List of All LLM Models

Discover and compare 500+ large language models with real-time rankings, benchmarks, and community votes.

MiniMax: MiniMax M2.5 (free)

MiniMax: MiniMax M2.5 (free)

By MiniMax

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments, M2.5 builds upon the coding expertise of M2.1 to extend into general office work, reaching fluency in generating and operating Word, Excel, and Powerpoint files, context switching between diverse software environments, and working across different agent and human teams. Scoring 80.2% on SWE-Bench Verified, 51.3% on Multi-SWE-Bench, and 76.3% on BrowseComp, M2.5 is also more token efficient than previous generations, having been trained to optimize its actions and output through planning.

Release Date

12 Feb 2026

Context Size

204.80K

Z.ai: GLM 5

Z.ai: GLM 5

By Z.ai

GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows. Built for expert developers, it delivers production-grade performance on large-scale programming tasks, rivaling leading closed-source models. With advanced agentic planning, deep backend reasoning, and iterative self-correction, GLM-5 moves beyond code generation to full-system construction and autonomous execution.

Release Date

11 Feb 2026

Context Size

202.75K

Qwen: Qwen3 Max Thinking

Qwen: Qwen3 Max Thinking

By Qwen

Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning. By significantly scaling model capacity and reinforcement learning compute, it delivers major gains in factual accuracy, complex reasoning, instruction following, alignment with human preferences, and agentic behavior.

Release Date

09 Feb 2026

Context Size

262.14K

Aurora Alpha

Aurora Alpha

By OpenRouter

This is a cloaked model provided to the community to gather feedback. A reasoning model designed for speed. It is built for coding assistants, real-time conversational applications, and agentic workflows. Default reasoning effort is set to medium for fast responses. For agentic coding use cases, we recommend changing effort to high. Note: All prompts and completions for this model are logged by the provider and may be used to improve the model.

Release Date

09 Feb 2026

Context Size

128K

Pony Alpha

Pony Alpha

By OpenRouter

Pony is a cutting-edge foundation model with strong performance in coding, agentic workflows, reasoning, and roleplay, making it well suited for hands-on coding and real-world use. **Note:** All prompts and completions for this model are logged by the provider and may be used to improve the model.

Release Date

06 Feb 2026

Context Size

200K

Anthropic: Claude Opus 4.6

Anthropic: Claude Opus 4.6

By Anthropic

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is built for agents that operate across entire workflows rather than single prompts, making it especially effective for large codebases, complex refactors, and multi-step debugging that unfolds over time. The model shows deeper contextual understanding, stronger problem decomposition, and greater reliability on hard engineering tasks than prior generations. Beyond coding, Opus 4.6 excels at sustained knowledge work. It produces near-production-ready documents, plans, and analyses in a single pass, and maintains coherence across very long outputs and extended sessions. This makes it a strong default for tasks that require persistence, judgment, and follow-through, such as technical design, migration planning, and end-to-end project execution. For users upgrading from earlier Opus versions, see our [official migration guide here](https://openrouter.ai/docs/guides/guides/model-migrations/claude-4-6-opus)

Release Date

04 Feb 2026

Context Size

1M

Qwen: Qwen3 Coder Next

Qwen: Qwen3 Coder Next

By Qwen

Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows. It uses a sparse MoE design with 80B total parameters and only 3B activated per token, delivering performance comparable to models with 10 to 20x higher active compute, which makes it well suited for cost-sensitive, always-on agent deployment. The model is trained with a strong agentic focus and performs reliably on long-horizon coding tasks, complex tool usage, and recovery from execution failures. With a native 256k context window, it integrates cleanly into real-world CLI and IDE environments and adapts well to common agent scaffolds used by modern coding tools. The model operates exclusively in non-thinking mode and does not emit <think> blocks, simplifying integration for production coding agents.

Release Date

04 Feb 2026

Context Size

262.14K

Sourceful: Riverflow V2 Pro

Sourceful: Riverflow V2 Pro

By sourceful

Riverflow V2 Pro is the most powerful variant of Sourceful's Riverflow 2.0 lineup, best for top-tier control and perfect text rendering. The Riverflow 2.0 series represents SOTA performance on image generation and editing tasks, using an integrated reasoning model to boost reliability and tackle complex challenges. Pricing is $0.15 per 1K/2K output image and $0.33 per 4K output image. Additional features: - Custom font rendering via font_inputs ($0.03/font, max 2) - Image enhancement via super_resolution_references ($0.20/reference, max 4) See the image generation docs for details: https://openrouter.ai/docs/features/multimodal/image-generation Note: Sourceful imposes a 4.5MB request size limit, therefore it is highly recommended to pass image URLs instead of Base64 data.

Release Date

02 Feb 2026

Context Size

8.19K

Sourceful: Riverflow V2 Fast

Sourceful: Riverflow V2 Fast

By sourceful

Riverflow V2 Fast is the fastest variant of Sourceful's Riverflow 2.0 lineup, best for production deployments and latency-critical workflows. The Riverflow 2.0 series represents SOTA performance on image generation and editing tasks, using an integrated reasoning model to boost reliability and tackle complex challenges. Pricing is $0.02 per 1K output image and $0.04 per 2K output image. Does not support 4K image output. Additional features: - Custom font rendering via font_inputs ($0.03/font, max 2) - Image enhancement via super_resolution_references ($0.20/reference, max 4) See the image generation docs for details: https://openrouter.ai/docs/features/multimodal/image-generation Note: Sourceful imposes a 4.5MB request size limit, therefore it is highly recommended to pass image URLs instead of Base64 data.

Release Date

02 Feb 2026

Context Size

8.19K

Free Models Router

Free Models Router

By OpenRouter

The simplest way to get free inference. openrouter/free is a router that selects free models at random from the models available on OpenRouter. The router smartly filters for models that support features needed for your request such as image understanding, tool calling, structured outputs and more.

Release Date

01 Feb 2026

Context Size

200K

StepFun: Step 3.5 Flash

StepFun: Step 3.5 Flash

By stepfun

Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse Mixture of Experts (MoE) architecture, it selectively activates only 11B of its 196B parameters per token. It is a reasoning model that is incredibly speed efficient even at long contexts.

Release Date

29 Jan 2026

Context Size

262.14K

Arcee AI: Trinity Large Preview

Arcee AI: Trinity Large Preview

By arcee-ai

Trinity-Large-Preview is a frontier-scale open-weight language model from Arcee, built as a 400B-parameter sparse Mixture-of-Experts with 13B active parameters per token using 4-of-256 expert routing. It excels in creative writing, storytelling, role-play, chat scenarios, and real-time voice assistance, better than your average reasoning model usually can. But we’re also introducing some of our newer agentic performance. It was trained to navigate well in agent harnesses like OpenCode, Cline, and Kilo Code, and to handle complex toolchains and long, constraint-filled prompts. The architecture natively supports very long context windows up to 512k tokens, with the Preview API currently served at 128k context using 8-bit quantization for practical deployment. Trinity-Large-Preview reflects Arcee’s efficiency-first design philosophy, offering a production-oriented frontier model with open weights and permissive licensing suitable for real-world applications and experimentation.

Release Date

27 Jan 2026

Context Size

131K

MoonshotAI: Kimi K2.5

MoonshotAI: Kimi K2.5

By moonshotai

Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm. Built on Kimi K2 with continued pretraining over approximately 15T mixed visual and text tokens, it delivers strong performance in general reasoning, visual coding, and agentic tool-calling.

Release Date

27 Jan 2026

Context Size

262.14K

Upstage: Solar Pro 3

Upstage: Solar Pro 3

By upstage

Solar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model. With 102B total parameters and 12B active parameters per forward pass, it delivers exceptional performance while maintaining computational efficiency. Optimized for Korean with English and Japanese support.

Release Date

27 Jan 2026

Context Size

128K

MiniMax: MiniMax M2-her

MiniMax: MiniMax M2-her

By MiniMax

MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and expressive multi-turn conversations. Designed to stay consistent in tone and personality, it supports rich message roles (user_system, group, sample_message_user, sample_message_ai) and can learn from example dialogue to better match the style and pacing of your scenario, making it a strong choice for storytelling, companions, and conversational experiences where natural flow and vivid interaction matter most.

Release Date

23 Jan 2026

Context Size

65.54K

Writer: Palmyra X5

Writer: Palmyra X5

By Writer

Palmyra X5 is Writer's most advanced model, purpose-built for building and scaling AI agents across the enterprise. It delivers industry-leading speed and efficiency on context windows up to 1 million tokens, powered by a novel transformer architecture and hybrid attention mechanisms. This enables faster inference and expanded memory for processing large volumes of enterprise data, critical for scaling AI agents.

Release Date

21 Jan 2026

Context Size

1.04M

LiquidAI: LFM2.5-1.2B-Thinking (free)

LiquidAI: LFM2.5-1.2B-Thinking (free)

By Liquid

LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—while still running comfortably on edge devices. It supports long context (up to 32K tokens) and is designed to provide higher-quality “thinking” responses in a small 1.2B model.

Release Date

20 Jan 2026

Context Size

32.77K

LiquidAI: LFM2.5-1.2B-Thinking (free)

LiquidAI: LFM2.5-1.2B-Thinking (free)

By liquid

LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—while still running comfortably on edge devices. It supports long context (up to 32K tokens) and is...

Release Date

20 Jan 2026

Context Size

32.77K

LiquidAI: LFM2.5-1.2B-Instruct (free)

LiquidAI: LFM2.5-1.2B-Instruct (free)

By Liquid

LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model built for fast on-device AI. It delivers strong chat quality in a 1.2B parameter footprint, with efficient edge inference and broad runtime support.

Release Date

20 Jan 2026

Context Size

32.77K

LiquidAI: LFM2.5-1.2B-Instruct (free)

LiquidAI: LFM2.5-1.2B-Instruct (free)

By liquid

LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model built for fast on-device AI. It delivers strong chat quality in a 1.2B parameter footprint, with efficient edge inference and broad runtime support.

Release Date

20 Jan 2026

Context Size

32.77K

OpenAI: GPT Audio

OpenAI: GPT Audio

By OpenAI

The gpt-audio model is OpenAI's first generally available audio model. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Audio is priced at $32 per million input tokens and $64 per million output tokens.

Release Date

19 Jan 2026

Context Size

128K

OpenAI: GPT Audio Mini

OpenAI: GPT Audio Mini

By OpenAI

A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Input is priced at $0.60 per million tokens and output is priced at $2.40 per million tokens.

Release Date

19 Jan 2026

Context Size

128K

Z.ai: GLM 4.7 Flash

Z.ai: GLM 4.7 Flash

By Z.ai

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning, and tool collaboration, and has achieved leading performance among open-source models of the same size on several current public benchmark leaderboards.

Release Date

19 Jan 2026

Context Size

202.75K

Black Forest Labs: FLUX.2 Klein 4B

Black Forest Labs: FLUX.2 Klein 4B

By black-forest-labs

FLUX.2 [klein] 4B is the fastest and most cost-effective model in the FLUX.2 family, optimized for high-throughput use cases while maintaining excellent image quality. Pricing is based on the output image. The first generated megapixel is charged $0.014. Each subsequent megapixel is charged $0.001.

Release Date

14 Jan 2026

Context Size

40.96K

OpenAI: GPT-5.2-Codex

OpenAI: GPT-5.2-Codex

By OpenAI

GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks. The model supports building projects from scratch, feature development, debugging, large-scale refactoring, and code review. Compared to GPT-5.1-Codex, 5.2-Codex is more steerable, adheres closely to developer instructions, and produces cleaner, higher-quality code outputs. Reasoning effort can be adjusted with the `reasoning.effort` parameter. Read the [docs here](https://openrouter.ai/docs/use-cases/reasoning-tokens#reasoning-effort-level) Codex integrates into developer environments including the CLI, IDE extensions, GitHub, and cloud tasks. It adapts reasoning effort dynamically—providing fast responses for small tasks while sustaining extended multi-hour runs for large projects. The model is trained to perform structured code reviews, catching critical flaws by reasoning over dependencies and validating behavior against tests. It also supports multimodal inputs such as images or screenshots for UI development and integrates tool use for search, dependency installation, and environment setup. Codex is intended specifically for agentic coding applications.

Release Date

14 Jan 2026

Context Size

400K

AllenAI: Molmo2 8B

AllenAI: Molmo2 8B

By Ai2

Molmo2-8B is an open vision-language model developed by the Allen Institute for AI (Ai2) as part of the Molmo2 family, supporting image, video, and multi-image understanding and grounding. It is based on Qwen3-8B and uses SigLIP 2 as its vision backbone, outperforming other open-weight, open-data models on short videos, counting, and captioning, while remaining competitive on long-video tasks.

Release Date

09 Jan 2026

Context Size

36.86K

AllenAI: Olmo 3.1 32B Instruct

AllenAI: Olmo 3.1 32B Instruct

By Ai2

Olmo 3.1 32B Instruct is a large-scale, 32-billion-parameter instruction-tuned language model engineered for high-performance conversational AI, multi-turn dialogue, and practical instruction following. As part of the Olmo 3.1 family, this variant emphasizes responsiveness to complex user directions and robust chat interactions while retaining strong capabilities on reasoning and coding benchmarks. Developed by Ai2 under the Apache 2.0 license, Olmo 3.1 32B Instruct reflects the Olmo initiative’s commitment to openness and transparency.

Release Date

06 Jan 2026

Context Size

65.54K

ByteDance Seed: Seedream 4.5

ByteDance Seed: Seedream 4.5

By bytedance-seed

Seedream 4.5 is the latest in-house image generation model developed by ByteDance. Compared with Seedream 4.0, it delivers comprehensive improvements, especially in editing consistency, including better preservation of subject details, lighting, and color tone. It also enhances portrait refinement and small-text rendering. The model’s multi-image composition capabilities have been significantly strengthened, and both reasoning performance and visual aesthetics continue to advance, enabling more accurate and artistically expressive image generation. Pricing is $0.04 per output image, regardless of size.

Release Date

23 Dec 2025

Context Size

4.10K

ByteDance Seed: Seed 1.6 Flash

ByteDance Seed: Seed 1.6 Flash

By bytedance-seed

Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual understanding. It features a 256k context window and can generate outputs of up to 16k tokens.

Release Date

23 Dec 2025

Context Size

262.14K

ByteDance Seed: Seed 1.6

ByteDance Seed: Seed 1.6

By bytedance-seed

Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimodal capabilities and adaptive deep thinking with a 256K context window.

Release Date

23 Dec 2025

Context Size

262.14K

Showing page 6 of 26 with 762 models total