List of All LLM Models

Discover and compare 500+ large language models with real-time rankings, benchmarks, and community votes.

OpenAI: GPT-5.1-Codex

OpenAI: GPT-5.1-Codex

By OpenAI

GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks. The model supports building projects from scratch, feature development, debugging, large-scale refactoring, and code review. Compared to GPT-5.1, Codex is more steerable, adheres closely to developer instructions, and produces cleaner, higher-quality code outputs. Reasoning effort can be adjusted with the `reasoning.effort` parameter. Read the [docs here](https://openrouter.ai/docs/use-cases/reasoning-tokens#reasoning-effort-level) Codex integrates into developer environments including the CLI, IDE extensions, GitHub, and cloud tasks. It adapts reasoning effort dynamically—providing fast responses for small tasks while sustaining extended multi-hour runs for large projects. The model is trained to perform structured code reviews, catching critical flaws by reasoning over dependencies and validating behavior against tests. It also supports multimodal inputs such as images or screenshots for UI development and integrates tool use for search, dependency installation, and environment setup. Codex is intended specifically for agentic coding applications.

Release Date

13 Nov 2025

Context Size

400K

OpenAI: GPT-5.1-Codex-Mini

OpenAI: GPT-5.1-Codex-Mini

By OpenAI

GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex

Release Date

13 Nov 2025

Context Size

400K

Kwaipilot: KAT-Coder-Pro V1

Kwaipilot: KAT-Coder-Pro V1

By kwaipilot

KAT-Coder-Pro V1 is KwaiKAT's most advanced agentic coding model in the KAT-Coder series. Designed specifically for agentic coding tasks, it excels in real-world software engineering scenarios, achieving 73.4% solve rate on the SWE-Bench Verified benchmark. The model has been optimized for tool-use capability, multi-turn interaction, instruction following, generalization, and comprehensive capabilities through a multi-stage training process, including mid-training, supervised fine-tuning (SFT), reinforcement fine-tuning (RFT), and scalable agentic RL.

Release Date

10 Nov 2025

Context Size

262.14K

Polaris Alpha

Polaris Alpha

By OpenRouter

This model was an early snapshot of GPT-5.1 with reasoning effort set to minimal. Try the official launch of GPT-5.1 [here](/openai/gpt-5.1) This is a cloaked model provided to the community to gather feedback. A powerful, general-purpose model that excels across real-world tasks, with standout performance in coding, tool calling, and instruction following. **Note:** All prompts and completions for this model are logged by the provider and may be used to improve the model.

Release Date

06 Nov 2025

Context Size

256K

MoonshotAI: Kimi K2 Thinking

MoonshotAI: Kimi K2 Thinking

By moonshotai

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) architecture introduced in Kimi K2, it activates 32 billion parameters per forward pass and supports 256 k-token context windows. The model is optimized for persistent step-by-step thought, dynamic tool invocation, and complex reasoning workflows that span hundreds of turns. It interleaves step-by-step reasoning with tool use, enabling autonomous research, coding, and writing that can persist for hundreds of sequential actions without drift. It sets new open-source benchmarks on HLE, BrowseComp, SWE-Multilingual, and LiveCodeBench, while maintaining stable multi-agent behavior through 200–300 tool calls. Built on a large-scale MoE architecture with MuonClip optimization, it combines strong reasoning depth with high inference efficiency for demanding agentic and analytical tasks.

Release Date

06 Nov 2025

Context Size

262.14K

Qwen: Qwen3 Embedding 0.6B

Qwen: Qwen3 Embedding 0.6B

By Qwen

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. This series inherits the exceptional multilingual capabilities, long-text understanding, and reasoning skills of its foundational model. The Qwen3 Embedding series represents significant advancements in multiple text embedding and ranking tasks, including text retrieval, code retrieval, text classification, text clustering, and bitext mining.

Release Date

05 Nov 2025

Context Size

8.19K

Amazon: Nova Premier 1.0

Amazon: Nova Premier 1.0

By Amazon

Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models.

Release Date

31 Oct 2025

Context Size

1M

Mistral: Mistral Embed 2312

Mistral: Mistral Embed 2312

By Mistral AI

Mistral Embed is a specialized embedding model for text data, optimized for semantic search and RAG applications. Developed by Mistral AI in late 2023, it produces 1024-dimensional vectors that effectively capture semantic relationships in text.

Release Date

31 Oct 2025

Context Size

8.19K

Google: Gemini Embedding 001

Google: Gemini Embedding 001

By Google

gemini-embedding-001 provides a unified cutting edge experience across domains, including science, legal, finance, and coding. This embedding model has consistently held a top spot on the Massive Text Embedding Benchmark (MTEB) Multilingual leaderboard since the experimental launch in March.

Release Date

31 Oct 2025

Context Size

20K

OpenAI: Text Embedding Ada 002

OpenAI: Text Embedding Ada 002

By OpenAI

text-embedding-ada-002 is OpenAI's legacy text embedding model.

Release Date

30 Oct 2025

Context Size

8.19K

Mistral: Codestral Embed 2505

Mistral: Codestral Embed 2505

By Mistral AI

Mistral Codestral Embed is specially designed for code, perfect for embedding code databases, repositories, and powering coding assistants with state-of-the-art retrieval.

Release Date

30 Oct 2025

Context Size

8.19K

OpenAI: Text Embedding 3 Large

OpenAI: Text Embedding 3 Large

By OpenAI

text-embedding-3-large is OpenAI's most capable embedding model for both english and non-english tasks. Embeddings are a numerical representation of text that can be used to measure the relatedness between two pieces of text. Embeddings are useful for search, clustering, recommendations, anomaly detection, and classification tasks.

Release Date

30 Oct 2025

Context Size

8.19K

OpenAI: Text Embedding 3 Small

OpenAI: Text Embedding 3 Small

By OpenAI

text-embedding-3-small is OpenAI's improved, more performant version of the ada embedding model. Embeddings are a numerical representation of text that can be used to measure the relatedness between two pieces of text. Embeddings are useful for search, clustering, recommendations, anomaly detection, and classification tasks.

Release Date

30 Oct 2025

Context Size

8.19K

Perplexity: Sonar Pro Search

Perplexity: Sonar Pro Search

By Perplexity

Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic search system. It is designed for deeper reasoning and analysis. Pricing is based on tokens plus $18 per thousand requests. This model powers the Pro Search mode on the Perplexity platform. Sonar Pro Search adds autonomous, multi-step reasoning to Sonar Pro. So, instead of just one query + synthesis, it plans and executes entire research workflows using tools.

Release Date

30 Oct 2025

Context Size

200K

Mistral: Voxtral Small 24B 2507

Mistral: Voxtral Small 24B 2507

By Mistral AI

Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding. Input audio is priced at $100 per million seconds.

Release Date

30 Oct 2025

Context Size

32K

OpenAI: gpt-oss-safeguard-20b

OpenAI: gpt-oss-safeguard-20b

By OpenAI

gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This open-weight, 21B-parameter Mixture-of-Experts (MoE) model offers lower latency for safety tasks like content classification, LLM filtering, and trust & safety labeling. Learn more about this model in OpenAI's gpt-oss-safeguard [user guide](https://cookbook.openai.com/articles/gpt-oss-safeguard-guide).

Release Date

29 Oct 2025

Context Size

131.07K

Qwen: Qwen3 Embedding 8B

Qwen: Qwen3 Embedding 8B

By Qwen

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. This series inherits the exceptional multilingual capabilities, long-text understanding, and reasoning skills of its foundational model. The Qwen3 Embedding series represents significant advancements in multiple text embedding and ranking tasks, including text retrieval, code retrieval, text classification, text clustering, and bitext mining.

Release Date

28 Oct 2025

Context Size

32K

NVIDIA: Nemotron Nano 12B 2 VL (free)

NVIDIA: Nemotron Nano 12B 2 VL (free)

By Nvidia

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence. It introduces a hybrid Transformer-Mamba architecture, combining transformer-level accuracy with Mamba’s memory-efficient sequence modeling for significantly higher throughput and lower latency. The model supports inputs of text and multi-image documents, producing natural-language outputs. It is trained on high-quality NVIDIA-curated synthetic datasets optimized for optical-character recognition, chart reasoning, and multimodal comprehension. Nemotron Nano 2 VL achieves leading results on OCRBench v2 and scores ≈ 74 average across MMMU, MathVista, AI2D, OCRBench, OCR-Reasoning, ChartQA, DocVQA, and Video-MME—surpassing prior open VL baselines. With Efficient Video Sampling (EVS), it handles long-form videos while reducing inference cost. Open-weights, training data, and fine-tuning recipes are released under a permissive NVIDIA open license, with deployment supported across NeMo, NIM, and major inference runtimes.

Release Date

28 Oct 2025

Context Size

128K

NVIDIA: Nemotron Nano 12B 2 VL (free)

NVIDIA: Nemotron Nano 12B 2 VL (free)

By nvidia

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence. It introduces a hybrid Transformer-Mamba architecture, combining transformer-level accuracy with Mamba’s...

Release Date

28 Oct 2025

Context Size

128K

Qwen: Qwen3 Embedding 4B

Qwen: Qwen3 Embedding 4B

By Qwen

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. This series inherits the exceptional multilingual capabilities, long-text understanding, and reasoning skills of its foundational model. The Qwen3 Embedding series represents significant advancements in multiple text embedding and ranking tasks, including text retrieval, code retrieval, text classification, text clustering, and bitext mining.

Release Date

28 Oct 2025

Context Size

32.77K

MiniMax: MiniMax M2

MiniMax: MiniMax M2

By MiniMax

MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230 billion total), it delivers near-frontier intelligence across general reasoning, tool use, and multi-step task execution while maintaining low latency and deployment efficiency. The model excels in code generation, multi-file editing, compile-run-fix loops, and test-validated repair, showing strong results on SWE-Bench Verified, Multi-SWE-Bench, and Terminal-Bench. It also performs competitively in agentic evaluations such as BrowseComp and GAIA, effectively handling long-horizon planning, retrieval, and recovery from execution errors. Benchmarked by [Artificial Analysis](https://artificialanalysis.ai/models/minimax-m2), MiniMax-M2 ranks among the top open-source models for composite intelligence, spanning mathematics, science, and instruction-following. Its small activation footprint enables fast inference, high concurrency, and improved unit economics, making it well-suited for large-scale agents, developer assistants, and reasoning-driven applications that require responsiveness and cost efficiency. To avoid degrading this model's performance, MiniMax highly recommends preserving reasoning between turns. Learn more about using reasoning_details to pass back reasoning in our [docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks).

Release Date

23 Oct 2025

Context Size

204.80K

Qwen: Qwen3 VL 32B Instruct

Qwen: Qwen3 VL 32B Instruct

By Qwen

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text comprehension, enabling fine-grained spatial reasoning, document and scene analysis, and long-horizon video understanding.Robust OCR in 32 languages, and enhanced multimodal fusion through Interleaved-MRoPE and DeepStack architectures. Optimized for agentic interaction and visual tool use, Qwen3-VL-32B delivers state-of-the-art performance for complex real-world multimodal tasks.

Release Date

23 Oct 2025

Context Size

262.14K

Andromeda Alpha

Andromeda Alpha

By OpenRouter

This model has been revealed as NVIDIA Nemotron Nano 2 VL. It continues to be offered for free by NVIDIA [here](https://openrouter.ai/nvidia/nemotron-nano-12b-v2-vl:free). This is a small reasoning VLM trained for image understanding. It's strengths include multi-image comprehension (6+ images), especially those containing charts and text. This is a cloaked model provided to the community to gather feedback. Note: All prompts and output are logged to improve the provider’s model and its product and services. Please do not upload any personal, confidential, or otherwise sensitive information. This is a trial use only. Do not use for production or business-critical systems.

Release Date

21 Oct 2025

Context Size

128K

LiquidAI: LFM2-8B-A1B

LiquidAI: LFM2-8B-A1B

By Liquid

LFM2-8B-A1B is an efficient on-device Mixture-of-Experts (MoE) model from Liquid AI’s LFM2 family, built for fast, high-quality inference on edge hardware. It uses 8.3B total parameters with only ~1.5B active per token, delivering strong performance while keeping compute and memory usage low—making it ideal for phones, tablets, and laptops.

Release Date

20 Oct 2025

Context Size

8.19K

LiquidAI: LFM2-2.6B

LiquidAI: LFM2-2.6B

By Liquid

LFM2 is a new generation of hybrid models developed by Liquid AI, specifically designed for edge AI and on-device deployment. It sets a new standard in terms of quality, speed, and memory efficiency.

Release Date

20 Oct 2025

Context Size

32.77K

IBM: Granite 4.0 Micro

IBM: Granite 4.0 Micro

By ibm-granite

Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These models are the latest in a series of models released by IBM. They are fine-tuned for long context tool calling.

Release Date

20 Oct 2025

Context Size

131K

Microsoft: Phi 4 Mini Instruct

Microsoft: Phi 4 Mini Instruct

By Microsoft

Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4 model family and supports 128K token context length. The model underwent an enhancement process, incorporating both supervised fine-tuning and direct preference optimization to support precise instruction adherence and robust safety measures.

Release Date

17 Oct 2025

Context Size

131.07K

Deep Cogito: Cogito V2 Preview Llama 405B

Deep Cogito: Cogito V2 Preview Llama 405B

By deepcogito

Cogito v2 405B is a dense hybrid reasoning model that combines direct answering capabilities with advanced self-reflection. It represents a significant step toward frontier intelligence with dense architecture delivering performance competitive with leading closed models. This advanced reasoning system combines policy improvement with massive scale for exceptional capabilities.

Release Date

17 Oct 2025

Context Size

131.07K

OpenAI: GPT-5 Image Mini

OpenAI: GPT-5 Image Mini

By OpenAI

GPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by [GPT-5 Mini](https://openrouter.ai/openai/gpt-5-mini), with GPT Image 1 Mini for efficient image generation. This natively multimodal model features superior instruction following, text rendering, and detailed image editing with reduced latency and cost. It excels at high-quality visual creation while maintaining strong text understanding, making it ideal for applications that require both efficient image generation and text processing at scale.

Release Date

16 Oct 2025

Context Size

400K

Anthropic: Claude Haiku 4.5

Anthropic: Claude Haiku 4.5

By Anthropic

Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude models. Matching Claude Sonnet 4’s performance across reasoning, coding, and computer-use tasks, Haiku 4.5 brings frontier-level capability to real-time and high-volume applications. It introduces extended thinking to the Haiku line; enabling controllable reasoning depth, summarized or interleaved thought output, and tool-assisted workflows with full support for coding, bash, web search, and computer-use tools. Scoring >73% on SWE-bench Verified, Haiku 4.5 ranks among the world’s best coding models while maintaining exceptional responsiveness for sub-agents, parallelized execution, and scaled deployment.

Release Date

15 Oct 2025

Context Size

200K

Showing page 9 of 26 with 762 models total