List of All LLM Models

Discover and compare 500+ large language models with real-time rankings, benchmarks, and community votes.

Sherlock Dash Alpha

Sherlock Dash Alpha

By OpenRouter

This model was an early snapshot of Grok 4.1 Fast with reasoning disabled. Try the official launch of Grok 4.1 Fast [here](/x-ai/grok-4.1-fast) This is a cloaked model provided to the community to gather feedback. A frontier non-reasoning model that excels at tool calling, with a 1.8M context window and multimodal support. **Note:** All prompts and completions for this model are logged by the provider and may be used to improve the model.

Release Date

15 Nov 2025

Context Size

1.84M

Sherlock Think Alpha

Sherlock Think Alpha

By OpenRouter

This model was an early snapshot of Grok 4.1 Fast with reasoning enabled. Try the official launch of Grok 4.1 Fast [here](/x-ai/grok-4.1-fast) This is a cloaked model provided to the community to gather feedback. A frontier reasoning model that excels at tool calling, with a 1.8M context window and multimodal support. **Note:** All prompts and completions for this model are logged by the provider and may be used to improve the model.

Release Date

15 Nov 2025

Context Size

1.84M

Deep Cogito: Cogito v2.1 671B

Deep Cogito: Cogito v2.1 671B

By deepcogito

Cogito v2.1 671B MoE represents one of the strongest open models globally, matching performance of frontier closed and open models. This model is trained using self play with reinforcement learning to reach state-of-the-art performance on multiple categories (instruction following, coding, longer queries and creative writing). This advanced system demonstrates significant progress toward scalable superintelligence through policy improvement.

Release Date

13 Nov 2025

Context Size

128K

OpenAI: GPT-5.1

OpenAI: GPT-5.1

By OpenAI

GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural conversational style compared to GPT-5. It uses adaptive reasoning to allocate computation dynamically, responding quickly to simple queries while spending more depth on complex tasks. The model produces clearer, more grounded explanations with reduced jargon, making it easier to follow even on technical or multi-step problems. Built for broad task coverage, GPT-5.1 delivers consistent gains across math, coding, and structured analysis workloads, with more coherent long-form answers and improved tool-use reliability. It also features refined conversational alignment, enabling warmer, more intuitive responses without compromising precision. GPT-5.1 serves as the primary full-capability successor to GPT-5

Release Date

13 Nov 2025

Context Size

400K

OpenAI: GPT-5.1 Chat

OpenAI: GPT-5.1 Chat

By OpenAI

GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “think” on harder queries, improving accuracy on math, coding, and multi-step tasks without slowing down typical conversations. The model is warmer and more conversational by default, with better instruction following and more stable short-form reasoning. GPT-5.1 Chat is designed for high-throughput, interactive workloads where responsiveness and consistency matter more than deep deliberation.

Release Date

13 Nov 2025

Context Size

128K

OpenAI: GPT-5.1-Codex

OpenAI: GPT-5.1-Codex

By OpenAI

GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks. The model supports building projects from scratch, feature development, debugging, large-scale refactoring, and code review. Compared to GPT-5.1, Codex is more steerable, adheres closely to developer instructions, and produces cleaner, higher-quality code outputs. Reasoning effort can be adjusted with the `reasoning.effort` parameter. Read the [docs here](https://openrouter.ai/docs/use-cases/reasoning-tokens#reasoning-effort-level) Codex integrates into developer environments including the CLI, IDE extensions, GitHub, and cloud tasks. It adapts reasoning effort dynamically—providing fast responses for small tasks while sustaining extended multi-hour runs for large projects. The model is trained to perform structured code reviews, catching critical flaws by reasoning over dependencies and validating behavior against tests. It also supports multimodal inputs such as images or screenshots for UI development and integrates tool use for search, dependency installation, and environment setup. Codex is intended specifically for agentic coding applications.

Release Date

13 Nov 2025

Context Size

400K

OpenAI: GPT-5.1-Codex-Mini

OpenAI: GPT-5.1-Codex-Mini

By OpenAI

GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex

Release Date

13 Nov 2025

Context Size

400K

Kwaipilot: KAT-Coder-Pro V1

Kwaipilot: KAT-Coder-Pro V1

By kwaipilot

KAT-Coder-Pro V1 is KwaiKAT's most advanced agentic coding model in the KAT-Coder series. Designed specifically for agentic coding tasks, it excels in real-world software engineering scenarios, achieving 73.4% solve rate on the SWE-Bench Verified benchmark. The model has been optimized for tool-use capability, multi-turn interaction, instruction following, generalization, and comprehensive capabilities through a multi-stage training process, including mid-training, supervised fine-tuning (SFT), reinforcement fine-tuning (RFT), and scalable agentic RL.

Release Date

10 Nov 2025

Context Size

262.14K

Polaris Alpha

Polaris Alpha

By OpenRouter

This model was an early snapshot of GPT-5.1 with reasoning effort set to minimal. Try the official launch of GPT-5.1 [here](/openai/gpt-5.1) This is a cloaked model provided to the community to gather feedback. A powerful, general-purpose model that excels across real-world tasks, with standout performance in coding, tool calling, and instruction following. **Note:** All prompts and completions for this model are logged by the provider and may be used to improve the model.

Release Date

06 Nov 2025

Context Size

256K

MoonshotAI: Kimi K2 Thinking

MoonshotAI: Kimi K2 Thinking

By moonshotai

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) architecture introduced in Kimi K2, it activates 32 billion parameters per forward pass and supports 256 k-token context windows. The model is optimized for persistent step-by-step thought, dynamic tool invocation, and complex reasoning workflows that span hundreds of turns. It interleaves step-by-step reasoning with tool use, enabling autonomous research, coding, and writing that can persist for hundreds of sequential actions without drift. It sets new open-source benchmarks on HLE, BrowseComp, SWE-Multilingual, and LiveCodeBench, while maintaining stable multi-agent behavior through 200–300 tool calls. Built on a large-scale MoE architecture with MuonClip optimization, it combines strong reasoning depth with high inference efficiency for demanding agentic and analytical tasks.

Release Date

06 Nov 2025

Context Size

262.14K

Qwen: Qwen3 Embedding 0.6B

Qwen: Qwen3 Embedding 0.6B

By Qwen

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. This series inherits the exceptional multilingual capabilities, long-text understanding, and reasoning skills of its foundational model. The Qwen3 Embedding series represents significant advancements in multiple text embedding and ranking tasks, including text retrieval, code retrieval, text classification, text clustering, and bitext mining.

Release Date

05 Nov 2025

Context Size

8.19K

Amazon: Nova Premier 1.0

Amazon: Nova Premier 1.0

By Amazon

Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models.

Release Date

31 Oct 2025

Context Size

1M

Mistral: Mistral Embed 2312

Mistral: Mistral Embed 2312

By Mistral AI

Mistral Embed is a specialized embedding model for text data, optimized for semantic search and RAG applications. Developed by Mistral AI in late 2023, it produces 1024-dimensional vectors that effectively capture semantic relationships in text.

Release Date

31 Oct 2025

Context Size

8.19K

Google: Gemini Embedding 001

Google: Gemini Embedding 001

By Google

gemini-embedding-001 provides a unified cutting edge experience across domains, including science, legal, finance, and coding. This embedding model has consistently held a top spot on the Massive Text Embedding Benchmark (MTEB) Multilingual leaderboard since the experimental launch in March.

Release Date

31 Oct 2025

Context Size

20K

OpenAI: Text Embedding Ada 002

OpenAI: Text Embedding Ada 002

By OpenAI

text-embedding-ada-002 is OpenAI's legacy text embedding model.

Release Date

30 Oct 2025

Context Size

8.19K

Mistral: Codestral Embed 2505

Mistral: Codestral Embed 2505

By Mistral AI

Mistral Codestral Embed is specially designed for code, perfect for embedding code databases, repositories, and powering coding assistants with state-of-the-art retrieval.

Release Date

30 Oct 2025

Context Size

8.19K

OpenAI: Text Embedding 3 Large

OpenAI: Text Embedding 3 Large

By OpenAI

text-embedding-3-large is OpenAI's most capable embedding model for both english and non-english tasks. Embeddings are a numerical representation of text that can be used to measure the relatedness between two pieces of text. Embeddings are useful for search, clustering, recommendations, anomaly detection, and classification tasks.

Release Date

30 Oct 2025

Context Size

8.19K

OpenAI: Text Embedding 3 Small

OpenAI: Text Embedding 3 Small

By OpenAI

text-embedding-3-small is OpenAI's improved, more performant version of the ada embedding model. Embeddings are a numerical representation of text that can be used to measure the relatedness between two pieces of text. Embeddings are useful for search, clustering, recommendations, anomaly detection, and classification tasks.

Release Date

30 Oct 2025

Context Size

8.19K

Perplexity: Sonar Pro Search

Perplexity: Sonar Pro Search

By Perplexity

Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic search system. It is designed for deeper reasoning and analysis. Pricing is based on tokens plus $18 per thousand requests. This model powers the Pro Search mode on the Perplexity platform. Sonar Pro Search adds autonomous, multi-step reasoning to Sonar Pro. So, instead of just one query + synthesis, it plans and executes entire research workflows using tools.

Release Date

30 Oct 2025

Context Size

200K

Mistral: Voxtral Small 24B 2507

Mistral: Voxtral Small 24B 2507

By Mistral AI

Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding. Input audio is priced at $100 per million seconds.

Release Date

30 Oct 2025

Context Size

32K

OpenAI: gpt-oss-safeguard-20b

OpenAI: gpt-oss-safeguard-20b

By OpenAI

gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This open-weight, 21B-parameter Mixture-of-Experts (MoE) model offers lower latency for safety tasks like content classification, LLM filtering, and trust & safety labeling. Learn more about this model in OpenAI's gpt-oss-safeguard [user guide](https://cookbook.openai.com/articles/gpt-oss-safeguard-guide).

Release Date

29 Oct 2025

Context Size

131.07K

Qwen: Qwen3 Embedding 8B

Qwen: Qwen3 Embedding 8B

By Qwen

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. This series inherits the exceptional multilingual capabilities, long-text understanding, and reasoning skills of its foundational model. The Qwen3 Embedding series represents significant advancements in multiple text embedding and ranking tasks, including text retrieval, code retrieval, text classification, text clustering, and bitext mining.

Release Date

28 Oct 2025

Context Size

32K

NVIDIA: Nemotron Nano 12B 2 VL (free)

NVIDIA: Nemotron Nano 12B 2 VL (free)

By Nvidia

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence. It introduces a hybrid Transformer-Mamba architecture, combining transformer-level accuracy with Mamba’s memory-efficient sequence modeling for significantly higher throughput and lower latency. The model supports inputs of text and multi-image documents, producing natural-language outputs. It is trained on high-quality NVIDIA-curated synthetic datasets optimized for optical-character recognition, chart reasoning, and multimodal comprehension. Nemotron Nano 2 VL achieves leading results on OCRBench v2 and scores ≈ 74 average across MMMU, MathVista, AI2D, OCRBench, OCR-Reasoning, ChartQA, DocVQA, and Video-MME—surpassing prior open VL baselines. With Efficient Video Sampling (EVS), it handles long-form videos while reducing inference cost. Open-weights, training data, and fine-tuning recipes are released under a permissive NVIDIA open license, with deployment supported across NeMo, NIM, and major inference runtimes.

Release Date

28 Oct 2025

Context Size

128K

NVIDIA: Nemotron Nano 12B 2 VL (free)

NVIDIA: Nemotron Nano 12B 2 VL (free)

By nvidia

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence. It introduces a hybrid Transformer-Mamba architecture, combining transformer-level accuracy with Mamba’s...

Release Date

28 Oct 2025

Context Size

128K

Qwen: Qwen3 Embedding 4B

Qwen: Qwen3 Embedding 4B

By Qwen

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. This series inherits the exceptional multilingual capabilities, long-text understanding, and reasoning skills of its foundational model. The Qwen3 Embedding series represents significant advancements in multiple text embedding and ranking tasks, including text retrieval, code retrieval, text classification, text clustering, and bitext mining.

Release Date

28 Oct 2025

Context Size

32.77K

MiniMax: MiniMax M2

MiniMax: MiniMax M2

By MiniMax

MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230 billion total), it delivers near-frontier intelligence across general reasoning, tool use, and multi-step task execution while maintaining low latency and deployment efficiency. The model excels in code generation, multi-file editing, compile-run-fix loops, and test-validated repair, showing strong results on SWE-Bench Verified, Multi-SWE-Bench, and Terminal-Bench. It also performs competitively in agentic evaluations such as BrowseComp and GAIA, effectively handling long-horizon planning, retrieval, and recovery from execution errors. Benchmarked by [Artificial Analysis](https://artificialanalysis.ai/models/minimax-m2), MiniMax-M2 ranks among the top open-source models for composite intelligence, spanning mathematics, science, and instruction-following. Its small activation footprint enables fast inference, high concurrency, and improved unit economics, making it well-suited for large-scale agents, developer assistants, and reasoning-driven applications that require responsiveness and cost efficiency. To avoid degrading this model's performance, MiniMax highly recommends preserving reasoning between turns. Learn more about using reasoning_details to pass back reasoning in our [docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks).

Release Date

23 Oct 2025

Context Size

196.61K

Qwen: Qwen3 VL 32B Instruct

Qwen: Qwen3 VL 32B Instruct

By Qwen

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text comprehension, enabling fine-grained spatial reasoning, document and scene analysis, and long-horizon video understanding.Robust OCR in 32 languages, and enhanced multimodal fusion through Interleaved-MRoPE and DeepStack architectures. Optimized for agentic interaction and visual tool use, Qwen3-VL-32B delivers state-of-the-art performance for complex real-world multimodal tasks.

Release Date

23 Oct 2025

Context Size

131.07K

Andromeda Alpha

Andromeda Alpha

By OpenRouter

This model has been revealed as NVIDIA Nemotron Nano 2 VL. It continues to be offered for free by NVIDIA [here](https://openrouter.ai/nvidia/nemotron-nano-12b-v2-vl:free). This is a small reasoning VLM trained for image understanding. It's strengths include multi-image comprehension (6+ images), especially those containing charts and text. This is a cloaked model provided to the community to gather feedback. Note: All prompts and output are logged to improve the provider’s model and its product and services. Please do not upload any personal, confidential, or otherwise sensitive information. This is a trial use only. Do not use for production or business-critical systems.

Release Date

21 Oct 2025

Context Size

128K

LiquidAI: LFM2-8B-A1B

LiquidAI: LFM2-8B-A1B

By Liquid

LFM2-8B-A1B is an efficient on-device Mixture-of-Experts (MoE) model from Liquid AI’s LFM2 family, built for fast, high-quality inference on edge hardware. It uses 8.3B total parameters with only ~1.5B active per token, delivering strong performance while keeping compute and memory usage low—making it ideal for phones, tablets, and laptops.

Release Date

20 Oct 2025

Context Size

8.19K

LiquidAI: LFM2-2.6B

LiquidAI: LFM2-2.6B

By Liquid

LFM2 is a new generation of hybrid models developed by Liquid AI, specifically designed for edge AI and on-device deployment. It sets a new standard in terms of quality, speed, and memory efficiency.

Release Date

20 Oct 2025

Context Size

32.77K

Showing page 8 of 25 with 737 models total