List of All LLM Models

Discover and compare 500+ large language models with real-time rankings, benchmarks, and community votes.

ByteDance: Seedance 2.0 Fast

ByteDance: Seedance 2.0 Fast

By bytedance

Seedance 2.0 Fast is a video generation model from ByteDance. It supports text-to-video, image-to-video with first and last frame control, and multimodal reference-to-video. It prioritizes generation speed and lower cost over maximum output quality. The number of tokens is given by (height of output video * width of output video * duration * 24) / 1024

Release Date

15 Apr 2026

Context Size

0

Alibaba: Wan 2.7

Alibaba: Wan 2.7

By alibaba

Wan 2.7 is a video generation model from Alibaba. It supports text-to-video, image-to-video with first and last frame control, and reference-to-video, where multiple reference images guide the style and content of the generated scene.

Release Date

15 Apr 2026

Context Size

0

Elephant Alpha

Elephant Alpha

By OpenRouter

Elephant Alpha is a 100B-parameter text model focused on intelligence efficiency, delivering strong performance while minimizing token usage. It supports a 256K context window with up to 32K output tokens, function calling, structured output, and prompt caching. It is particularly well-suited for code completion and debugging, rapid document processing, and lightweight agent interactions. Note: Prompts and completions may be logged by the provider and used to improve the model.

Release Date

13 Apr 2026

Context Size

262.14K

Anthropic: Claude Opus 4.6 (Fast)

Anthropic: Claude Opus 4.6 (Fast)

By Anthropic

Fast-mode variant of [Opus 4.6](/anthropic/claude-opus-4.6) - identical capabilities with higher output speed at premium 6x pricing. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode

Release Date

07 Apr 2026

Context Size

1M

Z.ai: GLM 5.1

Z.ai: GLM 5.1

By Z.ai

GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minute-level interactions, GLM-5.1 can work independently and continuously on a single task for more than 8 hours, autonomously planning, executing, and improving itself throughout the process, ultimately delivering complete, engineering-grade results.

Release Date

07 Apr 2026

Context Size

202.75K

Cohere: Rerank 4 Pro

Cohere: Rerank 4 Pro

By Cohere

Cohere's AI search foundation model for enhancing the relevance of information surfaced within search and RAG systems. Features a 32K context window, multilingual support across 100+ languages, no data pre-processing required, and state of the art performance with low latency.

Release Date

06 Apr 2026

Context Size

32.77K

Cohere: Rerank 4 Fast

Cohere: Rerank 4 Fast

By Cohere

Cohere's AI search foundation model for enhancing the relevance of information surfaced within search and RAG systems. Features a 32K context window, multilingual support across 100+ languages, no data pre-processing required, and high performance with lowest latency.

Release Date

06 Apr 2026

Context Size

32.77K

Cohere: Rerank v3.5

Cohere: Rerank v3.5

By Cohere

Rerank v3.5 is designed to reorder search results for improved relevance. It supports multi-aspect and semi-structured data reranking over 100+ languages. Ideal for refining results from semantic or keyword search pipelines.

Release Date

05 Apr 2026

Context Size

4.10K

Google: Gemma 4 26B A4B  (free)

Google: Gemma 4 26B A4B (free)

By Google

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at a fraction of the compute cost. Supports multimodal input including text, images, and video (up to 60s at 1fps). Features a 256K token context window, native function calling, configurable thinking/reasoning mode, and structured output support. Released under Apache 2.0.

Release Date

03 Apr 2026

Context Size

262.14K

Google: Gemma 4 26B A4B  (free)

Google: Gemma 4 26B A4B (free)

By google

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at...

Release Date

03 Apr 2026

Context Size

262.14K

Google: Gemma 4 31B (free)

Google: Gemma 4 31B (free)

By Google

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function calling, and multilingual support across 140+ languages. Strong on coding, reasoning, and document understanding tasks. Apache 2.0 license.

Release Date

02 Apr 2026

Context Size

262.14K

Google: Gemma 4 31B (free)

Google: Gemma 4 31B (free)

By google

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function...

Release Date

02 Apr 2026

Context Size

262.14K

Qwen: Qwen3.6 Plus

Qwen: Qwen3.6 Plus

By Qwen

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference. Compared to the 3.5 series, it delivers major gains in agentic coding, front-end development, and overall reasoning, with a significantly improved “vibe coding” experience. The model excels at complex tasks such as 3D scenes, games, and repository-level problem solving, achieving a 78.8 score on SWE-bench Verified. It represents a substantial leap in both pure-text and multimodal capabilities, performing at the level of leading state-of-the-art models.

Release Date

02 Apr 2026

Context Size

1M

Z.ai: GLM 5V Turbo

Z.ai: GLM 5V Turbo

By Z.ai

GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven tasks. It natively handles image, video, and text inputs, excels at long-horizon planning, complex coding, and task execution, and works seamlessly with agents to complete the full loop of “perceive → plan → execute“.

Release Date

01 Apr 2026

Context Size

202.75K

Arcee AI: Trinity Large Thinking

Arcee AI: Trinity Large Thinking

By arcee-ai

Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance in PinchBench, agentic workloads, and reasoning tasks. Launch video: https://youtu.be/Gc82AXLa0Rg?si=4RLn6WBz33qT--B7

Release Date

01 Apr 2026

Context Size

262.14K

xAI: Grok 4.20 Multi-Agent

xAI: Grok 4.20 Multi-Agent

By xAI

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information across complex tasks. Reasoning effort behavior: - low / medium: 4 agents - high / xhigh: 16 agents

Release Date

31 Mar 2026

Context Size

2M

xAI: Grok 4.20

xAI: Grok 4.20

By xAI

Grok 4.20 is a reasoning model from xAI with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the market with strict prompt adherance, delivering consistently precise and truthful responses. Reasoning can be enabled/disabled using the `reasoning` `enabled` parameter in the API. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#controlling-reasoning-tokens)

Release Date

31 Mar 2026

Context Size

2M

Google: Lyria 3 Pro Preview

Google: Lyria 3 Pro Preview

By Google

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz stereo audio from text prompts or from images. These models deliver structural coherence, including vocals, timed lyrics, and full instrumental arrangements. Lyria 3 Pro can generate full-length songs with verses, choruses, bridges.

Release Date

30 Mar 2026

Context Size

1.05M

Google: Lyria 3 Clip Preview

Google: Lyria 3 Clip Preview

By Google

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz stereo audio from text prompts or from images. These models deliver structural coherence, including vocals, timed lyrics, and full instrumental arrangements. Lyria 3 Clip can generate short clips, loops, previews.

Release Date

30 Mar 2026

Context Size

1.05M

Qwen: Qwen3.6 Plus Preview

Qwen: Qwen3.6 Plus Preview

By Qwen

Qwen 3.6 Plus Preview is the next-generation evolution of the Qwen Plus series, featuring an advanced hybrid architecture that improves efficiency and scalability. It delivers stronger reasoning and more reliable agentic behavior compared to the 3.5 series. In benchmarks, it performs at or above leading state-of-the-art models. Designed as a flagship preview, it excels in agentic coding, front-end development, and complex problem-solving. Note: The model collects prompt and completion data that can be used to improve the model.

Release Date

30 Mar 2026

Context Size

1M

Alibaba: Wan 2.6

Alibaba: Wan 2.6

By alibaba

Alibaba's most advanced video generation model, supporting over 10 visual creation capabilities in a unified system. Wan 2.6 generates 1080p video at 24fps from text, images, reference videos, or audio, with native audio-visual synchronization and precise lip-sync. Key features include reference-to-video (insert a character's appearance and voice into new scenes), multi-shot storytelling from simple prompts, synchronized sound effects and music, and support for 16:9, 9:16, and 1:1 aspect ratios with clips up to 15 seconds.

Release Date

28 Mar 2026

Context Size

0

Kwaipilot: KAT-Coder-Pro V2

Kwaipilot: KAT-Coder-Pro V2

By kwaipilot

KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder series, designed for complex enterprise-grade software engineering and SaaS integration. It builds on the agentic coding strengths of earlier versions, with a focus on large-scale production environments, multi-system coordination, and seamless integration across modern software stacks, while also supporting web aesthetics generation to produce production-grade landing pages and presentation decks.

Release Date

27 Mar 2026

Context Size

256K

ByteDance: Seedance 1.5 Pro

ByteDance: Seedance 1.5 Pro

By bytedance

ByteDance's next-generation audio-visual generation model with a 4.5B parameter Dual-Branch Diffusion Transformer architecture. Seedance 1.5 Pro generates video and audio simultaneously in a single unified pass — eliminating the timing issues of sequential audio dubbing. Supports multi-language lip-sync (English, Mandarin, Japanese, Korean, Spanish, and more), cinematic camera control (pan, tilt, zoom, orbit), multi-character dialogue, and character consistency across shots. Produces clips from 4–12 seconds at up to 1080p. The number of tokens is given by (height of output video * width of output video * duration * 24) / 1024

Release Date

23 Mar 2026

Context Size

0

OpenAI: Sora 2 Pro

OpenAI: Sora 2 Pro

By OpenAI

OpenAI's flagship video generation model, delivering production-quality video with physics-accurate motion, synchronized audio, and world-state persistence across shots. Sora 2 Pro follows intricate multi-shot instructions while maintaining consistent spatial relationships — objects don't disappear or change shape between cuts. Supports text-to-video and image-to-video, with synchronized background soundscapes, speech, and sound effects. Includes advanced content safety with C2PA metadata provenance and SynthID-style watermarking.

Release Date

23 Mar 2026

Context Size

0

Google: Veo 3.1

Google: Veo 3.1

By Google

Google's state-of-the-art video generation model, built for maximum visual fidelity in final production cuts. Veo 3.1 generates high-quality 1080p video from text or image prompts with native synchronized audio — including dialogue, ambient effects, and background sound. Supports scene extension (up to 20 chained clips for 140+ second narratives), frames-to-video transitions between two images, vertical video for Shorts, and 4K upscaling.

Release Date

23 Mar 2026

Context Size

0

Reka Edge

Reka Edge

By rekaai

Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video+text inputs and generates text outputs. This model is optimized specifically to deliver industry-leading performance in image understanding, video analysis, object detection, and agentic tool-use.

Release Date

20 Mar 2026

Context Size

16.38K

Xiaomi: MiMo-V2-Omni

Xiaomi: MiMo-V2-Omni

By Xiaomi

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step planning, tool use, and code execution - making it well-suited for complex real-world tasks that span modalities, 256K context window.

Release Date

18 Mar 2026

Context Size

262.14K

Xiaomi: MiMo-V2-Pro

Xiaomi: MiMo-V2-Pro

By Xiaomi

MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and a 1M context length, deeply optimized for agentic scenarios. It is highly adaptable to general agent frameworks like OpenClaw. It ranks among the global top tier in the standard PinchBench and ClawBench benchmarks, with perceived performance approaching that of Opus 4.6. MiMo-V2-Pro is designed to serve as the brain of agent systems, orchestrating complex workflows, driving production engineering tasks, and delivering results reliably.

Release Date

18 Mar 2026

Context Size

1.05M

MiniMax: MiniMax M2.7

MiniMax: MiniMax M2.7

By MiniMax

MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous improvement. Built to actively participate in its own evolution, M2.7 integrates advanced agentic capabilities through multi-agent collaboration, enabling it to plan, execute, and refine complex tasks across dynamic environments. Trained for production-grade performance, M2.7 handles workflows such as live debugging, root cause analysis, financial modeling, and full document generation across Word, Excel, and PowerPoint. It delivers strong results on benchmarks including 56.2% on SWE-Pro and 57.0% on Terminal Bench 2, while achieving a 1495 ELO on GDPval-AA, setting a new standard for multi-agent systems operating in real-world digital workflows.

Release Date

18 Mar 2026

Context Size

196.61K

OpenAI: GPT-5.4 Nano

OpenAI: GPT-5.4 Nano

By OpenAI

GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized for speed-critical and high-volume tasks. It supports text and image inputs and is designed for low-latency use cases such as classification, data extraction, ranking, and sub-agent execution. The model prioritizes responsiveness and efficiency over deep reasoning, making it ideal for pipelines that require fast, reliable outputs at scale. GPT-5.4 nano is well suited for background tasks, real-time systems, and distributed agent architectures where minimizing cost and latency is essential.

Release Date

17 Mar 2026

Context Size

400K

Showing page 3 of 25 with 737 models total