Best LLMs for Roleplay 2026 | Creative Roleplaying AI Rankings

Top LLMs for roleplay, character simulation, storytelling, and creative interactions. Real-time leaderboard focused on creativity and consistency.

DeepSeek: DeepSeek V3.2

DeepSeek: DeepSeek V3.2

by DeepSeek

131.07K tokens

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism that reduces training and inference cost while preserving quality in long-context scenarios. A scalable reinforcement learning post-training framework further improves reasoning, with reported performance in the GPT-5 class, and the model has demonstrated gold-medal results on the 2025 IMO and IOI. V3.2 also uses a large-scale agentic task synthesis pipeline to better integrate reasoning into tool-use settings, boosting compliance and generalization in interactive environments. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config)

Position Medals
DeepSeek: DeepSeek V4 Flash

DeepSeek: DeepSeek V4 Flash

by DeepSeek

1.05M tokens

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and high-throughput workloads, while maintaining strong reasoning and coding performance. The model includes hybrid attention for efficient long-context processing. Reasoning efforts `high` and `xhigh` are supported; `xhigh` maps to max reasoning. It is well suited for applications such as coding assistants, chat systems, and agent workflows where responsiveness and cost efficiency are important.

Position Medals
Z.ai: GLM 4.5 Air (free)

Z.ai: GLM 4.5 Air (free)

by Z.ai

131.07K tokens

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixture-of-Experts (MoE) architecture but with a more compact parameter size. GLM-4.5-Air also supports hybrid inference modes, offering a "thinking mode" for advanced reasoning and tool use, and a "non-thinking mode" for real-time interaction. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config)

Position Medals

4

xAI: Grok 4.1 Fast

xAI: Grok 4.1 Fast

by xAI

2M tokens

5

DeepSeek: DeepSeek V4 Pro

DeepSeek: DeepSeek V4 Pro

by DeepSeek

1.05M tokens

6

Google: Gemini 2.5 Flash Lite

Google: Gemini 2.5 Flash Lite

by Google

1.05M tokens

7

Google: Gemma 4 31B (free)

Google: Gemma 4 31B (free)

by Google

262.14K tokens

8

Google: Gemini 3 Flash Preview

Google: Gemini 3 Flash Preview

by Google

1.05M tokens

9

OpenAI: gpt-oss-120b (free)

OpenAI: gpt-oss-120b (free)

by OpenAI

131.07K tokens