Best LLMs for Roleplay 2026 | Creative Roleplaying AI Rankings
Top LLMs for roleplay, character simulation, storytelling, and creative interactions. Real-time leaderboard focused on creativity and consistency.

DeepSeek: DeepSeek V3.2
by DeepSeek
•131.07K tokens
DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism that reduces training and inference cost while preserving quality in long-context scenarios. A scalable reinforcement learning post-training framework further improves reasoning, with reported performance in the GPT-5 class, and the model has demonstrated gold-medal results on the 2025 IMO and IOI. V3.2 also uses a large-scale agentic task synthesis pipeline to better integrate reasoning into tool-use settings, boosting compliance and generalization in interactive environments. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config)

DeepSeek: DeepSeek V4 Flash (free)
by DeepSeek
•1.05M tokens
DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and high-throughput workloads, while maintaining strong reasoning and coding performance. The model includes hybrid attention for efficient long-context processing. Reasoning efforts `high` and `xhigh` are supported; `xhigh` maps to max reasoning. It is well suited for applications such as coding assistants, chat systems, and agent workflows where responsiveness and cost efficiency are important.


DeepSeek: DeepSeek V4 Pro
by DeepSeek
•1.05M tokens
DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding, and long-horizon agent workflows, with strong performance across knowledge, math, and software engineering benchmarks. Built on the same architecture as DeepSeek V4 Flash, it introduces a hybrid attention system for efficient long-context processing. Reasoning efforts `high` and `xhigh` are supported; `xhigh` maps to max reasoning. It is well suited for complex workloads such as full-codebase analysis, multi-step automation, and large-scale information synthesis, where both capability and efficiency are critical.

4

Z.ai: GLM 4.5 Air (free)
by Z.ai
131.07K tokens
5
Google: Gemini 3 Flash Preview
by Google
1.05M tokens
6
Google: Gemini 2.5 Flash Lite
by Google
1.05M tokens
7
Google: Gemma 4 31B (free)
by Google
262.14K tokens
8

OpenAI: gpt-oss-120b (free)
by OpenAI
131.07K tokens
9
Owl Alpha
by OpenRouter
1.05M tokens