List of All LLM Models

Discover and compare 500+ large language models with real-time rankings, benchmarks, and community votes.

Anthropic: Claude v2.1

Anthropic: Claude v2.1

By Anthropic

Claude 2 delivers advancements in key capabilities for enterprises—including an industry-leading 200K token context window, significant reductions in rates of model hallucination, system prompts and a new beta feature: tool use.

Release Date

22 Nov 2023

Context Size

200K

Anthropic: Claude Instant v1.1

Anthropic: Claude Instant v1.1

By Anthropic

Anthropic's model for low-latency, high throughput text generation. Supports hundreds of pages of text.

Release Date

22 Nov 2023

Context Size

100K

OpenHermes 2.5 Mistral 7B

OpenHermes 2.5 Mistral 7B

By Teknium

A continuation of [OpenHermes 2 model](/models/teknium/openhermes-2-mistral-7b), trained on additional code datasets. Potentially the most interesting finding from training on a good ratio (est. of around 7-14% of the total dataset) of code instruction was that it has boosted several non-code benchmarks, including TruthfulQA, AGIEval, and GPT4All suite. It did however reduce BigBench benchmark score, but the net gain overall is significant.

Release Date

20 Nov 2023

Context Size

4.10K

LLaVA 13B

LLaVA 13B

By Haotian Liu

LLaVA is a large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding, achieving impressive chat capabilities and setting a new state-of-the-art accuracy on Science QA. #multimodal

Release Date

16 Nov 2023

Context Size

2.05K

Nous: Capybara 34B

Nous: Capybara 34B

By Nous Research

This model is trained on the Yi-34B model for 3 epochs on the Capybara dataset. It's the first 34B Nous model and first 200K context length Nous model.

Release Date

15 Nov 2023

Context Size

200K

OpenAI: GPT-4 Vision

OpenAI: GPT-4 Vision

By OpenAI

Ability to understand images, in addition to all other [GPT-4 Turbo capabilties](/models/openai/gpt-4-turbo). Training data: up to Apr 2023. **Note:** heavily rate limited by OpenAI while in preview. #multimodal

Release Date

13 Nov 2023

Context Size

128K

lzlv 70B

lzlv 70B

By lizpreciatior

A Mythomax/MLewd_13B-style merge of selected 70B models. A multi-model merge of several LLaMA2 70B finetunes for roleplaying and creative work. The goal was to create a model that combines creativity with intelligence for an enhanced experience. #merge #uncensored

Release Date

12 Nov 2023

Context Size

4.10K

Toppy M 7B

Toppy M 7B

By Undi

A wild 7B parameter model that merges several models using the new task_arithmetic merge method from mergekit. List of merged models: - NousResearch/Nous-Capybara-7B-V1.9 - [HuggingFaceH4/zephyr-7b-beta](/models/huggingfaceh4/zephyr-7b-beta) - lemonilia/AshhLimaRP-Mistral-7B - Vulkane/120-Days-of-Sodom-LoRA-Mistral-7b - Undi95/Mistral-pippa-sharegpt-7b-qlora #merge #uncensored

Release Date

10 Nov 2023

Context Size

4.10K

Goliath 120B

Goliath 120B

By Alpindale

A large LLM created by combining two fine-tuned Llama 70B models into one 120B model. Combines Xwin and Euryale. Credits to - [@chargoddard](https://huggingface.co/chargoddard) for developing the framework used to merge the model - [mergekit](https://github.com/cg123/mergekit). - [@Undi95](https://huggingface.co/Undi95) for helping with the merge ratios. #merge

Release Date

10 Nov 2023

Context Size

6.14K

Auto Router

Auto Router

By OpenRouter

"Your prompt will be processed by a meta-model and routed to one of dozens of models (see below), optimizing for the best possible output.\n\nTo see which model was used, visit [Activity](/activity), or read the `model` attribute of the response. Your response will be priced at the same rate as the routed model.\n\nLearn more, including how to customize the models for routing, in our [docs](/docs/guides/routing/routers/auto-router)."

Release Date

08 Nov 2023

Context Size

2M

OpenAI: GPT-3.5 Turbo 16k (older v1106)

OpenAI: GPT-3.5 Turbo 16k (older v1106)

By OpenAI

An older GPT-3.5 Turbo model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Training data: up to Sep 2021.

Release Date

06 Nov 2023

Context Size

16.39K

OpenAI: GPT-4 Turbo (older v1106)

OpenAI: GPT-4 Turbo (older v1106)

By OpenAI

The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling. Training data: up to April 2023.

Release Date

06 Nov 2023

Context Size

128K

Google: PaLM 2 Code Chat 32k

Google: PaLM 2 Code Chat 32k

By Google

PaLM 2 fine-tuned for chatbot conversations that help with code-related questions.

Release Date

03 Nov 2023

Context Size

32.76K

Google: PaLM 2 Chat 32k

Google: PaLM 2 Chat 32k

By Google

PaLM 2 is a language model by Google with improved multilingual, reasoning and coding capabilities.

Release Date

03 Nov 2023

Context Size

32.76K

OpenHermes 2 Mistral 7B

OpenHermes 2 Mistral 7B

By Teknium

Trained on 900k instructions, surpasses all previous versions of Hermes 13B and below, and matches 70B on some benchmarks. Hermes 2 has strong multiturn chat skills and system prompt capabilities.

Release Date

01 Nov 2023

Context Size

8.19K

Mistral OpenOrca 7B

Mistral OpenOrca 7B

By OpenOrca

A fine-tune of Mistral using the OpenOrca dataset. First 7B model to beat all other models <30B.

Release Date

30 Oct 2023

Context Size

8.19K

Airoboros 70B

Airoboros 70B

By Jon Durbin

A Llama 2 70B fine-tune using synthetic data (the Airoboros dataset). Currently based on [jondurbin/airoboros-l2-70b](https://huggingface.co/jondurbin/airoboros-l2-70b-2.2.1), but might get updated in the future.

Release Date

29 Oct 2023

Context Size

4.10K

Nous: Hermes 70B

Nous: Hermes 70B

By Nous Research

A state-of-the-art language model fine-tuned on over 300k instructions by Nous Research, with Teknium and Emozilla leading the fine tuning process.

Release Date

20 Oct 2023

Context Size

4.10K

Xwin 70B

Xwin 70B

By xwin-lm

Xwin-LM aims to develop and open-source alignment tech for LLMs. Our first release, built-upon on the [Llama2](/models/${Model.Llama_2_13B_Chat}) base models, ranked TOP-1 on AlpacaEval. Notably, it's the first to surpass [GPT-4](/models/${Model.GPT_4}) on this benchmark. The project will be continuously updated.

Release Date

15 Oct 2023

Context Size

8.19K

OpenAI: GPT-3.5 Turbo Instruct

OpenAI: GPT-3.5 Turbo Instruct

By OpenAI

This model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omitting chat-related optimizations. Training data: up to Sep 2021.

Release Date

28 Sep 2023

Context Size

4.09K

Mistral: Mistral 7B Instruct v0.1

Mistral: Mistral 7B Instruct v0.1

By Mistral AI

A 7.3B parameter model that outperforms Llama 2 13B on all benchmarks, with optimizations for speed and context length.

Release Date

28 Sep 2023

Context Size

2.82K

Synthia 70B

Synthia 70B

By Migel Tissera

SynthIA (Synthetic Intelligent Agent) is a LLama-2 70B model trained on Orca style datasets. It has been fine-tuned for instruction following as well as having long-form conversations.

Release Date

22 Sep 2023

Context Size

8.19K

Pygmalion: Mythalion 13B

Pygmalion: Mythalion 13B

By Pygmalion

A blend of the new Pygmalion-13b and MythoMax. #merge

Release Date

02 Sep 2023

Context Size

8.19K

OpenAI: GPT-4 32k

OpenAI: GPT-4 32k

By OpenAI

GPT-4-32k is an extended version of GPT-4, with the same capabilities but quadrupled context length, allowing for processing up to 40 pages of text in a single pass. This is particularly beneficial for handling longer content like interacting with PDFs without an external vector database. Training data: up to Sep 2021.

Release Date

28 Aug 2023

Context Size

32.77K

OpenAI: GPT-4 32k (older v0314)

OpenAI: GPT-4 32k (older v0314)

By OpenAI

GPT-4-32k is an extended version of GPT-4, with the same capabilities but quadrupled context length, allowing for processing up to 40 pages of text in a single pass. This is particularly beneficial for handling longer content like interacting with PDFs without an external vector database. Training data: up to Sep 2021.

Release Date

28 Aug 2023

Context Size

32.77K

OpenAI: GPT-3.5 Turbo 16k

OpenAI: GPT-3.5 Turbo 16k

By OpenAI

This model offers four times the context length of gpt-3.5-turbo, allowing it to support approximately 20 pages of text in a single request at a higher cost. Training data: up to Sep 2021.

Release Date

28 Aug 2023

Context Size

16.39K

Meta: CodeLlama 34B Instruct

Meta: CodeLlama 34B Instruct

By Meta Llama

Code Llama is built upon Llama 2 and excels at filling in code, handling extensive input contexts, and following programming instructions without prior training for various programming tasks.

Release Date

20 Aug 2023

Context Size

8.19K

Nous: Hermes 13B

Nous: Hermes 13B

By Nous Research

A state-of-the-art language model fine-tuned on over 300k instructions by Nous Research, with Teknium and Emozilla leading the fine tuning process.

Release Date

20 Aug 2023

Context Size

4.10K

Phind: CodeLlama 34B v2

Phind: CodeLlama 34B v2

By Phind

A fine-tune of CodeLlama-34B on an internal dataset that helps it exceed GPT-4 on some benchmarks, including HumanEval.

Release Date

20 Aug 2023

Context Size

4.10K

Mancer: Weaver (alpha)

Mancer: Weaver (alpha)

By Mancer

An attempt to recreate Claude-style verbosity, but don't expect the same level of coherence or memory. Meant for use in roleplay/narrative situations.

Release Date

02 Aug 2023

Context Size

8K

Showing page 24 of 25 with 737 models total