List of All LLM Models

Discover and compare 500+ large language models with real-time rankings, benchmarks, and community votes.

Microsoft: Phi-3 Medium 4K Instruct

Microsoft: Phi-3 Medium 4K Instruct

By Microsoft

Phi-3 4K Medium is a powerful 14-billion parameter model designed for advanced language understanding, reasoning, and instruction following. Optimized through supervised fine-tuning and preference adjustments, it excels in tasks involving common sense, mathematics, logical reasoning, and code processing. At time of release, Phi-3 Medium demonstrated state-of-the-art performance among lightweight models. In the MMLU-Pro eval, the model even comes close to a Llama3 70B level of performance. For 128k context length, try [Phi-3 Medium 128K](/models/microsoft/phi-3-medium-128k-instruct).

Release Date

15 Jun 2024

Context Size

4K

StarCoder2 15B Instruct

StarCoder2 15B Instruct

By BigCode

StarCoder2 15B Instruct excels in coding-related tasks, primarily in Python. It is the first self-aligned open-source LLM developed by BigCode. This model was fine-tuned without any human annotations or distilled data from proprietary LLMs. The base model uses [Grouped Query Attention](https://arxiv.org/abs/2305.13245) and was trained using the [Fill-in-the-Middle objective](https://arxiv.org/abs/2207.14255) objective on 4+ trillion tokens.

Release Date

09 Jun 2024

Context Size

16.38K

Dolphin 2.9.2 Mixtral 8x22B 🐬

Dolphin 2.9.2 Mixtral 8x22B 🐬

By Cognitive Computations

Dolphin 2.9 is designed for instruction following, conversational, and coding. This model is a finetune of [Mixtral 8x22B Instruct](/models/mistralai/mixtral-8x22b-instruct). It features a 64k context length and was fine-tuned with a 16k sequence length using ChatML templates. This model is a successor to [Dolphin Mixtral 8x7B](/models/cognitivecomputations/dolphin-mixtral-8x7b). The model is uncensored and is stripped of alignment and bias. It requires an external alignment layer for ethical use. Users are cautioned to use this highly compliant model responsibly, as detailed in a blog post about uncensored models at [erichartford.com/uncensored-models](https://erichartford.com/uncensored-models). #moe #uncensored

Release Date

08 Jun 2024

Context Size

65.54K

Qwen 2 72B Instruct

Qwen 2 72B Instruct

By Qwen

Qwen2 72B is a transformer-based model that excels in language understanding, multilingual capabilities, coding, mathematics, and reasoning. It features SwiGLU activation, attention QKV bias, and group query attention. It is pretrained on extensive data with supervised finetuning and direct preference optimization. For more details, see this [blog post](https://qwenlm.github.io/blog/qwen2/) and [GitHub repo](https://github.com/QwenLM/Qwen2). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).

Release Date

07 Jun 2024

Context Size

32.77K

OpenChat 3.6 8B

OpenChat 3.6 8B

By OpenChat

OpenChat 8B is a library of open-source language models, fine-tuned with "C-RLFT (Conditioned Reinforcement Learning Fine-Tuning)" - a strategy inspired by offline reinforcement learning. It has been trained on mixed-quality data without preference labels. It outperforms many similarly sized models including [Llama 3 8B Instruct](/models/meta-llama/llama-3-8b-instruct) and various fine-tuned models. It excels in general conversation, coding assistance, and mathematical reasoning. - For OpenChat fine-tuned on Mistral 7B, check out [OpenChat 7B](/models/openchat/openchat-7b). - For OpenChat fine-tuned on Llama 8B, check out [OpenChat 8B](/models/openchat/openchat-8b). #open-source

Release Date

01 Jun 2024

Context Size

8.19K

NousResearch: Hermes 2 Pro - Llama-3 8B

NousResearch: Hermes 2 Pro - Llama-3 8B

By Nous Research

Hermes 2 Pro is an upgraded, retrained version of Nous Hermes 2, consisting of an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced Function Calling and JSON Mode dataset developed in-house.

Release Date

27 May 2024

Context Size

8.19K

Mistral: Mistral 7B Instruct v0.3

Mistral: Mistral 7B Instruct v0.3

By Mistral AI

A high-performing, industry-standard 7.3B parameter model, with optimizations for speed and context length. An improved version of [Mistral 7B Instruct v0.2](/models/mistralai/mistral-7b-instruct-v0.2), with the following changes: - Extended vocabulary to 32768 - Supports v3 Tokenizer - Supports function calling NOTE: Support for function calling depends on the provider.

Release Date

27 May 2024

Context Size

32.77K

Mistral: Mistral 7B Instruct

Mistral: Mistral 7B Instruct

By Mistral AI

A high-performing, industry-standard 7.3B parameter model, with optimizations for speed and context length. *Mistral 7B Instruct has multiple version variants, and this is intended to be the latest version.*

Release Date

27 May 2024

Context Size

32.77K

Microsoft: Phi-3 Mini 128K Instruct

Microsoft: Phi-3 Mini 128K Instruct

By Microsoft

Phi-3 Mini is a powerful 3.8B parameter model designed for advanced language understanding, reasoning, and instruction following. Optimized through supervised fine-tuning and preference adjustments, it excels in tasks involving common sense, mathematics, logical reasoning, and code processing. At time of release, Phi-3 Medium demonstrated state-of-the-art performance among lightweight models. This model is static, trained on an offline dataset with an October 2023 cutoff date.

Release Date

26 May 2024

Context Size

128K

Microsoft: Phi-3 Medium 128K Instruct

Microsoft: Phi-3 Medium 128K Instruct

By Microsoft

Phi-3 128K Medium is a powerful 14-billion parameter model designed for advanced language understanding, reasoning, and instruction following. Optimized through supervised fine-tuning and preference adjustments, it excels in tasks involving common sense, mathematics, logical reasoning, and code processing. At time of release, Phi-3 Medium demonstrated state-of-the-art performance among lightweight models. In the MMLU-Pro eval, the model even comes close to a Llama3 70B level of performance. For 4k context length, try [Phi-3 Medium 4K](/models/microsoft/phi-3-medium-4k-instruct).

Release Date

24 May 2024

Context Size

128K

NeverSleep: Llama 3 Lumimaid 70B

NeverSleep: Llama 3 Lumimaid 70B

By NeverSleep

The NeverSleep team is back, with a Llama 3 70B finetune trained on their curated roleplay data. Striking a balance between eRP and RP, Lumimaid was designed to be serious, yet uncensored when necessary. To enhance it's overall intelligence and chat capability, roughly 40% of the training data was not roleplay. This provides a breadth of knowledge to access, while still keeping roleplay as the primary strength. Usage of this model is subject to [Meta's Acceptable Use Policy](https://llama.meta.com/llama3/use-policy/).

Release Date

16 May 2024

Context Size

8.19K

Perplexity: Llama3 Sonar 70B

Perplexity: Llama3 Sonar 70B

By Perplexity

Llama3 Sonar is Perplexity's latest model family. It surpasses their earlier Sonar models in cost-efficiency, speed, and performance. This is a normal offline LLM, but the [online version](/models/perplexity/llama-3-sonar-large-32k-online) of this model has Internet access.

Release Date

14 May 2024

Context Size

32.77K

Perplexity: Llama3 Sonar 8B Online

Perplexity: Llama3 Sonar 8B Online

By Perplexity

Llama3 Sonar is Perplexity's latest model family. It surpasses their earlier Sonar models in cost-efficiency, speed, and performance. This is the online version of the [offline chat model](/models/perplexity/llama-3-sonar-small-32k-chat). It is focused on delivering helpful, up-to-date, and factual responses. #online

Release Date

14 May 2024

Context Size

28K

Perplexity: Llama3 Sonar 8B

Perplexity: Llama3 Sonar 8B

By Perplexity

Llama3 Sonar is Perplexity's latest model family. It surpasses their earlier Sonar models in cost-efficiency, speed, and performance. This is a normal offline LLM, but the [online version](/models/perplexity/llama-3-sonar-small-32k-online) of this model has Internet access.

Release Date

14 May 2024

Context Size

32.77K

DeepSeek V2.5

DeepSeek V2.5

By DeepSeek

DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous versions. For model details, please visit [DeepSeek-V2 page](https://github.com/deepseek-ai/DeepSeek-V2) for more information.

Release Date

14 May 2024

Context Size

128K

Google: Gemini 1.5 Flash

Google: Gemini 1.5 Flash

By Google

Gemini 1.5 Flash is a foundation model that performs well at a variety of multimodal tasks such as visual understanding, classification, summarization, and creating content from image, audio and video. It's adept at processing visual and text inputs such as photographs, documents, infographics, and screenshots. Gemini 1.5 Flash is designed for high-volume, high-frequency tasks where cost and latency matter. On most common tasks, Flash achieves comparable quality to other Gemini Pro models at a significantly reduced cost. Flash is well-suited for applications like chat assistants and on-demand content generation where speed and scale matter. Usage of Gemini is subject to Google's [Gemini Terms of Use](https://ai.google.dev/terms). #multimodal

Release Date

14 May 2024

Context Size

1M

Perplexity: Llama3 Sonar 70B Online

Perplexity: Llama3 Sonar 70B Online

By Perplexity

Llama3 Sonar is Perplexity's latest model family. It surpasses their earlier Sonar models in cost-efficiency, speed, and performance. This is the online version of the [offline chat model](/models/perplexity/llama-3-sonar-large-32k-chat). It is focused on delivering helpful, up-to-date, and factual responses. #online

Release Date

14 May 2024

Context Size

28K

Meta: Llama 3 8B (Base)

Meta: Llama 3 8B (Base)

By Meta Llama

Meta's latest class of model (Llama 3) launched with a variety of sizes & flavors. This is the base 8B pre-trained version. It has demonstrated strong performance compared to leading closed-source models in human evaluations. To read more about the model release, [click here](https://ai.meta.com/blog/meta-llama-3/). Usage of this model is subject to [Meta's Acceptable Use Policy](https://llama.meta.com/llama3/use-policy/).

Release Date

13 May 2024

Context Size

8.19K

Meta: LlamaGuard 2 8B

Meta: LlamaGuard 2 8B

By Meta Llama

This safeguard model has 8B parameters and is based on the Llama 3 family. Just like is predecessor, [LlamaGuard 1](https://huggingface.co/meta-llama/LlamaGuard-7b), it can do both prompt and response classification. LlamaGuard 2 acts as a normal LLM would, generating text that indicates whether the given input/output is safe/unsafe. If deemed unsafe, it will also share the content categories violated. For best results, please use raw prompt input or the `/completions` endpoint, instead of the chat API. It has demonstrated strong performance compared to leading closed-source models in human evaluations. To read more about the model release, [click here](https://ai.meta.com/blog/meta-llama-3/). Usage of this model is subject to [Meta's Acceptable Use Policy](https://llama.meta.com/llama3/use-policy/).

Release Date

13 May 2024

Context Size

8.19K

Meta: Llama 3 70B (Base)

Meta: Llama 3 70B (Base)

By Meta Llama

Meta's latest class of model (Llama 3) launched with a variety of sizes & flavors. This is the base 70B pre-trained version. It has demonstrated strong performance compared to leading closed-source models in human evaluations. To read more about the model release, [click here](https://ai.meta.com/blog/meta-llama-3/). Usage of this model is subject to [Meta's Acceptable Use Policy](https://llama.meta.com/llama3/use-policy/).

Release Date

13 May 2024

Context Size

8.19K

OpenAI: GPT-4o (2024-05-13)

OpenAI: GPT-4o (2024-05-13)

By OpenAI

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of [GPT-4 Turbo](/models/openai/gpt-4-turbo) while being twice as fast and 50% more cost-effective. GPT-4o also offers improved performance in processing non-English languages and enhanced visual capabilities. For benchmarking against other models, it was briefly called ["im-also-a-good-gpt2-chatbot"](https://twitter.com/LiamFedus/status/1790064963966370209) #multimodal

Release Date

13 May 2024

Context Size

128K

OpenAI: GPT-4o

OpenAI: GPT-4o

By OpenAI

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of [GPT-4 Turbo](/models/openai/gpt-4-turbo) while being twice as fast and 50% more cost-effective. GPT-4o also offers improved performance in processing non-English languages and enhanced visual capabilities. For benchmarking against other models, it was briefly called ["im-also-a-good-gpt2-chatbot"](https://twitter.com/LiamFedus/status/1790064963966370209) #multimodal

Release Date

13 May 2024

Context Size

128K

LLaVA v1.6 34B

LLaVA v1.6 34B

By Haotian Liu

LLaVA Yi 34B is an open-source model trained by fine-tuning LLM on multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture. Base LLM: [NousResearch/Nous-Hermes-2-Yi-34B](/models/nousresearch/nous-hermes-yi-34b) It was trained in December 2023.

Release Date

11 May 2024

Context Size

4.10K

OLMo 7B Instruct

OLMo 7B Instruct

By Ai2

OLMo 7B Instruct by the Allen Institute for AI is a model finetuned for question answering. It demonstrates **notable performance** across multiple benchmarks including TruthfulQA and ToxiGen. **Open Source**: The model, its code, checkpoints, logs are released under the [Apache 2.0 license](https://choosealicense.com/licenses/apache-2.0). - [Core repo (training, inference, fine-tuning etc.)](https://github.com/allenai/OLMo) - [Evaluation code](https://github.com/allenai/OLMo-Eval) - [Further fine-tuning code](https://github.com/allenai/open-instruct) - [Paper](https://arxiv.org/abs/2402.00838) - [Technical blog post](https://blog.allenai.org/olmo-open-language-model-87ccfc95f580) - [W&B Logs](https://wandb.ai/ai2-llm/OLMo-7B/reports/OLMo-7B--Vmlldzo2NzQyMzk5)

Release Date

10 May 2024

Context Size

2.05K

Qwen 1.5 110B Chat

Qwen 1.5 110B Chat

By Qwen

Qwen1.5 110B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).

Release Date

09 May 2024

Context Size

32.77K

Qwen 1.5 14B Chat

Qwen 1.5 14B Chat

By Qwen

Qwen1.5 14B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).

Release Date

09 May 2024

Context Size

32.77K

Qwen 1.5 32B Chat

Qwen 1.5 32B Chat

By Qwen

Qwen1.5 32B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).

Release Date

09 May 2024

Context Size

32.77K

Qwen 1.5 4B Chat

Qwen 1.5 4B Chat

By Qwen

Qwen1.5 4B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).

Release Date

09 May 2024

Context Size

32.77K

Qwen 1.5 72B Chat

Qwen 1.5 72B Chat

By Qwen

Qwen1.5 72B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).

Release Date

09 May 2024

Context Size

32.77K

Qwen 1.5 7B Chat

Qwen 1.5 7B Chat

By Qwen

Qwen1.5 7B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).

Release Date

09 May 2024

Context Size

32.77K

Showing page 21 of 25 with 737 models total