AI Model Leaderboard

Compare 107+ large language models side-by-side across five key benchmarks. Find the best model for your use case — sorted by overall performance, cost, or capability.

107

Total Models

27

Frontier Models

51

Open Source

#ModelProviderMMLUHumanEvalMathReasoningCodingOverallContext
1OpenAI o3
Multimodal
OpenAI93.894.596.296.89495.1200K
2GPT-5.2
Multimodal
OpenAI95.295.893.195.595.395512K
3Claude Opus 4.6
Multimodal
Anthropic94.895.591.89595.294.5400K
4Gemini 3 Ultra
Multimodal
Google94.59392.594.893.293.62M
5DeepSeek R2
Open Source
DeepSeek9293.59594.59393.6256K
6OpenAI o1
OpenAI91.892.494.895.19293.2200K
7GPT-5
Multimodal
OpenAI93.59489.594.293.893256K
8DeepSeek R1
Open Source
DeepSeek90.89294.393.891.592.5128K
9Grok 4
Multimodal
xAI9392.59193.59292.4256K
10Claude Opus 4
Multimodal
Anthropic92.395.285.693.59592.3200K
11Gemini 2.5 Pro
Multimodal
Google9293.286.492.89391.51M
12OpenAI o4-mini
OpenAI88.291.592.8929191.1200K
13Llama 4 Behemoth
Open SourceMultimodal
Meta92.59189.59290.591.1256K
14Claude Sonnet 4.6
Multimodal
Anthropic91.59485.290.893.591200K
15Qwen 3 Max
Multimodal
Alibaba91.59188.590.59090.3256K
16Gemini 3 Pro
Multimodal
Google9190.587.891.59090.21M
17Grok 3
Multimodal
xAI91.290.58591.59089.6128K
18OpenAI o3-mini
OpenAI86.99090.290.589.889.5200K
19Claude Sonnet 4
Multimodal
Anthropic89.593.980.189.893.589.4200K
20Qwen 3 235B-A22B
Open Source
Alibaba89.591.58590.29189.4128K
21DeepSeek V3.5
Open Source
DeepSeek90.59186.5899089.4128K
22OpenAI o1-mini
OpenAI85.2909088.589.588.6128K
23Mistral Large 3
Multimodal
Mistral AI9089.58489.58988.4256K
24Claude 3.5 Sonnet
Multimodal
Anthropic88.793.778.387.69388.3200K
25Llama 4 Maverick
Open SourceMultimodal
Meta89.29182.5889088.11M
26Grok 3.5
xAI89.58984.58988.588.1128K
27GPT-4.5
Multimodal
OpenAI90.888.581.291.38888128K
28GPT-4o
Multimodal
OpenAI88.790.276.686.489.586.3128K
29DeepSeek V3
Open Source
DeepSeek87.1898284.58886.1128K
30Llama 4 Scout
Open SourceMultimodal
Meta87.589.27985.88885.910M
31Qwen 3 Plus
Multimodal
Alibaba87.587828686.585.8128K
32Ernie 5.0
Multimodal
Baidu888582.587.58485.4256K
33Gemini 2.5 Flash
Multimodal
Google8788.578.585.58785.31M
34GPT-5.2 Mini
Multimodal
OpenAI86.589.278.48488.685.3256K
35Llama 3.1 405B
Open Source
Meta88.68973.88588.585128K
36Llama 3.3 70B
Open Source
Meta8688.47783.58784.4128K
37Jamba 2
AI21 Labs8784.5798683.584512K
38Qwen 2.5 72B
Open Source
Alibaba85.386.48082.585.583.9128K
39Gemini 3 Flash
Multimodal
Google85.58678.58485.583.91M
40Yi-Lightning 2
01.AI8684.5808583.583.8128K
41Pixtral Large 2
Multimodal
Mistral AI868478.585.583.583.5128K
42Claude Haiku 4.5
Multimodal
Anthropic8588.574.282.58783.4200K
43Grok 2
Multimodal
xAI87.5857683.284.583.2128K
44Reka Core 2
Multimodal
Reka86.583.57885.582.583.2128K
45DeepSeek Coder V3
Open Source
DeepSeek78.5947276.593.582.9128K
46Gemini 2.0 Flash
Multimodal
Google85.886.573.482.18582.61M
47Claude 3 Opus
Multimodal
Anthropic86.884.9728584.582.6200K
48Command R+ 2
Cohere86.5837685.58282.6256K
49WizardLM 3
Open Source
Microsoft82.585788284.582.4128K
50Phi-4
Open Source
Microsoft84.882.680.4818282.216K
51Nemotron-4 340B
Open Source
NVIDIA85.58278.5848182.2128K
52QwQ-32B-Preview
Open Source
Alibaba79.58085.58778.582.132K
53Mistral Large 2
Mistral AI8484.574.582.88482128K
54Falcon 3 180B
Open Source
TII8582.5778481.582128K
55DBRX 2
Open Source
Databricks8483.575.58382.581.7128K
56Claude 3.5 Haiku
Multimodal
Anthropic8488.169.378.286.581.2200K
57HyperCLOVA X 2
Multimodal
Naver8580.576.58479.581.1128K
58GPT-4o Mini
Multimodal
OpenAI828770.278.585.380.6128K
59Gemini 1.5 Pro
Multimodal
Google85.984.167.78283.580.62M
60Qwen 3 Turbo
Open Source
Alibaba8283.57580.58280.6128K
61Mistral Medium 3
Mistral AI83.58371.58182.580.3128K
62Codestral 2
Mistral AI7293.568749380.1256K
63Qwen 2.5 Coder 32B
Open Source
Alibaba74.292.768.5729279.9128K
64Nous Hermes 3
Open Source
Nous Research80.5827379.58179.2128K
65Inflection Pi-3
Inflection AI8478.573837779.1128K
66Arctic 2
Open Source
Snowflake8180.5748079.579128K
67DeepSeek Coder V2
Open Source
DeepSeek7190.273.56989.578.6128K
68Nemotron 70B
Open Source
NVIDIA83.58068.580.579.578.4128K
69Command A
Cohere82.880.56880.279.578.2256K
70Llama 3.1 70B
Open Source
Meta8280.568798077.9128K
71Dolphin 3
Open Source
Cognitive Computations7980.5717879.577.6128K
72GLM-4-Plus
Multimodal
Zhipu AI827870.57977.577.4128K
73Gemma 3 27B
Open SourceMultimodal
Google80.5806878.27977.1128K
74Mistral Small 3
Open Source
Mistral AI80.58166.576.38076.9128K
75Yi-Lightning
01.AI8278.5687877.576.816K
76MiniMax-01
Open Source
MiniMax82.57866.57977.576.74M
77Codestral
Mistral AI70.5906068.59176256K
78Falcon 3 40B
Open Source
TII78.576707775.575.464K
79Gemini 2.0 Flash Lite
Multimodal
Google80.278.56574.377.875.21M
80Solar 2 Pro
Multimodal
Upstage77.57670.576.57575.164K
81Amazon Nova Pro
Multimodal
Amazon80.277.5647776.575300K
82Command R 2
Cohere7876.568.5777575128K
83Mixtral 8x22B
Open Source
Mistral AI77.87964.275.57874.964K
84CodeLlama 2 70B
Open Source
Meta68866268.585.574128K
85Jamba 1.5 Large
Open Source
AI21 Labs8075.562.476.87573.9256K
86Gemma 3 9B
Open SourceMultimodal
Google74.576687375.573.4128K
87Yi-34B-Chat
Open Source
01.AI76.5746674.57372.832K
88StarCoder 3
Open Source
BigCode6588.5586587.572.864K
89Cohere Aya 3
Open Source
Cohere78.57266.5767172.864K
90Qwen 2.5 7B
Open Source
Alibaba74.275.66571.57472.1128K
91BLOOM-3
Open Source
BigScience7670.56574.569.571.164K
92Phi-4 Mini
Open Source
Microsoft72.57268.37071.570.9128K
93WizardLM 2 8x22B
Open Source
Microsoft75.5745873.573.570.964K
94Reka Core
Multimodal
Reka AI7972.556757270.9128K
95Claude 3 Haiku
Multimodal
Anthropic75.275.957.37174.570.8200K
96Gemma 2 27B
Open Source
Google75.273.558.3727370.48K
97Command R+
Cohere75.771.557.8747270.2128K
98InternLM 2.5 20B
Open Source
Shanghai AI Lab73.872627171.570.11M
99DBRX
Open Source
Databricks73.770.154.269.569.867.532K
100Amazon Nova Lite
Multimodal
Amazon72.568556867.566.2300K
101Jamba 1.5 Mini
Open Source
AI21 Labs726852.568.56765.6256K
102Aya Expanse 32B
Open Source
Cohere736552706464.8128K
103Mistral Nemo
Open Source
Mistral AI6867.55265.566.563.9128K
104Falcon 180B
Open Source
TII70.562456560.560.62K
105Snowflake Arctic
Open Source
Snowflake6764.545.56263.560.54K
106Llama 3.1 8B
Open Source
Meta68.462.147.26261.560.2128K
107StarCoder 2 15B
Open Source
BigCode5273.53548.57556.816K

Try All These Models in One Place

Vincony gives you access to 400+ AI models — compare responses side-by-side, run AI debates, and find the best model for your task.

Visit Vincony.com