RankModelProviderScore (0-100)SamplesContextPrice / 1M tokens
1
A
claude-opus-4-7 Anthropic
100.0
1.8K
1M
¥36 / ¥180Input/Output
2
A
claude-opus-4-7-thinking Anthropic
98.8
1.8K
1M
¥36 / ¥180Input/Output
3
A
claude-opus-4-6-thinking Anthropic
97.5
1.8K
1M
¥36 / ¥180Input/Output
4
O
gpt-5.5 Openai
96.3
1.2K
1.05M
¥36 / ¥216Input/Output
5
A
claude-opus-4-6 Anthropic
95.1
2.2K
1M
¥36 / ¥180Input/Output
6
O
gpt-5.4-high Openai
93.8
1.6K
1.05M
¥18 / ¥108Input/Output
7
M
muse-spark Meta
92.6
1.2K
-
-
8
G
gemini-3.1-pro-preview Google
91.4
4.2K
1.05M
¥14.4 / ¥86.4Input/Output
9
A
claude-sonnet-4-6 Anthropic
90.1
2.4K
1M
¥21.6 / ¥108Input/Output
10
O
gpt-5.4 Openai
88.9
1.5K
1.05M
¥18 / ¥108Input/Output
11
O
gpt-5.5-high Openai
87.7
1.2K
1.05M
¥36 / ¥216Input/Output
12
G
gemini-3-pro Google
86.4
2.9K
1.05M
¥14.4 / ¥86.4Input/Output
13
M
kimi-k2.6 Moonshot
85.2
1.5K
262K
¥6.84 / ¥28.8Input/Output
14
A
qwen3.7-plus-preview Alibaba
84.0
977
131K
¥3.6 / ¥21.6Input/Output
15
G
gemini-3-flash Google
82.7
5.1K
1.05M
¥3.6 / ¥21.6Input/Output
16
B
dola-seed-2.0-pro Bytedance
81.5
2.3K
-
-
17
O
gpt-5.2-chat-latest-20260210 Openai
80.2
3.1K
400K
¥12.6 / ¥101Input/Output
18
M
kimi-k2.5-thinking Moonshot
79.0
3.6K
262K
¥4.32 / ¥21.6Input/Output
19
A
qwen3.5-397b-a17b Alibaba
77.8
3K
262K
¥3.1 / ¥18.6Input/Output
20
G
gemma-4-31b Google
76.5
4.4K
262K
¥3.24 / ¥7.2Input/Output
21
Z
glm-5v-turbo Zai
75.3
1.9K
200K
¥0 / ¥0Input/Output
22
G
gemini-3-flash (thinking-minimal) Google
74.1
4.9K
1.05M
¥3.6 / ¥21.6Input/Output
23
G
gemini-2.5-pro Google
72.8
8.9K
1.05M
¥9 / ¥72Input/Output
24
O
gpt-5.5-instant Openai
71.6
1.1K
400K
¥9 / ¥72Input/Output
25
A
qwen-vl-max-2025-08-13 Alibaba
70.4
356
131K
¥1.66 / ¥4.13Input/Output
26
O
gpt-5.2-high Openai
69.1
3.8K
400K
¥12.6 / ¥101Input/Output
27
O
gpt-5.1-high Openai
67.9
2.2K
400K
¥9 / ¥72Input/Output
28
G
gemma-4-26b-a4b Google
66.7
2.7K
262K
¥0.94 / ¥2.88Input/Output
29
MI
mimo-v2.5 Xiaomi
65.4
1.9K
1.05M
¥2.88 / ¥14.4Input/Output
30
M
kimi-k2.5-instant Moonshot
64.2
911
262K
¥4.32 / ¥21.6Input/Output
31
O
chatgpt-4o-latest-20250326 Openai
63.0
3.7K
128K
¥18 / ¥72Input/Output
32
A
qwen3-vl-235b-a22b-instruct Alibaba
61.7
2.7K
128K
¥2.16 / ¥8.64Input/Output
33
X
grok-4.20-multi-agent-beta-0309 Xai
60.5
2.2K
2M
¥14.4 / ¥43.2Input/Output
34
O
gpt-5.4-mini-high Openai
59.3
2.2K
400K
¥5.4 / ¥32.4Input/Output
35
O
gpt-5-chat Openai
58.0
2.6K
400K
¥9 / ¥72Input/Output
36
X
grok-4.20-beta-0309-reasoning Xai
56.8
2.5K
2M
¥14.4 / ¥43.2Input/Output
37
O
gpt-5.2 Openai
55.6
4K
400K
¥12.6 / ¥101Input/Output
38
G
gemini-2.5-flash-preview-09-2025 Google
54.3
973
1M
¥2.16 / ¥18Input/Output
39
MI
mimo-v2-omni Xiaomi
53.1
2K
262K
¥2.88 / ¥14.4Input/Output
40
X
grok-4.3 Xai
51.9
1.1K
1M
¥9 / ¥18Input/Output
41
G
gemini-2.5-flash Google
50.6
7.7K
1.05M
¥2.16 / ¥18Input/Output
42
O
gpt-5.1 Openai
49.4
2.4K
400K
¥9 / ¥72Input/Output
43
G
gemini-3.1-flash-lite-preview Google
48.1
3.5K
1.05M
¥1.8 / ¥10.8Input/Output
44
A
qwen3.5-122b-a10b Alibaba
46.9
2.6K
262K
¥2.88 / ¥23Input/Output
45
A
qwen3.5-27b Alibaba
45.7
2.4K
262K
¥2.16 / ¥17.3Input/Output
46
B
ernie-5.0-preview-1220 Baidu
44.4
704
128K
¥7.92 / ¥14.4Input/Output
47
O
gpt-5-high Openai
43.2
3K
400K
¥9 / ¥72Input/Output
48
O
o3-2025-04-16 Openai
42.0
3.6K
200K
¥14.4 / ¥57.6Input/Output
49
O
gpt-4.1-2025-04-14 Openai
40.7
2.8K
1.05M
¥14.4 / ¥57.6Input/Output
50
A
qwen3-vl-235b-a22b-thinking Alibaba
39.5
479
131K
¥2.06 / ¥8.26Input/Output
51
O
gpt-5-mini-high Openai
38.3
2K
400K
¥1.8 / ¥14.4Input/Output
52
X
grok-4-0709 Xai
37.0
2.5K
256K
¥21.6 / ¥108Input/Output
53
X
grok-4-1-fast-reasoning Xai
35.8
2.7K
2M
¥1.44 / ¥3.6Input/Output
54
O
gpt-5.4-nano-high Openai
34.6
2.2K
400K
¥1.44 / ¥9Input/Output
55
G
gemini-2.5-flash-lite-preview-09-2025-no-thinking Google
33.3
986
1.05M
¥0.72 / ¥2.88Input/Output
56
O
o4-mini-2025-04-16 Openai
32.1
2.9K
200K
¥7.92 / ¥31.7Input/Output
57
A
claude-sonnet-4-20250514-thinking-32k Anthropic
30.9
230
200K
¥21.6 / ¥108Input/Output
58
TE
hunyuan-vision-1.5-thinking Tencent
29.6
497
-
-
59
G
gemini-2.5-flash-lite-preview-06-17-thinking Google
28.4
2.5K
65.5K
¥0.72 / ¥2.88Input/Output
60
ST
step-1o-turbo-202506 Stepfun
27.2
394
-
-
61
O
gpt-4.1-mini-2025-04-14 Openai
25.9
2.4K
1.05M
¥2.88 / ¥11.5Input/Output
62
A
claude-sonnet-4-20250514 Anthropic
24.7
397
200K
¥21.6 / ¥108Input/Output
63
Z
glm-4.6v Zai
23.5
551
128K
¥2.16 / ¥6.48Input/Output
64
A
claude-3-7-sonnet-20250219-thinking-32k Anthropic
22.2
304
-
-
65
A
claude-opus-4-20250514 Anthropic
21.0
471
200K
¥108 / ¥540Input/Output
66
MA
mistral-medium-2508 Mistral
19.8
3.7K
262K
¥2.88 / ¥14.4Input/Output
67
A
claude-opus-4-20250514-thinking-16k Anthropic
18.5
285
200K
¥108 / ¥540Input/Output
68
G
gemma-3-27b-it Google
17.3
1.8K
128K
¥2.15 / ¥2.15Input/Output
69
MA
mistral-medium-2505 Mistral
16.0
1.4K
262K
¥2.88 / ¥14.4Input/Output
70
Z
glm-4.5v Zai
14.8
376
64K
¥4.32 / ¥13Input/Output
71
ST
step-3 Stepfun
13.6
364
65.5K
¥1.8 / ¥4.68Input/Output
72
A
claude-3-7-sonnet-20250219 Anthropic
12.3
306
200K
¥21.6 / ¥108Input/Output
73
G
gemini-2.0-flash-001 Google
11.1
1.1K
1.05M
¥1.08 / ¥4.32Input/Output
74
O
gpt-5-nano-high Openai
9.9
527
400K
¥0.36 / ¥2.88Input/Output
75
TE
hunyuan-large-vision Tencent
8.6
263
-
-
76
M
llama-4-maverick-17b-128e-instruct Meta
7.4
970
1M
¥1.8 / ¥6.26Input/Output
77
A
claude-3-5-sonnet-20241022 Anthropic
6.2
318
200K
¥21.6 / ¥108Input/Output
78
MA
mistral-small-2506 Mistral
4.9
1.2K
262K
¥2.88 / ¥14.4Input/Output
79
M
llama-4-scout-17b-16e-instruct Meta
3.7
867
128K
¥1.44 / ¥5.62Input/Output
80
MA
mistral-small-3.1-24b-instruct-2503 Mistral
2.5
1.7K
262K
¥2.88 / ¥14.4Input/Output
81
A
claude-3-5-haiku-20241022 Anthropic
1.2
325
200K
¥5.76 / ¥28.8Input/Output
82
AI
molmo-2-8b Allenai
0.0
281
-
-
Top model analysisclaude-opus-4-7 why it ranks first
claude-opus-4-7 ranks first with a percent score of 100.0 and 1.8K samples. Use it as the first option for this leaderboard, then compare price, context and availability.
How to chooseDo not only look at rank #1
Start with the leaderboard closest to your task. Compare the top models by score and sample size, then check price, context length, open or closed access, and provider availability.
Related leaderboardsCompare adjacent capabilities