Chat · Text · Life, Physical & Social Science Leaderboard

Ranking for Text / Life, Physical & Social Science, based on public preference data.

Selection guide

Life, Physical & Social Science model ranking guide

Ranking for Text / Life, Physical & Social Science, based on public preference data.

claude-opus-4-6-thinkingclaude-opus-4-6claude-opus-4-7-thinkinggemini-3.5-flashgemini-3.1-pro-preview
Current DirectoryChat · Text · Life, Physical & Social Science
Models358
Published2026/05/27
Arena public preference evaluationOriginal leaderboard: Text / Industry Life And Physical And Social SciencePublished: 2026/05/27Leaderboard dataset: LMArena latest parquetOpen Arena sourceOpen leaderboard dataset
1
claude-opus-4-6-thinking
Anthropic
100.0
5.6K
1M
¥36 / ¥180Input/Output
2
claude-opus-4-6
Anthropic
99.7
5.9K
1M
¥36 / ¥180Input/Output
3
claude-opus-4-7-thinking
Anthropic
99.4
3.4K
1M
¥36 / ¥180Input/Output
4
gemini-3.5-flash
Google
99.2
1.4K
1.05M
¥10.8 / ¥64.8Input/Output
5
gemini-3.1-pro-preview
Google
98.9
7.1K
1.05M
¥14.4 / ¥86.4Input/Output
6
qwen3.5-max-preview
Alibaba
98.6
3.2K
-
-
7
claude-opus-4-7
Anthropic
98.3
3.5K
1M
¥36 / ¥180Input/Output
8
ernie-5.1
Baidu
98.0
2.4K
119K
¥5.4 / ¥21.6Input/Output
9
glm-5.1
Zai
97.8
2.3K
200K
¥0 / ¥0Input/Output
10
gemini-3-pro
Google
97.5
6.7K
1.05M
¥14.4 / ¥86.4Input/Output
11
gpt-5.5
Openai
97.2
2.8K
1.05M
¥36 / ¥216Input/Output
12
muse-spark
Meta
96.9
2K
-
-
13
gpt-5.4-high
Openai
96.6
4.6K
1.05M
¥18 / ¥108Input/Output
14
gemini-2.5-pro
Google
96.4
20.1K
1.05M
¥9 / ¥72Input/Output
15
gpt-5.5-high
Openai
96.1
2.7K
1.05M
¥36 / ¥216Input/Output
16
mimo-v2.5-pro
Xiaomi
95.8
2.6K
1.05M
¥7.2 / ¥21.6Input/Output
17
qwen3.7-max-preview
Alibaba
95.5
664
1M
¥18 / ¥54Input/Output
18
gemini-3-flash
Google
95.2
4.9K
1.05M
¥3.6 / ¥21.6Input/Output
19
claude-sonnet-4-6
Anthropic
95.0
4.4K
1M
¥21.6 / ¥108Input/Output
20
qwen3.6-max-preview
Alibaba
94.7
773
246K
¥9.5 / ¥56.9Input/Output
21
ernie-5.0-preview-1203
Baidu
94.4
1.6K
128K
¥7.92 / ¥14.4Input/Output
22
kimi-k2.6
Moonshot
94.1
2.6K
262K
¥6.84 / ¥28.8Input/Output
23
glm-5
Zai
93.8
3.4K
205K
¥7.2 / ¥23Input/Output
24
deepseek-v4-pro
Deepseek
93.6
2.8K
1M
¥3.13 / ¥6.26Input/Output
25
kimi-k2.5-thinking
Moonshot
93.3
5.7K
262K
¥4.32 / ¥21.6Input/Output
26
qwen3.5-397b-a17b
Alibaba
93.0
5.3K
262K
¥3.1 / ¥18.6Input/Output
27
gpt-5.4
Openai
92.7
4.8K
1.05M
¥18 / ¥108Input/Output
28
gemini-3-flash (thinking-minimal)
Google
92.4
8.6K
1.05M
¥3.6 / ¥21.6Input/Output
29
grok-4.20-multi-agent-beta-0309
Xai
92.2
4.7K
2M
¥14.4 / ¥43.2Input/Output
30
glm-4.6
Zai
91.9
5.9K
205K
¥4.32 / ¥15.8Input/Output
31
mimo-v2-pro
Xiaomi
91.6
3.6K
1.05M
¥7.2 / ¥21.6Input/Output
32
grok-4.20-beta-0309-reasoning
Xai
91.3
4.7K
2M
¥14.4 / ¥43.2Input/Output
33
qwen3-max-preview
Alibaba
91.0
4.2K
262K
¥6.2 / ¥24.8Input/Output
34
deepseek-v4-pro-thinking
Deepseek
90.8
2.5K
1M
¥3.13 / ¥6.26Input/Output
35
gpt-5.1-high
Openai
90.5
6.6K
400K
¥9 / ¥72Input/Output
36
glm-4.7
Zai
90.2
2.1K
205K
¥0 / ¥0Input/Output
37
dola-seed-2.0-pro
Bytedance
89.9
6K
-
-
38
gemma-4-31b
Google
89.6
872
262K
¥3.24 / ¥7.2Input/Output
39
claude-opus-4-5-20251101
Anthropic
89.4
10.3K
200K
¥36 / ¥180Input/Output
40
deepseek-v3.1-terminus
Deepseek
89.1
577
128K
¥1.8 / ¥5.04Input/Output
41
ernie-5.0-0110
Baidu
88.8
5.6K
128K
¥7.92 / ¥14.4Input/Output
42
deepseek-v3.1-terminus-thinking
Deepseek
88.5
538
128K
¥1.8 / ¥5.04Input/Output
43
mistral-large-3
Mistral
88.2
6.8K
262K
¥3.6 / ¥10.8Input/Output
44
amazon-nova-experimental-chat-26-02-10
Amazon
88.0
512
-
-
45
deepseek-v4-flash
Deepseek
87.7
2.8K
1M
¥1.01 / ¥2.02Input/Output
46
mistral-medium-2508
Mistral
87.4
14.5K
262K
¥2.88 / ¥14.4Input/Output
47
amazon-nova-experimental-chat-12-10
Amazon
87.1
600
-
-
48
amazon-nova-experimental-chat-11-10
Amazon
86.8
3.9K
-
-
49
ernie-5.0-preview-1022
Baidu
86.6
776
128K
¥7.92 / ¥14.4Input/Output
50
claude-sonnet-4-5-20250929
Anthropic
86.3
12.3K
200K
¥21.6 / ¥108Input/Output
51
deepseek-v3.2-exp-thinking
Deepseek
86.0
1.4K
128K
¥0 / ¥0Input/Output
52
grok-4.1
Xai
85.7
10.4K
200K
¥14.4 / ¥72Input/Output
53
gemma-4-26b-a4b
Google
85.4
871
262K
¥0.94 / ¥2.88Input/Output
54
deepseek-v3.2
Deepseek
85.2
7.2K
128K
¥2.09 / ¥3.1Input/Output
55
claude-opus-4-5-20251101-thinking-32k
Anthropic
84.9
5.8K
200K
¥108 / ¥540Input/Output
56
qwen3-vl-235b-a22b-instruct
Alibaba
84.6
1.8K
128K
¥2.16 / ¥8.64Input/Output
57
grok-4.20-beta1
Xai
84.3
4K
2M
¥14.4 / ¥43.2Input/Output
58
glm-4.5
Zai
84.0
3.9K
131K
¥4.32 / ¥15.8Input/Output
59
chatgpt-4o-latest-20250326
Openai
83.8
13.7K
128K
¥18 / ¥72Input/Output
60
gpt-5.2-chat-latest-20260210
Openai
83.5
5.1K
400K
¥12.6 / ¥101Input/Output
61
gpt-5.1
Openai
83.2
6.9K
400K
¥9 / ¥72Input/Output
62
deepseek-v4-flash-thinking
Deepseek
82.9
2.8K
1M
¥1.01 / ¥2.02Input/Output
63
grok-4.1-thinking
Xai
82.6
10.4K
200K
¥14.4 / ¥72Input/Output
64
qwen3-235b-a22b-instruct-2507
Alibaba
82.4
15.1K
128K
¥2.09 / ¥8.23Input/Output
65
grok-3-preview-02-24
Xai
82.1
5.8K
1M
¥9 / ¥18Input/Output
66
claude-sonnet-4-5-20250929-thinking-32k
Anthropic
81.8
12.3K
200K
¥21.6 / ¥108Input/Output
67
mimo-v2.5
Xiaomi
81.5
2.6K
1.05M
¥2.88 / ¥14.4Input/Output
68
grok-4-0709
Xai
81.2
6.9K
256K
¥21.6 / ¥108Input/Output
69
qwen3.6-plus
Alibaba
81.0
3K
1M
¥3.6 / ¥21.6Input/Output
70
gemini-2.5-flash-preview-09-2025
Google
80.7
5.2K
1M
¥2.16 / ¥18Input/Output
71
qwen3-next-80b-a3b-instruct
Alibaba
80.4
3.5K
131K
¥1.04 / ¥4.13Input/Output
72
qwen3.5-27b
Alibaba
80.1
4.2K
262K
¥2.16 / ¥17.3Input/Output
73
gemini-2.5-flash
Google
79.8
19.8K
1.05M
¥2.16 / ¥18Input/Output
74
deepseek-v3.1
Deepseek
79.6
2.3K
128K
¥1.44 / ¥5.04Input/Output
75
qwen3.5-122b-a10b
Alibaba
79.3
4.3K
262K
¥2.88 / ¥23Input/Output
76
hunyuan-t1-20250711
Tencent
79.0
723
131K
¥0 / ¥0Input/Output
77
qwen3-235b-a22b-thinking-2507
Alibaba
78.7
1.5K
131K
¥2.07 / ¥8.26Input/Output
78
longcat-flash-chat-2602-exp
Meituan
78.4
3.9K
128K
¥1.08 / ¥10.8Input/Output
79
deepseek-r1-0528
Deepseek
78.2
3.4K
164K
¥3.6 / ¥15.5Input/Output
80
kimi-k2.5-instant
Moonshot
77.9
1.2K
262K
¥4.32 / ¥21.6Input/Output
81
gemini-3.1-flash-lite-preview
Google
77.6
5.7K
1.05M
¥1.8 / ¥10.8Input/Output
82
deepseek-v3.1-thinking
Deepseek
77.3
1.9K
128K
¥1.44 / ¥5.04Input/Output
83
o3-2025-04-16
Openai
77.0
10K
200K
¥14.4 / ¥57.6Input/Output
84
deepseek-v3.2-exp
Deepseek
76.8
1.9K
128K
¥0 / ¥0Input/Output
85
deepseek-v3.2-thinking
Deepseek
76.5
6.4K
128K
¥2.09 / ¥3.1Input/Output
86
gpt-5.5-instant
Openai
76.2
4K
400K
¥9 / ¥72Input/Output
87
longcat-flash-chat
Meituan
75.9
1.7K
128K
¥1.08 / ¥10.8Input/Output
88
amazon-nova-experimental-chat-10-20
Amazon
75.6
1.8K
-
-
89
gpt-5.2
Openai
75.4
7.5K
400K
¥12.6 / ¥101Input/Output
90
gpt-5.2-high
Openai
75.1
7.5K
400K
¥12.6 / ¥101Input/Output
91
grok-4-fast-reasoning
Xai
74.8
3K
2M
¥1.44 / ¥3.6Input/Output
92
hunyuan-hy3-preview
Tencent
74.5
1.1K
256K
¥0 / ¥0Input/Output
93
mimo-v2-flash (non-thinking)
Xiaomi
74.2
7.1K
262K
¥0.72 / ¥2.16Input/Output
94
claude-opus-4-1-20250805-thinking-16k
Anthropic
73.9
7.9K
200K
¥108 / ¥540Input/Output
95
grok-4-1-fast-reasoning
Xai
73.7
8.8K
2M
¥1.44 / ¥3.6Input/Output
96
gpt-5.4-mini-high
Openai
73.4
4.3K
400K
¥5.4 / ¥32.4Input/Output
97
kimi-k2-thinking-turbo
Moonshot
73.1
9.6K
262K
¥17.3 / ¥72Input/Output
98
gpt-4.5-preview-2025-02-27
Openai
72.8
2.5K
8.19K
¥216 / ¥432Input/Output
99
grok-4.3
Xai
72.5
2.6K
1M
¥9 / ¥18Input/Output
100
gpt-5-chat
Openai
72.3
5K
400K
¥9 / ¥72Input/Output
101
step-3.5-flash
Stepfun
72.0
5.5K
256K
¥0.69 / ¥2.07Input/Output
102
claude-opus-4-1-20250805
Anthropic
71.7
12.1K
200K
¥108 / ¥540Input/Output
103
minimax-m2.1-preview
Minimax
71.4
2.7K
205K
¥0 / ¥0Input/Output
104
qwen3.5-flash
Alibaba
71.1
4.7K
1M
¥1.24 / ¥12.4Input/Output
105
grok-4-fast-chat
Xai
70.9
1K
2M
¥1.44 / ¥3.6Input/Output
106
qwen3-max-2025-09-23
Alibaba
70.6
1.4K
258K
¥6.19 / ¥24.7Input/Output
107
qwen3-vl-235b-a22b-thinking
Alibaba
70.3
1.3K
131K
¥2.06 / ¥8.26Input/Output
108
qwen3.5-35b-a3b
Alibaba
70.0
4.6K
262K
¥1.8 / ¥14.4Input/Output
109
hunyuan-vision-1.5-thinking
Tencent
69.7
339
-
-
110
amazon-nova-experimental-chat-26-01-10
Amazon
69.5
555
-
-
111
glm-4.5-air
Zai
69.2
4.9K
131K
¥0 / ¥0Input/Output
112
amazon-nova-experimental-chat-10-09
Amazon
68.9
434
-
-
113
qwen3-235b-a22b-no-thinking
Alibaba
68.6
6.3K
131K
¥2.07 / ¥8.26Input/Output
114
minimax-m2.7
Minimax
68.3
3.7K
205K
¥0 / ¥0Input/Output
115
mimo-v2-flash (thinking)
Xiaomi
68.1
1.7K
262K
¥0.72 / ¥2.16Input/Output
116
gpt-5-high
Openai
67.8
5.1K
400K
¥9 / ¥72Input/Output
117
mimo-v2-omni
Xiaomi
67.5
463
262K
¥2.88 / ¥14.4Input/Output
118
hunyuan-turbos-20250416
Tencent
67.2
2K
131K
¥0 / ¥0Input/Output
119
claude-haiku-4-5-20251001
Anthropic
66.9
12.5K
200K
¥7.2 / ¥36Input/Output
120
gemini-2.5-flash-lite-preview-09-2025-no-thinking
Google
66.7
7.5K
1.05M
¥0.72 / ¥2.88Input/Output
121
glm-4.6v
Zai
66.4
467
128K
¥2.16 / ¥6.48Input/Output
122
qwen3-30b-a3b-instruct-2507
Alibaba
66.1
3.7K
262K
¥2.16 / ¥3.6Input/Output
123
nvidia-nemotron-3-super-120b-a12b
Nvidia
65.8
1.1K
262K
¥1.44 / ¥5.76Input/Output
124
kimi-k2-0905-preview
Moonshot
65.5
1.7K
262K
¥4.32 / ¥18Input/Output
125
ling-flash-2.0
Ant Group
65.3
1K
131K
¥1.01 / ¥4.1Input/Output
126
gemma-3-27b-it
Google
65.0
7.9K
128K
¥2.15 / ¥2.15Input/Output
127
qwen3-next-80b-a3b-thinking
Alibaba
64.7
2.1K
131K
¥1.04 / ¥10.3Input/Output
128
gpt-5.3-chat-latest
Openai
64.4
4.8K
128K
¥12.6 / ¥101Input/Output
129
deepseek-v3-0324
Deepseek
64.1
7.7K
75K
¥1.44 / ¥5.76Input/Output
130
gpt-oss-120b
Openai
63.9
4.8K
131K
¥1.08 / ¥4.32Input/Output
131
gpt-4.1-2025-04-14
Openai
63.6
8.5K
1.05M
¥14.4 / ¥57.6Input/Output
132
qwen2.5-max
Alibaba
63.3
5.7K
32K
¥11.5 / ¥46Input/Output
133
nova-2-lite
Amazon
63.0
2.1K
128K
¥2.38 / ¥19.8Input/Output
134
mistral-medium-2505
Mistral
62.7
5.7K
262K
¥2.88 / ¥14.4Input/Output
135
grok-3-mini-high
Xai
62.5
2.8K
128K
¥0 / ¥0Input/Output
136
gpt-5.4-nano-high
Openai
62.2
4.2K
400K
¥1.44 / ¥9Input/Output
137
deepseek-r1
Deepseek
61.9
3.3K
164K
¥5.04 / ¥18Input/Output
138
grok-3-mini-beta
Xai
61.6
3.9K
1M
¥9 / ¥18Input/Output
139
gemini-2.5-flash-lite-preview-06-17-thinking
Google
61.3
5.3K
65.5K
¥0.72 / ¥2.88Input/Output
140
qwen3-32b
Alibaba
61.1
798
131K
¥2.07 / ¥8.26Input/Output
141
qwen3-235b-a22b
Alibaba
60.8
4.6K
131K
¥2.07 / ¥8.26Input/Output
142
gpt-5-mini-high
Openai
60.5
4.2K
400K
¥1.8 / ¥14.4Input/Output
143
o1-2024-12-17
Openai
60.2
4.8K
128K
¥108 / ¥432Input/Output
144
claude-opus-4-20250514-thinking-16k
Anthropic
59.9
6K
200K
¥108 / ¥540Input/Output
145
minimax-m2.5
Minimax
59.7
5.9K
205K
¥0 / ¥0Input/Output
146
gemma-3-12b-it
Google
59.4
760
128K
¥1.96 / ¥1.96Input/Output
147
nvidia-llama-3.3-nemotron-super-49b-v1.5
Nvidia
59.1
594
131K
¥2.88 / ¥2.88Input/Output
148
kimi-k2-0711-preview
Moonshot
58.8
4.4K
131K
¥4.32 / ¥18Input/Output
149
trinity-large-thinking
-
58.5
3.9K
262K
¥1.8 / ¥6.48Input/Output
150
claude-opus-4-20250514
Anthropic
58.3
7.1K
200K
¥108 / ¥540Input/Output
151
glm-4.7-flash
Zai
58.0
1.8K
200K
¥0 / ¥0Input/Output
152
intellect-3
-
57.7
890
131K
¥1.44 / ¥7.92Input/Output
153
gemini-2.0-flash-001
Google
57.4
7.8K
1.05M
¥1.08 / ¥4.32Input/Output
154
nvidia-nemotron-3-nano-30b-a3b-bf16
Nvidia
57.1
2.5K
131K
¥0 / ¥0Input/Output
155
step-1o-turbo-202506
Stepfun
56.9
1.5K
-
-
156
minimax-m1
Minimax
56.6
5.9K
1M
¥0.95 / ¥9.03Input/Output
157
step-3
Stepfun
56.3
1.1K
65.5K
¥1.8 / ¥4.68Input/Output
158
o4-mini-2025-04-16
Openai
56.0
7.6K
200K
¥7.92 / ¥31.7Input/Output
159
qwen-plus-0125
Alibaba
55.7
968
1M
¥0.83 / ¥2.07Input/Output
160
trinity-large-preview
-
55.5
4.6K
262K
¥1.8 / ¥6.48Input/Output
161
glm-4-plus-0111
Zai
55.2
986
128K
¥72 / ¥72Input/Output
162
minimax-m2
Minimax
54.9
1.1K
197K
¥0 / ¥0Input/Output
163
qwen3-coder-480b-a35b-instruct
Alibaba
54.6
4.1K
262K
¥6.2 / ¥24.8Input/Output
164
o1-preview
Openai
54.3
5.4K
128K
¥108 / ¥432Input/Output
165
qwq-32b
Alibaba
54.1
4.5K
131K
¥2.07 / ¥6.2Input/Output
166
claude-sonnet-4-20250514-thinking-32k
Anthropic
53.8
5.7K
200K
¥21.6 / ¥108Input/Output
167
mercury-2
Inception Ai
53.5
472
128K
¥1.8 / ¥5.4Input/Output
168
mistral-small-2506
Mistral
53.2
2.9K
262K
¥2.88 / ¥14.4Input/Output
169
claude-sonnet-4-20250514
Anthropic
52.9
6.4K
200K
¥21.6 / ¥108Input/Output
170
o3-mini-high
Openai
52.7
3.4K
200K
¥7.92 / ¥31.7Input/Output
171
ring-flash-2.0
Ant Group
52.4
1.1K
131K
¥1.01 / ¥4.1Input/Output
172
command-a-03-2025
Cohere
52.1
9.4K
256K
¥18 / ¥72Input/Output
173
deepseek-v3
Deepseek
51.8
3.8K
128K
¥0 / ¥0Input/Output
174
llama-3.3-nemotron-49b-super-v1
Nvidia
51.5
395
131K
¥0 / ¥0Input/Output
175
gemini-2.0-flash-lite-preview-02-05
Google
51.3
4.4K
1.05M
¥0.54 / ¥2.16Input/Output
176
step-2-16k-exp-202412
Stepfun
51.0
863
16.4K
¥37.5 / ¥118Input/Output
177
glm-4.5v
Zai
50.7
749
64K
¥4.32 / ¥13Input/Output
178
hunyuan-turbo-0110
Tencent
50.4
414
-
-
179
gpt-4.1-mini-2025-04-14
Openai
50.1
6.5K
1.05M
¥2.88 / ¥11.5Input/Output
180
llama-3.1-nemotron-ultra-253b-v1
Nvidia
49.9
498
128K
¥4.32 / ¥13Input/Output
181
gemma-3n-e4b-it
Google
49.6
3.7K
128K
¥0 / ¥0Input/Output
182
qwen3-30b-a3b
Alibaba
49.3
4.6K
128K
¥0.79 / ¥7.78Input/Output
183
gemini-1.5-pro-002
Google
49.0
9.8K
-
-
184
hunyuan-turbos-20250226
Tencent
48.7
384
131K
¥0 / ¥0Input/Output
185
gpt-5-nano-high
Openai
48.5
1.3K
400K
¥0.36 / ¥2.88Input/Output
186
qwen2.5-plus-1127
Alibaba
48.2
1.8K
-
-
187
o1-mini
Openai
47.9
9K
128K
¥7.92 / ¥31.7Input/Output
188
yi-lightning
-
47.6
4.8K
12K
¥1.44 / ¥1.44Input/Output
189
o3-mini
Openai
47.3
9.7K
200K
¥7.92 / ¥31.7Input/Output
190
grok-2-2024-08-13
Xai
47.1
10.8K
1M
¥9 / ¥18Input/Output
191
olmo-3.1-32b-instruct
Allenai
46.8
1.9K
200K
¥14.4 / ¥57.6Input/Output
192
olmo-3-32b-think
Allenai
46.5
1K
128K
¥2.16 / ¥3.24Input/Output
193
athene-v2-chat
-
46.2
4.4K
-
-
194
granite-4.1-8b
Ibm
45.9
603
131K
¥0.36 / ¥0.72Input/Output
195
gemma-3-4b-it
Google
45.7
780
128K
¥1.44 / ¥1.44Input/Output
196
grok-2-mini-2024-08-13
Xai
45.4
8.8K
1M
¥9 / ¥18Input/Output
197
gemini-1.5-flash-002
Google
45.1
6.1K
2M
¥0.54 / ¥2.2Input/Output
198
claude-3-7-sonnet-20250219-thinking-32k
Anthropic
44.8
6.8K
-
-
199
gpt-oss-20b
Openai
44.5
1.7K
131K
¥0.32 / ¥1.3Input/Output
200
gpt-4.1-nano-2025-04-14
Openai
44.3
1.2K
1.05M
¥14.4 / ¥57.6Input/Output
201
mercury
Inception Ai
44.0
288
128K
¥1.8 / ¥5.4Input/Output
202
llama-3.1-405b-instruct-bf16
Meta
43.7
7.2K
128K
¥0 / ¥0Input/Output
203
gpt-4o-2024-05-13
Openai
43.4
19K
128K
¥36 / ¥108Input/Output
204
glm-4-plus
Zai
43.1
4.7K
128K
¥54 / ¥54Input/Output
205
athene-70b-0725
-
42.9
3K
-
-
206
llama-3.1-nemotron-70b-instruct
Nvidia
42.6
1.3K
128K
¥0 / ¥0Input/Output
207
hunyuan-large-2025-02-10
Tencent
42.3
707
-
-
208
deepseek-v2.5-1210
Deepseek
42.0
1.1K
1M
¥1.01 / ¥2.02Input/Output
209
llama-4-maverick-17b-128e-instruct
Meta
41.7
6.8K
1M
¥1.8 / ¥6.26Input/Output
210
gpt-4o-mini-2024-07-18
Openai
41.5
11.6K
128K
¥1.08 / ¥4.32Input/Output
211
hunyuan-standard-2025-02-10
Tencent
41.2
696
-
-
212
llama-3.1-405b-instruct-fp8
Meta
40.9
10K
128K
¥0 / ¥0Input/Output
213
claude-3-7-sonnet-20250219
Anthropic
40.6
7.5K
200K
¥21.6 / ¥108Input/Output
214
llama-3.3-70b-instruct
Meta
40.3
9.4K
128K
¥0 / ¥0Input/Output
215
olmo-3.1-32b-think
Allenai
40.1
1.4K
200K
¥14.4 / ¥57.6Input/Output
216
llama-4-scout-17b-16e-instruct
Meta
39.8
5K
128K
¥1.44 / ¥5.62Input/Output
217
qwen-max-0919
Alibaba
39.5
3K
131K
¥2.48 / ¥9.91Input/Output
218
llama-3.1-70b-instruct
Meta
39.2
9.4K
131K
¥2.88 / ¥2.88Input/Output
219
qwen2.5-72b-instruct
Alibaba
38.9
6.9K
131K
¥4.13 / ¥12.4Input/Output
220
gpt-4o-2024-08-06
Openai
38.7
7.6K
128K
¥18 / ¥72Input/Output
221
deepseek-v2.5
Deepseek
38.4
4.3K
1M
¥1.01 / ¥2.02Input/Output
222
claude-3-5-sonnet-20241022
Anthropic
38.1
15.6K
200K
¥21.6 / ¥108Input/Output
223
claude-3-5-sonnet-20240620
Anthropic
37.8
13.7K
200K
¥21.6 / ¥108Input/Output
224
gemini-advanced-0514
Google
37.5
8.8K
-
-
225
gemini-1.5-pro-001
Google
37.3
13.6K
-
-
226
llama-3.1-tulu-3-70b
Allenai
37.0
520
-
-
227
mistral-large-2407
Mistral
36.7
7.5K
131K
¥14.4 / ¥43.2Input/Output
228
mistral-small-3.1-24b-instruct-2503
Mistral
36.4
5.3K
262K
¥2.88 / ¥14.4Input/Output
229
reka-core-20240904
-
36.1
1.2K
-
-
230
mistral-large-2411
Mistral
35.9
4.9K
128K
¥14.4 / ¥43.2Input/Output
231
hunyuan-large-vision
Tencent
35.6
930
-
-
232
gpt-4-turbo-2024-04-09
Openai
35.3
16.4K
128K
¥72 / ¥216Input/Output
233
amazon-nova-pro-v1.0
Amazon
35.0
4.3K
300K
¥5.76 / ¥23Input/Output
234
gpt-4-1106-preview
Openai
34.7
17.7K
8.19K
¥216 / ¥432Input/Output
235
claude-3-opus-20240229
Anthropic
34.5
33.8K
200K
¥108 / ¥540Input/Output
236
ibm-granite-h-small
Ibm
34.2
898
-
-
237
command-r-plus-08-2024
Cohere
33.9
1.6K
128K
¥18 / ¥72Input/Output
238
gpt-4-0125-preview
Openai
33.6
16K
8.19K
¥216 / ¥432Input/Output
239
jamba-1.5-large
-
33.3
1.4K
256K
¥0 / ¥0Input/Output
240
claude-3-5-haiku-20241022
Anthropic
33.1
11.9K
200K
¥5.76 / ¥28.8Input/Output
241
olmo-2-0325-32b-instruct
Allenai
32.8
580
-
-
242
magistral-medium-2506
Mistral
32.5
2K
128K
¥14.4 / ¥36Input/Output
243
mistral-small-24b-instruct-2501
Mistral
32.2
2.6K
262K
¥2.88 / ¥14.4Input/Output
244
c4ai-aya-expanse-32b
Cohere
31.9
4.8K
-
-
245
amazon-nova-lite-v1.0
Amazon
31.7
3.3K
300K
¥0.43 / ¥1.73Input/Output
246
gemini-1.5-flash-8b-001
Google
31.4
6.2K
2M
¥0.54 / ¥2.2Input/Output
247
gemini-1.5-flash-001
Google
31.1
10.9K
2M
¥0.54 / ¥2.2Input/Output
248
gemma-2-9b-it-simpo
-
30.8
1.6K
8.19K
¥1.44 / ¥1.44Input/Output
249
nemotron-4-340b-instruct
Nvidia
30.5
3.3K
-
-
250
gemma-2-27b-it
Google
30.3
12.9K
8.19K
¥0.58 / ¥0.58Input/Output
251
glm-4-0520
Zai
30.0
1.6K
128K
¥108 / ¥108Input/Output
252
llama-3.1-nemotron-51b-instruct
Nvidia
29.7
682
128K
¥0 / ¥0Input/Output
253
llama-3-70b-instruct
Meta
29.4
25.9K
8.19K
¥3.67 / ¥5.33Input/Output
254
reka-flash-20240904
-
29.1
1.3K
65.5K
¥0.72 / ¥1.44Input/Output
255
qwen2.5-coder-32b-instruct
Alibaba
28.9
956
131K
¥2.07 / ¥6.2Input/Output
256
command-r-08-2024
Cohere
28.6
1.6K
128K
¥18 / ¥72Input/Output
257
command-r-plus
Cohere
28.3
13.2K
128K
¥18 / ¥72Input/Output
258
amazon-nova-micro-v1.0
Amazon
28.0
3.3K
128K
¥0.25 / ¥1.01Input/Output
259
claude-3-sonnet-20240229
Anthropic
27.7
19.1K
200K
¥21.6 / ¥108Input/Output
260
phi-4
Microsoft
27.5
4.2K
128K
¥0.9 / ¥3.6Input/Output
261
c4ai-aya-expanse-8b
Cohere
27.2
1.7K
-
-
262
gemma-2-9b-it
Google
26.9
9.2K
8.19K
¥1.44 / ¥1.44Input/Output
263
hunyuan-standard-256k
Tencent
26.6
538
-
-
264
qwen2-72b-instruct
Alibaba
26.3
6.5K
131K
¥4.13 / ¥12.4Input/Output
265
llama-3.1-tulu-3-8b
Allenai
26.1
516
-
-
266
llama-3.1-8b-instruct
Meta
25.8
8.4K
131K
¥0.79 / ¥0.79Input/Output
267
jamba-1.5-mini
-
25.5
1.5K
256K
¥0 / ¥0Input/Output
268
ministral-8b-2410
Mistral
25.2
827
128K
¥0.72 / ¥0.72Input/Output
269
claude-3-haiku-20240307
Anthropic
24.9
20.6K
200K
¥1.8 / ¥9Input/Output
270
gpt-4-0314
Openai
24.6
9.6K
8.19K
¥216 / ¥432Input/Output
271
yi-1.5-34b-chat
-
24.4
4.3K
-
-
272
command-r
Cohere
24.1
9.2K
128K
¥18 / ¥72Input/Output
273
llama-3-8b-instruct
Meta
23.8
17.3K
8.19K
¥0.29 / ¥0.29Input/Output
274
reka-flash-21b-20240226-online
-
23.5
2.6K
-
-
275
qwen1.5-110b-chat
Alibaba
23.2
4.4K
-
-
276
internlm2_5-20b-chat
-
23.0
1.7K
-
-
277
deepseek-coder-v2
Deepseek
22.7
2.6K
1M
¥1.01 / ¥2.02Input/Output
278
reka-flash-21b-20240226
-
22.4
4.1K
-
-
279
gemma-2-2b-it
Google
22.1
7.9K
128K
¥0 / ¥0Input/Output
280
mistral-medium
Mistral
21.8
6.1K
262K
¥2.88 / ¥14.4Input/Output
281
qwen1.5-72b-chat
Alibaba
21.6
6.9K
-
-
282
qwq-32b-preview
Alibaba
21.3
583
131K
¥2.07 / ¥6.2Input/Output
283
mistral-large-2402
Mistral
21.0
10.9K
262K
¥2.88 / ¥14.4Input/Output
284
gpt-4-0613
Openai
20.7
15.3K
8.19K
¥216 / ¥432Input/Output
285
mixtral-8x22b-instruct-v0.1
Mistral
20.4
8.7K
64K
¥14.4 / ¥43.2Input/Output
286
gemini-pro-dev-api
Google
20.2
3.3K
1.05M
¥14.4 / ¥86.4Input/Output
287
zephyr-orpo-141b-A35b-v0.1
-
19.9
827
200K
¥108 / ¥432Input/Output
288
starling-lm-7b-beta
-
19.6
2.9K
200K
¥5.4 / ¥18.7Input/Output
289
phi-3-medium-4k-instruct
Microsoft
19.3
4.2K
4.1K
¥1.22 / ¥4.9Input/Output
290
granite-3.1-8b-instruct
Ibm
19.0
560
-
-
291
yi-34b-chat
-
18.8
2.7K
-
-
292
qwen1.5-32b-chat
Alibaba
18.5
3.9K
-
-
293
llama-2-70b-chat
Meta
18.2
6.6K
-
-
294
gemini-pro
Google
17.9
1.1K
1.05M
¥14.4 / ¥86.4Input/Output
295
mixtral-8x7b-instruct-v0.1
Mistral
17.6
12.8K
32K
¥5.04 / ¥5.04Input/Output
296
starling-lm-7b-alpha
-
17.4
1.9K
200K
¥5.4 / ¥18.7Input/Output
297
wizardlm-70b
Microsoft
17.1
1.3K
-
-
298
granite-3.1-2b-instruct
Ibm
16.8
549
-
-
299
phi-3-small-8k-instruct
Microsoft
16.5
3.1K
8.19K
¥1.08 / ¥4.32Input/Output
300
llama-3.2-3b-instruct
Meta
16.2
1.4K
131K
¥0.22 / ¥0.35Input/Output
301
tulu-2-dpo-70b
-
16.0
1.1K
-
-
302
nous-hermes-2-mixtral-8x7b-dpo
-
15.7
651
1M
¥36 / ¥180Input/Output
303
qwen1.5-14b-chat
Alibaba
15.4
3.1K
-
-
304
dbrx-instruct-preview
-
15.1
5.6K
-
-
305
vicuna-33b
-
14.8
4K
-
-
306
granite-3.0-8b-instruct
Ibm
14.6
1.1K
-
-
307
llama2-70b-steerlm-chat
Nvidia
14.3
642
-
-
308
gpt-3.5-turbo-0125
Openai
14.0
11.4K
16.4K
¥3.6 / ¥10.8Input/Output
309
llama-2-13b-chat
Meta
13.7
3.3K
-
-
310
snowflake-arctic-instruct
-
13.4
5.4K
-
-
311
openchat-3.5-0106
-
13.2
2.2K
-
-
312
mistral-7b-instruct-v0.2
Mistral
12.9
3.3K
262K
¥2.88 / ¥14.4Input/Output
313
gemma-1.1-7b-it
Google
12.6
4.1K
-
-
314
qwen1.5-7b-chat
Alibaba
12.3
828
-
-
315
solar-10.7b-instruct-v1.0
-
12.0
721
128K
¥0 / ¥0Input/Output
316
granite-3.0-2b-instruct
Ibm
11.8
1.2K
-
-
317
wizardlm-13b
Microsoft
11.5
1.2K
-
-
318
phi-3-mini-4k-instruct-june-2024
Microsoft
11.2
2K
4.1K
¥0.94 / ¥3.74Input/Output
319
openchat-3.5
-
10.9
1.4K
-
-
320
codellama-34b-instruct
Meta
10.6
1.2K
-
-
321
deepseek-llm-67b-chat
Deepseek
10.4
876
1M
¥1.01 / ¥2.02Input/Output
322
dolphin-2.2.1-mistral-7b
-
10.1
308
262K
¥2.88 / ¥14.4Input/Output
323
openhermes-2.5-mistral-7b
-
9.8
884
1M
¥36 / ¥180Input/Output
324
phi-3-mini-4k-instruct
Microsoft
9.5
3.5K
4.1K
¥0.94 / ¥3.74Input/Output
325
zephyr-7b-beta
-
9.2
2K
-
-
326
guanaco-33b
-
9.0
526
200K
¥14.4 / ¥57.6Input/Output
327
llama-3.2-1b-instruct
Meta
8.7
1.4K
16.4K
¥0.07 / ¥0.08Input/Output
328
gpt-3.5-turbo-1106
Openai
8.4
3K
16.4K
¥7.2 / ¥14.4Input/Output
329
llama-2-7b-chat
Meta
8.1
2.5K
128K
¥4.03 / ¥48Input/Output
330
smollm2-1.7b-instruct
-
7.8
352
-
-
331
mpt-30b-chat
-
7.6
509
-
-
332
vicuna-13b
-
7.3
3.3K
-
-
333
zephyr-7b-alpha
-
7.0
294
-
-
334
phi-3-mini-128k-instruct
Microsoft
6.7
3.4K
128K
¥0.94 / ¥3.74Input/Output
335
gemma-7b-it
Google
6.4
1.6K
-
-
336
falcon-180b-chat
-
6.2
206
-
-
337
stripedhyena-nous-7b
-
5.9
863
-
-
338
palm-2
Google
5.6
1.4K
-
-
339
qwen-14b-chat
Alibaba
5.3
894
32.8K
¥1.04 / ¥3.1Input/Output
340
olmo-7b-instruct
Allenai
5.0
1.2K
-
-
341
vicuna-7b
-
4.8
1.2K
-
-
342
mistral-7b-instruct
Mistral
4.5
1.6K
262K
¥2.88 / ¥14.4Input/Output
343
gemma-1.1-2b-it
Google
4.2
1.9K
-
-
344
koala-13b
-
3.9
1.2K
-
-
345
qwen1.5-4b-chat
Alibaba
3.6
1.3K
-
-
346
gemma-2b-it
Google
3.4
879
-
-
347
chatglm3-6b
-
3.1
888
200K
¥5.4 / ¥18.7Input/Output
348
RWKV-4-Raven-14B
-
2.8
844
-
-
349
chatglm2-6b
-
2.5
436
200K
¥5.4 / ¥18.7Input/Output
350
mpt-7b-chat
-
2.2
677
-
-
351
gpt4all-13b-snoozy
-
2.0
305
1M
¥36 / ¥216Input/Output
352
oasst-pythia-12b
-
1.7
1.1K
-
-
353
alpaca-13b
-
1.4
986
-
-
354
fastchat-t5-3b
-
1.1
685
-
-
355
chatglm-6b
-
0.8
819
200K
¥5.4 / ¥18.7Input/Output
356
llama-13b
Meta
0.6
414
-
-
357
stablelm-tuned-alpha-7b
-
0.3
615
-
-
358
dolly-v2-12b
-
0.0
589
-
-
Top model analysis

claude-opus-4-6-thinking why it ranks first

claude-opus-4-6-thinking ranks first with a percent score of 100.0 and 5.6K samples. Use it as the first option for this leaderboard, then compare price, context and availability.

How to choose

Do not only look at rank #1

Start with the leaderboard closest to your task. Compare the top models by score and sample size, then check price, context length, open or closed access, and provider availability.

FAQ

FAQ

生命、物理与社会科学排行榜看什么指标?

主要看排名、百分制分数、样本量和来源。分数用于快速比较同一榜单内模型表现,样本量用于判断结果稳定性。

为什么不同榜单不能直接混合成总分?

不同榜单的任务、样本和评测口径不同,模力榜默认只在同一榜单内排序,避免把写作、代码、图像等能力强行合并。

生命、物理与社会科学模型应该怎么选?

优先看与你任务最接近的榜单,再结合价格、上下文长度、开源闭源和厂商可用性。排名靠前不代表适合所有预算和部署方式。

榜单多久更新?

页面展示的是最新成功采集的公开榜单数据。当前优先使用 LMArena leaderboard dataset,并在页面来源中保留原始链接。