Chat · Text · Multi-Turn Leaderboard

Ranking for Text / Multi-Turn, based on public preference data.

Selection guide

Multi-Turn model ranking guide

Ranking for Text / Multi-Turn, based on public preference data.

claude-opus-4-6claude-opus-4-7-thinkingclaude-opus-4-6-thinkingclaude-opus-4-7gemini-3.5-flash
Current DirectoryChat · Text · Multi-Turn
Models358
Published2026/05/27
Arena public preference evaluationOriginal leaderboard: Text / Multi TurnPublished: 2026/05/27Leaderboard dataset: LMArena latest parquetOpen Arena sourceOpen leaderboard dataset
1
claude-opus-4-6
Anthropic
100.0
6.4K
1M
¥36 / ¥180Input/Output
2
claude-opus-4-7-thinking
Anthropic
99.7
3.4K
1M
¥36 / ¥180Input/Output
3
claude-opus-4-6-thinking
Anthropic
99.4
5.8K
1M
¥36 / ¥180Input/Output
4
claude-opus-4-7
Anthropic
99.2
3.6K
1M
¥36 / ¥180Input/Output
5
gemini-3.5-flash
Google
98.9
1.6K
1.05M
¥10.8 / ¥64.8Input/Output
6
gemini-3.1-pro-preview
Google
98.6
7.5K
1.05M
¥14.4 / ¥86.4Input/Output
7
gpt-5.4-high
Openai
98.3
5.2K
1.05M
¥18 / ¥108Input/Output
8
gemini-3-pro
Google
98.0
6.8K
1.05M
¥14.4 / ¥86.4Input/Output
9
qwen3.7-max-preview
Alibaba
97.8
660
1M
¥18 / ¥54Input/Output
10
muse-spark
Meta
97.5
1.9K
-
-
11
qwen3.5-max-preview
Alibaba
97.2
3.4K
-
-
12
glm-5.1
Zai
96.9
2.3K
200K
¥0 / ¥0Input/Output
13
mimo-v2.5-pro
Xiaomi
96.6
2.7K
1.05M
¥7.2 / ¥21.6Input/Output
14
ernie-5.1
Baidu
96.4
2.5K
119K
¥5.4 / ¥21.6Input/Output
15
gemini-3-flash
Google
96.1
5.3K
1.05M
¥3.6 / ¥21.6Input/Output
16
gpt-5.5-high
Openai
95.8
2.8K
1.05M
¥36 / ¥216Input/Output
17
gpt-5.5
Openai
95.5
2.9K
1.05M
¥36 / ¥216Input/Output
18
gpt-5.4
Openai
95.2
5.6K
1.05M
¥18 / ¥108Input/Output
19
claude-opus-4-5-20251101
Anthropic
95.0
11.7K
200K
¥36 / ¥180Input/Output
20
claude-sonnet-4-5-20250929
Anthropic
94.7
13.2K
200K
¥21.6 / ¥108Input/Output
21
deepseek-v4-pro
Deepseek
94.4
3.1K
1M
¥3.13 / ¥6.26Input/Output
22
claude-sonnet-4-6
Anthropic
94.1
5K
1M
¥21.6 / ¥108Input/Output
23
claude-opus-4-5-20251101-thinking-32k
Anthropic
93.8
6.5K
200K
¥108 / ¥540Input/Output
24
gemini-3-flash (thinking-minimal)
Google
93.6
9.5K
1.05M
¥3.6 / ¥21.6Input/Output
25
gpt-5.2-chat-latest-20260210
Openai
93.3
6K
400K
¥12.6 / ¥101Input/Output
26
grok-4.20-beta-0309-reasoning
Xai
93.0
5.3K
2M
¥14.4 / ¥43.2Input/Output
27
mimo-v2-pro
Xiaomi
92.7
4K
1.05M
¥7.2 / ¥21.6Input/Output
28
grok-4.20-multi-agent-beta-0309
Xai
92.4
4.8K
2M
¥14.4 / ¥43.2Input/Output
29
chatgpt-4o-latest-20250326
Openai
92.2
14.6K
128K
¥18 / ¥72Input/Output
30
qwen3.6-max-preview
Alibaba
91.9
742
246K
¥9.5 / ¥56.9Input/Output
31
grok-4.20-beta1
Xai
91.6
3.9K
2M
¥14.4 / ¥43.2Input/Output
32
glm-5
Zai
91.3
3.5K
205K
¥7.2 / ¥23Input/Output
33
gemini-2.5-pro
Google
91.0
20.8K
1.05M
¥9 / ¥72Input/Output
34
gemma-4-31b
Google
90.8
1.1K
262K
¥3.24 / ¥7.2Input/Output
35
claude-sonnet-4-5-20250929-thinking-32k
Anthropic
90.5
13.2K
200K
¥21.6 / ¥108Input/Output
36
deepseek-v4-pro-thinking
Deepseek
90.2
2.8K
1M
¥3.13 / ¥6.26Input/Output
37
qwen3-max-preview
Alibaba
89.9
4.7K
262K
¥6.2 / ¥24.8Input/Output
38
gpt-5.1-high
Openai
89.6
7.1K
400K
¥9 / ¥72Input/Output
39
glm-4.7
Zai
89.4
1.9K
205K
¥0 / ¥0Input/Output
40
kimi-k2.6
Moonshot
89.1
2.7K
262K
¥6.84 / ¥28.8Input/Output
41
mimo-v2.5
Xiaomi
88.8
2.8K
1.05M
¥2.88 / ¥14.4Input/Output
42
claude-opus-4-1-20250805-thinking-16k
Anthropic
88.5
8.6K
200K
¥108 / ¥540Input/Output
43
qwen3.5-397b-a17b
Alibaba
88.2
5.4K
262K
¥3.1 / ¥18.6Input/Output
44
amazon-nova-experimental-chat-26-02-10
Amazon
88.0
599
-
-
45
kimi-k2.5-thinking
Moonshot
87.7
6.4K
262K
¥4.32 / ¥21.6Input/Output
46
gpt-4.5-preview-2025-02-27
Openai
87.4
2.1K
8.19K
¥216 / ¥432Input/Output
47
deepseek-v4-flash
Deepseek
87.1
3K
1M
¥1.01 / ¥2.02Input/Output
48
claude-opus-4-1-20250805
Anthropic
86.8
13.3K
200K
¥108 / ¥540Input/Output
49
dola-seed-2.0-pro
Bytedance
86.6
6.6K
-
-
50
gemma-4-26b-a4b
Google
86.3
1.1K
262K
¥0.94 / ¥2.88Input/Output
51
grok-4.1
Xai
86.0
11.9K
200K
¥14.4 / ¥72Input/Output
52
qwen3-max-2025-09-23
Alibaba
85.7
1.7K
258K
¥6.19 / ¥24.7Input/Output
53
gpt-5.4-mini-high
Openai
85.4
5K
400K
¥5.4 / ¥32.4Input/Output
54
gpt-5.5-instant
Openai
85.2
4.6K
400K
¥9 / ¥72Input/Output
55
grok-4.1-thinking
Xai
84.9
11.7K
200K
¥14.4 / ¥72Input/Output
56
qwen3.6-plus
Alibaba
84.6
3.2K
1M
¥3.6 / ¥21.6Input/Output
57
gpt-5.1
Openai
84.3
8K
400K
¥9 / ¥72Input/Output
58
ernie-5.0-0110
Baidu
84.0
5.5K
128K
¥7.92 / ¥14.4Input/Output
59
qwen3-235b-a22b-instruct-2507
Alibaba
83.8
16.6K
128K
¥2.09 / ¥8.23Input/Output
60
deepseek-v4-flash-thinking
Deepseek
83.5
2.9K
1M
¥1.01 / ¥2.02Input/Output
61
mistral-large-3
Mistral
83.2
7.3K
262K
¥3.6 / ¥10.8Input/Output
62
qwen3-vl-235b-a22b-instruct
Alibaba
82.9
2K
128K
¥2.16 / ¥8.64Input/Output
63
grok-4.3
Xai
82.6
3K
1M
¥9 / ¥18Input/Output
64
kimi-k2.5-instant
Moonshot
82.4
1.5K
262K
¥4.32 / ¥21.6Input/Output
65
glm-4.6
Zai
82.1
5.7K
205K
¥4.32 / ¥15.8Input/Output
66
deepseek-v3.2
Deepseek
81.8
8K
128K
¥2.09 / ¥3.1Input/Output
67
gpt-5-chat
Openai
81.5
5.8K
400K
¥9 / ¥72Input/Output
68
deepseek-v3.2-exp
Deepseek
81.2
2K
128K
¥0 / ¥0Input/Output
69
grok-3-preview-02-24
Xai
81.0
4.8K
1M
¥9 / ¥18Input/Output
70
ernie-5.0-preview-1022
Baidu
80.7
703
128K
¥7.92 / ¥14.4Input/Output
71
deepseek-v3.2-thinking
Deepseek
80.4
6.7K
128K
¥2.09 / ¥3.1Input/Output
72
mistral-medium-2508
Mistral
80.1
15.8K
262K
¥2.88 / ¥14.4Input/Output
73
hunyuan-vision-1.5-thinking
Tencent
79.8
438
-
-
74
gpt-5.2-high
Openai
79.6
8.5K
400K
¥12.6 / ¥101Input/Output
75
gemini-3.1-flash-lite-preview
Google
79.3
6.3K
1.05M
¥1.8 / ¥10.8Input/Output
76
gpt-5.2
Openai
79.0
8.8K
400K
¥12.6 / ¥101Input/Output
77
qwen3-next-80b-a3b-instruct
Alibaba
78.7
4.1K
131K
¥1.04 / ¥4.13Input/Output
78
glm-4.5
Zai
78.4
3.9K
131K
¥4.32 / ¥15.8Input/Output
79
grok-4-fast-chat
Xai
78.2
1.2K
2M
¥1.44 / ¥3.6Input/Output
80
grok-4-0709
Xai
77.9
6.6K
256K
¥21.6 / ¥108Input/Output
81
deepseek-v3.1-terminus-thinking
Deepseek
77.6
604
128K
¥1.8 / ¥5.04Input/Output
82
deepseek-v3.2-exp-thinking
Deepseek
77.3
1.4K
128K
¥0 / ¥0Input/Output
83
hunyuan-hy3-preview
Tencent
77.0
990
256K
¥0 / ¥0Input/Output
84
ernie-5.0-preview-1203
Baidu
76.8
1.5K
128K
¥7.92 / ¥14.4Input/Output
85
longcat-flash-chat-2602-exp
Meituan
76.5
4.2K
128K
¥1.08 / ¥10.8Input/Output
86
qwen3.5-122b-a10b
Alibaba
76.2
4.5K
262K
¥2.88 / ¥23Input/Output
87
kimi-k2-thinking-turbo
Moonshot
75.9
10.4K
262K
¥17.3 / ¥72Input/Output
88
qwen3.5-27b
Alibaba
75.6
4.3K
262K
¥2.16 / ¥17.3Input/Output
89
gpt-5.3-chat-latest
Openai
75.4
5.7K
128K
¥12.6 / ¥101Input/Output
90
minimax-m2.7
Minimax
75.1
4.1K
205K
¥0 / ¥0Input/Output
91
claude-haiku-4-5-20251001
Anthropic
74.8
13.7K
200K
¥7.2 / ¥36Input/Output
92
gemini-2.5-flash
Google
74.5
21.2K
1.05M
¥2.16 / ¥18Input/Output
93
deepseek-v3.1-thinking
Deepseek
74.2
1.9K
128K
¥1.44 / ¥5.04Input/Output
94
qwen3-235b-a22b-thinking-2507
Alibaba
73.9
1.4K
131K
¥2.07 / ¥8.26Input/Output
95
amazon-nova-experimental-chat-12-10
Amazon
73.7
629
-
-
96
step-3.5-flash
Stepfun
73.4
6.1K
256K
¥0.69 / ¥2.07Input/Output
97
deepseek-r1-0528
Deepseek
73.1
3K
164K
¥3.6 / ¥15.5Input/Output
98
grok-4-fast-reasoning
Xai
72.8
3.1K
2M
¥1.44 / ¥3.6Input/Output
99
o3-2025-04-16
Openai
72.5
9.9K
200K
¥14.4 / ¥57.6Input/Output
100
mimo-v2-flash (non-thinking)
Xiaomi
72.3
7.8K
262K
¥0.72 / ¥2.16Input/Output
101
longcat-flash-chat
Meituan
72.0
2K
128K
¥1.08 / ¥10.8Input/Output
102
deepseek-v3.1
Deepseek
71.7
2.5K
128K
¥1.44 / ¥5.04Input/Output
103
gpt-5-high
Openai
71.4
5K
400K
¥9 / ¥72Input/Output
104
qwen3-235b-a22b-no-thinking
Alibaba
71.1
6.8K
131K
¥2.07 / ¥8.26Input/Output
105
gpt-4.1-2025-04-14
Openai
70.9
9K
1.05M
¥14.4 / ¥57.6Input/Output
106
mimo-v2-omni
Xiaomi
70.6
482
262K
¥2.88 / ¥14.4Input/Output
107
gemini-2.5-flash-preview-09-2025
Google
70.3
5.8K
1M
¥2.16 / ¥18Input/Output
108
claude-opus-4-20250514-thinking-16k
Anthropic
70.0
6.3K
200K
¥108 / ¥540Input/Output
109
minimax-m2.1-preview
Minimax
69.7
2.9K
205K
¥0 / ¥0Input/Output
110
hunyuan-t1-20250711
Tencent
69.5
687
131K
¥0 / ¥0Input/Output
111
grok-4-1-fast-reasoning
Xai
69.2
9.8K
2M
¥1.44 / ¥3.6Input/Output
112
deepseek-r1
Deepseek
68.9
2.4K
164K
¥5.04 / ¥18Input/Output
113
qwen3.5-35b-a3b
Alibaba
68.6
4.8K
262K
¥1.8 / ¥14.4Input/Output
114
qwen3-vl-235b-a22b-thinking
Alibaba
68.3
1.3K
131K
¥2.06 / ¥8.26Input/Output
115
qwen3.5-flash
Alibaba
68.1
5.2K
1M
¥1.24 / ¥12.4Input/Output
116
deepseek-v3.1-terminus
Deepseek
67.8
691
128K
¥1.8 / ¥5.04Input/Output
117
deepseek-v3-0324
Deepseek
67.5
7.9K
75K
¥1.44 / ¥5.76Input/Output
118
claude-opus-4-20250514
Anthropic
67.2
7.7K
200K
¥108 / ¥540Input/Output
119
hunyuan-turbos-20250416
Tencent
66.9
1.7K
131K
¥0 / ¥0Input/Output
120
mistral-medium-2505
Mistral
66.7
5.8K
262K
¥2.88 / ¥14.4Input/Output
121
gpt-5.4-nano-high
Openai
66.4
4.9K
400K
¥1.44 / ¥9Input/Output
122
amazon-nova-experimental-chat-26-01-10
Amazon
66.1
603
-
-
123
qwen3-30b-a3b-instruct-2507
Alibaba
65.8
4.1K
262K
¥2.16 / ¥3.6Input/Output
124
amazon-nova-experimental-chat-11-10
Amazon
65.5
4.1K
-
-
125
mimo-v2-flash (thinking)
Xiaomi
65.3
1.9K
262K
¥0.72 / ¥2.16Input/Output
126
claude-sonnet-4-20250514-thinking-32k
Anthropic
65.0
5.9K
200K
¥21.6 / ¥108Input/Output
127
kimi-k2-0711-preview
Moonshot
64.7
4.7K
131K
¥4.32 / ¥18Input/Output
128
glm-4.5-air
Zai
64.4
5.1K
131K
¥0 / ¥0Input/Output
129
o1-preview
Openai
64.1
5.9K
128K
¥108 / ¥432Input/Output
130
minimax-m2
Minimax
63.9
1.1K
197K
¥0 / ¥0Input/Output
131
nvidia-nemotron-3-super-120b-a12b
Nvidia
63.6
1.3K
262K
¥1.44 / ¥5.76Input/Output
132
amazon-nova-experimental-chat-10-20
Amazon
63.3
1.9K
-
-
133
qwen3-coder-480b-a35b-instruct
Alibaba
63.0
4.4K
262K
¥6.2 / ¥24.8Input/Output
134
gemini-2.5-flash-lite-preview-09-2025-no-thinking
Google
62.7
8.2K
1.05M
¥0.72 / ¥2.88Input/Output
135
kimi-k2-0905-preview
Moonshot
62.5
2.1K
262K
¥4.32 / ¥18Input/Output
136
gpt-5-mini-high
Openai
62.2
4.5K
400K
¥1.8 / ¥14.4Input/Output
137
qwen2.5-max
Alibaba
61.9
4.9K
32K
¥11.5 / ¥46Input/Output
138
claude-sonnet-4-20250514
Anthropic
61.6
7.2K
200K
¥21.6 / ¥108Input/Output
139
qwen3-235b-a22b
Alibaba
61.3
4.3K
131K
¥2.07 / ¥8.26Input/Output
140
minimax-m2.5
Minimax
61.1
6.4K
205K
¥0 / ¥0Input/Output
141
gemini-2.5-flash-lite-preview-06-17-thinking
Google
60.8
5.7K
65.5K
¥0.72 / ¥2.88Input/Output
142
glm-4.6v
Zai
60.5
480
128K
¥2.16 / ¥6.48Input/Output
143
mercury-2
Inception Ai
60.2
540
128K
¥1.8 / ¥5.4Input/Output
144
o1-2024-12-17
Openai
59.9
4.4K
128K
¥108 / ¥432Input/Output
145
amazon-nova-experimental-chat-10-09
Amazon
59.7
481
-
-
146
gpt-4.1-mini-2025-04-14
Openai
59.4
7.2K
1.05M
¥2.88 / ¥11.5Input/Output
147
intellect-3
-
59.1
780
131K
¥1.44 / ¥7.92Input/Output
148
grok-3-mini-high
Xai
58.8
2.9K
128K
¥0 / ¥0Input/Output
149
gemini-2.0-flash-001
Google
58.5
6.9K
1.05M
¥1.08 / ¥4.32Input/Output
150
o4-mini-2025-04-16
Openai
58.3
7.9K
200K
¥7.92 / ¥31.7Input/Output
151
deepseek-v3
Deepseek
58.0
3.7K
128K
¥0 / ¥0Input/Output
152
grok-3-mini-beta
Xai
57.7
3.9K
1M
¥9 / ¥18Input/Output
153
qwen3-next-80b-a3b-thinking
Alibaba
57.4
2.3K
131K
¥1.04 / ¥10.3Input/Output
154
mistral-small-2506
Mistral
57.1
3.1K
262K
¥2.88 / ¥14.4Input/Output
155
gemma-3-27b-it
Google
56.9
7K
128K
¥2.15 / ¥2.15Input/Output
156
glm-4.7-flash
Zai
56.6
2.1K
200K
¥0 / ¥0Input/Output
157
trinity-large-preview
-
56.3
5K
262K
¥1.8 / ¥6.48Input/Output
158
gpt-oss-120b
Openai
56.0
5K
131K
¥1.08 / ¥4.32Input/Output
159
claude-3-7-sonnet-20250219
Anthropic
55.7
7.1K
200K
¥21.6 / ¥108Input/Output
160
step-1o-turbo-202506
Stepfun
55.5
1.4K
-
-
161
command-a-03-2025
Cohere
55.2
9.5K
256K
¥18 / ¥72Input/Output
162
step-3
Stepfun
54.9
1.1K
65.5K
¥1.8 / ¥4.68Input/Output
163
nova-2-lite
Amazon
54.6
2.1K
128K
¥2.38 / ¥19.8Input/Output
164
claude-3-7-sonnet-20250219-thinking-32k
Anthropic
54.3
6.4K
-
-
165
qwen-plus-0125
Alibaba
54.1
842
1M
¥0.83 / ¥2.07Input/Output
166
minimax-m1
Minimax
53.8
5.7K
1M
¥0.95 / ¥9.03Input/Output
167
glm-4.5v
Zai
53.5
895
64K
¥4.32 / ¥13Input/Output
168
nvidia-llama-3.3-nemotron-super-49b-v1.5
Nvidia
53.2
565
131K
¥2.88 / ¥2.88Input/Output
169
gemma-3-12b-it
Google
52.9
471
128K
¥1.96 / ¥1.96Input/Output
170
qwen3-32b
Alibaba
52.7
590
131K
¥2.07 / ¥8.26Input/Output
171
nvidia-nemotron-3-nano-30b-a3b-bf16
Nvidia
52.4
2.6K
131K
¥0 / ¥0Input/Output
172
ling-flash-2.0
Ant Group
52.1
1.2K
131K
¥1.01 / ¥4.1Input/Output
173
claude-3-5-sonnet-20241022
Anthropic
51.8
16K
200K
¥21.6 / ¥108Input/Output
174
trinity-large-thinking
-
51.5
4.3K
262K
¥1.8 / ¥6.48Input/Output
175
glm-4-plus-0111
Zai
51.3
851
128K
¥72 / ¥72Input/Output
176
o3-mini-high
Openai
51.0
2.5K
200K
¥7.92 / ¥31.7Input/Output
177
hunyuan-turbos-20250226
Tencent
50.7
293
131K
¥0 / ¥0Input/Output
178
llama-3.1-nemotron-ultra-253b-v1
Nvidia
50.4
374
128K
¥4.32 / ¥13Input/Output
179
llama-3.3-nemotron-49b-super-v1
Nvidia
50.1
301
131K
¥0 / ¥0Input/Output
180
qwq-32b
Alibaba
49.9
3.8K
131K
¥2.07 / ¥6.2Input/Output
181
o1-mini
Openai
49.6
9.3K
128K
¥7.92 / ¥31.7Input/Output
182
o3-mini
Openai
49.3
9.5K
200K
¥7.92 / ¥31.7Input/Output
183
gpt-5-nano-high
Openai
49.0
1.5K
400K
¥0.36 / ¥2.88Input/Output
184
yi-lightning
-
48.7
4.7K
12K
¥1.44 / ¥1.44Input/Output
185
olmo-3.1-32b-instruct
Allenai
48.5
2.2K
200K
¥14.4 / ¥57.6Input/Output
186
qwen3-30b-a3b
Alibaba
48.2
4.5K
128K
¥0.79 / ¥7.78Input/Output
187
gemini-2.0-flash-lite-preview-02-05
Google
47.9
3.6K
1.05M
¥0.54 / ¥2.16Input/Output
188
hunyuan-turbo-0110
Tencent
47.6
323
-
-
189
gpt-4o-2024-05-13
Openai
47.3
20.4K
128K
¥36 / ¥108Input/Output
190
qwen2.5-plus-1127
Alibaba
47.1
1.8K
-
-
191
claude-3-5-sonnet-20240620
Anthropic
46.8
15K
200K
¥21.6 / ¥108Input/Output
192
llama-3.1-405b-instruct-bf16
Meta
46.5
6.8K
128K
¥0 / ¥0Input/Output
193
deepseek-v2.5-1210
Deepseek
46.2
1.2K
1M
¥1.01 / ¥2.02Input/Output
194
gemini-1.5-pro-002
Google
45.9
10.1K
-
-
195
step-2-16k-exp-202412
Stepfun
45.7
805
16.4K
¥37.5 / ¥118Input/Output
196
ring-flash-2.0
Ant Group
45.4
1.2K
131K
¥1.01 / ¥4.1Input/Output
197
olmo-3-32b-think
Allenai
45.1
816
128K
¥2.16 / ¥3.24Input/Output
198
grok-2-2024-08-13
Xai
44.8
10.7K
1M
¥9 / ¥18Input/Output
199
glm-4-plus
Zai
44.5
4.8K
128K
¥54 / ¥54Input/Output
200
llama-4-maverick-17b-128e-instruct
Meta
44.3
6.8K
1M
¥1.8 / ¥6.26Input/Output
201
llama-3.1-405b-instruct-fp8
Meta
44.0
11.1K
128K
¥0 / ¥0Input/Output
202
athene-v2-chat
-
43.7
4.3K
-
-
203
gpt-4o-mini-2024-07-18
Openai
43.4
12.3K
128K
¥1.08 / ¥4.32Input/Output
204
hunyuan-large-2025-02-10
Tencent
43.1
510
-
-
205
mercury
Inception Ai
42.9
337
128K
¥1.8 / ¥5.4Input/Output
206
granite-4.1-8b
Ibm
42.6
581
131K
¥0.36 / ¥0.72Input/Output
207
llama-4-scout-17b-16e-instruct
Meta
42.3
5.3K
128K
¥1.44 / ¥5.62Input/Output
208
llama-3.3-70b-instruct
Meta
42.0
9.1K
128K
¥0 / ¥0Input/Output
209
gpt-4o-2024-08-06
Openai
41.7
8.2K
128K
¥18 / ¥72Input/Output
210
gemma-3n-e4b-it
Google
41.5
3.4K
128K
¥0 / ¥0Input/Output
211
qwen-max-0919
Alibaba
41.2
3.1K
131K
¥2.48 / ¥9.91Input/Output
212
gpt-4.1-nano-2025-04-14
Openai
40.9
1K
1.05M
¥14.4 / ¥57.6Input/Output
213
llama-3.1-nemotron-70b-instruct
Nvidia
40.6
1.2K
128K
¥0 / ¥0Input/Output
214
claude-3-opus-20240229
Anthropic
40.3
31.1K
200K
¥108 / ¥540Input/Output
215
magistral-medium-2506
Mistral
40.1
1.7K
128K
¥14.4 / ¥36Input/Output
216
hunyuan-standard-2025-02-10
Tencent
39.8
570
-
-
217
qwen2.5-72b-instruct
Alibaba
39.5
7.3K
131K
¥4.13 / ¥12.4Input/Output
218
gpt-oss-20b
Openai
39.2
1.8K
131K
¥0.32 / ¥1.3Input/Output
219
mistral-small-3.1-24b-instruct-2503
Mistral
38.9
5.6K
262K
¥2.88 / ¥14.4Input/Output
220
gemini-advanced-0514
Google
38.7
7.8K
-
-
221
gemini-1.5-pro-001
Google
38.4
13.7K
-
-
222
gpt-4-turbo-2024-04-09
Openai
38.1
15.3K
128K
¥72 / ¥216Input/Output
223
grok-2-mini-2024-08-13
Xai
37.8
9.4K
1M
¥9 / ¥18Input/Output
224
claude-3-5-haiku-20241022
Anthropic
37.5
11.5K
200K
¥5.76 / ¥28.8Input/Output
225
deepseek-v2.5
Deepseek
37.3
4.5K
1M
¥1.01 / ¥2.02Input/Output
226
mistral-large-2411
Mistral
37.0
4.4K
128K
¥14.4 / ¥43.2Input/Output
227
mistral-large-2407
Mistral
36.7
8.3K
131K
¥14.4 / ¥43.2Input/Output
228
athene-70b-0725
-
36.4
3.3K
-
-
229
gpt-4-1106-preview
Openai
36.1
14.9K
8.19K
¥216 / ¥432Input/Output
230
llama-3.1-70b-instruct
Meta
35.9
10.4K
131K
¥2.88 / ¥2.88Input/Output
231
gemma-3-4b-it
Google
35.6
565
128K
¥1.44 / ¥1.44Input/Output
232
hunyuan-large-vision
Tencent
35.3
964
-
-
233
gemini-1.5-flash-002
Google
35.0
6.3K
2M
¥0.54 / ¥2.2Input/Output
234
llama-3.1-tulu-3-70b
Allenai
34.7
537
-
-
235
olmo-3.1-32b-think
Allenai
34.5
1.2K
200K
¥14.4 / ¥57.6Input/Output
236
gpt-4-0125-preview
Openai
34.2
14.4K
8.19K
¥216 / ¥432Input/Output
237
ibm-granite-h-small
Ibm
33.9
974
-
-
238
amazon-nova-pro-v1.0
Amazon
33.6
4.1K
300K
¥5.76 / ¥23Input/Output
239
gemini-1.5-flash-001
Google
33.3
11.1K
2M
¥0.54 / ¥2.2Input/Output
240
claude-3-sonnet-20240229
Anthropic
33.1
17.1K
200K
¥21.6 / ¥108Input/Output
241
llama-3.1-nemotron-51b-instruct
Nvidia
32.8
781
128K
¥0 / ¥0Input/Output
242
reka-core-20240904
-
32.5
1.3K
-
-
243
gemma-2-27b-it
Google
32.2
13.6K
8.19K
¥0.58 / ¥0.58Input/Output
244
llama-3-70b-instruct
Meta
31.9
21.4K
8.19K
¥3.67 / ¥5.33Input/Output
245
qwen2.5-coder-32b-instruct
Alibaba
31.7
1K
131K
¥2.07 / ¥6.2Input/Output
246
olmo-2-0325-32b-instruct
Allenai
31.4
363
-
-
247
jamba-1.5-large
-
31.1
1.6K
256K
¥0 / ¥0Input/Output
248
gemma-2-9b-it-simpo
-
30.8
1.7K
8.19K
¥1.44 / ¥1.44Input/Output
249
glm-4-0520
Zai
30.5
1.6K
128K
¥108 / ¥108Input/Output
250
mistral-small-24b-instruct-2501
Mistral
30.3
2.2K
262K
¥2.88 / ¥14.4Input/Output
251
nemotron-4-340b-instruct
Nvidia
30.0
3.1K
-
-
252
command-r-plus-08-2024
Cohere
29.7
1.8K
128K
¥18 / ¥72Input/Output
253
gpt-4-0314
Openai
29.4
8K
8.19K
¥216 / ¥432Input/Output
254
phi-4
Microsoft
29.1
3.5K
128K
¥0.9 / ¥3.6Input/Output
255
amazon-nova-lite-v1.0
Amazon
28.9
3.2K
300K
¥0.43 / ¥1.73Input/Output
256
qwen2-72b-instruct
Alibaba
28.6
6.7K
131K
¥4.13 / ¥12.4Input/Output
257
gemma-2-9b-it
Google
28.3
9.7K
8.19K
¥1.44 / ¥1.44Input/Output
258
command-r-plus
Cohere
28.0
12.1K
128K
¥18 / ¥72Input/Output
259
c4ai-aya-expanse-32b
Cohere
27.7
5.2K
-
-
260
claude-3-haiku-20240307
Anthropic
27.5
19.3K
200K
¥1.8 / ¥9Input/Output
261
reka-flash-20240904
-
27.2
1.3K
65.5K
¥0.72 / ¥1.44Input/Output
262
gemini-1.5-flash-8b-001
Google
26.9
6.7K
2M
¥0.54 / ¥2.2Input/Output
263
gpt-4-0613
Openai
26.6
13.2K
8.19K
¥216 / ¥432Input/Output
264
amazon-nova-micro-v1.0
Amazon
26.3
3.1K
128K
¥0.25 / ¥1.01Input/Output
265
deepseek-coder-v2
Deepseek
26.1
2.7K
1M
¥1.01 / ¥2.02Input/Output
266
hunyuan-standard-256k
Tencent
25.8
532
-
-
267
mistral-large-2402
Mistral
25.5
9.5K
262K
¥2.88 / ¥14.4Input/Output
268
llama-3.1-8b-instruct
Meta
25.2
8.9K
131K
¥0.79 / ¥0.79Input/Output
269
ministral-8b-2410
Mistral
24.9
916
128K
¥0.72 / ¥0.72Input/Output
270
command-r-08-2024
Cohere
24.6
1.9K
128K
¥18 / ¥72Input/Output
271
qwen1.5-72b-chat
Alibaba
24.4
5.5K
-
-
272
qwen1.5-110b-chat
Alibaba
24.1
3.7K
-
-
273
c4ai-aya-expanse-8b
Cohere
23.8
1.7K
-
-
274
jamba-1.5-mini
-
23.5
1.6K
256K
¥0 / ¥0Input/Output
275
llama-3.1-tulu-3-8b
Allenai
23.2
520
-
-
276
yi-1.5-34b-chat
-
23.0
3.4K
-
-
277
llama-3-8b-instruct
Meta
22.7
14.8K
8.19K
¥0.29 / ¥0.29Input/Output
278
reka-flash-21b-20240226-online
-
22.4
2K
-
-
279
command-r
Cohere
22.1
8.3K
128K
¥18 / ¥72Input/Output
280
mistral-medium
Mistral
21.8
4.6K
262K
¥2.88 / ¥14.4Input/Output
281
reka-flash-21b-20240226
-
21.6
3.4K
-
-
282
qwq-32b-preview
Alibaba
21.3
487
131K
¥2.07 / ¥6.2Input/Output
283
qwen1.5-32b-chat
Alibaba
21.0
3.1K
-
-
284
gemini-pro
Google
20.7
641
1.05M
¥14.4 / ¥86.4Input/Output
285
internlm2_5-20b-chat
-
20.4
1.8K
-
-
286
mixtral-8x22b-instruct-v0.1
Mistral
20.2
7.8K
64K
¥14.4 / ¥43.2Input/Output
287
gemini-pro-dev-api
Google
19.9
2.5K
1.05M
¥14.4 / ¥86.4Input/Output
288
gemma-2-2b-it
Google
19.6
7.9K
128K
¥0 / ¥0Input/Output
289
gpt-3.5-turbo-0125
Openai
19.3
10.3K
16.4K
¥3.6 / ¥10.8Input/Output
290
mixtral-8x7b-instruct-v0.1
Mistral
19.0
10.9K
32K
¥5.04 / ¥5.04Input/Output
291
yi-34b-chat
-
18.8
1.9K
-
-
292
starling-lm-7b-beta
-
18.5
2.2K
200K
¥5.4 / ¥18.7Input/Output
293
dbrx-instruct-preview
-
18.2
4.9K
-
-
294
qwen1.5-14b-chat
Alibaba
17.9
2.6K
-
-
295
wizardlm-70b
Microsoft
17.6
1.2K
-
-
296
granite-3.1-8b-instruct
Ibm
17.4
551
-
-
297
llama-3.2-3b-instruct
Meta
17.1
1.3K
131K
¥0.22 / ¥0.35Input/Output
298
zephyr-orpo-141b-A35b-v0.1
-
16.8
683
200K
¥108 / ¥432Input/Output
299
granite-3.1-2b-instruct
Ibm
16.5
579
-
-
300
openchat-3.5-0106
-
16.2
1.8K
-
-
301
llama-2-70b-chat
Meta
16.0
5.5K
-
-
302
phi-3-medium-4k-instruct
Microsoft
15.7
3.8K
4.1K
¥1.22 / ¥4.9Input/Output
303
tulu-2-dpo-70b
-
15.4
827
-
-
304
openchat-3.5
-
15.1
1.2K
-
-
305
starling-lm-7b-alpha
-
14.8
1.3K
200K
¥5.4 / ¥18.7Input/Output
306
deepseek-llm-67b-chat
Deepseek
14.6
620
1M
¥1.01 / ¥2.02Input/Output
307
nous-hermes-2-mixtral-8x7b-dpo
-
14.3
467
1M
¥36 / ¥180Input/Output
308
vicuna-33b
-
14.0
3.1K
-
-
309
gpt-3.5-turbo-1106
Openai
13.7
2.4K
16.4K
¥7.2 / ¥14.4Input/Output
310
snowflake-arctic-instruct
-
13.4
4K
-
-
311
phi-3-small-8k-instruct
Microsoft
13.2
3.1K
8.19K
¥1.08 / ¥4.32Input/Output
312
openhermes-2.5-mistral-7b
-
12.9
722
1M
¥36 / ¥180Input/Output
313
granite-3.0-8b-instruct
Ibm
12.6
1.1K
-
-
314
mistral-7b-instruct-v0.2
Mistral
12.3
2.6K
262K
¥2.88 / ¥14.4Input/Output
315
qwen1.5-7b-chat
Alibaba
12.0
589
-
-
316
llama2-70b-steerlm-chat
Nvidia
11.8
451
-
-
317
mpt-30b-chat
-
11.5
341
-
-
318
granite-3.0-2b-instruct
Ibm
11.2
1.2K
-
-
319
llama-2-13b-chat
Meta
10.9
2.6K
-
-
320
phi-3-mini-4k-instruct-june-2024
Microsoft
10.6
1.8K
4.1K
¥0.94 / ¥3.74Input/Output
321
wizardlm-13b
Microsoft
10.4
983
-
-
322
solar-10.7b-instruct-v1.0
-
10.1
601
128K
¥0 / ¥0Input/Output
323
gemma-1.1-7b-it
Google
9.8
3.6K
-
-
324
zephyr-7b-beta
-
9.5
1.6K
-
-
325
vicuna-13b
-
9.2
2.6K
-
-
326
llama-3.2-1b-instruct
Meta
9.0
1.2K
16.4K
¥0.07 / ¥0.08Input/Output
327
llama-2-7b-chat
Meta
8.7
1.9K
128K
¥4.03 / ¥48Input/Output
328
qwen-14b-chat
Alibaba
8.4
753
32.8K
¥1.04 / ¥3.1Input/Output
329
dolphin-2.2.1-mistral-7b
-
8.1
208
262K
¥2.88 / ¥14.4Input/Output
330
phi-3-mini-4k-instruct
Microsoft
7.8
2.9K
4.1K
¥0.94 / ¥3.74Input/Output
331
zephyr-7b-alpha
-
7.6
283
-
-
332
codellama-34b-instruct
Meta
7.3
1.2K
-
-
333
falcon-180b-chat
-
7.0
207
-
-
334
mistral-7b-instruct
Mistral
6.7
1.3K
262K
¥2.88 / ¥14.4Input/Output
335
stripedhyena-nous-7b
-
6.4
738
-
-
336
olmo-7b-instruct
Allenai
6.2
537
-
-
337
guanaco-33b
-
5.9
331
200K
¥14.4 / ¥57.6Input/Output
338
palm-2
Google
5.6
1.2K
-
-
339
smollm2-1.7b-instruct
-
5.3
339
-
-
340
phi-3-mini-128k-instruct
Microsoft
5.0
2.5K
128K
¥0.94 / ¥3.74Input/Output
341
vicuna-7b
-
4.8
935
-
-
342
qwen1.5-4b-chat
Alibaba
4.5
1.1K
-
-
343
gemma-7b-it
Google
4.2
1.3K
-
-
344
chatglm3-6b
-
3.9
637
200K
¥5.4 / ¥18.7Input/Output
345
gemma-1.1-2b-it
Google
3.6
1.5K
-
-
346
gemma-2b-it
Google
3.4
768
-
-
347
gpt4all-13b-snoozy
-
3.1
253
1M
¥36 / ¥216Input/Output
348
koala-13b
-
2.8
792
-
-
349
chatglm2-6b
-
2.5
369
200K
¥5.4 / ¥18.7Input/Output
350
mpt-7b-chat
-
2.2
466
-
-
351
RWKV-4-Raven-14B
-
2.0
556
-
-
352
alpaca-13b
-
1.7
653
-
-
353
oasst-pythia-12b
-
1.4
683
-
-
354
chatglm-6b
-
1.1
492
200K
¥5.4 / ¥18.7Input/Output
355
fastchat-t5-3b
-
0.8
387
-
-
356
stablelm-tuned-alpha-7b
-
0.6
354
-
-
357
llama-13b
Meta
0.3
247
-
-
358
dolly-v2-12b
-
0.0
356
-
-
Top model analysis

claude-opus-4-6 why it ranks first

claude-opus-4-6 ranks first with a percent score of 100.0 and 6.4K samples. Use it as the first option for this leaderboard, then compare price, context and availability.

How to choose

Do not only look at rank #1

Start with the leaderboard closest to your task. Compare the top models by score and sample size, then check price, context length, open or closed access, and provider availability.

FAQ

FAQ

多轮对话排行榜看什么指标?

主要看排名、百分制分数、样本量和来源。分数用于快速比较同一榜单内模型表现,样本量用于判断结果稳定性。

为什么不同榜单不能直接混合成总分?

不同榜单的任务、样本和评测口径不同,模力榜默认只在同一榜单内排序,避免把写作、代码、图像等能力强行合并。

多轮对话模型应该怎么选?

优先看与你任务最接近的榜单,再结合价格、上下文长度、开源闭源和厂商可用性。排名靠前不代表适合所有预算和部署方式。

榜单多久更新?

页面展示的是最新成功采集的公开榜单数据。当前优先使用 LMArena leaderboard dataset,并在页面来源中保留原始链接。