Image · Image Edit · Multi-Image Edit Leaderboard

Ranking for Image Edit / Multi-Image Edit, based on public preference data.

Selection guide

Multi-Image Edit model ranking guide

Ranking for Image Edit / Multi-Image Edit, based on public preference data.

gpt-image-2 (medium)gemini-3.1-flash-image-preview (nano-banana-2) [web-search]gemini-3-pro-image-preview (nano-banana-pro)gemini-3-pro-image-preview-2k (nano-banana-pro)chatgpt-image-latest-high-fidelity (20251216)
Current DirectoryImage · Image Edit · Multi-Image Edit
Models36
Published2026/05/12
Arena public preference evaluationOriginal leaderboard: Image Edit / Multi Image EditPublished: 2026/05/12Leaderboard dataset: LMArena latest parquetOpen Arena sourceOpen leaderboard dataset
1
gpt-image-2 (medium)
Openai
100.0
28K
-
¥36.4 / ¥233Input/Output
2
gemini-3.1-flash-image-preview (nano-banana-2) [web-search]
Google
97.1
66.4K
131K
¥3.6 / ¥21.6Input/Output
3
gemini-3-pro-image-preview (nano-banana-pro)
Google
94.3
285.3K
1.05M
¥14.4 / ¥86.4Input/Output
4
gemini-3-pro-image-preview-2k (nano-banana-pro)
Google
91.4
141K
1.05M
¥14.4 / ¥86.4Input/Output
5
chatgpt-image-latest-high-fidelity (20251216)
Openai
88.6
141.4K
1M
¥36 / ¥216Input/Output
6
gpt-image-1.5-high-fidelity
Openai
85.7
122.4K
1M
¥36 / ¥216Input/Output
7
uni-1.1-max
Luma Ai
82.9
7.9K
-
-
8
seedream-4.5
Bytedance
80.0
317.8K
-
-
9
uni-1.1
Luma Ai
77.1
8.1K
-
-
10
wan2.7-image-pro
Alibaba
74.3
16.4K
8.19K
¥0 / ¥0Input/Output
11
wan2.7-image
Alibaba
71.4
17.5K
8.19K
¥0 / ¥0Input/Output
12
seedream-5.0-lite
Bytedance
68.6
109.1K
-
-
13
reve-v1.1
Reve
65.7
230.3K
-
-
14
kling-image-o1
Kling
62.9
31.4K
200K
¥108 / ¥432Input/Output
15
flux-2-max
Bfl
60.0
120.3K
-
-
16
flux-2-pro
Bfl
57.1
120.6K
-
-
17
gemini-2.5-flash-image-preview (nano-banana)
Google
54.3
4421.2K
1.05M
¥2.16 / ¥18Input/Output
18
flux-2-flex
Bfl
51.4
109.1K
-
-
19
reve-v1
Reve
48.6
532.3K
-
-
20
wan2.6-image
Alibaba
45.7
115.4K
-
-
21
flux-2-klein-9b
Bfl
42.9
126.1K
-
-
22
seedream-4-high-res-fal
Bytedance
40.0
993.1K
-
-
23
flux-2-dev
Bfl
37.1
70.7K
-
-
24
seedream-4-2k
Bytedance
34.3
30.5K
-
-
25
qwen-image-edit-2511
Alibaba
31.4
141K
8.19K
¥3.6 / ¥14.4Input/Output
26
flux-2-klein-4b
Bfl
28.6
126.1K
-
-
27
wan2.5-i2i-preview
Alibaba
25.7
58.6K
-
-
28
seedream-4-fal
Bytedance
22.9
310.1K
-
-
29
p-image-edit
-
20.0
109.9K
-
¥0 / ¥0Input/Output
30
gpt-image-1
Openai
17.1
1830K
-
¥36 / ¥288Input/Output
31
gpt-image-1-mini
Openai
14.3
602.4K
1M
¥36 / ¥216Input/Output
32
flux-1-kontext-pro
Bfl
11.4
142.2K
-
-
33
flux-1-kontext-max
Bfl
8.6
52.1K
-
-
34
gemini-2.0-flash-preview-image-generation
Google
5.7
3175.4K
1.05M
¥1.08 / ¥4.32Input/Output
35
seededit-3.0
Bytedance
2.9
87.8K
-
-
36
flux-1-kontext-dev
Bfl
0.0
130.5K
-
-
Top model analysis

gpt-image-2 (medium) why it ranks first

gpt-image-2 (medium) ranks first with a percent score of 100.0 and 28K samples. Use it as the first option for this leaderboard, then compare price, context and availability.

How to choose

Do not only look at rank #1

Start with the leaderboard closest to your task. Compare the top models by score and sample size, then check price, context length, open or closed access, and provider availability.

FAQ

FAQ

多图片编辑排行榜看什么指标?

主要看排名、百分制分数、样本量和来源。分数用于快速比较同一榜单内模型表现,样本量用于判断结果稳定性。

为什么不同榜单不能直接混合成总分?

不同榜单的任务、样本和评测口径不同,模力榜默认只在同一榜单内排序,避免把写作、代码、图像等能力强行合并。

多图片编辑模型应该怎么选?

优先看与你任务最接近的榜单,再结合价格、上下文长度、开源闭源和厂商可用性。排名靠前不代表适合所有预算和部署方式。

榜单多久更新?

页面展示的是最新成功采集的公开榜单数据。当前优先使用 LMArena leaderboard dataset,并在页面来源中保留原始链接。