更新时间:2026-06-25
| 模型 |
总分 |
专业能力 |
高难度提示词 |
编程 |
数学 |
创意写作 |
指令遵循 |
长文本 |
| claude-fable-5 |
1 |
3 |
2 |
1 |
1 |
1 |
1 |
2 |
| claude-opus-4-6-thinking |
2 |
1 |
1 |
3 |
3 |
2 |
2 |
1 |
| claude-opus-4-7-thinking |
3 |
6 |
4 |
2 |
5 |
3 |
3 |
4 |
| claude-opus-4-6 |
4 |
2 |
3 |
5 |
4 |
7 |
4 |
3 |
| claude-opus-4-7 |
5 |
4 |
5 |
4 |
7 |
5 |
6 |
5 |
| muse-spark |
6 |
30 |
11 |
13 |
30 |
11 |
24 |
29 |
| gemini-3.1-pro-preview |
7 |
10 |
8 |
14 |
10 |
6 |
9 |
8 |
| gemini-3-pro |
8 |
19 |
9 |
20 |
17 |
4 |
14 |
13 |
| claude-opus-4-8-thinking |
9 |
9 |
6 |
6 |
8 |
8 |
5 |
7 |
| gpt-5.5-high |
10 |
8 |
13 |
19 |
11 |
19 |
11 |
16 |
| gpt-5.4-high |
11 |
7 |
12 |
16 |
6 |
24 |
12 |
17 |
| claude-opus-4-8 |
12 |
5 |
7 |
7 |
9 |
16 |
7 |
6 |
| gemini-3.5-flash |
13 |
12 |
23 |
35 |
2 |
10 |
25 |
24 |
| gpt-5.2-chat-latest-20260210 |
14 |
26 |
18 |
21 |
38 |
41 |
28 |
31 |
| glm-5.1 |
15 |
16 |
15 |
9 |
21 |
12 |
18 |
14 |
| gpt-5.5 |
16 |
14 |
21 |
39 |
13 |
20 |
16 |
25 |
| qwen3.7-max-preview |
17 |
13 |
20 |
12 |
12 |
31 |
20 |
9 |
| grok-4.20-beta-0309-reasoning |
18 |
48 |
25 |
28 |
23 |
27 |
39 |
39 |
| grok-4.20-beta1 |
19 |
51 |
30 |
34 |
42 |
14 |
37 |
41 |
| gemini-3-flash |
20 |
25 |
22 |
33 |
20 |
15 |
27 |
26 |
| claude-opus-4-5-20251101-thinking-32k |
21 |
17 |
14 |
8 |
24 |
9 |
8 |
10 |
| gpt-5.5-instant |
22 |
45 |
26 |
26 |
29 |
17 |
31 |
30 |
| grok-4.20-multi-agent-beta-0309 |
23 |
34 |
32 |
27 |
41 |
22 |
45 |
44 |
| claude-sonnet-4-6 |
24 |
15 |
10 |
10 |
33 |
28 |
10 |
11 |
| glm-5.2 (max) |
25 |
29 |
17 |
11 |
– |
29 |
19 |
19 |
| claude-opus-4-5-20251101 |
26 |
18 |
16 |
15 |
28 |
13 |
13 |
12 |
| gpt-5.4 |
27 |
24 |
27 |
23 |
32 |
36 |
21 |
23 |
| ernie-5.1 |
28 |
35 |
28 |
22 |
16 |
40 |
32 |
38 |
| mimo-v2.5-pro |
29 |
11 |
19 |
17 |
14 |
39 |
15 |
15 |
| grok-4.1-thinking |
30 |
55 |
40 |
45 |
50 |
47 |
59 |
58 |
| qwen3.5-max-preview |
31 |
22 |
24 |
29 |
25 |
21 |
17 |
22 |
| qwen3.6-max-preview |
32 |
23 |
34 |
32 |
19 |
37 |
36 |
27 |
| gemini-3-flash (thinking-minimal) |
33 |
61 |
41 |
53 |
36 |
26 |
43 |
42 |
| kimi-k2.6 |
34 |
20 |
31 |
31 |
15 |
44 |
29 |
28 |
| grok-4.1 |
35 |
71 |
44 |
54 |
70 |
42 |
58 |
54 |
| deepseek-v4-pro-thinking |
36 |
47 |
38 |
51 |
18 |
34 |
35 |
33 |
| glm-5 |
37 |
33 |
37 |
46 |
55 |
25 |
40 |
34 |
| deepseek-v4-pro |
38 |
43 |
42 |
42 |
58 |
33 |
34 |
35 |
| claude-sonnet-4-5-20250929-thinking-32k |
39 |
21 |
29 |
18 |
35 |
23 |
22 |
18 |
| claude-sonnet-4-5-20250929 |
40 |
32 |
33 |
24 |
73 |
18 |
23 |
21 |
| dola-seed-2.0-pro |
41 |
42 |
36 |
30 |
43 |
68 |
54 |
52 |
| gpt-5.1-high |
42 |
41 |
45 |
55 |
37 |
46 |
38 |
43 |
| gemma-4-31b |
43 |
38 |
46 |
48 |
26 |
52 |
33 |
36 |
| kimi-k2.5-thinking |
44 |
37 |
48 |
41 |
22 |
48 |
46 |
45 |
| ernie-5.0-preview-1203 |
45 |
74 |
54 |
80 |
106 |
53 |
72 |
81 |
| claude-opus-4-1-20250805-thinking-16k |
46 |
36 |
35 |
25 |
48 |
30 |
26 |
20 |
| gpt-5.3-chat-latest |
47 |
53 |
47 |
47 |
79 |
65 |
53 |
47 |
| mimo-v2-pro |
48 |
28 |
43 |
40 |
40 |
49 |
42 |
37 |
| minimax-m3 |
49 |
31 |
51 |
37 |
31 |
70 |
41 |
49 |
| gpt-5.4-mini-high |
50 |
39 |
49 |
49 |
54 |
75 |
55 |
61 |
| claude-opus-4-1-20250805 |
51 |
59 |
39 |
36 |
64 |
35 |
30 |
32 |
| ernie-5.0-0110 |
52 |
82 |
53 |
56 |
59 |
50 |
65 |
73 |
| gemini-2.5-pro |
53 |
64 |
64 |
89 |
52 |
32 |
47 |
46 |
| gpt-4.5-preview-2025-02-27 |
54 |
104 |
88 |
98 |
103 |
38 |
49 |
68 |
| qwen3.6-plus |
55 |
49 |
50 |
52 |
39 |
64 |
50 |
50 |
| grok-4.3 |
56 |
83 |
66 |
61 |
87 |
43 |
82 |
59 |
| qwen3.5-397b-a17b |
57 |
40 |
52 |
50 |
46 |
56 |
51 |
48 |
| chatgpt-4o-latest-20250326 |
58 |
103 |
68 |
82 |
112 |
51 |
61 |
74 |
| glm-4.7 |
59 |
79 |
55 |
63 |
72 |
66 |
60 |
51 |
| gpt-5.1 |
60 |
68 |
70 |
77 |
81 |
59 |
63 |
66 |
| gemma-4-26b-a4b |
61 |
50 |
57 |
67 |
27 |
69 |
48 |
56 |
| gpt-5.2-high |
62 |
44 |
59 |
58 |
34 |
95 |
64 |
82 |
| deepseek-v4-flash-thinking |
63 |
57 |
67 |
72 |
60 |
61 |
56 |
57 |
| longcat-flash-chat-2602-exp |
64 |
58 |
60 |
43 |
61 |
90 |
85 |
77 |
| qwen3-max-preview |
65 |
54 |
61 |
65 |
56 |
83 |
66 |
67 |
| gpt-5.2 |
66 |
60 |
62 |
64 |
63 |
91 |
67 |
75 |
| deepseek-v4-flash |
67 |
67 |
65 |
66 |
77 |
60 |
62 |
62 |
| gpt-5-high |
68 |
66 |
81 |
85 |
62 |
111 |
90 |
114 |
| mimo-v2.5 |
69 |
46 |
56 |
57 |
44 |
80 |
57 |
53 |
| gemini-3.1-flash-lite-preview |
70 |
84 |
84 |
103 |
53 |
57 |
92 |
89 |
| mimo-v2-omni |
71 |
63 |
63 |
59 |
49 |
84 |
68 |
55 |
| kimi-k2.5-instant |
72 |
70 |
58 |
38 |
51 |
93 |
52 |
64 |
| o3-2025-04-16 |
73 |
87 |
90 |
101 |
45 |
100 |
101 |
119 |
| grok-4-1-fast-reasoning |
74 |
90 |
87 |
94 |
89 |
55 |
102 |
97 |
| kimi-k2-thinking-turbo |
75 |
65 |
71 |
62 |
57 |
85 |
75 |
83 |
| amazon-nova-experimental-chat-26-02-10 |
76 |
27 |
72 |
60 |
76 |
146 |
76 |
90 |
| gpt-5-chat |
77 |
78 |
73 |
93 |
97 |
89 |
77 |
78 |
| mistral-medium-3.5 |
78 |
88 |
83 |
71 |
65 |
87 |
70 |
86 |
| glm-4.6 |
79 |
89 |
86 |
99 |
83 |
71 |
83 |
84 |
| deepseek-v3.2 |
80 |
75 |
77 |
81 |
68 |
73 |
69 |
71 |
| deepseek-v3.2-exp-thinking |
81 |
77 |
79 |
74 |
75 |
79 |
79 |
87 |
| claude-opus-4-20250514-thinking-16k |
82 |
80 |
69 |
44 |
84 |
45 |
44 |
40 |
| qwen3-max-2025-09-23 |
83 |
115 |
75 |
75 |
67 |
81 |
81 |
85 |
| qwen3-235b-a22b-instruct-2507 |
84 |
73 |
74 |
78 |
85 |
105 |
80 |
80 |
| deepseek-v3.2-exp |
85 |
107 |
76 |
87 |
88 |
58 |
78 |
69 |
| deepseek-v3.2-thinking |
86 |
72 |
80 |
73 |
78 |
88 |
73 |
70 |
| deepseek-r1-0528 |
87 |
112 |
97 |
90 |
127 |
82 |
116 |
120 |
| grok-4-fast-chat |
88 |
109 |
98 |
102 |
82 |
98 |
107 |
100 |
| ernie-5.0-preview-1022 |
89 |
96 |
107 |
136 |
120 |
63 |
105 |
102 |
| kimi-k2-0905-preview |
90 |
117 |
94 |
83 |
91 |
102 |
117 |
124 |
| deepseek-v3.1-terminus-thinking |
91 |
– |
82 |
92 |
107 |
97 |
71 |
65 |
| kimi-k2-0711-preview |
92 |
121 |
103 |
95 |
136 |
116 |
136 |
134 |
| deepseek-v3.1 |
93 |
105 |
101 |
114 |
93 |
94 |
98 |
101 |
| qwen3.5-122b-a10b |
94 |
85 |
100 |
100 |
80 |
119 |
94 |
106 |
| deepseek-v3.1-thinking |
95 |
102 |
93 |
105 |
95 |
67 |
74 |
63 |
| minimax-m2.7 |
96 |
86 |
85 |
68 |
94 |
124 |
91 |
79 |
| nvidia-nemotron-3-ultra-550b-a55b-nvfp4 |
97 |
56 |
96 |
88 |
47 |
120 |
108 |
95 |
| deepseek-v3.1-terminus |
98 |
– |
111 |
125 |
128 |
62 |
114 |
104 |
| amazon-nova-experimental-chat-26-01-10 |
99 |
52 |
78 |
69 |
102 |
133 |
100 |
99 |
| mistral-large-3 |
100 |
110 |
102 |
84 |
116 |
109 |
97 |
107 |
数据来源:LMSYS Chatbot Arena (arena.ai) © Open-source research project by LMSYS Org.