更新时间:2026-06-10
| 模型 |
总分 |
专业能力 |
高难度提示词 |
编程 |
数学 |
创意写作 |
指令遵循 |
长文本 |
| claude-opus-4-6-thinking |
1 |
1 |
1 |
1 |
2 |
1 |
1 |
1 |
| claude-opus-4-7-thinking |
2 |
6 |
3 |
2 |
6 |
2 |
2 |
3 |
| claude-opus-4-6 |
3 |
2 |
2 |
5 |
4 |
6 |
3 |
2 |
| claude-opus-4-7 |
4 |
3 |
4 |
3 |
8 |
4 |
5 |
4 |
| muse-spark |
5 |
33 |
8 |
9 |
27 |
10 |
19 |
25 |
| gemini-3.1-pro-preview |
6 |
8 |
7 |
10 |
5 |
5 |
7 |
7 |
| gemini-3-pro |
7 |
16 |
9 |
17 |
19 |
3 |
13 |
12 |
| claude-opus-4-8-thinking |
8 |
18 |
5 |
4 |
16 |
7 |
4 |
5 |
| gpt-5.5-high |
9 |
4 |
12 |
14 |
9 |
19 |
8 |
14 |
| gpt-5.4-high |
10 |
5 |
11 |
15 |
3 |
26 |
10 |
16 |
| claude-opus-4-8 |
11 |
7 |
6 |
6 |
11 |
17 |
11 |
6 |
| gemini-3.5-flash |
12 |
9 |
22 |
34 |
1 |
9 |
23 |
21 |
| gpt-5.2-chat-latest-20260210 |
13 |
27 |
16 |
18 |
38 |
41 |
28 |
29 |
| glm-5.1 |
14 |
12 |
13 |
8 |
18 |
13 |
16 |
13 |
| grok-4.20-beta1 |
15 |
49 |
29 |
32 |
40 |
12 |
36 |
39 |
| gpt-5.5 |
16 |
15 |
19 |
40 |
10 |
21 |
15 |
23 |
| qwen3.7-max-preview |
17 |
11 |
17 |
11 |
7 |
31 |
18 |
10 |
| gemini-3-flash |
18 |
24 |
20 |
31 |
20 |
14 |
26 |
24 |
| grok-4.20-beta-0309-reasoning |
19 |
43 |
23 |
25 |
24 |
25 |
37 |
38 |
| claude-opus-4-5-20251101-thinking-32k |
20 |
13 |
14 |
7 |
23 |
8 |
6 |
9 |
| grok-4.20-multi-agent-beta-0309 |
21 |
32 |
30 |
19 |
41 |
27 |
40 |
41 |
| gpt-5.5-instant |
22 |
45 |
24 |
26 |
30 |
15 |
30 |
30 |
| claude-sonnet-4-6 |
23 |
14 |
10 |
12 |
31 |
20 |
9 |
8 |
| ernie-5.1 |
24 |
40 |
25 |
20 |
17 |
35 |
24 |
34 |
| claude-opus-4-5-20251101 |
25 |
17 |
15 |
13 |
29 |
11 |
12 |
11 |
| grok-4.1-thinking |
26 |
53 |
37 |
44 |
47 |
45 |
56 |
54 |
| gpt-5.4 |
27 |
22 |
26 |
27 |
34 |
38 |
22 |
22 |
| qwen3.5-max-preview |
28 |
21 |
21 |
24 |
26 |
24 |
17 |
20 |
| mimo-v2.5-pro |
29 |
10 |
18 |
21 |
14 |
34 |
14 |
15 |
| kimi-k2.6 |
30 |
20 |
27 |
29 |
13 |
39 |
27 |
26 |
| gemini-3-flash (thinking-minimal) |
31 |
55 |
39 |
51 |
35 |
22 |
42 |
40 |
| qwen3.6-max-preview |
32 |
23 |
32 |
33 |
15 |
37 |
35 |
31 |
| grok-4.1 |
33 |
68 |
43 |
53 |
66 |
40 |
57 |
52 |
| deepseek-v4-pro-thinking |
34 |
46 |
35 |
48 |
21 |
30 |
33 |
27 |
| glm-5 |
35 |
30 |
36 |
47 |
56 |
18 |
39 |
33 |
| deepseek-v4-pro |
36 |
44 |
40 |
39 |
59 |
33 |
32 |
32 |
| dola-seed-2.0-pro |
37 |
39 |
33 |
28 |
42 |
67 |
51 |
51 |
| claude-sonnet-4-5-20250929-thinking-32k |
38 |
19 |
28 |
16 |
37 |
23 |
20 |
17 |
| claude-sonnet-4-5-20250929 |
39 |
29 |
31 |
22 |
74 |
16 |
21 |
19 |
| gpt-5.1-high |
40 |
38 |
42 |
54 |
36 |
43 |
38 |
42 |
| gemma-4-31b |
41 |
37 |
44 |
45 |
25 |
50 |
31 |
36 |
| kimi-k2.5-thinking |
42 |
35 |
46 |
37 |
22 |
49 |
44 |
43 |
| minimax-m3 |
43 |
28 |
49 |
30 |
12 |
62 |
34 |
45 |
| ernie-5.0-preview-1203 |
44 |
72 |
52 |
77 |
104 |
51 |
68 |
78 |
| gpt-5.4-mini-high |
45 |
31 |
45 |
42 |
51 |
71 |
49 |
53 |
| mimo-v2-pro |
46 |
25 |
41 |
38 |
39 |
46 |
41 |
35 |
| claude-opus-4-1-20250805-thinking-16k |
47 |
36 |
34 |
23 |
48 |
28 |
25 |
18 |
| gpt-5.3-chat-latest |
48 |
52 |
47 |
46 |
77 |
65 |
53 |
47 |
| ernie-5.0-0110 |
49 |
81 |
51 |
52 |
60 |
47 |
64 |
73 |
| claude-opus-4-1-20250805 |
50 |
58 |
38 |
35 |
65 |
32 |
29 |
28 |
| grok-4.3 |
51 |
82 |
61 |
56 |
80 |
42 |
71 |
59 |
| gemini-2.5-pro |
52 |
60 |
62 |
86 |
50 |
29 |
45 |
44 |
| gpt-4.5-preview-2025-02-27 |
53 |
102 |
86 |
95 |
102 |
36 |
48 |
66 |
| qwen3.6-plus |
54 |
47 |
48 |
50 |
33 |
64 |
47 |
48 |
| qwen3.5-397b-a17b |
55 |
34 |
50 |
49 |
44 |
55 |
52 |
46 |
| chatgpt-4o-latest-20250326 |
56 |
101 |
65 |
80 |
110 |
48 |
60 |
72 |
| glm-4.7 |
57 |
79 |
53 |
60 |
70 |
63 |
59 |
49 |
| gpt-5.1 |
58 |
66 |
67 |
74 |
79 |
57 |
61 |
65 |
| gemma-4-26b-a4b |
59 |
48 |
55 |
64 |
28 |
69 |
46 |
55 |
| gpt-5.2-high |
60 |
42 |
58 |
55 |
32 |
92 |
62 |
79 |
| deepseek-v4-flash-thinking |
61 |
54 |
64 |
70 |
54 |
61 |
54 |
56 |
| gpt-5.2 |
62 |
56 |
57 |
61 |
63 |
86 |
65 |
69 |
| qwen3-max-preview |
63 |
51 |
59 |
63 |
57 |
80 |
63 |
64 |
| longcat-flash-chat-2602-exp |
64 |
57 |
60 |
41 |
62 |
89 |
80 |
77 |
| deepseek-v4-flash |
65 |
62 |
63 |
65 |
58 |
60 |
58 |
57 |
| gpt-5-high |
66 |
65 |
78 |
82 |
64 |
110 |
89 |
112 |
| gemini-3.1-flash-lite-preview |
67 |
77 |
81 |
100 |
55 |
54 |
87 |
86 |
| mimo-v2.5 |
68 |
41 |
54 |
57 |
52 |
74 |
55 |
50 |
| kimi-k2.5-instant |
69 |
67 |
56 |
36 |
49 |
87 |
50 |
61 |
| o3-2025-04-16 |
70 |
85 |
88 |
97 |
45 |
99 |
100 |
116 |
| grok-4-1-fast-reasoning |
71 |
87 |
84 |
91 |
86 |
53 |
104 |
95 |
| kimi-k2-thinking-turbo |
72 |
61 |
69 |
59 |
53 |
83 |
73 |
82 |
| amazon-nova-experimental-chat-26-02-10 |
73 |
26 |
70 |
58 |
71 |
145 |
72 |
88 |
| mimo-v2-omni |
74 |
63 |
68 |
62 |
43 |
98 |
81 |
60 |
| gpt-5-chat |
75 |
75 |
71 |
89 |
94 |
84 |
74 |
75 |
| glm-4.6 |
76 |
86 |
83 |
94 |
82 |
68 |
79 |
80 |
| mistral-medium-3.5 |
77 |
90 |
82 |
78 |
46 |
91 |
78 |
84 |
| deepseek-v3.2-exp-thinking |
78 |
76 |
77 |
72 |
72 |
78 |
76 |
85 |
| deepseek-v3.2 |
79 |
74 |
76 |
79 |
69 |
72 |
66 |
71 |
| claude-opus-4-20250514-thinking-16k |
80 |
80 |
66 |
43 |
83 |
44 |
43 |
37 |
| qwen3-max-2025-09-23 |
81 |
113 |
73 |
71 |
68 |
79 |
82 |
81 |
| qwen3-235b-a22b-instruct-2507 |
82 |
71 |
72 |
75 |
84 |
105 |
77 |
76 |
| deepseek-v3.2-exp |
83 |
105 |
74 |
84 |
87 |
56 |
75 |
68 |
| deepseek-v3.2-thinking |
84 |
69 |
80 |
69 |
76 |
85 |
69 |
67 |
| deepseek-r1-0528 |
85 |
111 |
94 |
87 |
124 |
81 |
114 |
118 |
| grok-4-fast-chat |
86 |
107 |
95 |
96 |
78 |
94 |
106 |
98 |
| ernie-5.0-preview-1022 |
87 |
92 |
105 |
132 |
118 |
59 |
105 |
100 |
| nvidia-nemotron-3-ultra-550b-a55b-nvfp4 |
88 |
64 |
98 |
108 |
– |
95 |
108 |
90 |
| kimi-k2-0905-preview |
89 |
116 |
92 |
81 |
89 |
101 |
115 |
123 |
| deepseek-v3.1 |
90 |
103 |
97 |
112 |
91 |
90 |
97 |
99 |
| deepseek-v3.1-terminus-thinking |
91 |
– |
79 |
90 |
100 |
96 |
67 |
63 |
| kimi-k2-0711-preview |
92 |
120 |
101 |
93 |
134 |
115 |
135 |
132 |
| deepseek-v3.1-thinking |
93 |
99 |
91 |
102 |
92 |
66 |
70 |
62 |
| qwen3.5-122b-a10b |
94 |
83 |
99 |
99 |
81 |
118 |
92 |
106 |
| amazon-nova-experimental-chat-26-01-10 |
95 |
50 |
75 |
66 |
101 |
131 |
98 |
96 |
| deepseek-v3.1-terminus |
96 |
– |
109 |
123 |
126 |
58 |
112 |
102 |
| qwen3-vl-235b-a22b-instruct |
97 |
73 |
87 |
85 |
99 |
123 |
84 |
93 |
| mistral-large-3 |
98 |
112 |
100 |
83 |
115 |
109 |
96 |
107 |
| minimax-m2.7 |
99 |
84 |
89 |
68 |
96 |
126 |
90 |
83 |
| hunyuan-hy3-preview |
100 |
78 |
85 |
92 |
75 |
122 |
95 |
89 |
数据来源:LMSYS Chatbot Arena (lmarena.ai) © Open-source research project by LMSYS Org.