# Model Overall Expert Hard Prompts Coding
1 gemini-3-pro 1 4 1 4
2 grok-4.1-thinking 2 8 5 8
3 gemini-3-flash 3 7 4 10
4 claude-opus-4-5-20251101-thinking-… 4 2 2 1
5 claude-opus-4-5-20251101 5 1 3 3
6 grok-4.1 6 18 11 13
7 gemini-3-flash (thinking-minimal) 7 12 7 9
8 gpt-5.1-high 8 11 10 14
9 ernie-5.0-0110 9 20 14 17
10 claude-sonnet-4-5-20250929-thinkin… 10 5 6 2
11 claude-sonnet-4-5-20250929 11 9 9 7
12 kimi-k2.5-thinking 12 3 13 5
13 gemini-2.5-pro 13 16 19 35
14 ernie-5.0-preview-1203 14 24 16 26
15 claude-opus-4-1-20250805-thinking-… 15 10 8 6
16 claude-opus-4-1-20250805 16 15 12 11
17 gpt-4.5-preview-2025-02-27 17 45 36 44
18 chatgpt-4o-latest-20250326 18 49 21 34
19 glm-4.7 19 32 17 18
20 gpt-5.2 20 22 15 15
21 gpt-5.2-high 21 6 18 16
22 gpt-5.1 22 17 23 23
23 gpt-5-high 23 19 27 33
24 qwen3-max-preview 24 13 20 20
25 o3-2025-04-16 25 27 37 46
26 grok-4-1-fast-reasoning 26 41 35 48
27 kimi-k2-thinking-turbo 27 21 24 19
28 gpt-5-chat 28 26 25 39
29 qwen3-max-2025-09-23 29 51 28 22
30 glm-4.6 30 38 34 42
31 claude-opus-4-20250514-thinking-16k 31 25 22 12
32 deepseek-v3.2-exp 32 47 29 36
33 deepseek-v3.2-exp-thinking 33 36 30 25
34 qwen3-235b-a22b-instruct-2507 34 23 26 28
35 grok-4-fast-chat 35 46 44 47