大模型能力排名

更新时间:2026年5月1日

模型 总分 专业能力 高难度提示词 编程 数学 创意写作 指令遵循 长文本
claude-opus-4-7-thinking 1 2 2 1 6 2 2 3
claude-opus-4-6-thinking 2 1 1 3 3 1 1 1
claude-opus-4-6 3 3 3 4 5 6 3 4
claude-opus-4-7 4 4 4 2 7 5 4 2
gemini-3.1-pro-preview 5 7 5 7 4 3 5 5
muse-spark 6 30 6 5 26 10 14 27
gpt-5.5-high 7 5 7 9 1 20 7 6
gemini-3-pro 8 15 8 17 11 4 12 10
grok-4.20-beta1 9 40 16 26 28 8 27 33
gpt-5.4-high 10 6 9 8 2 16 8 12
grok-4.20-beta-0309-reasoning 11 28 12 16 23 17 26 32
gpt-5.2-chat-latest-20260210 12 25 14 12 27 31 22 24
ernie-5.1-preview 13 20 18 21 9 35 23 23
grok-4.20-multi-agent-beta-0309 14 14 22 15 21 19 34 25
gemini-3-flash 15 18 17 27 12 11 21 19
gpt-5.5 16 24 23 24 8 28 18 21
claude-opus-4-5-20251101-thinking-32k 17 10 10 6 17 7 6 7
glm-5.1 18 27 19 11 19 12 17 11
grok-4.1-thinking 19 41 31 37 44 34 46 46
claude-opus-4-5-20251101 20 11 13 14 25 9 9 9
gpt-5.4 21 16 20 18 33 33 13 18
mimo-v2.5-pro 22 8 15 23 16 24 10 13
deepseek-v4-pro 23 35 33 54 39 15 30 39
claude-sonnet-4-6 24 13 11 10 22 26 11 8
qwen3.5-max-preview 25 9 21 28 13 27 15 20
gemini-3-flash (thinking-minimal) 26 42 29 38 29 14 33 31
deepseek-v4-pro-thinking 27 21 27 35 10 29 29 16
kimi-k2.6 28 19 25 19 24 39 24 28
dola-seed-2.0-pro 29 38 24 20 38 61 43 42
grok-4.1 30 58 39 47 57 32 47 44
gpt-5.4-mini-high 31 22 30 30 52 48 36 37
glm-5 32 32 34 43 37 21 35 26
gpt-5.1-high 33 34 38 48 32 36 32 36
claude-sonnet-4-5-20250929-thinking-32k 34 17 26 13 30 23 16 15
claude-sonnet-4-5-20250929 35 26 28 25 69 13 19 17
gemma-4-31b 36 33 40 39 20 41 28 29
ernie-5.0-0110 37 60 42 44 40 38 45 48
gpt-5.3-chat-latest 38 46 37 36 65 56 48 38
kimi-k2.5-thinking 39 29 41 29 14 45 39 40
ernie-5.0-preview-1203 40 65 46 66 92 42 60 70
claude-opus-4-1-20250805-thinking-16k 41 31 32 22 41 22 20 14
mimo-v2-pro 42 12 36 32 34 43 31 30
gemini-2.5-pro 43 49 52 73 42 18 38 35
claude-opus-4-1-20250805 44 48 35 31 53 25 25 22
qwen3.5-397b-a17b 45 45 47 51 35 47 51 47
qwen3.6-plus 46 39 43 34 15 63 44 43
gpt-4.5-preview-2025-02-27 47 91 75 86 90 30 41 57
chatgpt-4o-latest-20250326 48 90 54 68 100 40 54 60
glm-4.7 49 71 48 52 61 54 52 41
deepseek-v4-flash-thinking 50 53 44 56 45 55 50 63
gemini-3.1-flash-lite-preview 51 68 64 83 46 44 76 65
gpt-5.2-high 52 36 49 41 31 83 57 66
gpt-5.1 53 54 57 64 68 51 53 54
gpt-5.2 54 51 45 45 49 76 58 56
gemma-4-26b-a4b 55 37 51 55 18 58 40 45
qwen3-max-preview 56 44 53 53 47 71 56 55
gpt-5-high 57 52 70 72 51 99 78 101
longcat-flash-chat-2602-exp 58 47 55 42 54 78 77 67
deepseek-v4-flash 59 62 58 59 55 65 49 61
kimi-k2.5-instant 60 56 50 33 43 79 42 50
grok-4-1-fast-reasoning 61 76 74 76 70 49 92 82
o3-2025-04-16 62 72 77 88 36 89 89 105
kimi-k2-thinking-turbo 63 55 59 50 48 72 61 69
amazon-nova-experimental-chat-26-02-10 64 23 61 49 56 135 63 77
gpt-5-chat 65 66 62 80 82 77 68 68
glm-4.6 66 75 73 84 71 59 69 73
deepseek-v3.2-exp-thinking 67 67 69 63 59 69 67 75
deepseek-v3.2 68 69 67 69 60 64 62 62
qwen3-max-2025-09-23 69 99 63 61 58 73 71 74
claude-opus-4-20250514-thinking-16k 70 70 56 40 74 37 37 34
mimo-v2.5 71 57 60 46 95 80 55 52
deepseek-v3.2-exp 72 96 66 74 76 50 66 58
qwen3-235b-a22b-instruct-2507 73 64 65 67 73 94 70 72
deepseek-v3.2-thinking 74 59 72 60 66 74 65 59
deepseek-r1-0528 75 97 82 78 113 70 103 107
grok-4-fast-chat 76 95 83 87 67 84 93 83
ernie-5.0-preview-1022 77 80 93 119 106 53 91 85
kimi-k2-0905-preview 78 102 80 71 77 92 105 112
deepseek-v3.1 79 93 84 101 79 82 85 84
qwen3.5-122b-a10b 80 74 90 96 64 97 83 88
deepseek-v3.1-terminus-thinking 81 71 81 91 86 59 53
kimi-k2-0711-preview 82 104 87 85 126 106 123 121
deepseek-v3.1-thinking 83 89 78 90 81 57 64 51
deepseek-v3.1-terminus 84 96 111 115 52 100 89
amazon-nova-experimental-chat-26-01-10 85 43 68 57 89 118 90 87
qwen3-vl-235b-a22b-instruct 86 61 76 75 84 112 72 80
mistral-large-3 87 100 86 70 103 98 84 93
gpt-4.1-2025-04-14 88 113 88 91 139 60 88 81
claude-opus-4-20250514 89 81 81 79 101 46 74 49
grok-3-preview-02-24 90 114 95 106 137 62 81 76
glm-4.5 91 78 85 93 80 101 82 91
gemini-2.5-flash 92 92 103 133 94 66 86 86
grok-4-0709 93 85 104 116 62 67 95 92
mistral-medium-2508 94 108 91 94 111 91 97 102
claude-haiku-4-5-20251001 95 63 79 58 120 85 75 71
gemini-2.5-flash-preview-09-2025 96 77 102 130 83 87 87 90
qwen3.5-27b 97 88 98 104 50 120 102 95
gpt-5.4-nano-high 98 73 105 82 63 139 112 110
minimax-m2.7 99 84 92 77 88 121 94 96
grok-4-fast-reasoning 100 101 117 114 85 90 106 97