模型测评:Artificial Analysis 指标

公开 LLM 指标聚合展示,包含智能、代码、数学、速度、价格、延迟和性价比等维度。

输出速度

Inception Mercury 2IBM Granite 3.3 8B (Non-reasoning)IBM Granite 4.0 H Small 的输出速度更高,适合强调响应速度的场景。

智能

Artificial Analysis Intelligence Index · 越高越好

Anthropic Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback)

Top 64.9平均 58.5310 项

代码

Artificial Analysis Coding Index · 越高越好

Anthropic Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback)

Top 62平均 56.3910 项

数学

Artificial Analysis Math Index · 越高越好

OpenAI GPT-5.2 (xhigh)

Top 99平均 96.5510 项

速度

输出 tokens/s · 越高越好

Inception Mercury 2

Top 1096.3平均 416.8610 项

各维度 Top 3 差距

每个维度以第 1 名为 100,对比第 2、3 名与头名的接近程度。

智能
#1Anthropic Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback)64.9
#2Anthropic Claude Opus 4.8 (Adaptive Reasoning, Max Effort)61.4
#3OpenAI GPT-5.5 (xhigh)60.2
代码
#1Anthropic Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback)62
#2OpenAI GPT-5.5 (xhigh)59.1
#3OpenAI GPT-5.5 (high)58.5
数学
#1OpenAI GPT-5.2 (xhigh)99
#2OpenAI GPT-5 Codex (high)98.7
#3Google Gemini 3 Flash Preview (Reasoning)97
速度
#1Inception Mercury 21096.3
#2IBM Granite 3.3 8B (Non-reasoning)407
#3IBM Granite 4.0 H Small406.2
混合价格
#1Alibaba Qwen3.5 0.8B (Reasoning)0.02
#2Alibaba Qwen3.5 0.8B (Non-reasoning)0.02
#3Google Gemma 3n E4B Instruct0.025
输入价格
#1Alibaba Qwen3.5 0.8B (Reasoning)0.01
#2Alibaba Qwen3.5 0.8B (Non-reasoning)0.01
#3Alibaba Qwen3.5 2B (Reasoning)0.02
输出价格
#1Google Gemma 3n E4B Instruct0.04
#2Alibaba Qwen3.5 0.8B (Reasoning)0.05
#3Alibaba Qwen3.5 0.8B (Non-reasoning)0.05
首响延迟
#1Cohere Command A+0.18
#2Cohere North Mini Code0.2
#3NVIDIA Nemotron Nano 9B V2 (Reasoning)0.23
性价比
#1Alibaba Qwen3.5 0.8B (Reasoning)525
#2Alibaba Qwen3.5 0.8B (Non-reasoning)495
#3Alibaba Qwen3.5 4B (Reasoning)451.7
供应商覆盖
#1Alibaba81
#2OpenAI67
#3Google55

智能 Top 5

直接对比头部 5 个模型的实际分数,比占比图更适合看排名差距。

#1Anthropic Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback)64.9
#2Anthropic Claude Opus 4.8 (Adaptive Reasoning, Max Effort)61.4
#3OpenAI GPT-5.5 (xhigh)60.2
#4OpenAI GPT-5.5 (high)58.9
#5Anthropic Claude Opus 4.7 (Adaptive Reasoning, Max Effort)57.3

智能

Artificial Analysis Intelligence Index · 越高越好

#1Anthropic Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback)64.9
#2Anthropic Claude Opus 4.8 (Adaptive Reasoning, Max Effort)61.4
#3OpenAI GPT-5.5 (xhigh)60.2
#4OpenAI GPT-5.5 (high)58.9
#5Anthropic Claude Opus 4.7 (Adaptive Reasoning, Max Effort)57.3
#6Google Gemini 3.1 Pro Preview57.2
#7OpenAI GPT-5.4 (xhigh)56.8
#8OpenAI GPT-5.5 (medium)56.7
#9Alibaba Qwen3.7 Max56.6
#10Google Gemini 3.5 Flash (high)55.3

显示前 10 项,完整模型名见图表下方列表。

代码

Artificial Analysis Coding Index · 越高越好

#1Anthropic Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback)62
#2OpenAI GPT-5.5 (xhigh)59.1
#3OpenAI GPT-5.5 (high)58.5
#4OpenAI GPT-5.4 (xhigh)57.2
#5Anthropic Claude Opus 4.8 (Adaptive Reasoning, Max Effort)56.7
#6OpenAI GPT-5.5 (medium)56.2
#7Google Gemini 3.1 Pro Preview55.5
#8OpenAI GPT-5.3 Codex (xhigh)53.1
#9Anthropic Claude Opus 4.7 (Non-reasoning, High Effort)53.1
#10Anthropic Claude Opus 4.7 (Adaptive Reasoning, Max Effort)52.5

显示前 10 项,完整模型名见图表下方列表。

数学

Artificial Analysis Math Index · 越高越好

#1OpenAI GPT-5.2 (xhigh)99
#2OpenAI GPT-5 Codex (high)98.7
#3Google Gemini 3 Flash Preview (Reasoning)97
#4OpenAI GPT-5.2 (medium)96.7
#5DeepSeek V3.2 Speciale96.7
#6Xiaomi MiMo-V2-Flash (Reasoning)96.3
#7OpenAI GPT-5.1 Codex (high)95.7
#8Google Gemini 3 Pro Preview (high)95.7
#9Z AI GLM-4.7 (Reasoning)95
#10KwaiKAT KAT-Coder-Pro V194.7

显示前 10 项,完整模型名见图表下方列表。

速度

输出 tokens/s · 越高越好

#1Inception Mercury 21096.3
#2IBM Granite 3.3 8B (Non-reasoning)407
#3IBM Granite 4.0 H Small406.2
#4OpenAI gpt-oss-120b (low)372.9
#5OpenAI gpt-oss-120b (high)359.3
#6Alibaba Qwen3.5 2B (Non-reasoning)351.3
#7NVIDIA Nemotron 3 Nano Omni 30B A3B Reasoning300
#8Google Gemini 2.5 Flash-Lite (Reasoning)293.4
#9NVIDIA Nemotron Nano 12B v2 VL (Reasoning)291.5
#10Google Gemini 3.1 Flash-Lite290.7

显示前 10 项,完整模型名见图表下方列表。

混合价格

每 100 万 tokens 混合价格 · 越低越好

#1Alibaba Qwen3.5 0.8B (Reasoning)0.02
#2Alibaba Qwen3.5 0.8B (Non-reasoning)0.02
#3Google Gemma 3n E4B Instruct0.025
#4Alibaba Qwen3.5 2B (Reasoning)0.04
#5Alibaba Qwen3.5 2B (Non-reasoning)0.04
#6Sarvam 30B (high)0.047
#7Meta Llama 3.2 Instruct 1B0.05
#8Google Gemma 3 4B Instruct0.05
#9Liquid AI LFM2 24B A2B0.052
#10Alibaba Qwen3.5 4B (Reasoning)0.06

显示前 10 项,完整模型名见图表下方列表。

输入价格

每 100 万输入 tokens · 越低越好

#1Alibaba Qwen3.5 0.8B (Reasoning)0.01
#2Alibaba Qwen3.5 0.8B (Non-reasoning)0.01
#3Alibaba Qwen3.5 2B (Reasoning)0.02
#4Alibaba Qwen3.5 2B (Non-reasoning)0.02
#5Google Gemma 3n E4B Instruct0.02
#6Sarvam 30B (high)0.026
#7Liquid AI LFM2 24B A2B0.03
#8Alibaba Qwen3.5 4B (Reasoning)0.03
#9Alibaba Qwen3.5 4B (Non-reasoning)0.03
#10IBM Granite 3.3 8B (Non-reasoning)0.03

显示前 10 项,完整模型名见图表下方列表。

输出价格

每 100 万输出 tokens · 越低越好

#1Google Gemma 3n E4B Instruct0.04
#2Alibaba Qwen3.5 0.8B (Reasoning)0.05
#3Alibaba Qwen3.5 0.8B (Non-reasoning)0.05
#4Meta Llama 3.2 Instruct 1B0.05
#5Google Gemma 3 4B Instruct0.08
#6Mistral Ministral 3 3B0.1
#7IBM Granite 4.1 8B0.1
#8Alibaba Qwen3.5 2B (Reasoning)0.1
#9Alibaba Qwen3.5 2B (Non-reasoning)0.1
#10Meta Llama 3.1 Instruct 8B0.1

显示前 10 项,完整模型名见图表下方列表。

首响延迟

Median time to first token · 越低越好

#1Cohere Command A+0.18
#2Cohere North Mini Code0.2
#3NVIDIA Nemotron Nano 9B V2 (Reasoning)0.23
#4Alibaba Qwen3.5 4B (Reasoning)0.23
#5Alibaba Qwen3.5 2B (Non-reasoning)0.24
#6Alibaba Qwen3.5 4B (Non-reasoning)0.26
#7NVIDIA Nemotron Nano 12B v2 VL (Reasoning)0.27
#8Cohere Command A0.28
#9Alibaba Qwen3.5 0.8B (Non-reasoning)0.28
#10NVIDIA Llama Nemotron Super 49B v1.5 (Non-reasoning)0.3

显示前 10 项,完整模型名见图表下方列表。

性价比

智能指数 / 混合价格 · 越高越好

#1Alibaba Qwen3.5 0.8B (Reasoning)525
#2Alibaba Qwen3.5 0.8B (Non-reasoning)495
#3Alibaba Qwen3.5 4B (Reasoning)451.7
#4Alibaba Qwen3.5 2B (Reasoning)407.5
#5Alibaba Qwen3.5 4B (Non-reasoning)376.7
#6Alibaba Qwen3.5 2B (Non-reasoning)367.5
#7Alibaba Qwen3.5 9B (Reasoning)286.7
#8Xiaomi MiMo-V2.5280
#9OpenAI gpt-oss-20B (high)278.4
#10Xiaomi MiMo-V2-Flash (Feb 2026)276.7

显示前 10 项,完整模型名见图表下方列表。

供应商覆盖

当前 API 返回的模型数量 · 越高代表覆盖更多

#1Alibaba81
#2OpenAI67
#3Google55
#4Anthropic32
#5Mistral32
#6DeepSeek31
#7xAI20
#8NVIDIA18
#9Z AI18
#10Meta17

显示前 10 项,完整模型名见图表下方列表。