GPT-5 vs GPT 5.1 (2025-11-13)

GPT-5 vs GPT 5.1 (2025-11-13): GPT-5 is cheaper by 0% on average. GPT-5 from OpenAI (272,000-token context, reasoning, tool calls) vs. GPT 5.1 (2025-11-13) from Azure OpenAI (272,000-token context, reasoning, tool calls). Use Agent Command Center to A/B both in shadow mode and pick the winner per workload.

Side-by-side cost

Live workload comparison

Same workload run through both models. The cheaper one is highlighted.

3,000
0272,000
400
0128,000
5,000
01,000,000
OpenAI
$1,179/mo
Input $1.25/M · Output $10.00/M
Azure OpenAI
$1,179/mo
Input $1.25/M · Output $10.00/M
At this workload, GPT 5.1 (2025-11-13) is 0% cheaper than GPT-5 — a savings of $0.000000/month ($0.000000/year).
Production recipe — Agent Command Center
strategy: cost-optimized
primary:
  model: gpt-5-1-2025-11-13
  provider: azure-openai
fallback:
  model: gpt-5
  provider: openai
shadow: { sample_rate: 0.05 }   # mirror 5% of traffic to compare quality live
GPT-5 GPT 5.1 (2025-11-13)
Input price $1.25/M $1.25/M
Output price $10.00/M $10.00/M
Context window 272,000 272,000
Max output 128,000 128,000
Function calling
Vision
Audio input
Reasoning
Prompt caching
Structured output
Pricing verified May 12, 2026 May 12, 2026
Cheaper option
Larger context
272,000 tokens
More capabilities
5 of 6 capability flags advertised

Benchmark comparison

Side-by-side public benchmark scores. Greener bar = winner.

Chatbot Arena ELOgeneral
GPT-5
1,450
GPT 5.1 (2025-11-13)
MATH-500math
GPT-5
99.6%
GPT 5.1 (2025-11-13)
AIME 2024math
GPT-5
98.4%
GPT 5.1 (2025-11-13)
BFCL v3agent
GPT-5
96.3%
GPT 5.1 (2025-11-13)
HumanEvalcode
GPT-5
96.0%
GPT 5.1 (2025-11-13)
IFEvalgeneral
GPT-5
95.6%
GPT 5.1 (2025-11-13)
AIME 2025math
GPT-5
94.6%
GPT 5.1 (2025-11-13)
LiveCodeBenchcode
GPT-5
90.0%
GPT 5.1 (2025-11-13)
MMLU-Proreasoning
GPT-5
89.4%
GPT 5.1 (2025-11-13)
Aider Polyglotcode
GPT-5
88.0%
GPT 5.1 (2025-11-13)
GPQA Diamondreasoning
GPT-5
87.3%
GPT 5.1 (2025-11-13)
MMMUmultimodal
GPT-5
84.2%
GPT 5.1 (2025-11-13)
SWE-bench Verifiedagent
GPT-5
74.9%
GPT 5.1 (2025-11-13)
Humanity's Last Examreasoning
GPT-5
42.0%
GPT 5.1 (2025-11-13)
ARC-AGI-2reasoning
GPT-5
17.6%
GPT 5.1 (2025-11-13)