Azure AI Foundry models & pricing

Azure AI Foundry hosts 76 models (71 with public pricing) covering 5 modalities. Azure marketplace catalog for Mistral, Cohere, Meta, Phi, and other partner models. Cheapest input starts at $0.0400/M tokens; the most premium goes up to $3,200/M. Use Future AGI's Agent Command Center to route any Azure AI Foundry model with cost-optimized fallback and unified observability.

Homepage ↗ Docs ↗
76

chat 60

Model Input / 1M Output / 1M Context Caps
Ministral 3B $0.0400/M $0.0400/M 128,000 tools
Phi 4 mini Instruct $0.0750/M $0.300/M 131,072 tools
Phi 4 mini Reasoning $0.0800/M $0.320/M 131,072 tools
Phi 4 Multimodal Instruct $0.0800/M $0.320/M 131,072 tools · vision · audio
Mistral Small 2503 $0.1000/M $0.300/M 128,000 tools · vision
Phi 4 $0.125/M $0.500/M 16,384 tools
Phi 4 Reasoning $0.125/M $0.500/M 32,768 tools · reasoning
Phi 3.5 mini Instruct $0.130/M $0.520/M 128,000
Phi 3.5 Vision Instruct $0.130/M $0.520/M 128,000 vision
Phi 3 mini 128k Instruct $0.130/M $0.520/M 128,000
Phi 3 mini 4k Instruct $0.130/M $0.520/M 4,096
Model Router $0.140/M
GPT Oss 120B $0.150/M $0.600/M 131,072 tools
Mistral Nemo $0.150/M $0.150/M 131,072 tools
Phi 3 Small 128k Instruct $0.150/M $0.600/M 128,000
Phi 3 Small 8k Instruct $0.150/M $0.600/M 8,192
Phi 3.5 MOE Instruct $0.160/M $0.640/M 128,000
Phi 3 Medium 128k Instruct $0.170/M $0.680/M 128,000
Phi 3 Medium 4k Instruct $0.170/M $0.680/M 4,096
Grok 4.1 Fast Non Reasoning $0.200/M $0.500/M 131,072 tools
Grok 4.1 Fast Reasoning $0.200/M $0.500/M 131,072 tools · reasoning
Grok 4 Fast Non Reasoning $0.200/M $0.500/M 131,072 tools
Grok 4 Fast Reasoning $0.200/M $0.500/M 131,072 tools
Grok Code Fast 1 $0.200/M $1.50/M 131,072 tools
Llama 4 Scout 17B 16e Instruct $0.200/M $0.780/M 10,000,000 tools · vision
Grok 3 mini $0.250/M $1.27/M 131,072 tools · reasoning
Grok 3 mini $0.250/M $1.27/M 131,072 tools · reasoning
Meta Llama 3.1 8B Instruct $0.300/M $0.610/M 128,000
Llama 3.2 11B Vision Instruct $0.370/M $0.370/M 128,000 tools · vision
Mistral Medium 2505 $0.400/M $2.00/M 131,072 tools
Jamba Instruct $0.500/M $0.700/M 70,000
Mistral Large 3 $0.500/M $1.50/M 256,000 tools · vision
DeepSeek v3.2 $0.580/M $1.68/M 163,840 tools · reasoning · cache
DeepSeek V3.2 Speciale $0.580/M $1.68/M 163,840 tools · reasoning · cache
Kimi K2.5 $0.600/M $3.00/M 262,144 tools · vision
Llama 3.3 70B Instruct $0.710/M $0.710/M 128,000 tools
Claude Haiku 4.5 $1.00/M $5.00/M 200,000 tools · vision · reasoning · cache
Mistral Small $1.00/M $3.00/M 32,000 tools
Meta Llama 3.70B Instruct $1.10/M $0.370/M 8,192
DeepSeek V3 $1.14/M $4.56/M 128,000
DeepSeek v3.0324 $1.14/M $4.56/M 128,000 tools
DeepSeek R1 $1.35/M $5.40/M 128,000 reasoning
Mai Ds R1 $1.35/M $5.40/M 128,000 reasoning
Llama 4 Maverick 17B 128e Instruct Fp8 $1.41/M $0.350/M 1,000,000 tools · vision
Mistral Large 2407 $2.00/M $6.00/M 128,000 tools
Mistral Large latest $2.00/M $6.00/M 128,000 tools
Llama 3.2 90B Vision Instruct $2.04/M $2.04/M 128,000 tools · vision
Meta Llama 3.1 70B Instruct $2.68/M $3.54/M 128,000
Claude Sonnet 4.5 $3.00/M $15.00/M 200,000 tools · vision · reasoning · cache
Claude Sonnet 4.6 $3.00/M $15.00/M 1,000,000 tools · vision · reasoning · cache
Grok 3 $3.00/M $15.00/M 131,072 tools
Grok 3 $3.00/M $15.00/M 131,072 tools
Grok 4 $3.00/M $15.00/M 131,072 tools
Mistral Large $4.00/M $12.00/M 32,000 tools
Claude Opus 4.5 $5.00/M $25.00/M 200,000 tools · vision · reasoning · cache
Claude Opus 4.6 $5.00/M $25.00/M 200,000 tools · vision · reasoning · cache
Claude Opus 4.7 $5.00/M $25.00/M 200,000 tools · vision · reasoning · cache
Meta Llama 3.1 405B Instruct $5.33/M $16.00/M 128,000
Claude Opus 4.1 $15.00/M $75.00/M 200,000 tools · vision · reasoning · cache
Jais 30B Chat $3,200/M $9,710/M 8,192

rerank 5

Model Input / 1M Output / 1M Context Caps
Cohere Rerank V3 English 4,096
Cohere Rerank V3 Multilingual 4,096
Cohere Rerank v3.5 4,096
Cohere Rerank V4.0 Fast 32,768
Cohere Rerank V4.0 Pro 32,768

ocr 5

Model Input / 1M Output / 1M Context Caps
Doc Intelligence Prebuilt Document
Doc Intelligence Prebuilt Layout
Doc Intelligence Prebuilt Read
Mistral Document AI 2505
Mistral Document AI 2512

embedding 3

Model Input / 1M Output / 1M Context Caps
Cohere Embed V3 English $0.1000/M 512
Cohere Embed V3 Multilingual $0.1000/M 512
Embed V 4.0 $0.120/M 128,000

image generation 3

Model Input / 1M Output / 1M Context Caps
Flux 1 Kontext Pro
Flux 1.1 Pro
Flux 2 Pro

FAQ

How many Azure AI Foundry models are there?

76 Azure AI Foundry models are listed across 5 modalities on this page. 71 have public per-token pricing.

How is Azure AI Foundry pricing verified?

Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.

Which Azure AI Foundry model is cheapest?

Input pricing on Azure AI Foundry starts at $0.0400 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.

Can I route to Azure AI Foundry via an OpenAI-compatible API?

Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a Azure AI Foundry target, and call Azure AI Foundry models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.

Route any Azure AI Foundry model via Agent Command Center →
OpenAI-compatible endpoint. Caching, fallback, guardrails, observability. Free for 100K requests/month.