IBM watsonx models & pricing

IBM watsonx hosts 29 models (28 with public pricing) covering 2 modalities. Enterprise foundation models — Granite, Llama, Mistral with AI governance. Cheapest input starts at $0.0600/M tokens; the most premium goes up to $500/M. Use Future AGI's Agent Command Center to route any IBM watsonx model with cost-optimized fallback and unified observability.

Homepage ↗ Docs ↗

chat 28

Model ↕	Input / 1M ↕	Output / 1M ↕	Blended ↕	Context ↕	Caps
IBM Granite 4 H Small	$0.0600/M	$0.250/M	$0.108/M	20,480	tools
IBM Granite Guardian 3.2 2B	$0.1000/M	$0.1000/M	$0.1000/M	8,192
IBM Granite Vision 3.2 2B	$0.1000/M	$0.1000/M	$0.1000/M	8,192	vision
Meta Llama Llama 3.2 1B Instruct	$0.1000/M	$0.1000/M	$0.1000/M	128,000	tools
Mistralai Mistral Small 2503	$0.1000/M	$0.300/M	$0.150/M	32,000	tools
Mistralai Mistral Small 3.1 24B Instruct 2503	$0.1000/M	$0.300/M	$0.150/M	32,000	tools
Meta Llama Llama 3.2 3B Instruct	$0.150/M	$0.150/M	$0.150/M	128,000	tools
OpenAI GPT Oss 120B	$0.150/M	$0.600/M	$0.262/M	8,192
IBM Granite 3.3 8B Instruct	$0.200/M	$0.200/M	$0.200/M	8,192	tools
IBM Granite 3.8B Instruct	$0.200/M	$0.200/M	$0.200/M	8,192	tools · cache
IBM Granite Guardian 3.3 8B	$0.200/M	$0.200/M	$0.200/M	8,192
Meta Llama Llama 3.2 11B Vision Instruct	$0.350/M	$0.350/M	$0.350/M	128,000	tools · vision
Meta Llama Llama 4 Maverick 17B	$0.350/M	$1.40/M	$0.612/M	128,000	tools
Meta Llama Llama Guard 3.11B Vision	$0.350/M	$0.350/M	$0.350/M	128,000	vision
Mistralai Pixtral 12B 2409	$0.350/M	$0.350/M	$0.350/M	128,000	vision
IBM Granite Ttm 1024.96 R2	$0.380/M	$0.380/M	$0.380/M	512
IBM Granite Ttm 1536.96 R2	$0.380/M	$0.380/M	$0.380/M	512
IBM Granite Ttm 512.96 R2	$0.380/M	$0.380/M	$0.380/M	512
Google Flan T5 Xl 3B	$0.600/M	$0.600/M	$0.600/M	8,192
IBM Granite 13B Chat v2	$0.600/M	$0.600/M	$0.600/M	8,192
IBM Granite 13B Instruct v2	$0.600/M	$0.600/M	$0.600/M	8,192
Meta Llama Llama 3.3 70B Instruct	$0.710/M	$0.710/M	$0.710/M	128,000	tools
Sdaia Allam 1.13B Instruct	$1.80/M	$1.80/M	$1.80/M	8,192
Meta Llama Llama 3.2 90B Vision Instruct	$2.00/M	$2.00/M	$2.00/M	128,000	tools · vision
Mistralai Mistral Large	$3.00/M	$10.00/M	$4.75/M	131,072	tools · cache
Mistralai Mistral Medium 2505	$3.00/M	$10.00/M	$4.75/M	128,000	tools
Bigscience Mt0 Xxl 13B	$500/M	$2,000/M	$875/M	8,192
Core42 Jais 13B Chat	$500/M	$2,000/M	$875/M	8,192

audio transcription 1

Model ↕	Input / 1M ↕	Output / 1M ↕	Blended ↕	Context ↕	Caps
Whisper Large V3 Turbo	—	—	—	—

FAQ

How many IBM watsonx models are there?

29 IBM watsonx models are listed across 2 modalities on this page. 28 have public per-token pricing.

How is IBM watsonx pricing verified?

Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.

Which IBM watsonx model is cheapest?

Input pricing on IBM watsonx starts at $0.0600 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.

Can I route to IBM watsonx via an OpenAI-compatible API?

Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a IBM watsonx target, and call IBM watsonx models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.

Route any IBM watsonx model via Agent Command Center →

OpenAI-compatible endpoint. Caching, fallback, guardrails, observability. Free for 100K requests/month.