IBM watsonx models & pricing
IBM watsonx hosts 29 models (28 with public pricing) covering 2 modalities. Enterprise foundation models — Granite, Llama, Mistral with AI governance. Cheapest input starts at $0.0600/M tokens; the most premium goes up to $500/M. Use Future AGI's Agent Command Center to route any IBM watsonx model with cost-optimized fallback and unified observability.
chat 28
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| IBM Granite 4 H Small | $0.0600/M | $0.250/M | 20,480 | tools |
| IBM Granite Guardian 3.2 2B | $0.1000/M | $0.1000/M | 8,192 | |
| IBM Granite Vision 3.2 2B | $0.1000/M | $0.1000/M | 8,192 | vision |
| Meta Llama Llama 3.2 1B Instruct | $0.1000/M | $0.1000/M | 128,000 | tools |
| Mistralai Mistral Small 2503 | $0.1000/M | $0.300/M | 32,000 | tools |
| Mistralai Mistral Small 3.1 24B Instruct 2503 | $0.1000/M | $0.300/M | 32,000 | tools |
| Meta Llama Llama 3.2 3B Instruct | $0.150/M | $0.150/M | 128,000 | tools |
| OpenAI GPT Oss 120B | $0.150/M | $0.600/M | 8,192 | |
| IBM Granite 3.3 8B Instruct | $0.200/M | $0.200/M | 8,192 | tools |
| IBM Granite 3.8B Instruct | $0.200/M | $0.200/M | 8,192 | tools · cache |
| IBM Granite Guardian 3.3 8B | $0.200/M | $0.200/M | 8,192 | |
| Meta Llama Llama 3.2 11B Vision Instruct | $0.350/M | $0.350/M | 128,000 | tools · vision |
| Meta Llama Llama 4 Maverick 17B | $0.350/M | $1.40/M | 128,000 | tools |
| Meta Llama Llama Guard 3.11B Vision | $0.350/M | $0.350/M | 128,000 | vision |
| Mistralai Pixtral 12B 2409 | $0.350/M | $0.350/M | 128,000 | vision |
| IBM Granite Ttm 1024.96 R2 | $0.380/M | $0.380/M | 512 | |
| IBM Granite Ttm 1536.96 R2 | $0.380/M | $0.380/M | 512 | |
| IBM Granite Ttm 512.96 R2 | $0.380/M | $0.380/M | 512 | |
| Google Flan T5 Xl 3B | $0.600/M | $0.600/M | 8,192 | |
| IBM Granite 13B Chat v2 | $0.600/M | $0.600/M | 8,192 | |
| IBM Granite 13B Instruct v2 | $0.600/M | $0.600/M | 8,192 | |
| Meta Llama Llama 3.3 70B Instruct | $0.710/M | $0.710/M | 128,000 | tools |
| Sdaia Allam 1.13B Instruct | $1.80/M | $1.80/M | 8,192 | |
| Meta Llama Llama 3.2 90B Vision Instruct | $2.00/M | $2.00/M | 128,000 | tools · vision |
| Mistralai Mistral Large | $3.00/M | $10.00/M | 131,072 | tools · cache |
| Mistralai Mistral Medium 2505 | $3.00/M | $10.00/M | 128,000 | tools |
| Bigscience Mt0 Xxl 13B | $500/M | $2,000/M | 8,192 | |
| Core42 Jais 13B Chat | $500/M | $2,000/M | 8,192 |
audio transcription 1
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Whisper Large V3 Turbo | — | — | — |
FAQ
How many IBM watsonx models are there?
29 IBM watsonx models are listed across 2 modalities on this page. 28 have public per-token pricing.
How is IBM watsonx pricing verified?
Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.
Which IBM watsonx model is cheapest?
Input pricing on IBM watsonx starts at $0.0600 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.
Can I route to IBM watsonx via an OpenAI-compatible API?
Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a IBM watsonx target, and call IBM watsonx models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.