W&B Inference models & pricing
W&B Inference hosts 16 models (16 with public pricing) covering 1 modalities. Weights & Biases serverless inference for Llama and DeepSeek. Cheapest input starts at $0.300/M tokens; the most premium goes up to $135,000/M. Use Future AGI's Agent Command Center to route any W&B Inference model with cost-optimized fallback and unified observability.
chat 16
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Minimaxai Minimax M2.5 | $0.300/M | $1.20/M | 197,000 | tools · reasoning |
| Moonshotai Kimi K2.5 | $0.600/M | $3.00/M | 262,144 | tools · vision · reasoning |
| Moonshotai Kimi K2 Instruct | $0.600/M | $2.50/M | 128,000 | |
| OpenAI GPT Oss 20B | $5,000/M | $20,000/M | 131,072 | |
| Microsoft Phi 4 mini Instruct | $8,000/M | $35,000/M | 128,000 | |
| Qwen Qwen3.235b A22b Instruct 2507 | $10,000/M | $10,000/M | 262,144 | |
| Qwen Qwen3.235b A22b Thinking 2507 | $10,000/M | $10,000/M | 262,144 | |
| OpenAI GPT Oss 120B | $15,000/M | $60,000/M | 131,072 | |
| Meta Llama Llama 4 Scout 17B 16e Instruct | $17,000/M | $66,000/M | 64,000 | |
| Meta Llama Llama 3.1 8B Instruct | $22,000/M | $22,000/M | 128,000 | |
| DeepSeek AI DeepSeek v3.1 | $55,000/M | $165,000/M | 128,000 | |
| Zai Org Glm 4.5 | $55,000/M | $200,000/M | 131,072 | |
| Meta Llama Llama 3.3 70B Instruct | $71,000/M | $71,000/M | 128,000 | |
| Qwen Qwen3 Coder 480B A35b Instruct | $100,000/M | $150,000/M | 262,144 | |
| DeepSeek AI DeepSeek v3.0324 | $114,000/M | $275,000/M | 161,000 | |
| DeepSeek AI DeepSeek R1.0528 | $135,000/M | $540,000/M | 161,000 |
FAQ
How many W&B Inference models are there?
16 W&B Inference models are listed across 1 modality on this page. 16 have public per-token pricing.
How is W&B Inference pricing verified?
Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.
Which W&B Inference model is cheapest?
Input pricing on W&B Inference starts at $0.300 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.
Can I route to W&B Inference via an OpenAI-compatible API?
Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a W&B Inference target, and call W&B Inference models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.