Together AI models & pricing
Together AI hosts 42 models (35 with public pricing) covering 2 modalities. Open-model inference cloud — Llama, DeepSeek, Qwen, Mixtral, FLUX with serverless and dedicated endpoints. Cheapest input starts at $0.008000/M tokens; the most premium goes up to $3.50/M. Use Future AGI's Agent Command Center to route any Together AI model with cost-optimized fallback and unified observability.
chat 39
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Blended ↕ | Context ↕ | Caps |
|---|---|---|---|---|---|
| OpenAI GPT Oss 20B | $0.0500/M | $0.200/M | $0.0875/M | 128,000 | tools |
| Together AI Up To 4B | $0.1000/M | $0.1000/M | $0.1000/M | — | |
| OpenAI GPT Oss 120B | $0.150/M | $0.600/M | $0.262/M | 131,072 | tools · reasoning |
| Qwen Qwen3 Next 80B A3b Instruct | $0.150/M | $1.50/M | $0.488/M | 262,144 | tools |
| Qwen Qwen3 Next 80B A3b Thinking | $0.150/M | $1.50/M | $0.488/M | 262,144 | tools |
| Meta Llama Llama 4 Scout 17B 16e Instruct | $0.180/M | $0.590/M | $0.283/M | — | tools |
| Meta Llama Meta Llama 3.1 8B Instruct Turbo | $0.180/M | $0.180/M | $0.180/M | — | tools |
| Qwen Qwen3.235b A22b Fp8 Tput | $0.200/M | $0.600/M | $0.300/M | 40,000 | |
| Qwen Qwen3.235b A22b Instruct 2507 Tput | $0.200/M | $6.00/M | $1.65/M | 262,000 | tools |
| Together AI 4.1B 8B | $0.200/M | $0.200/M | $0.200/M | — | |
| Zai Org Glm 4.5 Air Fp8 | $0.200/M | $1.10/M | $0.425/M | 128,000 | tools |
| Meta Llama Llama 4 Maverick 17B 128e Instruct Fp8 | $0.270/M | $0.850/M | $0.415/M | — | tools |
| Together AI 8.1B 21B | $0.300/M | $0.300/M | $0.300/M | 1,000 | |
| Zai Org Glm 4.7 | $0.450/M | $2.00/M | $0.837/M | 200,000 | tools · reasoning |
| Moonshotai Kimi K2.5 | $0.500/M | $2.80/M | $1.08/M | 256,000 | tools · vision · reasoning |
| DeepSeek AI DeepSeek R1.0528 Tput | $0.550/M | $2.19/M | $0.960/M | 128,000 | tools |
| DeepSeek AI DeepSeek v3.1 | $0.600/M | $1.70/M | $0.875/M | 128,000 | tools · reasoning |
| Mistralai Mixtral 8×7B Instruct v0.1 | $0.600/M | $0.600/M | $0.600/M | — | tools |
| Qwen Qwen3.5 397B A17b | $0.600/M | $3.60/M | $1.35/M | 262,144 | tools |
| Zai Org Glm 4.6 | $0.600/M | $2.20/M | $1.00/M | 200,000 | tools · reasoning |
| Qwen Qwen3.235b A22b Thinking 2507 | $0.650/M | $3.00/M | $1.24/M | 256,000 | tools |
| Together AI 21.1B 41B | $0.800/M | $0.800/M | $0.800/M | — | |
| Meta Llama Llama 3.3 70B Instruct Turbo | $0.880/M | $0.880/M | $0.880/M | — | tools |
| Meta Llama Meta Llama 3.1 70B Instruct Turbo | $0.880/M | $0.880/M | $0.880/M | — | tools |
| Together AI 41.1B 80B | $0.900/M | $0.900/M | $0.900/M | — | |
| Moonshotai Kimi K2 Instruct | $1.00/M | $3.00/M | $1.50/M | — | tools |
| Moonshotai Kimi K2 Instruct 0905 | $1.00/M | $3.00/M | $1.50/M | 262,144 | tools |
| DeepSeek AI DeepSeek v3 | $1.25/M | $1.25/M | $1.25/M | 65,536 | tools |
| Together AI 81.1B 110B | $1.80/M | $1.80/M | $1.80/M | — | |
| Qwen Qwen3 Coder 480B A35b Instruct Fp8 | $2.00/M | $2.00/M | $2.00/M | 256,000 | tools |
| DeepSeek AI DeepSeek R1 | $3.00/M | $7.00/M | $4.00/M | 128,000 | tools |
| Meta Llama Meta Llama 3.1 405B Instruct Turbo | $3.50/M | $3.50/M | $3.50/M | — | tools |
| Meta Llama Llama 3.2 3B Instruct Turbo | — | — | — | — | tools |
| Meta Llama Llama 3.3 70B Instruct Turbo Free | — | — | — | — | tools |
| Mistralai Mistral 7B Instruct v0.1 | — | — | — | — | tools |
| Mistralai Mistral Small 24B Instruct 2501 | — | — | — | — | tools |
| Qwen Qwen2.5 72B Instruct Turbo | — | — | — | — | tools |
| Qwen Qwen2.5 7B Instruct Turbo | — | — | — | — | tools |
| Togethercomputer Codellama 34B Instruct | — | — | — | — | tools |
embedding 3
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Blended ↕ | Context ↕ | Caps |
|---|---|---|---|---|---|
| Baai Bge Base En v1.5 | $0.008000/M | — | — | 512 | |
| Together AI Embedding Up To 150M | $0.008000/M | — | — | — | |
| Together AI Embedding 151M To 350M | $0.0160/M | — | — | — |
FAQ
How many Together AI models are there?
42 Together AI models are listed across 2 modalities on this page. 35 have public per-token pricing.
How is Together AI pricing verified?
Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.
Which Together AI model is cheapest?
Input pricing on Together AI starts at $0.008000 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.
Can I route to Together AI via an OpenAI-compatible API?
Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a Together AI target, and call Together AI models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.