Together AI models & pricing
Together AI hosts 42 models (35 with public pricing) covering 2 modalities. Open-model inference cloud — Llama, DeepSeek, Qwen, Mixtral, FLUX with serverless and dedicated endpoints. Cheapest input starts at $0.008000/M tokens; the most premium goes up to $3.50/M. Use Future AGI's Agent Command Center to route any Together AI model with cost-optimized fallback and unified observability.
chat 39
embedding 3
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Baai Bge Base En v1.5 | $0.008000/M | — | 512 | |
| Together AI Embedding Up To 150M | $0.008000/M | — | — | |
| Together AI Embedding 151M To 350M | $0.0160/M | — | — |
FAQ
How many Together AI models are there?
42 Together AI models are listed across 2 modalities on this page. 35 have public per-token pricing.
How is Together AI pricing verified?
Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.
Which Together AI model is cheapest?
Input pricing on Together AI starts at $0.008000 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.
Can I route to Together AI via an OpenAI-compatible API?
Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a Together AI target, and call Together AI models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.