Cohere models & pricing
Cohere hosts 22 models (22 with public pricing) covering 4 modalities. Command R+ chat, Embed v3 multilingual embeddings, Rerank, and enterprise RAG models. Cheapest input starts at $0.1000/M tokens; the most premium goes up to $100/M. Use Future AGI's Agent Command Center to route any Cohere model with cost-optimized fallback and unified observability.
embedding 8
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Embed English Light v2.0 | $0.1000/M | — | 1,024 | |
| Embed English Light v3.0 | $0.1000/M | — | 1,024 | |
| Embed English v2.0 | $0.1000/M | — | 4,096 | |
| Embed English v3.0 | $0.1000/M | — | 1,024 | |
| Embed Multilingual v2.0 | $0.1000/M | — | 768 | |
| Embed Multilingual v3.0 | $0.1000/M | — | 1,024 | |
| Embed v4.0 | $0.120/M | — | 128,000 | |
| Embed Multilingual Light v3.0 | $100/M | — | 1,024 |
chat 7
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Command R | $0.150/M | $0.600/M | 128,000 | tools |
| Command R 08.2024 | $0.150/M | $0.600/M | 128,000 | tools |
| Command R7b 12.2024 | $0.150/M | $0.0375/M | 128,000 | tools |
| Command Light | $0.300/M | $0.600/M | 4,096 | |
| Command A 03.2025 | $2.50/M | $10.00/M | 256,000 | tools |
| Command R+ | $2.50/M | $10.00/M | 128,000 | tools |
| Command R Plus 08.2024 | $2.50/M | $10.00/M | 128,000 | tools |
rerank 5
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Rerank English v2.0 | — | — | 4,096 | |
| Rerank English v3.0 | — | — | 4,096 | |
| Rerank Multilingual v2.0 | — | — | 4,096 | |
| Rerank Multilingual v3.0 | — | — | 4,096 | |
| Rerank v3.5 | — | — | 4,096 |
completion 2
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Command | $1.00/M | $2.00/M | 4,096 | |
| Command Nightly | $1.00/M | $2.00/M | 4,096 |
FAQ
How many Cohere models are there?
22 Cohere models are listed across 4 modalities on this page. 22 have public per-token pricing.
How is Cohere pricing verified?
Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.
Which Cohere model is cheapest?
Input pricing on Cohere starts at $0.1000 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.
Can I route to Cohere via an OpenAI-compatible API?
Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a Cohere target, and call Cohere models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.