Groq models & pricing
Groq hosts 14 models (11 with public pricing) covering 3 modalities. LPU inference for Llama, Mixtral, Whisper — sub-second TTFT at high throughput. Cheapest input starts at $0.0500/M tokens; the most premium goes up to $1.00/M. Use Future AGI's Agent Command Center to route any Groq model with cost-optimized fallback and unified observability.
chat 11
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Gemma 7B It | $0.0500/M | $0.0800/M | 8,192 | tools |
| Llama 3.1 8B Instant | $0.0500/M | $0.0800/M | 128,000 | tools |
| OpenAI GPT Oss 20B | $0.0750/M | $0.300/M | 131,072 | tools · reasoning |
| OpenAI GPT Oss Safeguard 20B | $0.0750/M | $0.300/M | 131,072 | tools · reasoning |
| Meta Llama Llama 4 Scout 17B 16e Instruct | $0.110/M | $0.340/M | 131,072 | tools · vision |
| OpenAI GPT Oss 120B | $0.150/M | $0.600/M | 131,072 | tools · reasoning |
| Meta Llama Llama 4 Maverick 17B 128e Instruct | $0.200/M | $0.600/M | 131,072 | tools · vision |
| Meta Llama Llama Guard 4.12B | $0.200/M | $0.200/M | 8,192 | |
| Qwen Qwen3.32b | $0.290/M | $0.590/M | 131,000 | tools · reasoning |
| Llama 3.3 70B Versatile | $0.590/M | $0.790/M | 128,000 | tools |
| Moonshotai Kimi K2 Instruct 0905 | $1.00/M | $3.00/M | 262,144 | tools |
audio transcription 2
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Whisper Large v3 | — | — | — | |
| Whisper Large V3 Turbo | — | — | — |
audio speech 1
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Playai TTS | — | — | 10,000 |
FAQ
How many Groq models are there?
14 Groq models are listed across 3 modalities on this page. 11 have public per-token pricing.
How is Groq pricing verified?
Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.
Which Groq model is cheapest?
Input pricing on Groq starts at $0.0500 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.
Can I route to Groq via an OpenAI-compatible API?
Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a Groq target, and call Groq models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.