Groq models & pricing

Groq hosts 14 models (11 with public pricing) covering 3 modalities. LPU inference for Llama, Mixtral, Whisper — sub-second TTFT at high throughput. Cheapest input starts at $0.0500/M tokens; the most premium goes up to $1.00/M. Use Future AGI's Agent Command Center to route any Groq model with cost-optimized fallback and unified observability.

Homepage ↗ Docs ↗

chat 11

Model ↕	Input / 1M ↕	Output / 1M ↕	Context ↕	Caps
Gemma 7B It	$0.0500/M	$0.0800/M	8,192	tools
Llama 3.1 8B Instant	$0.0500/M	$0.0800/M	128,000	tools
OpenAI GPT Oss 20B	$0.0750/M	$0.300/M	131,072	tools · reasoning
OpenAI GPT Oss Safeguard 20B	$0.0750/M	$0.300/M	131,072	tools · reasoning
Meta Llama Llama 4 Scout 17B 16e Instruct	$0.110/M	$0.340/M	131,072	tools · vision
OpenAI GPT Oss 120B	$0.150/M	$0.600/M	131,072	tools · reasoning
Meta Llama Llama 4 Maverick 17B 128e Instruct	$0.200/M	$0.600/M	131,072	tools · vision
Meta Llama Llama Guard 4.12B	$0.200/M	$0.200/M	8,192
Qwen Qwen3.32b	$0.290/M	$0.590/M	131,000	tools · reasoning
Llama 3.3 70B Versatile	$0.590/M	$0.790/M	128,000	tools
Moonshotai Kimi K2 Instruct 0905	$1.00/M	$3.00/M	262,144	tools

audio transcription 2

Model ↕	Input / 1M ↕	Output / 1M ↕	Context ↕	Caps
Whisper Large v3	—	—	—
Whisper Large V3 Turbo	—	—	—

audio speech 1

Model ↕	Input / 1M ↕	Output / 1M ↕	Context ↕	Caps
Playai TTS	—	—	10,000

FAQ

How many Groq models are there?

14 Groq models are listed across 3 modalities on this page. 11 have public per-token pricing.

How is Groq pricing verified?

Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.

Which Groq model is cheapest?

Input pricing on Groq starts at $0.0500 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.

Can I route to Groq via an OpenAI-compatible API?

Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a Groq target, and call Groq models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.

Route any Groq model via Agent Command Center →

OpenAI-compatible endpoint. Caching, fallback, guardrails, observability. Free for 100K requests/month.