Perplexity models & pricing

Perplexity hosts 46 models (28 with public pricing) covering 4 modalities. Sonar models with grounded web search and citation built-in. Cheapest input starts at $0.004000/M tokens; the most premium goes up to $5.00/M. Use Future AGI's Agent Command Center to route any Perplexity model with cost-optimized fallback and unified observability.

Homepage ↗ Docs ↗

chat 25

Model ↕	Input / 1M ↕	Output / 1M ↕	Blended ↕	Context ↕	Caps
Mistral 7B Instruct	$0.0700/M	$0.280/M	$0.123/M	4,096
Mixtral 8×7B Instruct	$0.0700/M	$0.280/M	$0.123/M	4,096
Pplx 7B Chat	$0.0700/M	$0.280/M	$0.123/M	8,192
Sonar Small Chat	$0.0700/M	$0.280/M	$0.123/M	16,384
Llama 3.1 8B Instruct	$0.200/M	$0.200/M	$0.200/M	131,072
Llama 3.1 Sonar Small 128k Chat Deprecated	$0.200/M	$0.200/M	$0.200/M	131,072
Llama 3.1 Sonar Small 128k Online Deprecated	$0.200/M	$0.200/M	$0.200/M	127,072
Codellama 34B Instruct	$0.350/M	$1.40/M	$0.612/M	16,384
Sonar Medium Chat	$0.600/M	$1.80/M	$0.900/M	16,384
Codellama 70B Instruct	$0.700/M	$2.80/M	$1.22/M	16,384
Llama 2.70B Chat	$0.700/M	$2.80/M	$1.22/M	4,096
Pplx 70B Chat	$0.700/M	$2.80/M	$1.22/M	4,096
Llama 3.1 70B Instruct	$1.00/M	$1.00/M	$1.00/M	131,072
Llama 3.1 Sonar Large 128k Chat Deprecated	$1.00/M	$1.00/M	$1.00/M	131,072
Llama 3.1 Sonar Large 128k Online Deprecated	$1.00/M	$1.00/M	$1.00/M	127,072
Sonar	$1.00/M	$1.00/M	$1.00/M	128,000
Sonar Reasoning	$1.00/M	$5.00/M	$2.00/M	128,000	reasoning
Sonar Deep Research	$2.00/M	$8.00/M	$3.50/M	128,000	reasoning
Sonar Reasoning Pro	$2.00/M	$8.00/M	$3.50/M	128,000	reasoning
Sonar Pro	$3.00/M	$15.00/M	$6.00/M	200,000
Llama 3.1 Sonar Huge 128k Online Deprecated	$5.00/M	$5.00/M	$5.00/M	127,072
Pplx 70B Online	—	$2.80/M	—	4,096
Pplx 7B Online	—	$0.280/M	—	4,096
Sonar Medium Online	—	$1.80/M	—	12,000
Sonar Small Online	—	$0.280/M	—	12,000

responses 18

Model ↕	Input / 1M ↕	Output / 1M ↕	Blended ↕	Context ↕	Caps
Anthropic Claude Haiku 4.5	—	—	—	—	tools
Anthropic Claude Opus 4.5	—	—	—	—	tools
Anthropic Claude Opus 4.6	—	—	—	—	tools
Anthropic Claude Opus 4.7	—	—	—	—	tools
Anthropic Claude Sonnet 4.5	—	—	—	—	tools
Google Gemini 2.5 Flash	—	—	—	—	tools
Google Gemini 2.5 Pro	—	—	—	—	tools
Google Gemini 3 Flash preview	—	—	—	—	tools
Google Gemini 3 Pro preview	—	—	—	—	tools
OpenAI GPT 5 mini	—	—	—	—	tools
OpenAI GPT 5.1	—	—	—	—	tools
OpenAI GPT 5.2	—	—	—	—	tools · reasoning
Perplexity Sonar	—	—	—	—	tools
Preset Advanced Deep Research	—	—	—	—	tools
Preset Deep Research	—	—	—	—	tools
Preset Fast Search	—	—	—	—	tools
Preset Pro Search	—	—	—	—	tools
Xai Grok 4.1 Fast Non Reasoning	—	—	—	—	tools

embedding 2

Model ↕	Input / 1M ↕	Output / 1M ↕	Blended ↕	Context ↕	Caps
Pplx Embed V1.0 6B	$0.004000/M	—	—	32,768
Pplx Embed V1.4b	$0.0300/M	—	—	32,768

search 1

Model ↕	Input / 1M ↕	Output / 1M ↕	Blended ↕	Context ↕	Caps
Search	—	—	—	—

FAQ

How many Perplexity models are there?

46 Perplexity models are listed across 4 modalities on this page. 28 have public per-token pricing.

How is Perplexity pricing verified?

Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.

Which Perplexity model is cheapest?

Input pricing on Perplexity starts at $0.004000 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.

Can I route to Perplexity via an OpenAI-compatible API?

Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a Perplexity target, and call Perplexity models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.

Route any Perplexity model via Agent Command Center →

OpenAI-compatible endpoint. Caching, fallback, guardrails, observability. Free for 100K requests/month.