Perplexity models & pricing

Perplexity hosts 46 models (28 with public pricing) covering 4 modalities. Sonar models with grounded web search and citation built-in. Cheapest input starts at $0.004000/M tokens; the most premium goes up to $5.00/M. Use Future AGI's Agent Command Center to route any Perplexity model with cost-optimized fallback and unified observability.

Homepage ↗ Docs ↗
46

chat 25

Model Input / 1M Output / 1M Context Caps
Mistral 7B Instruct $0.0700/M $0.280/M 4,096
Mixtral 8×7B Instruct $0.0700/M $0.280/M 4,096
Pplx 7B Chat $0.0700/M $0.280/M 8,192
Sonar Small Chat $0.0700/M $0.280/M 16,384
Llama 3.1 8B Instruct $0.200/M $0.200/M 131,072
Llama 3.1 Sonar Small 128k Chat Deprecated $0.200/M $0.200/M 131,072
Llama 3.1 Sonar Small 128k Online Deprecated $0.200/M $0.200/M 127,072
Codellama 34B Instruct $0.350/M $1.40/M 16,384
Sonar Medium Chat $0.600/M $1.80/M 16,384
Codellama 70B Instruct $0.700/M $2.80/M 16,384
Llama 2.70B Chat $0.700/M $2.80/M 4,096
Pplx 70B Chat $0.700/M $2.80/M 4,096
Llama 3.1 70B Instruct $1.00/M $1.00/M 131,072
Llama 3.1 Sonar Large 128k Chat Deprecated $1.00/M $1.00/M 131,072
Llama 3.1 Sonar Large 128k Online Deprecated $1.00/M $1.00/M 127,072
Sonar $1.00/M $1.00/M 128,000
Sonar Reasoning $1.00/M $5.00/M 128,000 reasoning
Sonar Deep Research $2.00/M $8.00/M 128,000 reasoning
Sonar Reasoning Pro $2.00/M $8.00/M 128,000 reasoning
Sonar Pro $3.00/M $15.00/M 200,000
Llama 3.1 Sonar Huge 128k Online Deprecated $5.00/M $5.00/M 127,072
Pplx 70B Online $2.80/M 4,096
Pplx 7B Online $0.280/M 4,096
Sonar Medium Online $1.80/M 12,000
Sonar Small Online $0.280/M 12,000

responses 18

Model Input / 1M Output / 1M Context Caps
Anthropic Claude Haiku 4.5 tools
Anthropic Claude Opus 4.5 tools
Anthropic Claude Opus 4.6 tools
Anthropic Claude Opus 4.7 tools
Anthropic Claude Sonnet 4.5 tools
Google Gemini 2.5 Flash tools
Google Gemini 2.5 Pro tools
Google Gemini 3 Flash preview tools
Google Gemini 3 Pro preview tools
OpenAI GPT 5 mini tools
OpenAI GPT 5.1 tools
OpenAI GPT 5.2 tools · reasoning
Perplexity Sonar tools
Preset Advanced Deep Research tools
Preset Deep Research tools
Preset Fast Search tools
Preset Pro Search tools
Xai Grok 4.1 Fast Non Reasoning tools

embedding 2

Model Input / 1M Output / 1M Context Caps
Pplx Embed V1.0 6B $0.004000/M 32,768
Pplx Embed V1.4b $0.0300/M 32,768

FAQ

How many Perplexity models are there?

46 Perplexity models are listed across 4 modalities on this page. 28 have public per-token pricing.

How is Perplexity pricing verified?

Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.

Which Perplexity model is cheapest?

Input pricing on Perplexity starts at $0.004000 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.

Can I route to Perplexity via an OpenAI-compatible API?

Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a Perplexity target, and call Perplexity models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.

Route any Perplexity model via Agent Command Center →
OpenAI-compatible endpoint. Caching, fallback, guardrails, observability. Free for 100K requests/month.