Perplexity models & pricing
Perplexity hosts 46 models (28 with public pricing) covering 4 modalities. Sonar models with grounded web search and citation built-in. Cheapest input starts at $0.004000/M tokens; the most premium goes up to $5.00/M. Use Future AGI's Agent Command Center to route any Perplexity model with cost-optimized fallback and unified observability.
chat 25
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Mistral 7B Instruct | $0.0700/M | $0.280/M | 4,096 | |
| Mixtral 8×7B Instruct | $0.0700/M | $0.280/M | 4,096 | |
| Pplx 7B Chat | $0.0700/M | $0.280/M | 8,192 | |
| Sonar Small Chat | $0.0700/M | $0.280/M | 16,384 | |
| Llama 3.1 8B Instruct | $0.200/M | $0.200/M | 131,072 | |
| Llama 3.1 Sonar Small 128k Chat Deprecated | $0.200/M | $0.200/M | 131,072 | |
| Llama 3.1 Sonar Small 128k Online Deprecated | $0.200/M | $0.200/M | 127,072 | |
| Codellama 34B Instruct | $0.350/M | $1.40/M | 16,384 | |
| Sonar Medium Chat | $0.600/M | $1.80/M | 16,384 | |
| Codellama 70B Instruct | $0.700/M | $2.80/M | 16,384 | |
| Llama 2.70B Chat | $0.700/M | $2.80/M | 4,096 | |
| Pplx 70B Chat | $0.700/M | $2.80/M | 4,096 | |
| Llama 3.1 70B Instruct | $1.00/M | $1.00/M | 131,072 | |
| Llama 3.1 Sonar Large 128k Chat Deprecated | $1.00/M | $1.00/M | 131,072 | |
| Llama 3.1 Sonar Large 128k Online Deprecated | $1.00/M | $1.00/M | 127,072 | |
| Sonar | $1.00/M | $1.00/M | 128,000 | |
| Sonar Reasoning | $1.00/M | $5.00/M | 128,000 | reasoning |
| Sonar Deep Research | $2.00/M | $8.00/M | 128,000 | reasoning |
| Sonar Reasoning Pro | $2.00/M | $8.00/M | 128,000 | reasoning |
| Sonar Pro | $3.00/M | $15.00/M | 200,000 | |
| Llama 3.1 Sonar Huge 128k Online Deprecated | $5.00/M | $5.00/M | 127,072 | |
| Pplx 70B Online | — | $2.80/M | 4,096 | |
| Pplx 7B Online | — | $0.280/M | 4,096 | |
| Sonar Medium Online | — | $1.80/M | 12,000 | |
| Sonar Small Online | — | $0.280/M | 12,000 |
responses 18
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Anthropic Claude Haiku 4.5 | — | — | — | tools |
| Anthropic Claude Opus 4.5 | — | — | — | tools |
| Anthropic Claude Opus 4.6 | — | — | — | tools |
| Anthropic Claude Opus 4.7 | — | — | — | tools |
| Anthropic Claude Sonnet 4.5 | — | — | — | tools |
| Google Gemini 2.5 Flash | — | — | — | tools |
| Google Gemini 2.5 Pro | — | — | — | tools |
| Google Gemini 3 Flash preview | — | — | — | tools |
| Google Gemini 3 Pro preview | — | — | — | tools |
| OpenAI GPT 5 mini | — | — | — | tools |
| OpenAI GPT 5.1 | — | — | — | tools |
| OpenAI GPT 5.2 | — | — | — | tools · reasoning |
| Perplexity Sonar | — | — | — | tools |
| Preset Advanced Deep Research | — | — | — | tools |
| Preset Deep Research | — | — | — | tools |
| Preset Fast Search | — | — | — | tools |
| Preset Pro Search | — | — | — | tools |
| Xai Grok 4.1 Fast Non Reasoning | — | — | — | tools |
embedding 2
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Pplx Embed V1.0 6B | $0.004000/M | — | 32,768 | |
| Pplx Embed V1.4b | $0.0300/M | — | 32,768 |
search 1
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Search | — | — | — |
FAQ
How many Perplexity models are there?
46 Perplexity models are listed across 4 modalities on this page. 28 have public per-token pricing.
How is Perplexity pricing verified?
Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.
Which Perplexity model is cheapest?
Input pricing on Perplexity starts at $0.004000 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.
Can I route to Perplexity via an OpenAI-compatible API?
Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a Perplexity target, and call Perplexity models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.