Anyscale models & pricing
Anyscale hosts 12 models (12 with public pricing) covering 1 modalities. Ray-powered serverless inference for Llama, Mistral, and Qwen. Cheapest input starts at $0.150/M tokens; the most premium goes up to $1.00/M. Use Future AGI's Agent Command Center to route any Anyscale model with cost-optimized fallback and unified observability.
chat 12
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Google Gemma 7B It | $0.150/M | $0.150/M | 8,192 | |
| Huggingfaceh4 Zephyr 7B beta | $0.150/M | $0.150/M | 16,384 | |
| Meta Llama Llama 2.7B Chat Hf | $0.150/M | $0.150/M | 4,096 | |
| Meta Llama Meta Llama 3.8B Instruct | $0.150/M | $0.150/M | 8,192 | |
| Mistralai Mistral 7B Instruct v0.1 | $0.150/M | $0.150/M | 16,384 | tools |
| Mistralai Mixtral 8×7B Instruct v0.1 | $0.150/M | $0.150/M | 16,384 | tools |
| Meta Llama Llama 2.13B Chat Hf | $0.250/M | $0.250/M | 4,096 | |
| Mistralai Mixtral 8×22B Instruct v0.1 | $0.900/M | $0.900/M | 65,536 | tools |
| Codellama Codellama 34B Instruct Hf | $1.00/M | $1.00/M | 4,096 | |
| Codellama Codellama 70B Instruct Hf | $1.00/M | $1.00/M | 4,096 | |
| Meta Llama Llama 2.70B Chat Hf | $1.00/M | $1.00/M | 4,096 | |
| Meta Llama Meta Llama 3.70B Instruct | $1.00/M | $1.00/M | 8,192 |
FAQ
How many Anyscale models are there?
12 Anyscale models are listed across 1 modality on this page. 12 have public per-token pricing.
How is Anyscale pricing verified?
Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.
Which Anyscale model is cheapest?
Input pricing on Anyscale starts at $0.150 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.
Can I route to Anyscale via an OpenAI-compatible API?
Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a Anyscale target, and call Anyscale models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.