Anyscale models & pricing

Anyscale hosts 12 models (12 with public pricing) covering 1 modalities. Ray-powered serverless inference for Llama, Mistral, and Qwen. Cheapest input starts at $0.150/M tokens; the most premium goes up to $1.00/M. Use Future AGI's Agent Command Center to route any Anyscale model with cost-optimized fallback and unified observability.

Homepage ↗ Docs ↗

chat 12

Model Input / 1M Output / 1M Context Caps
Google Gemma 7B It $0.150/M $0.150/M 8,192
Huggingfaceh4 Zephyr 7B beta $0.150/M $0.150/M 16,384
Meta Llama Llama 2.7B Chat Hf $0.150/M $0.150/M 4,096
Meta Llama Meta Llama 3.8B Instruct $0.150/M $0.150/M 8,192
Mistralai Mistral 7B Instruct v0.1 $0.150/M $0.150/M 16,384 tools
Mistralai Mixtral 8×7B Instruct v0.1 $0.150/M $0.150/M 16,384 tools
Meta Llama Llama 2.13B Chat Hf $0.250/M $0.250/M 4,096
Mistralai Mixtral 8×22B Instruct v0.1 $0.900/M $0.900/M 65,536 tools
Codellama Codellama 34B Instruct Hf $1.00/M $1.00/M 4,096
Codellama Codellama 70B Instruct Hf $1.00/M $1.00/M 4,096
Meta Llama Llama 2.70B Chat Hf $1.00/M $1.00/M 4,096
Meta Llama Meta Llama 3.70B Instruct $1.00/M $1.00/M 8,192

FAQ

How many Anyscale models are there?

12 Anyscale models are listed across 1 modality on this page. 12 have public per-token pricing.

How is Anyscale pricing verified?

Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.

Which Anyscale model is cheapest?

Input pricing on Anyscale starts at $0.150 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.

Can I route to Anyscale via an OpenAI-compatible API?

Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a Anyscale target, and call Anyscale models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.

Route any Anyscale model via Agent Command Center →
OpenAI-compatible endpoint. Caching, fallback, guardrails, observability. Free for 100K requests/month.