Nebius models & pricing

Nebius hosts 30 models (30 with public pricing) covering 2 modalities. European GPU cloud with Inference Studio for Llama, Qwen, DeepSeek, FLUX. Cheapest input starts at $0.0100/M tokens; the most premium goes up to $1.00/M. Use Future AGI's Agent Command Center to route any Nebius model with cost-optimized fallback and unified observability.

Homepage ↗ Docs ↗
30

chat 27

Model Input / 1M Output / 1M Context Caps
Qwen Qwen2.5 Coder 7B $0.0100/M $0.0300/M 32,768 tools
Meta Llama Llama Guard 3.8B $0.0200/M $0.0600/M 128,000
Meta Llama Meta Llama 3.1 8B Instruct $0.0200/M $0.0600/M 128,000 tools
Qwen Qwen2 VL 7B Instruct $0.0200/M $0.0600/M 131,072 vision
Mistralai Mistral Nemo Instruct 2407 $0.0400/M $0.120/M 128,000 tools
Google Gemma 3.27B It $0.0600/M $0.200/M 128,000 tools · vision
Qwen Qwen2.5 32B Instruct $0.0600/M $0.200/M 128,000 tools
Qwen Qwen3.14b $0.0800/M $0.240/M 32,768 tools
Qwen Qwen3.4b $0.0800/M $0.240/M 32,768 tools
Nvidia Llama 3.3 Nemotron Super 49B v1 $0.1000/M $0.400/M 131,072 tools
Qwen Qwen3.30b A3b $0.1000/M $0.300/M 32,768 tools
Qwen Qwen3.32b $0.1000/M $0.300/M 32,768 tools
Meta Llama Llama 3.3 70B Instruct $0.130/M $0.400/M 128,000 tools
Meta Llama Meta Llama 3.1 70B Instruct $0.130/M $0.400/M 128,000 tools
Qwen Qwen2.5 72B Instruct $0.130/M $0.400/M 128,000 tools
Qwen Qwen2.5 VL 72B Instruct $0.130/M $0.400/M 131,072 tools · vision
Qwen Qwen2 VL 72B Instruct $0.130/M $0.400/M 131,072 tools · vision
Qwen Qwq 32B $0.150/M $0.450/M 32,768 tools · reasoning
Qwen Qwen3.235b A22b $0.200/M $0.600/M 262,144 tools
DeepSeek AI DeepSeek R1 Distill Llama 70B $0.250/M $0.750/M 128,000 tools
DeepSeek AI DeepSeek v3 $0.500/M $1.50/M 128,000 tools
DeepSeek AI DeepSeek v3.0324 $0.500/M $1.50/M 128,000 tools
Nvidia Llama 3.1 Nemotron Ultra 253B v1 $0.600/M $1.80/M 128,000 tools
DeepSeek AI DeepSeek R1 $0.800/M $2.40/M 128,000 tools · reasoning
DeepSeek AI DeepSeek R1.0528 $0.800/M $2.40/M 164,000 tools · reasoning
Meta Llama Meta Llama 3.1 405B Instruct $1.00/M $3.00/M 128,000 tools
Nousresearch Hermes 3 Llama 3.1 405B $1.00/M $3.00/M 128,000 tools

embedding 3

Model Input / 1M Output / 1M Context Caps
Baai Bge En Icl $0.0100/M 32,768
Baai Bge Multilingual Gemma2 $0.0100/M 8,192
Intfloat E5 Mistral 7B Instruct $0.0100/M 32,768

FAQ

How many Nebius models are there?

30 Nebius models are listed across 2 modalities on this page. 30 have public per-token pricing.

How is Nebius pricing verified?

Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.

Which Nebius model is cheapest?

Input pricing on Nebius starts at $0.0100 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.

Can I route to Nebius via an OpenAI-compatible API?

Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a Nebius target, and call Nebius models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.

Route any Nebius model via Agent Command Center →
OpenAI-compatible endpoint. Caching, fallback, guardrails, observability. Free for 100K requests/month.