Nebius models & pricing
Nebius hosts 30 models (30 with public pricing) covering 2 modalities. European GPU cloud with Inference Studio for Llama, Qwen, DeepSeek, FLUX. Cheapest input starts at $0.0100/M tokens; the most premium goes up to $1.00/M. Use Future AGI's Agent Command Center to route any Nebius model with cost-optimized fallback and unified observability.
chat 27
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Qwen Qwen2.5 Coder 7B | $0.0100/M | $0.0300/M | 32,768 | tools |
| Meta Llama Llama Guard 3.8B | $0.0200/M | $0.0600/M | 128,000 | |
| Meta Llama Meta Llama 3.1 8B Instruct | $0.0200/M | $0.0600/M | 128,000 | tools |
| Qwen Qwen2 VL 7B Instruct | $0.0200/M | $0.0600/M | 131,072 | vision |
| Mistralai Mistral Nemo Instruct 2407 | $0.0400/M | $0.120/M | 128,000 | tools |
| Google Gemma 3.27B It | $0.0600/M | $0.200/M | 128,000 | tools · vision |
| Qwen Qwen2.5 32B Instruct | $0.0600/M | $0.200/M | 128,000 | tools |
| Qwen Qwen3.14b | $0.0800/M | $0.240/M | 32,768 | tools |
| Qwen Qwen3.4b | $0.0800/M | $0.240/M | 32,768 | tools |
| Nvidia Llama 3.3 Nemotron Super 49B v1 | $0.1000/M | $0.400/M | 131,072 | tools |
| Qwen Qwen3.30b A3b | $0.1000/M | $0.300/M | 32,768 | tools |
| Qwen Qwen3.32b | $0.1000/M | $0.300/M | 32,768 | tools |
| Meta Llama Llama 3.3 70B Instruct | $0.130/M | $0.400/M | 128,000 | tools |
| Meta Llama Meta Llama 3.1 70B Instruct | $0.130/M | $0.400/M | 128,000 | tools |
| Qwen Qwen2.5 72B Instruct | $0.130/M | $0.400/M | 128,000 | tools |
| Qwen Qwen2.5 VL 72B Instruct | $0.130/M | $0.400/M | 131,072 | tools · vision |
| Qwen Qwen2 VL 72B Instruct | $0.130/M | $0.400/M | 131,072 | tools · vision |
| Qwen Qwq 32B | $0.150/M | $0.450/M | 32,768 | tools · reasoning |
| Qwen Qwen3.235b A22b | $0.200/M | $0.600/M | 262,144 | tools |
| DeepSeek AI DeepSeek R1 Distill Llama 70B | $0.250/M | $0.750/M | 128,000 | tools |
| DeepSeek AI DeepSeek v3 | $0.500/M | $1.50/M | 128,000 | tools |
| DeepSeek AI DeepSeek v3.0324 | $0.500/M | $1.50/M | 128,000 | tools |
| Nvidia Llama 3.1 Nemotron Ultra 253B v1 | $0.600/M | $1.80/M | 128,000 | tools |
| DeepSeek AI DeepSeek R1 | $0.800/M | $2.40/M | 128,000 | tools · reasoning |
| DeepSeek AI DeepSeek R1.0528 | $0.800/M | $2.40/M | 164,000 | tools · reasoning |
| Meta Llama Meta Llama 3.1 405B Instruct | $1.00/M | $3.00/M | 128,000 | tools |
| Nousresearch Hermes 3 Llama 3.1 405B | $1.00/M | $3.00/M | 128,000 | tools |
embedding 3
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Baai Bge En Icl | $0.0100/M | — | 32,768 | |
| Baai Bge Multilingual Gemma2 | $0.0100/M | — | 8,192 | |
| Intfloat E5 Mistral 7B Instruct | $0.0100/M | — | 32,768 |
FAQ
How many Nebius models are there?
30 Nebius models are listed across 2 modalities on this page. 30 have public per-token pricing.
How is Nebius pricing verified?
Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.
Which Nebius model is cheapest?
Input pricing on Nebius starts at $0.0100 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.
Can I route to Nebius via an OpenAI-compatible API?
Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a Nebius target, and call Nebius models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.