Novita AI models & pricing
Novita AI hosts 85 models (85 with public pricing) covering 3 modalities. Pay-as-you-go GPU inference for Llama, Qwen, FLUX, and image/video models. Cheapest input starts at $0.0100/M tokens; the most premium goes up to $2.11/M. Use Future AGI's Agent Command Center to route any Novita AI model with cost-optimized fallback and unified observability.
chat 80
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Meta Llama Llama 3.1 8B Instruct | $0.0200/M | $0.0500/M | 16,384 | |
| Paddlepaddle Paddleocr VL | $0.0200/M | $0.0200/M | 16,384 | vision |
| DeepSeek DeepSeek OCR | $0.0300/M | $0.0300/M | 8,192 | vision |
| Meta Llama Llama 3.2 3B Instruct | $0.0300/M | $0.0500/M | 32,768 | tools |
| Qwen Qwen3.4b Fp8 | $0.0300/M | $0.0300/M | 128,000 | reasoning |
| Qwen Qwen3.8b Fp8 | $0.0350/M | $0.138/M | 128,000 | reasoning |
| Zai Org Autoglm Phone 9B Multilingual | $0.0350/M | $0.138/M | 65,536 | vision |
| Meta Llama Llama 3.8B Instruct | $0.0400/M | $0.0400/M | 8,192 | |
| Mistralai Mistral Nemo | $0.0400/M | $0.170/M | 60,288 | |
| OpenAI GPT Oss 20B | $0.0400/M | $0.150/M | 131,072 | vision · reasoning |
| Google Gemma 3.12B It | $0.0500/M | $0.1000/M | 131,072 | vision |
| OpenAI GPT Oss 120B | $0.0500/M | $0.250/M | 131,072 | tools · vision · reasoning |
| Sao10k L3.8b Lunaris | $0.0500/M | $0.0500/M | 8,192 | |
| Sao10k L3.8b Stheno v3.2 | $0.0500/M | $0.0500/M | 8,192 | tools |
| DeepSeek DeepSeek R1.0528 Qwen3.8b | $0.0600/M | $0.0900/M | 128,000 | reasoning |
| Baichuan Baichuan M2.32b | $0.0700/M | $0.0700/M | 131,072 | |
| Baidu Ernie 4.5 21B A3b | $0.0700/M | $0.280/M | 120,000 | tools |
| Baidu Ernie 4.5 21B A3b Thinking | $0.0700/M | $0.280/M | 131,072 | reasoning |
| Qwen Qwen2.5 7B Instruct | $0.0700/M | $0.0700/M | 32,000 | tools |
| Qwen Qwen3 Coder 30B A3b Instruct | $0.0700/M | $0.270/M | 160,000 | tools |
| Qwen Qwen3 VL 8B Instruct | $0.0800/M | $0.500/M | 131,072 | tools · vision |
| Gryphe Mythomax L2.13b | $0.0900/M | $0.0900/M | 4,096 | |
| Qwen Qwen3.235b A22b Instruct 2507 | $0.0900/M | $0.580/M | 131,072 | tools |
| Qwen Qwen3.30b A3b Fp8 | $0.0900/M | $0.450/M | 40,960 | reasoning |
| Qwen Qwen3.32b Fp8 | $0.1000/M | $0.450/M | 40,960 | reasoning |
| Xiaomimimo Mimo V2 Flash | $0.1000/M | $0.300/M | 262,144 | tools · reasoning |
| Google Gemma 3.27B It | $0.119/M | $0.200/M | 98,304 | vision |
| Zai Org Glm 4.5 Air | $0.130/M | $0.850/M | 131,072 | tools · reasoning |
| Meta Llama Llama 3.3 70B Instruct | $0.135/M | $0.400/M | 131,072 | tools |
| Baidu Ernie 4.5 VL 28B A3b | $0.140/M | $0.560/M | 30,000 | tools · vision · reasoning |
| Nousresearch Hermes 2 Pro Llama 3.8B | $0.140/M | $0.140/M | 8,192 | |
| DeepSeek DeepSeek R1 Distill Qwen 14B | $0.150/M | $0.150/M | 32,768 | reasoning |
| Qwen Qwen3 Next 80B A3b Instruct | $0.150/M | $1.50/M | 131,072 | tools |
| Qwen Qwen3 Next 80B A3b Thinking | $0.150/M | $1.50/M | 131,072 | tools · reasoning |
| Meta Llama Llama 4 Scout 17B 16e Instruct | $0.180/M | $0.590/M | 131,072 | vision |
| Qwen Qwen3.235b A22b Fp8 | $0.200/M | $0.800/M | 40,960 | reasoning |
| Qwen Qwen3 VL 30B A3b Instruct | $0.200/M | $0.700/M | 131,072 | tools · vision |
| Qwen Qwen3 VL 30B A3b Thinking | $0.200/M | $1.00/M | 131,072 | tools · vision |
| Skywork R1v4 Lite | $0.200/M | $0.600/M | 262,144 | vision |
| Qwen Qwen Mt Plus | $0.250/M | $0.750/M | 16,384 | |
| Qwen Qwen3 Omni 30B A3b Instruct | $0.250/M | $0.970/M | 65,536 | tools · vision · audio |
| Qwen Qwen3 Omni 30B A3b Thinking | $0.250/M | $0.970/M | 65,536 | tools · vision · reasoning · audio |
| DeepSeek DeepSeek v3.2 | $0.269/M | $0.400/M | 163,840 | tools · reasoning |
| DeepSeek DeepSeek v3.0324 | $0.270/M | $1.12/M | 163,840 | tools |
| DeepSeek DeepSeek v3.1 | $0.270/M | $1.00/M | 131,072 | tools · reasoning |
| DeepSeek DeepSeek V3.1 Terminus | $0.270/M | $1.00/M | 131,072 | tools · reasoning |
| DeepSeek DeepSeek V3.2 exp | $0.270/M | $0.410/M | 163,840 | tools · reasoning |
| Meta Llama Llama 4 Maverick 17B 128e Instruct Fp8 | $0.270/M | $0.850/M | 1,048,576 | vision |
| Baidu Ernie 4.5 300B A47b Paddle | $0.280/M | $1.10/M | 123,000 | |
| DeepSeek DeepSeek R1 Distill Qwen 32B | $0.300/M | $0.300/M | 64,000 | reasoning |
| Kwaipilot Kat Coder Pro | $0.300/M | $1.20/M | 256,000 | tools |
| Minimax Minimax M2 | $0.300/M | $1.20/M | 204,800 | tools · reasoning |
| Minimax Minimax M2.1 | $0.300/M | $1.20/M | 204,800 | tools |
| Qwen Qwen3.235b A22b Thinking 2507 | $0.300/M | $3.00/M | 131,072 | tools · reasoning |
| Qwen Qwen3 Coder 480B A35b Instruct | $0.300/M | $1.30/M | 262,144 | tools |
| Qwen Qwen3 VL 235B A22b Instruct | $0.300/M | $1.50/M | 131,072 | tools · vision |
| Zai Org Glm 4.6v | $0.300/M | $0.900/M | 131,072 | tools · vision · reasoning |
| Qwen Qwen 2.5 72B Instruct | $0.380/M | $0.400/M | 32,000 | tools |
| Baidu Ernie 4.5 VL 28B A3b Thinking | $0.390/M | $0.390/M | 131,072 | tools · vision · reasoning |
| DeepSeek DeepSeek V3 Turbo | $0.400/M | $1.30/M | 64,000 | tools |
| Baidu Ernie 4.5 VL 424B A47b | $0.420/M | $1.25/M | 123,000 | vision · reasoning |
| Meta Llama Llama 3.70B Instruct | $0.510/M | $0.740/M | 8,192 | |
| Minimaxai Minimax M1.80k | $0.550/M | $2.20/M | 1,000,000 | tools · reasoning |
| Zai Org Glm 4.6 | $0.550/M | $2.20/M | 204,800 | tools · reasoning |
| Moonshotai Kimi K2 Instruct | $0.570/M | $2.30/M | 131,072 | tools |
| Moonshotai Kimi K2.0905 | $0.600/M | $2.50/M | 262,144 | tools |
| Moonshotai Kimi K2 Thinking | $0.600/M | $2.50/M | 262,144 | tools · reasoning |
| Zai Org Glm 4.5 | $0.600/M | $2.20/M | 131,072 | tools · reasoning |
| Zai Org Glm 4.5v | $0.600/M | $1.80/M | 65,536 | tools · vision · reasoning |
| Zai Org Glm 4.7 | $0.600/M | $2.20/M | 204,800 | tools · reasoning |
| Microsoft Wizardlm 2.8x22b | $0.620/M | $0.620/M | 65,535 | |
| DeepSeek DeepSeek Prover V2.671b | $0.700/M | $2.50/M | 160,000 | |
| DeepSeek DeepSeek R1.0528 | $0.700/M | $2.50/M | 163,840 | tools · reasoning |
| DeepSeek DeepSeek R1 Turbo | $0.700/M | $2.50/M | 64,000 | tools · reasoning |
| DeepSeek DeepSeek R1 Distill Llama 70B | $0.800/M | $0.800/M | 8,192 | reasoning |
| Qwen Qwen2.5 VL 72B Instruct | $0.800/M | $0.800/M | 32,768 | vision |
| Qwen Qwen3 VL 235B A22b Thinking | $0.980/M | $3.95/M | 131,072 | vision · reasoning |
| Sao10k L3.70b Euryale v2.1 | $1.48/M | $1.48/M | 8,192 | tools |
| Sao10k L31.70b Euryale v2.2 | $1.48/M | $1.48/M | 8,192 | tools |
| Qwen Qwen3 Max | $2.11/M | $8.45/M | 262,144 | tools |
embedding 3
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Baai Bge M3 | $0.0100/M | $0.0100/M | 8,192 | |
| Qwen Qwen3 Embedding 0.6B | $0.0700/M | — | 32,768 | |
| Qwen Qwen3 Embedding 8B | $0.0700/M | — | 32,768 |
rerank 2
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Baai Bge Reranker V2 M3 | $0.0100/M | $0.0100/M | 8,000 | |
| Qwen Qwen3 Reranker 8B | $0.0500/M | $0.0500/M | 32,768 |
FAQ
How many Novita AI models are there?
85 Novita AI models are listed across 3 modalities on this page. 85 have public per-token pricing.
How is Novita AI pricing verified?
Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.
Which Novita AI model is cheapest?
Input pricing on Novita AI starts at $0.0100 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.
Can I route to Novita AI via an OpenAI-compatible API?
Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a Novita AI target, and call Novita AI models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.