Novita AI models & pricing

Novita AI hosts 85 models (85 with public pricing) covering 3 modalities. Pay-as-you-go GPU inference for Llama, Qwen, FLUX, and image/video models. Cheapest input starts at $0.0100/M tokens; the most premium goes up to $2.11/M. Use Future AGI's Agent Command Center to route any Novita AI model with cost-optimized fallback and unified observability.

Homepage ↗ Docs ↗

chat 80

Model ↕	Input / 1M ↕	Output / 1M ↕	Blended ↕	Context ↕	Caps
Meta Llama Llama 3.1 8B Instruct	$0.0200/M	$0.0500/M	$0.0275/M	16,384
Paddlepaddle Paddleocr VL	$0.0200/M	$0.0200/M	$0.0200/M	16,384	vision
DeepSeek DeepSeek OCR	$0.0300/M	$0.0300/M	$0.0300/M	8,192	vision
Meta Llama Llama 3.2 3B Instruct	$0.0300/M	$0.0500/M	$0.0350/M	32,768	tools
Qwen Qwen3.4b Fp8	$0.0300/M	$0.0300/M	$0.0300/M	128,000	reasoning
Qwen Qwen3.8b Fp8	$0.0350/M	$0.138/M	$0.0608/M	128,000	reasoning
Zai Org Autoglm Phone 9B Multilingual	$0.0350/M	$0.138/M	$0.0608/M	65,536	vision
Meta Llama Llama 3.8B Instruct	$0.0400/M	$0.0400/M	$0.0400/M	8,192
Mistralai Mistral Nemo	$0.0400/M	$0.170/M	$0.0725/M	60,288
OpenAI GPT Oss 20B	$0.0400/M	$0.150/M	$0.0675/M	131,072	vision · reasoning
Google Gemma 3.12B It	$0.0500/M	$0.1000/M	$0.0625/M	131,072	vision
OpenAI GPT Oss 120B	$0.0500/M	$0.250/M	$0.100/M	131,072	tools · vision · reasoning
Sao10k L3.8b Lunaris	$0.0500/M	$0.0500/M	$0.0500/M	8,192
Sao10k L3.8b Stheno v3.2	$0.0500/M	$0.0500/M	$0.0500/M	8,192	tools
DeepSeek DeepSeek R1.0528 Qwen3.8b	$0.0600/M	$0.0900/M	$0.0675/M	128,000	reasoning
Baichuan Baichuan M2.32b	$0.0700/M	$0.0700/M	$0.0700/M	131,072
Baidu Ernie 4.5 21B A3b	$0.0700/M	$0.280/M	$0.123/M	120,000	tools
Baidu Ernie 4.5 21B A3b Thinking	$0.0700/M	$0.280/M	$0.123/M	131,072	reasoning
Qwen Qwen2.5 7B Instruct	$0.0700/M	$0.0700/M	$0.0700/M	32,000	tools
Qwen Qwen3 Coder 30B A3b Instruct	$0.0700/M	$0.270/M	$0.120/M	160,000	tools
Qwen Qwen3 VL 8B Instruct	$0.0800/M	$0.500/M	$0.185/M	131,072	tools · vision
Gryphe Mythomax L2.13b	$0.0900/M	$0.0900/M	$0.0900/M	4,096
Qwen Qwen3.235b A22b Instruct 2507	$0.0900/M	$0.580/M	$0.213/M	131,072	tools
Qwen Qwen3.30b A3b Fp8	$0.0900/M	$0.450/M	$0.180/M	40,960	reasoning
Qwen Qwen3.32b Fp8	$0.1000/M	$0.450/M	$0.188/M	40,960	reasoning
Xiaomimimo Mimo V2 Flash	$0.1000/M	$0.300/M	$0.150/M	262,144	tools · reasoning
Google Gemma 3.27B It	$0.119/M	$0.200/M	$0.139/M	98,304	vision
Zai Org Glm 4.5 Air	$0.130/M	$0.850/M	$0.310/M	131,072	tools · reasoning
Meta Llama Llama 3.3 70B Instruct	$0.135/M	$0.400/M	$0.201/M	131,072	tools
Baidu Ernie 4.5 VL 28B A3b	$0.140/M	$0.560/M	$0.245/M	30,000	tools · vision · reasoning
Nousresearch Hermes 2 Pro Llama 3.8B	$0.140/M	$0.140/M	$0.140/M	8,192
DeepSeek DeepSeek R1 Distill Qwen 14B	$0.150/M	$0.150/M	$0.150/M	32,768	reasoning
Qwen Qwen3 Next 80B A3b Instruct	$0.150/M	$1.50/M	$0.488/M	131,072	tools
Qwen Qwen3 Next 80B A3b Thinking	$0.150/M	$1.50/M	$0.488/M	131,072	tools · reasoning
Meta Llama Llama 4 Scout 17B 16e Instruct	$0.180/M	$0.590/M	$0.283/M	131,072	vision
Qwen Qwen3.235b A22b Fp8	$0.200/M	$0.800/M	$0.350/M	40,960	reasoning
Qwen Qwen3 VL 30B A3b Instruct	$0.200/M	$0.700/M	$0.325/M	131,072	tools · vision
Qwen Qwen3 VL 30B A3b Thinking	$0.200/M	$1.00/M	$0.400/M	131,072	tools · vision
Skywork R1v4 Lite	$0.200/M	$0.600/M	$0.300/M	262,144	vision
Qwen Qwen Mt Plus	$0.250/M	$0.750/M	$0.375/M	16,384
Qwen Qwen3 Omni 30B A3b Instruct	$0.250/M	$0.970/M	$0.430/M	65,536	tools · vision · audio
Qwen Qwen3 Omni 30B A3b Thinking	$0.250/M	$0.970/M	$0.430/M	65,536	tools · vision · reasoning · audio
DeepSeek DeepSeek v3.2	$0.269/M	$0.400/M	$0.302/M	163,840	tools · reasoning
DeepSeek DeepSeek v3.0324	$0.270/M	$1.12/M	$0.483/M	163,840	tools
DeepSeek DeepSeek v3.1	$0.270/M	$1.00/M	$0.453/M	131,072	tools · reasoning
DeepSeek DeepSeek V3.1 Terminus	$0.270/M	$1.00/M	$0.453/M	131,072	tools · reasoning
DeepSeek DeepSeek V3.2 exp	$0.270/M	$0.410/M	$0.305/M	163,840	tools · reasoning
Meta Llama Llama 4 Maverick 17B 128e Instruct Fp8	$0.270/M	$0.850/M	$0.415/M	1,048,576	vision
Baidu Ernie 4.5 300B A47b Paddle	$0.280/M	$1.10/M	$0.485/M	123,000
DeepSeek DeepSeek R1 Distill Qwen 32B	$0.300/M	$0.300/M	$0.300/M	64,000	reasoning
Kwaipilot Kat Coder Pro	$0.300/M	$1.20/M	$0.525/M	256,000	tools
Minimax Minimax M2	$0.300/M	$1.20/M	$0.525/M	204,800	tools · reasoning
Minimax Minimax M2.1	$0.300/M	$1.20/M	$0.525/M	204,800	tools
Qwen Qwen3.235b A22b Thinking 2507	$0.300/M	$3.00/M	$0.975/M	131,072	tools · reasoning
Qwen Qwen3 Coder 480B A35b Instruct	$0.300/M	$1.30/M	$0.550/M	262,144	tools
Qwen Qwen3 VL 235B A22b Instruct	$0.300/M	$1.50/M	$0.600/M	131,072	tools · vision
Zai Org Glm 4.6v	$0.300/M	$0.900/M	$0.450/M	131,072	tools · vision · reasoning
Qwen Qwen 2.5 72B Instruct	$0.380/M	$0.400/M	$0.385/M	32,000	tools
Baidu Ernie 4.5 VL 28B A3b Thinking	$0.390/M	$0.390/M	$0.390/M	131,072	tools · vision · reasoning
DeepSeek DeepSeek V3 Turbo	$0.400/M	$1.30/M	$0.625/M	64,000	tools
Baidu Ernie 4.5 VL 424B A47b	$0.420/M	$1.25/M	$0.628/M	123,000	vision · reasoning
Meta Llama Llama 3.70B Instruct	$0.510/M	$0.740/M	$0.568/M	8,192
Minimaxai Minimax M1.80k	$0.550/M	$2.20/M	$0.963/M	1,000,000	tools · reasoning
Zai Org Glm 4.6	$0.550/M	$2.20/M	$0.963/M	204,800	tools · reasoning
Moonshotai Kimi K2 Instruct	$0.570/M	$2.30/M	$1.00/M	131,072	tools
Moonshotai Kimi K2.0905	$0.600/M	$2.50/M	$1.08/M	262,144	tools
Moonshotai Kimi K2 Thinking	$0.600/M	$2.50/M	$1.08/M	262,144	tools · reasoning
Zai Org Glm 4.5	$0.600/M	$2.20/M	$1.00/M	131,072	tools · reasoning
Zai Org Glm 4.5v	$0.600/M	$1.80/M	$0.900/M	65,536	tools · vision · reasoning
Zai Org Glm 4.7	$0.600/M	$2.20/M	$1.00/M	204,800	tools · reasoning
Microsoft Wizardlm 2.8x22b	$0.620/M	$0.620/M	$0.620/M	65,535
DeepSeek DeepSeek Prover V2.671b	$0.700/M	$2.50/M	$1.15/M	160,000
DeepSeek DeepSeek R1.0528	$0.700/M	$2.50/M	$1.15/M	163,840	tools · reasoning
DeepSeek DeepSeek R1 Turbo	$0.700/M	$2.50/M	$1.15/M	64,000	tools · reasoning
DeepSeek DeepSeek R1 Distill Llama 70B	$0.800/M	$0.800/M	$0.800/M	8,192	reasoning
Qwen Qwen2.5 VL 72B Instruct	$0.800/M	$0.800/M	$0.800/M	32,768	vision
Qwen Qwen3 VL 235B A22b Thinking	$0.980/M	$3.95/M	$1.72/M	131,072	vision · reasoning
Sao10k L3.70b Euryale v2.1	$1.48/M	$1.48/M	$1.48/M	8,192	tools
Sao10k L31.70b Euryale v2.2	$1.48/M	$1.48/M	$1.48/M	8,192	tools
Qwen Qwen3 Max	$2.11/M	$8.45/M	$3.70/M	262,144	tools

embedding 3

Model ↕	Input / 1M ↕	Output / 1M ↕	Blended ↕	Context ↕
Baai Bge M3	$0.0100/M	$0.0100/M	$0.0100/M	8,192
Qwen Qwen3 Embedding 0.6B	$0.0700/M	—	—	32,768
Qwen Qwen3 Embedding 8B	$0.0700/M	—	—	32,768

rerank 2

Model ↕	Input / 1M ↕	Output / 1M ↕	Blended ↕	Context ↕	Caps
Baai Bge Reranker V2 M3	$0.0100/M	$0.0100/M	$0.0100/M	8,000
Qwen Qwen3 Reranker 8B	$0.0500/M	$0.0500/M	$0.0500/M	32,768

FAQ

How many Novita AI models are there?

85 Novita AI models are listed across 3 modalities on this page. 85 have public per-token pricing.

How is Novita AI pricing verified?

Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.

Which Novita AI model is cheapest?

Input pricing on Novita AI starts at $0.0100 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.

Can I route to Novita AI via an OpenAI-compatible API?

Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a Novita AI target, and call Novita AI models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.

Route any Novita AI model via Agent Command Center →

OpenAI-compatible endpoint. Caching, fallback, guardrails, observability. Free for 100K requests/month.