Lambda models & pricing
Lambda hosts 20 models (20 with public pricing) covering 1 modalities. Lambda Inference API — Llama, DeepSeek, Qwen on dedicated GPU clusters. Cheapest input starts at $0.0150/M tokens; the most premium goes up to $0.800/M. Use Future AGI's Agent Command Center to route any Lambda model with cost-optimized fallback and unified observability.
chat 20
| Model ↕ | Input / 1M ↕ | Output / 1M ↕ | Context ↕ | Caps |
|---|---|---|---|---|
| Llama3.2 11B Vision Instruct | $0.0150/M | $0.0250/M | 131,072 | tools · vision |
| Llama3.2 3B Instruct | $0.0150/M | $0.0250/M | 131,072 | tools |
| Hermes3.8b | $0.0250/M | $0.0400/M | 131,072 | tools |
| Lfm 7B | $0.0250/M | $0.0400/M | 131,072 | tools |
| Llama3.1 8B Instruct | $0.0250/M | $0.0400/M | 131,072 | tools |
| Llama 4 Maverick 17B 128e Instruct Fp8 | $0.0500/M | $0.1000/M | 131,072 | tools |
| Llama 4 Scout 17B 16e Instruct | $0.0500/M | $0.1000/M | 16,384 | tools |
| Qwen25 Coder 32B Instruct | $0.0500/M | $0.1000/M | 131,072 | tools |
| Qwen3.32b Fp8 | $0.0500/M | $0.1000/M | 131,072 | tools · reasoning |
| Lfm 40B | $0.1000/M | $0.200/M | 131,072 | tools |
| Hermes3.70b | $0.120/M | $0.300/M | 131,072 | tools |
| Llama3.1 70B Instruct Fp8 | $0.120/M | $0.300/M | 131,072 | tools |
| Llama3.1 Nemotron 70B Instruct Fp8 | $0.120/M | $0.300/M | 131,072 | tools |
| Llama3.3 70B Instruct Fp8 | $0.120/M | $0.300/M | 131,072 | tools |
| DeepSeek Llama3.3 70B | $0.200/M | $0.600/M | 131,072 | tools · reasoning |
| DeepSeek R1.0528 | $0.200/M | $0.600/M | 131,072 | tools · reasoning |
| DeepSeek v3.0324 | $0.200/M | $0.600/M | 131,072 | tools |
| DeepSeek R1.671b | $0.800/M | $0.800/M | 131,072 | tools · reasoning |
| Hermes3.405b | $0.800/M | $0.800/M | 131,072 | tools |
| Llama3.1 405B Instruct Fp8 | $0.800/M | $0.800/M | 131,072 | tools |
FAQ
How many Lambda models are there?
20 Lambda models are listed across 1 modality on this page. 20 have public per-token pricing.
How is Lambda pricing verified?
Pricing is aggregated from BerriAI/litellm, models.dev, and OpenRouter and refreshed weekly. Each row shows a per-model "verified" date. If a price is wrong, click the row to open the model page and use the inline "suggest edit" link — submissions go into a public review queue.
Which Lambda model is cheapest?
Input pricing on Lambda starts at $0.0150 per 1M tokens. Sort the table by price (or use the in-page filter at the top) to find the cheapest model that matches your capability requirements.
Can I route to Lambda via an OpenAI-compatible API?
Yes — point your OpenAI client at Future AGI's Agent Command Center, configure a Lambda target, and call Lambda models with the standard /v1/chat/completions surface. The same gateway can route to other providers as fallback. Free for the first 100K requests/month.