Guides

Best 5 AI Gateways to Manage Cursor Spend Across Teams in 2026

Five AI gateways scored on Cursor team spend in 2026: per-developer chargeback, per-repo budgets, SSO attribution, BYOK virtual keys, and where each gateway falls short.

·
17 min read
ai-gateway 2026 cursor
Editorial cover image for Best 5 AI Gateways to Manage Cursor Spend Across Teams in 2026
Table of Contents

A 60-engineer platform team on Cursor Business pays $2,400 a month in seat licenses. The actual cost is closer to $24,000 once you count model tokens their staff engineers burn on long-context refactors, and Cursor’s billing portal doesn’t split that usage by developer, team, or repository. By the time the invoice arrives, the cost-allocation conversation with finance is a postmortem.

An AI gateway in front of Cursor fixes the shape. Cursor 2.x ships a “Custom API” mode that lets a workspace point at any OpenAI- or Anthropic-compatible endpoint. Route traffic through a gateway and per-developer attribution, per-repo budgets, and SSO-tagged chargeback become a config concern. Four of the five gateways below stop there. One takes the same trace and feeds it back into a routing optimizer so the cost curve bends without any change to developer behaviour.

This is the 2026 cohort, scored on the seven team-spend axes that matter when Cursor is the workload.


TL;DR

Future AGI Agent Command Center is the strongest pick for managing Cursor team spend because it ships per-developer virtual keys via Cursor’s Custom-API mode, per-repo span attributes, SSO-tagged chargeback rollups, BYOK fan-out that preserves bulk-discount pricing, and Bedrock alongside Anthropic both behind one OpenAI-compatible base URL. The other four picks below win on specific edges.

  1. Future AGI Agent Command Center — Best overall. Per-developer attribution via Custom-API mode, per-repo budgets, SSO-tagged chargeback, and provider-mixed routing under one base URL.
  2. Portkey — Best for the deepest virtual-key + RBAC story with bulk pricing preserved. Fastest hosted setup (verify the Palo Alto Networks acquisition timeline before signing multi-year).
  3. Helicone — Best for the simplest drop-in proxy when budgets are not yet a hard constraint. Lightweight per-request observability (treat as planned migration after the March 3, 2026 Mintlify acquisition).
  4. LiteLLM — Best when source code cannot leave your network. Self-hosted Python proxy for VPC-locked teams; pin commits after the March 24, 2026 PyPI compromise.
  5. TrueFoundry — Best if you already run an internal AI platform and want multi-team chargeback rollup. Strongest cross-team chargeback shape.

Why Cursor team-spend management is hard

Cursor is the most popular paid AI IDE in the developer toolchain as of mid-2026. Pro at $20 per user covers individuals; Business at $40 per user covers teams and ships SSO plus an admin console. What it doesn’t ship is per-developer chargeback finance can use.

Four properties make the workload hard to monitor without a gateway in front.

  1. Token cost is detached from seat cost. A staff engineer running long-context refactors with Composer burns 40 to 80 million input tokens in a busy week. A junior doing Tab edits might run 4 million. Per-developer cost ratios of 10x to 30x are common.
  2. The admin dashboard groups by user but not by repo. Cursor 2.x shows total tokens per user. Without per-repo attribution, the chargeback story stops at “the platform team is expensive.”
  3. The bulk-pricing trap. Give each developer their own API key for chargeback and you lose bulk-discount tier pricing. Virtual keys that fan out to one underlying provider key fix this.
  4. BYOK and BYOG are table-stakes for enterprise. Cursor’s “Use your own API key” mode and Custom API mode let teams bring their own key and gateway. Security wants this for the audit log; procurement wants it for cost isolation.

A gateway sits between the Cursor client and the model provider, intercepts each Composer turn, Tab batch, and chat completion, applies metadata (SSO claim, repo URL, feature tag), and forwards the request. Metadata makes cost attributable; the interception point makes budget caps and alerts possible. All five picks support being pointed at via Cursor’s Custom API endpoint as of May 2026.


The 7 axes we score on

Generic “best AI gateway” axes are too broad for Cursor team spend. We scored each pick on seven axes that specifically affect cross-team Cursor cost management.

AxisWhat it measures
1. Per-developer chargebackCan the gateway tag every request with an SSO email and aggregate cost by developer?
2. Per-repository budgetsCan it cap spend per repo with a soft alert and a hard pause?
3. Team / cost-center rollupCan a platform admin roll per-developer numbers into team or BU views?
4. SSO + RBACDoes the gateway support SSO so finance and engineering see different slices?
5. BYOK virtual-key fan-outCan each developer have a virtual key that maps to one underlying provider key?
6. Composer + tool-call passthroughDoes Cursor’s agent mode survive the gateway hop intact?
7. Feedback loop / optimizerDoes the gateway use traces to improve routing over time, or stop at observation?

How we picked

We started from gateways shipping an OpenAI-compatible endpoint Cursor’s Custom API can target. We removed gateways that buffer streaming (breaks Tab and Composer UX), two whose Anthropic Messages passthrough broke tool calls in claude-opus-4-7 + Composer testing, and any without per-key metadata pass-through. We tested each remaining gateway against three workloads: a 12-developer fintech team, a 60-developer platform team, and a 4-developer regulated team requiring self-host.


1. Future AGI Agent Command Center: Best for per-developer Cursor attribution

Verdict: Future AGI ships per-developer virtual keys via Cursor’s Custom-API mode, per-repo and per-team span attributes, SSO-tagged chargeback rollups, BYOK fan-out that preserves bulk-discount pricing, and Bedrock / Anthropic both reachable behind one OpenAI-compatible base URL. Finance gets the per-developer chargeback table directly out of the dashboard; the rest of the cohort either reports per-call cost without SSO tagging or asks the platform team to wire a separate observability sink to recover it.

What it does for Cursor team spend:

  • Per-developer chargeback via fi.attributes.user.id populated from the Cursor SSO claim. Every Composer turn, Tab batch, and inline chat is tagged. Group-by-developer is the default view.
  • Per-repository budgets via span attributes. Wire repo=<git remote url> into the forwarding rule; caps live in the rules engine, soft alert at 80%, hard pause at 110%, per repo per month.
  • Team and cost-center rollup via the project hierarchy. A platform admin rolls a 60-developer org into 4 teams; finance sees one rollup. CSV export and fi.api support BU-level chargeback.
  • SSO + RBAC via Okta, Azure AD, Google SAML.
  • BYOK virtual-key fan-out is default. 60 virtual keys fan out to one underlying provider key, preserving bulk-discount pricing.
  • Composer + tool-call passthrough confirmed with claude-opus-4-7, claude-sonnet-4-6, and gpt-5-codex on Cursor 2.4. The gateway parses tool-use blocks rather than re-serialising them.
  • Feedback loop via fi.opt. traceAI (50+ AI surfaces across Python, TypeScript, Java, and C# (including Spring Boot starter, Spring AI, LangChain4j, Semantic Kernel), OpenInference-native) feeds fi.evals. Error Feed (the part of the eval stack, the clustering and what-to-fix layer that feeds the self-improving evaluators) sits alongside as the zero-config error monitor: auto-clusters related per-developer and per-repo failures into named issues (50 traces → 1 issue), auto-writes the root cause plus a quick fix plus a long-term recommendation per issue, and tracks rising/steady/falling trend per issue so emerging Composer regressions surface like exceptions rather than buried in chargeback rows. Low-scoring sessions cluster by failure mode; the optimizer (six optimizers (RandomSearchOptimizer, BayesianSearchOptimizer Optuna-backed with teacher-inferred few-shot templates and resumable studies, MetaPromptOptimizer, ProTeGi, GEPAOptimizer, PromptWizardOptimizer), all sharing an EarlyStoppingConfig (patience + min_delta + threshold + max_evaluations) and the same unified Evaluator over 60+ FAGI rubrics) rewrites prompts or routing policy. Typical Cursor output: Tab to claude-haiku-4-5, Composer under 20K to claude-sonnet-4-6, over 30K to claude-opus-4-7.
  • The Future AGI Protect model family runs inline at ~65 ms p50 text and ~107 ms p50 image (arXiv 2510.13351). FAGI’s own fine-tuned Gemma 3n adapters across content moderation, bias detection, security/prompt-injection, and data privacy/PII, multi-modal across text/image/audio, a model family rather than a plugin chain.

Where it falls short:

  • agent-opt is opt-in, for one-week pilots with a 6-developer team, start with traceAI + ai-evaluation and turn the optimizer on once eval baselines stabilize.. For pure chargeback, the simpler picks below get you there with less surface area.

  • Composer session replay is less polished than Helicone’s per-request inspection.

Pricing: Free tier with 100K traces per month. Scale starts at $99 per month. Enterprise is custom with SOC 2 Type II certified, SSO, RBAC, BAA. AWS Marketplace listing.

Score: 7/7 axes.


2. Portkey: Best for hosted gateway with mature RBAC

Verdict: Portkey is the most polished hosted-only product for Cursor workloads. If the brief is “give us per-developer keys plus a clean dashboard by next Friday,” Portkey is the fastest path. It observes and routes; it doesn’t optimize. Portkey was acquired by Palo Alto Networks (announced April 30, 2026) and will become the AI Gateway for Prisma AIRS; close expected PANW fiscal Q4 2026. Verify standalone continuity before a multi-year contract.

What it does for Cursor team spend: Per-developer chargeback via virtual keys (each developer gets one, all fan out to one provider key, bulk pricing preserved); per-repository budgets via metadata headers (one-time workspace JSON change, then per-repo cap is a config field); team rollup via the project hierarchy; SSO + RBAC is Portkey’s strongest column with Okta, Azure AD, Google SAML, Auth0 and the most detailed role matrix on this list; Composer + tool-call passthrough confirmed with claude-opus-4-7 and gpt-5-codex on Cursor 2.4, SSE preserved.

Where it falls short:

  • No optimizer; routing policy updates are manual.
  • The metadata-header model requires a Cursor workspace JSON update. Without it, you get key-level aggregation, not per-repo.
  • Pricing escalates faster than lighter alternatives once you cross 5 million requests per month, a 60-developer Cursor org crosses inside the first quarter.
  • The PANW acquisition timeline is the elephant in the room.

Pricing: Free tier with 10K requests per day. Scale starts at $99 per month. Enterprise custom with SOC 2 Type II, SSO, BYOC self-host.

Score: 6/7 axes (missing: feedback loop / optimizer).


3. Helicone: Best for lightweight observability

Verdict: Helicone is the right pick when a Cursor team wants per-request observability and per-developer cost numbers, and nothing else. Change the Cursor Custom API base URL, add Helicone-Auth, per-request cost table appears. Helicone was acquired by Mintlify on March 3, 2026 and the roadmap shifted toward a documentation-platform stance; existing users should treat this as a planned migration window.

What it does for Cursor team spend: Per-developer chargeback via Helicone-User-Id (wired into Cursor workspace JSON once, per-developer cost table is the headline view); per-repo budgets via custom properties (caps softer than Portkey’s hard-cutoff model, wire your own alerting on the webhook); team rollup via custom properties, with teams typically exporting to BigQuery or Snowflake for the finance rollup; SSO + RBAC functional on the enterprise tier (Okta, Google SAML); Composer + tool-call passthrough confirmed on Cursor 2.4 with gpt-5-codex and claude-sonnet-4-6.

Where it falls short:

  • No optimizer, no prompt library. Routing is round-robin and failover only; Cursor-specific routing needs a middleware layer.
  • Self-host scales to a few hundred RPS; beyond that the operational story gets heavy. For 200+ developer orgs, plan for the hosted tier.
  • The Mintlify acquisition reshaped the roadmap. Cost-platform features active in 2025 are in maintenance mode.

Pricing: Free tier with 10K requests per month. Pro starts at $25 per month. Enterprise custom.

Score: 5/7 axes (missing: feedback loop, hard-cutoff budgets, mature routing).


4. LiteLLM: Best for self-hosted Python-native routing

Verdict: LiteLLM is the pick when Cursor traffic can’t leave the VPC and security wants to read every line of code that touches a prompt. Source-available (MIT, enterprise directory licensed separately), Python-native, runs as a proxy inside your infra. Pin a commit or upgrade past 1.83.7 after the March 24, 2026 PyPI supply-chain incident on versions 1.82.7 and 1.82.8.

What it does for Cursor team spend: Per-developer chargeback via user_id and team_id on virtual keys, SSO claim wired into virtual-key metadata; per-repo budgets via custom metadata with hard cutoffs from per-key budgets (slicing by repo takes a SQL view on the Postgres backend); team rollup via team_id, less polished than hosted alternatives; SSO + RBAC via LiteLLM Enterprise (SAML, OIDC); Composer + tool-call passthrough confirmed with claude-opus-4-7 on Cursor 2.4 once LiteLLM is 1.83.7 or newer.

Where it falls short:

  • No optimizer; traces flow to a sink, what the sink does is your problem.
  • UI is functional rather than polished; slicing by developer or repo means a SQL dashboard or warehouse export.
  • The March 2026 supply-chain incident is a real trust signal. Pin commits and rotate credentials touched by 1.82.7 or 1.82.8.

Pricing: Open source under MIT. LiteLLM Enterprise with SLA, SSO, audit starts around $250 per month for small teams.

Score: 5.5/7 axes (missing: polished dashboard, optimizer).


5. TrueFoundry: Best for multi-team chargeback in an internal AI platform

Verdict: TrueFoundry is the pick when the organisation already runs an internal AI / ML platform and Cursor spend is one of many AI workloads needing cross-team chargeback. It’s the strongest cross-team rollup story on this list. It’s also the least Cursor-specific. Cursor is a workload, not a primary integration target.

What it does for Cursor team spend: Per-developer chargeback via RBAC + cost dashboards (SSO claim is the primary key, per-developer rollups default); per-repo budgets via workspaces and projects (repos map to TrueFoundry projects, platform admins manage 50 projects without touching Cursor config); team and cost-center rollup is TrueFoundry’s strongest column, cross-team view rolls up into BU reporting finance accepts; SSO + RBAC has a detailed role matrix, SAML / OIDC, enterprise-grade audit log; Composer + tool-call passthrough supported but Cursor-specific polish is thinner than Portkey or Future AGI.

Where it falls short:

  • Cursor-specific story is thinner. TrueFoundry is a platform-team product first.
  • Pricing is enterprise-skewed; poor fit for a 6-developer team that just wants per-developer chargeback.
  • No optimizer; trace-level depth for Composer is shallower than Future AGI or Helicone.

Pricing: Enterprise pricing only as of May 2026, scoped per-platform-team. Self-hosted and cloud both supported. SOC 2, SSO, audit log.

Score: 5/7 axes (missing: optimizer, Cursor-specific polish, native session replay).


Capability matrix

AxisFuture AGIPortkeyHeliconeLiteLLMTrueFoundry
Per-developer chargebackNative SSO tagVirtual keyHeaderVirtual keyRBAC tag
Per-repo budgetsSpan attr + capMetadata + capCustom propMetadata + capProject + cap
Team / cost-center rollupNative hierarchyProjectCustom propTeam_idNative rollup
SSO + RBACOkta, Azure AD, GoogleStrongest single-featureFunctionalEnterprise tierStrongest cross-team
BYOK virtual-key fan-outDefaultDefaultSupportedDefaultDefault
Composer + tool-call passthroughYesYesYesYes (1.83.7+)Yes
Feedback loop / optimizerfi.optNoNoNoNo
Self-host postureBYOCBYOCOSSOSSCloud + self-host
2026 trust signalApache 2.0, no acquisitionPANW acquisition pendingMintlify acquisition closedMarch 24 PyPI incidentIndependent

No gateway wins every column; the feedback-loop column is where the field separates on the longer-term cost story.


Decision framework: Choose X if

Choose Future AGI Agent Command Center if the brief is bigger than chargeback. Pick when Cursor is a $20K-plus-per-month line item and the team wants the cost curve to bend without changing developer behaviour. The optimizer pays for itself between weeks two and four.

Choose Portkey if the brief is hosted-only, virtual-key-heavy, RBAC-mature, and “ship next Friday.” Pick when a managed dashboard is the right output and you have a plan for the Palo Alto acquisition timeline.

Choose Helicone if the team is under 15 developers and the brief is per-request observability plus a per-developer cost table. Verify the Mintlify roadmap fits your six-month plan.

Choose LiteLLM if security or compliance requires Cursor traffic to never leave the VPC. Pin commits past 1.83.7, and pair with a trace sink for visualisation.

Choose TrueFoundry if you’re a platform team running an internal AI / ML stack and Cursor is one of many workloads needing cross-team chargeback.


Common mistakes when wiring Cursor through a gateway

MistakeWhat goes wrongFix
Pointing only the IDE settings at the gatewayTab autocomplete bypasses the gateway; chargeback misses 30 to 40% of trafficSet the Custom API base URL in workspace JSON; verify Tab requests appear in gateway logs
Sharing one team key across developersAll sessions look identical to the dashboardIssue virtual keys per developer
Buffering streaming responsesTab UX freezes mid-completion; Composer hangs on long turnsConfirm the gateway forwards SSE without buffer-and-batch
Tagging with only user_id and not repoPer-repo budgets are impossibleInject repo and branch in the workspace JSON as span attributes
Setting per-repo budgets too tightComposer pauses mid-refactorSoft alert at 80%, hard pause at 110%, with a senior-engineer override
Routing every turn to claude-opus-4-7Cost spike of 8 to 12x versus the optimal mixTab to claude-haiku-4-5 or gpt-5-mini; Composer over 30K to opus
Skipping the SSO wire-upChargeback groups by key name, not developer identityWire the SSO claim into virtual-key metadata at provisioning

How Future AGI closes the loop on Cursor team spend

The other four gateways treat team-spend management as an end state: capture the trace, tag with the SSO claim, show the dashboard, alert on budget. Agent Command Center treats the trace as input to a feedback loop. The wedge for Cursor: the optimizer changes routing policy without any change to developer behaviour.

  1. Trace. Every Cursor turn produces a span tree via traceAI (Apache 2.0).
  2. Evaluate. ai-evaluation (Apache 2.0) scores each Composer turn. FAGI ships a 60+ EvalTemplate classes in the ai-evaluation SDK with self-improving evaluators on the Future AGI Platform (task completion, code-correctness, tool-use accuracy, faithfulness, structured-output, hallucination, agentic surfaces, instruction-following), plus unlimited custom evaluators authored end-to-end by an in-product agent that uses tool calling on your code, plus self-improving evaluators that learn from live production traces, plus FAGI’s proprietary classifier model family at very low cost-per-token (lower per-eval cost than Galileo Luna-2). Catalog is the floor, not the ceiling.
  3. Cluster. Low-scoring sessions cluster by failure mode. The two patterns accounting for most Cursor team-spend waste: “Composer opus called when sonnet would have been enough” and “Tab autocomplete routed to opus by accident.”
  4. Optimize. fi.opt.optimizers (Apache 2.0) runs six optimizers — ProTeGi, GEPA, Bayesian, MetaPrompt, RandomSearch, PromptWizard. Typical Cursor output: Tab to claude-haiku-4-5, Composer under 20K to claude-sonnet-4-6, over 30K to claude-opus-4-7. The optimizer also rewrites over-prompting prompts; typical reduction is 8% to 15% on input tokens per turn before any routing change.
  5. Route + re-deploy. Agent Command Center applies the updated policy on the next request. Versioned; rollouts go 10% → 50% → 100% with automatic rollback if eval scores regress.

Net effect across 22 engineering teams in Q1 2026: teams starting at $25K to $40K per month on Cursor-driven token spend saw costs trend down 18% to 32% within four weeks. No developer changed their workflow.

The hosted Agent Command Center adds the failure-cluster view, live Protect guardrails (around 65 ms text-scanning latency per arXiv 2510.13351), RBAC, SOC 2 Type II certified, and an AWS Marketplace listing.


What we did not include

Four gateways that show up in other 2026 Cursor listicles were deliberately left out.

  • OpenRouter. Routing is consumer-facing. No virtual-key fan-out, no SSO chargeback, no hard-cutoff budgets.
  • Cloudflare AI Gateway. Cursor-specific integration was thin as of May 2026; worker-based observability doesn’t yet slice per-developer without custom code.
  • Kong AI Gateway. Strong if you already run Kong, but chargeback is plugin-driven (Grafana on OTel); platform-team time to ship the cross-team view finance accepts is closer to two weeks than two days.
  • Maxim Bifrost. Strong throughput; team-spend rollup is thinner than the five picks above.

If your situation is different, all four are worth a second look in Q3 2026.



Sources

  • Cursor documentation, cursor.com/docs (Custom API mode, Business plan)
  • Future AGI Agent Command Center, futureagi.com/platform/monitor/command-center
  • Future AGI traceAI, ai-evaluation, agent-opt (Apache 2.0), github.com/future-agi
  • Future AGI Protect latency benchmarks, arxiv.org/abs/2510.13351
  • Portkey, portkey.ai; PANW intent to acquire (April 30, 2026), paloaltonetworks.com/company/press
  • Helicone, helicone.ai; Mintlify acquisition (March 3, 2026), mintlify.com/blog
  • LiteLLM, github.com/BerriAI/litellm; Datadog Security Labs TeamPCP PyPI writeup (March 24, 2026)
  • TrueFoundry, truefoundry.com

Frequently asked questions

How do I track Cursor cost per developer when everyone shares one team account?
Use a gateway with virtual keys (Future AGI, Portkey, LiteLLM, TrueFoundry). Each developer gets a virtual key fanning out to one underlying provider key, preserving bulk pricing. SSO provisioning wires the email to the virtual key at join time.
How do I cap Cursor spend per repository?
All five gateways support repo tagging; only three (Future AGI, Portkey, TrueFoundry) have hard-cutoff caps by default. Inject the repo URL in the Cursor workspace JSON, surface as a span attribute, set the cap with a soft alert at 80% and hard pause at 110%.
Can I route Cursor through multiple model providers?
Yes — one of the highest-value moves for spend management. Typical 2026 mix: `gpt-5-mini` or `claude-haiku-4-5` for Tab, `claude-sonnet-4-6` for short Composer turns, `claude-opus-4-7` or `gpt-5-codex` for long-context refactors.
What happens to Composer tool calls when Cursor runs through a gateway?
All five pass tool calls through intact as of May 2026, with one caveat: LiteLLM before 1.83.7 had a tool-block re-serialisation bug that broke Composer's diff-apply tool. Pin LiteLLM to 1.83.7 or newer.
Is it safe to send source code through an AI gateway?
For hosted gateways the flow is gateway → model provider; both already see the code. If compliance restricts both, the only safe path is self-hosted (LiteLLM or Future AGI's BYOC) running inside your VPC.
How is Future AGI different from Portkey for Cursor?
Portkey gives you the hosted observation and routing layer; you do routing-policy updates yourself. Future AGI adds an optimization layer — the same trace data feeds back into prompt rewrites and routing-policy updates so the gateway gets better over time. Portkey gives a dashboard; Future AGI gives a dashboard plus a loop.
Related Articles
View all
The Comprehensive Guide to LLM Security (2026)
Guides

LLM security is four layers — input, output, retrieval, tool-call. Defenders that secure all four ship reliably; defenders that secure only the input layer lose to anything beyond a hello-world attack.

NVJK Kartik
NVJK Kartik ·
17 min