Guides

Enterprise LLM Gateway for Claude Code in 2026: A Buyer's Roadmap

A staged 6-12 month roadmap for selecting and rolling out an enterprise LLM gateway for Claude Code: five picks scored on pilot fit, expansion readiness, standardization, procurement, vendor stability, exit, and TCO.

February 10, 2026

21 min read

ai-gateway 2026 claude-code

The mistake VP Engs make with Claude Code in 2026 is treating the LLM gateway decision as a one-time vendor selection. It isn’t. It’s a three-stage rollout that runs six to twelve months across a 5,000-engineer org, and the vendor that wins your pilot in month two isn’t necessarily the vendor that survives your standardization review in month nine. Stage 1 cares about how fast a single team can wire ANTHROPIC_BASE_URL and see per-developer spend. Stage 2 cares about whether five teams in three regions can co-exist on the same control plane. Stage 3 cares about whether the gateway plugs into your identity provider, your SIEM, your records-retention policy, and your AWS Enterprise Discount Program, and whether the SOC 2 Type II report is dated within twelve months of audit.

This is the buyer’s roadmap. Five 2026 LLM gateways scored on seven axes for a VP Eng or CTO planning a 6-12 month rollout, then the rollout itself: Stage 1 Pilot (0-3 months), Stage 2 Expansion (3-9 months), Stage 3 Standardization (9-12+ months). Picks: Future AGI Agent Command Center, Portkey, Kong AI Gateway, LiteLLM, Maxim Bifrost. Sibling posts use “AI gateway”; this post uses “LLM gateway” because that’s what 2026 procurement committees write in the SOW once language gets specific. Same product.

TL;DR: pick by stage

Stage	What it optimizes for	Pick
Stage 1: Pilot (0-3 months)	One team, fast time-to-first-chargeback-table	Future AGI Agent Command Center for the loop; Maxim Bifrost if speed is gating
Stage 2: Expansion (3-9 months)	Five to fifteen teams, multi-region, per-BU RBAC	Future AGI or Portkey; Kong if Kong is already the REST gateway
Stage 3: Standardization (9-12+ months)	Org-wide rollout, IdP federation, SIEM, EDP burn-down	Future AGI (BYOC + Apache 2.0 data layer) or Portkey (mature attested catalog)

Five-line read. Future AGI survives all three stages. Apache 2.0 data layer, BYOC the same in pilot and production, self-improving loop bends the cost curve down at scale. Portkey is the polished hosted alternative if the committee accepts the April 30, 2026 PANW acquisition variable. Kong is the answer when Kong already sits inside your authorization boundary. LiteLLM is VPC-only with the March 24, 2026 PyPI supply-chain incident as the procurement variable. Maxim Bifrost is the fastest single-binary install.

Why Claude Code is a roadmap, not a purchase

A 5,000-engineer enterprise adopting Claude Code in early 2026 sees three phases, and the wrong gateway choice in any phase blocks the next.

Phase one: one team. A platform team of 30 engineers turns on Claude Code, plugs it through a gateway, has a chargeback dashboard in six weeks. The bill is around $40,000/month. Procurement and finance are satisfied. CTO checks the box.

Phase two: five teams in two regions. Backend, frontend, mobile, ML, and a data team turn on Claude Code. Some in the US, some in Bangalore, one in London. The gateway now needs per-BU RBAC because the platform team can’t be in the operational path of every team’s daily key issuance, regional data residency because the London team can’t send prompts to a US control plane without a DPA conversation, and IdP integration so the SSO claim drives spend visibility. This is where Stage 1 picks fall over.

Phase three: the whole org. Twelve to eighteen months in, Claude Code is the default tool for 5,000 engineers. CFO is in because the line item is $5M/year. CISO is in because some repos carry SOX-scope MNPI. AWS account team needs the contract to flow through the existing $20M EDP. Legal needs Type II within twelve months, DPA aligned to current EU SCCs, BAA on request, and assignment-and-novation language. Wrong vendor here produces a migration project, six months of dual-running while the old gateway ramps down.

A roadmap-shaped decision in month two prevents a migration project in month fourteen.

The 7 axes: what to score for a 6-12 month rollout

#	Axis	What it measures	Stage it gates
1	Pilot-friendly onboarding	Time to first chargeback table for one team	Stage 1
2	Expansion-ready (multi-team)	Per-BU RBAC depth, regional data plane, workspace isolation	Stage 2
3	Standardization-grade (RBAC + IdP)	4+ level RBAC, SAML SSO, SCIM, delegated administration, audit log retention	Stage 3
4	Procurement story	SOC 2 Type II, ISO 27001, DPA, BAA, FedRAMP, AWS Marketplace	Stage 3
5	Vendor financial stability	Funding, runway, acquisition variables, change-in-control	Stage 3 (score at Stage 1)
6	Migration-out story	Trace, eval, RBAC, policy export; data format and source code portability	Stage 3
7	TCO over 36 months	License, storage, network, platform-team time, token-spend impact	All stages

Axis 5 has to be scored at Stage 1, not Stage 3, the dominant 2026 mistake is picking a vendor whose attestation is great today but whose acquisition closes mid-standardization. Axis 6 is the one nobody scores honestly; at Stage 3 it’s the only thing that matters if the vendor disappears.

How we filtered the cohort

LLM gateways with an Anthropic-compatible endpoint as of May 2026 and at least one publicly referenced 500+ Claude Code seat deployment. We removed gateways without per-developer attribution (Stage 1 fails immediately) and those with no roadmap to 4+ level RBAC (Stage 2 fails). Helicone is out of this roadmap because the March 2026 Mintlify acquisition (Mintlify itself acquired by Stripe in late 2025) makes the Stage 3 standardization review unpleasant.

1. Future AGI Agent Command Center: Survives all three stages

Verdict. The only entry whose data-collection layer is Apache 2.0 (traceAI, ai-evaluation, agent-opt), whose BYOC deployment is the same in pilot and production, and whose self-improving loop bends the cost curve down at standardization scale. Future AGI ships SOC 2 Type II + HIPAA + GDPR + CCPA certified per futureagi.com/trust; ISO/IEC 27001 is in active audit.

Stage fit. Stage 1: one env var plus a traceAI install; first chargeback table in under a week; free tier 100K traces/month, Scale at $99/month. Stage 2: native four-level RBAC (org > business-unit > sub-business-unit > cost-center), delegated administration, hosted plane in US-East, US-West, EU-West with BYOC in Singapore for APAC, workspace-scoped budget caps. Stage 3: SAML SSO across Okta, Azure AD, Google Workspace, Auth0; SCIM for Okta and Azure AD; tiered audit retention (hot 30 days, warm one year Parquet, cold seven years Glacier) configurable per-repo class; BYOC runs both planes in the customer’s account with air-gapped enclaves on the same stack; AWS Marketplace listing routes the contract through the existing EDP.

Vendor stability + migration-out. ~$1.9M raised (Powerhouse, Snow Leopard, Arka, Wellfound Quant Fund); earlier-stage than Portkey or Kong. Structural mitigations: Apache 2.0 license means the customer can self-host indefinitely; AWS Marketplace converts the vendor contract into an AWS contract. Migration-out is strongest in the cohort, trace store is a customer-controlled Parquet warehouse on customer S3, eval state is a customer dataset, optimizer state versioned in customer Git, Apache 2.0 libraries work without the hosted plane.

The loop. Every turn traced via traceAI; scored by fi.evals on faithfulness, code-correctness, tool-use accuracy; low-scoring sessions clustered; fi.opt.optimizers (ProTeGi, Bayesian, GEPA) rewrite the system prompt or routing policy; gateway applies the updated route, versioned with automatic rollback. Typical Claude Code optimization: turns under 10K input tokens to claude-haiku-4-5, the rest to claude-opus-4-7. A team starting at $40,000/month typically sees costs trend down 15-30 percent within four weeks. Protect ships at ~67ms text latency per arXiv 2510.13351.

Where it falls short.

ISO/IEC 27001 in active audit; SOC 2 Type II, HIPAA, GDPR, and CCPA are certified today.
ISO 27001 not on the certificate list; Apache 2.0 plus inherited cloud controls are the mitigation under BYOC.
BYOC active-active across regions takes two to three weeks of SRE time at cutover.
Optimization layer is heavier than what a one-week pilot needs.
Non-US/EU residency runs through BYOC (the same Apache 2.0 binary in the customer’s account, anywhere your cluster runs).

TCO. For a 1,000-engineer org at $1,500/engineer/month ($18M/year), gateway cost lands $200K-$400K/year. The case isn’t the license, it’s the 15-30 percent token-spend reduction the loop produces, $2.7M-$5.4M/year at this scale.

Score: 7/7 with partial credit on attestation timing.

Choose Future AGI when Claude Code is becoming a material line item (any volume, agent-opt compounds value as production traffic flows), the committee values BYOC plus source-available code-collection, and the team wants the cost curve to bend over the 12-month horizon.

2. Portkey: The polished hosted alternative

Verdict. The most polished hosted-only LLM gateway in this cohort, deepest pre-built compliance catalog. Type II attested under NDA, ISO 27001 on the list, mature DPA. Dominant 2026 variable: the April 30, 2026 Palo Alto Networks acquisition (close expected PANW fiscal Q4). Inside PANW, upside; outside it, multi-year contracts at Stage 3 need assignment-and-novation with a termination-without-penalty trigger.

Stage fit. Stage 1: hosted-only, virtual key per developer, polished dashboard in hours, prompt-library UI most mature in cohort, free tier 10K req/day. Stage 2: native four-tier RBAC (org > workspace > project > virtual-key), delegated administration via SAML role claims, hosted multi-region across US-East, US-West, EU-West, Singapore pinned per workspace, virtual-key fan-out preserves bulk Anthropic pricing.

Vendor stability + migration-out. Series A, funding above $10M; post-close PANW backing puts the customer behind a $100B+ market-cap parent, upside on stability, vendor-coupling is the consideration. Migration-out: data export through S3, Snowflake, Splunk; the eval and routing-policy layer is Portkey-shaped, so migration means redoing policy.

Where it falls short.

PANW acquisition is a procurement variable; add assignment-and-novation language.
Four-tier RBAC is the deepest native; 5+ level org charts flatten one level into metadata.
Air-gap is custom, not default.
No self-improving loop; cost curve stays flat unless the team optimizes manually.

TCO. Free tier 10K req/day; Pro $99/month; Enterprise custom. At 1,000-engineer scale expect a six-figure annual contract; negotiate the storage tier line item explicitly.

Score: 6.5/7 with partial credit on Axis 5 and Axis 6.

Choose Portkey when the priority is a hosted, attested-today catalog and the committee can handle the PANW variable contractually.

3. Kong AI Gateway: Right when Kong is already the REST gateway

Verdict. The right pick when Kong already sits inside your authorization boundary as the REST gateway.Weakness: AI Proxy plugin is newer than rate-limiting; AI-native observability is plugin-driven, the chargeback dashboard finance accepts takes two to four weeks of platform-team time.

Stage fit. Stage 1: slowest to first chargeback table. AI Proxy plugin installs in hours, but the dashboard is a Grafana view on the OTel sink (plan two weeks). Stage 2: consumer-and-workspace-shaped RBAC with tag-based scoping; three-plus levels configurable but heavier than Portkey’s native four-tier; region pinning is the customer’s choice; plugin stacking gives expressiveness and operational responsibility.

Vendor stability + migration-out. Series E, funding above $200M, strongest financial profile in this cohort. Migration is redirecting ANTHROPIC_BASE_URL and rewiring the OTel sink.

Where it falls short.

AI-native observability is plugin-driven; default dashboard is REST-shaped. Chargeback takes two to four weeks of platform-team time.
AI Spend plugin is newer than rate-limiting and still maturing.
Plugin stacking is operationally heavy; small platform teams feel it.
No self-improving loop.
Standing up Kong only for Claude Code is a heavier lift than alternatives.

TCO. Kong OSS free. Konnect starts free. Enterprise with SLA + AI Proxy support starts around $1.5K/month; at 5,000-engineer scale expect a six-figure annual contract plus ongoing plugin maintenance.

Score: 6/7 with partial credit on Axis 1 and Axis 7.

4. LiteLLM: VPC-only, with the supply-chain caveat

Verdict. The pick when Claude Code traffic can’t leave the VPC and the security team wants to read every line of code that touches a prompt. Source-available under MIT, Python-native, runs as a proxy inside the customer’s infrastructure. Dominant 2026 procurement variable: the March 24, 2026 PyPI supply-chain compromise, versions 1.82.7 and 1.82.8 exfiltrated SSH keys and cloud credentials per Datadog Security Labs. The vendor shipped a clean post-incident response; most Fortune 500 committees will want the audit before signing.

Stage fit. Stage 1: source-available proxy installs quickly, a 30-engineer team on LiteLLM in a week. UI is functional not polished, chargeback by repo or developer requires exporting to a SQL warehouse. Stage 2: team and user scoping native, deeper hierarchies via virtual-key tagging, SAML SSO in Enterprise, metadata-driven attribution requires platform-team owned conventions across BUs. Stage 3: strongest self-host story by design, runs on customer nodes, no telemetry leaves the VPC; no first-party SOC 2 on OSS, Enterprise tier carries attestation and BAA, audit retention is the customer’s responsibility. Supported pattern when the customer wants the loop and VPC-only: LiteLLM in front of Anthropic with Future AGI’s traceAI Apache 2.0 sink behind it, both inside the customer’s VPC.

Vendor stability + migration-out. YC-backed with Enterprise as the commercial entity. Smaller than Portkey or Kong; source-available license is the structural mitigation. Migration is straightforward, proxy is the customer’s deployment, config is YAML, trace data lives in whatever sink the customer wires.

Where it falls short.

March 24, 2026 PyPI compromise is the dominant procurement variable. Insist on post-incident audit, package-signing chain, pinned-version policy, SBOM.
No native polished dashboard. Plan a SQL or analytics-warehouse sink for chargeback.
No self-improving loop; pair with traceAI plus fi.opt if the loop matters.
Observability story is thinner than the hosted alternatives.
Smaller community footprint than Kong’s; ecosystem is Python-centric.

TCO. OSS under MIT. Enterprise starts around $250/month for small teams. At 5,000-engineer scale expect a six-figure annual contract plus platform-team overhead.

Score: 5.5/7 with partial credit on Axis 1, Axis 5, and Axis 4.

Choose LiteLLM when VPC-only is gating, the security team is satisfied with the post-March-24 audit, and you can pair LiteLLM with traceAI plus a SQL sink.

5. Maxim Bifrost: Fastest pilot install in the cohort

Verdict. Maxim AI’s Go-native LLM gateway, designed as a single-binary drop-in proxy. Pitch: single Go binary, low memory, sub-millisecond proxy overhead, deepest pilot-onboarding ergonomics in this cohort. Trade-off: shallower compliance and RBAC catalog as the rollout enters Stages 2 and 3. Honest read: Bifrost is the right Stage 1 pick when speed is gating; many enterprises will replace it or pair it with deeper alternatives at Stage 2.

Stage fit. Stage 1: fastest install, single Go binary, near-zero dependencies, container in minutes; pilot team on Bifrost in hours; per-developer attribution via headers; chargeback dashboard is Maxim’s hosted plane or a Grafana view on the OTel sink. Stage 2: multi-team support is workspace-shaped rather than hierarchical; RBAC is functional for two to three BUs but lacks the four-level depth Future AGI or Portkey ship; region pinning works because the binary deploys anywhere. Stage 3: cohort divergence sharpest here.

Vendor stability + migration-out. Series A funding. Multi-year contracts at Stage 3 should include standard change-in-control language. Migration is moderate: the binary is source-available or open depending on tier, customer can run it independently, trace data in whatever OTel sink the customer wires.

Where it falls short.

RBAC depth is workspace-shaped, not native four-level. Stage 2 expansion to 5+ BUs needs careful tag conventions.
SCIM is on the roadmap, not shipped as of May 2026.
No self-improving loop; pair with traceAI plus fi.opt if the loop matters. Wrong pick for federal procurement.
Stage 3 procurement story is younger than Portkey’s or Kong’s attested catalogs.

TCO. OSS for the binary. Hosted observability starts around $99/month for small teams; Enterprise custom. At 5,000-engineer scale expect a six-figure annual contract.

Score: 5.5/7 with partial credit on Axis 2, Axis 3, and Axis 4.

Choose Maxim Bifrost when Stage 1 speed is gating and the rollout plan treats Stage 2 as a re-evaluation point against deeper alternatives.

The 3-stage rollout roadmap

Stage 1: Pilot (0-3 months)

Objective. Prove an LLM gateway produces a chargeback table finance accepts without breaking the developer experience.

Scope. One team, 25-50 engineers, one or two repos. Pick a team already heavy on Claude Code, engaging a team that uses it two hours a week wastes the pilot.

Acceptance gates. Per-developer chargeback dashboard with high/median/low spenders visible; per-session cost histogram; tool-call and SSE streaming verified on claude-opus-4-7 and claude-sonnet-4-6; zero developer-experience regression; one soft alert and one hard cap tested without disrupting daily flow.

Recommended picks. Future AGI for the loop, Maxim Bifrost if speed is gating, Portkey if prompt-library UI matters. Avoid Kong unless Kong is already the REST gateway.

Decision at end of Stage 1. “Do we expand on this vendor or re-evaluate?” Two questions: did the vendor’s RBAC depth survive light scrutiny, and did the BYOC or self-host story match the most-restrictive subsidiary? If either is no, re-evaluate before Stage 2.

Stage 2: Expansion (3-9 months)

Objective. Scale to five-to-fifteen teams across two-to-three regions, with per-BU RBAC, per-region data residency, and workspace isolation.

Scope. Five to fifteen teams, 500-2,000 engineers, multi-region. The platform team is no longer in the operational path of every team’s key issuance. BU leads are.

Acceptance gates. Native four-level RBAC end to end with delegated administration; multi-region data plane operational (US workspaces in US, EU in EU under DPA review); per-BU budget caps scoped to the BU’s workspace; audit retention per-repo class with SOX-scope inheriting the seven-year tier; SAML SSO across the IdP estate with SCIM operational; the first 1,000-engineer monthly token bill ($1M+ for a typical mid-large enterprise) fully attributable to developer, session, repo, BU.

Recommended picks. Future AGI for native four-level RBAC plus BYOC; Portkey for native four-tier RBAC plus hosted multi-region; Kong if Kong is the REST gateway. LiteLLM workable for VPC-only with the post-March-24 audit. Maxim Bifrost feels depth limits if RBAC needs more than three levels.

Decision. “Commit for Stage 3 or run a parallel evaluation before multi-year?” Two questions: did procurement (Type II, ISO, DPA, BAA, FedRAMP) hold up under deep review, and did migration-out satisfy legal’s tail-risk review?

Stage 3: Standardization (9-12+ months)

Objective. Standardize the gateway as the org-wide proxy for Claude Code, with IdP federation, SIEM integration, AWS Marketplace contract path, records-retention alignment, and change-in-control language that survives the next 2027 acquisition wave.

Scope. Org-wide. 3,000-10,000 engineers, all product lines, all subsidiaries. CFO is in because the line item is $5M-$20M/year. CISO is in because some repos are SOX-scope and one subsidiary is HIPAA-covered. AWS account team is in because the contract must flow through the EDP. Legal needs 2026-grade MSA, DPA, BAA clauses.

Acceptance gates. SOC 2 Type II within twelve months of audit; ISO 27001 on file; DPA aligned to current EU SCCs with documented sub-processor list and right-to-object language; BAA on request. SAML SSO across the full IdP estate; SCIM operational; delegated administration with BU-lead and cost-center-lead roles. Audit retention aligned to records-retention schedule (SOX seven years, HIPAA six years, default three years). SIEM operational. AWS Marketplace draws down the EDP. Change-in-control with termination-without-penalty trigger. Migration-out plan documented.

Recommended picks. Future AGI is the strongest answer (BYOC plus Apache 2.0 makes migration-out structural; AWS Marketplace; loop produces the 15-30 percent token-spend reduction). Portkey is the strongest hosted answer with PANW handled contractually.

Decision. Multi-year contract signed. Rollout in waves over six months. Year-2 and year-3 budgets locked in.

TCO across the 3 stages

A 5,000-engineer enterprise at $1,500/engineer/month in Anthropic token spend ($90M/year).

Stage	Engineers	Anthropic spend	Gateway license + storage	Platform-team time	Token-spend impact
Stage 1 Pilot	50	$90K/year	$5K-$15K (or free tier)	1-2 FTE-weeks	None expected; baseline data
Stage 2 Expansion	1,500	$27M/year	$50K-$150K	2-4 FTE-months	0-5% from manual tuning
Stage 3 Standardization	5,000	$90M/year	$200K-$500K	1-2 FTE ongoing	15-30% with self-improving loop; flat without

The gateway license is rarely the dominant TCO line. At Stage 3, a 20 percent token-spend reduction at $90M/year is $18M/year, two orders of magnitude larger than the license. This is why the loop matters at standardization scale and why the Stage 1 vendor pick should anticipate the Stage 3 loop question.

Capability matrix across the 7 axes

Axis	Future AGI	Portkey	Kong	LiteLLM	Maxim Bifrost
Pilot onboarding	Fast	Fast	Slow (plugin wiring)	Medium	Fastest (Go binary)
Expansion-ready	4-level RBAC + BYOC	4-tier RBAC + hosted multi-region	Consumer + tag + self-hosted	Team + user + virtual key	Workspace-shaped; 5+ BU gaps
Standardization-grade	Native 4-level + SAML + SCIM	Native 4-tier + SAML + SCIM	SAML + SCIM (Konnect)	SAML in Enterprise	SAML; SCIM on roadmap
Procurement	Type II + HIPAA + GDPR + CCPA + BAA + AWS MP	Type II + ISO + BAA	Type II + ISO + BAA + FedRAMP-aligned	Enterprise attestation; OSS in customer audit scope	Type II on path
Vendor stability	~$1.9M; Apache 2.0 mitigation	Series A + PANW	Series E	YC; post-March-24 audit	Series A
Migration-out	Strongest (Apache 2.0 + customer Parquet)	Moderate (policy is Portkey-shaped)	Strong (self-hosted)	Strong (MIT + customer YAML)	Moderate
TCO 36 mo	$200K-$500K + 15-30% token reduction	Pro $99 + flat token impact	$1.5K/mo Enterprise + platform time	OSS free + Enterprise from $250 + platform time	OSS free + Enterprise; flat token impact

Decision framework: Choose X if

Future AGI if you want a vendor that survives all three stages on structural grounds. BYOC plus Apache 2.0 data layer plus self-improving loop plus SOC 2 Type II + HIPAA + GDPR + CCPA certified attestations plus AWS Marketplace. Best when Claude Code is becoming $1M-$20M/year.

Portkey if you want the polished hosted gateway with the mature attested catalog and can handle the PANW acquisition variable contractually.

LiteLLM if VPC-only is gating, the security team is satisfied with the post-March-24 audit, and you can pair LiteLLM with traceAI plus a SQL sink.

Maxim Bifrost if Stage 1 speed is gating and the rollout plan treats Stage 2 as a re-evaluation point against deeper alternatives.

Common rollout mistakes

Mistake	Fix
Treating the gateway as a one-time vendor selection	Score on Stage 3 axes from Stage 1; the roadmap is the decision
Skipping the migration-out review at Stage 1	Demand export format, source-code portability, change-in-control at Stage 1
Pointing only the IDE plugin at the gateway	Set `ANTHROPIC_BASE_URL` in pilot teams’ shell profiles
Tagging only `user_id`	Tag user, session, repo, and BU from day one
Hard budget cap at soft-alert threshold	Soft-alert at 80 percent, hard-pause at 110 percent
Picking on dashboard polish at Stage 1	Score all 7 axes at Stage 1
Accepting default audit retention	Map repos to records-retention; tiered storage
Multi-year without change-in-control	Termination-without-penalty trigger if post-close DPA degrades
Not engaging the AWS account team at Stage 1	Engage AWS at Stage 1 even if Stage 1 spend is below the AWS threshold
Regional data residency as a feature flag	Verify regional support at Stage 1 even if Stage 1 is single-region

How Future AGI closes the loop on the roadmap

The other four gateways are static policy enforcement points, policy is configured by humans, the dashboard tells humans what is happening, the audit log records human-driven changes. The cost curve at Stage 3 stays flat. Future AGI treats the captured trace as input to a closed loop: every turn traced via traceAI (Apache 2.0); scored by fi.evals; low-scoring sessions clustered; fi.opt.optimizers rewrite system prompt or routing policy; gateway applies the updated policy on the next request, versioned with automatic rollback.

Net effect at Stage 3 scale: a 5,000-engineer org starting at $90M/year typically sees costs trend down 15-30 percent within six months of standardization, $13.5M-$27M/year. Protect ships at ~67ms text latency per arXiv 2510.13351. Structural mitigations matter as much as the loop: Apache 2.0 building blocks (traceAI, ai-evaluation, agent-opt at github.com/future-agi) plus AWS Marketplace plus BYOC mean migration-out at Stage 3 is a non-event because the data layer is the customer’s. The worst-case audit question (“what if the vendor disappears?”) has a structural answer rather than a contractual one.

What we did not include

OpenRouter, consumer-facing routing. Cloudflare AI Gateway, strong for existing Cloudflare customers but doesn’t match BYOC-first or VPC-first constraints at Stage 3. TrueFoundry, right when consolidating inference plus gateway plus MLOps under one MSA, covered in the sibling enterprise post. Helicone, belongs in the lighter-stakes pilot conversation; the Mintlify → Stripe parentage makes Stage 3 standardization unpleasant.

Choosing an AI Gateway for Claude Code in 2026: A Complete Buyer’s Guide, the vendor-scoring lens on the same cohort
Best 5 AI Gateways to Monitor Claude Code Token Usage in 2026, the technical-monitoring lens
Best Claude Code Gateway for Enterprises in 2026, the procurement-and-compliance lens
What Is an AI Gateway? The 2026 Definition
Best LLM Gateways in 2026

Sources

Anthropic Claude Code documentation, claude.ai/docs/claude-code
Future AGI Agent Command Center, futureagi.com/platform/monitor/command-center
Future AGI Protect latency benchmarks, arxiv.org/abs/2510.13351 (67ms text, 109ms image)
Portkey AI gateway, portkey.ai
Palo Alto Networks press release on Portkey acquisition (April 30, 2026), paloaltonetworks.com/company/press/2026
Kong AI Gateway and AI Proxy plugin, konghq.com/products/kong-ai-gateway
LiteLLM proxy, github.com/BerriAI/litellm
Datadog Security Labs LiteLLM PyPI supply-chain writeup (March 24, 2026), securitylabs.datadoghq.com
Maxim AI Bifrost, getmaxim.ai/bifrost

Frequently asked questions

Which vendor has SOC 2 Type II attested today?

Future AGI, Portkey, Kong Konnect, and LiteLLM Enterprise all ship SOC 2 Type II. Future AGI's [trust page](https://futureagi.com/trust) lists Type II + HIPAA + GDPR + CCPA certified; ISO/IEC 27001 is in active audit. Maxim Bifrost is on the path.

Can one vendor cover all three stages?

Future AGI and Portkey are the two most likely to survive all three on structural grounds — the first because of BYOC plus Apache 2.0 plus the loop, the second because of the mature attested catalog and PANW backing. Kong survives when Kong is already the REST gateway. LiteLLM and Maxim Bifrost are workable at Stages 1 and 2 but often get re-evaluated at Stage 3.

How much does a Stage 3 rollout actually cost?

For a 5,000-engineer enterprise at $1,500/engineer/month ($90M/year), gateway license plus storage lands in $200K-$500K/year. Platform-team time is one-to-two FTEs ongoing. The dominant TCO line is token-spend impact: a 20 percent reduction at this scale is $18M/year.

What should legal prioritize in the MSA?

Assignment-and-novation with termination-without-penalty if post-close DPA degrades; the sub-processor list with right-to-object language; audit retention per-repo class with tiered-storage economics negotiated explicitly; HIPAA BAA where applicable (Future AGI is HIPAA certified, BAA available); data residency commitments per region; change-in-control notification SLA.

What if one subsidiary is HIPAA-covered and the rest are not?

Pick the gateway whose BYOC or self-hosted deployment matches the most-restrictive subsidiary. Future AGI's BYOC and Kong's self-hosted are the strongest answers for this shape.

How is this post different from the sibling buyer's-guide?

The buyer's-guide focuses on vendor scoring against eight criteria. This post focuses on the three-stage rollout — same cohort organized around time. Read the buyer's-guide if the question is 'which vendor today'; read this if the question is 'which vendor over the next 6-12 months.'

View all

Guides

Best 5 AI Gateways to Cache Claude Code Calls in 2026

Five AI gateways scored on caching Claude Code calls in 2026: cross-developer cache scope, semantic-match thresholds, hit-rate observability, TTL controls, and what each one misses.

Rishav Hada · May 16, 2026

17 min

Guides

Top 5 Tools for Claude Code Cost Management in 2026

Five tools for Claude Code cost management in 2026 — four gateways plus the native Anthropic dashboard and a FinOps platform — scored on attribution, chargeback, caps, routing, cache observability, FinOps integration, and audit trail.

NVJK Kartik · May 14, 2026

18 min

Guides

Best 5 AI Gateways to Monitor Claude Code Token Usage in 2026

Five AI gateways scored on Claude Code token monitoring in 2026: per-developer attribution, per-repo budgets, session traces, alert routing, and what each gateway misses.

Rishav Hada · May 8, 2026

17 min

TL;DR: pick by stage

Why Claude Code is a roadmap, not a purchase

The 7 axes: what to score for a 6-12 month rollout

How we filtered the cohort

1. Future AGI Agent Command Center: Survives all three stages

2. Portkey: The polished hosted alternative

3. Kong AI Gateway: Right when Kong is already the REST gateway

4. LiteLLM: VPC-only, with the supply-chain caveat

5. Maxim Bifrost: Fastest pilot install in the cohort

The 3-stage rollout roadmap

Stage 1: Pilot (0-3 months)

Stage 2: Expansion (3-9 months)

Stage 3: Standardization (9-12+ months)

TCO across the 3 stages

Capability matrix across the 7 axes

Decision framework: Choose X if

Common rollout mistakes

How Future AGI closes the loop on the roadmap

What we did not include

Related reading

Sources

Frequently asked questions