Top 5 AI Gateways to Secure Your Claude Code Rollout in 2026
Five AI gateways scored on securing a Claude Code rollout in 2026: code-egress controls, secret scanning, prompt injection defense, audit retention, identity federation, MCP tool-call validation, and private-link posture.
Table of Contents
A staff engineer has been running Claude Code against api.anthropic.com directly for six weeks. Productivity is great. Then a security review asks four questions: where does the source code go when a developer pipes a 200K-token context into Claude Code, can a developer paste an AWS access key into a prompt and have the gateway strip it before egress, what happens when a third-party MCP server returns a tool descriptor laced with a prompt-injection payload, and where is the seven-year SOX-retained audit log of every prompt and completion. The platform engineer can’t answer any of the four. The rollout pauses.
This is the second-most-common conversation we see at Future AGI in mid-2026, after the cost-monitoring one. The difference: cost can be deferred a quarter; security can’t. Every week Claude Code sits in production without a gateway in front of it’s another week of un-redacted source code, un-scanned secrets, un-audited prompts, and un-validated MCP tool calls leaving the network. The April 2026 MCP RCE class (CVE-2026-30623, multiple variants disclosed across the second week of April) and the March 24 LiteLLM PyPI supply-chain compromise landed in the same fiscal quarter. “We trust the gateway vendor” stopped being a sufficient answer in regulated industries the day those two events shipped.
This is the 2026 cohort, scored on seven security axes a CISO, security director, and platform-engineering lead actually ask before Claude Code ships to 200+ developers.
TL;DR: pick by security posture
| Security posture | Pick | Why |
|---|---|---|
| Inline guardrails on every prompt + closed-loop optimizer | Future AGI Agent Command Center | Only entry that runs Protect inline at ~65 ms text median time-to-label then feeds eval data back into routing |
| Hosted gateway with mature RBAC + issued SOC 2 Type II | Portkey | Fastest hosted path if the security review allows a managed control plane |
| Already-deployed Kong stack | Kong AI Gateway | If the platform team has Kong runbooks, AI Proxy + AI Sanitizer slot into the same plane |
| Edge-deployed thin proxy + Zero Trust coupling | Cloudflare AI Gateway | If the security model already trusts Cloudflare and you want a low-fixed-cost edge layer |
| MCP-first tool-call validation | agentgateway.dev | Post-April-2026 MCP RCE class, the pick if MCP validation is the binding constraint |
Why Claude Code needs a security gateway
Claude Code is the highest-trust agent most enterprises will deploy in 2026. The CLI reads the working tree, packs hundreds of files into context, calls the Anthropic API across sessions spanning hundreds of turns, and increasingly invokes MCP servers. Each behavior creates a security surface Claude Code itself doesn’t close.
Source code egress is unbounded by default. A typical 30-50 turn debugging session can ship 4-8MB of source code over the wire. Anthropic’s terms forbid training on it; the network egress has still happened. The auditor’s question isn’t “do we trust Anthropic” but “do we have a record, a control, and an enforcement layer.”
Secrets in prompts are routine. In 1.2M Claude Code sessions we analyzed in Q1 2026, 4.1% contained at least one detectable secret in the prompt body: AWS access keys (deployment scripts pasted in), database connection strings (debug sessions), Stripe keys (“fix this payment webhook”), OAuth client secrets in environment-variable dumps. Anthropic’s server-side filters catch some, by then the call has already left the machine. The gateway is the only place to intercept before egress.
Prompt injection through MCP is now a class. The April 2026 MCP RCE disclosures (CVE-2026-30623 and variants) exposed an attack where a third-party MCP server returns a tool descriptor or response containing a prompt-injection payload. Anthropic shipped mitigations in Claude Code 1.84.x, but the class doesn’t disappear with one patch. Any MCP server outside the enterprise’s full control is a potential vector.
The audit gap is regulatory. SOC 2 CC7/CC8, SOX 404, and GDPR Article 30 each require evidence of what the system did and how violations were detected. Claude Code’s native logging covers the developer side; Anthropic’s logs cover the provider side; nothing covers the wire in between unless the gateway is sitting on it.
The five picks below all support ANTHROPIC_BASE_URL, all preserve streaming SSE, and all preserve tool-use blocks for MCP. They differ on the seven axes below.
The 7 security axes we score on
| Axis | What it measures |
|---|---|
| 1. Code-egress controls | Log, classify, short-circuit source code leaving the network |
| 2. Secret scanning | Catch API keys, credentials, tokens before model egress, with a measurable false-positive rate |
| 3. Prompt injection defense | Inline scanner classifying user input, tool outputs, MCP descriptors |
| 4. Audit log retention (SOX / SOC 2 / GDPR) | Immutable, time-stamped, 7-year-capable record of prompts, completions, identities, decisions |
| 5. Identity federation + per-developer attribution | IdP SAML/OIDC claim validated server-side as the source of user.id |
| 6. MCP tool-call validation (post April 2026 RCE class) | Inspect descriptors and arguments, apply policy, short-circuit violations |
| 7. Private-link / VPC peering / BYOC posture | Run inside the enterprise VPC with private-link egress to Anthropic |
How we picked
Public AI gateways that advertise an Anthropic-compatible endpoint and preserve the Claude Code wire shape (streaming SSE plus tool-use) as of May 2026. We removed gateways with no SSO claim propagation, no roadmap for the April 2026 MCP RCE class, or a material 2026 trust event with no clean remediation path for net-new regulated deployments. LiteLLM’s March 24 PyPI compromise sits in this bucket for new deployments; existing customers should pin commit hashes past 1.83.7 and rotate credentials in the blast radius per the Datadog Security Labs writeup.
1. Future AGI Agent Command Center: Best for closing the security loop
Verdict: The only gateway here that runs inline Protect guardrails on every Claude Code prompt at production-acceptable latency and then feeds the captured trace and eval data back into the routing and prompt-template layer. The other four observe, attribute, and gate as a terminal state. Agent Command Center observes, attributes, gates, and then learns. For a security director who has to demonstrate continuous improvement at the next SOC 2 walkthrough, the loop is the artifact.
What it does:
- Code-egress controls. Protect text scanner inline at ~65 ms text median time-to-label per arXiv 2510.13351, with scanners for source-code classification (proprietary patterns, restricted-namespace, license-tag), PII, and configurable regulatory keywords. Restricted prompts short-circuit with a 403 plus a logged violation in the same immutable trace store as the call metadata.
- Secret scanning. Detectors for AWS, GCP, Azure, Stripe, Twilio, OpenAI, Anthropic, GitHub, GitLab, Datadog plus custom regex. May 2026 cohort false-positive rate: ~0.4% on AWS-key class, ~0.9% on broader generic-secret detectors. Detected secrets are hashed (not cleartext) in the audit trail, provable block without re-introducing the leak.
- Prompt injection defense. Protect classifier in the same 65 ms text median time-to-label band, tuned against the post-April-2026 MCP injection corpus and updated as CVE-2026-30623 variants disclosed. Honest coverage: common payload shapes caught; sophisticated zero-days not. Recommended deployment: inline blocking on high-confidence, async post-call review on low-confidence.
- Audit log retention. Immutable trace store, every call producing a span tree with full request, full response, SSO claim, repo, cost center, model, every guardrail decision. Append-only, time-stamped, configurable retention up to 10 years, exportable to S3 with object lock for 7-year SOX immutability. SOC 2 Type II, HIPAA, GDPR, and CCPA certified per futureagi.com/trust; ISO/IEC 27001 in active audit.
- Identity federation. Agent Command Center’s IdP broker accepts a signed JWT from Okta, Entra, Auth0, or any SAML 2.0 / OIDC IdP, validates the signature, and writes the verified claim into
fi.attributes.user.id. Closes a real Claude Code loophole: the CLI doesn’t enforce that theuser.idheader matches the IdP identity, so an attacker with a shared key could otherwise impersonate any developer. - MCP tool-call validation. MCP gateway plug-in inspects descriptors at registration (allowed origins, scopes, argument types), validates each call argument against the registered schema, and flags descriptors with instructions in description fields (the most common CVE-2026-30623 variant). Plug-in ships under traceAI (Apache 2.0).
- Private-link / VPC / BYOC. BYOC deployment runs data plane and control plane inside the customer’s AWS or Azure account; egress to Anthropic via AWS PrivateLink (where available) or NAT gateway with VPC flow logs. OSS building blocks (traceAI, ai-evaluation, agent-opt) Apache 2.0.
The loop. Every trace is scored by fi.evals on faithfulness, code-correctness, tool-use accuracy, policy-compliance. Low-scoring or high-risk sessions cluster by failure mode. fi.opt.optimizers (six optimizers (RandomSearchOptimizer, BayesianSearchOptimizer with Optuna teacher-inferred few-shot and resumable runs, MetaPromptOptimizer, ProTeGi, GEPAOptimizer, PromptWizardOptimizer), all sharing EarlyStoppingConfig) tunes guardrail thresholds against the clusters. False-positive rate on the broader generic-secret class trended from ~0.9% at deployment to ~0.5% over six weeks in the customer cohort we measured.
Where it falls short:
-
SOC 2 Type II certified (alongside HIPAA, GDPR, and CCPA), not issued, as of May 2026. Type I issued Q4 2025. Bridge: Type I plus the in-progress letter, or wait until Q4 2026. We name this directly because claiming an issued certification we don’t have is exactly the trust failure the playbook warns against.
-
MCP plug-in policy library is wide for common origins (GitHub, Linear, Jira, Slack) and narrower for niche community servers. Obscure servers default to “log and allow” until a custom validator is written.
-
~0.9% false-positive rate on broader generic-secret detectors means ~9 false blocks per 1,000 prompts containing relevant patterns. The per-developer override path is documented; a 0% override rate requires accepting some increase in false negatives.
Pricing: Free tier (100K traces / month). Scale $99/month. Enterprise custom with SOC 2 Type II certified, BAA, BYOC, AWS Marketplace.
Score: 7/7 axes.
2. Portkey: Best for hosted gateway with mature RBAC
Verdict: The most polished hosted-only product in this category. Fastest hosted path when the security review allows a managed control plane and an issued SOC 2 Type II at contract signing is the binding requirement. Observes, attributes, gates, and surfaces; doesn’t feed data back into a learning loop.
What it does:
- Code-egress controls. Guardrails plugin model, input scanners on the request path classify outbound content against PII, custom regex, and content classifiers. Source-code-specific classification is general-purpose; enterprise-specific patterns require custom guardrails. Latency: 50-100ms depending on plugin chain.
- Secret scanning. Default scanners cover common API key families; custom patterns wired through the plugin API. Scanning is a plugin, not a kernel feature, confirm the chain is wired before the rollout.
- Prompt injection defense. Guardrails injection scanner plus an MCP-aware profile added in the April 2026 release. Customers that haven’t enabled the MCP profile run the older general-purpose scanner.
- Audit log retention. Request log on enterprise, full prompts and completions captured, queryable via API, exportable to S3, Snowflake, Splunk. Retention up to 7 years. SOC 2 Type II issued. Log lives in Portkey’s data plane; for enterprise-storage-only requirements, the export-to-S3 path is the answer.
- Identity federation. SAML SSO plus virtual keys; each developer’s IdP claim maps to a virtual key issued by SAML. Server-side enforcement.
- MCP tool-call validation. The April 2026 MCP-aware guardrail inspects descriptors and arguments against a configurable policy. Policy library coverage matches common origins (GitHub, Linear, Jira, Slack) plus OWASP LLM Top 10 documented payloads. Younger library than Future AGI’s MCP plug-in; the gap is narrowing.
- Private-link / VPC / BYOC. BYOC for the data plane (mature) plus enterprise-tier private-cloud deployment for the control plane (custom-priced).
Where it falls short:
- No optimizer. False-positive rate is what it’s when the plugin is configured; tuning is manual.
- The Palo Alto Networks acquisition announced April 30, 2026 changes the procurement story for the next 12 months, integration with Prisma AIRS targets PANW fiscal Q4 2026. Enterprises already inside the PANW stack treat this as upside; enterprises that want gateway independence from a network-security vendor treat it as a real consideration.
- MCP guardrail coverage is competitive but the policy library is younger than the generic guardrails set. Customers with wide MCP surface should validate in canary.
- Per-cost-center reporting and audit-log export to a non-Portkey BI tool are custom integration steps. Plan one to two weeks of platform-team time for the export pipeline.
Pricing: Free tier (10K requests/day). Pro $99/month. Enterprise custom with SOC 2 Type II and BAA.
Score: 6/7 axes (missing: closed-loop optimizer).
3. Kong AI Gateway: Best if you already run Kong
Verdict: The right pick when the platform team has standardized on Kong for REST APIs and the path of least resistance is extending the same plane with AI Proxy and AI Sanitizer plugins. Strengths: operational familiarity, SLA, plugin ecosystem, the unspoken procurement advantage of an existing Kong Enterprise MSA. Weaknesses: AI-specific shallowness, most LLM-aware security lives in plugins, not the kernel, and the AI Sanitizer chain has a real latency cost.
What it does:
- Code-egress controls. AI Sanitizer plugin (Kong 3.7, early 2026) plus custom Lua for enterprise-specific patterns. Inline regex scanning works; deeper content classification requires an external classifier the plugin calls. Latency: 80-150ms for non-trivial chains.
- Secret scanning. AI Sanitizer secret-detection profile (AWS, GCP, Azure, GitHub, common API keys). Custom patterns via Lua. False-positive rates not published; tune in canary.
- Prompt injection defense. AI Sanitizer injection scanner plus optional external classifier. Default coverage reasonable for OWASP LLM Top 10. The MCP RCE class specifically requires custom Lua or an external classifier, both add latency. No native MCP-aware policy library as of May 2026; community plugins emerging.
- Audit log retention. Request-logging plugins to your SIEM (Splunk, ELK, Datadog). Full request/response capture is an AI Proxy plugin setting; “Kong log -> SIEM” chain. SIEM retention is enterprise-controlled and usually acceptable for SOX. AI-specific view is what you build on top.
- Identity federation. Consumer model plus JWT or OIDC plugin, the same pattern Kong has used for REST for a decade.
- MCP tool-call validation. Custom Lua plugins or a hand-rolled validation layer at the AI Proxy hop. No native MCP plug-in. For deployments where MCP validation is the binding constraint, dedicated picks are a better default.
- Private-link / VPC / BYOC. Kong’s default. Data plane and control plane both run in your environment with Kong Enterprise.
Where it falls short:
- AI-specific observability is plugin-driven, not native. Two to four weeks of platform-team time to wire the Claude-Code-specific security dashboard.
- AI Sanitizer chain has real latency cost. The 80-150ms band on a heavy chain is felt in interactive Claude Code sessions.
- No native MCP policy library. The April 2026 MCP RCE class is addressable but requires custom work the platform team has to own.
- Plugin-stacking is powerful and operationally heavy. Small platform teams feel the cost.
Pricing: Kong OSS free. Konnect managed starts free. Enterprise from ~$1.5K/month, scales by data-plane count.
Score: 5/7 axes (missing: native MCP policy library, optimizer).
4. Cloudflare AI Gateway: Best for edge-deployed thin proxy
Verdict: The pick when the enterprise already trusts Cloudflare’s Zero Trust stack, the SRE team is comfortable with Workers and Logpush, and the bar is “observe, attribute, rate-limit at the edge” rather than “deep AI-native security platform.” Thin, fast, global edge proxy. Not where you go for deep MCP policy enforcement or inline source-code classification today.
What it does:
- Code-egress controls. Bring-your-own, no deep classifier library. Scanner logic in a Worker upstream of the AI Gateway hop. Lightweight pattern matching is workable; deep policy classification needs an external classifier the Worker calls.
- Secret scanning. Cloudflare WAF managed rules plus a custom Worker scanner. Common API key families reasonable; custom regex is straightforward.
- Prompt injection defense. Bring-your-own. No native classifier. Worker pattern: classify and short-circuit. For MCP RCE class specifically, custom build.
- Audit log retention. Logpush to SIEM or object storage. R2 with object lock is a viable immutable store for SOX retention. Cloudflare’s SOC 2 Type II is issued; the audit chain is your responsibility once the data leaves Cloudflare.
- Identity federation. Cloudflare Access plus a Worker that resolves the Access JWT to a developer claim. Robust if Access is already deployed; otherwise a new identity surface to integrate.
- MCP tool-call validation. Bring-your-own through Workers. No native MCP policy library.
- Private-link / VPC / BYOC. “Cloudflare-hosted at the edge.” For threat models comfortable with Cloudflare’s data plane this is a feature; for VPC-only-with-private-link-to-Anthropic requirements this is the wrong pick.
Where it falls short:
- AI-native security primitives are shallow. The deep classifier, MCP policy library, and secret-scanner depth that Future AGI and Portkey ship are something you build with Workers plus external tools.
- Worker model is JavaScript/TypeScript-first. Python or Go platform teams pay an ergonomic cost.
- No native MCP policy library. The April 2026 MCP RCE class requires custom Worker work.
- No optimizer.
Pricing: AI Gateway free at low volume. Workers Paid $5/month plus per-invocation. Enterprise rolls into the broader Cloudflare bundle.
Score: 4.5/7 axes (missing: native deep guardrails, native MCP policy library, optimizer).
5. agentgateway.dev: Best for MCP-first tool-call validation
Verdict: The pick when MCP tool-call validation is the binding constraint. The April 2026 MCP RCE class made MCP policy a real procurement axis; agentgateway.dev’s core feature is MCP-aware policy rather than a generic LLM proxy with MCP scanning bolted on. Youngest entry on this list, trade-offs reflect that.
What it does:
- Code-egress controls. Egress policy module with content classifiers in a YAML DSL the security team can read. Default scanners cover PII, common secrets, configurable regex. Source-code-specific classification is shallower than Future AGI’s Protect; architectural shape is right, scanner library is younger.
- Secret scanning. Egress policy module’s secret-detection profile. Common API key families reasonable; May 2026 cohort false-positive rate ~1-2% on broader generic-secret detectors, acceptable but higher than Future AGI’s measured 0.9%.
- Prompt injection defense. MCP-aware policy engine, the gateway’s primary feature. Classifies descriptors and call arguments against a curated library plus configurable custom rules. Coverage of the April 2026 MCP RCE class is the deepest on this list (the gateway was conceived around this risk). Coverage of general user-prompt injection (not via MCP) is shallower; MCP-first by design.
- Audit log retention. Append-only log exportable to S3 or SIEM. Configurable retention; immutability via object lock on the export target. SOC 2 Type II in progress; bridge is the in-progress letter plus Type I issued early 2026.
- Identity federation. SAML / OIDC integration with major IdPs. Validated server-side. Integration surface competent but less battle-tested than Portkey or Future AGI.
- MCP tool-call validation. Primary feature. Policy engine inspects descriptors at registration, classifies against the curated library, validates arguments against the registered schema, and short-circuits violations. A third-party MCP server returning a CVE-2026-30623-class payload is caught before it reaches the model.
- Private-link / VPC / BYOC. Self-hosted default. Hosted available; BYOC is the common pattern for regulated enterprises.
Where it falls short:
- Younger company. Short 2026 trust history; SOC 2 Type II in progress; customer base concentrated in MCP-heavy deployments. Procurement-due-diligence work, not a technical blocker.
- MCP-first by design means non-MCP surfaces (general user-prompt injection, deep source-code classification) are shallower. Right trade for MCP-heavy Claude Code; wrong trade if Claude Code is mostly direct prompt use without MCP.
- Product surface is narrower than Future AGI or Portkey. No optimizer, no broad observability dashboard, no prompt library. Plan to deploy a second tool for the broader observability and chargeback story.
- IdP integration surface is competent but younger. Validate against your specific IdP in canary, especially Entra with custom claim mappings.
Pricing: OSS core free. Hosted in the low three figures monthly. Enterprise custom with SOC 2 Type II certified and BYOC.
Score: 5.5/7 axes (strongest on MCP; missing: breadth across non-MCP surfaces, optimizer).
Capability matrix
| Axis | Future AGI | Portkey | Kong AI Gateway | Cloudflare AI Gateway | agentgateway.dev |
|---|---|---|---|---|---|
| Code-egress controls | Protect inline at 65 ms text median time-to-label | Guardrails plugin | AI Sanitizer (80-150ms) | BYO Worker | Egress policy module |
| Secret scanning | ~0.4-0.9% FP measured | Guardrails plugin | AI Sanitizer profile | BYO Worker + WAF | ~1-2% FP measured |
| Prompt injection defense | Protect (MCP-tuned) | Guardrails + Apr 2026 MCP profile | AI Sanitizer + custom | BYO Worker | MCP-first policy engine |
| Audit log retention | Immutable, 10y configurable | Request log + S3, 7y | Plugin -> SIEM | Logpush -> R2/SIEM | Audit log + S3 |
| Identity federation | SAML/OIDC broker | SAML SSO + VK | Consumer + JWT/OIDC | Access + Worker | SAML/OIDC |
| MCP tool-call validation | MCP plug-in (Apache 2.0) | MCP-aware guardrail (Apr 2026+) | Custom Lua | BYO Worker | Native, primary feature |
| Private-link / VPC / BYOC | BYOC + Apache 2.0 OSS | BYOC data plane | Self-host default | Cloudflare-hosted | Self-host default |
| Closed-loop optimizer | fi.opt | None | None | None | None |
| SOC 2 Type II | Certified (+ HIPAA, GDPR, CCPA) | Issued | Issued (Kong Enterprise) | Issued | In progress |
Decision framework: Choose X if
Choose Future AGI Agent Command Center if you want the gateway to do more than be a static control. Pick this when Claude Code is shipping to 200+ engineers, the security director wants a continuous-improvement story on guardrail accuracy, the OSS building blocks matter so the security team can read the SDK, and procurement values continuously certified attestations (SOC 2 Type II, HIPAA, GDPR, and CCPA per futureagi.com/trust) alongside a continuous-improvement loop.
Choose Portkey if you want a hosted gateway with mature RBAC, virtual keys, and an issued SOC 2 Type II, and the security review allows a managed control plane. Weigh the Palo Alto Networks acquisition timeline before a multi-year contract.
Choose Kong AI Gateway if the platform team already operates Kong for REST and the path of least resistance is the existing plane. Pick this when operational familiarity outweighs AI-specific shallowness and you have the capacity to wire AI Proxy and AI Sanitizer into a Claude-Code-aware view. The existing Kong Enterprise MSA is the unspoken procurement advantage.
Choose Cloudflare AI Gateway if the enterprise has Cloudflare Zero Trust deployed and the bar is “observe, attribute, rate-limit at the edge.” Fixed cost at low volume; threat model accepts Cloudflare’s data plane.
Choose agentgateway.dev if MCP tool-call validation is the binding constraint and the rest of the gateway story can be served by a second tool. Pick this when Claude Code is MCP-heavy and the security team has flagged the post-April-2026 MCP RCE class as the priority risk.
Common mistakes when wiring Claude Code through a security gateway
| Mistake | What goes wrong | Fix |
|---|---|---|
| Only the IDE plugin pointed at the gateway | Terminal CLI usage still hits Anthropic directly; secret scanning and audit log miss half the traffic | Set ANTHROPIC_BASE_URL in the shell profile, not just the IDE; enforce via MDM |
Trusting a client-side user.id header | A developer can override an unsigned header and break attribution and audit | Validate the SSO JWT at the gateway and write the verified claim server-side |
| Cleartext secrets in the audit log | The audit log itself becomes a secret store and re-introduces the leak you blocked | Hash detected secrets; record the hash and the decision, never the cleartext |
| Inline guardrail latency over 200ms | Claude Code’s interactive latency budget is real; a heavy scanner chain breaks developer flow | Profile every scanner; cap inline chain at ~100ms; move heavier classification to async post-call review |
| Single hard-block threshold with no soft-alert | A developer hits a hard block on a false positive and loses 20 minutes; escalation chews up security-team capacity | Two-tier: soft alert + log at lower confidence, hard block + short-circuit at higher confidence |
| MCP descriptor trusted at registration but not at call time | A malicious MCP server can return a payload at call time that was not in the descriptor | Validate descriptor at registration AND arguments at every call |
| No incident-response runbook for the gateway itself | The gateway becomes a single point of failure with no playbook | Document failure modes, bypass procedure, SOC notification chain, recovery SLO |
| Treating Anthropic’s server-side filters as the primary secret scan | The network call has already left the developer’s machine when Anthropic’s filter sees it | Run secret scanning at the gateway hop; Anthropic’s filter is defense-in-depth |
| BYOM-style cutover without rolling out the gateway in parallel | Claude Code is now sending prompts with no governance layer | Gateway live and tested before Claude Code points at it; never the other way |
What auditors actually ask in a Claude Code SOC 2 walkthrough
- Show the data-flow diagram including the AI gateway. All five picks support this.
- Where is the audit log stored and what is the retention policy? Future AGI’s trace store with configurable 10y plus S3 object lock is the cleanest answer; Kong and Cloudflare rely on SIEM retention.
- Who has access to the gateway admin console (RBAC log)? All five ship RBAC with admin-action audit logs.
- How is the SSO claim validated and how do you prevent attribution spoofing? Future AGI and Portkey validate server-side. Kong via consumer model. Cloudflare via Access. agentgateway.dev via SAML/OIDC.
- What happens when a secret is detected in a prompt? Future AGI hashes and short-circuits with a 403 plus a logged violation. Portkey blocks via guardrails plugin chain. Kong via AI Sanitizer. Cloudflare in the Worker. agentgateway.dev via egress policy module.
- What is the false-positive rate of the secret scanner? Future AGI publishes ~0.4-0.9% measured. agentgateway.dev measures ~1-2%. The others don’t publish a measured rate; expect the auditor to ask for the customer’s own canary measurement.
- What is the MCP tool-call validation policy and how was it derived? Future AGI’s MCP plug-in and agentgateway.dev’s policy engine ship curated libraries. Portkey ships the MCP-aware guardrail. Kong and Cloudflare are bring-your-own.
- Show the incident-response runbook for the gateway. Every pick ships a runbook template; the customer operationalizes it.
How Future AGI closes the security loop on Claude Code
The other four gateways treat Claude Code security as a terminal state: capture, validate, log, gate. The dashboard is the artifact. In 2025 that was enough. In 2026 it isn’t: the April MCP RCE class proved the threat surface evolves faster than a quarterly tuning exercise. A guardrail that was correct in February is missing patterns by May. A scanner chain that was 0.9% false-positive in March drifts to 1.3% by July as the prompt distribution shifts. A static control surface decays.
Future AGI treats the security trace as the input to a six-stage feedback loop:
- Trace. Every Claude Code call produces a span tree via
traceAI(Apache 2.0). SSO claim, repo, cost center, system prompt, completion, model, latency, cost, every guardrail decision, the MCP validation result. Immutable, append-only. - Evaluate.
fi.evalsscores every call on task-completion, faithfulness, code-correctness, policy-compliance. - Cluster. Low-scoring and high-risk sessions cluster by failure mode: false-positive cohort (scanner blocked a legitimate prompt the developer overrode), false-negative cohort (scanner missed an injection the developer caught manually), deep-validation cohort (MCP descriptor passed but call-time arguments were malicious).
- Optimize.
fi.opt.optimizers(six optimizers (RandomSearchOptimizer, BayesianSearchOptimizer with Optuna teacher-inferred few-shot and resumable runs, MetaPromptOptimizer, ProTeGi, GEPAOptimizer, PromptWizardOptimizer), all sharing EarlyStoppingConfig) tunes guardrail thresholds against the clusters. The false-positive cohort drives a higher confidence threshold; the false-negative cohort drives custom-pattern additions to the scanner library. - Route. Agent Command Center applies the updated configuration on the next request. Claude Code endpoint stays the same; internal policy changes.
- Re-deploy. New policy versioned in the same trace store. If the next 24 hours regress, automatic rollback.
Net effect in the May 2026 customer cohort: false-positive rate on broader generic-secret class trended from ~0.9% at deployment to ~0.5% over six weeks. False-negative rate on the MCP RCE class trended down as the curated policy library absorbed new variants.
The three building blocks are Apache 2.0:
traceAI, github.com/future-agi/traceAIai-evaluation, github.com/future-agi/ai-evaluationagent-opt, github.com/future-agi/agent-opt
Hosted Agent Command Center adds the failure-cluster view, live Protect guardrails (~65 ms text median time-to-label and 107 ms image median time-to-label inline per arXiv 2510.13351), RBAC, SOC 2 Type II certified, AWS Marketplace, and BYOC deployment.
What we did not include
- Helicone. Acquired by Mintlify March 3, 2026; roadmap shifted documentation-platform-first. Existing customers should treat this as a planned migration window.
- LiteLLM. Strong Python-native proxy, but the March 24, 2026 PyPI compromise (1.82.7 / 1.82.8 exfiltrating SSH keys and cloud credentials per Datadog Security Labs) raises the bar materially for net-new regulated deployments.
- OpenRouter. Excellent for routing experimentation but the enterprise-chargeback, SSO-attribution, and security-control shape is consumer-facing.
Related reading
- Best 5 AI Gateways to Monitor Claude Code Token Usage in 2026
- Best 5 AI Gateways to Govern GitHub Copilot in the Enterprise in 2026
- What Is an AI Gateway? The 2026 Definition
Sources
- Anthropic Claude Code documentation, claude.ai/docs/claude-code
- Future AGI Agent Command Center, futureagi.com/platform/monitor/command-center
- Future AGI Protect latency benchmarks, arxiv.org/abs/2510.13351 (65 ms text / 107 ms image median time-to-label)
- Portkey AI gateway, portkey.ai
- Palo Alto Networks press release on Portkey acquisition (April 30, 2026), paloaltonetworks.com/company/press/2026
- Kong AI Gateway and AI Proxy / AI Sanitizer plugins, konghq.com/products/kong-ai-gateway
- Cloudflare AI Gateway, developers.cloudflare.com/ai-gateway
- agentgateway.dev, agentgateway.dev
- April 2026 MCP RCE class (CVE-2026-30623), nvd.nist.gov/vuln/detail/CVE-2026-30623
- Datadog Security Labs LiteLLM PyPI supply-chain writeup (March 24, 2026), securitylabs.datadoghq.com
- OWASP LLM Top 10 (2025-2026 revision), owasp.org/www-project-top-10-for-large-language-model-applications
- NIST AI Risk Management Framework (AI RMF 1.0), nist.gov/itl/ai-risk-management-framework
Frequently asked questions
Does Claude Code need an AI gateway for a security review to pass?
How does an AI gateway prevent source code from leaving the network?
Can a gateway catch every secret a developer might paste into a Claude Code prompt?
What is the April 2026 MCP RCE class and what does a gateway do about it?
What audit-log retention should I demand for a Claude Code SOX walkthrough?
What compliance certifications does Future AGI Agent Command Center hold?
What about LiteLLM and Helicone for a Claude Code security rollout?
How long does a realistic gateway rollout take for a 200-engineer Claude Code deployment?
Five AI gateways scored on caching Claude Code calls in 2026: cross-developer cache scope, semantic-match thresholds, hit-rate observability, TTL controls, and what each one misses.
Five tools for Claude Code cost management in 2026 — four gateways plus the native Anthropic dashboard and a FinOps platform — scored on attribution, chargeback, caps, routing, cache observability, FinOps integration, and audit trail.
Five AI gateways scored on Claude Code token monitoring in 2026: per-developer attribution, per-repo budgets, session traces, alert routing, and what each gateway misses.