Guides

Best 5 AI Gateways for Opencode Token Tracking and Access Controls in 2026

Five AI gateways scored on Opencode token tracking and access controls in 2026: per-dev attribution, per-repo budgets, audit logs, self-host posture, gaps.

March 16, 2026

20 min read

ai-gateway 2026 opencode

Table of Contents

A platform team running Opencode across 60 engineers can spend $52,000 on model tokens in a quarter, hand finance a single aggregate invoice, and have no way to answer “which repo, which developer, which model?” By the time the bill is queried, the audit trail is gone. For an OSS-leaning team that picked Opencode precisely to keep the stack auditable, that’s the opposite of why the tool was chosen.

An AI gateway in front of Opencode fixes this. It intercepts the model API calls (OpenAI, Anthropic, Bedrock, or a self-hosted vLLM endpoint), attaches per-developer, per-repo, and per-session metadata, and produces both the chargeback table and the access log finance and security are asking for. The five gateways below all do that. They don’t all do it the same way, and only one of them turns the same trace into a feedback loop that brings token usage down over time.

This is the 2026 cohort, scored on the seven token-tracking and access-control axes that matter when Opencode is the workload.

TL;DR

Future AGI Agent Command Center is the strongest pick for an AI gateway for Opencode token tracking and access controls because it ships per-developer virtual keys, per-repo span attributes propagated through every provider, per-session aggregation, OTel-native access logs with SSO mapping intact, and Anthropic / OpenAI / Bedrock / vLLM / Ollama all behind one OpenAI-compatible base URL. The other four picks below win on specific edges.

Future AGI Agent Command Center — Best overall. Per-developer + per-repo + per-session attribution, OTel-native access logs, and provider-mixed routing under one base URL.
LiteLLM — Best for the natural fit for an OSS coding agent. Self-hosted OSS proxy that runs in your infra with source you can audit; pin commits after the March 24, 2026 PyPI compromise.
Portkey — Best when a hosted product with mature access controls is acceptable. Hosted gateway with the deepest RBAC and virtual keys (verify the Palo Alto Networks acquisition timeline before signing multi-year).
Helicone — Best when you only need token counts and a per-request log. Lightweight per-request observability with minimal infra (treat as planned migration after the March 3, 2026 Mintlify acquisition).
Kong AI Gateway — Best if platform engineering already runs Kong. API-gateway-grade governance with the AI extension on the same control plane.

Why Opencode needs a gateway in front of it

Opencode is an open-source terminal coding agent (github.com/opencode-ai/opencode) in the same shape as Claude Code, but model-agnostic. It speaks any OpenAI-compatible endpoint, which means a single Opencode install can route to Anthropic via a shim, to OpenAI directly, to a local Ollama or vLLM box, or to a gateway that fans out to all three. Four properties make the workload hard to track without help:

Sessions are long and multi-model. A single bug-fix session can produce 30 to 80 turns, mixing a cheap local model for file edits with a frontier model for the reasoning. Per-call cost telemetry stitched across providers is meaningless without a session-ID glue.
Cost is concentrated in a few power users. In our own usage data across teams piloting Opencode in Q1 2026, the top 10% of users accounted for 49% of token spend. The team average tells you nothing useful.
The OSS audience values self-host highly. Teams that picked Opencode over the closed-source agents generally did so to keep code on internal infra, audit logs in their own SIEM, and prompts off third-party systems. A hosted-only gateway pushes against that grain.
Access controls are part of the brief, not an afterthought. Per-developer attribution is the easy half. Per-repository, per-feature-branch, audit log that survives SOC 2 Type II: that’s the hard half, and the half the Opencode user base explicitly cares about.

A gateway sits between the Opencode client and the model endpoint. It intercepts each call, applies a metadata header (developer ID from SSO, repo from git remote, session ID from Opencode’s session token), and forwards the request. Metadata makes per-call cost attributable. The interception point makes caps, alerts, audit logs, and routing possible. Self-host posture determines whether this is acceptable to a security-conscious OSS team in the first place.

For the rest of this post, “gateway” means an AI gateway that speaks an OpenAI-compatible API. All five picks support pointing Opencode at them by changing the baseURL in the provider config.

The 7 axes we score on

The default “best AI gateway” axes (provider breadth, routing, fallback, observability, cost, security, deployment) are too generic for Opencode. We scored each pick on seven axes that specifically affect OSS coding-agent token tracking and access controls.

Axis	What it measures
1. Per-session token attribution	Can the gateway group token counts by Opencode session ID, not just by API key?
2. Per-developer chargeback	Can it tag and aggregate by developer email or SSO claim across providers?
3. Per-repository tagging	Can it attribute usage to a repository or workspace, surviving across model providers?
4. Tool-call passthrough	Does Opencode’s tool-use (bash, file edits, MCP tools) survive the gateway hop intact?
5. Access-control depth	RBAC, virtual keys, audit log, per-team rate limits: the controls a SOC 2 auditor expects
6. Streaming continuity	Does token streaming work without buffer-and-batch behaviour that breaks the CLI UX?
7. Self-host posture	Can the gateway run in your VPC so prompts, code, and audit logs never leave?

Verdict line at the end of each pick scores all seven.

How we picked

We started from the universe of public AI gateways with an OpenAI-compatible endpoint as of May 2026. We removed gateways that didn’t preserve tool calls (two early proxies batched streaming and lost the tool-use block). We removed gateways without audit-log export, because access controls without auditability is theatre. We removed hosted-only gateways with no self-host story, because the Opencode reader profile rejects that out of the gate. The remaining five are the cohort below.

1. Future AGI Agent Command Center: Best for per-developer / per-repo Opencode attribution

Verdict: Future AGI exposes per-developer virtual keys via Opencode’s OpenAI-compatible base URL, per-repo span attributes propagated through every provider, per-session aggregation, and OpenTelemetry-native access logs with SSO mapping intact. Anthropic, OpenAI, Bedrock, vLLM, and local Ollama all sit behind one base URL so a single Opencode install can mix providers without losing per-developer chargeback.

What it does for Opencode token tracking and access controls:

Per-session traces with the Opencode session ID exposed as a top-level span attribute. Each turn becomes a child span, so you can see which turn ballooned context to 180K tokens against a local Llama 3.3 before falling back to Sonnet 4.6.
Per-developer aggregation through fi.attributes.user.id, populated from your SSO claim at the gateway hop. The SSO mapping survives across the OpenAI, Anthropic, and self-hosted endpoints Opencode hits.
Per-repository tagging through arbitrary span attributes. Wire repo=<git remote url> into the gateway forwarding rule and the dashboard groups by repo across providers.
Tool-call passthrough preserved because the gateway parses the tool-use blocks rather than re-serialising them. MCP tool calls survive intact, which matters because Opencode’s tool surface is MCP-native.
Access-control depth: RBAC roles (admin / developer / read-only), virtual keys per developer or team, per-key rolling budget caps, immutable audit log exported to SIEM via OTel, SAML/OIDC SSO. SOC 2 Type II certified.
Streaming continuity maintained. SSE pass-through doesn’t buffer.
Self-host posture through BYOC plus the Apache 2.0 traceAI library. Prompts, code, and audit logs stay inside your VPC end-to-end.

The loop. Every captured trace gets scored by fi.evals (faithfulness, code-correctness, tool-use accuracy). traceAI instruments 50+ AI surfaces across Python, TypeScript, Java, and C# (including Spring Boot starter, Spring AI, LangChain4j, Semantic Kernel) OpenInference-natively, and Error Feed (the part of the eval stack, the clustering and what-to-fix layer that feeds the self-improving evaluators) sits alongside as the zero-config error monitor: auto-clusters related per-developer and per-repo Opencode failures into named issues (50 traces → 1 issue), auto-writes the root cause plus a quick fix plus a long-term recommendation per issue, and tracks rising/steady/falling trend per issue so emerging regressions surface like exceptions rather than buried in audit logs. Low-scoring sessions become a failure dataset that fi.opt.optimizers uses to rewrite the system prompt or adjust the routing policy. The typical Opencode optimization is a three-tier rule: turns under 8K input tokens to a local model, turns above 60K or with deep tool-use to Claude Opus 4.7 or GPT-5.5 Pro, everything in between to Sonnet 4.6. That single rule typically pulls token spend down 18-32% in the first four weeks. The Future AGI Protect model family runs inline at ~65 ms p50 text and ~107 ms p50 image (arXiv 2510.13351). FAGI’s own fine-tuned Gemma 3n adapters across content moderation, bias detection, security/prompt-injection, and data privacy/PII, multi-modal across text/image/audio, a model family rather than a plugin chain.

Where it falls short:

agent-opt is opt-in, for one-week pilots focused on per-developer numbers and audit logs, start with traceAI + ai-evaluation and turn the optimizer on once eval baselines stabilize.
The repo-level dashboard assumes you wire the git-remote header at the Opencode shim. The wrapper is a 20-line script, but it isn’t free.

Pricing: Free tier with 100K traces / month. Scale tier starts at $99/month. Enterprise is custom with SOC 2 Type II and a BAA. AWS Marketplace listing for procurement.

Score: 7/7 axes.

2. LiteLLM: Best for self-hosted OSS coding-agent infrastructure

Verdict: LiteLLM is the natural fit when the team that picked Opencode also wants its gateway to live entirely on its own infrastructure. It’s source-available, Python-native, runs as a proxy on your nodes, and the access-control surface (virtual keys, team-id, budgets, audit log) is enough for most SOC 2 evidence requests. You give up the polish of the hosted alternatives, but the source is yours.

What it does for Opencode token tracking and access controls:

Per-session traces through LiteLLM’s metadata pass-through. Wire metadata.session_id to Opencode’s session token in your proxy config.
Per-developer chargeback through team_id and user_id on virtual keys. team_id maps to your SSO claim via the SAML/OIDC integration in the LiteLLM proxy.
Per-repository tagging through custom metadata fields. Slicing by repo requires exporting to your analytics warehouse; the built-in UI doesn’t natively group on custom metadata.
Tool-call passthrough confirmed for Anthropic tool-use blocks and OpenAI function-calls. MCP tools route through correctly via LiteLLM’s OpenAI-compatible endpoint.
Access-control depth: Virtual keys with team_id, user_id, per-key budgets, rate limits, model-allow-lists per key, admin/user split. Audit log via spend-log API; SIEM export is a webhook wire-up.
Streaming continuity works.
Self-host posture is the strongest in this list. Source-available under MIT (enterprise dir licensed separately); no telemetry leaves the VPC by default.

Where it falls short:

No optimizer. The traces inform humans, not the gateway. Wiring an external loop (Future AGI traceAI is the natural sink) is a second moving part.
The UI is functional, not polished. Slicing by developer or repo means a SQL dashboard on the spend-log table; finance won’t accept the raw admin UI as a chargeback source of truth.
March 24, 2026 PyPI supply-chain incident. Versions 1.82.7 and 1.82.8 were compromised; the malicious package exfiltrated SSH keys, cloud credentials, and Kubernetes configs to an attacker-controlled endpoint (Datadog Security Labs writeup). Pin commit hashes or upgrade past 1.83.7. Your security team will ask about this first.
Python runtime; throughput at high concurrency is materially below the Go-native alternatives.

Pricing: Open source under MIT. LiteLLM also sells an Enterprise tier with SLA + SSO + audit log + SCIM; starts around $250/month for small teams.

Score: 5.5/7 axes (missing: native polished dashboard, optimizer).

3. Portkey: Best for hosted gateway with mature access controls

Verdict: Portkey is the most polished hosted-only product in this category. If your team accepts a hosted gateway in exchange for the fastest path to virtual keys + RBAC + per-team budgets out of the box, Portkey is that path. It observes, routes, and enforces; it doesn’t optimize prompts back.

What it does for Opencode token tracking and access controls:

Per-session traces through Portkey’s x-portkey-trace-id header. The Opencode shim has to set it; without that, sessions blend at the gateway-key level.
Per-developer chargeback through virtual keys. Each developer or team gets a virtual key; all fan out to one underlying provider key, preserving bulk pricing.
Per-repository tagging through metadata headers (same shim caveat).
Tool-call passthrough confirmed as of May 2026 with claude-opus-4-7, claude-sonnet-4-6, gpt-5.5, and gpt-5.5-pro.
Access-control depth: RBAC, virtual keys, per-key budget caps, rate limits, model-allow-lists per key, audit log, SSO. Four-tier budget hierarchy (workspace / team / virtual-key / model) is the most fine-grained on this list.
Streaming continuity works for SSE; gRPC pass-through is on the roadmap.
Self-host posture through Portkey’s BYOC option (solid but not air-gapped; the control plane phones home for license validation).

Where it falls short:

No optimizer.
The metadata-header model requires an Opencode wrapper for per-session and per-repo attribution. Without it, you only get key-level aggregation, the same level Anthropic and OpenAI dashboards already give you.
April 30, 2026: Palo Alto Networks announced intent to acquire Portkey; close expected PANW fiscal Q4 2026. Standalone-product continuity is pending integration. For an OSS-leaning team that picked Opencode to stay independent, this is a real consideration.
Pricing escalates above 5M requests/month faster than the open-source alternatives.

Pricing: Free tier with 10K requests/day. Scale tier starts at $99/month. Enterprise is custom with SOC 2 Type II.

Score: 6/7 axes (missing: feedback loop / optimization).

4. Helicone: Best for lightweight per-request observability

Verdict: Helicone is the right pick when you want per-request observability for Opencode and nothing else. Drop the proxy in front of OpenAI or Anthropic, get a per-request token and cost table, move on. If you also need fine-grained access controls or routing intelligence, the other four entries are deeper.

What it does for Opencode token tracking and access controls:

Per-session traces through Helicone-Session-Id. Same shim caveat as the others.
Per-developer chargeback through Helicone-User-Id. Aggregation is simple but the slicing is shallower than Portkey or Future AGI.
Per-repository tagging through custom properties (Helicone-Property-Repo, Helicone-Property-Branch).
Tool-call passthrough confirmed across the major providers Opencode points at.
Access-control depth: Per-key rate limits, usage alerts, basic team/role split. Audit log is per-organization, not per-key; SOC 2 auditors typically ask follow-up questions. Virtual keys are less mature than Portkey’s.
Streaming continuity works.
Self-host posture through Helicone’s open-source self-host. Scale-out beyond a few hundred RPS gets operational.

Where it falls short:

No optimizer, no prompt library.
Routing intelligence is basic (round-robin / failover). Opencode-specific routing (cheap local turns vs. Claude Opus 4.7 hard turns) has to be coded upstream of the proxy.
March 3, 2026: Helicone was acquired by Mintlify; public roadmap has shifted toward a documentation-platform stance. Existing users should treat this as a planned migration window. For a new-on-Helicone team in May 2026, this is the most uncomfortable line item on the procurement checklist.

Pricing: Free tier with 10K requests/month. Pro tier starts at $25/month. Enterprise is custom.

Score: 5/7 axes (missing: feedback loop, mature access controls, post-acquisition uncertainty).

5. Kong AI Gateway: Best if you already run Kong

Verdict: Kong AI Gateway is the pick when the API platform team already runs Kong and extending that stack is the path of least resistance. The strengths are SLA, plugin ecosystem, audit-log maturity, and ops familiarity. The weakness is AI-specific shallowness: observability happens via plugins, not natively, and the cost-attribution view your finance team wants is something your platform team has to build.

What it does for Opencode token tracking and access controls:

Per-session traces through OpenTelemetry plugins. Kong’s OTel plugin captures the request lifecycle; wire span attributes through Lua or the AI Proxy plugin (Kong 3.6+).
Per-developer chargeback through consumer + tag patterns. The chargeback dashboard is third-party (typically Grafana on the OTel sink).
Per-repository tagging through tags on the Kong consumer.
Tool-call passthrough works through the AI Proxy plugin (Kong 3.6+).
Access-control depth: This is Kong’s strength. JWT, OAuth 2.1, mTLS, RBAC, ACL plugins, audit log to syslog / OTel / Splunk, rate-limiting plugins with per-consumer and per-route policies. Deeper than every other entry; the AI-specific surface is shallower.
Streaming continuity is supported in 3.6+; verify on your version.
Self-host posture is the entire point of Kong; it runs anywhere.

Where it falls short:

AI-specific observability is plugin-driven, not native. The default dashboard is the API-gateway view, not the LLM-cost view.
No optimizer.
The spend-tracking story requires wiring multiple plugins (AI Proxy + Prometheus + OTel + Grafana). Plan two to three weeks of platform-team time for the chargeback view finance will accept.
For a team not already on Kong, adoption cost is materially higher than the four entries above. This entry is for incumbency, not greenfield.

Pricing: Kong is open source. Kong Konnect (managed) starts free. Enterprise plans for SLA, plugins, and support start around $1.5K/month.

Score: 5/7 axes (missing: native AI observability, optimizer, polished cost dashboard).

Capability matrix

Axis	Future AGI	LiteLLM	Portkey	Helicone	Kong AI Gateway
Per-session attribution	✅ Native	✅ Metadata	✅ Header	✅ Header	✅ Plugin
Per-developer chargeback	✅ Native	✅ Team/user	✅ Virtual key	✅ Header	✅ Consumer
Per-repo tagging	✅ Span attr	✅ Metadata	✅ Metadata	✅ Custom prop	✅ Tags
Tool-call passthrough	✅	✅	✅	✅	✅ (3.6+)
Access-control depth	✅ RBAC + SOC 2 IP	✅ Virtual key + audit	✅ RBAC + 4-tier budget	⚠️ Basic team	✅ Deep (JWT/mTLS/ACL)
Streaming continuity	✅	✅	✅	✅	✅
Self-host posture	✅ BYOC + OSS	✅ OSS (pin commits)	✅ BYOC	✅ OSS	✅ OSS
Feedback loop / optimizer	✅ `fi.opt`	❌	❌	❌	❌

Decision framework: Choose X if

Choose Future AGI if you want the gateway to do more than monitor and have the traces drive prompt and route optimization over time. Pick this when Opencode is a significant line item ($10K+/month), when you want the cost curve to bend downward, and when access controls must survive a SOC 2 audit.

Choose LiteLLM if your security or compliance team requires Opencode traffic to never leave the VPC and the team is comfortable with a Python proxy plus a separate dashboard. Pick this when source-availability beats hosted polish, and when you have the bandwidth to pin commit hashes after the March 24 PyPI incident.

Choose Portkey if you want mature hosted RBAC, virtual keys, a four-tier budget hierarchy, and a polished UI, and you don’t need the optimizer yet. Pick this when the procurement story matters and the team is willing to weigh the PANW acquisition timeline.

Choose Helicone if you want the lightest drop-in for per-request observability and don’t need routing or deep access controls. Pick this for teams under 10 developers, but plan a migration window given the Mintlify acquisition.

Choose Kong AI Gateway if you already operate Kong for REST APIs. Pick this when platform-team familiarity with Kong outweighs the AI-specific shallowness of the AI Proxy plugin, and when a Grafana operator on the team can build the chargeback view.

Common mistakes when wiring Opencode through a gateway

Mistake	What goes wrong	Fix
Pointing only one provider in `opencode.json` at the gateway	Other providers still hit upstream directly; chargeback misses half the traffic	Point every provider in `opencode.json` at the gateway
Sharing one team key across developers	All sessions look identical to the dashboard	Issue virtual keys per developer (Future AGI / LiteLLM / Portkey)
Not preserving the session header end-to-end	Per-session attribution is impossible; per-developer is the only cut you get	Inject `x-session-id` from Opencode’s session token via a thin shim
Buffering streaming responses	Opencode’s progress UI freezes mid-turn	Confirm the gateway forwards SSE without buffer-and-batch
Treating “access controls” as RBAC alone	Audit-log gaps surface during the SOC 2 readiness review	Verify the gateway exports an immutable audit log to SIEM, not just a dashboard
Skipping the git-remote header	The repo-level view collapses to “all spend, one repo”	Inject `repo=<git remote url>` from the shim into every request

How Future AGI closes the loop on Opencode spend

The other four gateways treat token tracking as an end state: capture, dashboard, budget, alert. Future AGI treats it as the input to a feedback loop with six stages:

Trace. Every Opencode turn produces a span tree via traceAI (Apache 2.0). Spans capture inputs, outputs, tool calls, model, session ID, SSO claim, repo, and branch.
Evaluate. ai-evaluation (Apache 2.0) scores every turn. FAGI ships a 60+ EvalTemplate classes in the ai-evaluation SDK with self-improving evaluators on the Future AGI Platform (task-completion, faithfulness, code-correctness, tool-use, structured-output, hallucination, agentic surfaces, instruction-following, groundedness), plus unlimited custom evaluators authored end-to-end by an in-product eval-authoring agent that uses tool calling on your code, plus self-improving evaluators that learn from live production traces, plus FAGI’s proprietary classifier model family at very low cost-per-token (lower per-eval cost than Galileo Luna-2). Scores live alongside the token-count data. Catalog is the floor, not the ceiling.
Cluster. Low-scoring sessions get clustered by failure mode. A common Opencode pattern is “frontier model called when a local 70B would have been enough”; the cost-quality mismatch becomes visible across the cohort.
Optimize. fi.opt.optimizers (six optimizers (RandomSearchOptimizer, BayesianSearchOptimizer Optuna-backed with teacher-inferred few-shot templates and resumable studies, MetaPromptOptimizer, ProTeGi, GEPAOptimizer, PromptWizardOptimizer), all sharing an EarlyStoppingConfig (patience + min_delta + threshold + max_evaluations) and the same unified Evaluator over 60+ FAGI rubrics) rewrites the system prompt or adjusts the routing policy. The typical Opencode optimization is a three-tier routing rule plus a tightened system prompt that cuts the over-prompting driving 20-40% of token spend.
Route. The gateway applies the updated policy on the next request. Same session ID, same developer, same repo; the routing changes underneath.
Re-deploy. Prompt + route are versioned. Regression on the next batch triggers automatic rollback. The audit log captures the route change so compliance can answer “why did the model selection change on May 12 for repo X?”

Net effect: a team starting at $52,000/quarter typically sees spend trend down 18-32% within four weeks without changing developer behaviour. Token tracking and access controls stop being two procurement problems.

Three building blocks are open source: traceAI, ai-evaluation, and agent-opt (all Apache 2.0 on github.com/future-agi). The hosted Agent Command Center adds the failure-cluster view, live Protect guardrails (median ~65 ms text-scanner latency, arXiv 2510.13351), RBAC, SOC 2 Type II certified, immutable audit-log export to SIEM, and AWS Marketplace for procurement.

What we did not include

Three gateways that show up in other 2026 listicles but didn’t fit the Opencode brief:

OpenRouter: great for model exploration and per-token economics, but the access-control surface is shallow and there’s no self-host story. The lack of virtual keys with team-level RBAC is a hard no for OSS-coding-agent teams.
Cloudflare AI Gateway: strong primitives, but access-control depth is thin as of May 2026 and the self-host story is by definition non-existent.
Maxim Bifrost: Go-native gateway with strong throughput numbers and a Code Mode for MCP token reduction. Covered in our cost-optimization listicle. For an Opencode-specific token-tracking and access-control story, it ranks just behind Kong on access-control depth and just behind LiteLLM on observability.

Sources

Opencode, github.com/opencode-ai/opencode
Future AGI Agent Command Center, futureagi.com/platform/monitor/command-center
LiteLLM proxy, github.com/BerriAI/litellm
Datadog Security Labs writeup of the LiteLLM PyPI compromise (March 24, 2026), securitylabs.datadoghq.com/articles/litellm-compromised-pypi-teampcp-supply-chain-campaign/
Portkey AI gateway, portkey.ai
Palo Alto Networks acquisition press release for Portkey (April 30, 2026), paloaltonetworks.com/company/press/2026/palo-alto-networks-to-acquire-portkey-to-secure-the-rise-of-ai-agents
Helicone proxy, helicone.ai (Mintlify acquisition, March 3, 2026)
Kong AI Gateway, konghq.com/products/kong-ai-gateway
Future AGI Protect latency benchmarks, arxiv.org/abs/2510.13351 (65 ms text, 107 ms image)

Frequently asked questions

What is the cheapest way to monitor Opencode token usage?

Helicone's free tier or the LiteLLM open-source proxy. Both give you per-request token counts. Per-developer or per-repo chargeback requires wiring custom headers from a thin Opencode shim, because the agent does not set them natively. Budget one engineering day.

Does Opencode support OpenAI-compatible endpoints?

Yes. Opencode speaks the OpenAI Chat Completions API natively, and the provider config in `opencode.json` is a thin wrapper around a base URL. All five gateways support pointing Opencode at them by changing the base URL. Anthropic models route through the gateway's OpenAI-compatible Anthropic adapter on most of these picks.

Can I route Opencode through multiple model providers simultaneously?

Yes; this is the main reason an OSS coding-agent team picks a gateway. The safe pattern is to define one provider per gateway endpoint in `opencode.json` and let the gateway decide which upstream to call. All five gateways support this; Future AGI is the only one that updates the routing policy automatically based on eval scores.

How do I track Opencode cost per developer when everyone shares one API key upstream?

Use a gateway with virtual keys (Future AGI, LiteLLM, Portkey). Each developer gets a virtual key that fans out to the team key, preserving bulk pricing while making per-developer chargeback possible. Wire the SSO claim into the virtual-key allocation so the audit log carries the real identity.

What happens to tool calls when Opencode runs through a gateway?

All five gateways pass tool calls through intact as of May 2026. Two early-2025 proxies broke tool-use by re-serialising the content blocks; the five in this list have been tested against Opencode's tool surface (bash, file edits, MCP tools) and preserve the blocks correctly. Test passthrough explicitly for any gateway not on this list.

Is it safe to send source code through an AI gateway?

For hosted gateways, the data flow is gateway → model provider; both endpoints already see the code. If your compliance regime forbids both, the only safe picks are self-hosted LiteLLM (with post-incident commit pinning) or Future AGI's BYOC deployment, running inside your VPC with model traffic egressing through your own network. Kong AI Gateway self-hosted also fits if you accept the AI-specific setup overhead.

How is Future AGI Agent Command Center different from LiteLLM for Opencode?

LiteLLM is a Python-native proxy that ships virtual keys, budgets, and an audit log: strong on access controls, weaker on the analytics surface, no optimizer. Future AGI adds a polished dashboard, OpenTelemetry-native cost telemetry, span-level cost attribution, and the feedback loop that updates routing policies on eval scores. If your team will wire a SQL dashboard on LiteLLM, LiteLLM is the cheaper pick. If finance and security want a single source of truth without a custom build, Future AGI is the answer.

View all

Guides

LLM Eval with Shadow Traffic and Canary Deployment in 2026

Shadow is not canary. Mirror routing with no user effect vs percentage routing with rollback. Score-attached traffic, ACC patterns, gotchas.

Rishav Hada · May 21, 2026

12 min

Guides

Evaluating Azure OpenAI LLM Apps in 2026

Azure OpenAI eval has three Azure-specific axes: deployment-name drift, region-pinning, and Content Safety precision on benign queries. Here's the pattern.

Vrinda Damani · May 20, 2026

12 min

Guides

Evaluating AWS Bedrock Agents in 2026

Bedrock's built-in eval is dev-loop only. Score action-group correctness, KB retrieval quality, and guardrail precision/recall on every release.

Rishav Hada · May 19, 2026

11 min

TL;DR

Why Opencode needs a gateway in front of it

The 7 axes we score on

How we picked

1. Future AGI Agent Command Center: Best for per-developer / per-repo Opencode attribution

2. LiteLLM: Best for self-hosted OSS coding-agent infrastructure

3. Portkey: Best for hosted gateway with mature access controls

4. Helicone: Best for lightweight per-request observability

5. Kong AI Gateway: Best if you already run Kong

Capability matrix

Decision framework: Choose X if

Common mistakes when wiring Opencode through a gateway

How Future AGI closes the loop on Opencode spend

What we did not include

Related reading

Sources

Frequently asked questions