Research

Portkey Alternatives in 2026: 6 LLM Gateway and Observability Tools

FutureAGI, LiteLLM, Helicone, OpenRouter, Cloudflare AI Gateway, and Kong AI as Portkey alternatives in 2026. Pricing, OSS license, routing, tradeoffs.

June 23, 2025

15 min read

llm-gateway portkey-alternatives ai-gateway open-source self-hosting provider-routing llm-observability 2026

Table of Contents

You are probably here because Portkey has been the LLM gateway and you want to compare alternatives on price, OSS posture, eval depth, and self-hosting. This guide walks through six platforms that move teams off Portkey in 2026, from a Python proxy at the SDK boundary to a unified open-source stack that ties the gateway to evals, simulation, and guardrails. The recommendations are honest about where each alternative falls short.

TL;DR: Best Portkey alternative per use case

Use case	Best pick	Why (one phrase)	Pricing	OSS
Unified gateway, eval, observe, simulate, optimize, guard	FutureAGI	One stack across pre-prod and prod	Free self-hosted (OSS), hosted from $0 + usage	Apache 2.0
Python proxy at the SDK boundary	LiteLLM	Minimal moving parts	Open source free, Cloud usage-based	MIT
Gateway-first request analytics	Helicone	Fast OpenAI base URL swap	Hobby free, Pro $79/mo, Team $799/mo	Apache 2.0
Hosted unified API across many providers	OpenRouter	One API key, many models	Pay-as-you-go credits	Closed platform, OSS clients
Cloudflare-native gateway	Cloudflare AI Gateway	Free tier and fast deploy	Bundled with Cloudflare Workers	Closed
Enterprise API gateway with AI plugins	Kong AI Gateway	Reuses existing Kong infra	Bundled with Kong subscription	OSS Kong Gateway

If you only read one row: pick FutureAGI when the gateway and the rest of the platform should share a span tree, LiteLLM when the simplest path is a Python proxy, and Helicone when changing the base URL is the fastest fix. For deeper reads: see our LLM Gateways guide, the Agent Command Center page, and the Portkey integration guide.

Who Portkey is and where it stops

Portkey is a hosted AI gateway with provider routing, fallbacks, retries, semantic and simple caching, rate limits, budgets, virtual keys, prompt management, and observability. The Portkey Gateway repo is MIT and self-hostable. The Portkey hosted product adds the control plane, dashboards, prompt management UI, governance, and enterprise features. The docs describe Python and TypeScript SDKs, OpenAI-compatible endpoints, and integrations with Anthropic, Google Vertex, AWS Bedrock, Azure OpenAI, Mistral, Cohere, Groq, Together, and others.

Portkey pricing is straightforward to read at first. The free tier supports 10,000 recorded logs per month and basic features. Production starts at $49 per month with 100,000 recorded logs and overages around $9 per additional 100,000. Enterprise is custom and includes private cloud, SAML SSO, SOC 2, HIPAA, on-prem deployment, and dedicated support. Verify the current pricing page since limits and feature gates change.

Be fair about what Portkey does well. The gateway features are deep: virtual keys for budget isolation, semantic cache for prompt deduplication, fallback chains across providers, retries with backoff, prompt management with version history, and an observability dashboard with logs, metrics, and traces. The recent Guardrails 2.0 release moved Portkey into the policy enforcement space alongside the gateway. The hosted UX is polished, the documentation is clean, and the enterprise governance story (virtual keys, budgets, role-based access) is mature.

The honest gap is depth on evals and simulation. Portkey has eval surfaces around prompt experiments, but it is not a Braintrust-class closed-loop eval product or a FutureAGI-class unified eval-and-observability stack. There is no first-party simulated-user product for synthetic personas across text and voice. There is no prompt optimization loop that takes failing traces and ships a versioned prompt back through CI gates. There is no integrated dataset versioning for production trace-to-dataset workflows. Each of those is a real reason to compare alternatives even before pricing comes in.

Feature coverage matrix across seven gateway and observability platforms (Portkey, FutureAGI, LiteLLM, Helicone, OpenRouter, Cloudflare AI Gateway, Kong AI Gateway) on six rows: provider routing, request analytics, deep evals, simulation, prompt optimization, integrated guardrails. FutureAGI column highlighted with a soft white halo and shows checks across all six rows.

The 6 Portkey alternatives compared

1. FutureAGI: Best for unified gateway + eval + observe + simulate + optimize + guard

Open source. Self-hostable. Hosted cloud option.

Most tools in this list pick one job. Portkey does the hosted gateway. LiteLLM does Python proxy. Helicone does request analytics. OpenRouter does credit-based unified API. Cloudflare AI Gateway does Cloudflare-native routing. Kong AI Gateway does enterprise API gateway with AI plugins. FutureAGI does the loop, with the gateway as one layer of it. The Agent Command Center and traceAI tracing layer share the same span tree, the same eval contract, the same prompt registry, and the same guardrail policy engine.

Architecture: gateway plus observability plus evals share the runtime. The Go-based Agent Command Center gateway accepts OpenAI-compatible HTTP, routes across 100+ providers, applies cache policy and rate limits, enforces PII redaction and prompt-injection guardrails, and emits OTel spans into the same ClickHouse-backed trace store that traceAI writes to. Eval scores attach as span attributes, simulated runs against synthetic personas use the same evaluator that judges production, and the prompt optimizer reads failing spans as labeled training examples. The repo is Apache 2.0 and self-hostable. Plumbing under it (Django, React/Vite, Postgres, ClickHouse, Redis, object storage, workers, Temporal) exists so the gateway and the eval loop do not need glue code.

Pricing: FutureAGI starts at $0/month. The free tier includes 50 GB tracing and storage, 2,000 AI credits, 100,000 gateway requests, 100,000 cache hits, 1 million text simulation tokens, 60 voice simulation minutes, unlimited datasets, unlimited prompts, unlimited dashboards, 3 annotation queues, 3 monitors, unlimited team members, and unlimited projects. Usage after the free tier starts at $2/GB storage, $10 per 1,000 AI credits, $5 per 100,000 gateway requests, $1 per 100,000 cache hits, $2 per 1 million text simulation tokens, and $0.08 per voice minute. Boost is $250 per month, Scale is $750 per month, and Enterprise starts at $2,000 per month.

Best for: Pick FutureAGI when the gateway and the rest of the LLM platform should share the same trace tree, the same eval contract, and the same policy engine. The buying signal is teams running Portkey for routing plus a separate eval platform plus a separate guardrail tool, watching the three drift in production. RAG agents, voice agents, support automation, and BYOK LLM-as-judge teams fit this shape.

Skip if: Skip FutureAGI if your immediate need is a polished hosted gateway with governance and the eval workflow lives elsewhere. Portkey is closer to that shape. FutureAGI also has more services to self-host. Use the hosted product if you do not want to operate that surface.

2. LiteLLM: Best for a Python proxy at the SDK boundary

Open source. Self-hostable. LiteLLM Cloud option.

LiteLLM is the right alternative when the simplest path to value is a Python proxy or library that exposes 100+ LLM providers behind an OpenAI-compatible API. It is the lightest-weight gateway option, runs as a single service, and gets you out of provider lock-in without buying a UI.

Architecture: LiteLLM is MIT and supports Anthropic, AWS Bedrock, Google Vertex, Azure OpenAI, Cohere, Mistral, Together, Groq, OpenRouter, Replicate, HuggingFace, and many more. The proxy ships logging callbacks for Langfuse, Braintrust, OpenTelemetry, and others. Auth, rate limits, virtual keys, budgets, and team management are part of the proxy product, with a UI in the LiteLLM Admin app.

Pricing: LiteLLM is open source and free. LiteLLM Cloud is usage-based hosted on top of the same proxy. The community edition is free under MIT.

Best for: Pick LiteLLM if you want a single Python service in front of all your providers, do not need an opinionated UI, and prefer to wire callbacks into your existing eval and observability tools.

Skip if: Skip LiteLLM if you want a polished UI for analytics, prompts, evals, and guardrails out of the box. It is a proxy first. Note the security advisory and patch in 2025 and review your security posture before deploying as the front door of your traffic.

3. Helicone: Best for gateway-first request analytics

Open source. Self-hostable. Hosted cloud option.

Helicone is the right alternative when the fastest path to value is changing the OpenAI base URL and seeing every request. The center of gravity is gateway operations. Note the March 3, 2026 Mintlify acquisition, which put services in maintenance mode with security updates, new models, bug fixes, and performance fixes.

Architecture: Helicone is Apache 2.0 and ships an OpenAI-compatible AI Gateway with request logging, provider routing, caching, rate limits, sessions, user metrics, cost tracking, datasets, alerts, reports, HQL, eval scores, feedback, and prompts.

Pricing: Hobby is free. Pro is $79 per month. Team is $799 per month. Enterprise is custom.

Best for: Pick Helicone if you want request analytics, user-level spend, model cost tracking, caching, fallbacks, and a low-friction gateway. It is a good first tool for teams with live traffic.

Skip if: Helicone will not replace a deep eval workbench by itself. Treat the maintenance-mode status as a roadmap risk to verify directly.

4. OpenRouter: Best for hosted unified API across many providers

Closed platform with OSS clients. Hosted only.

OpenRouter is the right alternative when the requirement is a single API key that fronts hundreds of LLMs with provider failover, normalized JSON, and credit-based billing. It is not a self-hosted gateway and does not aim to compete on observability depth.

Architecture: OpenRouter exposes an OpenAI-compatible API across 200+ models from OpenAI, Anthropic, Google, Mistral, Meta, AWS Bedrock, Azure, Together, DeepSeek, Cohere, and many more. Routing, failover, normalization, and quota live on the OpenRouter side. Client libraries on GitHub are open source even though the platform is not.

Pricing: OpenRouter is pay-as-you-go. You buy credits and provider calls deduct the underlying provider price (no markup on provider pricing). The platform charges a 5.5% credit purchase fee. BYOK has separate rules: the first 1M requests per month are free, then a 5% BYOK fee applies. There is no monthly platform fee for individual usage.

Best for: Pick OpenRouter if your team wants one API key, one billing relationship, and access to many models without managing per-provider credentials.

Skip if: Skip OpenRouter if you need self-hosted control, deep observability dashboards, prompt management, evals, simulation, or guardrails.

5. Cloudflare AI Gateway: Best if your stack is on Cloudflare

Closed platform. Bundled with Cloudflare Workers.

Cloudflare AI Gateway is the right alternative when your application already runs on Cloudflare Workers or your team uses Cloudflare for DNS, CDN, and edge compute. The gateway sits at the edge, applies caching, rate limits, retries, fallback, and analytics, and is bundled with Cloudflare Workers usage.

Architecture: AI Gateway exposes a Cloudflare-hosted endpoint that proxies requests to OpenAI, Anthropic, Google AI, Mistral, AWS Bedrock, Azure OpenAI, Cohere, Groq, Replicate, HuggingFace, and Workers AI. Caching, rate limits, request retries, fallbacks, and analytics are first-class. Logs and analytics live in the Cloudflare dashboard.

Pricing: AI Gateway is bundled with Cloudflare Workers usage. There is a generous free tier; paid Workers tiers cover production at low marginal cost. Verify the current logging and analytics retention limits.

Best for: Pick Cloudflare AI Gateway if your application is on Cloudflare Workers or your team already pays for Cloudflare. The deploy story is short and the cost is low.

Skip if: Skip Cloudflare AI Gateway if you need deep prompt management, evals, simulation, or guardrails in the same product. The gateway is excellent at the edge layer; the rest is left to other tools.

6. Kong AI Gateway: Best if Kong is already in production

Open source Kong Gateway. Kong AI plugins. Konnect for hosted control plane.

Kong AI Gateway is the right alternative when Kong is already the API gateway in production and the LLM stack should reuse the same infrastructure. The AI plugins ship inside the existing Kong deploy and apply provider routing, semantic caching, prompt protections, rate limits, and analytics on the same control plane.

Architecture: Kong AI plugins integrate with Kong Gateway (open source under Apache 2.0) and Kong Konnect (hosted control plane). Plugins include AI Proxy, AI Prompt Decorator, AI Prompt Template, AI Request Transformer, AI Response Transformer, and Semantic Caching. Providers covered include OpenAI, Anthropic, Cohere, Mistral, Azure OpenAI, AWS Bedrock, and others.

Pricing: Kong Gateway open source is free. Kong AI plugins are bundled with Kong Konnect or Kong Enterprise plans. Verify current pricing on the Kong site since plans change between annual and monthly billing.

Best for: Pick Kong AI Gateway if Kong is already the API gateway and the platform team wants the LLM gateway inside the same control plane. The reuse story is the buying signal.

Skip if: Skip Kong if your team does not run Kong today. The setup cost of standing up Kong purely for the LLM stack rarely beats a dedicated tool.

Decision framework: Choose X if…

Choose FutureAGI if your dominant workload is gateway plus observability plus evals plus simulation plus guardrails in one open-source stack. Buying signal: Portkey for routing, Langfuse or Braintrust for evals, and a separate guardrail tool drift in production. Pairs with: OTel, OpenAI-compatible HTTP, BYOK judges, and self-hosted deployment.
Choose LiteLLM if your dominant workload is a Python proxy at the SDK boundary. Buying signal: minimal moving parts, callbacks into your own eval and observability tools. Pairs with: Langfuse, Braintrust, or FutureAGI for the analytics layer.
Choose Helicone if your dominant workload is gateway-first request analytics. Buying signal: live traffic now and SDK instrumentation later. Pairs with: OpenAI-compatible clients and provider failover.
Choose OpenRouter if your dominant workload is one API key for many models. Buying signal: prototypes, hackathons, and fast model comparisons. Pairs with: any client SDK and a separate analytics tool.
Choose Cloudflare AI Gateway if your dominant workload runs on Cloudflare Workers. Buying signal: edge-native deploy and low marginal cost. Pairs with: Workers AI and Cloudflare R2.
Choose Kong AI Gateway if your dominant workload already routes through Kong. Buying signal: reuse of the existing API gateway. Pairs with: Kong plugins and Konnect.

Common mistakes when picking a Portkey alternative

Treating “gateway” as a single capability. Portkey, FutureAGI, LiteLLM, Helicone, Cloudflare AI Gateway, and Kong AI cover different surfaces. Routing, virtual keys, budgets, semantic cache, retries, prompt protection, observability, and eval-attached scoring are not all in every product.
Picking by free-tier limits. Free tiers reset every month and rarely match production volume. Build a real cost model on a representative day.
Skipping the security review on a self-hosted gateway. Any service that holds provider API keys, sees PII, and routes traffic is in scope for SOC 2, HIPAA, or PCI in regulated industries.
Migrating without freezing the trace shape. Trace IDs, span IDs, attribute names, timing fields, and cost fields differ across platforms. Lock the schema first.
Ignoring fallback policy. A gateway without a fallback policy is a single point of failure. Configure fallback behavior before the first incident.
Pricing only the platform fee. Real cost is platform fee plus provider token spend plus overage plus retention plus seats plus on-call hours.

What changed in the LLM gateway landscape in 2026

Date	Event	Why it matters
Mar 9, 2026	FutureAGI shipped Agent Command Center and ClickHouse trace storage	Gateway routing, guardrails, cost controls, and trace analytics moved into the same loop.
Mar 3, 2026	Helicone joined Mintlify	Helicone remains usable but in maintenance mode.
Feb 2026	Portkey shipped Guardrails 2.0	Portkey moved into the policy enforcement space.
2025 H2	LiteLLM security advisory and patch	Reminder that any front-door proxy is in scope for security review.
Ongoing 2026	Cloudflare AI Gateway expanded provider coverage	Cloudflare-native deploys are now competitive on provider breadth.
Ongoing 2026	Kong shipped Semantic Caching plugin	Enterprise API gateways added LLM-aware caching.

How to actually evaluate this for production

Run a real traffic mirror. Mirror 5 to 10 percent of production requests through each candidate gateway for 7 days. Compare p50, p95, and p99 latency, success rate, retry rate, fallback rate, cost per 1,000 requests, and token attribution.
Inspect the trace contract. The gateway and the observability backend must agree on trace ID, span ID, attribute names, request and response payload shape, and cost fields.
Cost-adjust for your real mix. Real cost is platform fee plus provider token spend plus overage plus retention plus seats plus security review hours.

How FutureAGI implements gateway routing plus guardrails

FutureAGI is the production-grade gateway-plus-observability platform built around the closed reliability loop that Portkey alternatives stitch together by hand. The full stack runs on one Apache 2.0 self-hostable plane:

Gateway, the Agent Command Center fronts 100+ providers with BYOK routing, fallback, latency-aware load balancing, and request caching; cost attribution rolls up per virtual key, per team, per model.
Guardrails, 18+ runtime checks (PII, prompt injection, jailbreak, tool-call enforcement, output schema, regex policies) run pre-call and post-call on the gateway, with turing_flash screening at 50 to 70 ms p95.
Tracing and evals, traceAI (Apache 2.0) auto-instruments 35+ frameworks across Python, TypeScript, Java, and C#, gateway spans land in the same trace tree as agent and retriever spans, and 50+ first-party eval metrics attach as span attributes for outcome, trajectory, and recovery scoring.
Simulation and prompt optimization, persona-driven scenarios exercise the gateway path before live traffic, and six prompt-optimization algorithms consume failing trajectories as training data so production regressions feed back into versioned prompts that the CI gate enforces.

Pricing starts free with a 50 GB tracing tier, 100,000 gateway requests, and 100,000 cache hits; Boost is $250 per month, Scale is $750 per month with HIPAA, and Enterprise from $2,000 per month with SOC 2 Type II.

Most teams comparing Portkey alternatives end up running three or four tools in production: one for the gateway, one for traces, one for evals, one for guardrails. FutureAGI is the recommended pick because gateway, guardrails, tracing, evals, and simulation all live on one self-hostable runtime; the loop closes without stitching.

Sources

Series cross-link

Next: Helicone Alternatives, Best LLM Gateways 2026, LangSmith Alternatives

Frequently asked questions

What is the best Portkey alternative in 2026?

Pick FutureAGI if you want a gateway combined with evals, tracing, simulation, and guardrails in one Apache 2.0 stack. Pick LiteLLM if you want a Python proxy at the SDK boundary. Pick Helicone for gateway-first analytics on existing OpenAI clients. Pick OpenRouter for a hosted unified API across many providers. Pick Cloudflare AI Gateway if your stack is already on Cloudflare. Pick Kong AI Gateway if you operate Kong in production.

Is Portkey open source?

The Portkey Gateway repository is MIT and self-hostable. The Portkey hosted product, prompt management, observability dashboard, governance, and enterprise features are not all open source. The OSS gateway can be self-hosted; the hosted control plane and analytics are separate. Private cloud and full hosted control-plane deployment are part of the Enterprise plan. In a security review, list the components separately by license rather than calling all of Portkey open source.

Why do teams move off Portkey?

Three patterns repeat. The hosted plan tier becomes expensive once production log volume crosses a threshold. The eval surface is shallower than dedicated platforms, so teams pair Portkey with Langfuse or Braintrust and the two stacks drift on attribute names. The OSS gateway is self-hostable, but private cloud and full hosted control-plane features are Enterprise-tier, which is a barrier for mid-market. Each gap is fixable, but it pushes teams to evaluate alternatives.

Can I self-host an alternative to Portkey?

Yes. FutureAGI, LiteLLM, Helicone, and Kong AI Gateway all support self-hosted deployments. The Portkey Gateway repo also runs self-hosted, though the Portkey hosted product features are not part of the OSS image. Operational footprints differ. LiteLLM is a single Python service. Kong AI Gateway sits inside an existing Kong deployment. FutureAGI adds Postgres, ClickHouse, Redis, Temporal, and OTel collectors.

How does Portkey pricing compare to alternatives in 2026?

Portkey free supports 10,000 recorded logs per month. Production starts at $49 per month with 100,000 recorded logs and overages around $9 per additional 100,000. Enterprise is custom and adds private cloud, SAML SSO, and dedicated support. FutureAGI starts free with 100,000 gateway requests included. Helicone Pro is $79 per month. LiteLLM Cloud is usage-based. OpenRouter passes through provider pricing with a platform fee. Cloudflare AI Gateway core features are free; Workers plan affects log storage. Kong AI plugins price by Konnect tier and request volume.

Which alternative is closest to Portkey on routing features?

FutureAGI Agent Command Center and LiteLLM are the closest analogs in 2026. Both support fallback, retries, virtual keys, budgets, and cache. The differentiation is around eval and observability integration. FutureAGI ties routing to the same trace tree and eval contract as the rest of the platform. LiteLLM is a routing layer that pairs with separate observability.

Does FutureAGI replace Portkey for guardrails?

Yes for the common cases. The Agent Command Center applies PII redaction, prompt-injection blocking, jailbreak detection, content moderation, and tool-call enforcement at the gateway boundary. Policy decisions emit OTel spans into the same trace tree as the gateway request, which means a blocked input shows up in the same dashboard a successful eval does. Verify your specific guardrail need against the catalog in the Protect docs before migrating.

What does Portkey still do better than the alternatives?

Portkey remains strong on a polished hosted gateway, deep budget and virtual key controls, prompt management with version history, integrations across Anthropic, Vertex, Bedrock, Azure, Mistral, Cohere, Groq, Together, and a documented enterprise roadmap. If a hosted gateway with governance is the only requirement and eval depth lives in another tool, Portkey is a credible default.

View all

Research

Helicone Alternatives in 2026: 6 Gateway and LLM Observability Tools

FutureAGI, Portkey, LiteLLM, Langfuse, OpenRouter, and LangSmith as Helicone alternatives in 2026 after the Mintlify acquisition. Pricing, OSS, tradeoffs.