Guides

Best 5 Cloudflare AI Gateway Alternatives in 2026

Five Cloudflare AI Gateway alternatives scored on routing intelligence, observability depth, per-tenant chargeback, ecosystem portability, and the eval/optimizer loop that Cloudflare's gateway never grew.

January 29, 2026

17 min read

ai-gateway 2026 alternatives cloudflare

Table of Contents

Cloudflare AI Gateway is the obvious first gateway. It’s free, ships behind a one-line base_url change, and any team already on Workers, R2, or D1 gets it without a new procurement loop. In 2026, the exits are running ahead of the installs.

The pattern in HN threads, /r/LLMDevs migration posts, and the AI Gateway GitHub issue tracker repeats: teams adopt Cloudflare for the easy on-ramp, hit a ceiling at three to six months of production traffic, and look for a gateway that does more than relay-and-log. The ceiling is the same five capabilities, routing intelligence beyond round-robin, observability deeper than per-request rows, per-tenant chargeback at scale, escape from Cloudflare ecosystem lock, and a feedback loop where trace data actually changes how the gateway behaves.

This guide ranks five gateways worth migrating to and walks through the migration step that always trips teams up: the URL prefix pattern.

TL;DR: pick by exit reason

Why you are leaving Cloudflare AI Gateway	Pick	Why
You want trace data to feed back into routing and prompts	Future AGI Agent Command Center	Closes the loop from trace through eval to optimizer to route
You want a hosted gateway with the same proprietary polish but more routing depth	Portkey	Hosted virtual keys, prompt registry, and richer routing rules
You want a self-hosted, source-available proxy	LiteLLM	MIT-licensed proxy that runs entirely inside your VPC
You need enterprise SLA and plugin ecosystem on a Tier-1 API gateway	Kong AI Gateway	Extends a Kong stack with AI-specific policies and audit
You want a hosted gateway tightly bound to the Vercel platform	Vercel AI Gateway	Same one-line ergonomics, broader provider list, framework-native

Why people are leaving Cloudflare AI Gateway in 2026

Five exit drivers show up repeatedly. They aren’t all equal, the first three drive most migrations.

1. Routing intelligence stops at round-robin

Cloudflare AI Gateway’s routing surface, as documented in May 2026, supports fallback chains, per-provider rate limits, and basic retry on 5xx. What it doesn’t support: cost-aware, latency-aware, or model-aware routing (Sonnet for code, Haiku for classification, GPT-4o for vision, decided per request). The fallback chain is static and round-robin within tier. Teams running hot-swap A/B tests on Sonnet 4.6 vs GPT-4.1 vs Gemini 2.5 hit the ceiling fast and either bolt a routing layer in front (defeating the point) or migrate.

2. Observability is per-request rows, not spans

The dashboard shows a flat list of requests with timestamp, model, latency, token counts, and cost. Useful for debugging a single failing call. Insufficient for an agent workload where one user action triggers eight LLM calls plus four tool calls, there’s no native span hierarchy, no session view, no agent-trace timeline. Cloudflare’s documented workaround is to emit OTel from your application code, but then the gateway isn’t the source of truth, your collector is, and you’re paying for two stacks.

3. Per-tenant chargeback gets harder at scale

The gateway exposes a cf-aig-metadata header that the client sets to tag requests by user, session, or team. The data lands in Logpush exports. For a small team, fine. For a multi-tenant SaaS billing 200 customers on per-token usage, the workflow (Logpush to R2, Athena or BigQuery query, join with the application identity table, generate invoice CSV) is a weekly batch job, not a live dashboard. /r/LLMDevs threads describe three sprints to ship a chargeback pipeline that other gateways ship native.

4. Lock-in to the Cloudflare ecosystem

The free price is the headline benefit and the lock-in vector. AI Gateway integrates cleanly with Workers AI, R2 for Logpush, D1 for application data, and the Cloudflare dashboard for routing config. A team with prompt versions in R2, audit logs in Logpush, and routing config in the dashboard has three Cloudflare touchpoints to migrate, not one. Most teams discover this after a year of accumulated artifacts.

5. No eval or optimizer loop

The largest functional gap, under-weighted at adoption. Cloudflare AI Gateway is a relay with a dashboard. It captures traces; it doesn’t score them, cluster failures, drive prompt rewrites, or update routing rules from trace data. In 2026, where Agent Command Center, Portkey’s Guardrails+Optimizer stack, and Maxim’s eval pipelines all close that loop, a relay-only gateway means prompt and routing improvements happen in spreadsheets and pull requests, not in the gateway.

What to look for in a Cloudflare AI Gateway replacement

Score replacements on the seven axes that map to the surfaces you’re actually outgrowing:

Axis	What it measures
1. Routing intelligence	Cost-aware, latency-aware, model-aware — or just round-robin?
2. Observability depth	Span hierarchies, session views, agent traces — or per-request rows?
3. Per-tenant chargeback	Native dashboard slicing by tenant — or a Logpush batch job?
4. Ecosystem portability	Does the gateway lock you to a single cloud’s storage and config?
5. Self-host posture	Can the gateway run inside your VPC, fully air-gapped from the vendor?
6. Eval + optimizer loop	Does the gateway use its own trace data to improve routing and prompts?
7. Migration tooling	Are there published guides or importers for Cloudflare specifically?

1. Future AGI Agent Command Center: Best for closing the loop

Verdict: Future AGI is the only gateway here that fixes Cloudflare AI Gateway’s largest gap, captured traces never feed back into how the gateway behaves. Agent Command Center captures the trace, scores it with the eval library, clusters failures, runs the optimizer, and pushes the updated route or prompt back into the gateway on the next request. The other four are observation and policy layers. FAGI is an observation layer wired to an optimizer.

What it fixes versus Cloudflare AI Gateway:

Routing intelligence beyond round-robin. Cost-aware and latency-aware routing rules ship native. Model-aware routing keys off prompt classification (Sonnet for code, Haiku for classification, GPT-4o for vision) and the classifier is itself a prompt the optimizer rewrites against eval scores.
Span hierarchies and session views. traceAI (Apache 2.0) instruments any agent framework. LangChain, LlamaIndex, CrewAI, plain SDK calls, producing OpenTelemetry spans with agent-trace conventions. One user action shows as one trace tree with LLM, tool, and retrieval calls nested correctly.
Per-tenant chargeback as a dashboard, not a batch job. Cost slices by session, user, repo, route, and tag are native dashboard filters. Invoice CSV is one export, not a three-sprint pipeline.
Native eval, not bolt-on. Every captured trace is scored against task-completion, faithfulness, and tool-use rubrics by default with ai-evaluation (Apache 2.0). Cost and quality data sit in the same row, so the routing policy can use both.
Self-improving loop. agent-opt (Apache 2.0) rewrites prompts via six optimizers — ProTeGi, GEPA, Bayesian, MetaPrompt, RandomSearch, PromptWizard, driven by eval scores. The routing policy is itself a target the optimizer can tune. Cloudflare’s gateway is static configuration; FAGI’s is a feedback loop.
OSS instrumentation, hosted control plane. traceAI, ai-evaluation, and agent-opt are all Apache 2.0. The hosted Command Center adds RBAC, failure-cluster views, the Protect guardrails layer (median 65 ms text-mode latency per arXiv 2510.13351, 107 ms image), and AWS Marketplace procurement.

Migration from Cloudflare AI Gateway: Cloudflare’s invocation pattern sets the SDK’s base_url to https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/{provider} plus an API token. FAGI uses the same shape, change the base_url and swap the token. Routing rules, fallback chains, and metadata headers map directly. Eval and optimizer are additive, ship a like-for-like swap in two days, turn on the loop later. Timeline: three to five engineering days for the proxy cutover with shadow traffic; another sprint for eval and optimizer.

Where it falls short:

agent-opt is opt-in, start with traceAI + ai-evaluation in week one and turn the optimizer on once eval baselines stabilize. The loop compounds value over weeks rather than at day one.
For all-Cloudflare stacks (Workers, R2, D1), FAGI’s strongest surfaces sit outside the Cloudflare control plane, the integration is clean, but a single-vendor team gains a second vendor.

Pricing: Free tier with 100K traces/month. Scale tier from $99/month with linear per-trace scaling above 5M (no add-on multipliers). Enterprise with SOC 2 Type II and AWS Marketplace.

Score: 7 of 7 axes.

2. Portkey: Best for hosted depth

Verdict: Portkey is the natural step-up from Cloudflare AI Gateway when the requirement is “stay hosted but get more routing depth and a real prompt registry.” Closest functional match in hosted shape; the Palo Alto Networks April 2026 acquisition has added enterprise weight without (yet) reshaping the SMB SKU.

What it fixes versus Cloudflare AI Gateway:

Richer routing. Conditional routing matches on metadata, prompt class, or user attribute, then sends to the right provider. Cloudflare’s fallback is static; Portkey’s is rule-driven.
Native prompt registry. Prompt Studio stores prompts as versioned objects with diff views and rollback. Cloudflare has no first-party prompt store.
Virtual keys. Every developer or service gets a Portkey-issued key that fans out to one underlying provider key, preserving bulk pricing while exposing per-identity attribution. Stronger than Cloudflare’s metadata header.
Guardrails and Audit Logs as a bundled surface. Both configurable from the dashboard. Cloudflare leaves guardrails to upstream code.

Migration from Cloudflare AI Gateway: Both use the URL prefix pattern, so the base_url change is mechanical, point at https://api.portkey.ai/v1/proxy and add a virtual key header. Routing rules need re-expression in Portkey’s config (richer, so more work). The Logpush-to-Athena pipeline can be retired in favor of Portkey’s native dashboard. Timeline: five to seven engineering days.

Where it falls short:

No optimizer. Traces inform humans, not the gateway.
Pricing escalates above 5M requests/month; marginal cost compounds with add-ons (Guardrails, Prompt Studio, Audit Logs) enabled.
Post-Palo Alto acquisition, the standalone SMB SKU’s pricing trajectory is uncertain on a 12-to-24-month horizon.
Prompt Studio’s template dialect is Portkey-specific (handlebars-shaped with proprietary filters), so a future exit needs a rewrite step.

Pricing: Free tier with 10K requests/month. Scale tier from $99/month. Enterprise custom.

Score: 5 of 7 axes (missing: optimizer, fully open self-host).

3. LiteLLM: Best for self-hosted exit

Verdict: LiteLLM is the pick when the exit driver is ecosystem lock-in and the requirement is “this proxy runs entirely on our infrastructure, with source we can audit.” MIT-licensed, Python-native, the most popular self-hosted AI proxy on GitHub. You give up Cloudflare’s hosted polish; you gain full sovereignty.

What it fixes versus Cloudflare AI Gateway:

Ecosystem portability. A Python package and a container. Runs on Fly, AWS, GCP, bare metal. Cloudflare’s gateway is bound to Cloudflare. For multi-cloud or single-cloud-but-not-Cloudflare strategies, the cleanest swap.
Self-host posture. The proxy runs in your VPC. No telemetry leaves unless you configure an OTel sink. The answer for security reviews that flagged Cloudflare AI Gateway as a third-party data path.
Cost curve at scale. Open-source means no per-request licensing. Compute and storage scale linearly. Enterprise tier (from ~$250/month) adds SSO, audit, and SLA without per-request escalation.
Per-identity keys via team_id / user_id. A cleaner chargeback primitive than Cloudflare’s metadata header, with a native dashboard rather than a Logpush batch job.

Migration from Cloudflare AI Gateway: OpenAI-compatible endpoint, provider keys, metadata-as-key all map directly. URL prefix swap is one line. LiteLLM has no first-party prompt registry; pair with Langfuse, Future AGI, or in-repo Jinja2 files. You lose Cloudflare’s hosted dashboard UX and the Logpush-to-R2 pipeline. Timeline: five to seven engineering days plus a week for whatever observability sink you choose.

Where it falls short:

No optimizer.
The bundled UI is the weakest in this list; polish lives in the Enterprise tier.
Prompt-library story is a separate purchase or build.
Self-hosted ops carries an SRE cost the Cloudflare free tier didn’t.

Pricing: Open source under MIT. Enterprise from ~$250/month for small teams.

Score: 5 of 7 axes (missing: native prompt registry, optimization loop).

4. Kong AI Gateway: Best for enterprise platform teams

Verdict: Kong AI Gateway is the pick when your platform team already runs Kong for the company’s REST APIs and the path of least resistance is to extend the existing stack with AI-specific policies. Strengths: SLA, plugin ecosystem, audit posture. Weakness: AI-specific surfaces (prompt registry, eval, optimizer) live in plugins or upstream tools, not the product.

What it fixes versus Cloudflare AI Gateway:

Enterprise SLA and procurement. A Tier-1 API-gateway vendor for a decade. Compliance covers SOC 2, ISO 27001, and HIPAA-eligible workloads, the bar Cloudflare AI Gateway’s free tier was never sold against.
Plugin ecosystem. Existing Kong customers reuse rate-limiting, auth, request-transformation, and OTel plugins. The AI Proxy plugin (Kong 3.6+) handles Anthropic, OpenAI, Bedrock, and Vertex passthrough including tool calls. The OTel plugin gives the span hierarchies Cloudflare’s per-request log never had.
Self-host posture. Runs anywhere, bare metal, VPC, hybrid. Konnect (managed) is optional. No single-cloud lock-in.
Per-tenant chargeback via consumer + tag. Consumer abstraction plus tagging plus OTel feeds Grafana or Datadog for live per-tenant dashboards. Replaces the Logpush-to-Athena batch job.

Migration from Cloudflare AI Gateway: OpenAI-compatible endpoint via the AI Proxy plugin, provider keys, consumer + tag as the metadata analog, OTel for observability, all map directly. The URL prefix swap is one line. Kong has no first-party prompt registry; pair with Langfuse, Future AGI, or an in-house Git-backed store. You lose Cloudflare’s free tier and zero-ops posture. Timeline: ten to fifteen engineering days because work spans platform (AI Proxy plugin, OTel sink, Grafana for chargeback) and application (prompt-registry replacement) teams.

Where it falls short:

AI-specific observability is plugin-driven. The default dashboard is the API-gateway view, not the LLM-cost view.
No optimizer, no prompt registry, no eval library.
The two-week-plus setup means migration ROI shows up later than lighter alternatives.
Free tier is open-source self-host only; Konnect managed starts free but scales toward enterprise plans from ~$1.5K/month.

Pricing: Kong AI Gateway is open source. Konnect (managed) starts free. Enterprise plans from ~$1.5K/month.

Score: 5 of 7 axes (missing: native prompt registry, optimizer, native AI cost dashboard).

5. Vercel AI Gateway: Best for Vercel-platform teams

Verdict: Vercel AI Gateway is the pick when the application is already on Vercel and the goal is to swap Cloudflare’s URL prefix for Vercel’s without changing anything else. Same one-line ergonomics, broader provider list, framework-native bindings to the AI SDK, procurement on the existing Vercel invoice.

What it fixes versus Cloudflare AI Gateway:

Framework-native integration. Wired into Vercel’s AI SDK so the gateway feels like a framework feature rather than a separate vendor. For teams already on Next.js + AI SDK, the smallest delta.
Broader provider list. As of May 2026, routes to OpenAI, Anthropic, Google, Mistral, Groq, xAI, Cohere, and Bedrock with first-class support. Failover-plus-aliasing is slightly richer than Cloudflare’s for cross-provider swaps.
Procurement and billing on existing Vercel. Already paying Vercel for hosting? The gateway shows up on the same invoice. No new vendor onboarding, no new security review.
Edge runtime alignment. Co-located with the application’s edge functions, removing one hop teams sometimes hit when running Vercel functions through Cloudflare AI Gateway.

Migration from Cloudflare AI Gateway: Pure URL prefix swap. Replace https://gateway.ai.cloudflare.com/v1/... with https://gateway.ai.vercel.app/v1/... and swap the token. Metadata headers are similar; observability is comparable depth (per-request, with session grouping). Timeline: two to three days.

Where it falls short:

No optimizer.
Observability is still per-request, slightly richer than Cloudflare’s dashboard but not span-hierarchy native. Production agent teams hit the same ceiling six months later.
No first-party prompt registry, no eval library, no audit-log depth at the level of Kong or Portkey.
Ecosystem lock-in is real, trading Cloudflare’s lock for Vercel’s. For teams whose exit driver was specifically “single-vendor risk,” this gateway doesn’t solve the problem.
Pricing once usage scales is competitive with Cloudflare’s paid tier but no longer free.

Pricing: Free credit on Vercel Pro/Enterprise; usage-based above the credit, with rates per provider.

Score: 4 of 7 axes (missing: optimizer, native prompt registry, depth on chargeback and span hierarchies).

Capability matrix

Axis	Future AGI	Portkey	LiteLLM	Kong AI Gateway	Vercel AI Gateway
Routing intelligence	Cost + latency + model-aware, optimizer-tuned	Conditional rules	Cost + latency rules	Plugin-driven	Failover + aliasing
Observability depth	Span hierarchies + sessions	Per-request + session	Functional UI	OTel plugin + Grafana	Per-request + grouping
Per-tenant chargeback	Native dashboard	Virtual-key dashboard	`team_id` / `user_id`	Consumer + tag + Grafana	Per-request usage
Ecosystem portability	BYOC + OSS	Hosted (closed)	MIT, runs anywhere	OSS, runs anywhere	Vercel-bound
Self-host posture	OSS instrumentation, hosted control plane	Hosted only	MIT, full VPC	OSS, runs anywhere	Hosted only
Eval + optimizer loop	Yes (`ai-evaluation` + `agent-opt`)	No	No	No	No
Cloudflare migration tooling	Routing + metadata mapping	URL prefix swap	URL prefix swap	Manual setup	URL prefix swap

Migration notes: what breaks when leaving Cloudflare AI Gateway

Cloudflare AI Gateway uses the URL prefix pattern, so the cutover is a base_url change. Three surfaces around that one-line change need attention.

Re-routing client base URLs

Cloudflare AI Gateway is invoked by setting the SDK’s base_url to https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/{provider} plus an API token header. Every alternative here uses the same URL prefix shape, so the cutover is one line. But services hard-code the URL in three places: SDK initialization, runtime config (often a feature-flag file or Workers env binding), and the deployment manifest. Teams with the URL in one place finish in an afternoon; teams with it pasted across services budget two days for the audit alone.

Replacing the Logpush chargeback pipeline

If you built per-tenant chargeback as Logpush → R2 → Athena → CSV, the replacement gateway probably ships chargeback native and the pipeline can be retired. Two steps: stand up the new dashboard, validate totals match the last 30 days of Athena exports within a few percent, then turn off Logpush. Portkey and Future AGI expose per-tenant cost slices natively; LiteLLM and Kong need the consumer/tag pattern plus a Grafana board.

Reshaping routing rules

Cloudflare’s fallback chain is static and round-robin within tier. Every alternative supports richer rules, but they don’t translate one-to-one. Start with a like-for-like fallback chain on the new gateway, validate parity in shadow mode for a week, then add cost-aware, latency-aware, or model-aware routing. Redesigning routing during the cutover compounds risk.

Decision framework: Choose X if

Choose Future AGI if your reason for leaving is more than swapping vendors, you also want trace data to drive prompt rewrites and routing-policy updates, so the cost curve bends down over time. Pick this when production agent workloads are a significant line item and the OSS instrumentation (traceAI, ai-evaluation, agent-opt) plus the hosted Command Center together justify the migration.

Choose Portkey if you want the closest hosted match to Cloudflare’s shape with more routing depth, a real prompt registry, and virtual keys for multi-tenant chargeback. Pick this when you’re comfortable with the post-acquisition pricing trajectory and willing to accept the proprietary prompt-template dialect.

Choose LiteLLM if ecosystem lock-in is the dealbreaker and the requirement is “runs on our hardware, with source we can audit.” Pick this when self-host posture beats hosted polish and you have engineering budget for a separate prompt store and observability sink.

Choose Kong AI Gateway if your platform team already runs Kong and the path of least resistance is to extend the existing stack. Pick this when SLA, plugin ecosystem, and operational familiarity outweigh AI-specific shallowness.

Choose Vercel AI Gateway if your application is already on Vercel and the goal is the smallest possible delta. Pick this when “single-vendor risk” isn’t your exit driver and framework-native integration with the AI SDK is worth more than the depth you give up.

What we did not include

Three products show up in other 2026 listicles that we left out: OpenRouter (consumer-facing model marketplace, not the shape for an enterprise gateway replacement with chargeback and routing depth); Helicone (capable hosted observation layer, but routing is basic and chargeback depth is comparable to Cloudflare’s, so it solves only one of the five exit drivers); TrueFoundry (capable MLOps gateway, but Cloudflare-specific migration tooling isn’t published yet, worth a second look in Q3 2026).

Sources

Cloudflare AI Gateway documentation, developers.cloudflare.com/ai-gateway
Cloudflare AI Gateway routing and fallback reference, developers.cloudflare.com/ai-gateway/configuration
Cloudflare Logpush for AI Gateway, developers.cloudflare.com/ai-gateway/observability
Hacker News threads on AI Gateway migration patterns, Q1-Q2 2026, news.ycombinator.com
Reddit /r/LLMDevs migration discussions, January-May 2026
Portkey product documentation, portkey.ai/docs
LiteLLM GitHub repository, github.com/BerriAI/litellm
Kong AI Gateway product page, konghq.com/products/kong-ai-gateway
Vercel AI Gateway documentation, vercel.com/docs/ai-gateway
Future AGI Agent Command Center, futureagi.com/platform/monitor/command-center
Future AGI traceAI, github.com/future-agi/traceAI (Apache 2.0)
Future AGI ai-evaluation, github.com/future-agi/ai-evaluation (Apache 2.0)
Future AGI agent-opt, github.com/future-agi/agent-opt (Apache 2.0)
Future AGI Protect latency benchmark, arxiv.org/abs/2510.13351 (65 ms text, 107 ms image)

Frequently asked questions

Why are people moving off Cloudflare AI Gateway in 2026?

Five reasons, in order of weight: routing stops at round-robin and static fallback; observability is per-request rows rather than span hierarchies; per-tenant chargeback requires a Logpush-to-Athena pipeline rather than a native dashboard; the gateway locks teams into the Cloudflare ecosystem; no eval or optimizer loop, so trace data informs humans but never changes how the gateway behaves.

What is the closest like-for-like alternative to Cloudflare AI Gateway?

For the smallest delta on hosted ergonomics, Vercel AI Gateway — pure URL prefix swap, comparable observability, broader provider list. For more routing depth and a real prompt registry, Portkey. For the eval and optimizer loop on top, Future AGI Agent Command Center.

How do I migrate off Cloudflare AI Gateway?

The headline step is changing the SDK's `base_url` from `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/{provider}` to the new gateway's prefix and swapping the auth header. The work around that one line: audit every place the URL is hard-coded (SDK init, runtime config, deployment manifests), reshape routing rules on the new gateway, and replace any Logpush-based chargeback pipeline with the new gateway's native dashboard or a fresh OTel sink.

Is there an open-source Cloudflare AI Gateway alternative?

Yes. LiteLLM (MIT) and Kong AI Gateway are fully open source and self-hostable. Future AGI's `traceAI`, `ai-evaluation`, and `agent-opt` libraries are Apache 2.0; the Command Center hosted product layers on top. Portkey and Vercel AI Gateway are hosted-only.

Which Cloudflare AI Gateway alternative is cheapest at scale?

Cloudflare's free tier is hard to beat for low volume. Above a few million requests per month, self-hosted LiteLLM is usually the cheapest absolute spend, at the cost of engineering time. Future AGI's linear scaling above 5M traces (no add-on multipliers) is the most predictable hosted option once the free tier no longer covers your traffic.

How does Future AGI Agent Command Center compare to Cloudflare AI Gateway?

Cloudflare AI Gateway is a relay with a dashboard — strong free-tier ergonomics, shallow routing and observability. Future AGI is a relay plus eval suite plus optimizer, so trace data drives prompt rewrites and routing-policy updates over time. Both use the URL prefix pattern, so the cutover is a `base_url` change. FAGI's instrumentation libraries are Apache 2.0; the Protect guardrails layer adds median 65 ms text-mode latency per arXiv 2510.13351.

View all

Guides

Best 5 Pydantic AI Alternatives in 2026

Five Pydantic AI alternatives scored on multi-agent depth, language reach, observability without Logfire, optimizer presence, and what each replacement actually fixes for teams who outgrew the type-system-first framework.

Vrinda Damani · May 17, 2026

15 min

Guides

Best 5 Eyer AI Alternatives in 2026

Five Eyer AI alternatives scored on multi-language SDK coverage, self-host posture, gateway and optimizer reach, and what each replacement actually fixes for teams outgrowing AI-monitoring-only tooling.

NVJK Kartik · May 8, 2026

16 min

Guides

Best 5 Replicate Alternatives in 2026

Five Replicate alternatives scored on LLM inference depth, catalog breadth, per-token versus per-second economics, and custom container support — plus the gateway-in-front pattern most teams settle on.

Rishav Hada · May 1, 2026

15 min

TL;DR: pick by exit reason

Why people are leaving Cloudflare AI Gateway in 2026

1. Routing intelligence stops at round-robin

2. Observability is per-request rows, not spans

3. Per-tenant chargeback gets harder at scale

4. Lock-in to the Cloudflare ecosystem

5. No eval or optimizer loop

What to look for in a Cloudflare AI Gateway replacement

1. Future AGI Agent Command Center: Best for closing the loop

2. Portkey: Best for hosted depth

3. LiteLLM: Best for self-hosted exit

4. Kong AI Gateway: Best for enterprise platform teams

5. Vercel AI Gateway: Best for Vercel-platform teams

Capability matrix

Migration notes: what breaks when leaving Cloudflare AI Gateway

Re-routing client base URLs

Replacing the Logpush chargeback pipeline

Reshaping routing rules

Decision framework: Choose X if

What we did not include

Related reading

Sources

Frequently asked questions