What Is a Routing Policy?
The configuration object inside an LLM gateway that binds a routing strategy and targets to a set of conditions, evaluated on every request.
What Is a Routing Policy?
A routing policy is the configuration object inside an LLM gateway that defines how a request is mapped to a provider and model. It bundles a strategy (round-robin, weighted, least-latency, cost-optimized, or conditional), a list of targets, an optional condition tree, fallback chains, and timeouts into one named resource. The router evaluates the policy on every request. The policy is the configuration; the strategy is the algorithm. FutureAGI’s Agent Command Center exposes routing policies as a first-class CRUD resource on the gateway control plane.
Why it matters in production LLM/agent systems
Hard-coding routing in YAML works for a single deployment. As soon as a team has staging vs. production, multiple models, A/B tests, and per-team SLAs, routing has to be configurable at runtime. The pain shows up in three ways:
- Deploy-coupled changes. Every routing tweak. shifting weight from openai-east to openai-west, adding a new model fallback. turns into a config-PR, deploy, restart cycle.
- Untracked drift. Without a versioned policy object, “why is this user getting Claude all of a sudden?” becomes archaeology across git history.
- No multi-tenant isolation. A platform team that wants to give Team A 100% Anthropic and Team B 100% OpenAI ends up writing one big conditional rule that nobody can read.
A first-class routing-policy resource fixes all three. Policies are named, versioned, and updatable via API. Each request carries the policy_id it was routed by, so dashboards and traces explain themselves. For agent systems where one task triggers many model calls, policies are how a platform team prevents one team’s runaway agent from starving another’s.
We’ve found that the second-order win is regression evidence. When a policy update lands and cost-per-trace shifts, a versioned policy_id lets the on-call engineer pinpoint which routing decision changed and roll back without touching application code. In our 2026 evals, teams running first-class routing-policy resources cut routing-related incident MTTR by roughly 60% compared with teams reading raw YAML in git history.
How FutureAGI handles it
FutureAGI’s routing-policies resource is exposed in the Python and TypeScript SDKs and as REST on the Agent Command Center control plane. The shape:
from agentcc import AgentCommandCenter
client = AgentCommandCenter()
policy = client.routing_policies.create(
name="prod-chat-routing",
strategy="cost-optimized",
config={
"targets": {
"gpt-4o": [
{"provider": "openai-east", "weight": 50},
{"provider": "openai-west", "weight": 50},
],
},
"model_fallbacks": {
"gpt-4o": ["claude-sonnet-4", "gemini-2.0-pro"],
},
"conditional_routes": [
{
"name": "enterprise-tier",
"priority": 10,
"condition": {"field": "metadata.tier", "op": "$eq", "value": "enterprise"},
"action": {"provider": "openai-dedicated"},
},
],
},
)
Internally, the gateway compiles the policy into a routing strategy plus a conditional router that evaluates $eq, $ne, $in, $nin, $regex, $gt, $lt, $gte, $lte, and $exists operators with $and, $or, $not combinators. Fields available on every request: model, user, stream, provider, session_id, request_id, metadata.<key>. Routes sort by priority (lowest first), and the first match wins.
The OTel span emitted by the router carries agentcc.routing.policy_id, agentcc.routing.strategy, and agentcc.routing.target, joining the rest of the traceAI tree. Unlike Portkey’s “configs”. which flatten policy and strategy into one blob. Agent Command Center keeps the policy object versioned and the strategy a small set of named primitives, so changes are auditable. FutureAGI’s control plane stores the full revision history.
How to measure or detect it
Track routing-policy health with:
- Policy-version drift. number of active policies vs. policies referenced in the last 24h. Stale policies should be pruned.
- Conditional-rule match rate. per named rule. A rule with 0 matches in a week is probably dead.
- Target healthy ratio. fraction of targets passing circuit-breaker checks under each policy.
- Per-policy cost.
cost_usdrolled up byagentcc.routing.policy_idto confirm acost-optimizedpolicy is actually optimising. - Per-policy p99 latency. confirms a
least-latencypolicy is converging on the fastest providers.
# Update a policy at runtime: no gateway restart.
client.routing_policies.update(policy.id, config={...new shape...})
The same policy ID flows into FutureAGI’s evaluation surface, so regression evals can be filtered by the policy that generated their traces.
Common mistakes
- Treating routing-policy and routing-strategy as synonyms. The policy is the object; the strategy is one field on it.
- Putting too much logic in conditional rules. Past 10 rules, refactor into named policies and switch policies via metadata.
- Editing YAML in production instead of using the routing-policies API. YAML is for bootstrap; API is for runtime.
- Using a single policy for all teams. Multi-tenant systems benefit from per-team policies and a
metadata.teamconditional rule to pick between them. - Forgetting that priority is ascending.
priority: 10runs beforepriority: 50.
Frequently Asked Questions
What is a routing policy?
A routing policy is the configuration object in an LLM gateway that binds a routing strategy (round-robin, weighted, least-latency, cost-optimized, conditional) to a set of targets, evaluated on every request.
How is a routing policy different from a routing strategy?
The strategy is the algorithm. round-robin, weighted, etc. The policy is the configuration object that pairs a strategy with targets, conditions, fallback chains, and timeouts.
How does FutureAGI implement routing policies?
Agent Command Center exposes a routing-policies CRUD resource on the control plane. Each policy has name, strategy, config (targets, conditions, comparators), and is applied at request time.