What Is a Routing Policy?
The configuration object inside an LLM gateway that binds a routing strategy and targets to a set of conditions, evaluated on every request.
What Is a Routing Policy?
A routing policy is the configuration object inside an LLM gateway that defines how a request is mapped to a provider and model. It bundles a strategy (round-robin, weighted, least-latency, cost-optimized, or conditional), a list of targets, an optional condition tree, fallback chains, and timeouts into one named resource. The router evaluates the policy on every request. The policy is the configuration; the strategy is the algorithm. FutureAGI’s Agent Command Center exposes routing policies as a first-class CRUD resource on the gateway control plane.
Why it matters in production LLM/agent systems
Hard-coding routing in YAML works for a single deployment. As soon as a team has staging vs. production, multiple models, A/B tests, and per-team SLAs, routing has to be configurable at runtime. The pain shows up in three ways:
- Deploy-coupled changes. Every routing tweak — shifting weight from openai-east to openai-west, adding a new model fallback — turns into a config-PR, deploy, restart cycle.
- Untracked drift. Without a versioned policy object, “why is this user getting Claude all of a sudden?” becomes archaeology across git history.
- No multi-tenant isolation. A platform team that wants to give Team A 100% Anthropic and Team B 100% OpenAI ends up writing one big conditional rule that nobody can read.
A first-class routing-policy resource fixes all three. Policies are named, versioned, and updatable via API. Each request carries the policy_id it was routed by, so dashboards and traces explain themselves. For agent systems where one task triggers many model calls, policies are how a platform team prevents one team’s runaway agent from starving another’s.
We’ve found that the second-order win is regression evidence. When a policy update lands and cost-per-trace shifts, a versioned policy_id lets the on-call engineer pinpoint which routing decision changed and roll back without touching application code. In our 2026 evals, teams running first-class routing-policy resources cut routing-related incident MTTR by roughly 60% compared with teams reading raw YAML in git history.
How FutureAGI handles it
FutureAGI’s routing-policies resource is exposed in the Python and TypeScript SDKs and as REST on the Agent Command Center control plane. The shape:
from agentcc import AgentCommandCenter
client = AgentCommandCenter()
policy = client.routing_policies.create(
name="prod-chat-routing",
strategy="cost-optimized",
config={
"targets": {
"gpt-4o": [
{"provider": "openai-east", "weight": 50},
{"provider": "openai-west", "weight": 50},
],
},
"model_fallbacks": {
"gpt-4o": ["claude-sonnet-4", "gemini-2.0-pro"],
},
"conditional_routes": [
{
"name": "enterprise-tier",
"priority": 10,
"condition": {"field": "metadata.tier", "op": "$eq", "value": "enterprise"},
"action": {"provider": "openai-dedicated"},
},
],
},
)
Internally, the gateway compiles the policy into a routing strategy plus a conditional router that evaluates $eq, $ne, $in, $nin, $regex, $gt, $lt, $gte, $lte, and $exists operators with $and, $or, $not combinators. Fields available on every request: model, user, stream, provider, session_id, request_id, metadata.<key>. Routes sort by priority (lowest first), and the first match wins.
The OTel span emitted by the router carries agentcc.routing.policy_id, agentcc.routing.strategy, and agentcc.routing.target, joining the rest of the traceAI tree. Unlike Portkey’s “configs” — which flatten policy and strategy into one blob — Agent Command Center keeps the policy object versioned and the strategy a small set of named primitives, so changes are auditable. FutureAGI’s control plane stores the full revision history.
How to measure or detect it
Track routing-policy health with:
- Policy-version drift — number of active policies vs. policies referenced in the last 24h. Stale policies should be pruned.
- Conditional-rule match rate — per named rule. A rule with 0 matches in a week is probably dead.
- Target healthy ratio — fraction of targets passing circuit-breaker checks under each policy.
- Per-policy cost —
cost_usdrolled up byagentcc.routing.policy_idto confirm acost-optimizedpolicy is actually optimising. - Per-policy p99 latency — confirms a
least-latencypolicy is converging on the fastest providers.
# Update a policy at runtime — no gateway restart.
client.routing_policies.update(policy.id, config={...new shape...})
The same policy ID flows into FutureAGI’s evaluation surface, so regression evals can be filtered by the policy that generated their traces.
Common mistakes
- Treating routing-policy and routing-strategy as synonyms. The policy is the object; the strategy is one field on it.
- Putting too much logic in conditional rules. Past 10 rules, refactor into named policies and switch policies via metadata.
- Editing YAML in production instead of using the routing-policies API. YAML is for bootstrap; API is for runtime.
- Using a single policy for all teams. Multi-tenant systems benefit from per-team policies and a
metadata.teamconditional rule to pick between them. - Forgetting that priority is ascending —
priority: 10runs beforepriority: 50.
Frequently Asked Questions
What is a routing policy?
A routing policy is the configuration object in an LLM gateway that binds a routing strategy (round-robin, weighted, least-latency, cost-optimized, conditional) to a set of targets, evaluated on every request.
How is a routing policy different from a routing strategy?
The strategy is the algorithm — round-robin, weighted, etc. The policy is the configuration object that pairs a strategy with targets, conditions, fallback chains, and timeouts.
How does FutureAGI implement routing policies?
Agent Command Center exposes a routing-policies CRUD resource on the control plane. Each policy has name, strategy, config (targets, conditions, comparators), and is applied at request time.