Example: Building an Enterprise Routing Configuration
This example demonstrates a multi-layered routing strategy for a SaaS company that balances performance for premium users, cost for standard users, and flexibility for internal development.
Goals:
Provide the fastest possible responses for "premium" customers on support-related queries.
Minimize costs for "standard" tier users.
Allow the internal "development" team to test a new, experimental model without affecting customers.

Routing Configuration (router.json
):
{
"routes": [
{
"name": "premium_support_fast_track",
"conditions": {
"all": [
{ "metadata.user.tier": { "eq": "premium" } },
{ "metadata.request.topic": { "eq": "support" } }
]
},
"targets": {
"$any": ["anthropic/claude-4-opus", "openai/gpt-o3"],
"sort": { "ttft": "MIN" }
}
},
{
"name": "standard_user_cost_optimized",
"conditions": {
"metadata.user.tier": { "eq": "standard" }
},
"targets": {
"$any": ["mistral/mistral-large-latest", "anthropic/claude-4-sonnet"],
"sort": { "price": "MIN" }
}
},
{
"name": "internal_dev_testing",
"conditions": {
"metadata.user.group": { "eq": "development" }
},
"targets": [
{ "model": "google/gemini-2.5-pro" }
]
}
]
}
Configuration Breakdown:
Rule 1:
premium_support_fast_track
Conditions: This rule applies only when a request comes from a user in the
"premium"
tier AND the request topic has been identified as"support"
. This uses anall
operator to combine conditions.Targets: It routes the request to a pool of high-performance models (
anthropic/claude-4-opus
,openai/gpt-o3
) and selects the one with the lowest time-to-first-token (ttft
), ensuring the fastest response.
Rule 2:
standard_user_cost_optimized
Conditions: This is a broader rule that catches all requests from
"standard"
tier users.Targets: It uses a pool of cost-effective models (
mistral/mistral-large-latest
,anthropic/claude-4-sonnet
) and selects the one with the minimumprice
, optimizing for spend.
Rule 3:
internal_dev_testing
Conditions: This rule applies to any user in the
"development"
group.Targets: It directs their requests to
google/gemini-2.5-pro
, isolating test traffic from the production user base.
Last updated
Was this helpful?