Example: Building an Enterprise Routing Configuration

This example demonstrates a multi-layered routing strategy for a SaaS company that balances performance for premium users, cost for standard users, and flexibility for internal development.

Goals:

Provide the fastest possible responses for "premium" customers on support-related queries.
Minimize costs for "standard" tier users.
Allow the internal "development" team to test a new, experimental model without affecting customers.

Routing Configuration (router.json):

{
  "routes": [
    {
      "name": "premium_support_fast_track",
      "conditions": {
        "all": [
          { "metadata.user.tier": { "eq": "premium" } },
          { "metadata.request.topic": { "eq": "support" } }
        ]
      },
      "targets": {
        "$any": ["anthropic/claude-4-opus", "openai/gpt-o3"],
        "sort": { "ttft": "MIN" }
      }
    },
    {
      "name": "standard_user_cost_optimized",
      "conditions": {
        "metadata.user.tier": { "eq": "standard" }
      },
      "targets": {
        "$any": ["mistral/mistral-large-latest", "anthropic/claude-4-sonnet"],
        "sort": { "price": "MIN" }
      }
    },
    {
      "name": "internal_dev_testing",
      "conditions": {
        "metadata.user.group": { "eq": "development" }
      },
      "targets": [
        { "model": "google/gemini-2.5-pro" }
      ]
    }
  ]
}

Configuration Breakdown:

Rule 1: premium_support_fast_track
- Conditions: This rule applies only when a request comes from a user in the "premium" tier AND the request topic has been identified as "support". This uses an all operator to combine conditions.
- Targets: It routes the request to a pool of high-performance models (anthropic/claude-4-opus, openai/gpt-o3) and selects the one with the lowest time-to-first-token (ttft), ensuring the fastest response.
Rule 2: standard_user_cost_optimized
- Conditions: This is a broader rule that catches all requests from "standard" tier users.
- Targets: It uses a pool of cost-effective models (mistral/mistral-large-latest, anthropic/claude-4-sonnet ) and selects the one with the minimum price, optimizing for spend.
Rule 3: internal_dev_testing
- Conditions: This rule applies to any user in the "development" group.
- Targets: It directs their requests to google/gemini-2.5-pro, isolating test traffic from the production user base.

PreviousRouting Engine (Enterprise Only)NextExample: Routing with Interceptors and Compliance

Last updated 15 minutes ago

Was this helpful?