Example: Building an Enterprise Routing Configuration

This example demonstrates a multi-layered routing strategy for a SaaS company that balances performance for premium users, cost for standard users, and flexibility for internal development.

Goals:

  1. Provide the fastest possible responses for "premium" customers on support-related queries.

  2. Minimize costs for "standard" tier users.

  3. Allow the internal "development" team to test a new, experimental model without affecting customers.

Enterprise routing config workflow

Routing Configuration (router.json):

{
  "routes": [
    {
      "name": "premium_support_fast_track",
      "conditions": {
        "all": [
          { "metadata.user.tier": { "eq": "premium" } },
          { "metadata.request.topic": { "eq": "support" } }
        ]
      },
      "targets": {
        "$any": ["anthropic/claude-4-opus", "openai/gpt-o3"],
        "sort": { "ttft": "MIN" }
      }
    },
    {
      "name": "standard_user_cost_optimized",
      "conditions": {
        "metadata.user.tier": { "eq": "standard" }
      },
      "targets": {
        "$any": ["mistral/mistral-large-latest", "anthropic/claude-4-sonnet"],
        "sort": { "price": "MIN" }
      }
    },
    {
      "name": "internal_dev_testing",
      "conditions": {
        "metadata.user.group": { "eq": "development" }
      },
      "targets": [
        { "model": "google/gemini-2.5-pro" }
      ]
    }
  ]
}

Configuration Breakdown:

  • Rule 1: premium_support_fast_track

    • Conditions: This rule applies only when a request comes from a user in the "premium" tier AND the request topic has been identified as "support". This uses an all operator to combine conditions.

    • Targets: It routes the request to a pool of high-performance models (anthropic/claude-4-opus, openai/gpt-o3) and selects the one with the lowest time-to-first-token (ttft), ensuring the fastest response.

  • Rule 2: standard_user_cost_optimized

    • Conditions: This is a broader rule that catches all requests from "standard" tier users.

    • Targets: It uses a pool of cost-effective models (mistral/mistral-large-latest, anthropic/claude-4-sonnet ) and selects the one with the minimum price, optimizing for spend.

  • Rule 3: internal_dev_testing

    • Conditions: This rule applies to any user in the "development" group.

    • Targets: It directs their requests to google/gemini-2.5-pro, isolating test traffic from the production user base.

Last updated

Was this helpful?