Example: Building an Enterprise Routing Configuration

End-to-end example of a multi-layer enterprise routing setup with tiering, cost and fallbacks.

This example demonstrates a multi-layered routing strategy for a SaaS company that balances performance for premium users, cost for standard users, and flexibility for internal development.

Goals:

  1. Provide the fastest possible responses for "premium" customers on support-related queries.

  2. Minimize costs for "standard" tier users.

  3. Allow the internal "development" team to test a new, experimental model without affecting customers.

Enterprise routing config workflow

Complete Chat Completion Request with Enterprise Routing:

Configuration Breakdown:

  • Request Structure:

    • Uses "model": "router/dynamic" to enable dynamic routing

    • User information is passed via the extra.user object

    • Router configuration is specified in the router field with "type": "conditional"

  • Route 1: premium_support_fast_track

    • Conditions: Applies when extra.user.tier equals "premium" AND extra.user.request.topic equals "support"

    • Targets: Routes to high-performance models (anthropic/claude-4-opus, openai/gpt-o3, gemini/gemini-2.5-pro, xai/grok-4) with filtering for fast response times (ttft < 1000ms) and sorts by minimum time-to-first-token

  • Route 2: standard_user_cost_optimized

    • Conditions: Catches all requests from "standard" tier users via extra.user.tier

    • Targets: Uses cost-effective models (mistral/mistral-large-latest, anthropic/claude-4-sonnet) and sorts by minimum price

  • Route 3: internal_dev_testing

    • Conditions: Applies to users in the "development" cost group via metadata.group_name (automatically set by LangDB based on the LangDB user's cost group assignment)

    • Targets: Routes directly to google/gemini-2.5-pro for isolated testing

  • Route 4: fallback_route

    • Conditions: Empty conditions array ("all": []) catches all remaining requests

    • Targets: Routes to openai/gpt-4o-mini as a reliable fallback option

Key Features Demonstrated:

  • Conditional Logic: Uses all operators and comparison operators ($eq, $lt)

  • Target Selection: Shows both single targets and pools with filtering/sorting

  • Request Context: Leverages both extra user data and metadata for routing decisions

  • Cost Group Integration: metadata.group_name is automatically populated by LangDB based on the LangDB user's cost group assignment

  • Retry Configuration: Includes max_retries: 2 for resilience


Additional Example Scenarios

Scenario 1: Standard User Cost-Optimized Request

This request will match standard_user_cost_optimized route and use cost-effective models.

Scenario 2: Internal Development Testing

This request will match internal_dev_testing route because the LangDB user belongs to the "development" cost group, automatically setting metadata.group_name = "development".

Scenario 3: Fallback Route

This request doesn't match any specific conditions and will use the fallback_route with GPT-4o-mini.

Last updated

Was this helpful?