Routing Engine (Enterprise Only)
Use JSON rules or an embedded script to manage AI models, cut costs, and boost performance. A guide for enterprise-level routing.

LangDB's Routing Engine enables organizations to control how user requests are handled by AI models, optimizing for cost, performance, compliance, and user experience. By defining routing rules in JSON, businesses can automate decision-making, ensure reliability, and maximize value from their AI investments.
What is Routing and What are its Components?
Routing is the process of directing an incoming request to the most appropriate AI model based on a set of rules. The LangDB router is composed of several key components that work together to execute this logic:
Routes: These are the core building blocks of your router. A router is essentially a list of routes that are evaluated in order. The first route whose conditions are met is executed.
Conditions: The logic that determines whether a route should be triggered. Conditions can evaluate request data, user metadata, or results from pre-request hooks.
Targets: The destination for a request if a route's conditions are met. This is typically one or more AI models.
Interceptors (Guardrails & Rate Limiters): These are pre-request hooks that can inspect, modify, or enrich a request before routing rules are evaluated. Their results can be used in conditions.
Message Mapper: A component used to block a request or modify the final response, often for handling errors like rate limits.
Example Use Cases
SLA-Driven Tiering
Guarantee premium performance for high-value customers.
extra.user.tier
, ttft
Route extra.user.tier: "premium"
to models with the lowest ttft
.
Geographic Compliance
Ensure data sovereignty and meet regulatory requirements (e.g., GDPR).
metadata.region
, extra.user.tags
If metadata.region: "EU"
, route to models for users with tags: ["GDPR"]
.
Intelligent Cost Management
Reduce operational expenses for internal or low-priority tasks.
metadata.group_name
, price
If metadata.group_name: "internal"
, sort available models by "sort_by": "price"
.
Content-Aware Routing
Improve accuracy by using specialized models for specific topics.
pre_request.semantic_guardrail.result.topic
If topic: "finance"
, route to a finance-tuned model.
Brand Safety Enforcement
Prevent brand damage by blocking or redirecting inappropriate content.
pre_request.toxicity_guardrail.result.passed
If passed: false
, block the request or route to a safe-reply model.
For more detailed examples, see the pages below:
ExamplesExample: Building an Enterprise Routing ConfigurationExample: Routing with Interceptors and ComplianceAnatomy of a Routing Request
A routing request is a standard chat completion request with two key additions:
The
model
must be set to"router/dynamic"
.A
router
object containing your routing logic must be included in the request body.
Here’s how the various components fit into a complete request. The example below shows a simple configuration with two routes: one for premium users and a fallback for everyone else.
{
"model": "router/dynamic",
"messages": [
{
"role": "user",
"content": "Our production API is down, I need help now!"
}
],
"extra": {
"user": { "tier": "premium" }
},
"router": {
"type": "conditional",
"routes": [
{
"name": "premium_support_fast_track",
"conditions": {
"all": [
{ "extra.user.tier": { "$eq": "premium" } }
]
},
"targets": {
"$any": ["anthropic/claude-4-opus", "openai/gpt-o3"],
"sort_by": "ttft",
"sort_order": "min"
}
},
{
"name": "default_fallback",
"conditions": { "all": [] },
"targets": "openai/gpt-4o-mini"
}
]
}
}
Routing Components Explained
This section breaks down each of the major components of the router
object.
Routes
The routes
property contains an array of route objects. These are evaluated sequentially from top to bottom, and the first route whose conditions
are met will be executed. Every route must have a name
, conditions
, and targets
.
Conditions
The conditions
block defines when a route should be activated. It uses a flexible JSON-based syntax.
Logical Operators: You can combine multiple conditions using
all
(AND) orany
(OR).Comparison Operators: Conditions use operators like
$eq
(equal),$neq
(not equal),$in
(in array),$lt
(less than),$gt
(greater than) to evaluate variables.Lazy Evaluation of Guardrails: It's important to note that guardrails are evaluated lazily. A guardrail interceptor will only be executed if the router encounters a condition that requires its result (
pre_request.{guardrail_name}.*
). This prevents unnecessary latency.
Targets
The targets
block defines what happens when a route is matched. It specifies one or more models to which the request can be sent.
Specifying Models
You can specify models in your targets
list in several ways, giving you flexibility in how you define your candidate pool.
Exact Name with Provider:
openai/gpt-4o
This is the most specific and recommended method. It uniquely identifies a single model from a single provider.
Provider Wildcard:
openai/*
This selects all available models from a specific provider (e.g., all models from OpenAI). This is useful for creating provider-level routing rules or fallbacks.
Model Name Only:
claude-3-opus
This selects all models with that name from any available provider. For example, if both Anthropic and another provider offered
claude-3-opus
, both would be added to the candidate pool. This is particularly useful when you want to use sorting to find the best provider for a specific model based on real-time metrics likeprice
orttft
. Use with caution, as it can be ambiguous if providers have different capabilities for the same model name.
Filtering Models
Before selecting a model, you can filter the list of potential targets in $any
using the filter
property. This is useful for ensuring models meet certain real-time performance criteria.
"targets": {
"$any": ["anthropic/claude-4-opus", "openai/gpt-o3"],
"filter": {
"error_rate": { "$lt": 0.02 }
}
}
This example ensures that only models with an error rate below 2% are considered.
Sorting Models
After filtering, the router can sort the remaining candidate models to find the best one based on a specific metric.
sort_by
: The metric to use for sorting. Common values areprice
,ttft
(time to first token), anderror_rate
.sort_order
: The direction to sort, eithermin
(for lowest cost/latency) ormax
.
"targets": {
"$any": ["mistral/mistral-large-latest", "anthropic/claude-4-sonnet"],
"sort_by": "price",
"sort_order": "min"
}
This example selects the cheapest model from the pool.
Interceptors (Guardrails & Rate Limiters)
Interceptors are hooks that run before the main routing logic is evaluated. They are defined in the pre_request
array.
Guardrails enforce content and safety policies (e.g., checking for toxicity or classifying topics).
Rate Limiters enforce usage quotas to prevent abuse.
The results of these interceptors are made available in the pre_request
variable space for use in your conditions. For detailed configuration, see Interceptors & Guardrails.
Message Mapper
The Message Mapper is used to take direct control of the response. Its most common use case is to block a request that has failed an interceptor check and return a custom error message.
{
"name": "rate_limit_exceeded_block",
"conditions": {
"pre_request.rate_limiter.passed": { "$eq": false }
},
"message_mapper": {
"modifier": "block",
"content": "You have exceeded your daily quota."
}
}
Metadata and Variables
Effective routing relies on rich contextual information. LangDB provides two main sources of data for your conditions: extra.user.*
data, which you pass in the request, and metadata.*
data, which is automatically populated by LangDB. For a complete list, see Variables & Functions.
Components Summary
Routes
A list of rules evaluated in order.
name
, conditions
, targets
Conditions
The "if" statement for a route.
all
, any
, operators ($eq
, $lt
), variables
Targets
The destination model(s) for a route.
$any
, filter
, sort_by
, sort_order
Interceptors
Pre-request hooks for validation or enrichment.
pre_request
array, type
(guardrail
, interceptor
)
Message Mapper
Blocks or modifies the final response.
modifier: "block"
, content
Performance Impact
Different routing components have different performance characteristics.
Guardrails & Interceptors:
Simple checks like a Rate Limiter are very fast, typically involving a quick lookup in Redis.
More complex guardrails, especially those that are "LLM-as-a-judge" (i.e., they make an LLM call to evaluate the prompt), will introduce significant latency to the request, equal to the duration of that LLM call.
Model Sorting:
Sorting models based on performance metrics (
ttft
,error_rate
) orprice
requires fetching this data from a metrics store (like Redis). This is a very fast operation but does add a small, fixed overhead to each routed request.
Tracing & Observability
Routing decisions are fully transparent and traceable within the LangDB ecosystem.
Routing Performance: The performance of the routing logic itself is tracked, so you can monitor for any overhead.
Candidate & Picked Models: For every request, the trace records:
The initial pool of
candidate models
for a matched route.The list of models remaining after
filtering
.The final
picked model
aftersorting
.
OpenTelemetry: All router metrics (decisions, latencies, error rates) are exported via OpenTelemetry, allowing you to integrate with your existing observability stack (e.g., DataDog, New Relic) for real-time analytics and alerting.
Last updated
Was this helpful?