Introducing Virtual MCP Servers
LogoLogo
GithubJoin SlackSignupBook a Demo
  • Documentation
  • Self Hosted
  • Integrations
  • Guides
  • Enterprise
  • Using LLMs
    • Bring Gemini, Claude, DeepSeek to Agents SDK
    • Connecting LLMs to the Web with Real-Time Search Tools
    • Configure Fallback Routing with LangDB
    • Tracing Multiple Agents
  • Using MCPs
    • Send GitHub Release Summaries to Slack
    • Figma ➔ Code Implementation
    • Database Analytics (ClickHouse)
    • Personal Knowledgebase with DuckDuckGo + Qdrant
    • Context7 + Sequential Thinking for Smarter Coding Workflows
Powered by GitBook
LogoLogo

Social

  • LinkedIn
  • X
  • Youtube
  • Github

Platform

  • Pricing
  • Documentation
  • Blog

Company

  • Home
  • About

Legal

  • Privacy Policy
  • Terms of Service

2025 LangDB. All rights reserved.

On this page
  • What is Fallback Routing?
  • Example: Basic Fallback Routing
  • Behavior
  • Example: Fallback with Load-Balancing
  • How This Works:

Was this helpful?

Export as PDF
  1. Using LLMs

Configure Fallback Routing with LangDB

Set up fallback routing with LangDB to keep AI applications online during traffic spikes or model outages by automatically switching models.

PreviousConnecting LLMs to the Web with Real-Time Search ToolsNextTracing Multiple Agents

Last updated 16 hours ago

Was this helpful?

Ensure your AI applications stay online even during traffic spikes or model outages by configuring Fallback Routing. This guide walks you through setting up fallback routers using LangDB's routing feature.

What is Fallback Routing?

Fallback Routing allows LangDB to automatically switch to a backup model when your preferred model is slow, down, or overloaded. This helps you:

  • Avoid downtime

  • Improve reliability

  • Scale applications without manual intervention

Example: Basic Fallback Routing

Let’s say you want to use DeepSeek-Reasoner, but switch to GPT-4o if it becomes unavailable.

Here’s how you can use the UI to set it up:

Here’s how you can set it up programmatically:

{
    "model": "router/dynamic",
    "router": {
        "name": "fallback-router",
        "type": "fallback",
        "targets": [
            { "model": "deepseek-reasoner", "temperature": 0.7, "max_tokens": 400 },
            { "model": "gpt-4o", "temperature": 0.8, "max_tokens": 500 }
        ]
    }
}

Behavior

  • First, it tries deepseek-reasoner

  • If that fails, it automatically falls back to GPT-4o

Example: Fallback with Load-Balancing

In the previous example, we implemented a simple fallback mechanism. However, a more robust solution would be to distribute queries across multiple providers of DeepSeek-R1 while maintaining a fallback to GPT-4o if both providers fail. This method helps balance traffic efficiently while ensuring uninterrupted AI services.

Here’s how you can configure Fallback Routing with Percentage-Based Load Balancing:

{
    "model": "router/dynamic",
    "router": {
        "name": "fallback-percentage-router",
        "type": "fallback",
        "targets": [
            {
                "model": "router/dynamic",
                "router": {
                    "name": "percentage-balanced",
                    "type": "percentage",
                    "model_a": [
                        { "model": "fireworksai/deepseek-r1", "temperature": 0.7, "max_tokens": 400 },
                        0.5
                    ],
                    "model_b": [
                        { "model": "deepseek/deepseek-reasoner", "temperature": 0.7, "max_tokens": 400 },
                        0.5
                    ]
                }
            },
            { "model": "gpt-4o", "temperature": 0.8, "max_tokens": 500 }
        ]
    }
}

How This Works:

  • Primary Route: The system distributes requests evenly (50-50%) between two providers of DeepSeek-R1 to balance the load.

  • Fallback Route: If both DeepSeek-R1 providers are unavailable or fail, all requests are automatically rerouted to GPT-4o, ensuring continuous service.

This approach provides load balancing, and reliable fallback protection, making it ideal for AI applications facing high demand and occasional model unavailability.

In more complex scenarios, you can configure a multi-level fallback system with percentage-based distribution. This approach allows requests to be routed dynamically based on pricing, performance, or reliability, ensuring efficiency while preventing downtime.

By leveraging dynamic routing, you can:

  • Prevent downtime by automatically switching to backup models.

  • Optimize performance and cost with smart load balancing.

  • Ensure scalability without manual intervention.

With LangDB’s flexible and powerful routing capabilities, you can build AI applications that are not only intelligent but also robust and fail-safe.

Checkout for more routing strategies.

Routing Strategies
Configuring fallback routing on LangDB