Only this pageAll pages
Powered by GitBook
1 of 46

Langdb Docs

Loading...

Getting Started

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Concepts

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Features

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Python SDK

Loading...

Loading...

Introduction to AI Gateway

Monitor, Govern and Secure your AI traffic.

What is an AI Gateway?

An AI gateway is a middleware that acts as a unified access point to multiple LLMs, optimizing, securing, and managing AI traffic. It simplifies integration with different AI providers while enabling cost control, observability, and performance benchmarking. With an AI gateway, businesses can seamlessly switch between models, monitor usage, and optimize costs.

LangDB provides OpenAI compatible APIs to connect with multiple Large Language Models (LLMs) by just changing two lines of code.

Govern, Secure, and Optimize all of your AI Traffic with Cost Control, Optimisation and Full Observability.

What AI Gateway Offers Out of the Box

LangDB provides OpenAI-compatible APIs, enabling developers to connect with multiple LLMs by changing just two lines of code. With LangDB, you can:

  • Provide access to all major LLMs Ensure seamless integration with leading large language models to maximize flexibility and power.

  • No framework code required Enable plug-and-play functionality using any framework like Langchain, Vercel AI SDK, CrewAI, etc., for easy adoption.

  • Plug & Play Tracing & Cost Optimization Simplify implementation of tracing and cost optimization features, ensuring streamlined operations.

  • Automatic routing based on cost, quality, and other variables Dynamically route requests to the most suitable LLM based on predefined parameters.

  • Benchmark and provide insights Deliver insights into the best-performing models for specific tasks, such as coding or reasoning, to enhance decision-making.

Quick Start with LangDB

LangDB offers both managed and self hosted versions for organisations to manage AI traffic . Choose between the Hosted Gateway for ease of use or the Open-Source Gateway for full control.

Roadmap

  • Prompt Caching & Optimization (In Progress) Introduce caching mechanisms to optimize prompt usage and reduce redundant costs.

  • GuardRails (In Progress) Implement safeguards to enhance reliability and accuracy in AI outputs.

  • Leaderboard of models per category Create a comparative leaderboard to highlight model performance across categories.

  • Ready-to-use evaluations for non-data scientists Provide accessible evaluation tools for users without a data science background.

  • Readily fine-tunable data based on usage Offer pre-configured datasets tailored for fine-tuning, enabling customized improvements with ease.

Quick Start

Quick Start guide for LangDB AI Gateway

The LangDB AI Gateway allows you to connect with multiple Large Language Models (LLMs) instantly, without any setup.

1

Account Creation

Sign up on LangDB to start using the Hosted Gateway

2

Make your First Request

Test a chat window with two different models to see dynamic routing in action.

3

Checkout Samples section for Template Code

Use ready-made templates to integrate LangDB into your project effortlessly.

4

Analytics Section

Monitor usage, costs, and performance insights through the LangDB analytics dashboard.

LangDB Sign Up Page
LangDB Signup - Setting Organisation
LangDB Signup - One Time Password
Sending your first request in LangDB Playground
Using LangDB Samples to generate template code
Checking out LangDB Dashboard for analytics

Quick Start

A full featured and managed AI gateway that provides instant access to 250+ LLMs with enterprise ready features.

Self Hosted

A self-hosted option for organizations that require complete control over their AI infrastructure.

Diagram showing LangDB as a central AI gateway for applications, offering features like cost management, tracing, routing, prompt caching, security, and guardrails. LangDB connects to multiple AI providers, including OpenAI, Gemini, Anthropic, Bedrock, Mistral, Hugging Face, Cohere, and Llama.

Using Parameters

Configure temperature, max_tokens, logit_bias, and more with LangDB AI Gateway. Test easily via API, UI, or Playground.

LangDB AI Gateway supports every LLM parameter like temperature, max_tokens, stop sequences, logit_bias, and more.

API Usage:

from openai import OpenAI

response = client.chat.completions.create(
    model="gpt-4o", # Change Model
    messages=[
        {"role": "user", "content": "What are the earnings of Apple in 2022?"},
    ],
    temperature=0.7,               # temperature parameter
    max_tokens=150,                # max_tokens parameter
    stream=True                   # stream parameter
)
const response = await client.chat.completions.create({
  model: 'gpt-4o-mini',          
  messages,                     
  temperature: 0.7,              // temperature parameter
  max_tokens: 150,               // max_tokens parameter
  logit_bias: { '50256': -100 }, // logit_bias parameter
  stream: true,                  // stream parameter
});

UI

You can also use the UI to test various parameters and getting code snippet

Playground

Use the Playground to tweak parameters in real time via the Virtual Model config and send test requests instantly.

Samples

Explore ready-made code snippets complete with preconfigured parameters—copy, paste, and customize to fit your needs.

Working with MCPs

Learn how to connect to MCP Servers using LangDB AI Gateway

Instantly connect to managed MCP servers — skip the setup and start using fully managed MCPs with built-in authentication, seamless scalability, and full tracing. This guide gives you a quick walkthrough of how to get started with MCPs.

Quick Example

In this example, we’ll create a Virtual MCP Server by combining Slack and Gmail MCPs — and then connect it to an MCP Client like Cursor for instant access inside your chats.

Steps:

  1. Select Slack and Gmail from MCP Severs in the Virtual MCP Section.

  2. Generate a Virtual MCP URL automatically.

  3. Install the MCP into Cursor with a single command.

Example install command:

npx @langdb/mcp setup slack_gmail_virtual https://api.langdb.ai/mcp/xxxxx --client cursor

What Happens Under the Hood?

  • Authentication is handled (via OAuth or API Key)

  • Full tracing and observability are available (inputs, outputs, errors, latencies)

  • MCP tools are treated just like normal function calls inside LangDB

Next Steps:

  • MCP Servers listed on LangDB: https://app.langdb.ai/mcp-servers

  • Explore MCP Usecases.

Working with API

LangDB provides access to 350+ LLMs with OpenAI compatible APIs.

You can use LangDB as a drop-in replacement for OpenAI APIs, making it easy to integrate into existing workflows and libraries such as OpenAI Client SDK.

You can choose from any of the supported models.

from openai import OpenAI

langdb_project_id = "xxxxx"  # LangDB Project ID

client = OpenAI(
    base_url=f"https://api.us-east-1.langdb.ai/{langdb_project_id}/v1",
    api_key="xxxxx" ,           # LangDB token
)
response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4", # Change Model
    messages=[
        {"role": "system", "content": "You are a helpful assistant"},
        {"role": "user", "content": "What are the earnings of Apple in 2022?"},
    ],
)
print("Assistant:", response.choices[0].message)
import { OpenAI } from 'openai';

const langdbProjectId = 'xxxx';  // LangDB Project ID

const client = new OpenAI({
  baseURL: `https://api.us-east-1.langdb.ai/${langdbProjectId}/v1`,
  apiKey:  'xxxx'   // Your LangDB token,
});

const messages = [
  {
    role: 'system',
    content: 'You are a helpful assistant.'
  },
  {
    role: 'user',
    content: 'What are the earnings of Apple in 2022?'
  }
];
async function getAssistantReply() {
  const { choices } = await client.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: messages
  });
  console.log('Assistant:', choices[0].message.content);
}
getAssistantReply();
curl "https://api.us-east-1.langdb.ai/v1/chat/completions" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $LANGDB_API_KEY" \
    -X "X-Project-Id: $Project_ID" \
    -d '{
        "model": "gpt-4o",
        "messages": [
            {
                "role": "user",
                "content": "Write a haiku about recursion in programming."
            }
        ],
        "temperature": 0.8
    }'

After sending your request, you can see the Traces on the dashboard:

Check out the API reference here.

Trace

Track complete workflows with LangDB Traces. Get end-to-end visibility, multi-agent support, and error diagnosis.

A Trace represents the complete lifecycle of a workflow, spanning all components and systems involved.

Core Features:

  • End-to-End Visibility: Tracks model calls, tools across the entire workflow.

  • Multi Agent Ready: Perfect for workflows that involve multiple services, APIs, or tools.

  • Error Diagnosis: Quickly identify bottlenecks, failures, or inefficiencies in complex workflows.

Parent-Trace:

For workflows with nested operations (e.g., a workflow that triggers multiple sub-workflows), LangDB introduces the concept of a Parent-Trace, which links the parent workflow to its dependent sub-workflows. This hierarchical structure ensures you can analyze workflows at both macro and micro levels.

Headers for Trace:

  • trace-id: Tracks the parent workflow.

  • parent-trace-id: Links sub-workflows to the main workflow for hierarchical tracing.

Routing with Virtual Model

Manage routing strategies easily in LangDB AI Gateway’s UI to boost efficiency, speed, and reliability in AI workflows.

In LangDB AI Gatewau, any virtual model can act as a router. Just define a strategy and list of target models—it’ll route requests based on metrics like cost, latency, percentage, er or custom rules.

Setting up Routing

Setting up routing in a virtual model is straightforward:

  1. Open any virtual model in the Chat Playground and click Show Config

  2. Choose a routing strategy (like fallback, optimized, percentage, etc.)

  3. Add your target models—each one can be configured just like the virtual models you set up in the previous section

Each target defines:

  • Which model to use

  • Prompt

  • MCP Servers

  • Guardrails

  • Response Format

  • Custom parameters like temperature, max_tokens, penalties, etc.

All routing options are available directly in the virtual model config panel.

Check more about the .

Working with Multiple Agents

Learn how to use LangDB to Trace Multi Agent workflows

LangDB automatically visualizes how agents interact, providing a clear view of workflows, hierarchies, and usage patterns by adding and headers.

This allows developers to track interactions between agents seamlessly, ensuring clear visibility into workflows and dependencies.

What is a Multi-Agent System?

A multi-agent system consists of independent agents collaborating to solve complex tasks. Agents handle various roles such as user interaction, data processing, and workflow orchestration. LangDB streamlines tracking these interactions for better efficiency and transparency.

Why Track Workflows?

Tracking ensures:

  • Clear Execution Flow: Understand how agents interact.

  • Performance Optimization: Identify bottlenecks.

  • Reliability & Accountability: Improve transparency.

LangDB supports two main concepts.

  • : A complete end-to-end interaction between agents, grouped for easy tracking.

  • : Aggregate multiple Runs into a single thread for a unified chat experience.

    Example

Using the same Run ID and Thread ID across multiple agents ensures seamless tracking, maintaining context across interactions and providing a complete view of the workflow

Checkout the full Multi-Agent Tracing Example .

Tracing

Track every model call, agent handoff, and tool execution for faster debugging and optimization.

LangDB Gateway provides detailed tracing to monitor, debug, and optimize LLM workflows.

Below is an example of a trace visualization from the dashboard, showcasing a detailed breakdown of the request stages:

In this example trace you’ll find:

  • Overview Metrics

    • Cost: Total spend for this request (e.g. $0.034).

    • Tokens: Input (5,774) vs. output (1,395).

    • Duration: Total end-to-end latency (29.52 s).

  • Timeline Breakdown A parallel-track timeline showing each step—from moderation and relevance scoring to model inference and final reply.

  • Model Invocations** Every call to gpt-4o-mini, gpt-4o, etc., is plotted with precise start times and durations.

  • Agent Hand-offs Transitions between your agents (e.g. search → booking → reply) are highlighted with custom labels like transfer_to_reply_agent.

  • Tool Integrations External tools (e.g. booking_tool, travel_tool, python_repl_tool) appear inline with their execution times—so you can spot slow or failed runs immediately.

  • Guardrails Rules like Min Word Count and Travel Relevance enforce domain-specific constraints and appear in the trace.

With this level of visibility you can quickly pinpoint bottlenecks, understand cost drivers, and ensure your multi-agent pipelines run smoothly.

Working with Headers

Explore how LangDB API headers like x-thread-id, x-run-id, x-label, and x-project-id improve LLM tracing, observability, and session tracking for better API management and debugging.

LangDB API provides robust support for HTTP headers, enabling developers to manage API requests efficiently with enhanced tracing, observability, and organization.

These headers play a crucial role in structuring interactions with multiple LLMs by providing tracing, request tracking, and session continuity, making it easier to monitor, and analyze API usage

Thread ID (x-thread-id)

Usage: Groups multiple related requests under the same conversation

  • Useful for tracking interactions over a single user session.

  • Helps maintain context across multiple messages.

Thread Title (x-thread-title)

Usage: Assigns a custom, human-readable title to a thread.

  • This title is displayed in the LangDB UI, making it easier to identify and search for specific conversations.

Public Thread (x-thread-public)

Usage: Makes a thread publicly accessible via a shareable link.

  • Set the value to 1 or true to enable public sharing.

  • The public URL will be: https://app.langdb.ai/sharing/threads/{thread_id}

  • The x-thread-title, if set, will be displayed on the public thread page.

Check for more details.

Run ID (x-run-id)

Usage: Tracks a unique workflow execution in LangDB, such as a model call or tool invocation.

  • Enables precise tracking and debugging.

  • Each Run is independent for better observability.

Check for more details.

Label (x-label)

Usage: Adds a custom tag or label to a LLM Model Call for easier categorization.

  • Helps with tracing multiple agents.

Check for more details.

Project ID (x-project-id)

Usage: Identifies the project under which the request is being made.

  • Helps in cost tracking, monitoring, and organizing API calls within a specific project.

  • Can be set in headers or directly in the API base URL https://api.us-east-1.langdb.ai/${langdbProjectId}/v1

Thread

Use LangDB Threads to group messages, maintain conversation context, and enable seamless multi-turn interactions.

A Thread is simply a grouping of Message History that maintains context in a conversation or workflow. Threads are useful for keeping track of past messages and ensuring continuity across multiple exchanges.

Core Features:

  • Contextual Continuity: Ensures all related Runs are grouped for better observability.

  • Multi-Turn Support: Simplifies managing interactions that require maintaining state across multiple Runs.

Example:

A user interacting with a chatbot over multiple turns (e.g., asking follow-up questions) generates several messages, but all are grouped under a single Thread to maintain continuity.

Headers for Thread:

  • x-thread-id: Links all Runs in the same context or conversations.

  • x-thread-title: Assigns a custom, human-readable title to the thread, making it easier to identify.

  • x-thread-public: Makes the thread publicly accessible via a shareable link by setting its value to 1 or true.

Message

A Message in LangDB AI Gateway defines structured interactions between users, systems, and models in workflows.

A Message is the basic unit of communication in LangDB workflows. Messages define the interaction between the user, the system, and the model. Every workflow is built around exchanging and processing messages.

Core Features:

  • Structured Interactions: Messages define roles (user, system, assistant) to organize interactions clearly.

  • Multi-Role Flexibility: Different roles (e.g., system for instructions, user for queries) enable complex workflows.

  • Dynamic Responses: Messages form the backbone of LangDB’s chat-based interactions.

Example:

A simple interaction to generate a poem might look like this:

Virtual MCP Servers

Create Virtual MCP Servers in LangDB AI Gateway to unify tools, manage auth securely, and maintain full observability across workflows

A Virtual MCP Server lets you create a customized set of MCP tools by combining functions from multiple MCP servers — all with scoped access, unified auth, and full observability.

Why Use a Virtual MCP?

  • Selective Tools: Pick only the tools you need from existing MCP servers (e.g. Airtable's list_records, GitHub's create_issue, etc.)

  • Clean Auth Handling: Add your API keys o`nly if needed. Otherwise, LangDB handles OAuth for you.

  • Full Tracing: Every call is traced on the LangDB — with logs, latencies, input/output, and error metrics.

  • Easy Integration: Works out of the box with Cursor, Claude, Windsurf, and more.

  • Version Lock-in: Virtual MCPs are pinned to a specific server version to avoid breaking changes.

  • Poisoning Safety: Prevents injection or override by malicious tool definitions from source MCPs.

How to Set It Up

  1. Go to your Virtual MCP server on LangDB Project.

  2. Select the tools you want to include.

  3. (Optional) Add API keys or use LangDB-managed auth.

  4. Click Generate secure MCP URL.

Install in Cursor / Windsurf / Claude

Once you have the MCP URL:

You're now ready to use your selected tools directly inside the editor.

Try it in the playground

You can also try the Virtual MCP servers by adding the server in the config.

API Reference

API Endpoints for LangDB

[
  { "role": "system", "content": "You are a helful assistant" },
  { "role": "user", "content": "Write me a poem about celluloids." }
]
Threads
Run
Label
from openai import OpenAI
from uuid import uuid4
client = OpenAI(
    base_url="https://api.us-east-1.langdb.ai/{langdb_project_id}/v1"  # LangDB API base URL,
    api_key=api_key,  # Replace with your LangDB token
)
response1 = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "developer", "content": "You are a helpful assistant."},
              {"role": "user", "content": "Hello!"}],  
    extra_headers={"x-thread-id": thread_id, "x-run-id": run_id}
)

# Agent 2 processes the response
response2 = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "developer", "content": "Processing user input."},
              {"role": "user", "content": response1.choices[0].message["content"]}],  
    extra_headers={"x-thread-id": thread_id, "x-run-id": run_id}
)
run
thread
Run
Thread
here
With LangDB you can trace mulitple Agents. This will be in a form of a thread which stores all the interaction of workflow as a conversation and tracing which has details about the LLMs, Tool calls, Models, Parameters, etc.
A full end-to-end multi agent workflow traced on LangDB
Thread section in LangDB Dashboard
Trying different Parameters for chat completions through LangDB Playground
Trying different Parameters for chat completions through LangDB Samples
Quick guide on how to setup LangDB Virtual Model on  windsurf, claude, cursor.
Trace after running simple API call
An example of Trace linked to a thread can look on LangDB

User Tracking

Track users in LangDB AI Gateway to analyze usage, optimize performance, and improve chatbot experiences.

LangDB AI enables user tracking to collect analytics and monitor usage patterns efficiently. By associating metadata with requests, developers can analyze interactions, optimize performance, and enhance user experience.

Example: Chatbot Analytics with User Tracking

For a chatbot service handling multiple users, tracking enables:

  • Recognizing returning users: Maintain conversation continuity.

  • Tracking usage trends: Identify common queries to improve responses.

  • User segmentation: Categorize users using tags (e.g., "websearch", "support").

  • Analytics: Identify heavy users and allocate resources efficiently.

curl 'https://api.us-east-1.langdb.ai/v1/chat/completions' \
-H 'authorization: Bearer LangDBApiKey' \
-H 'Content-Type: application/json' \
-d '{
  "model": "openai/gpt-4o-mini",
  "stream": true,
  "messages": [
    {
      "role": "user",
      "content": "Def bubbleSort()"
    }
  ],
  "extra": {
    "user": {
      "id": "7",
      "name": "mrunmay",
      "tags": ["coding", "software"]
    }  
  }
}'

User Tracking Fields

  • extra.user.id: Unique user identifier.

  • extra.user.name: User alias.

  • extra.user.tags: Custom tags to classify users (e.g., "coding", "software").

Fetching User Analytics & Usage Data

Once users are tracked, analytics and usage APIs can be used to retrieve insights based on id, name, or tags.

Checkout Usage and Analytics section for more details.

Example:

curl -L \
  --request POST \
  --url 'https://api.us-east-1.langdb.ai/analytics/summary' \
  --header 'Authorization: Bearer langDBAPIKey' \
  --header 'X-Project-Id: langDBProjectID' \
  --header 'Content-Type: application/json' \
  --data '{
    "user_id": "7",
    "user_name": "mrunmay",
    "user_tags": ["software", "code"]   
  }'

Example response:

{
  "summary": [
    {
      "total_cost": 0.00030366,
      "total_requests": 1,
      "total_duration": 6240.888,
      "avg_duration": 6240.9,
      "duration": 6240.9,
      "duration_p99": 6240.9,
      "duration_p95": 6240.9,
      "duration_p90": 6240.9,
      "duration_p50": 6240.9,
      "total_input_tokens": 1139,
      "total_output_tokens": 137,
      "avg_ttft": 6240.9,
      "ttft": 6240.9,
      "ttft_p99": 6240.9,
      "ttft_p95": 6240.9,
      "ttft_p90": 6240.9,
      "ttft_p50": 6240.9,
      "tps": 204.46,
      "tps_p99": 204.46,
      "tps_p95": 204.46,
      "tps_p90": 204.46,
      "tps_p50": 204.46,
      "tpot": 0.05,
      "tpot_p99": 0.05,
      "tpot_p95": 0.05,
      "tpot_p90": 0.05,
      "tpot_p50": 0.05,
      "error_rate": 0.0,
      "error_request_count": 0
    }
  ],
  "start_time_us": 1737547895565066,
  "end_time_us": 1740139895565066
}
npx @langdb/mcp setup figma https://api.staging.langdb.ai/mcp/xxxxx --client cursor
virtual model
Virtual MCP Server - Using it in Claude, Cursor, Windsurf.
Virtual MCP Server - Usage on cursor
Virtual MCP Server - Usage on LangDB Playground
Routing Strategies

Working with Agent Frameworks

Enable end-to-end tracing for AI agent frameworks with LangDB’s one-line init() integration.

LangDB integrates seamlessly with a variety of agent libraries to provide out-of-the-box tracing, observability, and cost insights. By simply initializing the LangDB client adapter for your agent framework, LangDB monkey‑patches the underlying client to inject tracing hooks—no further code changes required.

Prerequisites

  • LangDB Core installed:

    pip install 'pylangdb'
  • Optional feature flags (for framework-specific tracing):

    pip install 'pylangdb[<library_feature>]'
    # e.g. pylangdb[adk], pylangdb[openai_agents]
  • Environment Variables set:

    export LANGDB_API_KEY="xxxxx"
    export LANGDB_PROJECT_ID="xxxxx"

Quick Start

Import and initialize once, before creating or running any agents:

from pylangdb.<library> import init
# Monkey‑patch the client for tracing
init()

# ...then your existing agent setup...

Monkey‑patching note: The init() call wraps key client methods at runtime to capture telemetry. Ensure it runs as early as possible.

GitHub Repo: https://github.com/langdb/pylangdb

Example: Google ADK

pip install 'pylangdb[adk]'
from pylangdb.adk import init
init()

from google.adk.agents import Agent
# (rest of your Google ADK agent code)

This is an example of complete end-to-end trace using Google ADK and LangDB.

LangDB’s ADK adapter captures request/response metadata, token usage, and latenc metrics automatically. During initialization it discovers and wraps all agents and sub‑agents in subfolders, linking their sessions for full end‑to‑end tracing across your workflow.

Supported Frameworks

Further Documentation

For full documentation including client capabilities, configuration, and detailed examples, checkout Python SDK documentation and Github.

MCP Support

Create, manage, and connect MCP servers easily to integrate dynamic tools and enhance your AI workflows with full tracing.

LangDB simplifies how you work with MCP (Model Context Protocol) servers — whether you want to use a built-in Virtual MCP or connect to an external MCP server.

Browse publicly-available MCP servers on LangDB

Model Context Protocol (MCP) is an open standard that enables AI models to seamlessly communicate with external systems. It allows models to dynamically process contextual data, ensuring efficient, adaptive, and scalable interactions. MCP simplifies request orchestration across distributed AI systems, enhancing interoperability and context-awareness.

With native tool integrations, MCP connects AI models to APIs, databases, local files, automation tools, and remote services through a standardized protocol. Developers can effortlessly integrate MCP with IDEs, business workflows, and cloud platforms, while retaining the flexibility to switch between LLM providers. This enables the creation of intelligent, multi-modal workflows where AI securely interacts with real-world data and tools.

For more details, visit the Model Context Protocol official page and explore Anthropic MCP documentation.

Using Virtual MCPs

Using API

LangDB allows you to create Virtual MCP Servers directly from the dashboard. You can instantly select and bundle tools like database queries, search APIs, or automation tasks into a single MCP URL — no external setup needed.

Here's an example of how you can use a Virtual MCP Server in your project:

from openai import OpenAI
from uuid import uuid4

client = OpenAI(
    base_url="https://api.us-east-1.langdb.ai/LangDBProjectID/v1",
    api_key="xxxx",
    default_headers={"x-thread-id": str(uuid4())},
)
mcpServerUrl = "Virtual MCP Server URL"
response = client.chat.completions.create(
    model="openai/gpt-4.1",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user",   "content": "What are the databases available"}
    ],
    extra_body={
        "mcp_servers": [
            {
                "server_url": mcpServerUrl,
                "type": "sse"
            }
        ]
    }
)
import openai, {
  OpenAI
} from 'openai';
import { v4 as uuid4 } from 'uuid';

const client = new OpenAI({
  baseURL: "https://api.us-east-1.langdb.ai/LangDBProjectID/v1",
  apiKey: "xxxx",
  defaultHeaders: {
    "x-thread-id": uuid4()
  }
});
const mcpServerUrl = 'Virtual MCP URL';
async function getAssistantReply() {
  const {
    choices
  } = await client.chat.completions.create({
    model: "openai/gpt-4.1-nano",
    messages: [
    {role: "system", content: "You are a helpful assistant."},
    {role: "user", content: "what are the databases on clickhouse?"} ,
    // @ts-expect-error mcp_servers is a LangDB extension
    mcp_servers: [
      { server_url: mcpServerUrl, type: 'sse' }
    ] 
  }
);
  console.log('Assistant:', choices[0].message.content);
}

Checkout Virtual MCP and section for usecases.

Using MCP Clients

You can instantly connect LangDB’s Virtual MCP servers to editors like Cursor, Claude, or Windsurf.

Run this in your terminal to set up MCP in Cursor:

npx @langdb/mcp setup <server_name> <mcp_url> --client cursor

You can now call tools directly in your editor, with full tracing on LangDB.

Connecting to External MCP Servers

If you already have an MCP server hosted externally — like Smithery’s Exa MCP — you can plug it straight into LangDB with zero extra setup.

Just pass your external MCP server URL in extra_body when you make a chat completion request. For example Smithery:

extra_body = {
    "mcp_servers": [
        {
            "server_url": "wss://your-mcp-server.com/ws?config=your_encoded_config",
            "type": "ws"
        }
    ]
}

For a complete example of how to use external MCP, refer to the .

Label

Label LLM instances in LangDB AI Gateway for easy tracking, categorization, and improved observability.

Label in LangDB defines an LLM instance with a unique identifier for categorization and tracking.

Core Features

  • Model Categorization: Assign labels to LLM instances.

  • Observability: Track models by label.

Headers for Label:

  • x-label: Defines a label for an LLM instance.

{
    "x-label" : "research-agent"
}

Run

Track and monitor complete workflows with Runs in LangDB AI Gateway for better observability, debugging, and insights.

A Run represents a single workflow or operation executed within LangDB. This could be a model invocation, a tool call, or any other discrete task. Each Run is independent and can be tracked separately, making it easier to analyze and debug individual workflows.

Example of a Run:

Core Features:

  • Granular Tracking: Analyze and optimize the performance and cost of individual Runs.

  • Independent Execution: Each Run has a distinct lifecycle, enabling precise observability.

Example:

Generating a summary of a document, analyzing a dataset, or fetching information from an external API – each is a Run.

Headers for Run:

  • x-run-id: Identifies a specific Run for tracking and debugging purposes.

Draft Mode

Simplify version control with LangDB Virtual Models’ draft mode—safely iterate, preview, and publish model versions without impacting live traffic.

LangDB’s Virtual Models support a draft mode that streamlines version management and ensures safe, iterative changes. In draft mode, modifications are isolated from the published version until you explicitly publish, giving you confidence that live traffic is unaffected by in-progress edits.

Version Workflow

  1. Edit in Draft

    • Making any change (e.g., adjusting parameters, adding guardrails, modifying messages) flips the version into a Modified draft.

  2. Save Draft

    • Click Save to record your changes. The draft is saved as a new version at the top of the version list, without affecting the live version.

    • Live API traffic remains pointed at the last published version.

  3. Publish Draft

    • Once validated, click Publish:

      • Saves the version as the new latest version.

      • Directs all live chat completion traffic to this version.

      • Keeps the previous published version visible in the list so you can reselect and republish if needed.

  4. Restore & Edit Previous Version

    • Open the version dropdown and select any listed version.

    • The selected version loads into the editor.

    • You can further modify this draft and click Save to create a new version entry.

  5. Re-Publish Any Version

    • To make any saved version live, select it from the dropdown and click Publish.

API Behavior

All chatCompletions requests to a Virtual Model endpoint automatically target the latest published version. Drafts and restored drafts never receive live traffic until published.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.us-east-1.langdb.ai",
    api_key=api_key,
)

# Always hits current published version
response = client.chat.completions.create(
    model="openai/langdb/my-virtual-model@latest",
    messages=[...],
)

To preview changes in a draft or restored draft, switch the UI or JSON view selector to that draft and experiment in the Virtual Model Editor — all without impacting production calls.

Best Practices

  • Iterate Safely: Leverage drafts for experimental guardrails or parameter tuning without risking production stability.

  • Frequent Publishing: Keep version history granular—publish stable drafts regularly to simplify tracking and rollbacks.

  • Use Restore Thoughtfully: Before restoring, ensure any important unsaved draft work is committed or intentionally discarded.

Cost Control

Control project expenses by setting user and group-based limits, monitoring AI usage, and optimizing costs in LangDB.

LangDB enables cost tracking, project budgeting, and cost groups to help manage AI usage efficiently.

Cost Groups (Business Tier & Above)

  • Available in Business & Enterprise tiers under User Management.

  • Organize users into cost groups to track and allocate spending.

  • Cost groups help in budgeting but are independent of user roles.

Project-Level Spending Limits

  • Set daily, monthly, and total spending limits per project.

  • Enforce per-user limits to prevent excessive usage.

  • Available in Project Settings → Cost Control.

Cost Group-Based Role Management

  • Admins and Billing users can define spending limits for cost groups.

  • Set daily, monthly, and total budgets per group.

  • Useful for controlling team-based expenses independently of project limits.

User Roles

Set user permissions with LangDB’s role-based system, giving Admins, Developers, and Billing users specific access and controls.

LangDB provides role-based access control to manage users efficiently within an organization. There are three primary roles: Admin, Developer, and Billing.

Each role has specific permissions and responsibilities, ensuring a structured and secure environment for managing teams.

Admin

Admins have the highest level of control within LangDB. They can:

  • Invite and manage users

  • Assign and modify roles for team members

  • Manage cost groups and usage tracking

  • Access billing details and payment settings

  • Configure organizational settings

  • Configure project model access restrictions

  • Configure project user access restrictions

Best for: Organization owners, team leads, or IT administrators managing team access and billing.

Developer

Developers focus on working with APIs and integrating LLMs. They have the following permissions:

  • Access and use LangDB APIs

  • Deploy and test applications using LangDB’s AI Gateway

  • View and monitor API usage and performance

Best for: Software developers, data scientists, and AI engineers working on LLM integrations.

Billing

Billing users have access to financial and cost-related features. Their permissions include:

  • Managing top-ups and subscriptions

  • Monitoring usage costs and optimizing expenses

Best for: Finance teams, accounting personnel, and cost management administrators.


Role Management

Admins can assign roles to users when inviting them to the organization. Role changes can also be made later through the user management panel.

Key Points:

  • Users can have multiple roles (e.g., both Developer and Billing).-

  • Only Admins can assign or update roles.

  • Billing users cannot modify API access but can track and manage costs.

  • Role Management is only available in Professional, Business, and Enterprise tiers.

Custom MCP Servers

Learn how to connect your own custom MCP servers to LangDB AI Gateway.

While LangDB provides a rich library of pre-built MCP servers, you can also bring your own. By connecting a custom MCP server, you can leverage all the benefits of a Virtual MCP Server, including:

  • Unified Interface: Combine your custom tools with tools from other LangDB-managed servers.

  • Clean Auth Handling: Let LangDB manage authentication, or provide your own API keys and headers.

  • Full Observability: Get complete tracing for every call, with logs, latencies, and metrics.

  • Seamless Integration: Works out-of-the-box with clients like Cursor, Claude, and Windsurf.

  • Enhanced Security: Benefit from version pinning and protection against tool definition poisoning.

This guide explains how to connect your own custom MCP server, whether it uses an HTTP (REST API) or SSE (Server-Sent Events) transport.

Connecting Your Custom Server

When creating a Virtual MCP Server, you can add your own server alongside the servers deployed and managed by LangDB.

Steps to Configure a Custom Server

  1. Navigate to MCP Servers: Go to the "MCP Servers" section in your LangDB project and click "Create Virtual MCP Server".

  2. Add a Custom Server: In the "Server Configuration" section, click the "+ Add Server" button on the right and select "Custom" from the list.

  3. Configure Server Details: A new "Custom Server" block will appear on the left. Fill in the following details:

    • Server Name: Give your custom server a descriptive name.

    • Transport Type: Choose either HTTP (REST API) or SSE (Server-Sent Events) from the dropdown.

    • HTTP/SSE URL: Enter the endpoint URL for your custom MCP server. LangDB will attempt to connect to this URL to validate the server and fetch the available tools.

    • (Optional) HTTP Headers: If your server requires specific HTTP headers for authentication or other purposes, you can add them here.

    • (Optional) Environment Variables: If your server requires specific configuration via environment variables, you can add them.

  4. Select Tools: Once LangDB successfully connects to your server, it will display a list of all the tools exposed by your MCP server. You can select which tools you want to include in your Virtual MCP Server.

  5. Generate URL: After configuring your custom server and selecting the tools, you can generate the secure URL for your Virtual MCP Server and start using it in your applications.

Project Access Control

Control which users have access to your projects with LangDB's project-level user access restrictions.

Select which users in your organization can access specific projects. Only Admins can configure project access - other roles cannot modify these settings.

How It Works

  • Admin-only configuration: Only Admins can enable/disable user access per project

  • User-level control: Individual users can be granted or revoked project access

  • Role preservation: Users keep their organization roles but may be restricted from certain projects

  • API enforcement: Users without project access cannot make API calls to restricted projects

Setup (Admin Only)

  1. Project Settings → Users → User Access Configuration

  2. Search and select users to grant project access

  3. Toggle individual users on/off for the project

  4. Use "All Users" toggle to quickly enable/disable everyone

  5. Save configuration

User States

  • Enabled: User can access the project and make API calls

  • Disabled: User cannot access the project (blocked from API calls)

  • All Users toggle: Bulk enable/disable all organization users for the project

Common Use Cases

  • Sensitive projects: Restrict access to confidential or regulated projects

  • Client work: Limit project access to specific team members working with particular clients

  • Development stages: Control access to production vs development projects

  • Cost management: Prevent unauthorized usage by limiting project access

Troubleshooting

"Access denied" errors:

  • Check if the user is enabled for the specific project

  • Verify the user exists in the organization

  • Confirm the project access configuration is saved

Can't modify project access:

  • Only Admin role can configure project access

  • Ensure you're in the correct project settings

Beating GPT-5

LangDB's Auto Router delivers 83% satisfactory results at 35% lower cost than GPT-5. Real-world testing across 100 prompts shows router optimization without quality compromise.

Everyone assumes GPT-5 is untouchable — the safest, most accurate choice for every task. But our latest experiments tell a different story. When we put LangDB's Auto Router head-to-head against GPT-5, the results surprised us.

The Setup

We ran 100 real-world prompts across four categories: Finance, Writing, Science/Math, and Coding. One group always used GPT-5. The other let Auto Router decide the right model.

At first glance, you’d expect GPT-5 to dominate — and in strict A/B judging, it often did. But once we layered in a second check — asking an independent validator whether the Router’s answers were satisfactory (correct, useful, and complete) — the picture flipped.

What We Found

  • Costs Less: Router cut spend by 35% compared to GPT-5 ($1.04 vs $1.58).

  • Good Enough Most of the Time: Router's answers were judged satisfactory in 83% of cases.

  • Practical Wins: When you combine Router wins, ties, and “GPT-5 wins but Router still satisfactory,” the Router came out ahead in 86/100 tasks.

  • Safe: There were zero catastrophic failures — Router never produced unusable output.

Breaking Down Quality

On strict comparisons, GPT-5 outscored Router in 65 cases. Router directly won 10, with 25 ties. But here’s the catch: in the majority of those “GPT-5 wins,” the Router’s answer was still perfectly fine.

Think about defining a finance term, writing a short code snippet, or solving a straightforward math problem. GPT-5 might give a longer, more polished answer, but Router’s output was clear, correct, and usable — and it cost a fraction of the price.

The validator helped us separate “better” from “good enough.” And for most workloads, good enough at lower cost is exactly what you want.

Where Router Shines (and Struggles)

  • Finance: Router was flawless here, delivering satisfactory answers for every single prompt.

  • Coding: Router handled structured coding tasks well — effective in 30 out of 32 cases.

  • Science/Math: Router held its own, though GPT-5 still had the edge on trickier reasoning.

  • Writing: This was the weakest area for Router. GPT-5 consistently produced richer, more polished prose. Still, Router’s outputs were acceptable two-thirds of the time.


Why This Matters

The key takeaway isn’t that Router is “better than GPT-5” in raw accuracy. It’s that Router is better for your budget without compromising real-world quality. By knowing when a smaller model is good enough, you save money while still keeping GPT-5 in reserve for the hardest tasks.

In practice, that means:

  • Finance and Coding workloads → Route automatically and trust the savings.

  • Open-ended creative writing → Let Router escalate to GPT-5 when needed.

  • Everywhere else → Expect huge cost reductions without a hit to user experience.


Try It Yourself

Using the Router doesn’t require any special configuration:

{
  "model": "router/auto",
  "messages": [
    {
      "role": "user",
      "content": "Define liquidity in finance in one sentence."
    }
  ]
}

Just point to router/auto. LangDB takes care of routing — so you get the right balance of cost and quality, automatically.

Response Caching

Enable response caching in LangDB for faster, lower-cost results on repeated LLM queries.

Response caching is designed for faster response times, reduced compute cost, and consistent outputs when handling repeated or identical prompts. Perfect for dashboards, agents, and endpoints with predictable queries.

Benefits

  • Faster responses for identical requests (cache hit)

  • Reduced model/token usage for repeated inputs

  • Consistent outputs for the same input and parameters

Using Response Caching

Through Virtual Model

  1. Toggle Response Caching ON.

  2. Select the cache type:

    • Exact match (default): Matches prompt.

    • (Distance-based matching is coming soon.)

  3. Set Cache expiration time in seconds (default: 1200).

Once enabled, identical requests will reuse the cached output as long as it hasn’t expired.

Through API Calls

You can use caching on a per-request basis by including a cache field in your API body:

  • type: Currently only exact is supported.

  • expiration_time: Time in seconds (e.g., 1200 for 20 minutes).

If caching is enabled in both the virtual model and the request, the API payload takes priority.

Pricing

  • Cache hits are billed at 0.1× the standard token price (90% cheaper than a normal model call).

Cache Hits

  • When a response is served from cache, it is clearly marked as Cache: HIT in traces.

  • You’ll also see:

    • Status: 200

    • Trace ID and Thread ID for debuging

    • Start time / Finish time: Notice how the duration is typically <0.01s for cache hits.

    • Cost: Cache hits are billed at a much lower rate (shown here as $0.000027).

  • The “Cache” field is displayed prominently (green “HIT” label).

Response caching in LangDB is a practical way to improve latency, reduce compute costs, and ensure consistent outputs for repeated queries. Use the UI or API to configure caching, monitor cache hits in traces and dashboard, and take advantage of reduced pricing for cached responses.

For most projects with stable or repeated inputs, enabling caching is a straightforward optimization that delivers immediate benefits.

Model Access Control

Control which models are available in your projects with LangDB's model access restrictions, ensuring teams only use approved models.

Restrict which AI models are available for specific projects. Only Admins can configure these restrictions - other roles are bound by the settings.

How It Works

  • Admin-only configuration: Only Admins can set which models are allowed per project

  • API enforcement: Restricted models return access denied errors

  • Team-wide: All project members are bound by the same restrictions

  • Universal: Works across all API endpoints and integrations

Setup (Admin Only)

  1. Project Settings → Model

  2. Select allowed models from the list

  3. Save configuration

Test with an API call to verify restrictions are working.

Common Use Cases

  • Cost control: Restrict expensive models in dev environments

  • Production stability: Only allow tested models in production

  • Compliance: Meet regulatory requirements by limiting model access

Troubleshooting

"Model not available" errors:

  • Check if the model is in the project's allowed list

  • Verify model restrictions are enabled

  • Confirm you're using the correct model identifier

Can't modify restrictions:

  • Only Admin role can configure restrictions

{
  "model": "openai/gpt-4.1",
  "messages": [
    {"role": "user", "content": "Summarize the news today"}
  ],
  "cache": {
    "type": "exact",
    "expiration_time": 1200
  }
}
Setting up response caching on LangDB
Tracing Showcasing LangDB Response Cache
Model Access  Control settings page in Projects

Google ADK

OpenAI Agents SDK

LangGraph

Agno

CrewAI

Example of how Labelled LLM calls look.
A complete trace view group by a Run.
Modified Virtual Model
Saved Version Draft
Publishing a Version of Virtual Model
Creating Cost groups for users on LangDB
Setting Project Level limits on LangDB
Setting Cost Group limits per project.
LangDB - Users and Roles settings
Configuring Custom MCP Server
User Access Control settings page in Projects
MCP Servers hosted on LangDB
Setting up and Using Virtual MCP.

Working with Agno

Unlock full observability for CrewAI agents and tasks—capture LLM calls, task execution, and agent interactions with LangDB’s init().

LangDB’s Agno integration provides end-to-end tracing for your Agno agent pipelines.

Installation

Install the LangDB client with Agno feature flag:

Quick Start

Export Environment Variables

Set your LangDB credentials:

Initialize Tracing

Import and run the initialize before configuring your Agno Code:

Configure your Agno code

All Agno interactions from invocation through tool calls to final output are traced with LangDB.

Complete Agno Example

Here is a full example based on Web Search Agno Multi Agent Team.

Example code

Check out the full sample on GitHub:

Setup Environment

Export Environment Variables

main.py

Running your Agent

Navigate to the parent directory of your agent project and use one of the following commands:

Traces on LangDB

When you run queries against your agent, LangDB automatically captures detailed traces of all agent interactions:

Next Steps: Advanced Agno Integration

This guide covered the basics of integrating LangDB with Agno using a Web Search agent example. For more complex scenarios and advanced use cases, check out our comprehensive resources in .

Working with Google ADK

Instrument Google ADK pipelines with LangDB—capture nested agent flows, token usage, and latency metrics using a single init() call.

LangDB’s Google ADK integration provides end-to-end tracing for your ADK agent pipelines.

Installation

Enable end-to-end tracing for your Google ADK agents by installing the pylangdb client with the ADK feature flag:

Quick Start

Set your environment variables before initializing running the script:

Initialize LangDB before creating or running any ADK agents:

Once initialized, LangDB automatically discovers all agents and sub-agents (including nested folders), wraps their key methods at runtime, and links sessions for full end-to-end tracing across your workflow as well.

Complete Google ADK Python Example

Here's a full example of a Google ADK agent implementation that you can instrument with LangDB. This sample is based on the official .

Example code

Check out the full sample on GitHub:

Setup Environment

Project Structure

Create the following project structure:

init.py

Create an __init__.py file in the multi_tool_agent folder:

.env

Create .env file for your secrets

agent.py

Create an agent.py file with the following code:

Running Your Agent

Navigate to the parent directory of your agent project and use the following commands:

Open the URL provided (usually http://localhost:8000) in your browser and select "multi_tool_agent" from the dropdown menu.

Once your agent is running, try these example queries to test its functionality:

These queries will trigger the agent to use the functions we defined and provide responses based on the our agent workflow.

Traces on LangDB

When you run queries against your ADK agent, LangDB automatically captures detailed traces of all agent interactions:

Next Steps: Advanced Google ADK Integration

This guide covered the basics of integrating LangDB with Google ADK using a simple weather and time agent example. For more complex scenarios and advanced use cases, check out our comprehensive resources in .

Provider Routing

Automatically route requests across multiple AI providers for optimal cost, latency, and accuracy. One model name, multiple providers.

Stop worrying about which provider to pick. With Provider Routing, you can call a model by name, and LangDB will automatically select the right provider for you.

Why Use Provider Routing?

  • One Name, Many Providers – Call a model like deepseek-v3.1 and LangDB picks from DeepSeek official, Parasail, DeepInfra, Fireworks AI, and more.

  • Optimize by Mode – Choose whether you want lowest cost, fastest latency, highest accuracy, or simply balanced routing.


Quick Start

That’s it — LangDB will resolve deepseek-v3.1 across multiple providers, and by default use balanced mode.


Optimization Modes

When you specify only a model name, LangDB chooses the provider according to your selected mode.

Mode
What it does
Best for

Examples

Balanced (default)

LangDB chooses the provider dynamically, balancing cost, latency, and accuracy.


Cost Optimization

LangDB picks the cheapest provider for deepseek-v3.1 based on input/output token prices (e.g. Parasail, Fireworks AI, or DeepInfra if they’re lower than DeepSeek official).


Accuracy Optimization

Routes to the provider with the highest benchmark score for deepseek-v3.1.


Latency Optimization

Always picks the provider with the fastest response times.


Throughput Optimization

Distributes requests across all available providers for deepseek-v3.1 to maximize scale.


Explicit Provider Pinning

If you want full control, you can always specify the provider explicitly:

This bypasses provider routing and always uses the given provider.


Summary

  • Use model without provider → LangDB does provider routing.

  • Add :mode suffix → pick between balanced, accuracy, cost, latency, or throughput.

  • Use provider/model → pin a specific provider directly.

Provider Routing makes it easy to scale across multiple vendors without rewriting your code.

Usage

Track total usage, model-specific metrics, and user-level analytics to stay within limits and optimize LLM workflows.

Monitoring complements tracing by providing aggregate insights into the usage of LLM workflows.

Limits

LangDB enforces limits to ensure fair usage and cost management while allowing users to configure these limits as needed. Limits are categorized into:

  1. Daily Limits: Maximum usage per day, e.g., $10 in the Starter Tier.

  2. Monthly Limits: Total usage allowed in a month, e.g., $100.

Total Limits: Cumulative limit over the project’s duration, e.g., $500.

Best Practices

  • Monitor usage regularly to avoid overages.

  • Plan limits based on project needs and anticipated workloads.

  • Upgrade tiers if usage consistently approaches limits.

Setting limits not only helps you stay within budget but also provides the flexibility to scale your usage as needed, ensuring your projects run smoothly and efficiently.

Usage APIs

Retrieves the total usage statistics for your project for a timeframe.

Example Response:

Fetches timeseries usage statistics per model, allowing users to analyze the distribution of LLM usage.

Example Response:

Filtering By Users

As discussed in User Tracking, we can use filters to retrieve insights based on id, name, or tags.

Available Filters:

  • user_id: Filter data for a specific user by their unique ID.

  • user_name: Retrieve usage based on the user’s name.

  • user_tags: Filter by tags associated with a user (e.g., "websearch", "support").

Example response:

{
  "model": "deepseek-v3.1",
  "messages": [
    {
      "role": "user",
      "content": "Explain reinforcement learning in simple terms."
    }
  ]
}

balanced

Distributes requests across providers for optimal overall performance

General apps (default)

accuracy

Routes to the provider with the best benchmark score

Research, compliance

cost

Picks the cheapest provider by input/output token price

Support chatbots, FAQs

latency

Always selects the lowest latency provider

Real-time UIs, voice bots

throughput

Spreads requests across all providers to maximize concurrency

High-volume pipelines

{
  "model": "deepseek-v3.1",
  "messages": [{ "role": "user", "content": "Summarize this article." }]
}
{
  "model": "deepseek-v3.1:cost",
  "messages": [{ "role": "user", "content": "Write a short FAQ response." }]
}
{
  "model": "deepseek-v3.1:accuracy",
  "messages": [{ "role": "user", "content": "Solve this math word problem." }]
}
{
  "model": "deepseek-v3.1:latency",
  "messages": [{ "role": "user", "content": "Respond quickly for a live chat." }]
}
{
  "model": "deepseek-v3.1:throughput",
  "messages": [{ "role": "user", "content": "Translate this dataset." }]
}
{
  "model": "parasail/deepseek-v3.1",
  "messages": [{ "role": "user", "content": "Generate a poem." }]
}
curl --location 'https://api.us-east-1.langdb.ai/usage/total' \
--header 'x-project-id: langdbProjectID' \ 
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer langDBAPIKey' \ 
--data '{"start_time_us": 1693062345678, 
         "end_time_us": 1695092345678}'
{
  "total": {
    "total_input_tokens": 4181386,
    "total_output_tokens": 206547,
    "total_cost": 11.890438685999994
  },
  "period_start": 1737504000000000,
  "period_end": 1740131013885000
}
curl --location 'https://api.us-east-1.langdb.ai/usage/models' \
--header 'x-project-id: langdbProjectID' \  
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer langDBAPIKey' \ 
--data '{"start_time_us": 1693062345678, "end_time_us": 1695092345678, 
             "min_unit": "hour"} '
{
  "models": [
    {
      "hour": "2025-02-14 08:00:00",
      "provider": "openai",
      "model_name": "gpt-4o-mini",
      "total_input_tokens": 13408,
      "total_output_tokens": 2169,
      "total_cost": 0.0039751199999999995
    },
    {
      "hour": "2025-02-13 08:00:00",
      "provider": "openai",
      "model_name": "gpt-4o-mini",
      "total_input_tokens": 55612,
      "total_output_tokens": 786,
      "total_cost": 0.01057608
    }
  ],
  "period_start": 1737504000000000,
  "period_end": 1740130915098000
}
curl -L \
  --request POST \
  --url 'https://api.us-east-1.langdb.ai/usage/models' \
  --header 'Authorization: Bearer langDBAPIKey' \
  --header 'X-Project-Id: langDBProjectID' \
  --header 'Content-Type: application/json' \
  --data '{
    "user_id": "123",
    "user_name": "mrunmay",
    "user_tags": ["websearch", "testings"]
  }'
{
  "models": [
    {
      "day": "2025-02-21 10:00:00",
      "provider": "openai",
      "model_name": "gpt-4o-mini",
      "total_input_tokens": 1112,
      "total_output_tokens": 130,
      "total_cost": 0.00029376
    },
    {
      "day": "2025-02-21 14:00:00",
      "provider": "openai",
      "model_name": "gpt-4o-mini",
      "total_input_tokens": 3317,
      "total_output_tokens": 328,
      "total_cost": 0.00083322
    }
  ],
  "period_start": 1737556513673410,
  "period_end": 1740148513673410
}
/usage/total
/usage/models
Usage- displaying input and output tokens for each provider and model as well as their cost.
pip install 'pylangdb[agno]'
export LANGDB_API_KEY="<your_langdb_api_key>"
export LANGDB_PROJECT_ID="<your_langdb_project_id>"
from pylangdb.agno import init
# Initialise LangDB
init()
import os
from pylangdb.agno import init
init()

from agno.agent import Agent
from agno.tools.duckduckgo import DuckDuckGoTools
from agno.models.langdb import LangDB

# Configure LangDB-backed model
langdb_model = LangDB(
    id="openai/gpt-4",
    api_key=os.getenv("LANGDB_API_KEY"),
    project_id=os.getenv("LANGDB_PROJECT_ID"),
)

# Create and run your agent
agent = Agent(
    name="Web Agent",
    role="Search the web for information",
    model=langdb_model,
    tools=[DuckDuckGoTools()],
    instructions="Answer questions using web search",
)

response = agent.run("What is LangDB?")
print(response)
pip install agno 'pylangdb[agno]' duckduckgo-search
export LANGDB_API_KEY="<your_langdb_api_key>"
export LANGDB_PROJECT_ID="<your_langdb_project_id>"
import os
from textwrap import dedent

# Initialize LangDB tracing and import model
from pylangdb.agno import init
init()
from agno.models.langdb import LangDB

# Import Agno agent components
from agno.agent import Agent
from agno.tools.duckduckgo import DuckDuckGoTools

# Function to create a LangDB model with selectable model name
def create_langdb_model(model_name="openai/gpt-4.1"):
    return LangDB(
        id=model_name,
        api_key=os.getenv("LANGDB_API_KEY"),
        project_id=os.getenv("LANGDB_PROJECT_ID"),
    )

web_agent = Agent(
    name="Web Agent",
    role="Search the web for comprehensive information and current data",
    model=create_langdb_model("openai/gpt-4.1"),
    tools=[DuckDuckGoTools()],
    instructions="Always use web search tools to find current and accurate information. Search for multiple aspects of the topic to gather comprehensive data.",
    show_tool_calls=True, 
    markdown=True,        
)


writer_agent = Agent(
    name="Writer Agent",
    role="Write comprehensive article on the provided topic",
    model=create_langdb_model("anthropic/claude-3.7-sonnet"),
    instructions="Use outlines to write articles",
    show_tool_calls=True,
    markdown=True,
)

agent_team = Agent(
    name="Research Team",
    team=[web_agent, writer_agent],  
    model=create_langdb_model("gemini/gemini-2.0-flash"),
    instructions=dedent("""\
        You are the coordinator of a research team with two specialists:
        
        1. Web Agent: Has DuckDuckGo search tools and must be used for ALL research tasks
        2. Writer Agent: Specializes in creating comprehensive articles
        
        WORKFLOW:
        1. ALWAYS delegate research tasks to the Web Agent first
        2. The Web Agent MUST use web search tools to gather current information
        3. Then delegate writing tasks to the Writer Agent using the research findings
        4. Ensure comprehensive coverage of the topic through multiple searches
        
        IMPORTANT: Never attempt to answer without first having the Web Agent conduct searches.
    """),
    show_tool_calls=True,
    markdown=True,
)

agent_team.print_response(
    "I need a comprehensive article about the Eiffel Tower. "
    "Please have the Web Agent search for current information about its history, architectural significance, and cultural impact. "
    "Then have the Writer Agent create a detailed article based on the research findings.", 
    stream=True
)
python main.py
https://github.com/langdb/langdb-samples/tree/main/examples/agno/agno-basic
Checkout: https://app.langdb.ai/sharing/threads/8a44dccc-c679-4fc3-9555-a07de103d637
Multi Team Agno Trace
pip install 'pylangdb[adk]'
export LANGDB_API_KEY="<your_langdb_api_key>"
export LANGDB_PROJECT_ID="<your_langdb_project_id>"
from pylangdb.adk import init
# Initialise LangDB
init()

# Then proceed with your normal ADK setup:
from google.adk.agents import Agent
# ...define and run agents...
pip install google-adk litellm 'pylangdb[adk]'
parent_folder/
└── multi_tool_agent/
    ├── __init__.py
    ├── agent.py
    └── .env
from . import agent
LANGDB_API_KEY="<your_langdb_api_key>"
LANGDB_PROJECT_ID="<your_langdb_project_id>"
# First initialize LangDB before defining any agents
from pylangdb.adk import init
init()

import datetime
from zoneinfo import ZoneInfo
from google.adk.agents import Agent

def get_weather(city: str) -> dict:
    if city.lower() != "new york":
        return {"status": "error", "error_message": f"Weather information for '{city}' is not available."}
    return {"status": "success", "report": "The weather in New York is sunny with a temperature of 25 degrees Celsius (77 degrees Fahrenheit)."}

def get_current_time(city: str) -> dict:
    if city.lower() != "new york":
        return {"status": "error", "error_message": f"Sorry, I don't have timezone information for {city}."}
    tz = ZoneInfo("America/New_York")
    now = datetime.datetime.now(tz)
    return {"status": "success", "report": f'The current time in {city} is {now.strftime("%Y-%m-%d %H:%M:%S %Z%z")}'}

root_agent = Agent(
    name="weather_time_agent",
    model="gemini-2.0-flash",
    description=("Agent to answer questions about the time and weather in a city." ),
    instruction=("You are a helpful agent who can answer user questions about the time and weather in a city."),
    tools=[get_weather, get_current_time],
)
adk web
Whats the weather in New York?
Google ADK Quickstart
https://github.com/langdb/langdb-samples/tree/main/examples/google-adk/multi-tool-agent
Checkout: https://app.langdb.ai/sharing/threads/8425e068-77de-4f41-8aa9-d1111fc7d2b7
Sample Traces of Google ADK Quick Start

Prompt Caching

Leverage provider-side prompt caching for significant cost and latency savings on large, repeated prompts.

To save on inference costs, you can leverage prompt caching on supported providers and models. When a provider supports it, LangDB will make a best-effort to route subsequent requests to the same provider to make use of the warm cache.

Most providers automatically enable prompt caching for large prompts, but some, like Anthropic, require you to enable it on a per-message basis.

How Caching Works

Automatic Caching

Providers like OpenAI, Grok, DeepSeek, and (soon) Google Gemini enable caching by default once your prompt exceeds a certain length (e.g. 1024 tokens).

  • Activation: No change needed. Any prompt over the length threshold is written to cache.

  • Best Practice: Put your static content (system prompts, RAG context, long instructions) first in the message so it can be reused.

  • Pricing:

    • Cache Write: Mostly free or heavily discounted.

    • Cache Read: Deep discounts vs. fresh inference.

Manual Caching:

Anthropic’s Claude family requires you to mark which parts of the message are cacheable by adding a cache_control object. You can also set a TTL to control how long the block stays in cache.

  • Activation: You must wrap static blocks in a content array and give them a cache_control entry.

  • TTL: Use {"ttl": "5m"} or {"ttl": "1h"} to control expiration (default 5 minutes).

  • Best For: Huge documents, long backstories, or repeated system instructions.

  • Pricing:

    • Cache Write: 1.25× the normal per-token rate

    • Cache Read: 0.1× (10%) of the normal per-token rate

  • Limitations: Ephemeral (expires after TTL), limited number of blocks.

In this run you’ll see “Prompt Caching: 99.9% Write,” a small cost increase (~25%).

Caching Example ( Anthropic)

Here is an example of caching a large document. This can be done in either the system or user message.

{
  "model": "anthropic/claude-3.5-sonnet",
  "messages": [
    {
      "role": "system",
      "content": [
        {
          "type": "text",
          "text": "You are a helpful assistant that analyzes legal documents. The following is a terms of service document:"
        },
        {
          "type": "text",
          "text": "HUGE DOCUMENT TEXT...",
          "cache_control": {
            "type": "ephemeral",
            "ttl": "1h"
          }
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Summarize the key points about data privacy."
        }
      ]
    }
  ]
}

Provider Support Matrix

Provider
Auto-cache?
Manual flag?
TTL
Write cost
Read cost

OpenAI

✅

❌

N/A

standard

0.25x or 0.5x

Grok

✅

❌

N/A

standard

0.25x

DeepSeek

✅

❌

N/A

standard

0.25x

Anthropic Claude

❌

cache_control + TTL

5 m / 1 h

1.25×

0.1×


For the most up-to-date information on a specific model or provider's caching policy, pricing, and limitations, please refer to the model page on LangDB

Beating the Best Model

Save costs without losing quality. Auto Router delivers best-model accuracy at a fraction of the price.

Most developers assume that using the best model is the safest bet for every query. But in practice, that often means paying more than you need to — especially when cheaper models can handle simpler queries just as well.

LangDB’s Auto Router shows you don’t always need the “best” model — just the right model for the job.

The Question We Asked

When building AI applications, you face a constant trade-off: performance vs. cost. Do you always use the most powerful (and expensive) model to guarantee quality? Or do you risk cheaper alternatives that might fall short on complex tasks?

We wanted to find out: Can smart routing beat the "always use the best model" strategy?

Our Experiment

We designed a head-to-head comparison using 100 real-world queries across four domains: Finance, Writing, Science/Math, and Coding. Each query was tested against two strategies:

  • Auto Router → Analyzed query complexity and topic, then selected the most cost-effective model that could handle the task

  • Router:Accuracy → Always defaulted to the highest-performing model (the "best model" approach)

What made this test realistic:

  • Diverse complexity: 70 low-complexity queries (simple conversions, definitions) and 30 high-complexity queries (complex analysis, multi-step reasoning)

  • Real-world domains: Finance calculations, professional writing, scientific explanations, and coding problems

  • Impartial judging: Used GPT-5-mini as an objective judge to compare response quality

Sample of what we tested:

  • Finance: "A company has revenue of $200M and expenses of $150M. What is its profit?"

  • Writing: "Write a one-line professional email subject requesting a meeting"

  • Science/Math: "Convert 100 cm into meters"

  • Coding: "Explain what a variable is in programming in one sentence"

Results

Metric
Auto Router
Router:Accuracy

What Wins & Ties Mean

  • Win → Auto Router chose a cheaper model, and the output was equal or better than the best model.

  • Tie → Auto Router escalated to the best model itself, because the query was complex enough to require it.

  • Loss → Didn’t happen. Auto Router never underperformed compared to always using the best model.

In other words: Auto Router matched or beat the best model strategy 100% of the time — while cutting costs by ~42%.

Category Breakdown

Category
Count
Router Wins
Ties (Used Best Model)
  • In Finance and Writing, Auto Router confidently used cheaper models most of the time.

  • In Coding, Auto Router often escalated to the best model — proving it knows when not to compromise.

The Methodology Behind the Magic

How Auto Router Works: Auto Router doesn't just pick models randomly. It uses a sophisticated classification system that:

  1. Analyzes query complexity — Is this a simple fact lookup or a complex reasoning task?

  2. Identifies the domain — Finance, writing, coding, or science/math?

  3. Matches to optimal model — Selects the most cost-effective model that can handle the specific complexity level

The "Always Best" Approach: Router:Accuracy takes the conservative route — always selecting the highest-performing model regardless of query complexity. It's like using a Formula 1 car for grocery shopping.

Fair Comparison: We used GPT-5-mini as an impartial judge to evaluate response quality across both strategies. The judge compared answers based on correctness, usefulness, and completeness without knowing which routing strategy was used.

What This Means for Developers

The Real-World Impact:

  • Cost optimization without compromise — Save 42% on API costs while maintaining quality

  • Intelligent escalation — Complex queries automatically get the best models

  • No manual tuning — The router handles the complexity analysis for you

Try It Yourself

Using Auto Router is simple — just point to router/auto:

Auto Router will automatically select the most cost-effective model that can handle your query complexity.

The Bottom Line

  • Save Money → Auto Router avoids overpaying on simple queries

  • Stay Accurate → For complex cases, it automatically picks the strongest model

  • Smarter Than "Always Best" → Matches or beats the best-model-only approach at a fraction of the cost

Takeaway

You don't need to pick the "best" model every time.

With Auto Router:

  • Simple queries → cheaper models save you money

  • Complex queries → stronger models keep accuracy intact

  • Overall → 100% accuracy parity at 42% lower cost

That's the power of LangDB Auto Router.

Total Cost

$0.95

$1.64

Wins

65%

0%

Ties

35%

35%

Losses

0%

0%

Accuracy Parity

100% (wins + ties)

100%

Finance

25

23

2

Writing

24

18

6

Science & Math

19

14

5

Coding

32

10

22

{
  "model": "router/auto",
  "messages": [
    {
      "role": "user",
      "content": "A company has revenue of $200M and expenses of $150M. What is its profit?"
    }
  ]
}
Example Trace of Cache HIT
Cache write with Anthropic Prompt Caching
Quick Guide on how to use virtual MCP Server

Working with CrewAI

Add end-to-end tracing to Agno agent workflows with LangDB—monitor model calls, tool usage, and step flows using a single init() call.

LangDB makes it effortless to trace CrewAI workflows end-to-end. With a single init() call, all agent interactions, task executions, and LLM calls are captured.

Checkout:

Installation

Install the LangDB client with LangChain feature flag:

pip install 'pylangdb[crewai]'

Quick Start

Export Environment Variables

Set your LangDB credentials:

export LANGDB_API_KEY="<your_langdb_api_key>"
export LANGDB_PROJECT_ID="<your_langdb_project_id>"

Initialize Tracing

Import and run the initialize before configuring your CrewAI Code:

from pylangdb.crewai import init
# Initialise LangDB
init()

Configure your CrewAI code

import os
from dotenv import load_dotenv
from crewai import Agent, Task, Crew, LLM

# Configure LLM with LangDB headers
llm = LLM(
    model="openai/gpt-4o",  # Use LiteLLM Like Model Names
    api_key=os.getenv("LANGDB_API_KEY"),
    base_url=os.getenv("LANGDB_API_BASE_URL"),
    extra_headers={"x-project-id": os.getenv("LANGDB_PROJECT_ID")}
)

# Define agents and tasks as usual
researcher = Agent(
    role="researcher",
    goal="Research topic thoroughly",
    backstory="You are an expert researcher",
    llm=llm,
    verbose=True
)
task = Task(description="Research the given topic", agent=researcher)
crew = Crew(agents=[researcher], tasks=[task])

# Kick off the workflow
result = crew.kickoff()
print(result)

All CrewAI calls—agent initialization, task execution, and model responses—are automatically linked.

Complete CrewAI example

Here is a full example based on CrewAI report writing agent.

Example code

Check out the full sample on GitHub: https://github.com/langdb/langdb-samples/tree/main/examples/crewai/crewai-tracing

Setup Evironment

pip install crewai 'pylangdb[crewai]' crewai_tools setuptools python-dotenv

Export Environment Variables

You also need to get API Key from Serper.dev

export LANGDB_API_KEY="<your_langdb_api_key>"
export LANGDB_PROJECT_ID="<your_langdb_project_id>"
export LANGDB_API_BASE_URL='https://api.us-east-1.langdb.ai'

main.py

#!/usr/bin/env python3

import os
import sys
from pylangdb.crewai import init
init()
from dotenv import load_dotenv
from crewai import Agent, Task, Crew, Process, LLM
from crewai_tools import SerperDevTool

load_dotenv()

def create_llm(model):
    return LLM(
        model=model,
        api_key=os.environ.get("LANGDB_API_KEY"),
        base_url=os.environ.get("LANGDB_API_BASE_URL"),
        extra_headers={"x-project-id": os.environ.get("LANGDB_PROJECT_ID")}
    )

class ResearchPlanningCrew:
    def researcher(self) -> Agent:
        return Agent(
            role="Research Specialist",
            goal="Research topics thoroughly",
            backstory="Expert researcher with skills in finding information",
            tools=[SerperDevTool()],
            llm=create_llm("openai/gpt-4o"),
            verbose=True
        )
    
    def planner(self) -> Agent:
        return Agent(
            role="Strategic Planner",
            goal="Create actionable plans based on research",
            backstory="Strategic planner who breaks down complex challenges",
            reasoning=True,
            max_reasoning_attempts=3,
            llm=create_llm("openai/anthropic/claude-3.7-sonnet"),
            verbose=True
        )
    
    def research_task(self) -> Task:
        return Task(
            description="Research the topic thoroughly and compile information",
            agent=self.researcher(),
            expected_output="Comprehensive research report"
        )
    
    def planning_task(self) -> Task:
        return Task(
            description="Create a strategic plan based on research",
            agent=self.planner(),
            expected_output="Strategic execution plan with phases and goals",
            context=[self.research_task()]
        )
    
    def crew(self) -> Crew:
        return Crew(
            agents=[self.researcher(), self.planner()],
            tasks=[self.research_task(), self.planning_task()],
            verbose=True,
            process=Process.sequential
        )

def main():
    topic = sys.argv[1] if len(sys.argv) > 1 else "Artificial Intelligence in Healthcare"
    
    crew_instance = ResearchPlanningCrew()
    
    # Update task descriptions with topic
    crew_instance.research_task().description = f"Research {topic} thoroughly and compile information"
    crew_instance.planning_task().description = f"Create a strategic plan for {topic} based on research"
    
    result = crew_instance.crew().kickoff()
    print(result)

if __name__ == "__main__":
    main()

Running your Agent

Navigate to the parent directory of your agent project and use one of the following commands:

python main.py

Traces on LangDB:

When you run queries against your agent, LangDB automatically captures detailed traces of all agent interactions:

Next Steps: Advanced CrewAI Integration

This guide covered the basics of integrating LangDB with CrewAI using a Research and Planning agent example. For more complex scenarios and advanced use cases, check out our comprehensive resources in .

Retrieve a list of threads

post
Authorizations
Header parameters
X-Project-IdstringRequired

LangDB project ID

Body
limitinteger · min: 1RequiredExample: 10
offsetintegerRequiredExample: 100
Responses
200

A list of threads with pagination info

application/json
post
POST /threads HTTP/1.1
Host: api.us-east-1.langdb.ai
Authorization: Bearer YOUR_SECRET_TOKEN
X-Project-Id: text
Content-Type: application/json
Accept: */*
Content-Length: 25

{
  "limit": 10,
  "offset": 100
}
200

A list of threads with pagination info

{
  "data": [
    {
      "id": "123e4567-e89b-12d3-a456-426614174000",
      "created_at": "2025-10-01T19:41:01.782Z",
      "updated_at": "2025-10-01T19:41:01.782Z",
      "model_name": "text",
      "project_id": "text",
      "score": 1,
      "title": "text",
      "user_id": "text"
    }
  ],
  "pagination": {
    "limit": 10,
    "offset": 100,
    "total": 10
  }
}

Virtual Models

Create, save, and reuse LLM configurations with Virtual Models in LangDB AI Gateway to streamline workflows and ensure consistent behavior.

LangDB’s Virtual Models let you save, share, and reuse model configurations—combining prompts, parameters, tools, and routing logic into a single named unit. This simplifies workflows and ensures consistent behavior across your apps, agents, and API calls.

Once saved, these configurations can be quickly accessed and reused across multiple applications.

Why do you need Virtual Models

Virtual models in LangDB are more than just model aliases. They are fully configurable AI agents that:

  • Let you define system/user messages upfront

  • Support routing logic to dynamically choose between models

  • Include MCP integrations and guardrails

  • Are callable from UI playground, API, and LangChain/OpenAI SDKs

Use virtual models to manage:

  • Prompt versioning and reuse

  • Consistent testing across different models

  • Precision tuning with per-model parameters

  • Seamless integration of tools and control logic

  • Routing using strategies like fallback, percentage-based, latency-based, optimized, and script-based selection

Setting Up Virtual Model

  1. Go to the Models

  2. Click on Create Virtual Model.

  3. Set prompt messages — define system and user messages to guide model behavior

  4. Set variables (optional) — useful if your prompts require dynamic values

  5. Select router type

    • None: Use a single model only

    • Fallback, Random, Cost,Percentage, Latency, Optimized: Configure smart routing across targets. Checkout all .

  6. Add one or more targets

    • Each target defines a model, mcp servers, guardrails, system-user messages, response format and its parameters (e.g. temperature, max_tokens, top_p, penalties)

  7. Select MCP Servers — connect tools like LangDB Search, Code Execution, or others

  8. Add guardrails (optional) — for validation, transformation, or filtering logic

  9. Set response format — choose between text, json_object, or json_schema

  10. Give your virtual model a name and Save.

Your virtual model now appears in the Models section of your project, ready to be used anywhere a model is accepted.

Updating and Versioning

You can edit virtual models anytime. LangDB supports formal versioning via the @version syntax:

  • langdb/my-model@latest or langdb/my-model → resolves to the latest version

  • langdb/my-model@v1 or langdb/my-model@1 → resolves to version 1

This allows you to safely test new versions, roll back to older ones, or maintain multiple stable variants of a model in parallel.

Using Your Virtual Model

Once saved, your virtual model is fully available across all LangDB interfaces:

  • Chat Playground: Select it from the model dropdown and test interactively.

  • OpenAI-Compatible SDKs: Works seamlessly with OpenAI clients by changing only the model name.

  • LangChain / CrewAI / other frameworks: Call it just like any base model by using model="langdb/my-model@latest" or a specific version like @v1.

This makes virtual models a portable, modular building block across all parts of your AI stack.

https://app.langdb.ai/sharing/threads/3becbfed-a1be-ae84-ea3c-4942867a3e22
Simple CrewAI Example Trace
Routing Strategies
Virtual Model details page

Working with OpenAI Agents SDK

Trace OpenAI Agents SDK workflows end-to-end with LangDB—monitor model calls, tool invocations, and runner sessions via one-line init().

LangDB helps you add full tracing and observability to your OpenAI Agents SDK workflows—without changing your core logic. With a one-line initialization, LangDB captures model calls, tool invocations, and intermediate steps, giving you a complete view of how your agent operates.

Checkout:

Installation

Enable end-to-end tracing for your OpenAI Agents SDK agents by installing the pylangdb client with the openai feature flag:

pip install 'pylangdb[openai]'

Quick Start

Export Environment Variables

Set your LangDB credentials:

export LANGDB_API_KEY="<your_langdb_api_key>"
export LANGDB_PROJECT_ID="<your_langdb_project_id>"

Initialize Tracing

Import and run the initialize before configuring your OpenAI client:

from pylangdb.openai import init
# Initialise LangDB
init()

Configure OpenAI Client and Agent Runner

# Agent SDK imports
from agents import (
    Agent,
    Runner,
    set_default_openai_client,
    RunConfig,
    ModelProvider,
    Model,
    OpenAIChatCompletionsModel
)
from openai import AsyncOpenAI

# Configure the OpenAI client with LangDB headers
client = AsyncOpenAI(
    api_key=os.environ["LANGDB_API_KEY"],
    base_url=os.environ["LANGDB_API_BASE_URL"],
    default_headers={"x-project-id": os.environ["LANGDB_PROJECT_ID"]}
)
set_default_openai_client(client)

# Create a custom model provider for advanced routing
class CustomModelProvider(ModelProvider):
    def get_model(self, model_name: str | None) -> Model:
        return OpenAIChatCompletionsModel(model=model_name, openai_client=client)

agent = Agent(
    name="Math Tutor",
    instructions="You are a helpful assistant",
    model="openai/gpt-4.1", # Choose any model from avaialable model on LangDB
)
# Register your custom model provider to route model calls through LangDB
CUSTOM_MODEL_PROVIDER = CustomModelProvider()

# Assign a unique group_id to link all steps in this session trace
group_id = str(uuid.uuid4())
response = await Runner.run(
    agent,
    input="Hello, world!",
    run_config=RunConfig(
        model_provider=CUSTOM_MODEL_PROVIDER,  # Inject custom model provider
        group_id=group_id                      # Link all steps to the same trace
    )
)

Once executed, LangDB links all steps—model calls, intermediate tool usage, and runner orchestration—into a single session trace.

Complete OpenAI Agents SDK Example

Here is a full example based on OpenAI Agents SDK Quickstart which uses LangDB Tracing.

Example code

Check out the full sample on GitHub: https://github.com/langdb/langdb-samples/tree/main/examples/openai/openai-agents-tracing

Setup Environment

pip install openai-agents 'pylangdb[openai]'

Export Environment Variables

export LANGDB_API_KEY="<your_langdb_api_key>"
export LANGDB_PROJECT_ID="<your_langdb_project_id>"

main.py

# Initialize LangDB tracing
from pylangdb.openai import init
init()

# Agent SDK imports
from agents import (
    Agent,
    Runner,
    set_default_openai_client,
    set_default_openai_key,
    set_default_openai_api,
    RunConfig,
    ModelProvider,
    Model,
    OpenAIChatCompletionsModel
)
from openai import AsyncOpenAI
import os
import uuid
import asyncio


# Configure the OpenAI client with LangDB headers
client = AsyncOpenAI(api_key=os.environ["LANGDB_API_KEY"],
        base_url=os.environ["LANGDB_API_BASE_URL"],
        default_headers={"x-project-id": os.environ["LANGDB_PROJECT_ID"]})

# Set the configured client as default with tracing enabled
set_default_openai_client(client, use_for_tracing=True)
set_default_openai_api(api="chat_completions")
# set_default_openai_key(os.environ["LANGDB_API_KEY"])

# Create a custom model provider for advanced routing
class CustomModelProvider(ModelProvider):
    def get_model(self, model_name: str | None) -> Model:
        return OpenAIChatCompletionsModel(model=model_name, openai_client=client)

# Register your custom model provider to route model calls through LangDB
CUSTOM_MODEL_PROVIDER = CustomModelProvider()

math_tutor_agent = Agent(
    name="Math Tutor",
    handoff_description="Specialist agent for math questions",
    instructions="You provide help with math problems. Explain your reasoning at each step and include examples",
    model="anthropic/claude-3.7-sonnet" 
)

history_tutor_agent = Agent(
    name="History Tutor",
    handoff_description="Specialist agent for historical questions",
    instructions="You provide assistance with historical queries. Explain important events and context clearly.",
    model="gemini/gemini-2.0-flash" # Choose any model available on LangDB
)

triage_agent = Agent(
    name="Triage Agent",
    instructions="You determine which agent to use based on the user's homework question",
    handoffs=[history_tutor_agent, math_tutor_agent],
    model="openai/gpt-4o-mini" # Choose any model available on LangDB
)
# Assign a unique group_id to link all steps in this session trace
group_id = str(uuid.uuid4())

# Define async function to run the agent
async def run_agent():
    response = await Runner.run(
        triage_agent,
        input="who was the first president of the united states?",
        run_config=RunConfig(
            model_provider=CUSTOM_MODEL_PROVIDER,  # Inject custom model provider
            group_id=group_id                      # Link all steps to the same trace
        )
    )
    print(response.final_output)

# Run the async function with asyncio
asyncio.run(run_agent())

Running Your Agent

Navigate to the parent directory of your agent project and use one of the following commands:

python main.py

Output:

The first president of the United States was **George Washington**.

Here's some important context:

*   **The American Revolution (1775-1783):** Washington was the commander-in-chief of the Continental Army during the Revolutionary War. His leadership was crucial in securing American independence from Great Britain.
*   **The Articles of Confederation (1781-1789):** After the war, the United States was governed by the Articles of Confederation. This system proved to be weak and ineffective, leading to calls for a stronger national government.
*   **The Constitutional Convention (1787):** Delegates from the states met in Philadelphia to revise the Articles of Confederation. Instead, they drafted a new Constitution that created a more powerful federal government. Washington presided over the convention, lending his prestige and influence to the process.
*   **The Constitution and the Presidency:** The Constitution established the office of the President of the United States.
*   **Election of 1789:** George Washington was unanimously elected as the first president by the Electoral College in 1789. There were no opposing candidates. This reflected the immense respect and trust the nation had in him.
*   **First Term (1789-1793):** Washington established many precedents for the presidency, including the formation of a cabinet, the practice of delivering an annual address to Congress, and the idea of serving only two terms. He focused on establishing a stable national government, paying off the national debt, and maintaining neutrality in foreign affairs.
*   **Second Term (1793-1797):** Washington faced challenges such as the Whiskey Rebellion and growing partisan divisions. He decided to retire after two terms, setting another crucial precedent for peaceful transitions of power.
*   **Significance:** Washington's leadership and integrity were essential in establishing the legitimacy and credibility of the new government. He is often considered the "Father of His Country" for his pivotal role in the founding of the United States.

Traces on LangDB

When you run queries against your agent, LangDB automatically captures detailed traces of all agent interactions:

Next Steps: Advanced OpenAI Agents SDK Integration

This guide covered the basics of integrating LangDB with OpenAI Agents SDK using a history and maths agent example. For more complex scenarios and advanced use cases, check out our comprehensive resources in .

Routing

Intelligently route across multiple LLMs to ensure fast, reliable, and scalable AI operations.

LangDB AI Gateway optimizes LLM selection based on cost, speed, and availability, ensuring efficient request handling. This guide covers the various dynamic routing strategies available in the system, including fallback, script-based, optimized, percentage-based, and latency-based routing.

This ensures efficient request handling and optimal model selection tailored to specific application needs.

Understanding Targets

Before diving into routing strategies, it's essential to understand targets in LangDB AI Gateway. A target refers to a specific model or endpoint to which requests can be directed. Each target represents a potential processing unit within the routing logic, enabling optimal performance and reliability.

{
  "model": "router/dynamic",
  "router": {
    "type": "percentage",
    "targets_percentages": [
      40,
      60
    ],
    "targets": [
      {
        "model": "openai/gpt-4.1",
        "mcp_servers": [
          {
            "slug": "mymcp_zoyhbp3u",
            "name": "mymcp",
            "type": "sse",
            "server_url": "https://api.staging.langdb.ai/mymcp_zoyhbp3u"
          }
        ],
        "extra": {
          "guards": [
            "openai_moderation_y6ln88g4"
          ]
        }
      },
      {
        "model": "anthropic/claude-3.7-sonnet",
        "mcp_servers": [
          {
            "slug": "mymcp_zoyhbp3u",
            "name": "mymcp",
            "type": "sse",
            "server_url": "https://api.staging.langdb.ai/mymcp_zoyhbp3u"
          }
        ],
        "extra": {
          "guards": [
            "openai_moderation_y6ln88g4"
          ]
        },
        "temperature": 0.6,
        "messages": [
          {
            "content": "You are a helpful assistant",
            "id": "02cb4630-b01a-42d9-a226-94968865fbe0",
            "role": "system"
          }
        ]
      }
    ]
  }
}

Target Parameters

Each target in LangDB is essentially a self-contained configuration, similar to a virtual model. A target can include:

  • Model – The identifier for the base model to use (e.g. openai/gpt-4o)

  • Prompt – Optional system and user messages to steer the model

  • MCP Servers – Support to Virtual MCP Servers

  • Guardrails – Validations, Moderations.

  • Response Format – text, json_object, or json_schema

  • Custom Parameters – Tuning controls like:

    • temperature

    • max_tokens

    • top_p

    • frequency_penalty

    • presence_penalty

Routing Strategies

LangDB AI Gateway supports multiple routing strategies that can be combined and customized to meet your specific needs:

Routing Strategy
Description

Sequentially routes requests through multiple models in case of

Selects the best model based on real-time performance metrics.

Distributes traffic between multiple models using predefined weightings.

Chooses the model with the lowest response time for real-time applications.

Combines multiple routing strategies for flexible traffic management.

Fallback Routing

Fallback routing allows sequential attempts to different model targets in case of failure or unavailability. It ensures robustness by cascading through a list of models based on predefined logic.

{
    "model": "router/dynamic",
    "messages": [
        { "role": "system", "content": "You are a helpful assistant." },
        { "role": "user", "content": "What is the formula of a square plot?" }
    ],
    "router": {
        "router": "router",
        "type": "fallback", // Type: fallback/script/optimized/percentage/latency
        "targets": [
            { "model": "openai/gpt-4o-mini", "temperature": 0.9, "max_tokens": 500, "top_p": 0.9 },
            { "model": "deepseek/deepseek-chat", "frequency_penalty": 1, "presence_penalty": 0.6 }
        ]
    },
    "stream": false
}

Optimized Routing

Optimized routing automatically selects the best model based on real-time performance metrics such as latency, response time, and cost-efficiency.


{
    "model": "router/dynamic",
    "router": {
        "name": "fastest",
        "type": "optimized",
        "metric": "ttft",
        "targets": [
            { "model": "gpt-3.5-turbo", "temperature": 0.8, "max_tokens": 400, "frequency_penalty": 0.5 },
            { "model": "gpt-4o-mini", "temperature": 0.9, "max_tokens": 500, "top_p": 0.9 }
        ]
    }
}

Here, the request is routed to the model with the lowest Time-to-First-Token (TTFT) among gpt-3.5-turbo and gpt-4o-mini.

Metrics:

  • Requests – Total number of requests sent to the model.

  • InputTokens – Number of tokens provided as input to the model.

  • OutputTokens – Number of tokens generated by the model in response.

  • TotalTokens – Combined count of input and output tokens.

  • RequestsDuration – Total duration taken to process requests.

  • Ttft (Time-to-First-Token) (Default) – Time taken by the model to generate its first token after receiving a request.

  • LlmUsage – The total computational cost of using the model, often used for cost-based routing.

Percentage-Based Routing

Percentage-based routing distributes requests between models according to predefined weightings, allowing load balancing, A/B testing, or controlled experimentation with different configurations. Each model can have distinct parameters while sharing the request load.


{ 
  "model": "router/dynamic",
  "router": {
    "name": "dynamic",
    "type": "percentage",
    "targets": [
      { "model": "openai/gpt-4o-mini", "temperature": 0.9, "max_tokens": 500, "top_p": 0.9 },
      { "model": "openai/gpt-4o-mini", "temperature": 0.8, "max_tokens": 400, "frequency_penalty": 1 }
    ],
    "targets_percentages": [ 70, 30 ]
  }
}

Latency-Based Routing

Latency-based routing selects the model with the lowest response time, ensuring minimal delay for real-time applications like chatbots and interactive AI systems.


{
  "model": "router/dynamic",
  "router": {
    "name": "fastest_latency",
    "type": "latency",
    "targets": [
      { "model": "openai/gpt-4o-mini", "temperature": 0.9, "max_tokens": 500, "top_p": 0.9 },
      { "model": "deepseek/deepseek-chat", "frequency_penalty": 1, "presence_penalty": 0.6 },
      { "model": "gemini/gemini-2.0-flash-exp", "temperature": 0.8, "max_tokens": 400, "frequency_penalty": 0.5 }
    ]
  }
}

Nested Routing

LangDB AI allows nesting of routing strategies, enabling combinations like fallback within script-based selection. This flexibility helps refine model selection based on dynamic business needs.

{
    "model": "router/dynamic",
    "messages": [
        { "role": "system", "content": "You are a helpful assistant." },
        { "role": "user", "content": "What is the formula of a square plot?" }
    ],
    "router": {
        "type": "fallback",
        "targets": [
            {
                "model": "router/dynamic",
                "router": {
                    "name": "cheapest_script_execution",
                    "type": "script",
                    "script": "const route = ({ models }) => models \
                        .filter(m => m.inference_provider.provider === 'bedrock' && m.type === 'completions') \
                        .sort((a, b) => a.price.per_input_token - b.price.per_input_token)[0]?.model;"
                }
            },
            {
                "model": "router/dynamic",
                "router": {
                    "name": "fastest",
                    "type": "optimized",
                    "metric": "ttft",
                    "targets": [
                        { "model": "gpt-3.5-turbo", "temperature": 0.8, "max_tokens": 400, "frequency_penalty": 0.5 },
                        { "model": "gpt-4o-mini", "temperature": 0.9, "max_tokens": 500, "top_p": 0.9 }
                    ]
                }
            },
            { "model": "deepseek/deepseek-chat", "temperature": 0.7, "max_tokens": 300, "frequency_penalty": 1 }
        ]
    },
    "stream": false
}

Analytics

Get full visibility into API consumption with cost, speed, and reliability insights to optimize your LLM workflows efficiently.

You can monitor API usage with key insights.

After integrating LangDB into your project, the Analytics Dashboard becomes your central hub for understanding usage.

Metrics

LangDB’s Analytics Dashboard is segmented into several key panels:

Cost:

  • Tracks your total cost consumption across all integrated models.

  • Enables you to compare costs by provider/model/tags, helping you identify the most cost-effective options for your use cases.

Time:

  • Displays the average duration of requests in milliseconds.

  • Useful for benchmarking response times and optimizing performance for latency-sensitive applications.

Number of Requests:

  • Shows the total number of API calls made.

  • Helps you analyze usage patterns and allocate resources effectively.

Average Time to First Token (TTFT)

  • Indicates the average time taken to receive the first token from the API response.

  • This metric is critical for understanding initial latency.

Tokens Per Second (TPS)

  • Measures the throughput of token generation.

  • High TPS is indicative of efficient processing.

Time Per Output Token (TPOT)

  • Tracks the average time spent per output token.

  • Helps in identifying and troubleshooting bottlenecks in model output.

Error Rate

  • Displays the percentage of failed requests over total requests.

  • Helps monitor system stability and reliability.

Error Request Count

  • Tracks the total number of failed API requests.

  • Useful for debugging and troubleshooting failures effectively.

Analytics APIs

/analytics

Provides a detailed timeseries view of API usage metrics. Users can filter data by time range and group it by provider, model, or tags to analyze trends over different periods.

# grouby: provider/tag/model
curl --location 'https://api.us-east-1.langdb.ai/analytics' \
--header 'x-project-id: langDBProjectID' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer langDBAPIKey' \
--data '{"start_time_us": , "end_time_us": , "groupBy": ["provider"]}'

Example response:

{
    "timeseries": [
    {
            "hour": "2025-01-23 04:00:00",
            "total_cost": 0.0006719999999999999,
            "total_requests": 2,
            "avg_duration": 814.4,
            "duration": 814.4,
            "duration_p99": 1125.4,
            "duration_p95": 1100.0,
            "duration_p90": 1068.3,
            "duration_p50": 814.4,
            "total_duration": 1628.778,
            "total_input_tokens": 72,
            "total_output_tokens": 38,
            "error_rate": 0.0,
            "error_request_count": 0,
            "avg_ttft": 814.4,
            "ttft": 814.4,
            "ttft_p99": 1125.4,
            "ttft_p95": 1100.0,
            "ttft_p90": 1068.3,
            "ttft_p50": 814.4,
            "tps": 67.54,
            "tps_p99": 110.03,
            "tps_p95": 107.55,
            "tps_p90": 104.45,
            "tps_p50": 79.63,
            "tpot": 0.04,
            "tpot_p99": 0.06,
            "tpot_p95": 0.06,
            "tpot_p90": 0.06,
            "tpot_p50": 0.04,
            "tag_tuple": [
                "openai"
            ]
        }
    ]
}

/analytics/summary

Provides aggregated usage metrics, allowing users to get a high-level overview of API consumption and error rates.

# grouby: provider/tag/model
curl --location 'https://api.us-east-1.langdb.ai/analytics/summary' \
--header 'x-project-id: langDBProjectID' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer langDBAPIKey' \
--data '{"start_time_us": , "end_time_us": , "groupBy": ["provider"]} '

Example response:

{
    "summary": {
            "tag_tuple": [
                "togetherai"
            ],
            "total_cost": 0.0015163199999999998,
            "total_requests": 8,
            "total_duration": 5242.402,
            "avg_duration": 655.3,
            "duration": 655.3,
            "duration_p99": 969.2,
            "duration_p95": 962.5,
            "duration_p90": 954.1,
            "duration_p50": 624.3,
            "total_input_tokens": 853,
            "total_output_tokens": 200,
            "avg_ttft": 655.3,
            "ttft": 655.3,
            "ttft_p99": 969.2,
            "ttft_p95": 962.5,
            "ttft_p90": 954.1,
            "ttft_p50": 624.3,
            "tps": 200.86,
            "tps_p99": 336.04,
            "tps_p95": 304.95,
            "tps_p90": 266.08,
            "tps_p50": 186.24,
            "tpot": 0.03,
            "tpot_p99": 0.04,
            "tpot_p95": 0.04,
            "tpot_p90": 0.04,
            "tpot_p50": 0.03,
            "error_rate": 0.0,
            "error_request_count": 0
        },
}

Filtering By Users

As discussed in User Tracking, we can use filters to retrieve insights based on id, name, or tags.

Available Filters:

  • user_id: Filter data for a specific user by their unique ID.

  • user_name: Retrieve usage based on the user’s name.

  • user_tags: Filter by tags associated with a user (e.g., "websearch", "support").

curl -L \
  --request POST \
  --url 'https://api.us-east-1.langdb.ai/analytics/summary' \
  --header 'Authorization: Bearer langDBAPIKey' \
  --header 'X-Project-Id: langDBProjectID' \
  --header 'Content-Type: application/json' \
  --data '{
    "user_id": "123",
    "user_name": "mrunmay",
    "user_tags": ["websearch", "testings"]
  }'

Example response:

{
  "summary": [
    {
      "total_cost": 0.00112698,
      "total_requests": 4,
      "total_duration": 31645.018,
      "avg_duration": 7911.3,
      "duration": 7911.3,
      "duration_p99": 9819.3,
      "duration_p95": 9809.0,
      "duration_p90": 9796.1,
      "duration_p50": 8193.2,
      "total_input_tokens": 4429,
      "total_output_tokens": 458,
      "avg_ttft": 7911.3,
      "ttft": 7911.3,
      "ttft_p99": 9819.3,
      "ttft_p95": 9809.0,
      "ttft_p90": 9796.1,
      "ttft_p50": 8193.2,
      "tps": 154.43,
      "tps_p99": 207.79,
      "tps_p95": 206.1,
      "tps_p90": 203.99,
      "tps_p50": 160.85,
      "tpot": 0.07,
      "tpot_p99": 0.1,
      "tpot_p95": 0.09,
      "tpot_p90": 0.09,
      "tpot_p50": 0.07,
      "error_rate": 0.0,
      "error_request_count": 0
    }
  ],
  "start_time_us": 1737576094363076,
  "end_time_us": 1740168094363076
}

Auto Router

Stop guessing which model to pick. The Auto Router picks the best one for you—whether you care about cost, speed, or accuracy.

Stop guessing which model to pick. The Auto Router picks the best one for you—whether you care about cost, speed, or accuracy.

Why Use Auto Router?

  • Save Costs - Automatically uses cheaper models for simple queries

  • Get Faster Responses - Routes to the fastest model when speed matters

  • Guarantee Accuracy - Picks the best model for critical tasks

  • Handle Scale - No configuration hell, just works

Quick Start

Using API

Using UI

You can also try Auto Router through the LangDB dashboard:

Note: The UI shows only a few router variations. For all available options and advanced configurations, use the API.

Trace Example

Here's what happens behind the scenes when you use Auto Router:

That's it — no config needed. The router classifies the query and picks the best model automatically.

If you already know the query type (e.g., Finance), skip auto-classification with router/finance:accuracy.

Under the Hood

Behind the scenes, the Auto Router uses lightweight classifiers (NVIDIA for complexity, BART for topic) combined with LangDB's routing engine. These decisions are logged in traces so you can inspect why a query was sent to a specific model.

How It Works

The Auto Router uses a two-stage classification process:

  1. Complexity Classification: Uses NVIDIA's classification model to determine if a query is high or low complexity

  2. Topic Classification: Uses Facebook's BART Large model to identify the query's topic from these categories:

    • Academia

    • Finance

    • Marketing

    • Maths

    • Programming

    • Science

    • Vision

    • Writing

Based on these classifications and your chosen optimization strategy, the router automatically selects the best model from your available options.

Router Behavior

Router Syntax
What happens

Optimization Modes

Mode
What it does
Best for

Case Study

Use Cases

Cost Optimization

Perfect for FAQ bots, education apps, and high-volume content generation.

Accuracy Optimization

Ideal for finance, medical, legal, and research applications.

Latency Optimization

Great for real-time assistants, voice bots, and interactive UIs.

Balanced (Load Balanced)

Intelligently distributes requests across available models for optimal performance. Works well for most business applications and integrations.

Direct Category Routing

If you already know your query belongs to a specific domain, you can skip classification and directly route to a topic with your chosen optimization mode.

Result:

  • Skips complexity + topic classification

  • Directly applies accuracy optimization for the finance topic

  • Routes to the highest-scoring finance-optimized model

Available topic shortcuts:

  • router/finance:<mode>

  • router/writing:<mode>

  • router/academia:<mode>

  • router/programming:<mode>

  • router/science:<mode>

  • router/vision:<mode>

  • router/marketing:<mode>

  • router/maths:<mode>

Where <mode> can be: balanced, accuracy, cost, latency, or throughput.

Quick Decision Guide:

  • Don't know the type? → Use router/auto

  • Know the type? → Jump straight with router/<topic>:<mode>

Advanced Configuration

Topic-Specific Routing

Best Practices

  1. Choose the Right Mode - Match optimization to your use case

  2. Monitor Performance - Use LangDB's analytics to track routing decisions

  3. Combine with Fallbacks - Add fallback models for high availability

  4. Test Different Modes - Experiment to find the best fit

Integration with Other Features

The Auto Router works seamlessly with:

  • Guardrails - Apply content filtering before routing

  • MCP Servers - Access external tools and data sources

  • Response Caching - Cache responses for frequently asked questions

  • Analytics - Track routing decisions and performance metrics

{
  "model": "router/auto",
  "messages": [
    {
      "role": "user",
      "content": "What's the capital of France?"
    }
  ]
}

router/auto

Classifies complexity + topic. Low-complexity queries go to cheaper models; high-complexity queries go to stronger models. Then applies your optimization strategy.

router/auto:<mode>

Classifies topic only. Ignores complexity and always applies the chosen optimization (cost, accuracy, etc.) for that topic.

router/<topic>:<mode>

Skips classification. Directly routes to the specified topic with the chosen optimization mode.

balanced

Intelligently distributes requests across models for optimal performance

General apps (default)

accuracy

Picks models with best benchmark scores

Research, compliance

cost

Routes to cheapest viable model

Support chatbots, FAQs

latency

Always picks the fastest

Real-time UIs, voice bots

throughput

Distributes across many models

High-volume pipelines

{
  "model": "router/auto:cost",
  "messages": [
    {
      "role": "user",
      "content": "What are your business hours?"
    }
  ]
}
{
  "model": "router/auto:accuracy",
  "messages": [
    {
      "role": "user",
      "content": "Analyze this financial risk assessment"
    }
  ]
}
{
  "model": "router/auto:latency",
  "messages": [
    {
      "role": "user",
      "content": "What's the weather like today?"
    }
  ]
}
{
  "model": "router/auto",
  "messages": [
    {
      "role": "user",
      "content": "Help me write a product description"
    }
  ]
}
{
  "model": "router/finance:accuracy",
  "messages": [
    {
      "role": "user",
      "content": "Analyze the risk factors in this financial derivative"
    }
  ]
}
{
  "model": "router/auto",
  "router": {
    "topic_routing": {
      "finance": "cost",
      "writing": "latency",
      "technical": "accuracy"
    }
  },
  "messages": [
    {
      "role": "user",
      "content": "Calculate the net present value of this investment"
    }
  ]
}
LangDB dashboard showing Auto Router models with filters for providers, context length, and pricing. The models table displays router options like auto, auto:balanced, auto:cost, and topic-specific routers for academia, finance, and marketing.
LangDB dashboard showing available Auto Router models and configuration options
LangDB Traces dashboard showing a successful Auto Router call. The left panel lists the trace with details like cost <$0.001, 1937 tokens, and a duration of 21.09s. A timeline below shows the 'auto' routing step taking 0.36s, followed by 'programming:cost' (20.71s) and 'deepseek-r1-0528-qwen3...' (20.52s). The right panel provides detailed information for the 'auto' trace, including a 200 status, Trace ID, Run ID, Thread ID, start time, and its 0.36s duration.

Beating the Best Model

Save costs without losing quality. Auto Router delivers best-model accuracy at a fraction of the price.

Beating GPT-5

Auto Router delivers 83% satisfactory results at 35% lower cost than GPT-5. Real-world testing shows router optimization without quality compromise.

Fallback Routing
Optimized Routing
Percentage-Based Routing
Latency-Based Routing
Nested Routing
Router Config Page
Analytics- LangDB displays analytics on dashboard for metrics like TTFT, No. of Requests, etc
https://app.langdb.ai/sharing/threads/53b87631-de7f-431a-a049-48556f899b4d
Trace of simple OpenAI Agents SDK Sample on LangDB

Working with LangGraph

Automatically instrument LangChain chains and agents with LangDB—gain live traces, cost analytics, and latency insights through init().

LangDB provides seamless tracing and observability for LangChain-based applications.

Checkout:

Installation

Install the LangDB client with LangChain support:

pip install 'pylangdb[langchain]'

Quick Start

Export Environment Variables

export LANGDB_API_KEY="<your_langdb_api_key>"
export LANGDB_PROJECT_ID="<your_langdb_project_id>"
export LANGDB_API_BASE_URL='https://api.us-east-1.langdb.ai'

Initialize LangDB

Import and run the initialize before configuring your LangChain/LangGraph:

from pylangdb.langchain import init
# Initialise LangDB
init()

Define your Agent

# Your existing LangChain code works with proper configuration
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage
import os 

api_base = "https://api.us-east-1.langdb.ai"
api_key = os.getenv("LANGDB_API_KEY")
project_id = os.getenv("LANGDB_PROJECT_ID")

# Default headers for API requests
default_headers: dict[str, str] = {
    "x-project-id": project_id
}

# Initialize OpenAI LLM with LangDB configuratio
llm = ChatOpenAI(
    model_name="gpt-4o",
    temperature=0.3,
    openai_api_base=api_base,
    openai_api_key=api_key,
    default_headers=default_headers,
)
result = llm.invoke([HumanMessage(content="Hello, LangDB!")])

Once LangDB is initialized, all calls to llm, intermediate steps, tool executions, and nested chains are automatically traced and linked under a single session.

Complete LangGraph Agent Example

Here is a full LangGraph example based on ReAct Agent which uses LangDB Tracing.

Example code

Check out the full sample on GitHub: https://github.com/langdb/langdb-samples/tree/main/examples/langchain/langgraph-tracing

Setup Environment

Install the libraries using pip

pip install langgraph 'pylangdb[langchain]' langchain_openai geopy

Export Environment Variables

export LANGDB_API_KEY="<your_langdb_api_key>"
export LANGDB_PROJECT_ID="<your_langdb_project_id>"
export LANGDB_API_BASE_URL='https://api.us-east-1.langdb.ai'

main.py

# Initialize LangDB tracing
from pylangdb.langchain import init
init()

import os
from typing import Annotated, Sequence, TypedDict
from datetime import datetime

# Import required libraries
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage, ToolMessage
from langchain_core.tools import tool
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from geopy.geocoders import Nominatim
from pydantic import BaseModel, Field
import requests

# Initialize the model
def create_model():
    """Create and return the ChatOpenAI model."""
    api_base = os.getenv("LANGDB_API_BASE_URL")
    api_key = os.getenv("LANGDB_API_KEY")
    project_id = os.getenv("LANGDB_PROJECT_ID")
    default_headers = {
        "x-project-id": project_id,
    }
    llm = ChatOpenAI(
        model_name='openai/gpt-4o', # Choose any model from LangDB
        temperature=0.3,
        openai_api_base=api_base,
        openai_api_key=api_key,
        default_headers=default_headers
    )
    return llm
    
# Define the agent state
class AgentState(TypedDict):
    """The state of the agent."""
    messages: Annotated[Sequence[BaseMessage], add_messages]
    number_of_steps: int

# Define the weather tool
class SearchInput(BaseModel):
    location: str = Field(description="The city and state, e.g., San Francisco")
    date: str = Field(description="The forecasting date in format YYYY-MM-DD")

@tool("get_weather_forecast", args_schema=SearchInput, return_direct=True)
def get_weather_forecast(location: str, date: str) -> dict:
    """
    Retrieves the weather using Open-Meteo API for a given location (city) and a date (yyyy-mm-dd).
    Returns a dictionary with the time and temperature for each hour.
    """
    geolocator = Nominatim(user_agent="weather-app")
    location = geolocator.geocode(location)
    if not location:
        return {"error": "Location not found"}
    try:
        response = requests.get(
            f"https://api.open-meteo.com/v1/forecast?"
            f"latitude={location.latitude}&"
            f"longitude={location.longitude}&"
            "hourly=temperature_2m&"
            f"start_date={date}&end_date={date}",
            timeout=10
        )
        response.raise_for_status()
        data = response.json()
        return {
            time: f"{temp}°C" 
            for time, temp in zip(
                data["hourly"]["time"], 
                data["hourly"]["temperature_2m"]
            )
        }
    except Exception as e:
        return {"error": f"Failed to fetch weather data: {str(e)}"}


# Define the nodes
def call_model(state: AgentState) -> dict:
    """Call the model with the current state and return the response."""
    model = create_model()
    model.bind_tools([get_weather_forecast]
    messages = state["messages"]
    response = model.invoke(messages)
    return {"messages": [response], "number_of_steps": state["number_of_steps"] + 1}

def route_to_tool(state: AgentState) -> str:
    """Determine the next step based on the model's response."""
    messages = state["messages"]
    last_message = messages[-1]
    if hasattr(last_message, 'tool_calls') and last_message.tool_calls:
        return "call_tool"
    return END

# Create the graph
def create_agent():
    """Create and return the LangGraph agent."""
    # Create the graph
    workflow = StateGraph(AgentState)
    workflow.add_node("call_model", call_model)
    workflow.add_node("call_tool", ToolNode([get_weather_forecast]))
    workflow.set_entry_point("call_model")    
    workflow.add_conditional_edges(
        "call_model",
        route_to_tool,
        {
            "call_tool": "call_tool",
            END: END
        }
    )
    workflow.add_edge("call_tool", "call_model")
    return workflow.compile()

def main():
    agent = create_agent()
    query = f"What's the weather in Paris today? Today is {datetime.now().strftime('%Y-%m-%d')}."
    initial_state = {
        "messages": [HumanMessage(content=query)],
        "number_of_steps": 0
    }
    print(f"Query: {query}")
    print("\nRunning agent...\n")
    for output in agent.stream(initial_state):
        for key, value in output.items():
            if key == "__end__":
                continue
            print(f"\n--- {key.upper()} ---")
            if key == "messages":
                for msg in value:
                    if hasattr(msg, 'content'):
                        print(f"{msg.type}: {msg.content}")
                    if hasattr(msg, 'tool_calls') and msg.tool_calls:
                        print(f"Tool Calls: {msg.tool_calls}")
            else:
                print(value)

if __name__ == "__main__":
    main()

Running your Agent

Navigate to the parent directory of your agent project and use one of the following commands:

python main.py

Output

--- CALL_MODEL ---
{'messages': [AIMessage(content="The weather in Paris on July 1, 2025, is as follows:\n\n- 00:00: 28.1°C\n- 01:00: 27.0°C\n- 02:00: 26.3°C\n- 03:00: 25.7°C\n- 04:00: 25.1°C\n- 05:00: 24.9°C\n- 06:00: 25.8°C\n- 07:00: 27.6°C\n- 08:00: 29.6°C\n- 09:00: 31.7°C\n- 10:00: 33.7°C\n- 11:00: 35.1°C\n- 12:00: 36.3°C\n- 13:00: 37.3°C\n- 14:00: 38.6°C\n- 15:00: 37.9°C\n- 16:00: 38.1°C\n- 17:00: 37.8°C\n- 18:00: 37.3°C\n- 19:00: 35.3°C\n- 20:00: 33.2°C\n- 21:00: 30.8°C\n- 22:00: 28.7°C\n- 23:00: 27.3°C\n\nIt looks like it's going to be a hot day in Paris!", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 319, 'prompt_tokens': 585, 'total_tokens': 904, 'completion_tokens_details': None, 'prompt_tokens_details': None, 'cost': 0.005582999999999999}, 'model_name': 'gpt-4o', 'system_fingerprint': None, 'id': '3bbde343-79e3-4d8f-bd97-b07179ee92c0', 'service_tier': None, 'finish_reason': 'stop', 'logprobs': None}, id='run--4fd3896d-1fbd-4c91-9c21-bd6cf3d2949e-0', usage_metadata={'input_tokens': 585, 'output_tokens': 319, 'total_tokens': 904, 'input_token_details': {}, 'output_token_details': {}})], 'number_of_steps': 2}

Traces on LangDB

When you run queries against your agent, LangDB automatically captures detailed traces of all agent interactions:

Next Steps: Advanced LangGraph Integration

This guide covered the basics of integrating LangDB with LangGraph using a ReAcT agent example. For more complex scenarios and advanced use cases, check out our comprehensive resources in .

Retrieve messages for a specific thread

get
Authorizations
Path parameters
thread_idstring · uuidRequired

The ID of the thread to retrieve messages from

Header parameters
X-Project-IdstringRequired

LangDB project ID

Responses
200

A list of messages for the given thread

application/json
get
GET /threads/{thread_id}/messages HTTP/1.1
Host: api.us-east-1.langdb.ai
Authorization: Bearer YOUR_SECRET_TOKEN
X-Project-Id: text
Accept: */*
200

A list of messages for the given thread

[
  {
    "model_name": "gpt-4o-mini",
    "thread_id": "123e4567-e89b-12d3-a456-426614174000",
    "user_id": "langdb",
    "content_type": "Text",
    "content": "text",
    "content_array": [
      "text"
    ],
    "type": "system",
    "tool_call_id": "123e4567-e89b-12d3-a456-426614174000",
    "tool_calls": "text",
    "created_at": "2025-01-29 10:25:00.736000",
    "id": "123e4567-e89b-12d3-a456-426614174000"
  }
]

Retrieve the total cost for a specific thread

get
Authorizations
Path parameters
thread_idstring · uuidRequired

The ID of the thread for which to retrieve cost information

Header parameters
X-Project-IdstringRequired

LangDB project ID

Responses
200

The total cost and token usage for the specified thread

application/json
get
GET /threads/{thread_id}/cost HTTP/1.1
Host: api.us-east-1.langdb.ai
Authorization: Bearer YOUR_SECRET_TOKEN
X-Project-Id: text
Accept: */*
200

The total cost and token usage for the specified thread

{
  "total_cost": 0.022226999999999997,
  "total_output_tokens": 171,
  "total_input_tokens": 6725
}

Create a new model

post

Register and configure a new LLM under your LangDB project

Authorizations
Header parameters
X-Admin-KeystringRequired

LangDB Admin Key

Body
model_namestringRequiredExample: my-model
descriptionstringRequiredExample: A custom completions model for text and image inputs
provider_info_idstring · uuidRequiredExample: e2e9129b-6661-4eeb-80a2-0c86964974c9
project_idstringRequiredExample: 55f4a12b-74c8-4294-8e4b-537f13fc3861
publicbooleanOptionalExample: false
request_response_mappingstringOptionalExample: openai-compatible
model_typestringRequiredExample: completions
input_token_pricenumber · float | nullableOptionalExample: 0.00001
output_token_pricenumber · float | nullableOptionalExample: 0.00003
context_sizeinteger | nullableOptionalExample: 128000
capabilitiesstring[]OptionalExample: ["tools"]
input_typesstring[]OptionalExample: ["text","image"]
output_typesstring[]OptionalExample: ["text","image"]
tagsstring[]Optional
mp_pricenumber · float | nullableOptional
owner_namestringRequiredExample: openai
priorityintegerRequiredExample: 0
model_name_in_providerstringOptionalExample: my-model-v1.2
parametersobjectOptional

Additional configuration parameters

Example: {"top_k":{"default":0,"description":"Limits the token sampling to only the top K tokens.","min":0,"required":false,"step":1,"type":"int"},"top_p":{"default":1,"description":"Nucleus sampling alternative.","max":1,"min":0,"required":false,"step":0.05,"type":"float"}}
Responses
200

Created

application/json
post
POST /admin/models HTTP/1.1
Host: api.xxx.langdb.ai
Authorization: Bearer JWT
X-Admin-Key: text
Content-Type: application/json
Accept: */*
Content-Length: 884

{
  "model_name": "my-model",
  "description": "A custom completions model for text and image inputs",
  "provider_info_id": "e2e9129b-6661-4eeb-80a2-0c86964974c9",
  "project_id": "55f4a12b-74c8-4294-8e4b-537f13fc3861",
  "public": false,
  "request_response_mapping": "openai-compatible",
  "model_type": "completions",
  "input_token_price": 0.00001,
  "output_token_price": 0.00003,
  "context_size": 128000,
  "capabilities": [
    "tools"
  ],
  "input_types": [
    "text",
    "image"
  ],
  "output_types": [
    "text",
    "image"
  ],
  "tags": [],
  "type_prices": {
    "text_generation": 0.00002
  },
  "mp_price": null,
  "owner_name": "openai",
  "priority": 0,
  "model_name_in_provider": "my-model-v1.2",
  "parameters": {
    "top_k": {
      "default": 0,
      "description": "Limits the token sampling to only the top K tokens.",
      "min": 0,
      "required": false,
      "step": 1,
      "type": "int"
    },
    "top_p": {
      "default": 1,
      "description": "Nucleus sampling alternative.",
      "max": 1,
      "min": 0,
      "required": false,
      "step": 0.05,
      "type": "float"
    }
  }
}
200

Created

{
  "id": "55f4a12b-74c8-4294-8e4b-537f13fc3861",
  "model_name": "my-model",
  "description": "A custom completions model for text and image inputs",
  "provider_info_id": "e2e9129b-6661-4eeb-80a2-0c86964974c9",
  "model_type": "completions",
  "input_token_price": "0.00001",
  "output_token_price": "0.00003",
  "context_size": 128000,
  "capabilities": [
    "tools"
  ],
  "input_types": [
    "text",
    "image"
  ],
  "output_types": [
    "text",
    "image"
  ],
  "tags": [],
  "type_prices": null,
  "mp_price": null,
  "model_name_in_provider": "my-model-v1.2",
  "owner_name": "openai",
  "priority": 0,
  "parameters": {
    "top_k": {
      "default": 0,
      "description": "Limits the token sampling to only the top K tokens.",
      "min": 0,
      "required": false,
      "step": 1,
      "type": "int"
    },
    "top_p": {
      "default": 1,
      "description": "An alternative to sampling with temperature.",
      "max": 1,
      "min": 0,
      "required": false,
      "step": 0.05,
      "type": "float"
    }
  }
}

Retrieve pricing information

get

Returns the pricing details for LangDB services.

Responses
200

Successful retrieval of pricing information

application/json
get
GET /pricing HTTP/1.1
Host: api.us-east-1.langdb.ai
Accept: */*
200

Successful retrieval of pricing information

{
  "model": "gpt-3.5-turbo-0125",
  "provider": "openai",
  "price": {
    "per_input_token": 0.5,
    "per_output_token": 1.5,
    "valid_from": null
  },
  "input_formats": [
    "text"
  ],
  "output_formats": [
    "text"
  ],
  "capabilities": [
    "tools"
  ],
  "type": "completions",
  "limits": {
    "max_context_size": 16385
  }
}

List models

get
Responses
200

OK

application/json
get
GET /models HTTP/1.1
Host: api.us-east-1.langdb.ai
Accept: */*
200

OK

{
  "object": "list",
  "data": [
    {
      "id": "o1-mini",
      "object": "model",
      "created": 1686935002,
      "owned_by": "openai"
    }
  ]
}

Set custom prices for imported models

post

Set custom pricing for models imported from providers like Bedrock, Azure, Vertex that do not have built-in pricing

Authorizations
Path parameters
project_idstring · uuidRequired

UUID of the project

Example: 25a0b7d9-86cf-448d-8395-66e9073d876f
Body

Request body for setting custom prices on imported models

Responses
200

Custom prices set successfully

application/json
post
POST /projects/{project_id}/custom_prices HTTP/1.1
Host: api.us-east-1.langdb.ai
Authorization: Bearer YOUR_SECRET_TOKEN
Content-Type: application/json
Accept: */*
Content-Length: 88

{
  "bedrock/twelvelabs.pegasus-1-2-v1:0": {
    "per_input_token": 1.23,
    "per_output_token": 2.12
  }
}
200

Custom prices set successfully

{
  "bedrock/ai21.jamba-1-5-large-v1:0": {
    "per_input_token": 1.23,
    "per_output_token": 2.12
  }
}
Guides Section
Guides Section
Guides Section
Guides Section
https://app.langdb.ai/sharing/threads/bfced28f-5966-4491-867e-fd3875fe3282
Guides Section
Guides
Smithery guide

Guardrails

Enforce safety, compliance, and quality with LangDB guardrails—moderate content, validate responses, and detect security risks.

LangDB allow developers to enforce specific constraints and checks on their LLM calls, ensuring safety, compliance, and quality control.

Guardrails currently support request validation and logging, ensuring structured oversight of LLM interactions.

Guardrail Templates on LangDB

These guardrails include:

  • Content Moderation: Detects and filters harmful or inappropriate content (e.g., toxicity detection, sentiment analysis).

  • Security Checks: Identifies and mitigates security risks (e.g., PII detection, prompt injection detection).

  • Compliance Enforcement: Ensures adherence to company policies and factual accuracy (e.g., policy adherence, factual accuracy).

  • Response Validation: Validates response format and structure (e.g., word count, JSON schema, regex patterns).

Guardrails can be configured via the UI or API, providing flexibility for different use cases.

Guardrail Behaviour

When a guardrail blocks an input or output, the system returns a structured error response. Below are some example responses for different scenarios:

Example 1: Input Rejected by Guard

{
  "id": "",
  "object": "chat.completion",
  "created": 0,
  "model": "",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Input rejected by guard",
        "tool_calls": null,
        "refusal": null,
        "tool_call_id": null
      },
      "finish_reason": "rejected"
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0,
    "cost": 0.0
  }
}

Example 2: Output Rejected by Guard

{
  "id": "5ef4d8b1-f700-46ca-8439-b537f58f7dc6",
  "object": "chat.completion",
  "created": 1741865840,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Output rejected by guard",
        "tool_calls": null,
        "refusal": null,
        "tool_call_id": null
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 21,
    "completion_tokens": 40,
    "total_tokens": 61,
    "cost": 0.000032579999999999996
  }
}

Limitations

It is important to note that guardrails cannot be applied to streaming outputs.

Guardrail Templates

LangDB provides prebuilt templates to enforce various constraints on LLM responses. These templates cover areas such as content moderation, security, compliance, and validation.

The following table provides quick access to each guardrail template:

Guardrail
Description

Detects and filters toxic or harmful content.

Validates responses against a user-defined JSON schema.

Detects mentions of competitor names or products.

Identifies personally identifiable information in responses.

Detects attempts to manipulate the AI through prompt injections.

Ensures responses align with company policies.

Validates responses against specified regex patterns.

Ensures responses meet specified word count requirements.

Evaluates sentiment to ensure appropriate tone.

Checks if responses are in allowed languages.

Ensures responses stay on specified topics.

Validates that responses contain factually accurate information.

Toxicity Detection (content-toxicity)

Detects and filters out toxic, harmful, or inappropriate content.

JSON Schema Validator (validation-json-schema)

Validates responses against a user-defined JSON schema.

Parameter
Type
Description
Defaults

schema

object

Custom JSON schema to validate against (replace with your own schema)

Required

Competitor Mention Check (content-competitor-mentions)

Detects mentions of competitor names or products in LLM responses.

Parameter
Type
Description
Defaults

competitors

array

List of competitor names.

["company1", "company2"]

match_partial

boolean

Whether to match partial names.

true

case_sensitive

boolean

Whether matching should be case sensitive

false

PII Detection (security-pii-detection)

Detects personally identifiable information (PII) in responses.

Parameter
Type
Description
Defaults

pii_types

array

Types of PII to detect.

["email", "phone", "ssn", "credit_card"]

redact

boolean

Whether to redact detected PII.

false

Prompt Injection Detection (security-prompt-injection)

Identifies prompt injection attacks attempting to manipulate the AI.

Parameter
Type
Description
Defaults

threshold

number

Confidence threshold for injection detection.

Required

detection_patterns

array

Common patterns used in prompt injection attacks.

["Ignore previous instructions", "Forget your training", "Tell me your prompt"]

evaluation_criteria

array

Criteria used for detection.

["Attempts to override system instructions", "Attempts to extract system prompt information", "Attempts to make the AI operate outside its intended purpose"]

Company Policy Compliance (compliance-company-policy)

Ensures that responses align with predefined company policies.

Parameter
Type
Description
Defaults

embedding_model

string

Model used for text embedding.

text-embedding-ada-002

threshold

number

Similarity threshold for compliance.

Required

dataset

object

Example dataset for compliance checking.

Contains predefined examples

Regex Pattern Validator (validation-regex-pattern)

Validates responses against specific regex patterns.

Parameter
Type
Description
Defaults

patterns

array

Model List of regex patterns.

["^[A-Za-z0-9\s.,!?]+$"]

match_type

string

Whether all, any, or none of the patterns must match.

"all"

Word Count Validator (validation-word-count)

Ensures responses meet specified word count requirements.

Parameter
Type
Description
Defaults

min_words

number

Model List of regex patterns.

10

max_words

number

Whether all, any, or none of the patterns must match.

500

count_method

string

Method for word counting.

split

Sentiment Analysis (content-sentiment-analysis)

Evaluates the sentiment of responses to ensure appropriate tone.

Parameter
Type
Description
Defaults

allowed_sentiments

array

Allowed sentiment categories.

["positive", "neutral"]

threshold

number

Confidence threshold for sentiment detection.

0.7

Language Validator (content-language-validation)

Checks if responses are in allowed languages.

Parameter
Type
Description
Defaults

allowed_languages

array

List of allowed languages.

["english"]

threshold

number

Confidence threshold for language detection.

0.9

Topic Adherence (content-topic-adherence)

Ensures responses stay on specified topics.

Parameter
Type
Description
Defaults

allowed_topics

array

List of allowed topics.

["Product information", "Technical assistance"]

forbidden_topics

array

List of forbidden topics.

["politics", "religion"]

threshold

number

Confidence threshold for topic detection.

0.7

Factual Accuracy (content-factual-accuracy)

Validates that responses contain factually accurate information.

Parameter
Type
Description
Defaults

reference_facts

array

List of reference facts.

[]

threshold

number

Confidence threshold for factuality assessment.

0.8

evaluation_criteria

array

Criteria used to assess factual accuracy.

["Contains verifiable information", "Avoids speculative claims"]

Fetch analytics data

post
Authorizations
Header parameters
X-Project-IdstringRequired

LangDB project ID

Body
start_time_usinteger · int64OptionalDeprecated

Start time in microseconds.

Example: 1693062345678
end_time_usinteger · int64OptionalDeprecated

End time in microseconds.

Example: 1693082345678
periodstring · enumOptional

Time period for filtering data. If provided, start_time and end_time will be ignored.

Example: last_monthPossible values:
Responses
200

Successful response

application/json
post
POST /analytics HTTP/1.1
Host: api.us-east-1.langdb.ai
Authorization: Bearer YOUR_SECRET_TOKEN
X-Project-Id: text
Content-Type: application/json
Accept: */*
Content-Length: 23

{
  "period": "last_month"
}
200

Successful response

{
  "timeseries": [
    {
      "hour": "2025-02-20 18:00:00",
      "total_cost": 12.34,
      "total_requests": 1000,
      "avg_duration": 250.5,
      "duration": 245.7,
      "duration_p99": 750.2,
      "duration_p95": 500.1,
      "duration_p90": 400.8,
      "duration_p50": 200.3,
      "total_duration": 1,
      "total_input_tokens": 1,
      "total_output_tokens": 1,
      "error_rate": 1,
      "error_request_count": 1,
      "avg_ttft": 1,
      "ttft": 1,
      "ttft_p99": 1,
      "ttft_p95": 1,
      "ttft_p90": 1,
      "ttft_p50": 1,
      "tps": 1,
      "tps_p99": 1,
      "tps_p95": 1,
      "tps_p90": 1,
      "tps_p50": 1,
      "tpot": 0.85,
      "tpot_p99": 1.5,
      "tpot_p95": 1.2,
      "tpot_p90": 1,
      "tpot_p50": 0.75,
      "tag_tuple": [
        "text"
      ]
    }
  ],
  "start_time": 1,
  "end_time": 1
}

Fetch analytics summary

post
Authorizations
Header parameters
X-Project-IdstringRequired

LangDB project ID

Body
start_time_usinteger · int64OptionalDeprecatedExample: 1693062345678
end_time_usinteger · int64OptionalDeprecatedExample: 1693082345678
periodstring · enumOptional

Time period for filtering data. If provided, start_time and end_time will be ignored.

Example: last_monthPossible values:
groupBystring[]RequiredExample: ["provider"]
Responses
200

Successful response

application/json
post
POST /analytics/summary HTTP/1.1
Host: api.us-east-1.langdb.ai
Authorization: Bearer YOUR_SECRET_TOKEN
X-Project-Id: text
Content-Type: application/json
Accept: */*
Content-Length: 46

{
  "period": "last_month",
  "groupBy": [
    "provider"
  ]
}
200

Successful response

{
  "summary": [
    {
      "tag_tuple": [
        "openai",
        "gpt-4"
      ],
      "total_cost": 156.78,
      "total_requests": 5000,
      "total_duration": 1250000,
      "avg_duration": 250,
      "duration": 245.5,
      "duration_p99": 750,
      "duration_p95": 500,
      "duration_p90": 400,
      "duration_p50": 200,
      "total_input_tokens": 100000,
      "total_output_tokens": 50000,
      "avg_ttft": 100,
      "ttft": 98.5,
      "ttft_p99": 300,
      "ttft_p95": 200,
      "ttft_p90": 150,
      "ttft_p50": 80,
      "tps": 10.5,
      "tps_p99": 20,
      "tps_p95": 15,
      "tps_p90": 12,
      "tps_p50": 8,
      "tpot": 0.85,
      "tpot_p99": 1.5,
      "tpot_p95": 1.2,
      "tpot_p90": 1,
      "tpot_p50": 0.75,
      "error_rate": 1,
      "error_request_count": 1
    }
  ],
  "start_time": 1,
  "end_time": 1
}

Get total usage

post
Authorizations
Header parameters
X-Project-IdstringRequired

LangDB project ID

Body
start_time_usinteger · int64OptionalDeprecatedExample: 1693062345678
end_time_usinteger · int64OptionalDeprecated

End time in microseconds.

Example: 1693082345678
periodstring · enumOptional

Time period for filtering data. If provided, start_time and end_time will be ignored.

Example: last_monthPossible values:
Responses
200

OK

application/json
post
POST /usage/total HTTP/1.1
Host: api.us-east-1.langdb.ai
Authorization: Bearer YOUR_SECRET_TOKEN
X-Project-Id: text
Content-Type: application/json
Accept: */*
Content-Length: 23

{
  "period": "last_month"
}
200

OK

{
  "models": [
    {
      "provider": "openai",
      "model_name": "gpt-4o",
      "total_input_tokens": 3196182,
      "total_output_tokens": 74096,
      "total_cost": 10.4776979999,
      "cost_per_input_token": 3,
      "cost_per_output_token": 12
    }
  ],
  "total": {
    "total_input_tokens": 4181386,
    "total_output_tokens": 206547,
    "total_cost": 11.8904386859
  },
  "period_start": 1737504000,
  "period_end": 1740120949
}

Get usage by model

post
Authorizations
Header parameters
X-Project-IdstringRequired

LangDB project ID

Body
start_time_usinteger · int64OptionalExample: 1693062345678
end_time_usinteger · int64Optional
min_unitstring · enumOptional

The granularity of the returned usage data.

Example: hourPossible values:
Responses
200

Successful response

application/json
post
POST /usage/models HTTP/1.1
Host: api.us-east-1.langdb.ai
Authorization: Bearer YOUR_SECRET_TOKEN
X-Project-Id: text
Content-Type: application/json
Accept: */*
Content-Length: 65

{
  "start_time_us": 1693062345678,
  "end_time_us": 1,
  "min_unit": "hour"
}
200

Successful response

{
  "models": [
    {
      "hour": "2025-02-15 09:00:00",
      "provider": "openai",
      "model_name": "gpt-4o",
      "total_input_tokens": 451235,
      "total_output_tokens": 2553,
      "total_cost": 1.3843410000000005
    }
  ],
  "period_start": 1737504000000000,
  "period_end": 1740121147931000
}
Toxicity Detection
JSON Schema Validator
Competitor Mention Check
PII Detection
Prompt Injection Detection
Company Policy Compliance
Regex Pattern Validator
Word Count Validator
Sentiment Analysis
Language Validator
Topic Adherence
Factual Accuracy
LangDB Guardrails - Displaying all the guards available there.

Create chat completion

post
Authorizations
Header parameters
X-Project-IdstringRequired

LangDB project ID

Body
modelstringRequired

ID of the model to use. This can be either a specific model ID or a virtual model identifier.

Example: gpt-4o
temperaturenumber · max: 2Optional

Sampling temperature.

Example: 0.8
top_pnumber · max: 1Optional

Nucleus sampling probability.

max_tokensinteger · min: 1Optional

The maximum number of tokens that can be generated in the chat completion.

ninteger · min: 1Optional

How many chat completion choices to generate for each input message.

Default: 1
stopone ofOptional

Up to 4 sequences where the API will stop generating further tokens.

stringOptional
or
string[]Optional
presence_penaltynumber · min: -2 · max: 2Optional

Penalize new tokens based on whether they appear in the text so far.

frequency_penaltynumber · min: -2 · max: 2Optional

Penalize new tokens based on their existing frequency in the text so far.

logprobsbooleanOptional

Whether to return log probabilities of the output tokens.

top_logprobsinteger · min: 1 · max: 20Optional

The number of most likely tokens to return at each position, for which the log probabilities are returned. Requires logprobs=true.

seedintegerOptional

If specified, the backend will make a best effort to return deterministic results.

response_formatone ofOptional

Format for the model's response.

string · enumOptionalPossible values:
or
tool_choiceone ofOptional

Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools.

string · enumOptionalPossible values:
or
or
parallel_tool_callsbooleanOptional

Whether to enable parallel function calling during tool use.

Default: true
streambooleanOptional

Whether to stream back partial progress.

Default: false
userstringOptionalDeprecated

Deprecated. This field is being replaced by safety_identifier and prompt_cache_key. Use prompt_cache_key to maintain caching optimizations. A stable identifier for your end-users was previously used to boost cache hit rates by better bucketing similar requests and to help detect and prevent abuse.

safety_identifierstringOptional

Stable identifier for your end-users, used to help detect and prevent abuse. Prefer this over user. For caching optimization, combine with prompt_cache_key.

prompt_cache_keystringOptional

Used to cache responses for similar requests to optimize cache hit rates. LangDB supports prompt caching; see https://docs.langdb.ai/features/prompt-caching. Can be used instead of the user field for cache bucketing.

Responses
200

OK

application/json
post
POST /v1/chat/completions HTTP/1.1
Host: api.us-east-1.langdb.ai
Authorization: Bearer YOUR_SECRET_TOKEN
X-Project-Id: text
Content-Type: application/json
Accept: */*
Content-Length: 859

{
  "model": "router/dynamic",
  "messages": [
    {
      "role": "user",
      "content": "Write a haiku about recursion in programming."
    }
  ],
  "temperature": 0.8,
  "max_tokens": 1000,
  "top_p": 0.9,
  "frequency_penalty": 0.1,
  "presence_penalty": 0.2,
  "stream": false,
  "response_format": "json_object",
  "mcp_servers": [
    {
      "server_url": "wss://your-mcp-server.com/ws?config=your_encoded_config",
      "type": "ws"
    }
  ],
  "router": {
    "type": "conditional",
    "routes": [
      {
        "conditions": {
          "all": [
            {
              "extra.user.tier": {
                "$eq": "premium"
              }
            }
          ]
        },
        "name": "premium_user",
        "targets": {
          "$any": [
            "openai/gpt-4.1-mini",
            "xai/grok-4",
            "anthropic/claude-sonnet-4"
          ],
          "filter": {
            "error_rate": {
              "$lt": 0.01
            }
          },
          "sort_by": "ttft",
          "sort_order": "min"
        }
      },
      {
        "name": "basic_user",
        "targets": "openai/gpt-4.1-nano"
      }
    ]
  },
  "extra": {
    "guards": [
      "word_count_validator_bd4bdnun",
      "toxicity_detection_4yj4cdvu"
    ],
    "user": {
      "id": "7",
      "name": "mrunmay",
      "tier": "premium",
      "tags": [
        "coding",
        "software"
      ]
    }
  }
}
200

OK

{
  "id": "text",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 1,
      "message": {
        "role": "assistant",
        "content": "text",
        "tool_calls": [
          {
            "id": "text",
            "type": "function",
            "function": {
              "name": "text",
              "arguments": "text"
            }
          }
        ],
        "function_call": {
          "name": "text",
          "arguments": "text"
        }
      },
      "logprobs": {
        "content": [
          {
            "token": "text",
            "logprob": 1
          }
        ],
        "refusal": [
          {
            "token": "text",
            "logprob": 1
          }
        ]
      }
    }
  ],
  "created": 1,
  "model": "text",
  "system_fingerprint": "text",
  "object": "chat.completion",
  "usage": {
    "prompt_tokens": 1,
    "completion_tokens": 1,
    "total_tokens": 1,
    "prompt_tokens_details": {
      "cached_tokens": 1,
      "cache_creation_tokens": 1,
      "audio_tokens": 1
    },
    "completion_tokens_details": {
      "reasoning_tokens": 1,
      "accepted_prediction_tokens": 1,
      "rejected_prediction_tokens": 1,
      "audio_tokens": 1
    },
    "cost": 1
  }
}

Create embeddings

post

Creates an embedding vector representing the input text or token arrays.

Authorizations
Body
modelstringRequired

ID of the model to use for generating embeddings.

Example: text-embedding-ada-002
inputone ofRequired
stringOptional

The text to embed.

or
string[]Optional

Array of text strings to embed.

encoding_formatstring · enumOptional

The format to return the embeddings in.

Default: floatPossible values:
dimensionsinteger · min: 1 · max: 1536Optional

The number of dimensions the resulting embeddings should have.

Example: 1536
Responses
200

Successful response with embeddings

application/json
post
POST /v1/embeddings HTTP/1.1
Host: api.us-east-1.langdb.ai
Authorization: Bearer YOUR_SECRET_TOKEN
Content-Type: application/json
Accept: */*
Content-Length: 136

{
  "input": "The food was delicious and the waiter was kind.",
  "model": "text-embedding-ada-002",
  "encoding_format": "float",
  "dimensions": 1536
}
200

Successful response with embeddings

{
  "data": [
    {
      "embedding": [
        1
      ],
      "index": 1
    }
  ],
  "model": "text",
  "usage": {
    "prompt_tokens": 1,
    "total_tokens": 1
  }
}
Parameter
Type
Description
Defaults

threshold

number

Confidence threshold for toxicity detection.

Required

categories

array

Categories of toxicity to detect.

["hate", "harassment", "violence", "self-harm", "sexual", "profanity"]

evaluation_criteria

array

Criteria used for toxicity evaluation.

["Hate speech", "Harassment", "Violence", "Self-harm", "Sexual content", "Profanity"]

Getting Started

Use LangDB’s Python SDK to generate completions, monitor API usage, retrieve analytics, and evaluate LLM workflows efficiently.

Key Features

LangDB exposes two complementary capabilities:

  1. Chat Completions Client – Call LLMs using the LangDb Python client. This works as a drop-in replacement for openai.ChatCompletion while adding automatic usage, cost and latency reporting.

  2. Agent Tracing – Instrument your existing AI framework (ADK, LangChain, CrewAI, etc.) with a single init() call. All calls are routed through the LangDB collector and are enriched with additional metadata regarding the framework is visible on the LangDB dashboard.


Quick Start (Chat Completions)


Agent Tracing Quick Start

Note: Always initialize LangDB before importing any framework-specific classes to ensure proper instrumentation.

Example Trace Screenshot

Supported Frameworks (Tracing)

Framework
Installation
Import Pattern
Key Features

How It Works

LangDB uses intelligent monkey patching to instrument your AI frameworks at runtime:

Click to see technical details for each framework

Google ADK

  • Patches Agent.__init__ to inject callbacks

  • Tracks agent hierarchies and tool usage

  • Maintains thread context across invocations

OpenAI

  • Intercepts HTTP requests via AsyncOpenAI.post

  • Propagates trace context via headers

  • Correlates spans across agent interactions

LangChain

  • Modifies httpx.Client.send for request tracing

  • Automatically tracks chains and agents

  • Injects trace headers into all requests

CrewAI

  • Intercepts litellm.completion for LLM calls

  • Tracks crew members and task delegation

  • Propagates context through LiteLLM headers

Agno

  • Patches LangDB.invoke and client parameters

  • Traces workflows and model interactions

  • Maintains consistent session context

Installation

Configuration

Set your credentials (or pass them directly to the init() function):

Client Usage (Chat Completions)

Initialize LangDb Client

Chat Completions

Thread Operations

Get Messages

Retrieve messages from a specific thread:

Get Thread Cost

Get cost and token usage information for a thread:

Analytics

Get analytics data for specific tags:

Evaluate Multiple Threads

List Available Models

Framework-Specific Examples (Tracing)

Google ADK

OpenAI

LangChain

CrewAI

Agno

Advanced Configuration

Environment Variables

Variable
Description
Default

Custom Configuration

All init() functions accept the same optional parameters:

Technical Details

Session and Thread Management

  • Thread ID: Maintains consistent session identifiers across agent calls

  • Run ID: Unique identifier for each execution trace

  • Invocation Tracking: Tracks the sequence of agent invocations

  • State Persistence: Maintains context across callbacks and sub-agent interactions

Distributed Tracing

  • OpenTelemetry Integration: Uses OpenTelemetry for standardized tracing

  • Attribute Propagation: Automatically propagates LangDB-specific attributes

  • Span Correlation: Links related spans across different agents and frameworks

  • Custom Exporters: Supports multiple export formats (OTLP, Console)

API Reference

Initialization Functions

Each framework has a simple init() function that handles all necessary setup:

  • langdb.adk.init(): Patches Google ADK Agent class with LangDB callbacks

  • langdb.openai.init(): Initializes OpenAI agents tracing

  • langdb.langchain.init(): Initializes LangChain tracing

  • langdb.crewai.init(): Initializes CrewAI tracing

  • langdb.agno.init(): Initializes Agno tracing

All init functions accept optional parameters for custom configuration (collector_endpoint, api_key, project_id)

Troubleshooting

Common Issues

  1. Missing API Key: Ensure LANGDB_API_KEY and LANGDB_PROJECT_ID are set

  2. Tracing Not Working: Check that initialization functions are called before creating agents

  3. Network Issues: Verify collector endpoint is accessible

  4. Framework Conflicts: Initialize LangDB integration before other instrumentation

pip install pylangdb[client]
from pylangdb.client import LangDb

# Initialize LangDB client
client = LangDb(api_key="your_api_key", project_id="your_project_id")

# Simple chat completion
resp = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(resp.choices[0].message.content)
# Install the package with Google ADK support
pip install pylangdb[adk]
# Import and initialize LangDB tracing
# First initialize LangDB before defining any agents
from pylangdb.adk import init
init()

import datetime
from zoneinfo import ZoneInfo
from google.adk.agents import Agent

def get_weather(city: str) -> dict:
    if city.lower() != "new york":
        return {"status": "error", "error_message": f"Weather information for '{city}' is not available."}
    return {"status": "success", "report": "The weather in New York is sunny with a temperature of 25 degrees Celsius (77 degrees Fahrenheit)."}

def get_current_time(city: str) -> dict:
    if city.lower() != "new york":
        return {"status": "error", "error_message": f"Sorry, I don't have timezone information for {city}."}
    tz = ZoneInfo("America/New_York")
    now = datetime.datetime.now(tz)
    return {"status": "success", "report": f'The current time in {city} is {now.strftime("%Y-%m-%d %H:%M:%S %Z%z")}'}

root_agent = Agent(
    name="weather_time_agent",
    model="gemini-2.0-flash",
    description=("Agent to answer questions about the time and weather in a city." ),
    instruction=("You are a helpful agent who can answer user questions about the time and weather in a city."),
    tools=[get_weather, get_current_time],
)

Google ADK

pip install pylangdb[adk]

from pylangdb.adk import init

Automatic sub-agent discovery

OpenAI

pip install pylangdb[openai]

from pylangdb.openai import init

Custom model provider support and Run Tracing

LangChain

pip install pylangdb[langchain]

from pylangdb.langchain import init

Automatic chain tracing

CrewAI

pip install pylangdb[crewai]

from pylangdb.crewai import init

Multi-agent crew tracing

Agno

pip install pylangdb[agno]

from pylangdb.agno import init

Tool usage tracing, model interactions

# For client library functionality (chat completions, analytics, etc.)
pip install pylangdb[client]

# For framework tracing - install specific framework extras
pip install pylangdb[adk]      # Google ADK tracing
pip install pylangdb[openai]   # OpenAI agents tracing
pip install pylangdb[langchain] # LangChain tracing
pip install pylangdb[crewai]   # CrewAI tracing
pip install pylangdb[agno]     # Agno tracing
export LANGDB_API_KEY="your-api-key"
export LANGDB_PROJECT_ID="your-project-id"
from pylangdb import LangDb

# Initialize with API key and project ID
client = LangDb(api_key="your_api_key", project_id="your_project_id")
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Say hello!"}
]

response = client.completion(
    model="gemini-1.5-pro-latest",
    messages=messages,
    temperature=0.7,
    max_tokens=100
)
messages = client.get_messages(thread_id="your_thread_id")

# Access message details
for message in messages:
    print(f"Type: {message.type}")
    print(f"Content: {message.content}")
    if message.tool_calls:
        for tool_call in message.tool_calls:
            print(f"Tool: {tool_call.function.name}")
usage = client.get_usage(thread_id="your_thread_id")
print(f"Total cost: ${usage.total_cost:.4f}")
print(f"Input tokens: {usage.total_input_tokens}")
print(f"Output tokens: {usage.total_output_tokens}")
# Get raw analytics data
analytics = client.get_analytics(
    tags="model1,model2",
    start_time_us=None,  # Optional: defaults to 24 hours ago
    end_time_us=None     # Optional: defaults to current time
)

# Get analytics as a pandas DataFrame
df = client.get_analytics_dataframe(
    tags="model1,model2",
    start_time_us=None,
    end_time_us=None
)
df = client.create_evaluation_df(thread_ids=["thread1", "thread2"])
print(df.head())
models = client.list_models()
print(models)
from pylangdb.adk import init

# Monkey-patch the client for tracing
init()

# Import your agents after initializing tracing
from google.adk.agents import Agent
from travel_concierge.sub_agents.booking.agent import booking_agent
from travel_concierge.sub_agents.in_trip.agent import in_trip_agent
from travel_concierge.sub_agents.inspiration.agent import inspiration_agent
from travel_concierge.sub_agents.planning.agent import planning_agent
from travel_concierge.sub_agents.post_trip.agent import post_trip_agent
from travel_concierge.sub_agents.pre_trip.agent import pre_trip_agent
from travel_concierge.tools.memory import _load_precreated_itinerary


root_agent = Agent(
    model="openai/gpt-4.1",
    name="root_agent",
    description="A Travel Conceirge using the services of multiple sub-agents",
    instruction="Instruct the travel concierge to plan a trip for the user.",
    sub_agents=[
        inspiration_agent,
        planning_agent,
        booking_agent,
        pre_trip_agent,
        in_trip_agent,
        post_trip_agent,
    ],
    before_agent_callback=_load_precreated_itinerary,
)
import uuid
import os

# Import LangDB tracing
from pylangdb.openai import init

# Initialize tracing
init()

# Import agent components
from agents import (
    Agent,
    Runner,
    set_default_openai_client,
    RunConfig,
    ModelProvider,
    Model,
    OpenAIChatCompletionsModel
)

# Configure OpenAI client with environment variables
from openai import AsyncOpenAI

client = AsyncOpenAI(
    api_key=os.environ.get("LANGDB_API_KEY"),
    base_url=os.environ.get("LANGDB_API_BASE_URL"),
    default_headers={
        "x-project-id": os.environ.get("LANGDB_PROJECT_ID")
    }
)
set_default_openai_client(client)

# Create a custom model provider
class CustomModelProvider(ModelProvider):
    def get_model(self, model_name: str | None) -> Model:
        return OpenAIChatCompletionsModel(model=model_name, openai_client=client)

CUSTOM_MODEL_PROVIDER = CustomModelProvider()

agent = Agent(
    name="Math Tutor",
    model="gpt-4.1",
    instruction="You are a math tutor who can help students with their math homework.",
)

group_id = str(uuid.uuid4())
# Use the model provider with a unique group_id for tracing
async def run_agent():
    response = await Runner.run(
        triage_agent,
        input="Hello World",
        run_config=RunConfig(
            model_provider=CUSTOM_MODEL_PROVIDER,  # Inject custom model provider
            group_id=group_id                      # Link all steps to the same trace
        )
    )
    print(response.final_output)

# Run the async function with asyncio
asyncio.run(run_agent())
import os
from pylangdb.langchain import init

init()

# Get environment variables for configuration
api_base = os.getenv("LANGDB_API_BASE_URL")
api_key = os.getenv("LANGDB_API_KEY")
if not api_key:
    raise ValueError("Please set the LANGDB_API_KEY environment variable")

project_id = os.getenv("LANGDB_PROJECT_ID")

# Default headers for API requests
default_headers: dict[str, str] = {
    "x-project-id": project-id
}

# Your existing LangChain code works with proper configuration
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage

# Initialize OpenAI LLM with proper configuration
llm = ChatOpenAI(
    model_name="gpt-4",
    temperature=0.3,
    openai_api_base=api_base,
    openai_api_key=api_key,
    default_headers=default_headers,
)
result = llm.invoke([HumanMessage(content="Hello, LangChain!")])
import os
from crewai import Agent, Task, Crew, LLM
from dotenv import load_dotenv

load_dotenv()

# Import and initialize LangDB tracing
from pylangdb.crewai import init

# Initialize tracing before importing or creating any agents
init()

# Initialize API credentials
api_key = os.environ.get("LANGDB_API_KEY")
api_base = os.environ.get("LANGDB_API_BASE_URL")
project_id = os.environ.get("LANGDB_PROJECT_ID")

# Create LLM with proper headers
llm = LLM(
    model="gpt-4",
    api_key=api_key,
    base_url=api_base,
    extra_headers={
        "x-project-id": project_id
    }
)
# Create and use your CrewAI components as usual
# They will be automatically traced by LangDB
researcher = Agent(
    role="researcher",
    goal="Research the topic thoroughly",
    backstory="You are an expert researcher",
    llm=llm,
    verbose=True
)

task = Task(
    description="Research the given topic",
    agent=researcher
)

crew = Crew(agents=[researcher], tasks=[task])
result = crew.kickoff()
import os
from agno.agent import Agent
from agno.tools.duckduckgo import DuckDuckGoTools

# Import and initialize LangDB tracing
from pylangdb.agno import init
init()

# Import LangDB model after initializing tracing
from agno.models.langdb import LangDB

# Create agent with LangDB model
agent = Agent(
    name="Web Agent",
    role="Search the web for information",
    model=LangDB(
        id="openai/gpt-4",
        base_url=os.getenv("LANGDB_API_BASE_URL") + '/' + os.getenv("LANGDB_PROJECT_ID") + '/v1',
        api_key=os.getenv("LANGDB_API_KEY"),
        project_id=os.getenv("LANGDB_PROJECT_ID"),
    ),
    tools=[DuckDuckGoTools()],
    instructions="Answer questions using web search",
    show_tool_calls=True,
    markdown=True,
)

# Use the agent
response = agent.run("What is LangDB?")

LANGDB_API_KEY

Your LangDB API key

Required

LANGDB_PROJECT_ID

Your LangDB project ID

Required

LANGDB_API_BASE_URL

LangDB API base URL

https://api.us-east-1.langdb.ai

LANGDB_TRACING_BASE_URL

Tracing collector endpoint

https://api.us-east-1.langdb.ai:4317

LANGDB_TRACING

Enable/disable tracing

true

LANGDB_TRACING_EXPORTERS

Comma-separated list of exporters

otlp, console

from langdb.openai import init

init(
    collector_endpoint='https://api.us-east-1.langdb.ai:4317',
    api_key="langdb-api-key",
    project_id="langdb-project-id"
)
Google ADK Trace Example