# Introduction to AI Gateway Monitor, Govern and Secure your AI traffic. ## **What is an AI Gateway?** An AI gateway is a middleware that acts as a unified access point to multiple LLMs, optimizing, securing, and managing AI traffic. It simplifies integration with different AI providers while enabling cost control, observability, and performance benchmarking. With an AI gateway, businesses can seamlessly switch between models, monitor usage, and optimize costs.

Diagram showing LangDB as a central AI gateway for applications, offering features like cost management, tracing, routing, prompt caching, security, and guardrails. LangDB connects to multiple AI providers, including OpenAI, Gemini, Anthropic, Bedrock, Mistral, Hugging Face, Cohere, and Llama.

LangDB provides OpenAI compatible APIs to connect with multiple Large Language Models (LLMs) by just changing two lines of code. Govern, Secure, and Optimize all of your AI Traffic with Cost Control, Optimisation and Full Observability. **What AI Gateway Offers Out of the Box** LangDB provides OpenAI-compatible APIs, enabling developers to connect with multiple LLMs by changing just two lines of code. With LangDB, you can: * **Provide access to all major LLMs**\ Ensure seamless integration with leading large language models to maximize flexibility and power. * **No framework code required**\ Enable plug-and-play functionality using any framework like Langchain, Vercel AI SDK, CrewAI, etc., for easy adoption. * **Plug & Play Tracing & Cost Optimization**\ Simplify implementation of tracing and cost optimization features, ensuring streamlined operations. * **Automatic routing based on cost, quality, and other variables**\ Dynamically route requests to the most suitable LLM based on predefined parameters. * **Benchmark and provide insights**\ Deliver insights into the best-performing models for specific tasks, such as coding or reasoning, to enhance decision-making. **Quick Start with LangDB** LangDB offers both managed and self hosted versions for organisations to manage AI traffic . Choose between the **Hosted Gateway** for ease of use or the **Open-Source Gateway** for full control.


Quick Start	A full featured and managed AI gateway that provides instant access to 250+ LLMs with enterprise ready features.	#hosted	#hosted
Self Hosted	A self-hosted option for organizations that require complete control over their AI infrastructure.	#open-source	#open-source

#### Roadmap * **Prompt Caching & Optimization** *(In Progress)*\ Introduce caching mechanisms to optimize prompt usage and reduce redundant costs. * **GuardRails** *(In Progress)*\ Implement safeguards to enhance reliability and accuracy in AI outputs. * **Leaderboard of models per category**\ Create a comparative leaderboard to highlight model performance across categories. * **Ready-to-use evaluations for non-data scientists**\ Provide accessible evaluation tools for users without a data science background. * **Readily fine-tunable data based on usage**\ Offer pre-configured datasets tailored for fine-tuning, enabling customized improvements with ease. # Quick Start Quick Start guide for LangDB AI Gateway The **LangDB** **AI Gateway** allows you to connect with multiple Large Language Models (LLMs) instantly, without any setup. {% stepper %} {% step %} ### Account Creation Sign up on [LangDB](https://app.langdb.ai/) to start using the Hosted Gateway

{% endstep %} {% step %} ### Make your First Request Test a chat window with two different models to see dynamic routing in action.

Sending your first request in LangDB Playground

{% endstep %} {% step %} ### Checkout Samples section for Template Code Use ready-made templates to integrate LangDB into your project effortlessly.

Using LangDB Samples to generate template code

{% endstep %} {% step %} ### Analytics Section Monitor usage, costs, and performance insights through the LangDB analytics dashboard.

Checking out LangDB Dashboard for analytics

{% endstep %} {% endstepper %} # Working with API LangDB provides access to 350+ LLMs with OpenAI compatible APIs. You can use LangDB as a drop-in replacement for OpenAI APIs, making it easy to integrate into existing workflows and libraries such as OpenAI Client SDK. You can choose from any of the [supported models](https://app.langdb.ai/models). {% tabs %} {% tab title="Python" %} ```python from openai import OpenAI langdb_project_id = "xxxxx" # LangDB Project ID client = OpenAI( base_url=f"https://api.us-east-1.langdb.ai/{langdb_project_id}/v1", api_key="xxxxx" , # LangDB token ) response = client.chat.completions.create( model="anthropic/claude-sonnet-4", # Change Model messages=[ {"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "What are the earnings of Apple in 2022?"}, ], ) print("Assistant:", response.choices[0].message) ``` {% endtab %} {% tab title="TypeScript" %} ```typescript import { OpenAI } from 'openai'; const langdbProjectId = 'xxxx'; // LangDB Project ID const client = new OpenAI({ baseURL: `https://api.us-east-1.langdb.ai/${langdbProjectId}/v1`, apiKey: 'xxxx' // Your LangDB token, }); const messages = [ { role: 'system', content: 'You are a helpful assistant.' }, { role: 'user', content: 'What are the earnings of Apple in 2022?' } ]; async function getAssistantReply() { const { choices } = await client.chat.completions.create({ model: 'gpt-4o-mini', messages: messages }); console.log('Assistant:', choices[0].message.content); } getAssistantReply(); ``` {% endtab %} {% tab title="cURL" %} ```bash curl "https://api.us-east-1.langdb.ai/v1/chat/completions" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $LANGDB_API_KEY" \ -X "X-Project-Id: $Project_ID" \ -d '{ "model": "gpt-4o", "messages": [ { "role": "user", "content": "Write a haiku about recursion in programming." } ], "temperature": 0.8 }' ``` {% endtab %} {% endtabs %} After sending your request, you can see the Traces on the dashboard:

[Check out the API reference here.](https://docs.langdb.ai/docs/api-reference) # Working with Multiple Agents Learn how to use LangDB to Trace Multi Agent workflows LangDB automatically visualizes how agents interact, providing a clear view of workflows, hierarchies, and usage patterns by adding [`run`](https://docs.langdb.ai/ai-gateway/concepts/run) and [`thread`](https://docs.langdb.ai/ai-gateway/concepts/thread) headers. This allows developers to track interactions between agents seamlessly, ensuring clear visibility into workflows and dependencies.

With LangDB you can trace mulitple Agents. This will be in a form of a thread which stores all the interaction of workflow as a conversation and tracing which has details about the LLMs, Tool calls, Models, Parameters, etc.

## What is a Multi-Agent System? A multi-agent system consists of independent agents collaborating to solve complex tasks. Agents handle various roles such as user interaction, data processing, and workflow orchestration. LangDB streamlines tracking these interactions for better efficiency and transparency. ## Why Track Workflows? Tracking ensures: * **Clear Execution Flow**: Understand how agents interact. * **Performance Optimization**: Identify bottlenecks. * **Reliability & Accountability**: Improve transparency. LangDB supports two main concepts. * [**Run**](https://docs.langdb.ai/docs/concepts/run): A complete end-to-end interaction between agents, grouped for easy tracking. * [**Thread**](https://docs.langdb.ai/docs/concepts/thread): Aggregate multiple Runs into a single thread for a unified chat experience. Example Using the same **Run ID** and **Thread ID** across multiple agents ensures seamless tracking, maintaining context across interactions and providing a complete view of the workflow ```python from openai import OpenAI from uuid import uuid4 client = OpenAI( base_url="https://api.us-east-1.langdb.ai/{langdb_project_id}/v1" # LangDB API base URL, api_key=api_key, # Replace with your LangDB token ) response1 = client.chat.completions.create( model="gpt-4o", messages=[{"role": "developer", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello!"}], extra_headers={"x-thread-id": thread_id, "x-run-id": run_id} ) # Agent 2 processes the response response2 = client.chat.completions.create( model="gpt-4o", messages=[{"role": "developer", "content": "Processing user input."}, {"role": "user", "content": response1.choices[0].message["content"]}], extra_headers={"x-thread-id": thread_id, "x-run-id": run_id} ) ``` Checkout the full Multi-Agent Tracing Example [here](https://github.com/langdb/langdb-samples/blob/main/examples/langchain/langchain-multi-agent/langchain-multi-agent.ipynb). # Working with Agent Frameworks Enable end-to-end tracing for AI agent frameworks with LangDB’s one-line init() integration. LangDB integrates seamlessly with a variety of agent libraries to provide out-of-the-box tracing, observability, and cost insights. By simply initializing the LangDB client adapter for your agent framework, LangDB **monkey‑patches** the underlying client to inject tracing hooks—no further code changes required. ### Prerequisites * **LangDB Core** installed: ``` pip install 'pylangdb' ``` * **Optional feature flags** (for framework-specific tracing): ``` pip install 'pylangdb[]' # e.g. pylangdb[adk], pylangdb[openai_agents] ``` * **Environment Variables** set: ```bash export LANGDB_API_KEY="xxxxx" export LANGDB_PROJECT_ID="xxxxx" ``` ## Quick Start Import and initialize once, before creating or running any agents: ```python from pylangdb. import init # Monkey‑patch the client for tracing init() # ...then your existing agent setup... ``` > **Monkey‑patching note:** The `init()` call wraps key client methods at runtime to capture telemetry. Ensure it runs as early as possible. **GitHub Repo:** ### Example: Google ADK ```bash pip install 'pylangdb[adk]' ``` ```python from pylangdb.adk import init init() from google.adk.agents import Agent # (rest of your Google ADK agent code) ``` This is an example of complete end-to-end trace using Google ADK and LangDB.

LangDB’s ADK adapter captures request/response metadata, token usage, and latenc metrics automatically. During initialization it discovers and wraps all agents and sub‑agents in subfolders, linking their sessions for full end‑to‑end tracing across your workflow. ## Supported Frameworks


Google ADK	working-with-google-adk
OpenAI Agents SDK	working-with-openai-agents-sdk
LangGraph	working-with-langgraph
Agno	working-with-agno
CrewAI	working-with-crewai

## Further Documentation For full documentation including client capabilities, configuration, and detailed examples, checkout [Python SDK documentation](https://docs.langdb.ai/docs/python-sdk/getting-started) and [Github](https://github.com/langdb/pylangdb). # Working with Google ADK Instrument Google ADK pipelines with LangDB—capture nested agent flows, token usage, and latency metrics using a single init() call. LangDB’s Google ADK integration provides end-to-end tracing for your ADK agent pipelines.

Checkout: https://app.langdb.ai/sharing/threads/8425e068-77de-4f41-8aa9-d1111fc7d2b7

## Installation Enable end-to-end tracing for your Google ADK agents by installing the `pylangdb` client with the ADK feature flag: ```bash pip install 'pylangdb[adk]' ``` ## Quick Start Set your environment variables **before** initializing running the script: ```bash export LANGDB_API_KEY="" export LANGDB_PROJECT_ID="" ``` Initialize LangDB **before** creating or running any ADK agents: ```python from pylangdb.adk import init # Initialise LangDB init() # Then proceed with your normal ADK setup: from google.adk.agents import Agent # ...define and run agents... ``` Once initialized, LangDB automatically discovers all agents and sub-agents (including nested folders), wraps their key methods at runtime, and links sessions for full end-to-end tracing across your workflow as well. ## Complete Google ADK Python Example Here's a full example of a Google ADK agent implementation that you can instrument with LangDB. This sample is based on the official [Google ADK Quickstart](https://google.github.io/adk-docs/get-started/quickstart/). ### Example code > Check out the full sample on GitHub: ### Setup Environment ```bash pip install google-adk litellm 'pylangdb[adk]' ``` ### Project Structure Create the following project structure: ``` parent_folder/ └── multi_tool_agent/ ├── __init__.py ├── agent.py └── .env ``` ### **init**.py Create an `__init__.py` file in the `multi_tool_agent` folder: ```python from . import agent ``` ### .env Create `.env` file for your secrets ```bash LANGDB_API_KEY="" LANGDB_PROJECT_ID="" ``` ### agent.py Create an `agent.py` file with the following code: ```python # First initialize LangDB before defining any agents from pylangdb.adk import init init() import datetime from zoneinfo import ZoneInfo from google.adk.agents import Agent def get_weather(city: str) -> dict: if city.lower() != "new york": return {"status": "error", "error_message": f"Weather information for '{city}' is not available."} return {"status": "success", "report": "The weather in New York is sunny with a temperature of 25 degrees Celsius (77 degrees Fahrenheit)."} def get_current_time(city: str) -> dict: if city.lower() != "new york": return {"status": "error", "error_message": f"Sorry, I don't have timezone information for {city}."} tz = ZoneInfo("America/New_York") now = datetime.datetime.now(tz) return {"status": "success", "report": f'The current time in {city} is {now.strftime("%Y-%m-%d %H:%M:%S %Z%z")}'} root_agent = Agent( name="weather_time_agent", model="gemini-2.0-flash", description=("Agent to answer questions about the time and weather in a city." ), instruction=("You are a helpful agent who can answer user questions about the time and weather in a city."), tools=[get_weather, get_current_time], ) ``` ### Running Your Agent Navigate to the parent directory of your agent project and use the following commands: ```bash adk web ``` Open the URL provided (usually ) in your browser and select "multi\_tool\_agent" from the dropdown menu. Once your agent is running, try these example queries to test its functionality: ``` Whats the weather in New York? ``` These queries will trigger the agent to use the functions we defined and provide responses based on the our agent workflow. ### Traces on LangDB When you run queries against your ADK agent, LangDB automatically captures detailed traces of all agent interactions:

## Next Steps: Advanced Google ADK Integration This guide covered the basics of integrating LangDB with Google ADK using a simple weather and time agent example. For more complex scenarios and advanced use cases, check out our comprehensive resources in [Guides Section](https://app.gitbook.com/s/jwXvc8GvLP6kUvOASaLP/). # Working with OpenAI Agents SDK Trace OpenAI Agents SDK workflows end-to-end with LangDB—monitor model calls, tool invocations, and runner sessions via one-line init(). LangDB helps you add full tracing and observability to your OpenAI Agents SDK workflows—without changing your core logic. With a one-line initialization, LangDB captures model calls, tool invocations, and intermediate steps, giving you a complete view of how your agent operates.

Checkout: https://app.langdb.ai/sharing/threads/53b87631-de7f-431a-a049-48556f899b4d

## Installation Enable end-to-end tracing for your OpenAI Agents SDK agents by installing the `pylangdb` client with the `openai` feature flag: ```bash pip install 'pylangdb[openai]' ``` ## Quick Start ### Export Environment Variables Set your LangDB credentials: ```bash export LANGDB_API_KEY="" export LANGDB_PROJECT_ID="" ``` ### Initialize Tracing Import and run the initialize **before** configuring your OpenAI client: ```python from pylangdb.openai import init # Initialise LangDB init() ``` ### Configure OpenAI Client and Agent Runner ```python # Agent SDK imports from agents import ( Agent, Runner, set_default_openai_client, RunConfig, ModelProvider, Model, OpenAIChatCompletionsModel ) from openai import AsyncOpenAI # Configure the OpenAI client with LangDB headers client = AsyncOpenAI( api_key=os.environ["LANGDB_API_KEY"], base_url=os.environ["LANGDB_API_BASE_URL"], default_headers={"x-project-id": os.environ["LANGDB_PROJECT_ID"]} ) set_default_openai_client(client) # Create a custom model provider for advanced routing class CustomModelProvider(ModelProvider): def get_model(self, model_name: str | None) -> Model: return OpenAIChatCompletionsModel(model=model_name, openai_client=client) agent = Agent( name="Math Tutor", instructions="You are a helpful assistant", model="openai/gpt-4.1", # Choose any model from avaialable model on LangDB ) # Register your custom model provider to route model calls through LangDB CUSTOM_MODEL_PROVIDER = CustomModelProvider() # Assign a unique group_id to link all steps in this session trace group_id = str(uuid.uuid4()) response = await Runner.run( agent, input="Hello, world!", run_config=RunConfig( model_provider=CUSTOM_MODEL_PROVIDER, # Inject custom model provider group_id=group_id # Link all steps to the same trace ) ) ``` Once executed, LangDB links all steps—model calls, intermediate tool usage, and runner orchestration—into a single session trace. ## Complete OpenAI Agents SDK Example Here is a full example based on [OpenAI Agents SDK Quickstart](https://openai.github.io/openai-agents-python/quickstart/) which uses LangDB Tracing. ### Example code > Check out the full sample on GitHub: ### Setup Environment ``` pip install openai-agents 'pylangdb[openai]' ``` ### Export Environment Variables ```bash export LANGDB_API_KEY="" export LANGDB_PROJECT_ID="" ``` ### main.py ```python # Initialize LangDB tracing from pylangdb.openai import init init() # Agent SDK imports from agents import ( Agent, Runner, set_default_openai_client, set_default_openai_key, set_default_openai_api, RunConfig, ModelProvider, Model, OpenAIChatCompletionsModel ) from openai import AsyncOpenAI import os import uuid import asyncio # Configure the OpenAI client with LangDB headers client = AsyncOpenAI(api_key=os.environ["LANGDB_API_KEY"], base_url=os.environ["LANGDB_API_BASE_URL"], default_headers={"x-project-id": os.environ["LANGDB_PROJECT_ID"]}) # Set the configured client as default with tracing enabled set_default_openai_client(client, use_for_tracing=True) set_default_openai_api(api="chat_completions") # set_default_openai_key(os.environ["LANGDB_API_KEY"]) # Create a custom model provider for advanced routing class CustomModelProvider(ModelProvider): def get_model(self, model_name: str | None) -> Model: return OpenAIChatCompletionsModel(model=model_name, openai_client=client) # Register your custom model provider to route model calls through LangDB CUSTOM_MODEL_PROVIDER = CustomModelProvider() math_tutor_agent = Agent( name="Math Tutor", handoff_description="Specialist agent for math questions", instructions="You provide help with math problems. Explain your reasoning at each step and include examples", model="anthropic/claude-3.7-sonnet" ) history_tutor_agent = Agent( name="History Tutor", handoff_description="Specialist agent for historical questions", instructions="You provide assistance with historical queries. Explain important events and context clearly.", model="gemini/gemini-2.0-flash" # Choose any model available on LangDB ) triage_agent = Agent( name="Triage Agent", instructions="You determine which agent to use based on the user's homework question", handoffs=[history_tutor_agent, math_tutor_agent], model="openai/gpt-4o-mini" # Choose any model available on LangDB ) # Assign a unique group_id to link all steps in this session trace group_id = str(uuid.uuid4()) # Define async function to run the agent async def run_agent(): response = await Runner.run( triage_agent, input="who was the first president of the united states?", run_config=RunConfig( model_provider=CUSTOM_MODEL_PROVIDER, # Inject custom model provider group_id=group_id # Link all steps to the same trace ) ) print(response.final_output) # Run the async function with asyncio asyncio.run(run_agent()) ``` ### Running Your Agent Navigate to the parent directory of your agent project and use one of the following commands: ```bash python main.py ``` #### Output: ``` The first president of the United States was **George Washington**. Here's some important context: * **The American Revolution (1775-1783):** Washington was the commander-in-chief of the Continental Army during the Revolutionary War. His leadership was crucial in securing American independence from Great Britain. * **The Articles of Confederation (1781-1789):** After the war, the United States was governed by the Articles of Confederation. This system proved to be weak and ineffective, leading to calls for a stronger national government. * **The Constitutional Convention (1787):** Delegates from the states met in Philadelphia to revise the Articles of Confederation. Instead, they drafted a new Constitution that created a more powerful federal government. Washington presided over the convention, lending his prestige and influence to the process. * **The Constitution and the Presidency:** The Constitution established the office of the President of the United States. * **Election of 1789:** George Washington was unanimously elected as the first president by the Electoral College in 1789. There were no opposing candidates. This reflected the immense respect and trust the nation had in him. * **First Term (1789-1793):** Washington established many precedents for the presidency, including the formation of a cabinet, the practice of delivering an annual address to Congress, and the idea of serving only two terms. He focused on establishing a stable national government, paying off the national debt, and maintaining neutrality in foreign affairs. * **Second Term (1793-1797):** Washington faced challenges such as the Whiskey Rebellion and growing partisan divisions. He decided to retire after two terms, setting another crucial precedent for peaceful transitions of power. * **Significance:** Washington's leadership and integrity were essential in establishing the legitimacy and credibility of the new government. He is often considered the "Father of His Country" for his pivotal role in the founding of the United States. ``` ### Traces on LangDB When you run queries against your agent, LangDB automatically captures detailed traces of all agent interactions:

Trace of simple OpenAI Agents SDK Sample on LangDB

## Next Steps: Advanced OpenAI Agents SDK Integration This guide covered the basics of integrating LangDB with OpenAI Agents SDK using a history and maths agent example. For more complex scenarios and advanced use cases, check out our comprehensive resources in [Guides Section](https://app.gitbook.com/s/jwXvc8GvLP6kUvOASaLP/). # Working with LangGraph Automatically instrument LangChain chains and agents with LangDB—gain live traces, cost analytics, and latency insights through init(). LangDB provides seamless tracing and observability for LangChain-based applications.

Checkout: https://app.langdb.ai/sharing/threads/bfced28f-5966-4491-867e-fd3875fe3282

## Installation Install the LangDB client with LangChain support: ```bash pip install 'pylangdb[langchain]' ``` ## Quick Start ### Export Environment Variables ```bash export LANGDB_API_KEY="" export LANGDB_PROJECT_ID="" export LANGDB_API_BASE_URL='https://api.us-east-1.langdb.ai' ``` ### Initialize LangDB Import and run the initialize **before** configuring your LangChain/LangGraph: ```python from pylangdb.langchain import init # Initialise LangDB init() ``` ### Define your Agent ```python # Your existing LangChain code works with proper configuration from langchain.chat_models import ChatOpenAI from langchain.schema import HumanMessage import os api_base = "https://api.us-east-1.langdb.ai" api_key = os.getenv("LANGDB_API_KEY") project_id = os.getenv("LANGDB_PROJECT_ID") # Default headers for API requests default_headers: dict[str, str] = { "x-project-id": project_id } # Initialize OpenAI LLM with LangDB configuratio llm = ChatOpenAI( model_name="gpt-4o", temperature=0.3, openai_api_base=api_base, openai_api_key=api_key, default_headers=default_headers, ) result = llm.invoke([HumanMessage(content="Hello, LangDB!")]) ``` Once LangDB is initialized, all calls to `llm`, intermediate steps, tool executions, and nested chains are automatically traced and linked under a single session. ## Complete LangGraph Agent Example Here is a full LangGraph example based on ReAct Agent which uses LangDB Tracing. ### Example code > Check out the full sample on GitHub: ### Setup Environment Install the libraries using pip ```bash pip install langgraph 'pylangdb[langchain]' langchain_openai geopy ``` ### Export Environment Variables ```bash export LANGDB_API_KEY="" export LANGDB_PROJECT_ID="" export LANGDB_API_BASE_URL='https://api.us-east-1.langdb.ai' ``` ### main.py ```python # Initialize LangDB tracing from pylangdb.langchain import init init() import os from typing import Annotated, Sequence, TypedDict from datetime import datetime # Import required libraries from langchain_core.messages import BaseMessage, HumanMessage, AIMessage, ToolMessage from langchain_core.tools import tool from langgraph.graph.message import add_messages from langgraph.prebuilt import ToolNode from langgraph.graph import StateGraph, END from langchain_openai import ChatOpenAI from geopy.geocoders import Nominatim from pydantic import BaseModel, Field import requests # Initialize the model def create_model(): """Create and return the ChatOpenAI model.""" api_base = os.getenv("LANGDB_API_BASE_URL") api_key = os.getenv("LANGDB_API_KEY") project_id = os.getenv("LANGDB_PROJECT_ID") default_headers = { "x-project-id": project_id, } llm = ChatOpenAI( model_name='openai/gpt-4o', # Choose any model from LangDB temperature=0.3, openai_api_base=api_base, openai_api_key=api_key, default_headers=default_headers ) return llm # Define the agent state class AgentState(TypedDict): """The state of the agent.""" messages: Annotated[Sequence[BaseMessage], add_messages] number_of_steps: int # Define the weather tool class SearchInput(BaseModel): location: str = Field(description="The city and state, e.g., San Francisco") date: str = Field(description="The forecasting date in format YYYY-MM-DD") @tool("get_weather_forecast", args_schema=SearchInput, return_direct=True) def get_weather_forecast(location: str, date: str) -> dict: """ Retrieves the weather using Open-Meteo API for a given location (city) and a date (yyyy-mm-dd). Returns a dictionary with the time and temperature for each hour. """ geolocator = Nominatim(user_agent="weather-app") location = geolocator.geocode(location) if not location: return {"error": "Location not found"} try: response = requests.get( f"https://api.open-meteo.com/v1/forecast?" f"latitude={location.latitude}&" f"longitude={location.longitude}&" "hourly=temperature_2m&" f"start_date={date}&end_date={date}", timeout=10 ) response.raise_for_status() data = response.json() return { time: f"{temp}°C" for time, temp in zip( data["hourly"]["time"], data["hourly"]["temperature_2m"] ) } except Exception as e: return {"error": f"Failed to fetch weather data: {str(e)}"} # Define the nodes def call_model(state: AgentState) -> dict: """Call the model with the current state and return the response.""" model = create_model() model.bind_tools([get_weather_forecast] messages = state["messages"] response = model.invoke(messages) return {"messages": [response], "number_of_steps": state["number_of_steps"] + 1} def route_to_tool(state: AgentState) -> str: """Determine the next step based on the model's response.""" messages = state["messages"] last_message = messages[-1] if hasattr(last_message, 'tool_calls') and last_message.tool_calls: return "call_tool" return END # Create the graph def create_agent(): """Create and return the LangGraph agent.""" # Create the graph workflow = StateGraph(AgentState) workflow.add_node("call_model", call_model) workflow.add_node("call_tool", ToolNode([get_weather_forecast])) workflow.set_entry_point("call_model") workflow.add_conditional_edges( "call_model", route_to_tool, { "call_tool": "call_tool", END: END } ) workflow.add_edge("call_tool", "call_model") return workflow.compile() def main(): agent = create_agent() query = f"What's the weather in Paris today? Today is {datetime.now().strftime('%Y-%m-%d')}." initial_state = { "messages": [HumanMessage(content=query)], "number_of_steps": 0 } print(f"Query: {query}") print("\nRunning agent...\n") for output in agent.stream(initial_state): for key, value in output.items(): if key == "__end__": continue print(f"\n--- {key.upper()} ---") if key == "messages": for msg in value: if hasattr(msg, 'content'): print(f"{msg.type}: {msg.content}") if hasattr(msg, 'tool_calls') and msg.tool_calls: print(f"Tool Calls: {msg.tool_calls}") else: print(value) if __name__ == "__main__": main() ``` ### Running your Agent Navigate to the parent directory of your agent project and use one of the following commands: ```bash python main.py ``` #### Output {% code overflow="wrap" %} ```bash --- CALL_MODEL --- {'messages': [AIMessage(content="The weather in Paris on July 1, 2025, is as follows:\n\n- 00:00: 28.1°C\n- 01:00: 27.0°C\n- 02:00: 26.3°C\n- 03:00: 25.7°C\n- 04:00: 25.1°C\n- 05:00: 24.9°C\n- 06:00: 25.8°C\n- 07:00: 27.6°C\n- 08:00: 29.6°C\n- 09:00: 31.7°C\n- 10:00: 33.7°C\n- 11:00: 35.1°C\n- 12:00: 36.3°C\n- 13:00: 37.3°C\n- 14:00: 38.6°C\n- 15:00: 37.9°C\n- 16:00: 38.1°C\n- 17:00: 37.8°C\n- 18:00: 37.3°C\n- 19:00: 35.3°C\n- 20:00: 33.2°C\n- 21:00: 30.8°C\n- 22:00: 28.7°C\n- 23:00: 27.3°C\n\nIt looks like it's going to be a hot day in Paris!", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 319, 'prompt_tokens': 585, 'total_tokens': 904, 'completion_tokens_details': None, 'prompt_tokens_details': None, 'cost': 0.005582999999999999}, 'model_name': 'gpt-4o', 'system_fingerprint': None, 'id': '3bbde343-79e3-4d8f-bd97-b07179ee92c0', 'service_tier': None, 'finish_reason': 'stop', 'logprobs': None}, id='run--4fd3896d-1fbd-4c91-9c21-bd6cf3d2949e-0', usage_metadata={'input_tokens': 585, 'output_tokens': 319, 'total_tokens': 904, 'input_token_details': {}, 'output_token_details': {}})], 'number_of_steps': 2} ``` {% endcode %} ### Traces on LangDB When you run queries against your agent, LangDB automatically captures detailed traces of all agent interactions:

## Next Steps: Advanced LangGraph Integration This guide covered the basics of integrating LangDB with LangGraph using a ReAcT agent example. For more complex scenarios and advanced use cases, check out our comprehensive resources in [Guides Section](https://app.gitbook.com/s/jwXvc8GvLP6kUvOASaLP/). # Working with Agno Unlock full observability for CrewAI agents and tasks—capture LLM calls, task execution, and agent interactions with LangDB’s init(). LangDB’s Agno integration provides end-to-end tracing for your Agno agent pipelines.

Checkout: https://app.langdb.ai/sharing/threads/8a44dccc-c679-4fc3-9555-a07de103d637

## Installation Install the LangDB client with Agno feature flag: ```bash pip install 'pylangdb[agno]' ``` ## Quick Start ### Export Environment Variables Set your LangDB credentials: ```bash export LANGDB_API_KEY="" export LANGDB_PROJECT_ID="" ``` ### Initialize Tracing Import and run the initialize **before** configuring your Agno Code: ```python from pylangdb.agno import init # Initialise LangDB init() ``` ### Configure your Agno code ```python import os from pylangdb.agno import init init() from agno.agent import Agent from agno.tools.duckduckgo import DuckDuckGoTools from agno.models.langdb import LangDB # Configure LangDB-backed model langdb_model = LangDB( id="openai/gpt-4", api_key=os.getenv("LANGDB_API_KEY"), project_id=os.getenv("LANGDB_PROJECT_ID"), ) # Create and run your agent agent = Agent( name="Web Agent", role="Search the web for information", model=langdb_model, tools=[DuckDuckGoTools()], instructions="Answer questions using web search", ) response = agent.run("What is LangDB?") print(response) ``` All Agno interactions from invocation through tool calls to final output are traced with LangDB. ## Complete Agno Example Here is a full example based on Web Search Agno Multi Agent Team. ### Example code > Check out the full sample on GitHub: ### Setup Environment ``` pip install agno 'pylangdb[agno]' duckduckgo-search ``` ### Export Environment Variables ```bash export LANGDB_API_KEY="" export LANGDB_PROJECT_ID="" ``` ### main.py ```python import os from textwrap import dedent # Initialize LangDB tracing and import model from pylangdb.agno import init init() from agno.models.langdb import LangDB # Import Agno agent components from agno.agent import Agent from agno.tools.duckduckgo import DuckDuckGoTools # Function to create a LangDB model with selectable model name def create_langdb_model(model_name="openai/gpt-4.1"): return LangDB( id=model_name, api_key=os.getenv("LANGDB_API_KEY"), project_id=os.getenv("LANGDB_PROJECT_ID"), ) web_agent = Agent( name="Web Agent", role="Search the web for comprehensive information and current data", model=create_langdb_model("openai/gpt-4.1"), tools=[DuckDuckGoTools()], instructions="Always use web search tools to find current and accurate information. Search for multiple aspects of the topic to gather comprehensive data.", show_tool_calls=True, markdown=True, ) writer_agent = Agent( name="Writer Agent", role="Write comprehensive article on the provided topic", model=create_langdb_model("anthropic/claude-3.7-sonnet"), instructions="Use outlines to write articles", show_tool_calls=True, markdown=True, ) agent_team = Agent( name="Research Team", team=[web_agent, writer_agent], model=create_langdb_model("gemini/gemini-2.0-flash"), instructions=dedent("""\ You are the coordinator of a research team with two specialists: 1. Web Agent: Has DuckDuckGo search tools and must be used for ALL research tasks 2. Writer Agent: Specializes in creating comprehensive articles WORKFLOW: 1. ALWAYS delegate research tasks to the Web Agent first 2. The Web Agent MUST use web search tools to gather current information 3. Then delegate writing tasks to the Writer Agent using the research findings 4. Ensure comprehensive coverage of the topic through multiple searches IMPORTANT: Never attempt to answer without first having the Web Agent conduct searches. """), show_tool_calls=True, markdown=True, ) agent_team.print_response( "I need a comprehensive article about the Eiffel Tower. " "Please have the Web Agent search for current information about its history, architectural significance, and cultural impact. " "Then have the Writer Agent create a detailed article based on the research findings.", stream=True ) ``` ### Running your Agent Navigate to the parent directory of your agent project and use one of the following commands: ```bash python main.py ``` ### Traces on LangDB When you run queries against your agent, LangDB automatically captures detailed traces of all agent interactions:

## Next Steps: Advanced Agno Integration This guide covered the basics of integrating LangDB with Agno using a Web Search agent example. For more complex scenarios and advanced use cases, check out our comprehensive resources in [Guides Section](https://app.gitbook.com/s/jwXvc8GvLP6kUvOASaLP/). # Working with CrewAI Add end-to-end tracing to Agno agent workflows with LangDB—monitor model calls, tool usage, and step flows using a single init() call. LangDB makes it effortless to trace CrewAI workflows end-to-end. With a single `init()` call, all agent interactions, task executions, and LLM calls are captured.

Checkout: https://app.langdb.ai/sharing/threads/3becbfed-a1be-ae84-ea3c-4942867a3e22

## Installation Install the LangDB client with LangChain feature flag: ```bash pip install 'pylangdb[crewai]' ``` ## Quick Start ### Export Environment Variables Set your LangDB credentials: ```bash export LANGDB_API_KEY="" export LANGDB_PROJECT_ID="" ``` ### Initialize Tracing Import and run the initialize **before** configuring your CrewAI Code: ```python from pylangdb.crewai import init # Initialise LangDB init() ``` ### Configure your CrewAI code ```python import os from dotenv import load_dotenv from crewai import Agent, Task, Crew, LLM # Configure LLM with LangDB headers llm = LLM( model="openai/gpt-4o", # Use LiteLLM Like Model Names api_key=os.getenv("LANGDB_API_KEY"), base_url=os.getenv("LANGDB_API_BASE_URL"), extra_headers={"x-project-id": os.getenv("LANGDB_PROJECT_ID")} ) # Define agents and tasks as usual researcher = Agent( role="researcher", goal="Research topic thoroughly", backstory="You are an expert researcher", llm=llm, verbose=True ) task = Task(description="Research the given topic", agent=researcher) crew = Crew(agents=[researcher], tasks=[task]) # Kick off the workflow result = crew.kickoff() print(result) ``` All CrewAI calls—agent initialization, task execution, and model responses—are automatically linked. ## Complete CrewAI example Here is a full example based on CrewAI report writing agent. ### Example code > Check out the full sample on GitHub: ### Setup Evironment ```bash pip install crewai 'pylangdb[crewai]' crewai_tools setuptools python-dotenv ``` ### Export Environment Variables You also need to get API Key from [Serper.dev](https://serper.dev/) ```bash export LANGDB_API_KEY="" export LANGDB_PROJECT_ID="" export LANGDB_API_BASE_URL='https://api.us-east-1.langdb.ai' ``` ### main.py ```python #!/usr/bin/env python3 import os import sys from pylangdb.crewai import init init() from dotenv import load_dotenv from crewai import Agent, Task, Crew, Process, LLM from crewai_tools import SerperDevTool load_dotenv() def create_llm(model): return LLM( model=model, api_key=os.environ.get("LANGDB_API_KEY"), base_url=os.environ.get("LANGDB_API_BASE_URL"), extra_headers={"x-project-id": os.environ.get("LANGDB_PROJECT_ID")} ) class ResearchPlanningCrew: def researcher(self) -> Agent: return Agent( role="Research Specialist", goal="Research topics thoroughly", backstory="Expert researcher with skills in finding information", tools=[SerperDevTool()], llm=create_llm("openai/gpt-4o"), verbose=True ) def planner(self) -> Agent: return Agent( role="Strategic Planner", goal="Create actionable plans based on research", backstory="Strategic planner who breaks down complex challenges", reasoning=True, max_reasoning_attempts=3, llm=create_llm("openai/anthropic/claude-3.7-sonnet"), verbose=True ) def research_task(self) -> Task: return Task( description="Research the topic thoroughly and compile information", agent=self.researcher(), expected_output="Comprehensive research report" ) def planning_task(self) -> Task: return Task( description="Create a strategic plan based on research", agent=self.planner(), expected_output="Strategic execution plan with phases and goals", context=[self.research_task()] ) def crew(self) -> Crew: return Crew( agents=[self.researcher(), self.planner()], tasks=[self.research_task(), self.planning_task()], verbose=True, process=Process.sequential ) def main(): topic = sys.argv[1] if len(sys.argv) > 1 else "Artificial Intelligence in Healthcare" crew_instance = ResearchPlanningCrew() # Update task descriptions with topic crew_instance.research_task().description = f"Research {topic} thoroughly and compile information" crew_instance.planning_task().description = f"Create a strategic plan for {topic} based on research" result = crew_instance.crew().kickoff() print(result) if __name__ == "__main__": main() ``` ### Running your Agent Navigate to the parent directory of your agent project and use one of the following commands: ```bash python main.py ``` ### Traces on LangDB: When you run queries against your agent, LangDB automatically captures detailed traces of all agent interactions:

## Next Steps: Advanced CrewAI Integration This guide covered the basics of integrating LangDB with CrewAI using a Research and Planning agent example. For more complex scenarios and advanced use cases, check out our comprehensive resources in [Guides Section](https://app.gitbook.com/s/jwXvc8GvLP6kUvOASaLP/). # Working with MCPs Learn how to connect to MCP Servers using LangDB AI Gateway Instantly connect to managed MCP servers — skip the setup and start using fully managed MCPs with built-in authentication, seamless scalability, and full tracing. This guide gives you a quick walkthrough of how to get started with MCPs. ## Quick Example In this example, we’ll create a [**Virtual MCP Server**](https://docs.langdb.ai/docs/concepts/virtual-mcp-servers) by combining Slack and Gmail MCPs — and then connect it to an MCP Client like **Cursor** for instant access inside your chats.

Quick guide on how to setup LangDB Virtual Model on windsurf, claude, cursor.

### Steps: 1. **Select Slack and Gmail** from MCP Severs in the Virtual MCP Section. 2. **Generate** a Virtual MCP URL automatically. 3. **Install** the MCP into Cursor with a single command. Example install command: ```bash npx @langdb/mcp setup slack_gmail_virtual https://api.langdb.ai/mcp/xxxxx --client cursor ``` ## What Happens Under the Hood? * Authentication is handled (via OAuth or API Key) * Full tracing and observability are available (inputs, outputs, errors, latencies) * MCP tools are treated just like normal function calls inside LangDB ## Next Steps: * MCP Servers listed on LangDB: * Explore [MCP Usecases](https://app.gitbook.com/o/BdRlRaljD2qHr7uIBrM0/s/jwXvc8GvLP6kUvOASaLP/). # Working with Headers Explore how LangDB API headers like x-thread-id, x-run-id, x-label, and x-project-id improve LLM tracing, observability, and session tracking for better API management and debugging. LangDB API provides robust support for **HTTP headers**, enabling developers to manage API requests efficiently with enhanced tracing, observability, and organization. These headers play a crucial role in structuring interactions with multiple LLMs by providing **tracing, request tracking, and session continuity**, making it easier to monitor, and analyze API usage ## Thread ID (x-thread-id) **Usage:** Groups multiple related requests under the same conversation * Useful for tracking interactions over a single user session. * Helps maintain context across multiple messages. ## Thread Title (x-thread-title) **Usage:** Assigns a custom, human-readable title to a thread. * This title is displayed in the LangDB UI, making it easier to identify and search for specific conversations. ## Public Thread (x-thread-public) **Usage:** Makes a thread publicly accessible via a shareable link. * Set the value to `1` or `true` to enable public sharing. * The public URL will be: `https://app.langdb.ai/sharing/threads/{thread_id}` * The `x-thread-title`, if set, will be displayed on the public thread page. Check [Threads](https://docs.langdb.ai/docs/concepts/thread) for more details. ## Run ID (x-run-id) **Usage:** Tracks a unique workflow execution in LangDB, such as a model call or tool invocation. * Enables precise tracking and debugging. * Each Run is independent for better observability. Check [Run](https://docs.langdb.ai/docs/concepts/run) for more details. ## Label (x-label) **Usage:** Adds a custom tag or label to a LLM Model Call for easier categorization. * Helps with tracing multiple agents. Check [Label](https://docs.langdb.ai/docs/concepts/label) for more details. ## Project ID (x-project-id) **Usage:** Identifies the project under which the request is being made. * Helps in cost tracking, monitoring, and organizing API calls within a specific project. * Can be set in headers or directly in the API base URL `https://api.us-east-1.langdb.ai/${langdbProjectId}/v1` # User Tracking Track users in LangDB AI Gateway to analyze usage, optimize performance, and improve chatbot experiences. LangDB AI enables user tracking to collect analytics and monitor usage patterns efficiently. By associating metadata with requests, developers can analyze interactions, optimize performance, and enhance user experience. ## Example: Chatbot Analytics with User Tracking For a chatbot service handling multiple users, tracking enables: * **Recognizing returning users**: Maintain conversation continuity. * **Tracking usage trends**: Identify common queries to improve responses. * **User segmentation**: Categorize users using tags (e.g., "websearch", "support"). * **Analytics**: Identify heavy users and allocate resources efficiently. ```python curl 'https://api.us-east-1.langdb.ai/v1/chat/completions' \ -H 'authorization: Bearer LangDBApiKey' \ -H 'Content-Type: application/json' \ -d '{ "model": "openai/gpt-4o-mini", "stream": true, "messages": [ { "role": "user", "content": "Def bubbleSort()" } ], "extra": { "user": { "id": "7", "name": "mrunmay", "tags": ["coding", "software"] } } }' ``` ### User Tracking Fields * **`extra.user.id`**: Unique user identifier. * **`extra.user.name`**: User alias. * **`extra.user.tags`**: Custom tags to classify users (e.g., "coding", "software"). ## Fetching User Analytics & Usage Data Once users are tracked, analytics and usage APIs can be used to retrieve insights based on `id`, `name`, or `tags`. Checkout [Usage](https://docs.langdb.ai/docs/features/usage#usage-apis) and [Analytics](https://docs.langdb.ai/docs/features/analytics#analytics-apis) section for more details. Example: ```bash curl -L \ --request POST \ --url 'https://api.us-east-1.langdb.ai/analytics/summary' \ --header 'Authorization: Bearer langDBAPIKey' \ --header 'X-Project-Id: langDBProjectID' \ --header 'Content-Type: application/json' \ --data '{ "user_id": "7", "user_name": "mrunmay", "user_tags": ["software", "code"] }' ``` Example response: ```json { "summary": [ { "total_cost": 0.00030366, "total_requests": 1, "total_duration": 6240.888, "avg_duration": 6240.9, "duration": 6240.9, "duration_p99": 6240.9, "duration_p95": 6240.9, "duration_p90": 6240.9, "duration_p50": 6240.9, "total_input_tokens": 1139, "total_output_tokens": 137, "avg_ttft": 6240.9, "ttft": 6240.9, "ttft_p99": 6240.9, "ttft_p95": 6240.9, "ttft_p90": 6240.9, "ttft_p50": 6240.9, "tps": 204.46, "tps_p99": 204.46, "tps_p95": 204.46, "tps_p90": 204.46, "tps_p50": 204.46, "tpot": 0.05, "tpot_p99": 0.05, "tpot_p95": 0.05, "tpot_p90": 0.05, "tpot_p50": 0.05, "error_rate": 0.0, "error_request_count": 0 } ], "start_time_us": 1737547895565066, "end_time_us": 1740139895565066 } ``` # Using Parameters Configure temperature, max\_tokens, logit\_bias, and more with LangDB AI Gateway. Test easily via API, UI, or Playground. LangDB AI Gateway supports every LLM parameter like temperature, max\_tokens, stop sequences, logit\_bias, and more. ## API Usage: {% tabs %} {% tab title="Python" %} ```python from openai import OpenAI response = client.chat.completions.create( model="gpt-4o", # Change Model messages=[ {"role": "user", "content": "What are the earnings of Apple in 2022?"}, ], temperature=0.7, # temperature parameter max_tokens=150, # max_tokens parameter stream=True # stream parameter ) ``` {% endtab %} {% tab title="TypeScript" %} ```typescript const response = await client.chat.completions.create({ model: 'gpt-4o-mini', messages, temperature: 0.7, // temperature parameter max_tokens: 150, // max_tokens parameter logit_bias: { '50256': -100 }, // logit_bias parameter stream: true, // stream parameter }); ``` {% endtab %} {% endtabs %} ## UI You can also use the UI to test various parameters and getting code snippet ### Playground Use the Playground to tweak parameters in real time via the [Virtual Model config](https://docs.langdb.ai/docs/concepts/virtual-models) and send test requests instantly.

Trying different Parameters for chat completions through LangDB Playground

### Samples Explore ready-made code snippets complete with preconfigured parameters—copy, paste, and customize to fit your needs.

Trying different Parameters for chat completions through LangDB Samples

# Thread Use LangDB Threads to group messages, maintain conversation context, and enable seamless multi-turn interactions. A **Thread** is simply a grouping of **Message History** that maintains context in a conversation or workflow. Threads are useful for keeping track of past messages and ensuring continuity across multiple exchanges.

**Core Features:** * **Contextual Continuity**: Ensures all related Runs are grouped for better observability. * **Multi-Turn Support**: Simplifies managing interactions that require maintaining state across multiple Runs. **Example:** A user interacting with a chatbot over multiple turns (e.g., asking follow-up questions) generates several messages, but all are grouped under a single **Thread** to maintain continuity. **Headers for Thread:** * `x-thread-id`: Links all Runs in the same context or conversations. * `x-thread-title`: Assigns a custom, human-readable title to the thread, making it easier to identify. * `x-thread-public`: Makes the thread publicly accessible via a shareable link by setting its value to `1` or `true`. # Trace Track complete workflows with LangDB Traces. Get end-to-end visibility, multi-agent support, and error diagnosis. A **Trace** represents the **complete lifecycle of a workflow**, spanning all components and systems involved.

An example of Trace linked to a thread can look on LangDB

**Core Features:** * **End-to-End Visibility**: Tracks model calls, tools across the entire workflow. * **Multi Agent Ready**: Perfect for workflows that involve multiple services, APIs, or tools. * **Error Diagnosis**: Quickly identify bottlenecks, failures, or inefficiencies in complex workflows. **Parent-Trace:** For workflows with nested operations (e.g., a workflow that triggers multiple sub-workflows), LangDB introduces the concept of a **Parent-Trace**, which links the parent workflow to its dependent sub-workflows. This hierarchical structure ensures you can analyze workflows at both macro and micro levels. **Headers for Trace:** * `trace-id`: Tracks the parent workflow. * `parent-trace-id`: Links sub-workflows to the main workflow for hierarchical tracing. # Run Track and monitor complete workflows with Runs in LangDB AI Gateway for better observability, debugging, and insights. A **Run** represents a **single workflow or operation** executed within LangDB. This could be a model invocation, a tool call, or any other discrete task. Each Run is independent and can be tracked separately, making it easier to analyze and debug individual workflows. Example of a Run:

**Core Features:** * **Granular Tracking**: Analyze and optimize the performance and cost of individual Runs. * **Independent Execution**: Each Run has a distinct lifecycle, enabling precise observability. **Example:** Generating a summary of a document, analyzing a dataset, or fetching information from an external API – each is a **Run**. **Headers for Run:** * `x-run-id`: Identifies a specific Run for tracking and debugging purposes. # Label Label LLM instances in LangDB AI Gateway for easy tracking, categorization, and improved observability. Label in LangDB defines an LLM instance with a unique identifier for categorization and tracking.

### Core Features * **Model Categorization**: Assign labels to LLM instances. * **Observability**: Track models by label. ### Headers for Label: * **x-label**: Defines a label for an LLM instance. ```json { "x-label" : "research-agent" } ``` # Message A Message in LangDB AI Gateway defines structured interactions between users, systems, and models in workflows. A **Message** is the **basic unit of communication** in LangDB workflows. Messages define the interaction between the user, the system, and the model. Every workflow is built around exchanging and processing messages. **Core Features:** * **Structured Interactions**: Messages define roles (`user`, `system`, `assistant`) to organize interactions clearly. * **Multi-Role Flexibility**: Different roles (e.g., `system` for instructions, `user` for queries) enable complex workflows. * **Dynamic Responses**: Messages form the backbone of LangDB’s chat-based interactions. **Example:** A simple interaction to generate a poem might look like this: ```json [ { "role": "system", "content": "You are a helful assistant" }, { "role": "user", "content": "Write me a poem about celluloids." } ] ``` # Virtual Models Create, save, and reuse LLM configurations with Virtual Models in LangDB AI Gateway to streamline workflows and ensure consistent behavior. LangDB’s **Virtual Models** let you save, share, and reuse model configurations—combining prompts, parameters, tools, and routing logic into a single named unit. This simplifies workflows and ensures consistent behavior across your apps, agents, and API calls. Once saved, these configurations can be quickly accessed and reused across multiple applications.

## Why do you need Virtual Models Virtual models in LangDB are more than just model aliases. They are **fully configurable AI agents** that: * Let you define **system/user messages** upfront * Support **routing logic** to dynamically choose between models * Include **MCP integrations** and **guardrails** * Are callable from **UI playground**, **API**, and **LangChain/OpenAI SDKs** Use virtual models to manage: * Prompt versioning and reuse * Consistent testing across different models * Precision tuning with per-model parameters * Seamless integration of tools and control logic * Routing using strategies like fallback, percentage-based, latency-based, optimized, and script-based selection ## Setting Up Virtual Model {% embed url="" %} 1. Go to the **Models** 2. Click on **Create Virtual Model.** 3. **Set prompt messages** — define system and user messages to guide model behavior 4. **Set variables** (optional) — useful if your prompts require dynamic values 5. **Select router type** * `None`: Use a single model only * `Fallback`, `Random`, `Cost`,`Percentage`, `Latency`, `Optimized`: Configure smart routing across targets. Checkout all [Routing Strategies](https://docs.langdb.ai/docs/features/routing). 6. **Add one or more targets** * Each target defines a model, mcp servers, guardrails, system-user messages, response format and its parameters (e.g. temperature, max\_tokens, top\_p, penalties) 7. **Select MCP Servers** — connect tools like LangDB Search, Code Execution, or others 8. **Add guardrails** (optional) — for validation, transformation, or filtering logic 9. **Set response format** — choose between `text`, `json_object`, or `json_schema` 10. Give your virtual model a name and Save. Your virtual model now appears in the **Models** section of your project, ready to be used anywhere a model is accepted. ## Updating and Versioning You can edit virtual models anytime. LangDB supports formal versioning via the `@version` syntax: * `langdb/my-model@latest` or `langdb/my-model` → resolves to the latest version * `langdb/my-model@v1` or `langdb/my-model@1` → resolves to version 1 This allows you to safely test new versions, roll back to older ones, or maintain multiple stable variants of a model in parallel. ## Using Your Virtual Model Once saved, your virtual model is fully available across all LangDB interfaces: * **Chat Playground**: Select it from the model dropdown and test interactively. * **OpenAI-Compatible SDKs**: Works seamlessly with OpenAI clients by changing only the `model` name. * **LangChain / CrewAI / other frameworks**: Call it just like any base model by using `model="langdb/my-model@latest"` or a specific version like `@v1`. This makes virtual models a portable, modular building block across all parts of your AI stack. # Routing with Virtual Model Manage routing strategies easily in LangDB AI Gateway’s UI to boost efficiency, speed, and reliability in AI workflows. In LangDB AI Gatewau, any virtual model can act as a router. Just define a strategy and list of target models—it’ll route requests based on metrics like cost, latency, percentage, er or custom rules. ## Setting up Routing {% embed url="" %} Setting up routing in a virtual model is straightforward: 1. Open any virtual model in the Chat Playground and click **Show Config** 2. Choose a **routing strategy** (like `fallback`, `optimized`, `percentage`, etc.) 3. Add your **target models**—each one can be configured just like the virtual models you set up in the previous section Each target defines: * Which model to use * Prompt * MCP Servers * Guardrails * Response Format * Custom parameters like temperature, max\_tokens, penalties, etc. All routing options are available directly in the virtual model config panel. Check more about the [Routing Strategies](https://docs.langdb.ai/docs/features/routing). # Draft Mode Simplify version control with LangDB Virtual Models’ draft mode—safely iterate, preview, and publish model versions without impacting live traffic. LangDB’s Virtual Models support a **draft mode** that streamlines version management and ensures safe, iterative changes. In draft mode, modifications are isolated from the published version until you explicitly publish, giving you confidence that live traffic is unaffected by in-progress edits. ## Version Workflow 1. **Edit in Draft** * Making any change (e.g., adjusting parameters, adding guardrails, modifying messages) flips the version into a **Modified** draft.

2. **Save Draft** * Click **Save** to record your changes. The draft is saved as a new version at the top of the version list, without affecting the live version.

* Live API traffic remains pointed at the last published version. 3. **Publish Draft** * Once validated, click **Publish**: * Saves the version as the new latest version.

* Directs all live chat completion traffic to this version. * Keeps the previous published version visible in the list so you can reselect and republish if needed. 4. **Restore & Edit Previous Version** * Open the version dropdown and select any listed version. * The selected version loads into the editor. * You can further modify this draft and click **Save** to create a new version entry. 5. **Re-Publish Any Version** * To make any saved version live, select it from the dropdown and click **Publish**. ## API Behavior All `chatCompletions` requests to a Virtual Model endpoint automatically target the **latest published version**. Drafts and restored drafts never receive live traffic until published. ```python from openai import OpenAI client = OpenAI( base_url="https://api.us-east-1.langdb.ai", api_key=api_key, ) # Always hits current published version response = client.chat.completions.create( model="openai/langdb/my-virtual-model@latest", messages=[...], ) ``` To preview changes in a draft or restored draft, switch the UI or JSON view selector to that draft and experiment in the Virtual Model Editor — all without impacting production calls. ## Best Practices * **Iterate Safely**: Leverage drafts for experimental guardrails or parameter tuning without risking production stability. * **Frequent Publishing**: Keep version history granular—publish stable drafts regularly to simplify tracking and rollbacks. * **Use Restore Thoughtfully**: Before restoring, ensure any important unsaved draft work is committed or intentionally discarded. # Virtual MCP Servers Create Virtual MCP Servers in LangDB AI Gateway to unify tools, manage auth securely, and maintain full observability across workflows A **Virtual MCP Server** lets you create a customized set of MCP tools by combining functions from multiple MCP servers — all with scoped access, unified auth, and full observability. {% embed url="" %} Quick Guide on how to use virtual MCP Server {% endembed %} ## Why Use a Virtual MCP? * **Selective Tools**: Pick only the tools you need from existing MCP servers (e.g. Airtable's `list_records`, GitHub's `create_issue`, etc.) * **Clean Auth Handling**: Add your API keys o\`nly if needed. Otherwise, LangDB handles OAuth for you. * **Full Tracing**: Every call is traced on the LangDB — with logs, latencies, input/output, and error metrics. * **Easy Integration**: Works out of the box with Cursor, Claude, Windsurf, and more. * **Version Lock-in**: Virtual MCPs are pinned to a specific server version to avoid breaking changes. * **Poisoning Safety**: Prevents injection or override by malicious tool definitions from source MCPs. ## How to Set It Up 1. Go to your Virtual MCP server on LangDB Project. 2. Select the tools you want to include. 3. (Optional) Add API keys or use LangDB-managed auth. 4. Click `Generate secure MCP URL`.

Virtual MCP Server - Using it in Claude, Cursor, Windsurf.

### Install in Cursor / Windsurf / Claude Once you have the MCP URL: {% code overflow="wrap" %} ```bash npx @langdb/mcp setup figma https://api.staging.langdb.ai/mcp/xxxxx --client cursor ``` {% endcode %}

You're now ready to use your selected tools directly inside the editor. ### Try it in the playground You can also try the Virtual MCP servers by adding the server in the [virtual model ](https://docs.langdb.ai/docs/concepts/virtual-models)config.

Virtual MCP Server - Usage on LangDB Playground

# Custom MCP Servers Learn how to connect your own custom MCP servers to LangDB AI Gateway. While LangDB provides a rich library of pre-built MCP servers, you can also bring your own. By connecting a custom MCP server, you can leverage all the benefits of a [Virtual MCP Server](https://docs.langdb.ai/docs/concepts/virtual-mcp-servers), including: * **Unified Interface**: Combine your custom tools with tools from other LangDB-managed servers. * **Clean Auth Handling**: Let LangDB manage authentication, or provide your own API keys and headers. * **Full Observability**: Get complete tracing for every call, with logs, latencies, and metrics. * **Seamless Integration**: Works out-of-the-box with clients like Cursor, Claude, and Windsurf. * **Enhanced Security**: Benefit from version pinning and protection against tool definition poisoning. This guide explains how to connect your own custom MCP server, whether it uses an HTTP (REST API) or SSE (Server-Sent Events) transport. ## Connecting Your Custom Server When creating a Virtual MCP Server, you can add your own server alongside the servers deployed and managed by LangDB. ### Steps to Configure a Custom Server 1. **Navigate to MCP Servers**: Go to the "MCP Servers" section in your LangDB project and click "Create Virtual MCP Server". 2. **Add a Custom Server**: In the "Server Configuration" section, click the "+ Add Server" button on the right and select "Custom" from the list. 3. **Configure Server Details**: A new "Custom Server" block will appear on the left. Fill in the following details:

* **Server Name**: Give your custom server a descriptive name. * **Transport Type**: Choose either `HTTP (REST API)` or `SSE (Server-Sent Events)` from the dropdown. * **HTTP/SSE URL**: Enter the endpoint URL for your custom MCP server. LangDB will attempt to connect to this URL to validate the server and fetch the available tools. * **(Optional) HTTP Headers**: If your server requires specific HTTP headers for authentication or other purposes, you can add them here. * **(Optional) Environment Variables**: If your server requires specific configuration via environment variables, you can add them. 4. **Select Tools**: Once LangDB successfully connects to your server, it will display a list of all the tools exposed by your MCP server. You can select which tools you want to include in your Virtual MCP Server. 5. **Generate URL**: After configuring your custom server and selecting the tools, you can generate the secure URL for your Virtual MCP Server and start using it in your applications. # Tracing Track every model call, agent handoff, and tool execution for faster debugging and optimization. LangDB Gateway provides detailed tracing to monitor, debug, and optimize LLM workflows. Below is an example of a trace visualization from the dashboard, showcasing a detailed breakdown of the request stages:

A full end-to-end multi agent workflow traced on LangDB

In this example trace you’ll find: * **Overview Metrics** * **Cost**: Total spend for this request (e.g. $0.034). * **Tokens**: Input (5,774) vs. output (1,395). * **Duration**: Total end-to-end latency (29.52 s). * **Timeline Breakdown**\ A parallel-track timeline showing each step—from moderation and relevance scoring to model inference and final reply. * **Model Invocations**\*\*\ Every call to `gpt-4o-mini`, `gpt-4o`, etc., is plotted with precise start times and durations. * **Agent Hand-offs**\ Transitions between your agents (e.g. search → booking → reply) are highlighted with custom labels like `transfer_to_reply_agent`. * **Tool Integrations**\ External tools (e.g. `booking_tool`, `travel_tool`, `python_repl_tool`) appear inline with their execution times—so you can spot slow or failed runs immediately. * **Guardrails**\ Rules like **Min Word Count** and **Travel Relevance** enforce domain-specific constraints and appear in the trace. With this level of visibility you can quickly pinpoint bottlenecks, understand cost drivers, and ensure your multi-agent pipelines run smoothly. # Auto Router Stop guessing which model to pick. The Auto Router picks the best one for you—whether you care about cost, speed, or accuracy. Stop guessing which model to pick. The Auto Router picks the best one for you—whether you care about cost, speed, or accuracy. ## Why Use Auto Router? * **Save Costs** - Automatically uses cheaper models for simple queries * **Get Faster Responses** - Routes to the fastest model when speed matters * **Guarantee Accuracy** - Picks the best model for critical tasks * **Handle Scale** - No configuration hell, just works ### Quick Start #### Using API ```json { "model": "router/auto", "messages": [ { "role": "user", "content": "What's the capital of France?" } ] } ``` #### Using UI You can also try Auto Router through the LangDB dashboard:

LangDB dashboard showing Auto Router models with filters for providers, context length, and pricing. The models table displays router options like auto, auto:balanced, auto:cost, and topic-specific routers for academia, finance, and marketing. — LangDB dashboard showing available Auto Router models and configuration options

> **Note:** The UI shows only a few router variations. For all available options and advanced configurations, use the API. ### Trace Example Here's what happens behind the scenes when you use Auto Router:

LangDB Traces dashboard showing a successful Auto Router call. The left panel lists the trace with details like cost <$0.001, 1937 tokens, and a duration of 21.09s. A timeline below shows the 'auto' routing step taking 0.36s, followed by 'programming:cost' (20.71s) and 'deepseek-r1-0528-qwen3...' (20.52s). The right panel provides detailed information for the 'auto' trace, including a 200 status, Trace ID, Run ID, Thread ID, start time, and its 0.36s duration.

That's it — no config needed. The router classifies the query and picks the best model automatically. > If you already know the query type (e.g., Finance), skip auto-classification with `router/finance:accuracy`. ## Under the Hood Behind the scenes, the Auto Router uses lightweight classifiers (NVIDIA for complexity, BART for topic) combined with LangDB's routing engine. These decisions are logged in traces so you can inspect why a query was sent to a specific model. ### How It Works The Auto Router uses a two-stage classification process: 1. **Complexity Classification**: Uses NVIDIA's classification model to determine if a query is high or low complexity 2. **Topic Classification**: Uses Facebook's BART Large model to identify the query's topic from these categories: * Academia * Finance * Marketing * Maths * Programming * Science * Vision * Writing Based on these classifications and your chosen optimization strategy, the router automatically selects the best model from your available options. ### Router Behavior | Router Syntax | What happens | | ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `router/auto` | Classifies complexity + topic. Low-complexity queries go to cheaper models; high-complexity queries go to stronger models. Then applies your optimization strategy. | | `router/auto:` | Classifies topic only. Ignores complexity and always applies the chosen optimization (cost, accuracy, etc.) for that topic. | | `router/:` | Skips classification. Directly routes to the specified topic with the chosen optimization mode. | ## Optimization Modes | Mode | What it does | Best for | | ------------ | ------------------------------------------------------------------------ | ------------------------- | | `balanced` | Intelligently distributes requests across models for optimal performance | General apps (default) | | `accuracy` | Picks models with best benchmark scores | Research, compliance | | `cost` | Routes to cheapest viable model | Support chatbots, FAQs | | `latency` | Always picks the fastest | Real-time UIs, voice bots | | `throughput` | Distributes across many models | High-volume pipelines | ## Case Study


Beating the Best Model	Save costs without losing quality. Auto Router delivers best-model accuracy at a fraction of the price.	beating-the-best-model
Beating GPT-5	Auto Router delivers 83% satisfactory results at 35% lower cost than GPT-5. Real-world testing shows router optimization without quality compromise.	beating-gpt-5

## Use Cases ### Cost Optimization Perfect for FAQ bots, education apps, and high-volume content generation. ```json { "model": "router/auto:cost", "messages": [ { "role": "user", "content": "What are your business hours?" } ] } ``` ### Accuracy Optimization Ideal for finance, medical, legal, and research applications. ```json { "model": "router/auto:accuracy", "messages": [ { "role": "user", "content": "Analyze this financial risk assessment" } ] } ``` ### Latency Optimization Great for real-time assistants, voice bots, and interactive UIs. ```json { "model": "router/auto:latency", "messages": [ { "role": "user", "content": "What's the weather like today?" } ] } ``` ### Balanced (Load Balanced) Intelligently distributes requests across available models for optimal performance. Works well for most business applications and integrations. ```json { "model": "router/auto", "messages": [ { "role": "user", "content": "Help me write a product description" } ] } ``` ## Direct Category Routing If you already know your query belongs to a specific domain, you can skip classification and directly route to a topic with your chosen optimization mode. ```json { "model": "router/finance:accuracy", "messages": [ { "role": "user", "content": "Analyze the risk factors in this financial derivative" } ] } ``` **Result:** * Skips complexity + topic classification * Directly applies accuracy optimization for the finance topic * Routes to the highest-scoring finance-optimized model **Available topic shortcuts:** * `router/finance:` * `router/writing:` * `router/academia:` * `router/programming:` * `router/science:` * `router/vision:` * `router/marketing:` * `router/maths:` Where `` can be: `balanced`, `accuracy`, `cost`, `latency`, or `throughput`. **Quick Decision Guide:** * **Don't know the type?** → Use `router/auto` * **Know the type?** → Jump straight with `router/:` ## Advanced Configuration ### Topic-Specific Routing ```json { "model": "router/auto", "router": { "topic_routing": { "finance": "cost", "writing": "latency", "technical": "accuracy" } }, "messages": [ { "role": "user", "content": "Calculate the net present value of this investment" } ] } ``` ## Best Practices 1. **Choose the Right Mode** - Match optimization to your use case 2. **Monitor Performance** - Use LangDB's analytics to track routing decisions 3. **Combine with Fallbacks** - Add fallback models for high availability 4. **Test Different Modes** - Experiment to find the best fit ## Integration with Other Features The Auto Router works seamlessly with: * **Guardrails** - Apply content filtering before routing * **MCP Servers** - Access external tools and data sources * **Response Caching** - Cache responses for frequently asked questions * **Analytics** - Track routing decisions and performance metrics # Beating the Best Model Save costs without losing quality. Auto Router delivers best-model accuracy at a fraction of the price. Most developers assume that using the *best* model is the safest bet for every query.\ But in practice, that often means **paying more than you need to** — especially when cheaper models can handle simpler queries just as well. LangDB’s Auto Router shows you don’t always need the “best” model — just the *right* model for the job.

## The Question We Asked When building AI applications, you face a constant trade-off: **performance vs. cost**. Do you always use the most powerful (and expensive) model to guarantee quality? Or do you risk cheaper alternatives that might fall short on complex tasks? We wanted to find out: *Can smart routing beat the "always use the best model" strategy?* ## Our Experiment We designed a head-to-head comparison using **100 real-world queries** across four domains: Finance, Writing, Science/Math, and Coding. Each query was tested against two strategies: * **Auto Router** → Analyzed query complexity and topic, then selected the most cost-effective model that could handle the task * **Router:Accuracy** → Always defaulted to the highest-performing model (the "best model" approach) **What made this test realistic:** * **Diverse complexity:** 70 low-complexity queries (simple conversions, definitions) and 30 high-complexity queries (complex analysis, multi-step reasoning) * **Real-world domains:** Finance calculations, professional writing, scientific explanations, and coding problems * **Impartial judging:** Used GPT-5-mini as an objective judge to compare response quality **Sample of what we tested:** * **Finance:** "A company has revenue of $200M and expenses of $150M. What is its profit?" * **Writing:** "Write a one-line professional email subject requesting a meeting" * **Science/Math:** "Convert 100 cm into meters" * **Coding:** "Explain what a variable is in programming in one sentence" ## Results | Metric | Auto Router | Router:Accuracy | | ------------------- | ------------------ | --------------- | | **Total Cost** | **$0.95** | $1.64 | | **Wins** | **65%** | 0% | | **Ties** | 35% | 35% | | **Losses** | 0% | 0% | | **Accuracy Parity** | 100% (wins + ties) | 100% | ### What Wins & Ties Mean * **Win** → Auto Router chose a **cheaper model**, and the output was *equal or better* than the best model. * **Tie** → Auto Router escalated to the **best model itself**, because the query was complex enough to require it. * **Loss** → Didn’t happen. Auto Router never underperformed compared to always using the best model. > In other words: **Auto Router matched or beat the best model strategy 100% of the time — while cutting costs by \~42%.** ## Category Breakdown | Category | Count | Router Wins | Ties (Used Best Model) | | -------------- | ----- | ----------- | ---------------------- | | Finance | 25 | 23 | 2 | | Writing | 24 | 18 | 6 | | Science & Math | 19 | 14 | 5 | | Coding | 32 | 10 | 22 | * In **Finance** and **Writing**, Auto Router confidently used cheaper models most of the time. * In **Coding**, Auto Router often escalated to the best model — proving it knows when not to compromise. ## The Methodology Behind the Magic **How Auto Router Works:** Auto Router doesn't just pick models randomly. It uses a sophisticated classification system that: 1. **Analyzes query complexity** — Is this a simple fact lookup or a complex reasoning task? 2. **Identifies the domain** — Finance, writing, coding, or science/math? 3. **Matches to optimal model** — Selects the most cost-effective model that can handle the specific complexity level **The "Always Best" Approach:** Router:Accuracy takes the conservative route — always selecting the highest-performing model regardless of query complexity. It's like using a Formula 1 car for grocery shopping. **Fair Comparison:** We used GPT-5-mini as an impartial judge to evaluate response quality across both strategies. The judge compared answers based on correctness, usefulness, and completeness without knowing which routing strategy was used. ## What This Means for Developers **The Real-World Impact:** * **Cost optimization without compromise** — Save 42% on API costs while maintaining quality * **Intelligent escalation** — Complex queries automatically get the best models * **No manual tuning** — The router handles the complexity analysis for you ## Try It Yourself Using Auto Router is simple — just point to `router/auto`: ```json { "model": "router/auto", "messages": [ { "role": "user", "content": "A company has revenue of $200M and expenses of $150M. What is its profit?" } ] } ``` Auto Router will automatically select the most cost-effective model that can handle your query complexity. ## The Bottom Line * **Save Money** → Auto Router avoids overpaying on simple queries * **Stay Accurate** → For complex cases, it automatically picks the strongest model * **Smarter Than "Always Best"** → Matches or beats the best-model-only approach at a fraction of the cost ## Takeaway You don't need to pick the "best" model every time. With Auto Router: * **Simple queries → cheaper models** save you money * **Complex queries → stronger models** keep accuracy intact * **Overall → 100% accuracy parity at 42% lower cost** That's the power of LangDB Auto Router. # Beating GPT-5 LangDB's Auto Router delivers 83% satisfactory results at 35% lower cost than GPT-5. Real-world testing across 100 prompts shows router optimization without quality compromise. Everyone assumes GPT-5 is untouchable — the safest, most accurate choice for every task. But our latest experiments tell a different story. When we put LangDB's **Auto Router** head-to-head against GPT-5, the results surprised us. ## The Setup We ran **100 real-world prompts** across four categories: Finance, Writing, Science/Math, and Coding. One group always used GPT-5. The other let Auto Router decide the right model. At first glance, you’d expect GPT-5 to dominate — and in strict A/B judging, it often did. But once we layered in a second check — asking an independent validator whether the Router’s answers were *satisfactory* (correct, useful, and complete) — the picture flipped.

## What We Found * **Costs Less:** Router cut spend by **35%** compared to GPT-5 ($1.04 vs $1.58). * **Good Enough Most of the Time:** Router's answers were judged satisfactory in **83% of cases**. * **Practical Wins:** When you combine Router wins, ties, and “GPT-5 wins but Router still satisfactory,” the Router came out ahead in **86/100 tasks**. * **Safe:** There were **zero catastrophic failures** — Router never produced unusable output. ## Breaking Down Quality On strict comparisons, GPT-5 outscored Router in 65 cases. Router directly won 10, with 25 ties. But here’s the catch: in the majority of those “GPT-5 wins,” the Router’s answer was still perfectly fine. Think about defining a finance term, writing a short code snippet, or solving a straightforward math problem. GPT-5 might give a longer, more polished answer, but Router’s output was clear, correct, and usable — and it cost a fraction of the price. The validator helped us separate “better” from “good enough.” And for most workloads, *good enough at lower cost* is exactly what you want. ## Where Router Shines (and Struggles) * **Finance:** Router was flawless here, delivering **satisfactory answers for every single prompt**. * **Coding:** Router handled structured coding tasks well — effective in 30 out of 32 cases. * **Science/Math:** Router held its own, though GPT-5 still had the edge on trickier reasoning. * **Writing:** This was the weakest area for Router. GPT-5 consistently produced richer, more polished prose. Still, Router’s outputs were acceptable two-thirds of the time. *** ### Why This Matters The key takeaway isn’t that Router is “better than GPT-5” in raw accuracy. It’s that **Router is better for your budget without compromising real-world quality**. By knowing when a smaller model is good enough, you save money while still keeping GPT-5 in reserve for the hardest tasks. In practice, that means: * **Finance and Coding workloads** → Route automatically and trust the savings. * **Open-ended creative writing** → Let Router escalate to GPT-5 when needed. * **Everywhere else** → Expect huge cost reductions without a hit to user experience. *** ## Try It Yourself Using the Router doesn’t require any special configuration: ```json { "model": "router/auto", "messages": [ { "role": "user", "content": "Define liquidity in finance in one sentence." } ] } ``` Just point to `router/auto`. LangDB takes care of routing — so you get the right balance of cost and quality, automatically. # Provider Routing Automatically route requests across multiple AI providers for optimal cost, latency, and accuracy. One model name, multiple providers. Stop worrying about which provider to pick. With Provider Routing, you can call a model by name, and LangDB will automatically select the right provider for you. ## Why Use Provider Routing? * **One Name, Many Providers** – Call a model like `deepseek-v3.1` and LangDB picks from DeepSeek official, Parasail, DeepInfra, Fireworks AI, and more. * **Optimize by Mode** – Choose whether you want lowest cost, fastest latency, highest accuracy, or simply balanced routing. *** ### Quick Start ```json { "model": "deepseek-v3.1", "messages": [ { "role": "user", "content": "Explain reinforcement learning in simple terms." } ] } ``` That’s it — LangDB will resolve `deepseek-v3.1` across multiple providers, and by default use **balanced** mode. *** ## Optimization Modes When you specify only a model name, LangDB chooses the provider according to your selected mode. | Mode | What it does | Best for | | -------------- | --------------------------------------------------------------------- | ------------------------- | | **balanced** | Distributes requests across providers for optimal overall performance | General apps (default) | | **accuracy** | Routes to the provider with the best benchmark score | Research, compliance | | **cost** | Picks the cheapest provider by input/output token price | Support chatbots, FAQs | | **latency** | Always selects the lowest latency provider | Real-time UIs, voice bots | | **throughput** | Spreads requests across all providers to maximize concurrency | High-volume pipelines | *** ## Examples ### Balanced (default) ```json { "model": "deepseek-v3.1", "messages": [{ "role": "user", "content": "Summarize this article." }] } ``` LangDB chooses the provider dynamically, balancing cost, latency, and accuracy. *** ### Cost Optimization ```json { "model": "deepseek-v3.1:cost", "messages": [{ "role": "user", "content": "Write a short FAQ response." }] } ``` LangDB picks the **cheapest provider** for `deepseek-v3.1` based on input/output token prices (e.g. Parasail, Fireworks AI, or DeepInfra if they’re lower than DeepSeek official). *** ### Accuracy Optimization ```json { "model": "deepseek-v3.1:accuracy", "messages": [{ "role": "user", "content": "Solve this math word problem." }] } ``` Routes to the provider with the **highest benchmark score** for `deepseek-v3.1`. *** ### Latency Optimization ```json { "model": "deepseek-v3.1:latency", "messages": [{ "role": "user", "content": "Respond quickly for a live chat." }] } ``` Always picks the provider with the **fastest response times**. *** ### Throughput Optimization ```json { "model": "deepseek-v3.1:throughput", "messages": [{ "role": "user", "content": "Translate this dataset." }] } ``` Distributes requests across **all available providers** for `deepseek-v3.1` to maximize scale. *** ## Explicit Provider Pinning If you want full control, you can always specify the provider explicitly: ```json { "model": "parasail/deepseek-v3.1", "messages": [{ "role": "user", "content": "Generate a poem." }] } ``` This bypasses provider routing and always uses the given provider. *** ## Summary * Use `model` without provider → LangDB does **provider routing**. * Add `:mode` suffix → pick between balanced, accuracy, cost, latency, or throughput. * Use `provider/model` → pin a specific provider directly. Provider Routing makes it easy to scale across multiple vendors without rewriting your code. # Routing Intelligently route across multiple LLMs to ensure fast, reliable, and scalable AI operations. LangDB AI Gateway optimizes LLM selection based on cost, speed, and availability, ensuring efficient request handling. This guide covers the various dynamic routing strategies available in the system, including fallback, script-based, optimized, percentage-based, and latency-based routing. This ensures efficient request handling and optimal model selection tailored to specific application needs.

## Understanding Targets Before diving into routing strategies, it's essential to understand **targets** in LangDB AI Gateway. A **target** refers to a specific model or endpoint to which requests can be directed. Each target represents a potential processing unit within the routing logic, enabling optimal performance and reliability. ```json { "model": "router/dynamic", "router": { "type": "percentage", "targets_percentages": [ 40, 60 ], "targets": [ { "model": "openai/gpt-4.1", "mcp_servers": [ { "slug": "mymcp_zoyhbp3u", "name": "mymcp", "type": "sse", "server_url": "https://api.staging.langdb.ai/mymcp_zoyhbp3u" } ], "extra": { "guards": [ "openai_moderation_y6ln88g4" ] } }, { "model": "anthropic/claude-3.7-sonnet", "mcp_servers": [ { "slug": "mymcp_zoyhbp3u", "name": "mymcp", "type": "sse", "server_url": "https://api.staging.langdb.ai/mymcp_zoyhbp3u" } ], "extra": { "guards": [ "openai_moderation_y6ln88g4" ] }, "temperature": 0.6, "messages": [ { "content": "You are a helpful assistant", "id": "02cb4630-b01a-42d9-a226-94968865fbe0", "role": "system" } ] } ] } } ``` **Target Parameters** Each target in LangDB is essentially a self-contained configuration, similar to a virtual model. A target can include: * **Model** – The identifier for the base model to use (e.g. `openai/gpt-4o`) * **Prompt** – Optional system and user messages to steer the model * [**MCP Servers**](https://docs.langdb.ai/docs/concepts/virtual-mcp-servers) – Support to [Virtual MCP Servers](https://docs.langdb.ai/docs/concepts/virtual-mcp-servers) * [**Guardrails**](https://docs.langdb.ai/docs/features/guardrails) – Validations, Moderations. * **Response Format** – `text`, `json_object`, or `json_schema` * **Custom Parameters** – Tuning controls like: * `temperature` * `max_tokens` * `top_p` * `frequency_penalty` * `presence_penalty` ## Routing Strategies LangDB AI Gateway supports multiple routing strategies that can be combined and customized to meet your specific needs: | Routing Strategy | Description | | ----------------------------------------------------- | --------------------------------------------------------------------------- | | [Fallback Routing](#fallback-routing) | Sequentially routes requests through multiple models in case of | | [Optimized Routing](#optimized-routing) | Selects the best model based on real-time performance metrics. | | [Percentage-Based Routing](#percentage-based-routing) | Distributes traffic between multiple models using predefined weightings. | | [Latency-Based Routing](#latency-based-routing) | Chooses the model with the lowest response time for real-time applications. | | [Nested Routing](#nested-routing) | Combines multiple routing strategies for flexible traffic management. | ### Fallback Routing Fallback routing allows sequential attempts to different model targets in case of failure or unavailability. It ensures robustness by cascading through a list of models based on predefined logic. ```json { "model": "router/dynamic", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "What is the formula of a square plot?" } ], "router": { "router": "router", "type": "fallback", // Type: fallback/script/optimized/percentage/latency "targets": [ { "model": "openai/gpt-4o-mini", "temperature": 0.9, "max_tokens": 500, "top_p": 0.9 }, { "model": "deepseek/deepseek-chat", "frequency_penalty": 1, "presence_penalty": 0.6 } ] }, "stream": false } ``` ### Optimized Routing Optimized routing automatically selects the best model based on real-time performance metrics such as latency, response time, and cost-efficiency. ```json { "model": "router/dynamic", "router": { "name": "fastest", "type": "optimized", "metric": "ttft", "targets": [ { "model": "gpt-3.5-turbo", "temperature": 0.8, "max_tokens": 400, "frequency_penalty": 0.5 }, { "model": "gpt-4o-mini", "temperature": 0.9, "max_tokens": 500, "top_p": 0.9 } ] } } ``` Here, the request is routed to the model with the lowest Time-to-First-Token (TTFT) among gpt-3.5-turbo and gpt-4o-mini. **Metrics:** * **Requests** – Total number of requests sent to the model. * **InputTokens** – Number of tokens provided as input to the model. * **OutputTokens** – Number of tokens generated by the model in response. * **TotalTokens** – Combined count of input and output tokens. * **RequestsDuration** – Total duration taken to process requests. * **Ttft (Time-to-First-Token) (Default)** – Time taken by the model to generate its first token after receiving a request. * **LlmUsage** – The total computational cost of using the model, often used for cost-based routing. ### Percentage-Based Routing Percentage-based routing distributes requests between models according to predefined weightings, allowing load balancing, A/B testing, or controlled experimentation with different configurations. Each model can have distinct parameters while sharing the request load. ```json { "model": "router/dynamic", "router": { "name": "dynamic", "type": "percentage", "targets": [ { "model": "openai/gpt-4o-mini", "temperature": 0.9, "max_tokens": 500, "top_p": 0.9 }, { "model": "openai/gpt-4o-mini", "temperature": 0.8, "max_tokens": 400, "frequency_penalty": 1 } ], "targets_percentages": [ 70, 30 ] } } ``` ### Latency-Based Routing Latency-based routing selects the model with the lowest response time, ensuring minimal delay for real-time applications like chatbots and interactive AI systems. ``` { "model": "router/dynamic", "router": { "name": "fastest_latency", "type": "latency", "targets": [ { "model": "openai/gpt-4o-mini", "temperature": 0.9, "max_tokens": 500, "top_p": 0.9 }, { "model": "deepseek/deepseek-chat", "frequency_penalty": 1, "presence_penalty": 0.6 }, { "model": "gemini/gemini-2.0-flash-exp", "temperature": 0.8, "max_tokens": 400, "frequency_penalty": 0.5 } ] } } ``` ### Nested Routing LangDB AI allows nesting of routing strategies, enabling combinations like fallback within script-based selection. This flexibility helps refine model selection based on dynamic business needs. ```json { "model": "router/dynamic", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "What is the formula of a square plot?" } ], "router": { "type": "fallback", "targets": [ { "model": "router/dynamic", "router": { "name": "cheapest_script_execution", "type": "script", "script": "const route = ({ models }) => models \ .filter(m => m.inference_provider.provider === 'bedrock' && m.type === 'completions') \ .sort((a, b) => a.price.per_input_token - b.price.per_input_token)[0]?.model;" } }, { "model": "router/dynamic", "router": { "name": "fastest", "type": "optimized", "metric": "ttft", "targets": [ { "model": "gpt-3.5-turbo", "temperature": 0.8, "max_tokens": 400, "frequency_penalty": 0.5 }, { "model": "gpt-4o-mini", "temperature": 0.9, "max_tokens": 500, "top_p": 0.9 } ] } }, { "model": "deepseek/deepseek-chat", "temperature": 0.7, "max_tokens": 300, "frequency_penalty": 1 } ] }, "stream": false } ``` # MCP Support Create, manage, and connect MCP servers easily to integrate dynamic tools and enhance your AI workflows with full tracing. LangDB simplifies how you work with MCP (Model Context Protocol) servers — whether you want to use a **built-in** [**Virtual MCP**](https://docs.langdb.ai/docs/concepts/virtual-mcp-servers) or connect to an **external MCP server**.

MCP Servers hosted on LangDB — Browse publicly-available MCP servers on LangDB

**Model Context Protocol (MCP)** is an **open standard** that enables AI models to **seamlessly communicate** with external systems. It allows models to **dynamically process contextual data**, ensuring **efficient, adaptive, and scalable** interactions. MCP simplifies **request orchestration** across distributed AI systems, enhancing **interoperability and context-awareness**. With **native tool integrations**, MCP connects AI models to **APIs, databases, local files, automation tools, and remote services** through a standardized protocol. Developers can effortlessly integrate MCP with **IDEs, business workflows, and cloud platforms**, while retaining the flexibility to **switch between LLM providers**. This enables the creation of **intelligent, multi-modal workflows** where AI securely interacts with real-world data and tools. For more details, visit the [Model Context Protocol official page](https://modelcontextprotocol.io/introduction) and explore [Anthropic MCP documentation](https://docs.anthropic.com/en/docs/build-with-claude/mcp). ## Using Virtual MCPs ### Using API LangDB allows you to create [**Virtual MCP Servers**](https://docs.langdb.ai/docs/concepts/virtual-mcp-servers) directly from the dashboard.\ You can instantly select and bundle tools like database queries, search APIs, or automation tasks into a single MCP URL — no external setup needed. Here's an example of how you can use a Virtual MCP Server in your project: {% tabs %} {% tab title="Python" %} ```python from openai import OpenAI from uuid import uuid4 client = OpenAI( base_url="https://api.us-east-1.langdb.ai/LangDBProjectID/v1", api_key="xxxx", default_headers={"x-thread-id": str(uuid4())}, ) mcpServerUrl = "Virtual MCP Server URL" response = client.chat.completions.create( model="openai/gpt-4.1", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What are the databases available"} ], extra_body={ "mcp_servers": [ { "server_url": mcpServerUrl, "type": "sse" } ] } ) ``` {% endtab %} {% tab title="TypeScript" %} ```typescript import openai, { OpenAI } from 'openai'; import { v4 as uuid4 } from 'uuid'; const client = new OpenAI({ baseURL: "https://api.us-east-1.langdb.ai/LangDBProjectID/v1", apiKey: "xxxx", defaultHeaders: { "x-thread-id": uuid4() } }); const mcpServerUrl = 'Virtual MCP URL'; async function getAssistantReply() { const { choices } = await client.chat.completions.create({ model: "openai/gpt-4.1-nano", messages: [ {role: "system", content: "You are a helpful assistant."}, {role: "user", content: "what are the databases on clickhouse?"} , // @ts-expect-error mcp_servers is a LangDB extension mcp_servers: [ { server_url: mcpServerUrl, type: 'sse' } ] } ); console.log('Assistant:', choices[0].message.content); } ``` {% endtab %} {% endtabs %} Checkout [Virtual MCP](https://docs.langdb.ai/docs/concepts/virtual-mcp-servers) and [Guides](https://app.gitbook.com/s/jwXvc8GvLP6kUvOASaLP/using-mcps) section for usecases. ### Using MCP Clients You can instantly connect LangDB’s Virtual MCP servers to editors like **Cursor**, **Claude**, or **Windsurf.** Run this in your terminal to set up MCP in Cursor: ```bash npx @langdb/mcp setup --client cursor ``` You can now call tools directly in your editor, with full tracing on LangDB.

## Connecting to External MCP Servers If you already have an MCP server hosted externally — like Smithery’s Exa MCP — you can plug it straight into LangDB with zero extra setup. Just pass your external MCP server URL in `extra_body` when you make a chat completion request. For example Smithery: ```python extra_body = { "mcp_servers": [ { "server_url": "wss://your-mcp-server.com/ws?config=your_encoded_config", "type": "ws" } ] } ``` For a **complete example** of how to use external MCP, refer to the [Smithery guide](https://app.gitbook.com/s/0xzoP8YRDIkACawdU2ht/smithery). # Usage Track total usage, model-specific metrics, and user-level analytics to stay within limits and optimize LLM workflows. **Monitoring** complements tracing by providing aggregate insights into the usage of LLM workflows.

Usage- displaying input and output tokens for each provider and model as well as their cost.

## Limits LangDB enforces limits to ensure fair usage and cost management while allowing users to configure these limits as needed. Limits are categorized into: 1. **Daily Limits:** Maximum usage per day, e.g., $10 in the Starter Tier. 2. **Monthly Limits:** Total usage allowed in a month, e.g., $100. **Total Limits:** Cumulative limit over the project’s duration, e.g., $500.

### Best Practices * Monitor usage regularly to avoid overages. * Plan limits based on project needs and anticipated workloads. * Upgrade tiers if usage consistently approaches limits. Setting limits not only helps you stay within budget but also provides the flexibility to scale your usage as needed, ensuring your projects run smoothly and efficiently. ## Usage APIs ### [`/usage/total`](https://docs.langdb.ai/docs/api-reference#usage-total) Retrieves the total usage statistics for your project for a timeframe. ```bash curl --location 'https://api.us-east-1.langdb.ai/usage/total' \ --header 'x-project-id: langdbProjectID' \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer langDBAPIKey' \ --data '{"start_time_us": 1693062345678, "end_time_us": 1695092345678}' ``` Example Response: ```json { "total": { "total_input_tokens": 4181386, "total_output_tokens": 206547, "total_cost": 11.890438685999994 }, "period_start": 1737504000000000, "period_end": 1740131013885000 } ``` ### [`/usage/models`](https://docs.langdb.ai/docs/api-reference#usage-models) Fetches timeseries usage statistics per model, allowing users to analyze the distribution of LLM usage. ```bash curl --location 'https://api.us-east-1.langdb.ai/usage/models' \ --header 'x-project-id: langdbProjectID' \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer langDBAPIKey' \ --data '{"start_time_us": 1693062345678, "end_time_us": 1695092345678, "min_unit": "hour"} ' ``` Example Response: ```json { "models": [ { "hour": "2025-02-14 08:00:00", "provider": "openai", "model_name": "gpt-4o-mini", "total_input_tokens": 13408, "total_output_tokens": 2169, "total_cost": 0.0039751199999999995 }, { "hour": "2025-02-13 08:00:00", "provider": "openai", "model_name": "gpt-4o-mini", "total_input_tokens": 55612, "total_output_tokens": 786, "total_cost": 0.01057608 } ], "period_start": 1737504000000000, "period_end": 1740130915098000 } ``` ### Filtering By Users As discussed in User Tracking, we can use filters to retrieve insights based on `id`, `name`, or `tags`. **Available Filters:** * **`user_id`**: Filter data for a specific user by their unique ID. * **`user_name`**: Retrieve usage based on the user’s name. * **`user_tags`**: Filter by tags associated with a user (e.g., "websearch", "support"). ```bash curl -L \ --request POST \ --url 'https://api.us-east-1.langdb.ai/usage/models' \ --header 'Authorization: Bearer langDBAPIKey' \ --header 'X-Project-Id: langDBProjectID' \ --header 'Content-Type: application/json' \ --data '{ "user_id": "123", "user_name": "mrunmay", "user_tags": ["websearch", "testings"] }' ``` Example response: ```json { "models": [ { "day": "2025-02-21 10:00:00", "provider": "openai", "model_name": "gpt-4o-mini", "total_input_tokens": 1112, "total_output_tokens": 130, "total_cost": 0.00029376 }, { "day": "2025-02-21 14:00:00", "provider": "openai", "model_name": "gpt-4o-mini", "total_input_tokens": 3317, "total_output_tokens": 328, "total_cost": 0.00083322 } ], "period_start": 1737556513673410, "period_end": 1740148513673410 } ``` # Analytics Get full visibility into API consumption with cost, speed, and reliability insights to optimize your LLM workflows efficiently. You can monitor API usage with key insights.

Analytics- LangDB displays analytics on dashboard for metrics like TTFT, No. of Requests, etc

After integrating LangDB into your project, the Analytics Dashboard becomes your central hub for understanding usage. ## **Metrics** LangDB’s Analytics Dashboard is segmented into several key panels: ### **Cost:** * Tracks your total cost consumption across all integrated models. * Enables you to compare costs by provider/model/tags, helping you identify the most cost-effective options for your use cases. ### **Time:** * Displays the average duration of requests in milliseconds. * Useful for benchmarking response times and optimizing performance for latency-sensitive applications. ### **Number of Requests:** * Shows the total number of API calls made. * Helps you analyze usage patterns and allocate resources effectively. ### **Average Time to First Token (TTFT)** * Indicates the average time taken to receive the first token from the API response. * This metric is critical for understanding initial latency. ### **Tokens Per Second (TPS)** * Measures the throughput of token generation. * High TPS is indicative of efficient processing. ### **Time Per Output Token (TPOT)** * Tracks the average time spent per output token. * Helps in identifying and troubleshooting bottlenecks in model output. ### Error Rate * Displays the **percentage of failed requests** over total requests. * Helps monitor **system stability and reliability**. ### Error Request Count * Tracks the **total number of failed API requests**. * Useful for debugging and **troubleshooting failures** effectively. ## Analytics APIs ### [/analytics](https://docs.langdb.ai/docs/api-reference#analytics) Provides a detailed timeseries view of API usage metrics. Users can filter data by time range and group it by provider, model, or tags to analyze trends over different periods.

# grouby: provider/tag/model
curl --location 'https://api.us-east-1.langdb.ai/analytics' \
--header 'x-project-id: langDBProjectID' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer langDBAPIKey' \
--data '{"start_time_us": , "end_time_us": , "groupBy": ["provider"]}'

Example response: ```json { "timeseries": [ { "hour": "2025-01-23 04:00:00", "total_cost": 0.0006719999999999999, "total_requests": 2, "avg_duration": 814.4, "duration": 814.4, "duration_p99": 1125.4, "duration_p95": 1100.0, "duration_p90": 1068.3, "duration_p50": 814.4, "total_duration": 1628.778, "total_input_tokens": 72, "total_output_tokens": 38, "error_rate": 0.0, "error_request_count": 0, "avg_ttft": 814.4, "ttft": 814.4, "ttft_p99": 1125.4, "ttft_p95": 1100.0, "ttft_p90": 1068.3, "ttft_p50": 814.4, "tps": 67.54, "tps_p99": 110.03, "tps_p95": 107.55, "tps_p90": 104.45, "tps_p50": 79.63, "tpot": 0.04, "tpot_p99": 0.06, "tpot_p95": 0.06, "tpot_p90": 0.06, "tpot_p50": 0.04, "tag_tuple": [ "openai" ] } ] } ``` ### [/analytics/summary](https://docs.langdb.ai/docs/api-reference#analytics-summary) Provides aggregated usage metrics, allowing users to get a high-level overview of API consumption and error rates.

# grouby: provider/tag/model
curl --location 'https://api.us-east-1.langdb.ai/analytics/summary' \
--header 'x-project-id: langDBProjectID' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer langDBAPIKey' \
--data '{"start_time_us": , "end_time_us": , "groupBy": ["provider"]} '

Example response: ```json { "summary": { "tag_tuple": [ "togetherai" ], "total_cost": 0.0015163199999999998, "total_requests": 8, "total_duration": 5242.402, "avg_duration": 655.3, "duration": 655.3, "duration_p99": 969.2, "duration_p95": 962.5, "duration_p90": 954.1, "duration_p50": 624.3, "total_input_tokens": 853, "total_output_tokens": 200, "avg_ttft": 655.3, "ttft": 655.3, "ttft_p99": 969.2, "ttft_p95": 962.5, "ttft_p90": 954.1, "ttft_p50": 624.3, "tps": 200.86, "tps_p99": 336.04, "tps_p95": 304.95, "tps_p90": 266.08, "tps_p50": 186.24, "tpot": 0.03, "tpot_p99": 0.04, "tpot_p95": 0.04, "tpot_p90": 0.04, "tpot_p50": 0.03, "error_rate": 0.0, "error_request_count": 0 }, } ``` ### Filtering By Users As discussed in User Tracking, we can use filters to retrieve insights based on `id`, `name`, or `tags`. **Available Filters:** * **`user_id`**: Filter data for a specific user by their unique ID. * **`user_name`**: Retrieve usage based on the user’s name. * **`user_tags`**: Filter by tags associated with a user (e.g., "websearch", "support"). ```bash curl -L \ --request POST \ --url 'https://api.us-east-1.langdb.ai/analytics/summary' \ --header 'Authorization: Bearer langDBAPIKey' \ --header 'X-Project-Id: langDBProjectID' \ --header 'Content-Type: application/json' \ --data '{ "user_id": "123", "user_name": "mrunmay", "user_tags": ["websearch", "testings"] }' ``` Example response: ```json { "summary": [ { "total_cost": 0.00112698, "total_requests": 4, "total_duration": 31645.018, "avg_duration": 7911.3, "duration": 7911.3, "duration_p99": 9819.3, "duration_p95": 9809.0, "duration_p90": 9796.1, "duration_p50": 8193.2, "total_input_tokens": 4429, "total_output_tokens": 458, "avg_ttft": 7911.3, "ttft": 7911.3, "ttft_p99": 9819.3, "ttft_p95": 9809.0, "ttft_p90": 9796.1, "ttft_p50": 8193.2, "tps": 154.43, "tps_p99": 207.79, "tps_p95": 206.1, "tps_p90": 203.99, "tps_p50": 160.85, "tpot": 0.07, "tpot_p99": 0.1, "tpot_p95": 0.09, "tpot_p90": 0.09, "tpot_p50": 0.07, "error_rate": 0.0, "error_request_count": 0 } ], "start_time_us": 1737576094363076, "end_time_us": 1740168094363076 } ``` # Guardrails Enforce safety, compliance, and quality with LangDB guardrails—moderate content, validate responses, and detect security risks. LangDB allow developers to enforce specific constraints and checks on their LLM calls, ensuring safety, compliance, and quality control. Guardrails currently support request validation and logging, ensuring structured oversight of LLM interactions.

LangDB Guardrails - Displaying all the guards available there. — Guardrail Templates on LangDB

These guardrails include: * **Content Moderation**: Detects and filters harmful or inappropriate content (e.g., toxicity detection, sentiment analysis). * **Security Checks**: Identifies and mitigates security risks (e.g., PII detection, prompt injection detection). * **Compliance Enforcement**: Ensures adherence to company policies and factual accuracy (e.g., policy adherence, factual accuracy). * **Response Validation**: Validates response format and structure (e.g., word count, JSON schema, regex patterns). Guardrails can be configured via the UI or API, providing flexibility for different use cases. ## Guardrail Behaviour When a guardrail blocks an input or output, the system returns a structured error response. Below are some example responses for different scenarios: ### **Example 1: Input Rejected by Guard** ```json { "id": "", "object": "chat.completion", "created": 0, "model": "", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Input rejected by guard", "tool_calls": null, "refusal": null, "tool_call_id": null }, "finish_reason": "rejected" } ], "usage": { "prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0, "cost": 0.0 } } ``` ### **Example 2: Output Rejected by Guard** ```json { "id": "5ef4d8b1-f700-46ca-8439-b537f58f7dc6", "object": "chat.completion", "created": 1741865840, "model": "gpt-4o-mini", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Output rejected by guard", "tool_calls": null, "refusal": null, "tool_call_id": null }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 21, "completion_tokens": 40, "total_tokens": 61, "cost": 0.000032579999999999996 } } ``` ## Limitations It is important to note that **guardrails cannot be applied to streaming outputs**. ## Guardrail Templates LangDB provides prebuilt templates to enforce various constraints on LLM responses. These templates cover areas such as content moderation, security, compliance, and validation. The following table provides quick access to each guardrail template: | Guardrail | Description | | ----------------------------------------------------------------------------------- | ---------------------------------------------------------------- | | [Toxicity Detection](#toxicity-detection-content-toxicity) | Detects and filters toxic or harmful content. | | [JSON Schema Validator](#toxicity-detection-content-toxicity) | Validates responses against a user-defined JSON schema. | | [Competitor Mention Check](#competitor-mention-check-content-competitor-mentions) | Detects mentions of competitor names or products. | | [PII Detection](#pii-detection-security-pii-detection) | Identifies personally identifiable information in responses. | | [Prompt Injection Detection](#prompt-injection-detection-security-prompt-injection) | Detects attempts to manipulate the AI through prompt injections. | | [Company Policy Compliance](#company-policy-compliance-compliance-company-policy) | Ensures responses align with company policies. | | [Regex Pattern Validator](#regex-pattern-validator-validation-regex-pattern) | Validates responses against specified regex patterns. | | [Word Count Validator](#word-count-validator-validation-word-count) | Ensures responses meet specified word count requirements. | | [Sentiment Analysis](#sentiment-analysis-content-sentiment-analysis) | Evaluates sentiment to ensure appropriate tone. | | [Language Validator](#language-validator-content-language-validation) | Checks if responses are in allowed languages. | | [Topic Adherence](#topic-adherence-content-topic-adherence) | Ensures responses stay on specified topics. | | [Factual Accuracy](#factual-accuracy-content-factual-accuracy) | Validates that responses contain factually accurate information. | ### **Toxicity Detection (`content-toxicity`)** Detects and filters out toxic, harmful, or inappropriate content.

Parameter	Type	Description	Defaults
`threshold`	`number`	Confidence threshold for toxicity detection.	Required
`categories`	`array`	Categories of toxicity to detect.	`["hate", "harassment", "violence", "self-harm", "sexual", "profanity"]`
`evaluation_criteria`	`array`	Criteria used for toxicity evaluation.	`["Hate speech", "Harassment", "Violence", "Self-harm", "Sexual content", "Profanity"]`

### **JSON Schema Validator (`validation-json-schema`)** Validates responses against a user-defined JSON schema. | Parameter | Type | Description | Defaults | | --------- | -------- | --------------------------------------------------------------------- | -------- | | `schema` | `object` | Custom JSON schema to validate against (replace with your own schema) | Required | ### **Competitor Mention Check (`content-competitor-mentions`)** Detects mentions of competitor names or products in LLM responses. | Parameter | Type | Description | Defaults | | ---------------- | --------- | ----------------------------------------- | -------------------------- | | `competitors` | `array` | List of competitor names. | `["company1", "company2"]` | | `match_partial` | `boolean` | Whether to match partial names. | `true` | | `case_sensitive` | `boolean` | Whether matching should be case sensitive | `false` | ### **PII Detection (`security-pii-detection`)** Detects personally identifiable information (PII) in responses. | Parameter | Type | Description | Defaults | | ----------- | --------- | ------------------------------- | ------------------------------------------ | | `pii_types` | `array` | Types of PII to detect. | `["email", "phone", "ssn", "credit_card"]` | | `redact` | `boolean` | Whether to redact detected PII. | `false` | ### **Prompt Injection Detection (`security-prompt-injection`)** Identifies prompt injection attacks attempting to manipulate the AI. | Parameter | Type | Description | Defaults | | --------------------- | -------- | ------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `threshold` | `number` | Confidence threshold for injection detection. | Required | | `detection_patterns` | `array` | Common patterns used in prompt injection attacks. | `["Ignore previous instructions", "Forget your training", "Tell me your prompt"]` | | `evaluation_criteria` | `array` | Criteria used for detection. | `["Attempts to override system instructions", "Attempts to extract system prompt information", "Attempts to make the AI operate outside its intended purpose"]` | ### **Company Policy Compliance (`compliance-company-policy`)** Ensures that responses align with predefined company policies. | Parameter | Type | Description | Defaults | | ----------------- | -------- | ---------------------------------------- | ---------------------------- | | `embedding_model` | `string` | Model used for text embedding. | `text-embedding-ada-002` | | `threshold` | `number` | Similarity threshold for compliance. | Required | | `dataset` | `object` | Example dataset for compliance checking. | Contains predefined examples | ### **Regex Pattern Validator (`validation-regex-pattern`)** Validates responses against specific regex patterns. | Parameter | Type | Description | Defaults | | ------------ | -------- | ----------------------------------------------------- | -------------------------- | | `patterns` | `array` | Model List of regex patterns. | `["^[A-Za-z0-9\s.,!?]+$"]` | | `match_type` | `string` | Whether all, any, or none of the patterns must match. | `"all"` | ### **Word Count Validator (`validation-word-count`)** Ensures responses meet specified word count requirements. | Parameter | Type | Description | Defaults | | -------------- | -------- | ----------------------------------------------------- | -------- | | `min_words` | `number` | Model List of regex patterns. | `10` | | `max_words` | `number` | Whether all, any, or none of the patterns must match. | `500` | | `count_method` | `string` | Method for word counting. | `split` | ### **Sentiment Analysis (`content-sentiment-analysis`)** Evaluates the sentiment of responses to ensure appropriate tone. | Parameter | Type | Description | Defaults | | -------------------- | -------- | --------------------------------------------- | ------------------------- | | `allowed_sentiments` | `array` | Allowed sentiment categories. | `["positive", "neutral"]` | | `threshold` | `number` | Confidence threshold for sentiment detection. | `0.7` | ### **Language Validator (`content-language-validation`)** Checks if responses are in allowed languages. | Parameter | Type | Description | Defaults | | ------------------- | -------- | -------------------------------------------- | ------------- | | `allowed_languages` | `array` | List of allowed languages. | `["english"]` | | `threshold` | `number` | Confidence threshold for language detection. | `0.9` | ### **Topic Adherence (`content-topic-adherence`)** Ensures responses stay on specified topics. | Parameter | Type | Description | Defaults | | ------------------ | -------- | ----------------------------------------- | ------------------------------------------------- | | `allowed_topics` | `array` | List of allowed topics. | `["Product information", "Technical assistance"]` | | `forbidden_topics` | `array` | List of forbidden topics. | `["politics", "religion"]` | | `threshold` | `number` | Confidence threshold for topic detection. | `0.7` | ### **Factual Accuracy (`content-factual-accuracy`)** Validates that responses contain factually accurate information. | Parameter | Type | Description | Defaults | | --------------------- | -------- | ----------------------------------------------- | ------------------------------------------------------------------ | | `reference_facts` | `array` | List of reference facts. | `[]` | | `threshold` | `number` | Confidence threshold for factuality assessment. | `0.8` | | `evaluation_criteria` | `array` | Criteria used to assess factual accuracy. | `["Contains verifiable information", "Avoids speculative claims"]` | # User Roles Set user permissions with LangDB’s role-based system, giving Admins, Developers, and Billing users specific access and controls. LangDB provides role-based access control to manage users efficiently within an organization. There are three primary roles: **Admin, Developer, and Billing**. Each role has specific permissions and responsibilities, ensuring a structured and secure environment for managing teams.

## **Admin** Admins have the highest level of control within LangDB. They can: * Invite and manage users * Assign and modify roles for team members * Manage cost groups and usage tracking * Access billing details and payment settings * Configure organizational settings * Configure project model access restrictions * Configure project user access restrictions **Best for:** Organization owners, team leads, or IT administrators managing team access and billing. ## **Developer** Developers focus on working with APIs and integrating LLMs. They have the following permissions: * Access and use LangDB APIs * Deploy and test applications using LangDB’s AI Gateway * View and monitor API usage and performance **Best for:** Software developers, data scientists, and AI engineers working on LLM integrations. ## Billing Billing users have access to financial and cost-related features. Their permissions include: * Managing top-ups and subscriptions * Monitoring usage costs and optimizing expenses **Best for:** Finance teams, accounting personnel, and cost management administrators. *** ## Role Management Admins can assign roles to users when inviting them to the organization. Role changes can also be made later through the user management panel. ### **Key Points:** * Users can have multiple roles (e.g., both Developer and Billing).- * Only Admins can assign or update roles. * Billing users cannot modify API access but can track and manage costs. * **Role Management is only available in Professional, Business, and Enterprise tiers.** # Cost Control Control project expenses by setting user and group-based limits, monitoring AI usage, and optimizing costs in LangDB. LangDB enables **cost tracking, project budgeting, and cost groups** to help manage AI usage efficiently. ## **Cost Groups (Business Tier & Above)** * **Available in Business & Enterprise tiers** under **User Management**. * Organize users into **cost groups** to track and allocate spending. * Cost groups help in **budgeting but are independent of user roles**.

Creating Cost groups for users on LangDB

## **Project-Level Spending Limits** * **Set daily, monthly, and total spending limits** per project. * Enforce **per-user limits** to prevent excessive usage. * Available in **Project Settings → Cost Control**.

## **Cost Group-Based Role Management** * **Admins and Billing users** can define **spending limits for cost groups**. * Set **daily, monthly, and total budgets** per group. * Useful for **controlling team-based expenses** independently of project limits.

# Response Caching Enable response caching in LangDB for faster, lower-cost results on repeated LLM queries. Response caching is designed for **faster response times**, reduced compute cost, and consistent outputs when handling repeated or identical prompts. Perfect for dashboards, agents, and endpoints with predictable queries. ## Benefits * Faster responses for identical requests (cache hit) * Reduced model/token usage for repeated inputs * Consistent outputs for the same input and parameters ## Using Response Caching ### Through Virtual Model 1. Toggle **Response Caching** ON. 2. Select the cache type: * **Exact match** (default): Matches prompt. * *(Distance-based matching is coming soon.)* 3. Set **Cache expiration time** in seconds (default: `1200`).

Once enabled, identical requests will reuse the cached output as long as it hasn’t expired. ### Through API Calls You can use caching on a per-request basis by including a `cache` field in your API body: ```json { "model": "openai/gpt-4.1", "messages": [ {"role": "user", "content": "Summarize the news today"} ], "cache": { "type": "exact", "expiration_time": 1200 } } ``` * `type`: Currently only `exact` is supported. * `expiration_time`: Time in seconds (e.g., 1200 for 20 minutes). **If caching is enabled in both the virtual model and the request, the API payload takes priority.** ### **Pricing** * Cache hits are billed at **0.1× the standard token price** (90% cheaper than a normal model call). ## Cache Hits * When a response is served from cache, it is clearly marked as **Cache: HIT** in traces. * You’ll also see: * **Status:** `200` * **Trace ID** and **Thread ID** for debuging * **Start time / Finish time**: Notice how the **duration** is typically `<0.01s` for cache hits. * **Cost:** Cache hits are billed at a much lower rate (shown here as `$0.000027`). * The “Cache” field is displayed prominently (green “HIT” label).

Tracing Showcasing LangDB Response Cache

Response caching in LangDB is a practical way to improve latency, reduce compute costs, and ensure consistent outputs for repeated queries. Use the UI or API to configure caching, monitor cache hits in traces and dashboard, and take advantage of reduced pricing for cached responses. For most projects with stable or repeated inputs, enabling caching is a straightforward optimization that delivers immediate benefits. # Prompt Caching Leverage provider-side prompt caching for significant cost and latency savings on large, repeated prompts. To save on inference costs, you can leverage prompt caching on supported providers and models. When a provider supports it, LangDB will make a best-effort to route subsequent requests to the same provider to make use of the warm cache. Most providers automatically enable prompt caching for large prompts, but some, like Anthropic, require you to enable it on a per-message basis. ## How Caching Works ### **Automatic Caching** Providers like OpenAI, Grok, DeepSeek, and (soon) Google Gemini enable caching by default once your prompt exceeds a certain length (e.g. 1024 tokens). * **Activation**: No change needed. Any prompt over the length threshold is written to cache. * **Best Practice**: Put your static content (system prompts, RAG context, long instructions) *first* in the message so it can be reused. * **Pricing**: * **Cache Write**: Mostly free or heavily discounted. * **Cache Read**: Deep discounts vs. fresh inference.

### **Manual Caching:** Anthropic’s Claude family requires you to mark which parts of the message are cacheable by adding a `cache_control` object. You can also set a TTL to control how long the block stays in cache. * **Activation**: You must wrap static blocks in a `content` array and give them a `cache_control` entry. * **TTL**: Use `{"ttl": "5m"}` or `{"ttl": "1h"}` to control expiration (default 5 minutes). * **Best For**: Huge documents, long backstories, or repeated system instructions. * **Pricing**: * **Cache Write**: 1.25× the normal per-token rate * **Cache Read**: 0.1× (10%) of the normal per-token rate * **Limitations**: Ephemeral (expires after TTL), limited number of blocks.

Cache write with Anthropic Prompt Caching — In this run you’ll see “Prompt Caching: 99.9% Write,” a small cost increase (~25%).

## **Caching Example ( Anthropic)** Here is an example of caching a large document. This can be done in either the `system` or `user` message. ```json { "model": "anthropic/claude-3.5-sonnet", "messages": [ { "role": "system", "content": [ { "type": "text", "text": "You are a helpful assistant that analyzes legal documents. The following is a terms of service document:" }, { "type": "text", "text": "HUGE DOCUMENT TEXT...", "cache_control": { "type": "ephemeral", "ttl": "1h" } } ] }, { "role": "user", "content": [ { "type": "text", "text": "Summarize the key points about data privacy." } ] } ] } ``` *** ## Provider Support Matrix | Provider | Auto-cache? | Manual flag? | TTL | Write cost | Read cost | | ---------------- | :------------------: | :-------------------: | :-------: | :--------: | :-----------: | | OpenAI | :white\_check\_mark: | :x: | N/A | standard | 0.25x or 0.5x | | Grok | :white\_check\_mark: | :x: | N/A | standard | 0.25x | | DeepSeek | :white\_check\_mark: | :x: | N/A | standard | 0.25x | | Anthropic Claude | :x: | `cache_control` + TTL | 5 m / 1 h | 1.25× | 0.1× | *** > For the most up-to-date information on a specific model or provider's caching policy, pricing, and limitations, please refer to the model page on LangDB # Model Access Control Control which models are available in your projects with LangDB's model access restrictions, ensuring teams only use approved models. Restrict which AI models are available for specific projects. Only **Admins** can configure these restrictions - other roles are bound by the settings.

Model Access Control settings page in Projects

## How It Works * **Admin-only configuration**: Only Admins can set which models are allowed per project * **API enforcement**: Restricted models return access denied errors * **Team-wide**: All project members are bound by the same restrictions * **Universal**: Works across all API endpoints and integrations ## Setup (Admin Only) 1. **Project Settings** → **Model** 2. **Select allowed models** from the list 3. **Save configuration** Test with an API call to verify restrictions are working. ## Common Use Cases * **Cost control**: Restrict expensive models in dev environments * **Production stability**: Only allow tested models in production * **Compliance**: Meet regulatory requirements by limiting model access ## Troubleshooting **"Model not available" errors:** * Check if the model is in the project's allowed list * Verify model restrictions are enabled * Confirm you're using the correct model identifier **Can't modify restrictions:** * Only **Admin** role can configure restrictions # Project Access Control Control which users have access to your projects with LangDB's project-level user access restrictions. Select which users in your organization can access specific projects. Only **Admins** can configure project access - other roles cannot modify these settings.

User Access Control settings page in Projects

## How It Works * **Admin-only configuration**: Only Admins can enable/disable user access per project * **User-level control**: Individual users can be granted or revoked project access * **Role preservation**: Users keep their organization roles but may be restricted from certain projects * **API enforcement**: Users without project access cannot make API calls to restricted projects ## Setup (Admin Only) 1. **Project Settings** → **Users** → **User Access Configuration** 2. **Search and select users** to grant project access 3. **Toggle individual users** on/off for the project 4. **Use "All Users"** toggle to quickly enable/disable everyone 5. **Save configuration** ## User States * **Enabled**: User can access the project and make API calls * **Disabled**: User cannot access the project (blocked from API calls) * **All Users toggle**: Bulk enable/disable all organization users for the project ## Common Use Cases * **Sensitive projects**: Restrict access to confidential or regulated projects * **Client work**: Limit project access to specific team members working with particular clients * **Development stages**: Control access to production vs development projects * **Cost management**: Prevent unauthorized usage by limiting project access ## Troubleshooting ### **"Access denied" errors:** * Check if the user is enabled for the specific project * Verify the user exists in the organization * Confirm the project access configuration is saved ### **Can't modify project access:** * Only **Admin** role can configure project access * Ensure you're in the correct project settings # Getting Started Use LangDB’s Python SDK to generate completions, monitor API usage, retrieve analytics, and evaluate LLM workflows efficiently. [![PyPI version](https://badge.fury.io/py/pylangdb.svg)](https://badge.fury.io/py/pylangdb) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) ### Key Features LangDB exposes **two complementary capabilities**: 1. **Chat Completions Client** – Call LLMs using the `LangDb` Python client. This works as a drop-in replacement for `openai.ChatCompletion` while adding automatic usage, cost and latency reporting. 2. **Agent Tracing** – Instrument your existing AI framework (ADK, LangChain, CrewAI, etc.) with a single `init()` call. All calls are routed through the LangDB collector and are enriched with additional metadata regarding the framework is visible on the LangDB dashboard. *** ## Quick Start (Chat Completions) ```bash pip install pylangdb[client] ``` ```python from pylangdb.client import LangDb # Initialize LangDB client client = LangDb(api_key="your_api_key", project_id="your_project_id") # Simple chat completion resp = client.chat.completions.create( model="openai/gpt-4o-mini", messages=[{"role": "user", "content": "Hello!"}] ) print(resp.choices[0].message.content) ``` *** ## Agent Tracing Quick Start ```bash # Install the package with Google ADK support pip install pylangdb[adk] ``` ```python # Import and initialize LangDB tracing # First initialize LangDB before defining any agents from pylangdb.adk import init init() import datetime from zoneinfo import ZoneInfo from google.adk.agents import Agent def get_weather(city: str) -> dict: if city.lower() != "new york": return {"status": "error", "error_message": f"Weather information for '{city}' is not available."} return {"status": "success", "report": "The weather in New York is sunny with a temperature of 25 degrees Celsius (77 degrees Fahrenheit)."} def get_current_time(city: str) -> dict: if city.lower() != "new york": return {"status": "error", "error_message": f"Sorry, I don't have timezone information for {city}."} tz = ZoneInfo("America/New_York") now = datetime.datetime.now(tz) return {"status": "success", "report": f'The current time in {city} is {now.strftime("%Y-%m-%d %H:%M:%S %Z%z")}'} root_agent = Agent( name="weather_time_agent", model="gemini-2.0-flash", description=("Agent to answer questions about the time and weather in a city." ), instruction=("You are a helpful agent who can answer user questions about the time and weather in a city."), tools=[get_weather, get_current_time], ) ``` > **Note:** Always initialize LangDB **before** importing any framework-specific classes to ensure proper instrumentation. **Example Trace Screenshot** ![Google ADK Trace Example](https://raw.githubusercontent.com/langdb/pylangdb/main/assets/adk.png) ### Supported Frameworks (Tracing) | Framework | Installation | Import Pattern | Key Features | | ---------- | --------------------------------- | ------------------------------------- | --------------------------------------------- | | Google ADK | `pip install pylangdb[adk]` | `from pylangdb.adk import init` | Automatic sub-agent discovery | | OpenAI | `pip install pylangdb[openai]` | `from pylangdb.openai import init` | Custom model provider support and Run Tracing | | LangChain | `pip install pylangdb[langchain]` | `from pylangdb.langchain import init` | Automatic chain tracing | | CrewAI | `pip install pylangdb[crewai]` | `from pylangdb.crewai import init` | Multi-agent crew tracing | | Agno | `pip install pylangdb[agno]` | `from pylangdb.agno import init` | Tool usage tracing, model interactions | ## How It Works LangDB uses intelligent monkey patching to instrument your AI frameworks at runtime:

Click to see technical details for each framework

#### Google ADK * Patches `Agent.__init__` to inject callbacks * Tracks agent hierarchies and tool usage * Maintains thread context across invocations #### OpenAI * Intercepts HTTP requests via `AsyncOpenAI.post` * Propagates trace context via headers * Correlates spans across agent interactions #### LangChain * Modifies `httpx.Client.send` for request tracing * Automatically tracks chains and agents * Injects trace headers into all requests #### CrewAI * Intercepts `litellm.completion` for LLM calls * Tracks crew members and task delegation * Propagates context through LiteLLM headers #### Agno * Patches `LangDB.invoke` and client parameters * Traces workflows and model interactions * Maintains consistent session context

## Installation ```bash # For client library functionality (chat completions, analytics, etc.) pip install pylangdb[client] # For framework tracing - install specific framework extras pip install pylangdb[adk] # Google ADK tracing pip install pylangdb[openai] # OpenAI agents tracing pip install pylangdb[langchain] # LangChain tracing pip install pylangdb[crewai] # CrewAI tracing pip install pylangdb[agno] # Agno tracing ``` ## Configuration Set your credentials (or pass them directly to the `init()` function): ```bash export LANGDB_API_KEY="your-api-key" export LANGDB_PROJECT_ID="your-project-id" ``` ## Client Usage (Chat Completions) ### Initialize LangDb Client ```python from pylangdb import LangDb # Initialize with API key and project ID client = LangDb(api_key="your_api_key", project_id="your_project_id") ``` ### Chat Completions ```python messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Say hello!"} ] response = client.completion( model="gemini-1.5-pro-latest", messages=messages, temperature=0.7, max_tokens=100 ) ``` ### Thread Operations **Get Messages** Retrieve messages from a specific thread: ```python messages = client.get_messages(thread_id="your_thread_id") # Access message details for message in messages: print(f"Type: {message.type}") print(f"Content: {message.content}") if message.tool_calls: for tool_call in message.tool_calls: print(f"Tool: {tool_call.function.name}") ``` **Get Thread Cost** Get cost and token usage information for a thread: ```python usage = client.get_usage(thread_id="your_thread_id") print(f"Total cost: ${usage.total_cost:.4f}") print(f"Input tokens: {usage.total_input_tokens}") print(f"Output tokens: {usage.total_output_tokens}") ``` ### Analytics Get analytics data for specific tags: ```python # Get raw analytics data analytics = client.get_analytics( tags="model1,model2", start_time_us=None, # Optional: defaults to 24 hours ago end_time_us=None # Optional: defaults to current time ) # Get analytics as a pandas DataFrame df = client.get_analytics_dataframe( tags="model1,model2", start_time_us=None, end_time_us=None ) ``` #### Evaluate Multiple Threads ```python df = client.create_evaluation_df(thread_ids=["thread1", "thread2"]) print(df.head()) ``` #### List Available Models ```python models = client.list_models() print(models) ``` ## Framework-Specific Examples (Tracing) ### Google ADK ```python from pylangdb.adk import init # Monkey-patch the client for tracing init() # Import your agents after initializing tracing from google.adk.agents import Agent from travel_concierge.sub_agents.booking.agent import booking_agent from travel_concierge.sub_agents.in_trip.agent import in_trip_agent from travel_concierge.sub_agents.inspiration.agent import inspiration_agent from travel_concierge.sub_agents.planning.agent import planning_agent from travel_concierge.sub_agents.post_trip.agent import post_trip_agent from travel_concierge.sub_agents.pre_trip.agent import pre_trip_agent from travel_concierge.tools.memory import _load_precreated_itinerary root_agent = Agent( model="openai/gpt-4.1", name="root_agent", description="A Travel Conceirge using the services of multiple sub-agents", instruction="Instruct the travel concierge to plan a trip for the user.", sub_agents=[ inspiration_agent, planning_agent, booking_agent, pre_trip_agent, in_trip_agent, post_trip_agent, ], before_agent_callback=_load_precreated_itinerary, ) ``` ### OpenAI ```python import uuid import os # Import LangDB tracing from pylangdb.openai import init # Initialize tracing init() # Import agent components from agents import ( Agent, Runner, set_default_openai_client, RunConfig, ModelProvider, Model, OpenAIChatCompletionsModel ) # Configure OpenAI client with environment variables from openai import AsyncOpenAI client = AsyncOpenAI( api_key=os.environ.get("LANGDB_API_KEY"), base_url=os.environ.get("LANGDB_API_BASE_URL"), default_headers={ "x-project-id": os.environ.get("LANGDB_PROJECT_ID") } ) set_default_openai_client(client) # Create a custom model provider class CustomModelProvider(ModelProvider): def get_model(self, model_name: str | None) -> Model: return OpenAIChatCompletionsModel(model=model_name, openai_client=client) CUSTOM_MODEL_PROVIDER = CustomModelProvider() agent = Agent( name="Math Tutor", model="gpt-4.1", instruction="You are a math tutor who can help students with their math homework.", ) group_id = str(uuid.uuid4()) # Use the model provider with a unique group_id for tracing async def run_agent(): response = await Runner.run( triage_agent, input="Hello World", run_config=RunConfig( model_provider=CUSTOM_MODEL_PROVIDER, # Inject custom model provider group_id=group_id # Link all steps to the same trace ) ) print(response.final_output) # Run the async function with asyncio asyncio.run(run_agent()) ``` ### LangChain ```python import os from pylangdb.langchain import init init() # Get environment variables for configuration api_base = os.getenv("LANGDB_API_BASE_URL") api_key = os.getenv("LANGDB_API_KEY") if not api_key: raise ValueError("Please set the LANGDB_API_KEY environment variable") project_id = os.getenv("LANGDB_PROJECT_ID") # Default headers for API requests default_headers: dict[str, str] = { "x-project-id": project-id } # Your existing LangChain code works with proper configuration from langchain.chat_models import ChatOpenAI from langchain.schema import HumanMessage # Initialize OpenAI LLM with proper configuration llm = ChatOpenAI( model_name="gpt-4", temperature=0.3, openai_api_base=api_base, openai_api_key=api_key, default_headers=default_headers, ) result = llm.invoke([HumanMessage(content="Hello, LangChain!")]) ``` ### CrewAI ```python import os from crewai import Agent, Task, Crew, LLM from dotenv import load_dotenv load_dotenv() # Import and initialize LangDB tracing from pylangdb.crewai import init # Initialize tracing before importing or creating any agents init() # Initialize API credentials api_key = os.environ.get("LANGDB_API_KEY") api_base = os.environ.get("LANGDB_API_BASE_URL") project_id = os.environ.get("LANGDB_PROJECT_ID") # Create LLM with proper headers llm = LLM( model="gpt-4", api_key=api_key, base_url=api_base, extra_headers={ "x-project-id": project_id } ) # Create and use your CrewAI components as usual # They will be automatically traced by LangDB researcher = Agent( role="researcher", goal="Research the topic thoroughly", backstory="You are an expert researcher", llm=llm, verbose=True ) task = Task( description="Research the given topic", agent=researcher ) crew = Crew(agents=[researcher], tasks=[task]) result = crew.kickoff() ``` ### Agno ```python import os from agno.agent import Agent from agno.tools.duckduckgo import DuckDuckGoTools # Import and initialize LangDB tracing from pylangdb.agno import init init() # Import LangDB model after initializing tracing from agno.models.langdb import LangDB # Create agent with LangDB model agent = Agent( name="Web Agent", role="Search the web for information", model=LangDB( id="openai/gpt-4", base_url=os.getenv("LANGDB_API_BASE_URL") + '/' + os.getenv("LANGDB_PROJECT_ID") + '/v1', api_key=os.getenv("LANGDB_API_KEY"), project_id=os.getenv("LANGDB_PROJECT_ID"), ), tools=[DuckDuckGoTools()], instructions="Answer questions using web search", show_tool_calls=True, markdown=True, ) # Use the agent response = agent.run("What is LangDB?") ``` ### Advanced Configuration #### Environment Variables | Variable | Description | Default | | -------------------------- | --------------------------------- | -------------------------------------- | | `LANGDB_API_KEY` | Your LangDB API key | Required | | `LANGDB_PROJECT_ID` | Your LangDB project ID | Required | | `LANGDB_API_BASE_URL` | LangDB API base URL | `https://api.us-east-1.langdb.ai` | | `LANGDB_TRACING_BASE_URL` | Tracing collector endpoint | `https://api.us-east-1.langdb.ai:4317` | | `LANGDB_TRACING` | Enable/disable tracing | `true` | | `LANGDB_TRACING_EXPORTERS` | Comma-separated list of exporters | `otlp`, `console` | #### Custom Configuration All `init()` functions accept the same optional parameters: ```python from langdb.openai import init init( collector_endpoint='https://api.us-east-1.langdb.ai:4317', api_key="langdb-api-key", project_id="langdb-project-id" ) ``` ### Technical Details #### Session and Thread Management * **Thread ID**: Maintains consistent session identifiers across agent calls * **Run ID**: Unique identifier for each execution trace * **Invocation Tracking**: Tracks the sequence of agent invocations * **State Persistence**: Maintains context across callbacks and sub-agent interactions #### Distributed Tracing * **OpenTelemetry Integration**: Uses OpenTelemetry for standardized tracing * **Attribute Propagation**: Automatically propagates LangDB-specific attributes * **Span Correlation**: Links related spans across different agents and frameworks * **Custom Exporters**: Supports multiple export formats (OTLP, Console) ### API Reference #### Initialization Functions Each framework has a simple `init()` function that handles all necessary setup: * `langdb.adk.init()`: Patches Google ADK Agent class with LangDB callbacks * `langdb.openai.init()`: Initializes OpenAI agents tracing * `langdb.langchain.init()`: Initializes LangChain tracing * `langdb.crewai.init()`: Initializes CrewAI tracing * `langdb.agno.init()`: Initializes Agno tracing All init functions accept optional parameters for custom configuration (collector\_endpoint, api\_key, project\_id) ## Troubleshooting #### Common Issues 1. **Missing API Key**: Ensure `LANGDB_API_KEY` and `LANGDB_PROJECT_ID` are set 2. **Tracing Not Working**: Check that initialization functions are called before creating agents 3. **Network Issues**: Verify collector endpoint is accessible 4. **Framework Conflicts**: Initialize LangDB integration before other instrumentation # Rate Limits & Free Tier Usage This document explains how rate limits work on LangDB, especially for free-tier accounts, and how to remove them. ### Free Tier Limits LangDB’s **free tier** is intended for light experimentation and evaluation. The following limits apply: #### Model Restrictions * Without adding credits, access to certain models is restricted. #### API Call Limit * **100 LLM calls per day** * Applies across all models accessed via the LangDB Gateway * Once the limit is exceeded, requests will fail until the daily quota resets #### IP-Based Enforcement * On the free tier, rate limits are enforced **per IP address** * This means: * If one free account exceeds the daily limit from a server or machine * Other free accounts making requests from the **same IP** may also be blocked * This behavior is intentional to prevent abuse of the free tier via multiple accounts > Note: This IP-based restriction applies **only to free-tier accounts**. ### Removing Limits by Adding Credits Adding credits to a LangDB account immediately removes all free-tier restrictions. #### After a Top-Up * No daily API call limit * No IP-based rate limiting * Full access to all available models through the gateway * Suitable for demos, staging, and production workloads There is no mandatory monthly subscription. You can add a one-time custom top-up and use credits on a pay-as-you-go basis. ### Signup Credits * New accounts currently receive 0.10 USD in signup credits * Earlier accounts may have received higher signup credits (for example, 10 USD) * The current signup credit amount is intentional and may change in the future # API Reference API Endpoints for LangDB ## POST /v1/chat/completions > Create chat completion ```json {"openapi":"3.0.1","info":{"title":"AI Gateway API","version":"1.0"},"servers":[{"url":"https://api.us-east-1.langdb.ai","description":"LangDB API Server"}],"security":[{"BearerAuth":[]}],"components":{"securitySchemes":{"BearerAuth":{"type":"http","scheme":"bearer"}},"parameters":{"XProjectId":{"name":"X-Project-Id","in":"header","description":"LangDB project ID","required":true,"schema":{"type":"string"}}},"schemas":{"CreateChatCompletionRequest":{"type":"object","required":["model","messages"],"properties":{"model":{"type":"string","description":"ID of the model to use. This can be either a specific model ID or a virtual model identifier."},"messages":{"type":"array","description":"A list of messages in the conversation.","items":{"type":"object","properties":{"role":{"type":"string","enum":["system","user","assistant","tool"]},"name":{"type":"string","description":"Optional name for the participant (e.g., function name)."},"content":{"oneOf":[{"type":"string","description":"Text content."},{"type":"array","description":"Array of content parts for multimodal inputs.","items":{"type":"object","required":["type"],"properties":{"type":{"type":"string","description":"The type of the content part (e.g., text or image_url).","enum":["text","image_url"]},"text":{"type":"string","description":"The text content, required when type=text."},"image_url":{"type":"object","description":"Image URL payload, required when type=image_url.","properties":{"url":{"type":"string"},"detail":{"type":"string","enum":["low","high","auto"]}}}}}}]},"tool_call_id":{"type":"string","description":"Tool call ID this message is responding to (for role=tool)."},"tool_calls":{"type":"array","description":"Tool calls requested by the assistant.","items":{"type":"object","properties":{"id":{"type":"string"},"type":{"type":"string","enum":["function"]},"function":{"type":"object","properties":{"name":{"type":"string"},"arguments":{"type":"string"}}}}}},"function_call":{"description":"Deprecated. For legacy function calling responses.","oneOf":[{"type":"string"},{"type":"object","properties":{"name":{"type":"string"}}}]}}}},"temperature":{"type":"number","minimum":0,"maximum":2,"description":"Sampling temperature."},"top_p":{"type":"number","minimum":0,"maximum":1,"description":"Nucleus sampling probability."},"max_tokens":{"type":"integer","minimum":1,"description":"The maximum number of tokens that can be generated in the chat completion."},"n":{"type":"integer","minimum":1,"default":1,"description":"How many chat completion choices to generate for each input message."},"stop":{"description":"Up to 4 sequences where the API will stop generating further tokens.","oneOf":[{"type":"string"},{"type":"array","items":{"type":"string"}}]},"presence_penalty":{"type":"number","minimum":-2,"maximum":2,"description":"Penalize new tokens based on whether they appear in the text so far."},"frequency_penalty":{"type":"number","minimum":-2,"maximum":2,"description":"Penalize new tokens based on their existing frequency in the text so far."},"logit_bias":{"type":"object","description":"Modify the likelihood of specified tokens appearing in the completion.","additionalProperties":{"type":"number"}},"logprobs":{"type":"boolean","description":"Whether to return log probabilities of the output tokens."},"top_logprobs":{"type":"integer","minimum":1,"maximum":20,"description":"The number of most likely tokens to return at each position, for which the log probabilities are returned. Requires logprobs=true."},"seed":{"type":"integer","description":"If specified, the backend will make a best effort to return deterministic results."},"response_format":{"description":"Format for the model's response.","oneOf":[{"type":"string","enum":["text","json_object"]},{"type":"object","properties":{"type":{"type":"string","enum":["text","json_object"]}}}]},"tools":{"type":"array","description":"A list of tools the model may call. Currently, only functions are supported as a tool.","items":{"$ref":"#/components/schemas/ChatCompletionTool"}},"tool_choice":{"$ref":"#/components/schemas/ChatCompletionToolChoiceOption"},"parallel_tool_calls":{"$ref":"#/components/schemas/ParallelToolCalls"},"functions":{"description":"Deprecated. Old-style function definitions.","type":"array","items":{"type":"object","required":["name"],"properties":{"name":{"type":"string"},"description":{"type":"string"},"parameters":{"type":"object"}}}},"stream":{"type":"boolean","description":"Whether to stream back partial progress.","default":false},"stream_options":{"type":"object","nullable":true,"default":null,"description":"Options for streaming responses. Only set when stream=true.","properties":{"include_obfuscation":{"type":"boolean","description":"When true, adds obfuscation fields to streaming deltas to normalize payload sizes and mitigate side-channel attacks.\nSet to false to reduce bandwidth overhead when trusted links are used.\n"},"include_usage":{"type":"boolean","description":"When set, an additional final chunk with total token usage is streamed before data: [DONE].\nOther chunks will include a usage field with null values.\nIf the stream is interrupted, you may not receive the final usage chunk.\n"}}},"user":{"type":"string","deprecated":true,"description":"Deprecated. This field is being replaced by `safety_identifier` and `prompt_cache_key`.\nUse `prompt_cache_key` to maintain caching optimizations. A stable identifier for your end-users\nwas previously used to boost cache hit rates by better bucketing similar requests and to help detect and prevent abuse.\n"},"safety_identifier":{"type":"string","description":"Stable identifier for your end-users, used to help detect and prevent abuse. Prefer this over `user`.\nFor caching optimization, combine with `prompt_cache_key`.\n"},"prompt_cache_key":{"type":"string","description":"Used to cache responses for similar requests to optimize cache hit rates. LangDB supports prompt caching; see https://docs.langdb.ai/features/prompt-caching. Can be used instead of the user field for cache bucketing."},"mcp_servers":{"type":"array","description":"Model Context Protocol servers to use during the request. These enable capabilities like web search.","items":{"type":"object","required":["server_url","type"],"properties":{"server_url":{"type":"string","description":"URL for the MCP server connection."},"type":{"type":"string","description":"Connection type for MCP server.","enum":["ws","sse"]}}}},"router":{"type":"object","description":"Advanced routing configuration for dynamic model selection based on conditions.","required":["type","routes"],"properties":{"type":{"type":"string","description":"Type of routing strategy.","enum":["conditional"]},"routes":{"type":"array","description":"Array of routing rules that define conditions and target models.","items":{"$ref":"#/components/schemas/RouterRule"}}}},"extra":{"type":"object","description":"Additional configuration options for the completion request.","properties":{"guards":{"type":"array","description":"List of guard identifiers to apply to this request. Guards provide content filtering and validation.","items":{"type":"string","description":"Identifier for a specific guard to apply."}},"user":{"type":"object","description":"User-specific information to associate with this request. This can be used for analytics and personalization.","properties":{"id":{"type":"string","description":"Unique identifier for the user."},"name":{"type":"string","description":"Name of the user."},"tags":{"type":"array","description":"List of tags associated with the user. Can be used for categorization or filtering.","items":{"type":"string"}}}}}}}},"ChatCompletionTool":{"type":"object","properties":{"type":{"type":"string","enum":["function"],"description":"The type of the tool. Currently, only `function` is supported."},"function":{"$ref":"#/components/schemas/FunctionObject"}},"required":["type","function"]},"FunctionObject":{"type":"object","properties":{"description":{"type":"string","description":"A description of what the function does."},"name":{"type":"string","description":"The name of the function to be called."},"parameters":{"$ref":"#/components/schemas/FunctionParameters"},"strict":{"type":"boolean","nullable":true,"description":"Whether to enable strict schema adherence."}},"required":["name"]},"FunctionParameters":{"type":"object","description":"The parameters the functions accepts, described as a JSON Schema object."},"ChatCompletionToolChoiceOption":{"description":"Controls which (if any) tool is called by the model. `none` means the model will not call any tool and instead generates a message. `auto` means the model can pick between generating a message or calling one or more tools. `required` means the model must call one or more tools.\n","oneOf":[{"type":"string","enum":["none","auto","required"]},{"$ref":"#/components/schemas/ChatCompletionAllowedToolsChoice"},{"$ref":"#/components/schemas/ChatCompletionNamedToolChoice"}]},"ChatCompletionAllowedToolsChoice":{"type":"object","title":"Allowed tools","description":"Constrains the tools available to the model to a pre-defined set.","properties":{"type":{"type":"string","enum":["allowed_tools"],"description":"Allowed tool configuration type. Always `allowed_tools`."},"allowed_tools":{"$ref":"#/components/schemas/ChatCompletionAllowedTools"}},"required":["type","allowed_tools"]},"ChatCompletionAllowedTools":{"type":"object","title":"Allowed tools","description":"Constrains the tools available to the model to a pre-defined set.","properties":{"mode":{"type":"string","enum":["auto","required"],"description":"`auto` allows the model to pick from among the allowed tools and generate a message. `required` requires the model to call one or more of the allowed tools.\n"},"tools":{"type":"array","description":"A list of tool definitions that the model should be allowed to call.","items":{"type":"object","description":"A tool definition that the model should be allowed to call.","additionalProperties":true}}},"required":["mode","tools"]},"ChatCompletionNamedToolChoice":{"type":"object","description":"Specifies a tool the model should use.","properties":{"type":{"type":"string","enum":["function"],"description":"The type of the tool. Currently, only `function` is supported."},"function":{"type":"object","properties":{"name":{"type":"string"}},"required":["name"]}},"required":["type","function"]},"ParallelToolCalls":{"description":"Whether to enable parallel function calling during tool use.","type":"boolean","default":true},"RouterRule":{"type":"object","description":"A routing rule that defines conditions and target models for dynamic routing.","required":["name","conditions","targets"],"properties":{"name":{"type":"string","description":"Unique identifier for this routing rule."},"conditions":{"$ref":"#/components/schemas/RouteConditions"},"targets":{"oneOf":[{"type":"string","description":"Single target model identifier."},{"$ref":"#/components/schemas/RouteTargets"}]}}},"RouteConditions":{"type":"object","description":"Logical conditions for determining when a route should be activated.","properties":{"all":{"type":"array","description":"All conditions must be true (AND logic).","items":{"$ref":"#/components/schemas/ConditionExpression"}},"any":{"type":"array","description":"At least one condition must be true (OR logic).","items":{"$ref":"#/components/schemas/ConditionExpression"}}}},"ConditionExpression":{"type":"object","description":"A single condition expression for evaluating request metadata or user information.","additionalProperties":{"oneOf":[{"type":"object","description":"Comparison operators for field evaluation.","properties":{"$eq":{"description":"Equals comparison."},"$neq":{"description":"Not equals comparison."},"$in":{"type":"array","description":"Value is in the provided array.","items":{"description":"Array item for comparison."}},"$lt":{"description":"Less than comparison."},"$gt":{"description":"Greater than comparison."}}},{"type":"string","description":"Direct string value comparison."},{"type":"number","description":"Direct numeric value comparison."},{"type":"boolean","description":"Direct boolean value comparison."}]}},"RouteTargets":{"type":"object","description":"Target model configuration for routing with filtering and sorting options.","properties":{"$any":{"type":"array","description":"Array of target models to choose from with fallback support.","items":{"type":"string","description":"Model identifier (e.g., \"openai/gpt-4o\", \"openai/*\", \"gpt-4o\")."}},"filter":{"type":"object","description":"Filter models based on real-time metrics.","additionalProperties":true},"sort_by":{"type":"string","description":"Sort candidate models by metric.","enum":["price","ttft","throughput","latency"]},"sort_order":{"type":"string","description":"Sort order for candidate models.","enum":["min","max"]}}},"CreateChatCompletionResponse":{"type":"object","description":"Represents a chat completion response returned by model, based on the provided input.","required":["choices","created","id","model","object"],"properties":{"id":{"type":"string","description":"A unique identifier for the chat completion."},"choices":{"type":"array","description":"A list of chat completion choices. Can be more than one if `n` is greater than 1.","items":{"type":"object","required":["finish_reason","index","message"],"properties":{"finish_reason":{"type":"string","description":"The reason the model stopped generating tokens. This will be `stop` if the model hit a natural stop point or a provided stop sequence,\n`length` if the maximum number of tokens specified in the request was reached,\n`content_filter` if content was omitted due to a flag from our content filters,\n`tool_calls` if the model called a tool, or `function_call` (deprecated) if the model called a function.\n","enum":["stop","length","tool_calls","content_filter","function_call"]},"index":{"type":"integer","description":"The index of the choice in the list of choices."},"message":{"type":"object","properties":{"role":{"type":"string","enum":["assistant"]},"content":{"type":"string","nullable":true},"tool_calls":{"type":"array","items":{"type":"object","properties":{"id":{"type":"string"},"type":{"type":"string","enum":["function"]},"function":{"type":"object","properties":{"name":{"type":"string"},"arguments":{"type":"string"}}}}}},"function_call":{"description":"Deprecated. For legacy function calling responses.","type":"object","nullable":true,"properties":{"name":{"type":"string"},"arguments":{"type":"string"}}}},"required":["role","content"]},"logprobs":{"description":"Log probability information for the choice.","type":"object","nullable":true,"properties":{"content":{"description":"A list of message content tokens with log probability information.","type":"array","items":{"type":"object","properties":{"token":{"type":"string"},"logprob":{"type":"number"}}},"nullable":true},"refusal":{"description":"A list of message refusal tokens with log probability information.","type":"array","items":{"type":"object","properties":{"token":{"type":"string"},"logprob":{"type":"number"}}},"nullable":true}},"required":["content","refusal"]}}}},"created":{"type":"integer","description":"The Unix timestamp (in seconds) of when the chat completion was created."},"model":{"type":"string","description":"The model used for the chat completion."},"system_fingerprint":{"type":"string","description":"This fingerprint represents the backend configuration that the model runs with.\n\nCan be used in conjunction with the `seed` request parameter to understand when backend changes have been made that might impact determinism.\n"},"object":{"type":"string","description":"The object type, which is always `chat.completion`.","enum":["chat.completion"]},"usage":{"type":"object","description":"Usage statistics for the completion request.","properties":{"prompt_tokens":{"type":"integer","description":"Number of tokens in the prompt."},"completion_tokens":{"type":"integer","description":"Number of tokens in the completion."},"total_tokens":{"type":"integer","description":"Total number of tokens used."},"prompt_tokens_details":{"type":"object","properties":{"cached_tokens":{"type":"integer"},"cache_creation_tokens":{"type":"integer"},"audio_tokens":{"type":"integer"}}},"completion_tokens_details":{"type":"object","properties":{"reasoning_tokens":{"type":"integer"},"accepted_prediction_tokens":{"type":"integer"},"rejected_prediction_tokens":{"type":"integer"},"audio_tokens":{"type":"integer"}}},"cost":{"type":"number","description":"Total cost for the request in the provider's billing unit."}}}}}}},"paths":{"/v1/chat/completions":{"post":{"operationId":"createChatCompletion","tags":["Completions"],"summary":"Create chat completion","parameters":[{"$ref":"#/components/parameters/XProjectId"}],"requestBody":{"required":true,"content":{"application/json":{"schema":{"$ref":"#/components/schemas/CreateChatCompletionRequest"}}}},"responses":{"200":{"description":"OK","content":{"application/json":{"schema":{"$ref":"#/components/schemas/CreateChatCompletionResponse"}}}}}}}}} ``` ## Create embeddings > Creates an embedding vector representing the input text or token arrays.
```json {"openapi":"3.0.1","info":{"title":"AI Gateway API","version":"1.0"},"servers":[{"url":"https://api.us-east-1.langdb.ai","description":"LangDB API Server"}],"security":[{"BearerAuth":[]}],"components":{"securitySchemes":{"BearerAuth":{"type":"http","scheme":"bearer"}},"schemas":{"EmbeddingsRequest":{"type":"object","required":["model","input"],"properties":{"model":{"type":"string","description":"ID of the model to use for generating embeddings."},"input":{"oneOf":[{"type":"string","description":"The text to embed."},{"type":"array","items":{"type":"string"},"description":"Array of text strings to embed."}]},"encoding_format":{"type":"string","enum":["float","base64"],"default":"float","description":"The format to return the embeddings in."},"dimensions":{"type":"integer","description":"The number of dimensions the resulting embeddings should have.","minimum":1,"maximum":1536}}},"EmbeddingsResponse":{"type":"object","required":["data","model","usage"],"properties":{"data":{"type":"array","items":{"type":"object","required":["embedding","index"],"properties":{"embedding":{"oneOf":[{"type":"array","items":{"type":"number"},"description":"The embedding vector, returned when encoding_format=float."},{"type":"string","description":"The base64-encoded embedding vector, returned when encoding_format=base64."}]},"index":{"type":"integer","description":"The index of this embedding in the input array."}}}},"model":{"type":"string","description":"The model used for generating the embeddings."},"usage":{"type":"object","required":["prompt_tokens","total_tokens"],"properties":{"prompt_tokens":{"type":"integer","description":"The number of tokens in the prompt."},"total_tokens":{"type":"integer","description":"The total number of tokens used (same as prompt_tokens since there is no completion)."}}}}}}},"paths":{"/v1/embeddings":{"post":{"operationId":"generateEmbeddings","tags":["Completions"],"summary":"Create embeddings","description":"Creates an embedding vector representing the input text or token arrays.\n","requestBody":{"required":true,"content":{"application/json":{"schema":{"$ref":"#/components/schemas/EmbeddingsRequest"}}}},"responses":{"200":{"description":"Successful response with embeddings","content":{"application/json":{"schema":{"$ref":"#/components/schemas/EmbeddingsResponse"}}}}}}}}} ``` {% openapi src="" path="/threads" method="post" %} [threads.yaml](https://2130779299-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F0peUeEHQMLSxYq8BpQGt%2Fuploads%2FWAcaFQwQf8EZJYQWRZ6x%2Fthreads.yaml?alt=media\&token=55757627-a455-4ca6-ba19-7464debcb4b4) {% endopenapi %} ## GET /threads/{thread\_id}/messages > Retrieve messages for a specific thread ```json {"openapi":"3.0.1","info":{"title":"AI Gateway API","version":"1.0"},"servers":[{"url":"https://api.us-east-1.langdb.ai","description":"LangDB API Server"}],"security":[{"BearerAuth":[]}],"components":{"securitySchemes":{"BearerAuth":{"type":"http","scheme":"bearer"}},"parameters":{"XProjectId":{"name":"X-Project-Id","in":"header","description":"LangDB project ID","required":true,"schema":{"type":"string"}}},"schemas":{"ThreadMessage":{"type":"object","properties":{"model_name":{"type":"string"},"thread_id":{"type":"string","format":"uuid"},"user_id":{"type":"string"},"content_type":{"type":"string"},"content":{"type":"string","description":"The raw text or structured content of the message"},"content_array":{"type":"array","description":"In some cases, message content may be split into multiple parts","items":{"type":"string"}},"type":{"type":"string","description":"The role or type of message (e.g., system, human, ai)"},"tool_call_id":{"type":"string","format":"uuid","nullable":true,"description":"If relevant, the ID of the tool call"},"tool_calls":{"type":"string","nullable":true,"description":"Any tool calls (if applicable) related to this message"},"created_at":{"type":"string","format":"date-time"},"id":{"type":"string","format":"uuid","description":"The unique identifier for the message"}},"required":["model_name","thread_id","user_id","content_type","content","type","created_at","id"]}}},"paths":{"/threads/{thread_id}/messages":{"get":{"operationId":"listThreadMessages","tags":["Threads"],"summary":"Retrieve messages for a specific thread","parameters":[{"$ref":"#/components/parameters/XProjectId"},{"name":"thread_id","in":"path","required":true,"schema":{"type":"string","format":"uuid"},"description":"The ID of the thread to retrieve messages from"}],"responses":{"200":{"description":"A list of messages for the given thread","content":{"application/json":{"schema":{"type":"array","items":{"$ref":"#/components/schemas/ThreadMessage"}}}}}}}}}} ``` ## GET /threads/{thread\_id}/cost > Retrieve the total cost for a specific thread ```json {"openapi":"3.0.1","info":{"title":"AI Gateway API","version":"1.0"},"servers":[{"url":"https://api.us-east-1.langdb.ai","description":"LangDB API Server"}],"security":[{"BearerAuth":[]}],"components":{"securitySchemes":{"BearerAuth":{"type":"http","scheme":"bearer"}},"parameters":{"XProjectId":{"name":"X-Project-Id","in":"header","description":"LangDB project ID","required":true,"schema":{"type":"string"}}},"schemas":{"ThreadCost":{"type":"object","properties":{"total_cost":{"type":"number","format":"float","description":"The total cost of the thread"},"total_output_tokens":{"type":"integer","description":"The number of output tokens used"},"total_input_tokens":{"type":"integer","description":"The number of input tokens used"}},"required":["total_cost","total_output_tokens","total_input_tokens"]}}},"paths":{"/threads/{thread_id}/cost":{"get":{"operationId":"getThreadCost","tags":["Threads"],"summary":"Retrieve the total cost for a specific thread","parameters":[{"$ref":"#/components/parameters/XProjectId"},{"name":"thread_id","in":"path","required":true,"schema":{"type":"string","format":"uuid"},"description":"The ID of the thread for which to retrieve cost information"}],"responses":{"200":{"description":"The total cost and token usage for the specified thread","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ThreadCost"}}}}}}}}} ``` ## POST /analytics > Fetch analytics data ```json {"openapi":"3.0.1","info":{"title":"AI Gateway API","version":"1.0"},"servers":[{"url":"https://api.us-east-1.langdb.ai","description":"LangDB API Server"}],"security":[{"BearerAuth":[]}],"components":{"securitySchemes":{"BearerAuth":{"type":"http","scheme":"bearer"}},"parameters":{"XProjectId":{"name":"X-Project-Id","in":"header","description":"LangDB project ID","required":true,"schema":{"type":"string"}}},"schemas":{"AnalyticsResponse":{"type":"object","required":["timeseries","start_time_us","end_time_us"],"properties":{"timeseries":{"type":"array","description":"An array of analytics timeseries records.","items":{"type":"object","required":["hour","total_cost","total_requests","avg_duration","duration","duration_p99","duration_p95","duration_p90","duration_p50","total_duration","total_input_tokens","total_output_tokens","error_rate","error_request_count","avg_ttft","ttft","ttft_p99","ttft_p95","ttft_p90","ttft_p50","tps","tps_p99","tps_p95","tps_p90","tps_p50","tpot","tpot_p99","tpot_p95","tpot_p90","tpot_p50","tag_tuple"],"properties":{"hour":{"type":"string","description":"The timestamp for the record (e.g. 'YYYY-MM-DD HH:mm:ss')."},"total_cost":{"type":"number","format":"double","description":"Total cost incurred during this period."},"total_requests":{"type":"integer","description":"Total number of requests."},"avg_duration":{"type":"number","format":"double","description":"Average duration (in milliseconds) of requests."},"duration":{"type":"number","format":"double","description":"Duration (in milliseconds) of requests."},"duration_p99":{"type":"number","format":"double","description":"99th percentile of request durations."},"duration_p95":{"type":"number","format":"double","description":"95th percentile of request durations."},"duration_p90":{"type":"number","format":"double","description":"90th percentile of request durations."},"duration_p50":{"type":"number","format":"double","description":"50th percentile of request durations."},"total_duration":{"type":"number","format":"double","description":"Total duration (in milliseconds) of all requests."},"total_input_tokens":{"type":"integer","description":"Total number of input tokens used."},"total_output_tokens":{"type":"integer","description":"Total number of output tokens generated."},"error_rate":{"type":"number","format":"double","description":"Error rate (as a percentage) over the period."},"error_request_count":{"type":"integer","description":"Number of error requests."},"avg_ttft":{"type":"number","format":"double","description":"Average time to first byte (TTFT) in milliseconds."},"ttft":{"type":"number","format":"double","description":"Time to first byte (TTFT) in milliseconds."},"ttft_p99":{"type":"number","format":"double","description":"99th percentile of TTFT."},"ttft_p95":{"type":"number","format":"double","description":"95th percentile of TTFT."},"ttft_p90":{"type":"number","format":"double","description":"90th percentile of TTFT."},"ttft_p50":{"type":"number","format":"double","description":"50th percentile of TTFT."},"tps":{"type":"number","format":"double","description":"Transactions per second."},"tps_p99":{"type":"number","format":"double","description":"99th percentile TPS."},"tps_p95":{"type":"number","format":"double","description":"95th percentile TPS."},"tps_p90":{"type":"number","format":"double","description":"90th percentile TPS."},"tps_p50":{"type":"number","format":"double","description":"50th percentile TPS."},"tpot":{"type":"number","format":"double","description":"Average transactions per output token (Tpot)."},"tpot_p99":{"type":"number","format":"double","description":"99th percentile of Tpot."},"tpot_p95":{"type":"number","format":"double","description":"95th percentile of Tpot."},"tpot_p90":{"type":"number","format":"double","description":"90th percentile of Tpot."},"tpot_p50":{"type":"number","format":"double","description":"50th percentile of Tpot."},"tag_tuple":{"type":"array","items":{"type":"string"},"description":"A tuple of tags associated with the record."}}}},"start_time":{"type":"integer","format":"int64","description":"Start time in seconds since epoch."},"end_time":{"type":"integer","format":"int64","description":"End time in seconds since epoch."},"start_time_us":{"type":"integer","format":"int64","description":"Start time in microseconds since epoch. Deprecated, use start_time instead.","deprecated":true},"end_time_us":{"type":"integer","format":"int64","description":"End time in microseconds since epoch. Deprecated, use end_time instead.","deprecated":true}}}}},"paths":{"/analytics":{"post":{"tags":["Analytics"],"summary":"Fetch analytics data","parameters":[{"$ref":"#/components/parameters/XProjectId"}],"requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","properties":{"start_time_us":{"type":"integer","format":"int64","description":"Start time in microseconds.","deprecated":true},"end_time_us":{"type":"integer","format":"int64","description":"End time in microseconds.","deprecated":true},"period":{"type":"string","description":"Time period for filtering data. If provided, start_time and end_time will be ignored.","enum":["last_minute","last15_minute","last_15_minutes","last_hour","last3_hours","last_3_hours","last12_hours","last_12_hours","last_day","last_week","last_month","last_year"]}}}}}},"responses":{"200":{"description":"Successful response","content":{"application/json":{"schema":{"$ref":"#/components/schemas/AnalyticsResponse"}}}}}}}}} ``` ## POST /analytics/summary > Fetch analytics summary ```json {"openapi":"3.0.1","info":{"title":"AI Gateway API","version":"1.0"},"servers":[{"url":"https://api.us-east-1.langdb.ai","description":"LangDB API Server"}],"security":[{"BearerAuth":[]}],"components":{"securitySchemes":{"BearerAuth":{"type":"http","scheme":"bearer"}},"parameters":{"XProjectId":{"name":"X-Project-Id","in":"header","description":"LangDB project ID","required":true,"schema":{"type":"string"}}},"schemas":{"AnalyticsSummaryResponse":{"type":"object","required":["summary","start_time_us","end_time_us"],"properties":{"summary":{"type":"array","description":"An array of aggregated analytics summary records.","items":{"$ref":"#/components/schemas/AnalyticsSummaryItem"}},"start_time":{"type":"integer","format":"int64","description":"Start time in seconds since epoch."},"end_time":{"type":"integer","format":"int64","description":"End time in seconds since epoch."},"start_time_us":{"type":"integer","format":"int64","description":"Start time in microseconds since epoch. Deprecated, use start_time instead.","deprecated":true},"end_time_us":{"type":"integer","format":"int64","description":"End time in microseconds since epoch. Deprecated, use end_time instead.","deprecated":true}}},"AnalyticsSummaryItem":{"type":"object","required":["tag_tuple","total_cost","total_requests","total_duration","avg_duration","duration","duration_p99","duration_p95","duration_p90","duration_p50","total_input_tokens","total_output_tokens","avg_ttft","ttft","ttft_p99","ttft_p95","ttft_p90","ttft_p50","tps","tps_p99","tps_p95","tps_p90","tps_p50","tpot","tpot_p99","tpot_p95","tpot_p90","tpot_p50","error_rate","error_request_count"],"properties":{"tag_tuple":{"type":"array","items":{"type":"string"},"description":"The grouping key(s) used for this summary. In this example the key is returned as tag_tuple. (Depending on the groupBy parameter, this field might contain provider names or model names.)\n"},"total_cost":{"type":"number","format":"double","description":"Aggregated total cost in USD for the group."},"total_requests":{"type":"integer","description":"Aggregated total number of requests for the group."},"total_duration":{"type":"number","format":"double","description":"Aggregated total duration (in milliseconds) for the group."},"avg_duration":{"type":"number","format":"double","description":"Average duration (in milliseconds) of requests for the group."},"duration":{"type":"number","format":"double","description":"Representative duration (in milliseconds) for the group."},"duration_p99":{"type":"number","format":"double","description":"99th percentile of request durations."},"duration_p95":{"type":"number","format":"double","description":"95th percentile of request durations."},"duration_p90":{"type":"number","format":"double","description":"90th percentile of request durations."},"duration_p50":{"type":"number","format":"double","description":"50th percentile of request durations."},"total_input_tokens":{"type":"integer","description":"Aggregated total input tokens used in the group."},"total_output_tokens":{"type":"integer","description":"Aggregated total output tokens generated in the group."},"avg_ttft":{"type":"number","format":"double","description":"Average time-to-first-byte (TTFT) in milliseconds for the group."},"ttft":{"type":"number","format":"double","description":"Representative TTFT value in milliseconds for the group."},"ttft_p99":{"type":"number","format":"double","description":"99th percentile of TTFT."},"ttft_p95":{"type":"number","format":"double","description":"95th percentile of TTFT."},"ttft_p90":{"type":"number","format":"double","description":"90th percentile of TTFT."},"ttft_p50":{"type":"number","format":"double","description":"50th percentile of TTFT."},"tps":{"type":"number","format":"double","description":"Aggregated transactions per second (TPS) for the group."},"tps_p99":{"type":"number","format":"double","description":"99th percentile of TPS."},"tps_p95":{"type":"number","format":"double","description":"95th percentile of TPS."},"tps_p90":{"type":"number","format":"double","description":"90th percentile of TPS."},"tps_p50":{"type":"number","format":"double","description":"50th percentile of TPS."},"tpot":{"type":"number","format":"double","description":"Aggregated average transactions per output token (TPOT) for the group."},"tpot_p99":{"type":"number","format":"double","description":"99th percentile of TPOT."},"tpot_p95":{"type":"number","format":"double","description":"95th percentile of TPOT."},"tpot_p90":{"type":"number","format":"double","description":"90th percentile of TPOT."},"tpot_p50":{"type":"number","format":"double","description":"50th percentile of TPOT."},"error_rate":{"type":"number","format":"double","description":"Error rate (as a percentage) for the group."},"error_request_count":{"type":"integer","description":"Total number of error requests in the group."}}}}},"paths":{"/analytics/summary":{"post":{"tags":["Analytics"],"summary":"Fetch analytics summary","parameters":[{"$ref":"#/components/parameters/XProjectId"}],"requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","required":["groupBy"],"properties":{"start_time_us":{"type":"integer","format":"int64","deprecated":true},"end_time_us":{"type":"integer","format":"int64","deprecated":true},"period":{"type":"string","description":"Time period for filtering data. If provided, start_time and end_time will be ignored.","enum":["last_minute","last15_minute","last_15_minutes","last_hour","last3_hours","last_3_hours","last12_hours","last_12_hours","last_day","last_week","last_month","last_year"]},"groupBy":{"type":"array","items":{"type":"string"}}}}}}},"responses":{"200":{"description":"Successful response","content":{"application/json":{"schema":{"$ref":"#/components/schemas/AnalyticsSummaryResponse"}}}}}}}}} ``` ## POST /usage/total > Get total usage ```json {"openapi":"3.0.1","info":{"title":"AI Gateway API","version":"1.0"},"servers":[{"url":"https://api.us-east-1.langdb.ai","description":"LangDB API Server"}],"security":[{"BearerAuth":[]}],"components":{"securitySchemes":{"BearerAuth":{"type":"http","scheme":"bearer"}},"parameters":{"XProjectId":{"name":"X-Project-Id","in":"header","description":"LangDB project ID","required":true,"schema":{"type":"string"}}},"schemas":{"TotalUsageResponse":{"type":"object","properties":{"models":{"type":"array","items":{"type":"object","properties":{"provider":{"type":"string"},"model_name":{"type":"string"},"total_input_tokens":{"type":"integer"},"total_output_tokens":{"type":"integer"},"total_cost":{"type":"number","format":"float"},"cost_per_input_token":{"type":"number","format":"float"},"cost_per_output_token":{"type":"number","format":"float"}}}},"total":{"type":"object","properties":{"total_input_tokens":{"type":"integer"},"total_output_tokens":{"type":"integer"},"total_cost":{"type":"number","format":"float"}}},"period_start":{"type":"integer","format":"int64","description":"Start of the period in seconds since epoch."},"period_end":{"type":"integer","format":"int64","description":"End of the period in seconds since epoch."},"period_start_us":{"type":"integer","format":"int64","description":"Start of the period in microseconds since epoch. Deprecated, use period_start instead.","deprecated":true},"period_end_us":{"type":"integer","format":"int64","description":"End of the period in microseconds since epoch. Deprecated, use period_end instead.","deprecated":true}},"required":["models","total","period_start","period_end"]}}},"paths":{"/usage/total":{"post":{"tags":["Analytics"],"summary":"Get total usage","parameters":[{"$ref":"#/components/parameters/XProjectId"}],"requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","properties":{"start_time_us":{"type":"integer","format":"int64","deprecated":true},"end_time_us":{"type":"integer","format":"int64","description":"End time in microseconds.","deprecated":true},"period":{"type":"string","description":"Time period for filtering data. If provided, start_time and end_time will be ignored.","enum":["last_minute","last15_minute","last_15_minutes","last_hour","last3_hours","last_3_hours","last12_hours","last_12_hours","last_day","last_week","last_month","last_year"]}}}}}},"responses":{"200":{"description":"OK","content":{"application/json":{"schema":{"$ref":"#/components/schemas/TotalUsageResponse"}}}}}}}}} ``` ## POST /usage/models > Get usage by model ```json {"openapi":"3.0.1","info":{"title":"AI Gateway API","version":"1.0"},"servers":[{"url":"https://api.us-east-1.langdb.ai","description":"LangDB API Server"}],"security":[{"BearerAuth":[]}],"components":{"securitySchemes":{"BearerAuth":{"type":"http","scheme":"bearer"}},"parameters":{"XProjectId":{"name":"X-Project-Id","in":"header","description":"LangDB project ID","required":true,"schema":{"type":"string"}}},"schemas":{"GranularUsageResponse":{"type":"object","properties":{"models":{"type":"array","description":"Usage statistics for each model at the specified granularity (hourly, daily, or monthly)","items":{"type":"object","properties":{"hour":{"type":"string","description":"Timestamp for the record"},"provider":{"type":"string"},"model_name":{"type":"string"},"total_input_tokens":{"type":"integer"},"total_output_tokens":{"type":"integer"},"total_cost":{"type":"number","format":"float"}}}},"period_start":{"type":"integer","format":"int64","description":"Start of the usage period in microseconds since epoch"},"period_end":{"type":"integer","format":"int64","description":"End of the usage period in microseconds since epoch"}},"required":["models","period_start","period_end"]}}},"paths":{"/usage/models":{"post":{"tags":["Analytics"],"summary":"Get usage by model","parameters":[{"$ref":"#/components/parameters/XProjectId"}],"requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","properties":{"start_time_us":{"type":"integer","format":"int64"},"end_time_us":{"type":"integer","format":"int64"},"min_unit":{"type":"string","enum":["hour","day","month"],"description":"The granularity of the returned usage data."}}}}}},"responses":{"200":{"description":"Successful response","content":{"application/json":{"schema":{"$ref":"#/components/schemas/GranularUsageResponse"}}}}}}}}} ``` ## Retrieve pricing information > Returns the pricing details for LangDB services. ```json {"openapi":"3.0.1","info":{"title":"AI Gateway API","version":"1.0"},"servers":[{"url":"https://api.us-east-1.langdb.ai","description":"LangDB API Server"}],"paths":{"/pricing":{"get":{"operationId":"getPricing","tags":["Pricing"],"summary":"Retrieve pricing information","description":"Returns the pricing details for LangDB services.","responses":{"200":{"description":"Successful retrieval of pricing information","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ModelPricing"}}}}}}}},"components":{"schemas":{"ModelPricing":{"type":"object","properties":{"model":{"type":"string"},"provider":{"type":"string"},"price":{"$ref":"#/components/schemas/PriceDetails"},"input_formats":{"type":"array","items":{"type":"string"}},"output_formats":{"type":"array","items":{"type":"string"}},"capabilities":{"type":"array","items":{"type":"string"}},"type":{"type":"string"},"limits":{"$ref":"#/components/schemas/UsageLimits"}}},"PriceDetails":{"type":"object","properties":{"per_input_token":{"type":"number","format":"float"},"per_output_token":{"type":"number","format":"float"},"valid_from":{"type":"string","format":"date-time","nullable":true}}},"UsageLimits":{"type":"object","properties":{"max_context_size":{"type":"integer"}}}}}} ``` ## GET /models > List models ```json {"openapi":"3.0.1","info":{"title":"AI Gateway API","version":"1.0"},"servers":[{"url":"https://api.us-east-1.langdb.ai","description":"LangDB API Server"}],"paths":{"/models":{"get":{"operationId":"listModels","tags":["Models"],"summary":"List models","responses":{"200":{"description":"OK","content":{"application/json":{"schema":{"type":"object","properties":{"object":{"type":"string","description":"Type of the response, e.g. 'list'."},"data":{"type":"array","description":"Array of model objects.","items":{"type":"object","properties":{"id":{"type":"string","description":"Unique identifier for the model."},"object":{"type":"string","description":"Object type, typically 'model'."},"created":{"type":"integer","description":"Unix timestamp when the model was created."},"owned_by":{"type":"string","description":"Owner of the model."}}}}}}}}}}}}}} ``` ## Create a new model > Register and configure a new LLM under your LangDB project ```json {"openapi":"3.0.1","info":{"title":"LangDB Admin Models API","version":"1.0"},"servers":[{"url":"https://api.xxx.langdb.ai","description":"Enterprise API server"}],"security":[{"bearerAuth":[]}],"components":{"securitySchemes":{"bearerAuth":{"type":"http","scheme":"bearer","bearerFormat":"JWT"}}},"paths":{"/admin/models":{"post":{"tags":["Models","Admin"],"summary":"Create a new model","description":"Register and configure a new LLM under your LangDB project","parameters":[{"name":"X-Admin-Key","in":"header","required":true,"description":"LangDB Admin Key","schema":{"type":"string"}}],"requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","required":["model_name","description","provider_info_id","project_id","model_type","owner_name","priority"],"properties":{"model_name":{"type":"string"},"description":{"type":"string"},"provider_info_id":{"type":"string","format":"uuid"},"project_id":{"type":"string"},"public":{"type":"boolean"},"request_response_mapping":{"type":"string"},"model_type":{"type":"string"},"input_token_price":{"type":"number","format":"float","nullable":true},"output_token_price":{"type":"number","format":"float","nullable":true},"context_size":{"type":"integer","nullable":true},"capabilities":{"type":"array","items":{"type":"string"}},"input_types":{"type":"array","items":{"type":"string"}},"output_types":{"type":"array","items":{"type":"string"}},"tags":{"type":"array","items":{"type":"string"}},"type_prices":{"type":"object","nullable":true,"additionalProperties":{"type":"number"}},"mp_price":{"type":"number","format":"float","nullable":true},"owner_name":{"type":"string"},"priority":{"type":"integer"},"model_name_in_provider":{"type":"string"},"parameters":{"type":"object","description":"Additional configuration parameters"}}}}}},"responses":{"200":{"description":"Created","content":{"application/json":{"schema":{"type":"object","properties":{"id":{"type":"string","format":"uuid"},"model_name":{"type":"string"},"description":{"type":"string"},"provider_info_id":{"type":"string","format":"uuid"},"model_type":{"type":"string"},"input_token_price":{"type":"string"},"output_token_price":{"type":"string"},"context_size":{"type":"integer"},"capabilities":{"type":"array","items":{"type":"string"}},"input_types":{"type":"array","items":{"type":"string"}},"output_types":{"type":"array","items":{"type":"string"}},"tags":{"type":"array","items":{"type":"string"}},"type_prices":{"type":"object","nullable":true,"additionalProperties":{"type":"number"}},"mp_price":{"type":"number","format":"float","nullable":true},"model_name_in_provider":{"type":"string"},"owner_name":{"type":"string"},"priority":{"type":"integer"},"parameters":{"type":"object"}}}}}}}}}}} ``` ## Set custom prices for imported models > Set custom pricing for models imported from providers like Bedrock, Azure, Vertex that do not have built-in pricing ```json {"openapi":"3.0.1","info":{"title":"AI Gateway API","version":"1.0"},"servers":[{"url":"https://api.us-east-1.langdb.ai","description":"LangDB API Server"}],"security":[{"BearerAuth":[]}],"components":{"securitySchemes":{"BearerAuth":{"type":"http","scheme":"bearer"}},"schemas":{"CustomPricesRequest":{"type":"object","description":"Request body for setting custom prices on imported models","additionalProperties":{"$ref":"#/components/schemas/ModelPricingDetails"}},"ModelPricingDetails":{"type":"object","required":["per_input_token","per_output_token"],"properties":{"per_input_token":{"type":"number","format":"float","description":"Price per input token"},"per_output_token":{"type":"number","format":"float","description":"Price per output token"}}},"CustomPricesResponse":{"type":"object","description":"Custom pricing for imported models","additionalProperties":{"$ref":"#/components/schemas/ModelPricingDetails"}}}},"paths":{"/projects/{project_id}/custom_prices":{"post":{"operationId":"setCustomPrices","tags":["Models","Pricing"],"summary":"Set custom prices for imported models","description":"Set custom pricing for models imported from providers like Bedrock, Azure, Vertex that do not have built-in pricing","parameters":[{"name":"project_id","in":"path","required":true,"description":"UUID of the project","schema":{"type":"string","format":"uuid"}}],"requestBody":{"required":true,"content":{"application/json":{"schema":{"$ref":"#/components/schemas/CustomPricesRequest"}}}},"responses":{"200":{"description":"Custom prices set successfully","content":{"application/json":{"schema":{"$ref":"#/components/schemas/CustomPricesResponse"}}}}}}}}} ``` # Getting Started with Agents Explore step-by-step tutorials for building AI agents. {% embed url="" %} Quick Overview of Building Agents and LangDB {% endembed %}


Building Report Writing Agent Using CrewAI	Screenshot from 2025-06-16 20-28-57.png	building-reporting-writing-agent-using-crewai
Building Travel Agent with OpenAI	japan.png	building-travel-agent-with-openai-agents-sdk
Building Web Search Agent with Google-ADK	bigtech.png	building-web-search-agent-with-google-adk
Building Travel Concierge with Google ADK	Screenshot from 2025-07-03 22-51-06.png	building-travel-concierge-with-google-adk
Building a Reasoning Finance Team with Agno	agno-finance.png	building-a-reasoning-finance-team-with-agno
Building Complex Data Extraction with LangGraph	langgraph.png	building-complex-data-extraction-with-langgraph

# Building Reporting Writing Agent Using CrewAI Build a multi-agent report system using CrewAI and LangDB. Learn to configure agents, integrate MCP tools, and trace end-to-end report generation workflows. Build a powerful multi-agent report generation workflow with CrewAI and LangDB. This guide walks through the full setup: from configuring your agents to sharing a public execution trace.

## Code * LangDB Samples: ## Goal Create a report-writing AI system where: * A **Researcher Agent** gathers up-to-date information using web tools like Tavily Search. * An **Analyst Agent** processes and synthesizes the findings. * A **Report Writer Agent** generates a clean, markdown-formatted report. LangDB enables seamless model routing, tracing, and observability across this pipeline, including full visibility into MCP tool calls like Tavily Search used by the Researcher Agent. ### Installation ```bash pip install crewai 'pylangdb[crewai]' python-dotenv ``` ## Project-Structure ``` report-writing-agent/ ├── configs │ ├── agents.yaml │ └── tasks.yaml ├── main.py └── README.mdc ``` The code snippets below illustrate the key components of the three-agent reporting workflow. ## Quick Start Example The code snippets below illustrate the key components of the three-agent reporting workflow. ### Export Environment Variables ```bash export LANGDB_API_KEY="" export LANGDB_PROJECT_ID="" export LANGDB_API_BASE_URL='https://api.us-east-1.langdb.ai' ``` ### Initialize Tracing The first and most important step is to initialize `pylangdb` tracing **before** any other CrewAI code runs. This call instruments the environment to automatically capture all agent and tool activity. ```python from pylangdb.crewai import init from dotenv import load_dotenv # Load environment variables and initialize tracing load_dotenv() init() ``` ## Configure the LLM First, define a helper function to instantiate LLMs with the necessary LangDB tracing headers. This ensures all model calls are captured. ```python from crewai import LLM import os def create_llm(model): return LLM( model=model, api_key=os.environ.get("LANGDB_API_KEY"), base_url=os.environ.get("LANGDB_BASE_URL"), extra_headers={ "x-project-id": os.environ.get("LANGDB_PROJECT_ID") } ) ``` ## Define the Crew Next, use the `@CrewBase` decorator to define the agents and tasks, loading configurations from external YAML files. ```python from crewai import Agent, Task, Crew, Process from crewai.project import CrewBase, agent, crew, task @CrewBase class ReportGenerationCrew(): """Report Generation crew""" agents_config = "configs/agents.yaml" tasks_config = "configs/tasks.yaml" @agent def researcher(self) -> Agent: return Agent( config=self.agents_config['researcher'], llm=create_llm("openai/langdb/reportresearcher_9wzgx5n5" ) # LangDB Virtual Model with MCP ) @agent def analyst(self) -> Agent: return Agent( config=self.agents_config['analyst'], llm=create_llm("openai/anthropic/claude-3.7-sonnet") ) # ... report_writer agent definition follows the same pattern ... @crew def crew(self) -> Crew: return Crew( agents=[self.researcher(), self.analyst(), self.report_writer()], tasks=[self.research_task(), self.analysis_task(), self.report_writing_task()], process=Process.sequential ) ``` ## Run the Workflow Finally, instantiate the crew and `kickoff()` the process with a specific topic. ```python def generate_report(topic): crew_instance = ReportGenerationCrew() # ... update task descriptions with the topic ... result = crew_instance.crew().kickoff() return result if __name__ == "__main__": generate_report("The Impact of AI on Social Media Marketing in 2024") ``` Running the script will generate a full trace in LangDB that includes every model call, tool invocation, and inter-agent message. ### Configs #### agents.yaml ```yaml researcher: role: "Domain Researcher" goal: "Gather up-to-date information on the given topic" backstory: "Experienced researcher with expertise in finding credible sources." ``` #### tasks.yaml ```yaml research_task: description: > Research the given topic thoroughly using web search tools. Gather current information, statistics, trends, and key insights. Focus on finding credible sources and recent developments. expected_output: > A comprehensive research summary with key findings, statistics, and relevant information about the topic. Include source citations. ``` Full source code—including additional agents, tasks, and YAML configs—is available in the repository: . ## Configuring MCP and Models To enable the Researcher Agent to retrieve fresh, real‑time information and ensure every search query is recorded for auditing and debugging, we configure a Virtual MCP Server and attach it to a Virtual Model. This setup provides: * **Live Web Search**: Integrate external search capabilities directly into your agent. * **Traceability**: All MCP tool calls (search queries, parameters, responses) are logged in LangDB for observability and troubleshooting. * **Consistency**: Using a dedicated MCP Server ensures uniform search behavior across runs. ### Steps To Create a Virtual MCP 1. In LangDB UI, navigate to **Projects → MCP Servers**. 2. Click **+ New Virtual MCP Server**: * **Name**: `web-search-mcp` * **Underlying MCP**: Tavily Search MCP * **Requires API Key**: Make sure Tavily API Key is configured in your environment to authenticate this operation. 3. Navigate to **Models → + New Virtual Model**: * **Name**: `report-researcher` * **Base Model**: GPT-4.1 or similar * **Attach**: `web-search-mcp` as the search tool 4. Copy the model identifier (e.g. `openai/langdb/report-researcher`) and use it in the Researcher agent.

Setting Up Virtual MCP Server With Virtual Model

LangDB will log all MCP calls for traceability. ## Custom Model Usage You can use any model available on LangDB. When specifying a model, ensure it follows the LiteLLM naming convention for provider-specific models (e.g., `openai/gpt-4o`, `anthropic/claude-3-sonnet-20240229`). To customize, simply update the `create_llm()` calls with your preferred model identifiers: ```python tool_llm = create_llm("openai/langdb/report-researcher") analysis_llm = create_llm("openai/gpt-4o") writer_llm = create_llm("openai/google/gemini-2.5-pro") ``` Ensure the model string matches a valid LangDB or OpenAI namespace. All routing, tracing, and MCP integrations remain identical regardless of the model. > When you create a new Virtual Model in LangDB, it will generate a unique model name (for example, `openai/langdb/report-researcher@v1`).\ > **Be sure to replace** the example model name in your `main.py` and in your agent config files with the actual model name generated for your project. ## Running the Agent Execute the workflow by passing a topic: ``` python main.py "The Impact of AI on Social Media Marketing in 2024" ``` The CLI will prompt for a topic if none is provided. ## Conclusion Below is a real, shareable example of a generated report and full execution trace using this pipeline: {% embed url="" %} ### References * [LangDB Virtual MCP Servers](https://app.gitbook.com/s/0peUeEHQMLSxYq8BpQGt/concepts/virtual-mcp-servers) * [LangDB Virtual Models](https://app.gitbook.com/s/0peUeEHQMLSxYq8BpQGt/concepts/virtual-models) * [CrewAI Documentation](https://docs.crewai.com/) # Building Travel Agent with OpenAI Agents SDK Integrate OpenAI Agents SDK with LangDB for multi-agent travel workflows. Configure guardrails, virtual MCP search, and model routing for reliable outputs.

Check Out: https://app.langdb.ai/sharing/threads/43cfa16f-042e-44ca-ad21-06f52afeca39

## Code * LangDB Samples: This guide illustrates how to build a multi-agent travel query workflow using the OpenAI Agents SDK, augmented by LangDB for guardrails, virtual MCP servers (tool integration), and model routing. OpenAI introduced the **Agents SDK**, a lightweight, Python-first toolkit for building agentic AI apps. It’s built around three primitives: * **Agents**: LLMs paired with tools and instructions to complete tasks autonomously. * **Handoffs**: Let agents delegate tasks to other agents. * **Guardrails**: Validate inputs/outputs to keep workflows safe and reliable. ## Overview This guide illustrates how to build a multi-agent travel query workflow using the **OpenAI Agents SDK**, augmented by LangDB for advanced tracing, tool integration, and model routing. We will create a 4-agent pipeline: 1. **Query Router Agent**: Routes user queries to the appropriate specialist agent. 2. **Booking Specialist**: Manages booking-related requests. 3. **Travel Recommendation Specialist**: Provides destination recommendations with web search support. 4. **Reply Agent**: Formats the final output for the user. ## Installation ```bash pip install openai-agent 'pylangdb[openai]' python-dotenv ``` ## Environment Variables ```bash export LANGDB_API_KEY="" export LANGDB_PROJECT_ID="" export LANGDB_API_BASE_URL="https://api.us-east-1.langdb.ai" ``` ## Code Walkthrough The snippets below break down how to configure the OpenAI Agents SDK with LangDB for end-to-end tracing and custom model routing. ### Initialize LangDB Tracing First, initialize `pylangdb` tracing. This must be the first step to ensure all subsequent SDK operations are captured. ```python import os import asyncio import uuid from dotenv import load_dotenv from pylangdb.openai import init # Load environment variables and initialize tracing load_dotenv() init() ``` ### Configure the OpenAI Client & Model Provider Next, configure the `AsyncOpenAI` client to send all requests through the LangDB gateway. We then create a `CustomModelProvider` to ensure the Agents SDK uses this client for all model calls. ```python from agents import ( Agent, Runner, set_default_openai_client, RunConfig, ModelProvider, Model, OpenAIChatCompletionsModel ) from openai import AsyncOpenAI # Configure the client with LangDB headers client = AsyncOpenAI( api_key=os.environ["LANGDB_API_KEY"], base_url=os.environ["LANGDB_API_BASE_URL"], default_headers={"x-project-id": os.environ["LANGDB_PROJECT_ID"]} ) # Set the configured client as default for tracing set_default_openai_client(client, use_for_tracing=True) # Create a custom model provider to route all model calls through LangDB def get_model(model_name): return OpenAIChatCompletionsModel(model=model_name, openai_client=client) ``` ### Define the Agents Now, define the specialist agents and the router agent that orchestrates them. The `model` parameter can be any model available in LangDB, including the virtual models we configure in the next section. ```python # Define specialist agents booking_agent = Agent( name="Booking Specialist", instructions="You are a booking specialist. You help customers with their booking and reservation questions.", model=get_model("openai/gpt-4o-mini") ) travel_recommendation_agent = Agent( name="Travel Recommendation Specialist", instructions="You are a travel recommendation specialist. You help customers find ideal destinations and travel plans.", model=get_model("langdb/recc_8ac7wclb") # A virtual model with search tools attached ) reply_agent = Agent( name="Reply Agent", instructions="You reply to the user's query and make it more informal by adding emojis.", model=get_model("langdb/reply_idzqgtrm") # A virtual model for formatting ) # Define the orchestrator agent query_router_agent = Agent( name="Query Router", instructions="You determine which specialist to use based on the user's query, then hand off to the reply agent.", model=get_model("langdb/router_c77w2sji"), # A virtual model for routing handoffs=[reply_agent], tools=[ booking_agent.as_tool( tool_name="booking_tool", tool_description="Use for questions about flight bookings or reservations.", ), travel_recommendation_agent.as_tool( tool_name="travel_tool", tool_description="Use for travel destination recommendations or planning.", ) ] ) ``` ### Run the Workflow Finally, use the `Runner` to execute the workflow. We inject our `CustomModelProvider` and a `group_id` into the `RunConfig` to ensure all steps are routed through LangDB and linked in the same trace. ```python async def run_travel_agent(query: str): group_id = str(uuid.uuid4()) # Links all steps in this session response = await Runner.run( query_router_agent, input=query, run_config=RunConfig( model_provider=CUSTOM_MODEL_PROVIDER, group_id=group_id ) ) print(response.final_output) if __name__ == "__main__": asyncio.run(run_travel_agent("I want to book a flight to Paris.")) ``` ### Configuring MCPs, Guardrails, and Models To empower agents with tools like web search or to enforce specific behaviors with guardrails, you use LangDB Virtual Models. This allows you to attach functionality directly to a model identifier without changing your agent code. 1. In the LangDB UI, navigate to **Models → + New Virtual Model**. 2. Create virtual models for your agents (e.g., `travel-recommender`, `query-router`). 3. Attach tools and guardrails as needed: * **For the `travel_recommendation_agent`**: Attach an MCP Server (like Tavily Search) to give it live web search capabilities. * **For the `query_router_agent`**: Attach guardrails to validate incoming requests. For example: * **Topic Adherence**: Ensure the query is travel-related. * **OpenAI Moderation**: Block harmful or disallowed content. * **Minimum Word Count**: Reject overly short or vague queries. * **For the `reply_agent`**: Attach a **Language Validator** guardrail to ensure the final output is in the expected language. 4. Use the virtual model's identifier (e.g., `langdb/travel-recommender`) as the `model` string in your `Agent` definition.

## Full Trace After setting up the virtual models and running the query like: ```bash uv run app.py 'Im planning a trip to Japan in April. What are the must-see cherry blossom spots?' ``` We get the following trace

You can check out the entire trace here: {% embed url="" %} ## References * [OpenAI Agents SDK](https://openai.github.io/openai-agents-python/) * [LangDB Virtual MCP Servers](https://docs.langdb.ai/concepts/virtual-mcp-servers) * [LangDB Guardrails](https://docs.langdb.ai/features/guardrails) * [LangDB Virtual Models](https://docs.langdb.ai/concepts/virtual-models) # Building Web Search Agent with Google-ADK Set up a web search agent with Google ADK and LangDB. Guide covers creating virtual MCP servers, integrating search tools, model routing, and full observability. This guide walks you through setting up a powerful WebSearch agent that leverages Google ADK for orchestration and LangDB for LLM access, tracing, and flexible routing.

Checkout: https://app.langdb.ai/sharing/threads/b6ddc154-33a0-403f-948c-9a559d93445a

## Code * LangDB Samples: ## Overview The final agent will use a `SequentialAgent` to orchestrate two sub-agents: 1. **Critic Agent**: Receives a user's query, searches the web for information using a tool, and provides an initial analysis with source references. 2. **Reviser Agent**: Takes the critic's output, refines the content, and synthesizes a final, polished answer. LangDB automatically captures the entire workflow, giving you full visibility into the handoff between agents and the tools they use. ## Installation ```bash pip install google-adk "pylangdb[adk]" python-dotenv ``` ### Environment Variables ```bash export LANGDB_API_KEY="" export LANGDB_PROJECT_ID="" ``` ### Project Structure ``` web-search-agent/ ├── web-search/ │ ├── agent.py # Root SequentialAgent │ ├── __init__.py │ └── sub_agents/ │ ├── __init__.py │ ├── critic/ │ │ ├── agent.py # Uses Virtual Model │ │ ├── __init__.py │ │ └── prompt.py # Contains the critic's instruction prompt │ └── reviser/ │ ├── agent.py │ ├── __init__.py │ └── prompt.py # Contains the reviser's instruction prompt └── pyproject.toml ``` ## Code Walkthrough ### Initialize LangDB Tracing The most important step is to call `pylangdb.adk.init()` **before** any Google ADK modules are imported. This instruments the environment for automatic tracing. ```python # web-search/agent.py from pylangdb.adk import init init() from google.adk.agents import SequentialAgent from .sub_agents.critic import critic_agent from .sub_agents.reviser import reviser_agent # ... ``` ### Define the Critic Agent with a Virtual Model The `critic_agent` is responsible for the web search. Instead of hard-coding a tool, we assign it a **LangDB Virtual Model**. This virtual model has a Tavily Search MCP attached, giving the agent search capabilities without changing its code. ```python # web-search/sub_agents/critic/agent.py from google.adk.agents import Agent # ... other imports for the callback ... critic_agent = Agent( # This virtual model has a Tavily Search MCP attached in the LangDB UI model="langdb/critic_agent_37r6urik", name="critic_agent", instruction=prompt.CRITIC_PROMPT, after_model_callback=_render_reference # Formats search results ) ``` ### Define the Sequential Agent The root `agent.py` defines a `SequentialAgent` that orchestrates the workflow, passing the user's query first to the `critic_agent` and then its output to the `reviser_agent`. ```python # web-search/agent.py # ... imports and init() call ... llm_auditor = SequentialAgent( name='web_search_agent', description=( 'A 2-step web search agent that first searches and analyzes web content,' ' then refines and synthesizes the information to provide comprehensive' ' answers to user queries.' ), sub_agents=[critic_agent, reviser_agent], ) # This is the entry point for the ADK root_agent = llm_auditor ``` ### Configuring MCPs and Virtual Models To empower the `critic_agent` with live web search, we first create a **Virtual MCP Server** for the search tool and then attach it to a **Virtual Model**. #### 1. Create a Virtual MCP Server First, create a dedicated MCP server for the search tool. 1. In the LangDB UI, navigate to **Projects → MCP Servers**. 2. Click **+ New Virtual MCP Server** and configure it: * **Name**: `web-search-mcp` * **Underlying MCP**: Select **Tavily Search**. * **Note**: The Tavily MCP requires an API key. Ensure you have added your `TAVILY_API_KEY` to your LangDB account secrets for the tool to function. #### 2. Create and Configure the Virtual Model Next, create a virtual model and attach the MCP you just made. 1. Navigate to **Models → + New Virtual Model**. 2. Give it a name (e.g., `critic-agent`). 3. In the **Tools** section, click **+ Attach MCP Server** and select the `web-search-mcp` you created. 4. Save the model and copy its identifier (e.g., `langdb/critic-agent_xxxxxx`). 5. Use this identifier as the `model` in your `critic_agent` definition. ### Running the Agent With your `pyproject.toml` configured for the ADK, you can run the agent locally: ```bash adk web ``` Navigate to `http://localhost:8000`, select `web-search`, and enter a query. ### Full Trace Every run is captured in LangDB, showing the full sequence from the initial query to the final revised answer, including the tool calls made by the critic agent.

Complete Web Search Agent Google ADK Trace

You can checkout the entire conversation history as well as tracing from the thread: {% embed url="" %} ## References * [Google ADK Documentation](https://google.github.io/adk-docs/) * [LangDB Virtual MCP Servers](https://docs.langdb.ai/concepts/virtual-mcp-servers) * [LangDB Virtual Models](https://docs.langdb.ai/concepts/virtual-models) * [Tavily Search MCP](https://app.langdb.ai/mcp-servers/tavily-mcp-4024f9c3-3d20-48d2-92da-4c7e9910e5f9) # Building Travel Concierge with Google-ADK This guide demonstrates how to enhance [Google's ADK Travel Concierge](https://github.com/google/adk-samples/tree/main/python/agents/travel-concierge) sample by integrating it with LangDB. By leveraging LangDB's AI gateway, you can offload complex tool logic to virtual MCPs, dynamically select models, and gain full end-to-end observability into your agent's operations. This approach simplifies the agent's code, improves security by centralizing API key management, and provides powerful tracing capabilities. {% embed url="" %}

Trace of Travel Concierge with Google ADK — Checkout: https://app.langdb.ai/sharing/threads/8425e068-77de-4f41-8aa9-d1111fc7d2b7

## Code * LangDB Samples: ## Overview The Travel Concierge is a sophisticated agent system built with Google ADK that handles the complete travel lifecycle through specialized sub-agents. ### Travel Concierge Architecture The system consists of these specialized sub-agents: 1. **Main Agent**: Orchestrates the workflow and routes user queries to specialized sub-agents. 2. **Inspiration Agent**: Helps users discover travel destinations based on preferences. 3. **Planning Agent**: Creates detailed itineraries and travel plans. 4. **Booking Agent**: Assists with booking flights, hotels, and activities. 5. **Pre-Trip Agent**: Provides preparation guidance before departure. 6. **In-Trip Agent**: Offers real-time assistance during travel. 7. **Post-Trip Agent**: Collects feedback and helps with post-travel tasks. ### Key Benefits With LangDB, this multi-agent system gains significant capabilities: * **Seamless Integration**: All LLM calls flow through a centralized AI gateway * **Tool Abstraction**: Virtual MCPs simplify integration with external services * **End-to-End Tracing**: Complete visibility into agent interactions and tool usage * **Flexible Model Selection**: Choose any model without code changes * **Centralized Configuration**: Tools, guardrails, and monitoring managed in one place ### Installation ```bash pip install google-adk "pylangdb[adk]" python-dotenv ``` ### Environment Variables Create a `.env` file in your project root with the following variables: ```bash # Required for AI platform integration LANGDB_API_KEY="your_api_key_here" LANGDB_PROJECT_ID="your_project_id_here" ``` ### Project Structure ``` travel-concierge/ ├── .env ├── README.md └── travel_concierge/ ├── agent.py # Main orchestrator agent ├── __init__.py ├── prompt.py ├── profiles/ # User profile templates ├── shared_libraries/ # Common utilities ├── sub_agents/ # Specialized travel agents │ ├── booking/ │ ├── inspiration/ │ ├── in_trip/ │ ├── planning/ │ ├── post_trip/ │ └── pre_trip/ └── tools/ # Shared tools ``` ## How the Integration Works ### Seamless LLM Integration The key to enhancing Google's ADK sample is directing all LLM calls through a centralized AI gateway. This is achieved through the Python SDK: ```python # travel_concierge/__init__.py from pylangdb.adk import init # Initialize LangDB tracing BEFORE importing any Google ADK modules # This sends all LLM API calls through LangDB's gateway init() ``` By calling `init()` before any Google ADK imports, the integration: 1. Patches the ADK's underlying model calling mechanisms 2. Routes all LLM requests through LangDB's API 3. Attaches tracing metadata to each request 4. Captures all agent-to-agent handoffs and tool calls This is a significant architectural improvement over the original sample, which requires manual instrumentation to achieve observability. ### Virtual Model References Instead of hardcoding model names or building complex tool integration, we reference LangDB virtual models in the agent definitions: ```python # travel_concierge/agent.py from google.adk.agents import Agent from .prompt import MAIN_AGENT_PROMPT from .sub_agents.inspiration import inspiration_agent from .sub_agents.planning import planning_agent from .sub_agents.booking import booking_agent from .sub_agents.pre_trip import pre_trip_agent from .sub_agents.in_trip import in_trip_agent from .sub_agents.post_trip import post_trip_agent # Main orchestrator agent travel_concierge = Agent( model="gemini/gemini-2.5-pro-preview", name="travel_concierge", description="A comprehensive travel assistant that helps with all phases of travel planning and execution.", instruction=MAIN_AGENT_PROMPT, sub_agents=[ inspiration_agent, planning_agent, booking_agent, pre_trip_agent, in_trip_agent, post_trip_agent ] ) # This is the entry point for the ADK root_agent = travel_concierge ``` Notice the `model="langdb/travel_concierge_main"` parameter. This doesn't point directly to a specific LLM but to a **LangDB Virtual Model** that can have: * A specific underlying LLM (like Google's Gemini Pro) * Attached tools and MCPs * Guardrails for input/output validation * Custom handling and retry logic This approach offloads significant complexity from the application code to LangDB AI gateway. ### Simplified Sub-Agent Implementation The `inspiration_agent` uses a virtual model that connects to a Google Maps MCP. The agent code itself only needs to reference the tools, not implement them: ```python # travel_concierge/sub_agents/inspiration/agent.py from google.adk.agents import Agent, AgentTool from . import prompt inspiration_agent = Agent( model= "langdb/inspiration_agent_z73m3wmd", # Virtual Model name="inspiration_agent", description="A travel inspiration agent who inspire users, and discover their next vacations; Provide information about places, activities, interests,", instruction=prompt.INSPIRATION_AGENT_INSTR, # Include map_tool so the agent can verify POIs on Google Maps after invoking poi_agent tools=[ AgentTool(agent=place_agent), AgentTool(agent=poi_agent), ], ) ``` **Key Benefit:** The agent code is significantly cleaner. All tool configuration is offloaded to the central platform, allowing you to: 1. Change search providers without code changes 2. Add guardrails for travel queries 3. Deploy updates to tools without redeploying your application ### Tool Definitions vs. Virtual MCPs **Original Approach (Tool Implementation)** In the original Google ADK sample, you would need to implement a complex tool to interact with the Google Places API directly: ```python # travel_concierge/tools/map.py (Original Approach Snippet) import os import requests from google.adk.tools import ToolContext class PlacesService: """Wrapper to Places API.""" def find_place_from_text(self, query: str): """Fetches place details using a text query.""" # ... complex request and parsing logic ... # The tool function itself def map_tool(key: str, tool_context: ToolContext): # ... logic to inspect state and call PlacesService ... pois = tool_context.state[key]["places"] for poi in pois: location = poi["place_name"] + ", " + poi["address"] result = places_service.find_place_from_text(location) # ... update state with results ... return {"places": pois} # Initialize with API key from environment search_tool = SearchTool(os.environ.get("SEARCH_API_KEY")) ``` This approach requires: * Managing API keys in your environment * Implementing error handling and retries * Deploying code changes for API updates * Writing custom integration code for each tool **Enhanced Approach (Virtual MCPs)** With the Virtual MCP approach, you offload the entire tool implementation to the central platform. This means: 1. No tool implementation code needed in your application. 2. API keys are managed securely in the platform. 3. You get built-in error handling and retries. 4. Tools can be updated or switched without code changes. ## Configuring Virtual Models and MCPs This approach separates tool configuration from code, moving it to a web interface where it can be managed and updated without deployments. ### **Creating Virtual MCP Servers** Virtual MCP servers act as API gateways to external tools and services: 1. In the LangDB UI, navigate to **Projects → MCP Servers**. 2. Click **+ New Virtual MCP Server** and create the necessary MCPs. Each MCP is then connected to a virtual model, which is used by a specific agent: * **Inspiration Agent** uses a virtual model connected to a **Google Maps** MCP. * **Planning Agent** uses a virtual model connected to an **Airbnb** MCP. * **Search Agent** uses a virtual model connected to a **Tavily Search** MCP. ### **Attaching MCPs to Virtual Models** Virtual models connect your agent code to the right tools automatically: 1. Navigate to **Models → + New Virtual Model**. 2. For the Inspiration Agent: * Name: `travel_inspiration_agent` * Base Model: `anthropic/claude-sonnet-4` * Attach the **Tavily Search MCP** from step 1 * Add optional guardrails (e.g., content filters for travel advice) 3. For the Google Search agent in shared tools: * Name: `google_search_agent` * Base Model: `gemini/gemini-2.5-pro-preview` * Attach the **Google Places MCP** from step 1 * Add custom response templates for place recommendations **Key Benefits:** * **Separation of Concerns**: Code handles agent orchestration while LangDB handles tools and models * **Dynamic Updates**: Change tools without redeploying your application * **Security**: API keys stored securely in LangDB, not in application code * **Monitoring**: Track usage patterns and error rates in one place ### **Run the Agent** ```bash adk web ``` Navigate to `http://localhost:8000`, select `travel_concierge`, and start your travel planning conversation. ### Example Conversation with Traced Tool Calls Here is an example conversation that shows how the `inspiration_agent` uses a virtual model connected to a Google Maps MCP to provide travel suggestions. **User Input:** ``` Inspire me about Maldives ``` **Agent Execution Flow:** 1. The root `travel_concierge` agent receives the request and transfers control to the `inspiration_agent`. * **Tool Call**: `transfer_to_agent(agent_name="inspiration_agent")` 2. The `inspiration_agent` is activated. Its virtual model is configured to use the Google Maps MCP. * **Tool Call**: `poi_agent(request="Maldives")` 3. The Google Maps MCP is invoked, which searches for points of interest in the Maldives and returns a structured list of places with details, ratings, and map links. **Final Agent Response:** ``` Of course, the Maldives is a stunning choice! Here are a few points of interest to get you started on your dream vacation: 1. North Malé Atoll Highlights: A stunning coral island known for its powdery white sand, turquoise waters, vibrant marine life, and luxury resorts. Perfect for snorkeling, diving, and relaxation. Rating: 4.7 Map: https://www.google.com/maps/search/?api=1&query=4.416667,73.5&query_place_id=ChIJs_UZm7KJPzsRyXC6lVstCGY 2. Ari Atoll Highlights: Renowned for its crystal-clear waters and diverse marine life, including whale sharks, manta rays, and colorful reefs, making it a diver’s paradise. Rating: 4.4 Map: https://www.google.com/maps/search/?api=1&query=3.8833333,72.8333333&query_place_id=ChIJc2fMBIzdQDsRml476HWWT5g 3. Malé City Highlights: The bustling capital city featuring the 17th-century Hukuru Miskiy mosque, Maldives Islamic Centre, vibrant markets, and local eateries. Great for experiencing Maldivian culture. Rating: 4.3 Map: https://www.google.com/maps/search/?api=1&query=4.1752709,73.5091172&query_place_id=ChIJVUvxlVV-PzsRV7p9OeoztoE Would you like to dive deeper into any of these destinations? Or, if you're feeling inspired and ready to start planning, I can hand you over to our planning_agent to help with itineraries, flights, and hotels. ``` ## Full Tracing with LangDB The true power of the LangDB integration becomes apparent in the comprehensive tracing capabilities. While the original Google ADK sample provides basic conversation logging, LangDB captures every aspect of the agent system's operation:

End-to-end tracing in LangDB shows all agent interactions and tool calls

You can checkout the entire conversation here: {% embed url="" %} In the LangDB trace view, you can see: 1. The exact flow paths between agents 2. Every tool call made through virtual MCPs 3. Complete inputs and outputs of each model invocation 4. Performance metrics and token usage 5. Errors and retries (if any occurred) ## Conclusion: Benefits of LangDB Integration By enhancing Google's ADK Travel Concierge sample with LangDB integration, we've achieved several significant improvements: 1. **Architectural Simplification**: Cleaner code with less boilerplate by offloading tool logic to LangDB 2. **Dynamic Tool Configuration**: Change tools and models without code changes 3. **Enhanced Security**: API keys managed centrally in LangDB 4. **Complete Observability**: End-to-end tracing of all agent interactions 5. **Improved Development Workflow**: Update agent capabilities without redeployment This approach demonstrates how LangDB's AI gateway can enhance agent frameworks by providing a layer of abstraction that simplifies development while adding powerful capabilities. ### References * [Google ADK Documentation](https://google.github.io/adk-docs/) * [LangDB Virtual MCP Servers](https://docs.langdb.ai/concepts/virtual-mcp-servers) * [LangDB Virtual Models](https://docs.langdb.ai/concepts/virtual-models) * [End-to-End Tracing on Google ADK with LangDB](https://blog.langdb.ai/discover-end-to-end-tracing-on-google-adk-with-langdb) * [Google ADK Travel Concierge Sample](https://github.com/google/adk-samples/tree/main/python/agents/travel-concierge) # Building a Reasoning Finance Team with Agno Build a multi-agent financial analysis team with LangDB and Agno. This guide demonstrates how to construct a sophisticated, multi-agent financial analysis team using LangDB. The team collaborates to deliver in-depth insights on publicly traded companies by combining web research and financial data analysis.

Building a Reasoning Finance Team with Agno — Checkout: https://app.langdb.ai/sharing/threads/73c91c58-eab7-4c6b-afe1-5ab6324f1ada

## Code * LangDB Samples: ## Overview The system is composed of two specialist agents orchestrated by a coordinating team: 1. **Web Search Agent**: Responsible for gathering the latest news and market sentiment from the internet. 2. **Finance Agent**: Equipped with YFinanceTools to fetch and analyze quantitative stock data, including pricing, fundamentals, and analyst recommendations. 3. **Reasoning Finance Team**: A coordinator that directs the two agents, synthesizes their findings, and produces a final, comprehensive report. LangDB provides the backbone for this system, enabling seamless model access, tool integration, and full observability into each agent's actions and the team's collaborative process. ## Installation ```bash pip install "pylangdb[agno]" python-dotenv yfinance ``` ## Environment Variables Create a `.env` file or export the following environment variables: ```bash export LANGDB_API_KEY="" export LANGDB_PROJECT_ID="" ``` ## Code Walkthrough ### Initialize LangDB Start by initializing LangDB to enable automatic tracing and model routing. This should be done before importing any other components from the `agno` library. ```python import os from dotenv import load_dotenv from pylangdb.agno import init init() from agno.agent import Agent # ... other imports load_dotenv() ``` ### Define the Web Search Agent The `web_agent` is responsible for searching the web. Instead of hard-coding a search tool, we assign it a **LangDB Virtual Model**. This decouples the agent's logic from the specific tools it uses. The virtual model is configured in the LangDB UI to provide search capabilities, as explained in the configuration section below. ```python web_agent = Agent( name="Web Search Agent", role="Search the web for the information", model=LangDB(id="langdb/search_agent_xmf4v5jk"), instructions="Always include sources" ) ``` ### Define the Finance Agent This agent is equipped with `YFinanceTools` to access a wide range of financial data. It is powered by Grok-4 and has specific instructions to format its output professionally. ```python finance_agent = Agent( name="Finance AI Agent", role="Analyse the given stock", model=LangDB(id="xai/grok-4"), tools=[YFinanceTools( stock_price=True, stock_fundamentals=True, analyst_recommendations=True, company_info=True, company_news=True )], instructions=[ "Use tables to display stock prices, fundamentals (P/E, Market Cap), and recommendations.", "Clearly state the company name and ticker symbol.", "Focus on delivering actionable financial insights." ] ) ``` ### Define the Coordinating Team The `ReasoningFinanceTeam` orchestrates the two agents. It operates in `coordinate` mode, allowing it to delegate tasks, synthesize information, and ensure the final output meets the specified success criteria. ```python reasoning_finance_team = Team( name="Reasoning Finance Team", mode="coordinate", model=LangDB(id="xai/grok-4"), members=[web_agent, finance_agent], instructions=[ "Collaborate to provide comprehensive financial and investment insights", "Consider both fundamental analysis and market sentiment", "Use tables and charts to display data clearly and professionally", "Present findings in a structured, easy-to-follow format", "Only output the final consolidated analysis, not individual agent responses" ], markdown=True, show_members_responses=True, success_criteria="The team has provided a complete financial analysis with data, visualizations, risk assessment, and actionable investment recommendations supported by quantitative analysis and market research." ) ``` ### Configuring Virtual Models and Tools To empower the `web_agent` with live web search capabilities without hard-coding tools, we configure a **Virtual Model** in LangDB. This model is backed by a **Virtual MCP Server** that provides the actual search functionality. **Create a Virtual MCP Server** First, create a dedicated MCP server for the search tool. 1. In the LangDB UI, navigate to **Projects → MCP Servers**. 2. Click **+ New Virtual MCP Server** and configure it: * **Name**: `web-search-mcp` * **Underlying MCP**: Select **Tavily Search**. * **Note**: The Tavily MCP requires an API key. Ensure you have added your `TAVILY_API_KEY` to your LangDB account secrets for the tool to function. **Create and Configure the Virtual Model** Next, create a virtual model and attach the MCP you just made. 1. Navigate to **Models → + New Virtual Model**. 2. Give it a name (e.g., `search-agent`). 3. In the **Tools** section, click **+ Attach MCP Server** and select the `web-search-mcp` you created. 4. Save the model and copy its identifier (e.g., `langdb/search-agent_xxxxxx`). 5. Use this identifier as the `model` in your `web_agent` definition. ### Running the Team To run the team, simply call the `print_response` method with a detailed prompt. The team will autonomously delegate tasks to the appropriate agents and generate a consolidated response. ```python reasoning_finance_team.print_response( """Compare the tech sector giants (AAPL, GOOGL, MSFT) performance:\n 1. Get financial data for all three companies\n 2. Analyze recent news affecting the tech sector\n 3. Calculate comparative metrics and correlations\n 4. Recommend portfolio allocation weights""" ) ``` ## Full Trace Every execution is captured in LangDB, providing a complete trace of the agno team's operations. This includes the initial prompt, each agent's individual contributions, the tools they used, and the final synthesized output. > Checkout: for full conversation

Full Trace of Building a Reasoning Finance Team with Agno

## References * [LangDB Virtual Models](https://docs.langdb.ai/concepts/virtual-models) * [Agno Documentation](https://docs.agno.com/) # Building Complex Data Extraction with LangGraph Build a multi-agent data extraction agent with LangDB and LangGraph. This guide shows how to build a sophisticated LangGraph agent for extracting structured information from meeting transcripts using LangDB. Leverage LangDB's AI gateway to create multi-stage workflows with confidence scoring, validation loops, and comprehensive tracing.

Trace of Complex Data Extraction with LangGraph — Checkout: https://app.langdb.ai/sharing/threads/efe57123-9c81-4428-ada1-8d2e55bce4e8

## Code * LangDB Samples: ## Overview The Complex Data Extraction agent processes meeting transcripts through a multi-stage workflow with validation, refinement, and synthesis phases. ### Data Extraction Architecture The system implements these specialized processing stages: 1. **Preprocessing Node**: Analyzes transcript structure and determines complexity 2. **Initial Extraction Node**: Performs data extraction with confidence scoring 3. **Validation Node**: Validates extraction quality and provides feedback 4. **Refinement Node**: Refines extraction based on validation feedback 5. **Synthesis Node**: Produces final comprehensive summary 6. **Fallback Node**: Provides simplified extraction if complex workflow fails ### Key Benefits With LangDB, this multi-stage extraction system gains: * **End-to-End Tracing**: Complete visibility into processing stages and decision points * **Confidence Scoring**: Built-in quality assessment for each extraction section * **Iterative Refinement**: Multiple validation loops with feedback-driven improvements * **Modular Architecture**: Clean separation of concerns across nodes and tools * **Robust Error Handling**: Fallback mechanisms ensure reliable processing * **Centralized Configuration**: All LLM calls routed through LangDB's AI gateway ### Installation ```bash pip install pylangdb[langchain] langchain langgraph langchain_openai pydantic python-dotenv ``` ### Environment Variables Create a `.env` file in your project root with the following variables: ```bash # Required for AI platform integration LANGDB_API_KEY="your_api_key_here" LANGDB_PROJECT_ID="your_project_id_here" LANGDB_API_BASE_URL="https://api.us-east-1.langdb.ai" ``` ### Project Structure ``` langchain-data-extraction/ ├── .env ├── README.md ├── requirements.txt ├── main.py # Main execution script ├── agent.py # Agent construction and workflow ├── models.py # Data models and state definitions ├── nodes.py # Node functions and routing logic ├── tools.py # Tool definitions └── transcript.py # Sample transcript data ``` ## How the Integration Works ### Seamless LangGraph Integration The key to enhancing LangGraph with LangDB is directing all LLM calls through a centralized AI gateway: ```python # main.py from pylangdb.langchain import init # Initialize LangDB tracing BEFORE importing any LangGraph modules init() ``` By calling `init()` before any LangGraph imports, the integration: 1. Patches LangGraph's underlying model calling mechanisms 2. Routes all LLM requests through LangDB's API 3. Attaches tracing metadata to each request 4. Captures all node transitions and tool calls This provides comprehensive observability into complex multi-stage workflows. ### Virtual Model References Instead of hardcoding model names, we reference LangDB virtual models: ```python # nodes.py def create_model(): """Create and return the ChatOpenAI model with tools bound.""" api_base = os.getenv("LANGDB_API_BASE_URL") api_key = os.getenv("LANGDB_API_KEY") project_id = os.getenv("LANGDB_PROJECT_ID") default_headers = {"x-project-id": project_id} llm = ChatOpenAI( model_name='openai/gpt-4o', temperature=0.2, openai_api_base=f"{api_base}/v1", openai_api_key=api_key, default_headers=default_headers ) return llm.bind_tools([ analyze_transcript_structure, extract_with_confidence, validate_extraction, refine_extraction ]) ``` The `model_name='openai/gpt-4o'` parameter can be replaced with a **LangDB Virtual Model** reference that includes: * A specific underlying LLM * Attached tools and MCPs * Guardrails for input/output validation * Custom handling and retry logic This approach offloads complexity from the application code to LangDB AI gateway. ### Modular State Management The system uses TypedDict for type-safe state management: ```python # models.py class ComplexAgentState(TypedDict): """Extended state for complex extraction workflow""" messages: Annotated[Sequence[BaseMessage], add_messages] transcript: str complexity_level: ComplexityLevel extraction_attempts: int max_attempts: int confidence_scores: Dict[str, float] validation_feedback: List[str] extraction_data: Dict[str, Any] current_phase: ExtractionPhase requires_refinement: bool processing_complete: bool error_count: int ``` This state structure enables type safety, observability, debugging, and extensibility. ### Advanced Workflow Patterns The agent implements sophisticated workflow patterns: ```python # agent.py def create_complex_agent(): """Create and return the complex LangGraph agent""" workflow = StateGraph(ComplexAgentState) # Add nodes workflow.add_node("preprocessing", preprocessing_node) workflow.add_node("initial_extraction", initial_extraction_node) workflow.add_node("validation", validation_node) workflow.add_node("refinement", refinement_node) workflow.add_node("synthesis", synthesis_node) workflow.add_node("fallback", fallback_node) workflow.add_node("tool_node", ToolNode([ analyze_transcript_structure, extract_with_confidence, validate_extraction, refine_extraction ])) # Set entry point workflow.set_entry_point("preprocessing") # Add conditional edges for smart routing workflow.add_conditional_edges( "preprocessing", route_after_preprocessing, { "tool_node": "tool_node", "initial_extraction": "initial_extraction" } ) return workflow.compile() ``` **Key Benefits:** * **Conditional Routing**: Smart routing based on validation results * **Tool Integration**: Seamless tool calls with automatic routing * **Error Recovery**: Fallback mechanisms for robust processing * **Observability**: Every decision point is traced in LangDB ## Configuring Virtual Models and Tools This approach separates tool configuration from code, moving it to a web interface where it can be managed without deployments. ### **Creating Virtual MCP Servers** Virtual MCP servers act as API gateways to external tools and services: 1. In the LangDB UI, navigate to **Projects → MCP Servers**. 2. Click **+ New Virtual MCP Server** and create the necessary MCPs: * **Transcript Analysis MCP**: For preprocessing and structure analysis * **Data Extraction MCP**: For structured information extraction * **Validation MCP**: For quality assessment and feedback * **Refinement MCP**: For iterative improvement ### **Attaching MCPs to Virtual Models** Virtual models connect your agent code to the right tools automatically: 1. Navigate to **Models → + New Virtual Model**. 2. For the Preprocessing Node: * Name: `transcript_preprocessing` * Base Model: `openai/gpt-4o` * Attach the **Transcript Analysis MCP** * Add guardrails for transcript processing 3. For the Extraction Node: * Name: `data_extraction` * Base Model: `openai/gpt-4o` * Attach the **Data Extraction MCP** * Add custom response templates for structured output 4. For the Validation Node: * Name: `extraction_validation` * Base Model: `openai/gpt-4o` * Attach the **Validation MCP** * Add quality assessment rules ### **Key Benefits:** * **Separation of Concerns**: Code handles workflow orchestration while LangDB handles tools and models * **Dynamic Updates**: Change tools without redeploying your application * **Security**: API keys stored securely in LangDB, not in application code * **Monitoring**: Track usage patterns and error rates in one place ### **Run the Agent** ```bash python main.py ``` The agent will process the sample transcript and provide detailed output showing each processing phase, confidence scores, and the final synthesized summary. ## Sample Output Here are key snippets from running the complex data extraction agent: **Agent Startup:** ```bash uv run langchain-data-extraction/main.py === COMPLEX EXTRACTION AGENT STARTING === Transcript length: 7296 characters Estimated complexity: ComplexityLevel.COMPLEX ``` **Preprocessing Phase:** ``` === PREPROCESSING PHASE === Transcript analysis complete. Complexity: ComplexityLevel.COMPLEX, Participants: 41, Topics: 15 ``` **Initial Extraction:** ``` **Meeting Summary: Q1 2024 Mobile App Redesign Project Kickoff** **Participants and Their Roles:** - Sarah Chen: Project Manager - Marcus Rodriguez: Lead Developer - Dr. Kim Patel: UX Research Director - Jennifer Wu: Product Owner - David Thompson: QA Manager - Lisa Chang: Marketing Director - Alex Kumar: DevOps Engineer ``` **Validation Feedback:** ``` **Validation Feedback on Meeting Summary Extraction:** **Completeness of Information Extracted:** - The summary captures the main points of the meeting, including participant roles, key decisions, action items, conflicts, and resolutions. - It includes the meeting's purpose, duration, and date, which are essential for context. ``` **Final Comprehensive Summary:** ``` **Comprehensive Summary: Q1 2024 Mobile App Redesign Project Kickoff** **Meeting Overview:** - **Date:** January 15, 2024 - **Duration:** 2 hours 30 minutes - **Purpose:** Initiate the mobile app redesign project aimed at improving user engagement by 40% and reducing bounce rate by 25%. **Key Decisions:** 1. **Phased Launch Approach:** Soft launch on March 31st to selected users, full release on April 15th. 2. **Technical Architecture:** Adopt Redux Toolkit for state management and React Query for data fetching. **Action Items:** 1. **CI/CD Pipeline Setup:** Alex Kumar to establish by January 29th (High Priority). 2. **Accessibility Testing Checklist:** Dr. Kim Patel to create by January 22nd (Medium Priority). 3. **Contractor Hiring:** Marcus Rodriguez to hire React Native contractor by January 20th (High Priority). **Risk Assessment and Mitigation:** - **Timeline and Migration Complexity:** Concerns over the tight timeline and React Native migration. Mitigation includes hiring a contractor and adjusting the release schedule. - **Team Capacity:** Addressed by hiring additional resources due to a senior developer's medical leave. ``` This output demonstrates the agent's ability to: 1. **Process Complex Transcripts**: Handle large transcripts (7,296 characters) with multiple participants and topics 2. **Multi-Stage Processing**: Execute preprocessing, extraction, validation, and synthesis phases 3. **Comprehensive Extraction**: Extract detailed information including participants, decisions, action items, conflicts, risks, and follow-up meetings 4. **Structured Output**: Produce well-organized, comprehensive summaries with clear sections 5. **Quality Validation**: Include validation feedback to ensure extraction quality 6. **Detailed Analysis**: Provide insights into project goals, technical decisions, and risk mitigation strategies The agent successfully transforms a raw meeting transcript into a structured, actionable summary that captures all critical information for project stakeholders. ## Full Tracing with LangDB The true power of the LangDB integration becomes apparent in the comprehensive tracing capabilities. While basic LangGraph provides conversation logging, LangDB captures every aspect of the complex workflow:

End-to-end tracing in LangDB shows all workflow stages and tool calls

You can checkout the entire conversation here: {% embed url="" %} In the LangDB trace view, you can see: 1. **Node Transitions**: Exact flow between preprocessing → extraction → validation → synthesis 2. **Tool Calls**: Every tool invocation with inputs and outputs 3. **Confidence Scores**: Quality assessment for each extraction section 4. **State Changes**: Complete state evolution throughout the workflow 5. **Performance Metrics**: Token usage and timing for each LLM calls ## Advanced Features ### Confidence Scoring System The agent implements a sophisticated confidence scoring system: ```python # tools.py def extract_with_confidence(...) -> Dict[str, Any]: """Extract structured data with confidence scoring""" # Calculate confidence scores for different sections confidence_scores = { "participants": 0.95 if len(participants) > 0 else 0.3, "decisions": 0.9 if len(decisions) > 0 else 0.5, "action_items": 0.85 if len(action_items) > 0 else 0.4, "conflicts": 0.8 if len(conflicts) > 0 else 0.7, "phases": 0.9 if len(phases) > 0 else 0.6, "insights": 0.8 if len(key_insights) > 0 else 0.5 } overall_confidence = sum(confidence_scores.values()) / len(confidence_scores) return { "extraction_data": extraction_data, "confidence_scores": confidence_scores, "overall_confidence": overall_confidence, "extraction_complete": overall_confidence > 0.7 } ``` ### Conditional Routing Logic The agent uses sophisticated routing logic: ```python # nodes.py def route_after_validation(state: ComplexAgentState) -> str: """Route based on validation results""" if state["extraction_attempts"] >= state["max_attempts"]: return "synthesis" # Skip to synthesis if too many attempts # Check for validation results in messages last_message = state["messages"][-1] if hasattr(last_message, 'tool_calls') and last_message.tool_calls: return "tool_node" # Default routing logic return "synthesis" ``` The system includes robust fallback mechanisms: ```python # nodes.py def fallback_node(state: ComplexAgentState) -> Dict[str, Any]: """Fallback to simplified extraction if complex extraction fails""" print("\n=== FALLBACK PHASE ===") fallback_prompt = f""" Perform a simplified extraction from this transcript: {state['transcript']} Focus on basic information: participants, main topics, and key outcomes. Use a simpler structure if the complex extraction failed. """ # Simplified processing logic response = llm.invoke([HumanMessage(content=fallback_prompt)]) return { "messages": [response], "processing_complete": True, "current_phase": ExtractionPhase.FALLBACK } ``` ## Conclusion: Benefits of LangDB Integration By enhancing LangGraph with LangDB integration, we've achieved several significant improvements: 1. **Comprehensive Observability**: Full tracing of complex multi-stage workflows 2. **Modular Architecture**: Clean separation of concerns across nodes and tools 3. **Quality Assurance**: Built-in confidence scoring and validation loops 4. **Robust Error Handling**: Fallback mechanisms ensure reliable processing 5. **Dynamic Configuration**: Change tools and models without code changes 6. **Performance Monitoring**: Track token usage and timing for optimization This approach demonstrates how LangDB's AI gateway can enhance LangGraph by providing enhanced tracing, quality control, reliability, and maintainability. ## References * [LangGraph Documentation](https://langchain-ai.github.io/langgraph/) * [LangDB Virtual MCP Servers](https://docs.langdb.ai/concepts/virtual-mcp-servers) * [LangDB Virtual Models](https://docs.langdb.ai/concepts/virtual-models) * [LangChain Data Extraction Sample](https://github.com/langdb/langdb-samples/tree/main/examples/langchain/langchain-data-extraction) # Bring Gemini, Claude, DeepSeek to Agents SDK Run agents across 350+ models with LangDB and OpenAI Agents SDK—optimize for speed, quality, or cost without changing your code. OpenAI recently introduced the **Agents SDK**, a lightweight, Python-first toolkit for building agentic AI apps. It’s built around three primitives: * **Agents**: LLMs paired with tools and instructions to complete tasks autonomously. * **Handoffs**: Let agents delegate tasks to other agents. * **Guardrails**: Validate inputs/outputs to keep workflows safe and reliable.

A trace for Multi Agent workflow using Agents SDK on LangDB

## **TLDR;** OpenAI’s Agents SDK is great for building tool-using agents with handoffs. But what if you want to use Gemini's latest reasoning model or take advantage of DeepSeek’s 90 percent lower cost? With LangDB, you can run the same agent logic across more than 350 models, including Claude, Gemini, DeepSeek, and Grok, without changing your code. It works out of the box with frameworks like CrewAI and LangChain, and provides full trace visibility into every agent, model, and tool call. ## Building a Multi-Agent Customer Support System Let’s create a production-grade AI agent system using the **OpenAI Agents SDK**, with **LangDB** providing multi-model flexibility and runtime control. We’ll build four specialized agents, each powered by a different model to demonstrate LangDB's multi-model capabilities: * **OrderAgent** (claude-3.7-sonnet): checks order status * **FAQAgent** (gemini-2.5-pro): answers common customer questions * **ComplaintAgent** (grok-2): handles complaints with empathy * **ReservationAgent** (gpt-4o): manages table bookings Then, we use a **Classifier Agent** to route user queries to the appropriate sub-agent using built-in handoff. ### Overview

Each leaf agent is powered by a tool — a simple Python function decorated with `@function_tool`. ### **Example: Tool for Order Status** ```python @function_tool def check_order_status(order_id: str): order_statuses = { "12345": "Being prepared.", "67890": "Dispatched.", "11121": "Still processing." } return order_statuses.get(order_id, "Order ID not found.") ``` ### Defining the agent: ```python order_agent = Agent( name="OrderAgent", model="anthropic/claude-3.7-sonnet", instructions="Help customers with their order status.", tools=[check_order_status] ) ``` Each of the other agents follows the same structure, varying only in tools, instructions, and model selection. ### Routing User Queries with Handoffs ```python classifier_agent = Agent( name="User Interface Agent", model="openai/gpt-4o-mini", instructions="You are a restaurant customer support agent. Handoff to the appropriate agent based on the user query.", handoffs=[order_agent,faq_agent,complaint_agent,reservation_agent] ) ``` This agent functions as a controller, deciding which specialized agent should handle the user's request. ## Running Multi-Model Agents with LangDB To run the agents through LangDB and switch between different providers, first configure the OpenAI-compatible client like this: ```python from agents import Agent, set_default_openai_client from openai import AsyncOpenAI from uuid import uuid4 client = AsyncOpenAI( api_key="langdbAPIKey", base_url=f"https://api.us-east-1.langdb.ai/{langDBProjectID}/v1", default_headers={"x-thread-id": str(uuid4()), "x-run-id": str(uuid4())}) set_default_openai_client(client, use_for_tracing=False) ``` LangDB allows you to test your agents using models such as **GPT**, **Claude**, **Gemini**, or **Grok**, while keeping the agent logic unchanged. To switch models, simply change the configuration: ```python faq_agent = Agent( name="FAQAgent", model="gemini/gemini-2.5-pro-exp-03-25", # or claude-3.7-sonnet, gpt-4o, instructions="Answer common customer questions about hours, menu, and location." ...) ``` This approach enables you to evaluate multiple providers and optimize for quality, cost, or latency. ## Observing the Agents Execution LangDB provides a unified trace view that helps developers inspect agent flows across different models. Each span in the trace shows the model used, any tools called, and the time taken at each step.

Complete Trace example on LangDB — Each row in the timeline corresponds to a different model and agent interaction.

In the example above, you can see: * Multiple agents running on different models in a single request * Visual breakdown of agent handoffs and tool calls * Timeline and cost details for each model involved

This trace graph shows a single user query routed through multiple agents, each using a different model. It captures the classifier decision, tool calls, and model usage in one place, making it easy to inspect and debug the full interaction flow. The visibility helps in debugging behavior, verifying tool usage, and understanding model performance across complex workflows. ## Why this matters? In real-world applications, different models can excel depending on the type of task or interaction style required: * Some are better suited for fast, low-latency tasks * Others handle nuanced, empathetic, or creative responses well * Certain models are optimized for summarization or structured formatting * Others provide strong performance for general-purpose conversations LangDB lets you assign the most suitable model to each agent, giving you task-specific control while maintaining a unified development experience. ## Conclusion The OpenAI Agents SDK provides a clean way to define agent workflows. Paired with LangDB, it becomes possible to run the same agent setup across multiple model providers without changing your application code. LangDB gives you visibility into agent execution through trace views and lets you switch between over 350 supported models using a consistent interface. This makes it easier to compare performance, debug behavior, and adapt to evolving requirements. ## Try It Yourself To explore this setup, check out the sample project in the repository. It contains the necessary code to run agents locally, modify model configurations, and observe how requests are routed across different models using LangDB. Visit [LangDB](https://langdb.ai/) and signup to get 10$ credit. View the sample code: [GitHub Repository](https://github.com/langdb/langdb-samples/tree/main/examples/openai/customer-support). *** ### References * [OpenAI Agents SDK Documentation](https://openai.github.io/openai-agents-python/) * [LangDB Documentation](https://docs.langdb.ai/) * [LangDB Model List](https://app.langdb.ai/models) * [LangDB Samples Repository](https://github.com/langdb/langdb-samples) # Connecting LLMs to the Web with Real-Time Search Tools Integrate LangDB's Search tool to fetch live data during LLM queries, delivering up-to-date, context-rich, and verifiable AI responses. Traditional LLMs rely on static training data, making them prone to outdated responses, hallucinations, and missing critical context. LangDB's **built-in Search tool** solves this by fetching real-time data, improving accuracy and contextual relevance. ## The Challenge: Stale or Incomplete Knowledge * **Static Corpus**: Most LLMs are trained on large datasets, but that training is typically a snapshot in time. Once trained, the model doesn’t automatically update its knowledge. * **Inaccurate or Outdated Information**: Without a method to query current data, an LLM may provide answers that were correct at the time of training but are no longer valid. * **Limited Context**: Even if the model has relevant data, it might not surface the best context without a guided search mechanism. ## Introducing LangDB Search Tool LangDB’s built-in Search tool addresses these challenges by allowing real-time querying of databases, documents, or external sources: * **On-Demand Queries**: Instead of relying solely on the LLM’s training data, the Search tool can fetch the latest information at query time. * **Integrated with LangDB**: The search functionality is seamlessly woven into the LangDB, ensuring that developers can use it without additional overhead. * **API-Ready**: LangDB’s search tool can be accessed via API too. ## Search vs No-Search Below is a side-by-side comparison of using LangDB’s search tool versus relying on static model knowledge. The left image shows results **with search enabled**, pulling real-time, up-to-date information. The right image shows the **same query without search**, leading to more generic and potentially outdated responses.

Comparing scenarios with search enabled and off on LangDB Playground

| **Feature** | **No Search** | **With LangDB Search** | | ---------------------- | ---------------------------------------- | ------------------------------------------------ | | **Data Freshness** | Static, based on training corpus | Dynamic, fetches real-time information | | **Accuracy** | Prone to outdated or incorrect responses | Pulls from latest sources, improving reliability | | **Context Depth** | Limited by internal model memory | Integrates external sources for better insights | | **Hallucination Risk** | Higher | Lower, as responses are backed by retrieved data | ## Using Search through API LangDB’s search tool can be easily integrated via API to fetch real-time data. Below is an example of how to make a simple API call to retrieve live information. Enable real-time search with LLMs in LangDB with a simple API call: ```bash curl 'https://api.us-east-1.langdb.ai/{LangDB_ProjectID}/v1/chat/completions' \ -H 'authorization: Bearer LangDBAPIKey' \ -H 'Content-Type: application/json' \ -d '{ "model": "openai/gpt-4o-mini", "mcp_servers": [{ "name": "websearch", "type": "in-memory"}], "messages": [ { "role": "user", "content": "what are the latest models by anthropic and what are its standout features?" } ] }' ``` This allows the LLM to enhance responses with live data, ensuring greater accuracy and relevance. ## Conclusion LangDB’s built-in Search tool eliminates the limitations of static LLMs by integrating **real-time web search**, ensuring your AI retrieves the most relevant, up-to-date, and accurate information. Whether you're building **chatbots, research tools, or automation systems**, dynamic search enhances responses with verifiable data, reducing hallucinations and improving decision-making. # Configure Fallback Routing with LangDB Set up fallback routing with LangDB to keep AI applications online during traffic spikes or model outages by automatically switching models. Ensure your AI applications stay online even during traffic spikes or model outages by configuring **Fallback Routing**. This guide walks you through setting up fallback routers using LangDB's routing feature. ## What is Fallback Routing? Fallback Routing allows LangDB to automatically switch to a backup model when your preferred model is slow, down, or overloaded. This helps you: * Avoid downtime * Improve reliability * Scale applications without manual intervention ## Example: Basic Fallback Routing Let’s say you want to use **DeepSeek-Reasoner**, but switch to **GPT-4o** if it becomes unavailable. Here’s how you can use the UI to set it up:

Here’s how you can set it up programmatically: ```json { "model": "router/dynamic", "router": { "name": "fallback-router", "type": "fallback", "targets": [ { "model": "deepseek-reasoner", "temperature": 0.7, "max_tokens": 400 }, { "model": "gpt-4o", "temperature": 0.8, "max_tokens": 500 } ] } } ``` ### Behavior * First, it tries **deepseek-reasoner** * If that fails, it automatically falls back to **GPT-4o** ## Example: Fallback with Load-Balancing In the previous example, we implemented a simple fallback mechanism. However, a more robust solution would be to **distribute queries across multiple providers of DeepSeek-R1** while maintaining a fallback to GPT-4o if both providers fail. This method helps balance traffic efficiently while ensuring uninterrupted AI services. Here’s how you can configure **Fallback Routing with Percentage-Based Load Balancing**: ```json { "model": "router/dynamic", "router": { "name": "fallback-percentage-router", "type": "fallback", "targets": [ { "model": "router/dynamic", "router": { "name": "percentage-balanced", "type": "percentage", "model_a": [ { "model": "fireworksai/deepseek-r1", "temperature": 0.7, "max_tokens": 400 }, 0.5 ], "model_b": [ { "model": "deepseek/deepseek-reasoner", "temperature": 0.7, "max_tokens": 400 }, 0.5 ] } }, { "model": "gpt-4o", "temperature": 0.8, "max_tokens": 500 } ] } } ``` ### How This Works: * **Primary Route:** The system distributes requests **evenly (50-50%)** between two providers of DeepSeek-R1 to balance the load. * **Fallback Route:** If both DeepSeek-R1 providers are unavailable or fail, all requests are automatically rerouted to **GPT-4o**, ensuring continuous service. This approach provides **load balancing, and reliable fallback protection**, making it ideal for AI applications facing high demand and occasional model unavailability. In more complex scenarios, you can configure a **multi-level fallback system** with percentage-based distribution. This approach allows requests to be routed dynamically based on pricing, performance, or reliability, ensuring efficiency while preventing downtime. Checkout [Routing Strategies](https://docs.langdb.ai/features/routing#routing-strategies) for more routing strategies. By leveraging **dynamic routing**, you can: * **Prevent downtime** by automatically switching to backup models. * **Optimize performance and cost** with smart load balancing. * **Ensure scalability** without manual intervention. With LangDB’s flexible and powerful routing capabilities, **you can build AI applications that are not only intelligent but also robust and fail-safe**. # Tracing Multiple Agents Monitor, trace, and analyze multi-agent workflows using LangDB When working with AI-powered workflows, efficiency, scalability, and cost control are. In many scenarios, a single-agent architecture often fails to meet all operational requirements, leading developers to adopt multi-agent workflows. LangDB simplifies this by offering seamless tracing, analytics, and optimization tools for multi-agent workflows.

**TLDR,** Combining Gemini and GPT offers lower costs, though it’s slightly slower than using a single model. | **Tag** | **Total Cost** | **Total Requests** | **Avg Duration (ms)** | **Total Duration (ms)** | **Total Input Tokens** | **Total Output Tokens** | **Avg TTFT (ms)** | **TPS** | **TPOT** | | ----------- | -------------- | ------------------ | --------------------- | ----------------------- | ---------------------- | ----------------------- | ----------------- | ------- | -------- | | \[gemini] | 0.037731 | 4 | 2390.0 | 9559.802 | 11197 | 345 | 2390.0 | 1207.35 | 0.03 | | \[gpt-4o] | 0.057021 | 4 | 4574.9 | 18299.686 | 17235 | 443 | 4574.9 | 966.03 | 0.04 | | \[combined] | 0.034887 | 4 | 2765.5 | 11061.997 | 10009 | 405 | 2765.5 | 941.42 | 0.03 | Let’s break down a typical **multi-agent workflow**: ## **Example: Generating a Chart for a given question** * **User Request**: *"Generate a chart of average temperature in Alaska over the past decade."* * **Agent Roles**: 1. **Researcher Agent**: Gathers raw temperature data by calling a search tool. 2. **Chart Generator Agent**: Processes the data and creates a visualization using a code execution tool. Here’s how this workflow is structured across multiple agents.

Agent Architecture of Chart and Research agent

To evaluate the efficiency of multi-agent workflows, we ran three different setups in LangDB. These scenarios illustrate how using the same model across agents compares to combining models. ## Scenarios We Tested To evaluate the efficiency of multi-agent workflows, we tested three setups. In the first scenario, both the Researcher Agent and Chart Generator Agent used Gemini, focusing on speed and cost. In the second scenario, both agents used GPT-4o, prioritizing accuracy but incurring higher costs. Finally, in the third scenario, we combined models: the Researcher Agent used Gemini for data gathering, while the Chart Generator Agent used GPT-4o for visualization. This allowed us to balance speed, cost, and accuracy

## Tracing and Analytics with LangDB After running these scenarios, we used LangDB's tracing capabilities to monitor task routing, response times, and token usage. The tracing interface allowed us to see exactly how tasks were distributed between agents and measure the efficiency of each workflow. Using LangDB's **analytics**, we evaluated the performance data from these runs to generate the table presented earlier. This included: * **Cost Analysis**: Tracking how much each agent contributed to the overall expense. * **Time Analysis**: Measuring the average response time across different workflows. * **Number of Requests**: Recording how many requests each run processed. * **Average Time to First Token (TTFT)**: Capturing the time taken for the first token to appear. * **Tokens Per Second (TPS)**: Analyzing the efficiency of token generation per second. * **Time Per Output Token (TPOT)**: Evaluating the time taken per output token to assess response efficiency. | **Tag** | **Total Cost** | **Total Requests** | **Avg Duration (ms)** | **Total Duration (ms)** | **Total Input Tokens** | **Total Output Tokens** | **Avg TTFT (ms)** | **TPS** | **TPOT** | | ----------- | -------------- | ------------------ | --------------------- | ----------------------- | ---------------------- | ----------------------- | ----------------- | ------- | -------- | | \[gemini] | 0.037731 | 4 | 2390.0 | 9559.802 | 11197 | 345 | 2390.0 | 1207.35 | 0.03 | | \[gpt-4o] | 0.057021 | 4 | 4574.9 | 18299.686 | 17235 | 443 | 4574.9 | 966.03 | 0.04 | | \[combined] | 0.034887 | 4 | 2765.5 | 11061.997 | 10009 | 405 | 2765.5 | 941.42 | 0.03 | #### Key Takeaways from the Scenarios **1. Gemini-Only Workflow** * **Cost**: Slightly higher than combined but cheaper than GPT-4o. * **Speed**: Fastest average duration (2390 ms) and highest TPS (1207.35). **2. GPT-4o-Only Workflow** * **Cost**: Highest overall. * **Speed**: Slowest duration (4574.9 ms) and lower TPS (966.03). **3. Combined Workflow** * **Cost**: Lowest of all workflows. * **Speed**: Moderate duration (2765.5 ms) and TPS (941.42). *** ## Why Tracing Matters for Multi-Agent Workflows In a multi-agent workflow, different models handle specialized tasks. Without proper observability, **you don't know how efficiently each model is performing**. This lack of visibility can lead to: * **Escalating Costs**: Premium models being overused for trivial tasks. * **Performance Bottlenecks**: Delays caused by slow or inefficient task execution. * **Operational Inefficiency**: No clear insights on how to optimize workflows. ## Why LangDB is Essential for Multi-Agent Workflows ### With LangDB, you can: 1. **Monitor Key Metrics**: Track costs, latency, number of requests, and token usage for each workflow. 2. **Analyze Performance Trends**: Understand response times, time to first token, and processing speeds across different models. 3. **Gain Detail Insights**: Use detailed analytics to identify inefficiencies and optimize workflows accordingly. *** ## Next Steps To explore multi-agent workflows with LangDB: 1. Visit the [LangDB Samples](https://github.com/langdb/langdb-samples) Repository for setup instructions and examples. 2. Try out [LangDB](https://app.langdb.ai/) to monitor and optimize your workflows in real time. 3. Experiment with combining different models to achieve the best results for your unique use case. Explore LangDB’s capabilities by visiting the [LangDB Samples Repository](https://github.com/langdb/langdb-samples) or [trying the platform](https://app.langdb.ai/) for multi-agent tracing and analytics. # Send GitHub Release Summaries to Slack Connect GitHub and Slack seamlessly with LangDB MCP—summarize releases, format highlights, and keep teams aligned in real time. Use LangDB MCP to automatically extract and summarize GitHub release notes, then post clean updates into Slack channels. ## Key Benefits * Cut noise from raw GitHub feeds. * Share structured highlights instantly. * Keep project teams aligned without manual effort. ## How It Works > GitHub Release Event ➔ LangDB Agent (Summarize) ➔ Post to Slack The agent reads release notes, extracts key highlights, formats them for Slack, and posts into the specified channel. ## Setup ### Slack MCP Server * **Authentication**: OAuth 2.0 * **Environment Variables**: None required * OAuth authentication is triggered automatically when the MCP server is used. **Note**: Ensure the Slack app has permission to post into the target Slack channels. ### GitHub MCP Server * **Authentication**: Personal Access Token (PAT) * **Environment Variables**: Token needs to be manually set * Recommended scopes for token: * `repo` (access to releases) * `read:org` (if monitoring org-related activities) Checkout: * Github MCP Server: * Slack MCP Server: [https://app.staging.langdb.ai/mcp-servers/slack](https://app.langdb.ai/mcp-servers/slack) Steps: 1. Generate a GitHub PAT from [GitHub Developer Settings](https://github.com/settings/tokens). 2. Configure the GitHub MCP server with this token. **Security Tip**: Treat PATs as secrets—do not expose publicly. | MCP Server | Authentication Type | | ---------- | --------------------- | | Slack | OAuth 2.0 | | GitHub | Personal Access Token |

### Interaction Flow #### Example Query ``` Summarize the last release of langdb/ai-gateway and post it to #mcp-test. ```

#### Output in Slack {% code overflow="wrap" %} ```markdown Summary of the latest release (commit ed568141d9ee18c5eecdde0d448fd105f316de36) of langdb/ai-gateway: - Date: 2025-04-08 - Author: Karolis Gudiškis - Change: Added a safeguard in the OpenAI model integration to ignore output if the 'choices' array is empty. This prevents potential errors or unnecessary processing when no choices are returned by the OpenAI API. - File affected: core/src/model/openai.rs Commit message: "chore: Ignore output if choices are empty" For details, see: https://github.com/langdb/ai-gateway/commit/ed568141d9ee18c5eecdde0d448fd105f316de36 ``` {% endcode %} # Figma ➔ Code Implementation Use LangDB’s Figma MCP integration to query designs, extract components, and generate production-ready Python code faster. Highlight LangDB's capability to connect with Figma and extract design specifications that can be used to automate Python code generation. This showcase demonstrates: * Querying Figma design files programmatically. * Extracting layout and component metadata. * Using the design metadata as the basis for Python implementation ## Why It Matters * Bridge the gap between design and engineering workflows. * Save time by automating the extraction of Figma designs. * Empower developers to move from design-to-code faster. ## How It Works > Figma Design ➔ AI Agent ➔ Extract Metadata ➔ Implement in Python Your AI Agent/Editor reads the Figma file structure, extracts relevant information, and enables code generation based on the design layout. ## MCP Server Setup ### Figma MCP Server * **Authentication**: Personal Access Token (API Key). * **Environment Variables**: API key must be securely stored. Steps: 1. Generate a Figma Personal Access Token from Figma Settings. 2. Configure the Figma MCP server inside LangDB using this token. | MCP Server | Authentication Type | | ---------- | --------------------- | | Figma | Personal Access Token |

Checkout: * Figma MCP Server: ## Interaction Flow #### Example Query ``` Implement this Figma design in a single react component. ```

Cursor Output

import React from 'react';
import './LandingPage.css';

const LandingPage = () => {
  return (
    <div className="landing-page">
      <div className="content">
        <h1 className="title">Find the project you love</h1>
        <p className="description">
          Lorem ipsum dolor sit amet, consectetur adipiscing elit. Faucibus in libero risus semper habitant arcu eget. Et integer facilisi eget diam.Lorem ipsum dolor sit ame,consectetur
        </p>
        <button className="primary-button">Explore</button>
      </div>
      </div>
    </div>

### Key Takeaways * Automate the jump from UI/UX design to functional Python code. * Use Figma as a dynamic blueprint for real applications. # Database Analytics (ClickHouse) Let product teams ask natural questions and retrieve real-time ClickHouse data without writing SQL, powered by LangDB Virtual MCP. Demonstrates how a **product manager** can directly ask natural language questions to a ClickHouse database to get LLM usage insights. ## Objective Ask natural language questions about **flight operations, delays, and patterns** from the [`ontime`](https://clickhouse.com/docs/getting-started/example-datasets/ontime) dataset, such as: ### Example Queries: * "What is the number of flights per day from the year 2000 to 2008" * "How many flights delayed by more than 10 minutes, grouped by the day of the week, for 2000-2008" * "What are the total number of delays" ### Behind the Scenes An AI agent interprets your question, generates optimized ClickHouse SQL, and returns a clean, structured result. ### MCP Server Setup: ClickHouse To connect an AI agent to your ClickHouse database, the following environment variables must be configured: | Variable | Description | | --------------------------------- | ---------------------------------------- | | `clickhouse_host` | ClickHouse server hostname | | `clickhouse_port` | ClickHouse server port | | `clickhouse_user` | ClickHouse username | | `clickhouse_password` | ClickHouse password | | `clickhouse_secure` | Whether to use TLS | | `clickhouse_verify` | Whether to verify TLS certificates | | `clickhouse_connect_timeout` | Connection timeout (seconds, optional) | | `clickhouse_send_receive_timeout` | Send/receive timeout (seconds, optional) | > Credentials are securely stored and encrypted. Never share them externally. Checkout Clickhouse MCP Server: #### Interaction Flow {% code overflow="wrap" %} ``` How many flights delayed by more than 10 minutes ``` {% endcode %}

Output ```json { "delayed_flights_over_10_min": 264924 } ``` *** ### Key Takeaways * Product teams can ask questions directly to the database. * No SQL writing required — just ask and analyze. * Enables faster decision-making around usage trends and cost controls. # Personal Knowledgebase with DuckDuckGo + Qdrant Create a domain-specific semantic index by pairing real-time DuckDuckGo search results with Qdrant storage, powered by LangDB. This use case demonstrates how to build a **private, self-updating knowledgebase** using public web search (DuckDuckGo) and a vector database (Qdrant). By pairing search results with embeddings, you can create a local semantic index for recall, reasoning, or exploration later — all from a single prompt. ### This Showcase Demonstrates: * Querying DuckDuckGo for relevant articles and insights. * Embedding the content into vector format using an LLM-powered embedding model. * Storing the semantic vectors into Qdrant for future retrieval via semantic search. ### Why It Matters * Maintain your own always-updating, domain-specific research archive. * Avoid reliance on vendor-controlled platforms or search engines. * Enables AI-powered recall and insight generation from public sources. ## MCP Server Setup #### DuckDuckGo MCP Server No configuration is required. * Simply add the DuckDuckGo MCP server. * Supports direct web search out of the box. * No authentication or environment variables needed. #### Qdrant MCP Server Setup To store embeddings, Qdrant requires secure credential configuration: | Variable | Description | | ----------------- | ----------------------------- | | `collection_name` | Name of the Qdrant collection | | `qdrant_url` | URL of your Qdrant instance | | `qdrant_api_key` | API key for authentication | Checkout MCP Servers: * DuckDuckGo: * QDrant: ### Interaction Flow ``` Search DuckDuckGo for LangDB AI Gateway and embed the articles. ``` ### QDrant Output

### Benefits * **Own your knowledgebase**: No vendor lock-in, no third-party exposure. * **Fast recall**: Ask contextual questions and retrieve relevant sources. * **Always current**: Update your Qdrant index with new searches as the field evolves. # Context7 + Sequential Thinking for Smarter Coding Workflows Supercharge coding workflows by using live documentation and structured reasoning tools directly in editors like Cursor and Windsurf. This use case demonstrates how developers can supercharge coding workflows by combining **Context7** (for live documentation) with **Sequential Thinking** (for structured code planning and debugging) — all inside LLM-native code editors like Cursor or Windsurf. ### This Showcase Demonstrates: * Using **Context7** to inject accurate, up-to-date code documentation inside your coding prompt. * Using **Sequential Thinking** to reason step-by-step through logic, architecture, or debugging. * Integrating both tools to eliminate hallucinations and maintain continuity across coding sessions. ### Why It Matters * Prevents outdated or incorrect suggestions from LLMs. * Injects precise documentation without leaving your editor. * Helps developers break down, plan, and revise complex code logic directly in the IDE. ### How It Works **Editor Prompt ➔ Context7 Docs ➔ Code Planning with Sequential Thinking** Your AI editor fetches framework or library documentation using Context7, then enables a structured thought process using Sequential Thinking to guide implementation or debugging. To enable this workflow, simply add the following MCP servers to your LangDB Virtual MCP: * `context7` – for live documentation injection * `sequential-thinking` – for structured reasoning steps No additional configuration is required. Checkout MCP Servers: * Context7: * Sequential Thinking: ## Interaction Flow {% code overflow="wrap" %} ``` I’m working on a SvelteKit app and want to understand how to use afterNavigate from @sveltejs/kit. Can you break down its usage step-by-step — when to use it, how to register it, and what kind of tasks it’s best suited for (like scroll restoration or analytics)? Please include code examples using