Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Monitor, Govern and Secure your AI traffic.
An AI gateway is a middleware that acts as a unified access point to multiple LLMs, optimizing, securing, and managing AI traffic. It simplifies integration with different AI providers while enabling cost control, observability, and performance benchmarking. With an AI gateway, businesses can seamlessly switch between models, monitor usage, and optimize costs.
LangDB provides OpenAI compatible APIs to connect with multiple Large Language Models (LLMs) by just changing two lines of code.
Govern, Secure, and Optimize all of your AI Traffic with Cost Control, Optimisation and Full Observability.
What AI Gateway Offers Out of the Box
LangDB provides OpenAI-compatible APIs, enabling developers to connect with multiple LLMs by changing just two lines of code. With LangDB, you can:
Provide access to all major LLMs Ensure seamless integration with leading large language models to maximize flexibility and power.
No framework code required Enable plug-and-play functionality using any framework like Langchain, Vercel AI SDK, CrewAI, etc., for easy adoption.
Plug & Play Tracing & Cost Optimization Simplify implementation of tracing and cost optimization features, ensuring streamlined operations.
Automatic routing based on cost, quality, and other variables Dynamically route requests to the most suitable LLM based on predefined parameters.
Benchmark and provide insights Deliver insights into the best-performing models for specific tasks, such as coding or reasoning, to enhance decision-making.
Quick Start with LangDB
LangDB offers both managed and self hosted versions for organisations to manage AI traffic . Choose between the Hosted Gateway for ease of use or the Open-Source Gateway for full control.
Prompt Caching & Optimization (In Progress) Introduce caching mechanisms to optimize prompt usage and reduce redundant costs.
GuardRails (In Progress) Implement safeguards to enhance reliability and accuracy in AI outputs.
Leaderboard of models per category Create a comparative leaderboard to highlight model performance across categories.
Ready-to-use evaluations for non-data scientists Provide accessible evaluation tools for users without a data science background.
Readily fine-tunable data based on usage Offer pre-configured datasets tailored for fine-tuning, enabling customized improvements with ease.
A full featured and managed AI gateway that provides instant access to 250+ LLMs with enterprise ready features.
A self-hosted option for organizations that require complete control over their AI infrastructure.
Quick Start
Self Hosted
Instantly connect to managed MCP servers — skip the setup and start using fully managed MCPs with built-in authentication, seamless scalability, and full tracing. This guide gives you a quick walkthrough of how to get started with MCPs.
Select Slack and Gmail from MCP Severs in the Virtual MCP Section.
Generate a Virtual MCP URL automatically.
Install the MCP into Cursor with a single command.
Example install command:
Authentication is handled (via OAuth or API Key)
Full tracing and observability are available (inputs, outputs, errors, latencies)
MCP tools are treated just like normal function calls inside LangDB
In this example, we’ll create a by combining Slack and Gmail MCPs — and then connect it to an MCP Client like Cursor for instant access inside your chats.
MCP Servers listed on LangDB:
Explore .
LangDB API provides robust support for HTTP headers, enabling developers to manage API requests efficiently with enhanced tracing, observability, and organization.
These headers play a crucial role in structuring interactions with multiple LLMs by providing tracing, request tracking, and session continuity, making it easier to monitor, and analyze API usage
Usage: Groups multiple related requests under the same conversation
Useful for tracking interactions over a single user session.
Helps maintain context across multiple messages.
Usage: Tracks a unique workflow execution in LangDB, such as a model call or tool invocation.
Enables precise tracking and debugging.
Each Run is independent for better observability.
Usage: Adds a custom tag or label to a LLM Model Call for easier categorization.
Helps with tracing multiple agents.
Usage: Identifies the project under which the request is being made.
Helps in cost tracking, monitoring, and organizing API calls within a specific project.
Can be set in headers or directly in the API base URL https://api.us-east-1.langdb.ai/${langdbProjectId}/v1
Check for more details.
Check for more details.
Check for more details.
LangDB AI Gateway supports every LLM parameter like temperature, max_tokens, stop sequences, logit_bias, and more.
You can also use the UI to test various parameters and getting code snippet
Explore ready-made code snippets complete with preconfigured parameters—copy, paste, and customize to fit your needs.
Use the Playground to tweak parameters in real time via the and send test requests instantly.
LangDB AI enables user tracking to collect analytics and monitor usage patterns efficiently. By associating metadata with requests, developers can analyze interactions, optimize performance, and enhance user experience.
For a chatbot service handling multiple users, tracking enables:
Recognizing returning users: Maintain conversation continuity.
Tracking usage trends: Identify common queries to improve responses.
User segmentation: Categorize users using tags (e.g., "websearch", "support").
Analytics: Identify heavy users and allocate resources efficiently.
extra.user.id
: Unique user identifier.
extra.user.name
: User alias.
extra.user.tags
: Custom tags to classify users (e.g., "coding", "software").
Once users are tracked, analytics and usage APIs can be used to retrieve insights based on id
, name
, or tags
.
Example:
Example response:
A Thread is simply a grouping of Message History that maintains context in a conversation or workflow. Threads are useful for keeping track of past messages and ensuring continuity across multiple exchanges.
Core Features:
Contextual Continuity: Ensures all related Runs are grouped for better observability.
Multi-Turn Support: Simplifies managing interactions that require maintaining state across multiple Runs.
Example:
A user interacting with a chatbot over multiple turns (e.g., asking follow-up questions) generates several messages, but all are grouped under a single Thread to maintain continuity.
Headers for Thread:
x-thread-id
: Links all Runs in the same context or conversations.
Checkout and section for more details.
The Hosted AI Gateway allows you to connect with multiple Large Language Models (LLMs) instantly, without any setup.
A Trace represents the complete lifecycle of a workflow, spanning all components and systems involved.
Core Features:
End-to-End Visibility: Tracks model calls, tools across the entire workflow.
Multi Agent Ready: Perfect for workflows that involve multiple services, APIs, or tools.
Error Diagnosis: Quickly identify bottlenecks, failures, or inefficiencies in complex workflows.
Parent-Trace:
For workflows with nested operations (e.g., a workflow that triggers multiple sub-workflows), LangDB introduces the concept of a Parent-Trace, which links the parent workflow to its dependent sub-workflows. This hierarchical structure ensures you can analyze workflows at both macro and micro levels.
Headers for Trace:
trace-id
: Tracks the parent workflow.
parent-trace-id
: Links sub-workflows to the main workflow for hierarchical tracing.
This allows developers to track interactions between agents seamlessly, ensuring clear visibility into workflows and dependencies.
A multi-agent system consists of independent agents collaborating to solve complex tasks. Agents handle various roles such as user interaction, data processing, and workflow orchestration. LangDB streamlines tracking these interactions for better efficiency and transparency.
Tracking ensures:
Clear Execution Flow: Understand how agents interact.
Performance Optimization: Identify bottlenecks.
Reliability & Accountability: Improve transparency.
LangDB supports two main concepts.
Example
Using the same Run ID and Thread ID across multiple agents ensures seamless tracking, maintaining context across interactions and providing a complete view of the workflow
Sign up on to start using the Hosted Gateway
LangDB automatically visualizes how agents interact, providing a clear view of workflows, hierarchies, and usage patterns by adding and headers.
: A complete end-to-end interaction between agents, grouped for easy tracking.
: Aggregate multiple Runs into a single thread for a unified chat experience.
Checkout the full Multi-Agent Tracing Example .
A Virtual MCP Server lets you create a customized set of MCP tools by combining functions from multiple MCP servers — all with scoped access, unified auth, and full observability.
Selective Tools: Pick only the tools you need from existing MCP servers (e.g. Airtable's list_records
, GitHub's create_issue
, etc.)
Clean Auth Handling: Add your API keys only if needed. Otherwise, LangDB handles OAuth for you.
Full Tracing: Every call is traced on the LangDB — with logs, latencies, input/output, and error metrics.
Easy Integration: Works out of the box with Cursor, Claude, Windsurf, and more.
Version Lock-in: Virtual MCPs are pinned to a specific server version to avoid breaking changes.
Poisoning Safety: Prevents injection or override by malicious tool definitions from source MCPs.
Go to your Virtual MCP server on LangDB Project.
Select the tools you want to include.
(Optional) Add API keys or use LangDB-managed auth.
Click Generate secure MCP URL
.
Once you have the MCP URL:
You're now ready to use your selected tools directly inside the editor.
You can also try the Virtual MCP servers by adding the server in the config.
A Message is the basic unit of communication in LangDB workflows. Messages define the interaction between the user, the system, and the model. Every workflow is built around exchanging and processing messages.
Core Features:
Structured Interactions: Messages define roles (user
, system
, assistant
) to organize interactions clearly.
Multi-Role Flexibility: Different roles (e.g., system
for instructions, user
for queries) enable complex workflows.
Dynamic Responses: Messages form the backbone of LangDB’s chat-based interactions.
Example:
A simple interaction to generate a poem might look like this:
A Run represents a single workflow or operation executed within LangDB. This could be a model invocation, a tool call, or any other discrete task. Each Run is independent and can be tracked separately, making it easier to analyze and debug individual workflows.
Example of a Run:
Core Features:
Granular Tracking: Analyze and optimize the performance and cost of individual Runs.
Independent Execution: Each Run has a distinct lifecycle, enabling precise observability.
Example:
Generating a summary of a document, analyzing a dataset, or fetching information from an external API – each is a Run.
Headers for Run:
x-run-id
: Identifies a specific Run for tracking and debugging purposes.
LangDB Gateway provides detailed tracing to monitor, debug, and optimize LLM workflows.
Below is an example of a trace visualization from the dashboard, showcasing a detailed breakdown of the request stages:
In this example trace you’ll find:
Overview Metrics
Cost: Total spend for this request (e.g. $0.034).
Tokens: Input (5,774) vs. output (1,395).
Duration: Total end-to-end latency (29.52 s).
Timeline Breakdown A parallel-track timeline showing each step—from moderation and relevance scoring to model inference and final reply.
Model Invocations**
Every call to gpt-4o-mini
, gpt-4o
, etc., is plotted with precise start times and durations.
Agent Hand-offs
Transitions between your agents (e.g. search → booking → reply) are highlighted with custom labels like transfer_to_reply_agent
.
Tool Integrations
External tools (e.g. booking_tool
, travel_tool
, python_repl_tool
) appear inline with their execution times—so you can spot slow or failed runs immediately.
Guardrails Rules like Min Word Count and Travel Relevance enforce domain-specific constraints and appear in the trace.
With this level of visibility you can quickly pinpoint bottlenecks, understand cost drivers, and ensure your multi-agent pipelines run smoothly.
Virtual Models in LangDB let you save and reuse AI configurations, ensuring streamlined workflows and consistent behavior across applications. They allow you to define system and user messages upfront, customize model parameters like temperature, max tokens, and penalties, and integrate LangDB’s built-in tools for enhanced response capabilities.
Once saved, these configurations can be quickly accessed and reused across multiple applications.
Predefined Conversations: Set system and user messages to guide AI behavior.
Custom Parameters: Fine-tune settings like temperature, max tokens, top-p, logit bias, and penalties.
Tools Integration: Augment AI responses with LangDB’s built-in tools.
Reusable Configurations: Save models under a unique name for easy retrieval and use.
Guardrails: Attach guardrails directly to a virtual model.
Monitoring complements tracing by providing aggregate insights into the usage of LLM workflows.
LangDB enforces limits to ensure fair usage and cost management while allowing users to configure these limits as needed. Limits are categorized into:
Daily Limits: Maximum usage per day, e.g., $10 in the Starter Tier.
Monthly Limits: Total usage allowed in a month, e.g., $100.
Total Limits: Cumulative limit over the project’s duration, e.g., $500.
Monitor usage regularly to avoid overages.
Plan limits based on project needs and anticipated workloads.
Upgrade tiers if usage consistently approaches limits.
Setting limits not only helps you stay within budget but also provides the flexibility to scale your usage as needed, ensuring your projects run smoothly and efficiently.
Retrieves the total usage statistics for your project for a timeframe.
Example Response:
Fetches timeseries usage statistics per model, allowing users to analyze the distribution of LLM usage.
Example Response:
As discussed in User Tracking, we can use filters to retrieve insights based on id
, name
, or tags
.
Available Filters:
user_id
: Filter data for a specific user by their unique ID.
user_name
: Retrieve usage based on the user’s name.
user_tags
: Filter by tags associated with a user (e.g., "websearch", "support").
Example response:
LangDB AI Gateway optimizes LLM selection based on cost, speed, and availability, ensuring efficient request handling. This guide covers the various dynamic routing strategies available in the system, including fallback, script-based, optimized, percentage-based, and latency-based routing.
This ensures efficient request handling and optimal model selection tailored to specific application needs.
Before diving into routing strategies, it's essential to understand targets in LangDB AI Gateway. A target refers to a specific model or endpoint to which requests can be directed. Each target represents a potential processing unit within the routing logic, enabling optimal performance and reliability.
Target Parameters
Each target in the routing configuration can have custom parameters that define its behavior. These parameters allow fine-tuning of model outputs to align with specific requirements.
Common parameters include:
model: The model identifier (e.g., openai/gpt-4o
, deepseek/deepseek-chat
).
temperature: Controls the randomness of responses (higher values make responses more creative).
max_tokens: Limits the number of tokens in the response.
top_p: Determines the probability mass for nucleus sampling.
frequency_penalty: Reduces repetition by penalizing frequent tokens.
presence_penalty: Encourages diversity by discouraging token reuse.
You can customize parameters for each target model to fine-tune the behavior and output of the models. Parameters such as temperature
, max_tokens
, and frequency_penalty
can be adjusted to meet specific requirements.
Example of customizing model parameters:
LangDB AI Gateway supports multiple routing strategies that can be combined and customized to meet your specific needs:
Sequentially routes requests through multiple models in case of
Selects the best model based on real-time performance metrics.
Distributes traffic between multiple models using predefined weightings.
Chooses the model with the lowest response time for real-time applications.
Combines multiple routing strategies for flexible traffic management.
Fallback routing allows sequential attempts to different model targets in case of failure or unavailability. It ensures robustness by cascading through a list of models based on predefined logic.
Optimized routing automatically selects the best model based on real-time performance metrics such as latency, response time, and cost-efficiency.
Here, the request is routed to the model with the lowest Time-to-First-Token (TTFT) among gpt-3.5-turbo and gpt-4o-mini.
Metrics:
Requests – Total number of requests sent to the model.
InputTokens – Number of tokens provided as input to the model.
OutputTokens – Number of tokens generated by the model in response.
TotalTokens – Combined count of input and output tokens.
RequestsDuration – Total duration taken to process requests.
Ttft (Time-to-First-Token) (Default) – Time taken by the model to generate its first token after receiving a request.
LlmUsage – The total computational cost of using the model, often used for cost-based routing.
Percentage-based routing distributes requests between models according to predefined weightings, allowing load balancing, A/B testing, or controlled experimentation with different configurations. Each model can have distinct parameters while sharing the request load.
Latency-based routing selects the model with the lowest response time, ensuring minimal delay for real-time applications like chatbots and interactive AI systems.
LangDB AI allows nesting of routing strategies, enabling combinations like fallback within script-based selection. This flexibility helps refine model selection based on dynamic business needs.
LangDB AI Gateway provides an intuitive user interface (UI) to configure and manage routing strategies. Through the UI, users can set up model routing dynamically, ensuring requests are directed efficiently based on cost, speed, and performance.
You can monitor API usage with key insights.
After integrating LangDB into your project, the Analytics Dashboard becomes your central hub for understanding usage.
LangDB’s Analytics Dashboard is segmented into several key panels:
Tracks your total cost consumption across all integrated models.
Enables you to compare costs by provider/model/tags, helping you identify the most cost-effective options for your use cases.
Displays the average duration of requests in milliseconds.
Useful for benchmarking response times and optimizing performance for latency-sensitive applications.
Shows the total number of API calls made.
Helps you analyze usage patterns and allocate resources effectively.
Indicates the average time taken to receive the first token from the API response.
This metric is critical for understanding initial latency.
Measures the throughput of token generation.
High TPS is indicative of efficient processing.
Tracks the average time spent per output token.
Helps in identifying and troubleshooting bottlenecks in model output.
Displays the percentage of failed requests over total requests.
Helps monitor system stability and reliability.
Tracks the total number of failed API requests.
Useful for debugging and troubleshooting failures effectively.
Provides a detailed timeseries view of API usage metrics. Users can filter data by time range and group it by provider, model, or tags to analyze trends over different periods.
Example response:
Provides aggregated usage metrics, allowing users to get a high-level overview of API consumption and error rates.
Example response:
As discussed in User Tracking, we can use filters to retrieve insights based on id
, name
, or tags
.
Available Filters:
user_id
: Filter data for a specific user by their unique ID.
user_name
: Retrieve usage based on the user’s name.
user_tags
: Filter by tags associated with a user (e.g., "websearch", "support").
Example response:
Model Context Protocol (MCP) is an open standard that enables AI models to seamlessly communicate with external systems. It allows models to dynamically process contextual data, ensuring efficient, adaptive, and scalable interactions. MCP simplifies request orchestration across distributed AI systems, enhancing interoperability and context-awareness.
With native tool integrations, MCP connects AI models to APIs, databases, local files, automation tools, and remote services through a standardized protocol. Developers can effortlessly integrate MCP with IDEs, business workflows, and cloud platforms, while retaining the flexibility to switch between LLM providers. This enables the creation of intelligent, multi-modal workflows where AI securely interacts with real-world data and tools.
Here's an example of how you can use a Virtual MCP Server in your project:
You can instantly connect LangDB’s Virtual MCP servers to editors like Cursor, Claude, or Windsurf.
Run this in your terminal to set up MCP in Cursor:
You can now call tools directly in your editor, with full tracing on LangDB.
If you already have an MCP server hosted externally — like Smithery’s Exa MCP — you can plug it straight into LangDB with zero extra setup.
Just pass your external MCP server URL in extra_body
when you make a chat completion request. For example Smithery:
Select type: Optimized
Select metric: TTFT
Select Target Models and Parameters
For example to define the same TTFT-Optimized router as in the UI.
LangDB simplifies how you work with MCP (Model Context Protocol) servers — whether you want to use a built-in or connect to an external MCP server.
For more details, visit the and explore .
LangDB allows you to create directly from the dashboard. You can instantly select and bundle tools like database queries, search APIs, or automation tasks into a single MCP URL — no external setup needed.
Checkout and section for usecases.
For a complete example of how to use external MCP, refer to the .
You can define router using JSON. There are examples in the .
LangDB enables cost tracking, project budgeting, and cost groups to help manage AI usage efficiently.
Available in Business & Enterprise tiers under User Management.
Organize users into cost groups to track and allocate spending.
Cost groups help in budgeting but are independent of user roles.
Set daily, monthly, and total spending limits per project.
Enforce per-user limits to prevent excessive usage.
Available in Project Settings → Cost Control.
Admins and Billing users can define spending limits for cost groups.
Set daily, monthly, and total budgets per group.
Useful for controlling team-based expenses independently of project limits.
LangDB allow developers to enforce specific constraints and checks on their LLM calls, ensuring safety, compliance, and quality control.
Guardrails currently support request validation and logging, ensuring structured oversight of LLM interactions.
These guardrails include:
Content Moderation: Detects and filters harmful or inappropriate content (e.g., toxicity detection, sentiment analysis).
Security Checks: Identifies and mitigates security risks (e.g., PII detection, prompt injection detection).
Compliance Enforcement: Ensures adherence to company policies and factual accuracy (e.g., policy adherence, factual accuracy).
Response Validation: Validates response format and structure (e.g., word count, JSON schema, regex patterns).
Guardrails can be configured via the UI or API, providing flexibility for different use cases.
When a guardrail blocks an input or output, the system returns a structured error response. Below are some example responses for different scenarios:
It is important to note that guardrails cannot be applied to streaming outputs.
LangDB provides prebuilt templates to enforce various constraints on LLM responses. These templates cover areas such as content moderation, security, compliance, and validation.
The following table provides quick access to each guardrail template:
Detects and filters toxic or harmful content.
Validates responses against a user-defined JSON schema.
Detects mentions of competitor names or products.
Identifies personally identifiable information in responses.
Detects attempts to manipulate the AI through prompt injections.
Ensures responses align with company policies.
Validates responses against specified regex patterns.
Ensures responses meet specified word count requirements.
Evaluates sentiment to ensure appropriate tone.
Checks if responses are in allowed languages.
Ensures responses stay on specified topics.
Validates that responses contain factually accurate information.
content-toxicity
)Detects and filters out toxic, harmful, or inappropriate content.
validation-json-schema
)Validates responses against a user-defined JSON schema.
schema
object
Custom JSON schema to validate against (replace with your own schema)
Required
content-competitor-mentions
)Detects mentions of competitor names or products in LLM responses.
competitors
array
List of competitor names.
["company1", "company2"]
match_partial
boolean
Whether to match partial names.
true
case_sensitive
boolean
Whether matching should be case sensitive
false
security-pii-detection
)Detects personally identifiable information (PII) in responses.
pii_types
array
Types of PII to detect.
["email", "phone", "ssn", "credit_card"]
redact
boolean
Whether to redact detected PII.
false
security-prompt-injection
)Identifies prompt injection attacks attempting to manipulate the AI.
threshold
number
Confidence threshold for injection detection.
Required
detection_patterns
array
Common patterns used in prompt injection attacks.
["Ignore previous instructions", "Forget your training", "Tell me your prompt"]
evaluation_criteria
array
Criteria used for detection.
["Attempts to override system instructions", "Attempts to extract system prompt information", "Attempts to make the AI operate outside its intended purpose"]
compliance-company-policy
)Ensures that responses align with predefined company policies.
embedding_model
string
Model used for text embedding.
text-embedding-ada-002
threshold
number
Similarity threshold for compliance.
Required
dataset
object
Example dataset for compliance checking.
Contains predefined examples
validation-regex-pattern
)Validates responses against specific regex patterns.
patterns
array
Model List of regex patterns.
["^[A-Za-z0-9\s.,!?]+$"]
match_type
string
Whether all, any, or none of the patterns must match.
"all"
validation-word-count
)Ensures responses meet specified word count requirements.
min_words
number
Model List of regex patterns.
10
max_words
number
Whether all, any, or none of the patterns must match.
500
count_method
string
Method for word counting.
split
content-sentiment-analysis
)Evaluates the sentiment of responses to ensure appropriate tone.
allowed_sentiments
array
Allowed sentiment categories.
["positive", "neutral"]
threshold
number
Confidence threshold for sentiment detection.
0.7
content-language-validation
)Checks if responses are in allowed languages.
allowed_languages
array
List of allowed languages.
["english"]
threshold
number
Confidence threshold for language detection.
0.9
content-topic-adherence
)Ensures responses stay on specified topics.
allowed_topics
array
List of allowed topics.
["Product information", "Technical assistance"]
forbidden_topics
array
List of forbidden topics.
["politics", "religion"]
threshold
number
Confidence threshold for topic detection.
0.7
content-factual-accuracy
)Validates that responses contain factually accurate information.
reference_facts
array
List of reference facts.
[]
threshold
number
Confidence threshold for factuality assessment.
0.8
evaluation_criteria
array
Criteria used to assess factual accuracy.
["Contains verifiable information", "Avoids speculative claims"]
LangDB provides role-based access control to manage users efficiently within an organization. There are three primary roles: Admin, Developer, and Billing.
Each role has specific permissions and responsibilities, ensuring a structured and secure environment for managing teams.
Admins have the highest level of control within LangDB. They can:
Invite and manage users
Assign and modify roles for team members
Manage cost groups and usage tracking
Access billing details and payment settings
Configure organizational settings
Best for: Organization owners, team leads, or IT administrators managing team access and billing.
Developers focus on working with APIs and integrating LLMs. They have the following permissions:
Access and use LangDB APIs
Deploy and test applications using LangDB’s AI Gateway
View and monitor API usage and performance
Best for: Software developers, data scientists, and AI engineers working on LLM integrations.
Billing users have access to financial and cost-related features. Their permissions include:
Managing top-ups and subscriptions
Monitoring usage costs and optimizing expenses
Best for: Finance teams, accounting personnel, and cost management administrators.
Admins can assign roles to users when inviting them to the organization. Role changes can also be made later through the user management panel.
Users can have multiple roles (e.g., both Developer and Billing).-
Only Admins can assign or update roles.
Billing users cannot modify API access but can track and manage costs.
Role Management is only available in Professional, Business, and Enterprise tiers.
LangDB simplifies working with multiple Large Language Models (LLMs) through a single API. It excels at analytics, usage monitoring, and evaluation, giving developers insights into model performance, usage stats, and costs. This guide covers installation, setup, and key functionalities.
To install the LangDB Python client, run:
Initialize the client with your API key and project ID:
You can generate a response using the completion
method:
You can fetch messages from a specific thread using its thread_id
:
Retrieve cost and token usage details for a thread:
You can retrieve analytics for specific model tags:
Alternatively, you can convert analytics data into a Pandas DataFrame for easier analysis:
To generate an evaluation DataFrame containing message and cost information for multiple threads:
To list all models supported by LangDB:
LangDB provides built-in evaluation capabilities, allowing developers to assess model performance, response accuracy, and cost efficiency. By analyzing messages, token usage, and analytics data, teams can fine-tune their models for better results.
Checkout section for more information.
Register and configure a new LLM under your LangDB project
LangDB Admin Key
my-model
A custom completions model for text and image inputs
e2e9129b-6661-4eeb-80a2-0c86964974c9
55f4a12b-74c8-4294-8e4b-537f13fc3861
false
openai-compatible
completions
0.00001
0.00003
128000
["tools"]
["text","image"]
["text","image"]
openai
0
my-model-v1.2
Additional configuration parameters
{"top_k":{"default":0,"description":"Limits the token sampling to only the top K tokens.","min":0,"required":false,"step":1,"type":"int"},"top_p":{"default":1,"description":"Nucleus sampling alternative.","max":1,"min":0,"required":false,"step":0.05,"type":"float"}}
Created
Creates an embedding vector representing the input text or token arrays.
ID of the model to use for generating embeddings.
text-embedding-ada-002
The text to embed.
Array of text strings to embed.
The format to return the embeddings in.
float
Possible values: The number of dimensions the resulting embeddings should have.
1536
Successful response with embeddings
LangDB project ID
ID of the model to use. This can be either a specific model ID or a virtual model identifier.
gpt-4o
Sampling temperature.
0.8
OK
LangDB provides access to 250+ LLMs with OpenAI compatible APIs.
You can use LangDB as a drop-in replacement for OpenAI APIs, making it easy to integrate into existing workflows and libraries such as OpenAI Client SDK.
The ID of the thread to retrieve messages from
LangDB project ID
A list of messages for the given thread
The ID of the thread for which to retrieve cost information
LangDB project ID
The total cost and token usage for the specified thread
You can choose from any of the .
threshold
number
Confidence threshold for toxicity detection.
Required
categories
array
Categories of toxicity to detect.
["hate", "harassment", "violence", "self-harm", "sexual", "profanity"]
evaluation_criteria
array
Criteria used for toxicity evaluation.
["Hate speech", "Harassment", "Violence", "Self-harm", "Sexual content", "Profanity"]
LangDB project ID
Start time in microseconds.
1693062345678
End time in microseconds.
1693082345678
Successful response
LangDB project ID
1693062345678
1693082345678
["provider"]
Successful response