Introducing Virtual MCP Servers
LogoLogo
GithubJoin SlackSignupBook a Demo
  • Documentation
  • Self Hosted
  • Integrations
  • Guides
  • Enterprise
  • Introduction to AI Gateway
  • Supported Models
  • Supported MCP Servers
  • Getting Started
    • Quick Start
    • Working with API
    • Working with Multiple Agents
    • Working with MCPs
    • Working with Headers
    • User Tracking
    • Using Parameters
  • Concepts
    • Thread
    • Trace
    • Run
    • Label
    • Message
    • Virtual Models
      • Routing with Virtual Model
    • Virtual MCP Servers
  • Features
    • Tracing
    • Routing
    • MCP Support
    • Publishing MCP Servers
    • Usage
    • Analytics
    • Guardrails
    • User Roles
    • Cost Control
  • Python SDK
    • Getting Started
  • API Reference
  • Postman Collection
Powered by GitBook
LogoLogo

Social

  • LinkedIn
  • X
  • Youtube
  • Github

Platform

  • Pricing
  • Documentation
  • Blog

Company

  • Home
  • About

Legal

  • Privacy Policy
  • Terms of Service

2025 LangDB. All rights reserved.

On this page
  • Guardrail Behaviour
  • Example 1: Input Rejected by Guard
  • Example 2: Output Rejected by Guard
  • Limitations
  • Guardrail Templates
  • Toxicity Detection (content-toxicity)
  • JSON Schema Validator (validation-json-schema)
  • Competitor Mention Check (content-competitor-mentions)
  • PII Detection (security-pii-detection)
  • Prompt Injection Detection (security-prompt-injection)
  • Company Policy Compliance (compliance-company-policy)
  • Regex Pattern Validator (validation-regex-pattern)
  • Word Count Validator (validation-word-count)
  • Sentiment Analysis (content-sentiment-analysis)
  • Language Validator (content-language-validation)
  • Topic Adherence (content-topic-adherence)
  • Factual Accuracy (content-factual-accuracy)

Was this helpful?

Export as PDF
  1. Features

Guardrails

Enforce safety, compliance, and quality with LangDB guardrails—moderate content, validate responses, and detect security risks.

PreviousAnalyticsNextUser Roles

Last updated 13 days ago

Was this helpful?

LangDB allow developers to enforce specific constraints and checks on their LLM calls, ensuring safety, compliance, and quality control.

Guardrails currently support request validation and logging, ensuring structured oversight of LLM interactions.

These guardrails include:

  • Content Moderation: Detects and filters harmful or inappropriate content (e.g., toxicity detection, sentiment analysis).

  • Security Checks: Identifies and mitigates security risks (e.g., PII detection, prompt injection detection).

  • Compliance Enforcement: Ensures adherence to company policies and factual accuracy (e.g., policy adherence, factual accuracy).

  • Response Validation: Validates response format and structure (e.g., word count, JSON schema, regex patterns).

Guardrails can be configured via the UI or API, providing flexibility for different use cases.

Guardrail Behaviour

When a guardrail blocks an input or output, the system returns a structured error response. Below are some example responses for different scenarios:

Example 1: Input Rejected by Guard

{
  "id": "",
  "object": "chat.completion",
  "created": 0,
  "model": "",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Input rejected by guard",
        "tool_calls": null,
        "refusal": null,
        "tool_call_id": null
      },
      "finish_reason": "rejected"
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0,
    "cost": 0.0
  }
}

Example 2: Output Rejected by Guard

{
  "id": "5ef4d8b1-f700-46ca-8439-b537f58f7dc6",
  "object": "chat.completion",
  "created": 1741865840,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Output rejected by guard",
        "tool_calls": null,
        "refusal": null,
        "tool_call_id": null
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 21,
    "completion_tokens": 40,
    "total_tokens": 61,
    "cost": 0.000032579999999999996
  }
}

Limitations

It is important to note that guardrails cannot be applied to streaming outputs.

Guardrail Templates

LangDB provides prebuilt templates to enforce various constraints on LLM responses. These templates cover areas such as content moderation, security, compliance, and validation.

The following table provides quick access to each guardrail template:

Guardrail
Description

Detects and filters toxic or harmful content.

Validates responses against a user-defined JSON schema.

Detects mentions of competitor names or products.

Identifies personally identifiable information in responses.

Detects attempts to manipulate the AI through prompt injections.

Ensures responses align with company policies.

Validates responses against specified regex patterns.

Ensures responses meet specified word count requirements.

Evaluates sentiment to ensure appropriate tone.

Checks if responses are in allowed languages.

Ensures responses stay on specified topics.

Validates that responses contain factually accurate information.

Toxicity Detection (content-toxicity)

Detects and filters out toxic, harmful, or inappropriate content.

JSON Schema Validator (validation-json-schema)

Validates responses against a user-defined JSON schema.

Parameter
Type
Description
Defaults

schema

object

Custom JSON schema to validate against (replace with your own schema)

Required

Competitor Mention Check (content-competitor-mentions)

Detects mentions of competitor names or products in LLM responses.

Parameter
Type
Description
Defaults

competitors

array

List of competitor names.

["company1", "company2"]

match_partial

boolean

Whether to match partial names.

true

case_sensitive

boolean

Whether matching should be case sensitive

false

PII Detection (security-pii-detection)

Detects personally identifiable information (PII) in responses.

Parameter
Type
Description
Defaults

pii_types

array

Types of PII to detect.

["email", "phone", "ssn", "credit_card"]

redact

boolean

Whether to redact detected PII.

false

Prompt Injection Detection (security-prompt-injection)

Identifies prompt injection attacks attempting to manipulate the AI.

Parameter
Type
Description
Defaults

threshold

number

Confidence threshold for injection detection.

Required

detection_patterns

array

Common patterns used in prompt injection attacks.

["Ignore previous instructions", "Forget your training", "Tell me your prompt"]

evaluation_criteria

array

Criteria used for detection.

["Attempts to override system instructions", "Attempts to extract system prompt information", "Attempts to make the AI operate outside its intended purpose"]

Company Policy Compliance (compliance-company-policy)

Ensures that responses align with predefined company policies.

Parameter
Type
Description
Defaults

embedding_model

string

Model used for text embedding.

text-embedding-ada-002

threshold

number

Similarity threshold for compliance.

Required

dataset

object

Example dataset for compliance checking.

Contains predefined examples

Regex Pattern Validator (validation-regex-pattern)

Validates responses against specific regex patterns.

Parameter
Type
Description
Defaults

patterns

array

Model List of regex patterns.

["^[A-Za-z0-9\s.,!?]+$"]

match_type

string

Whether all, any, or none of the patterns must match.

"all"

Word Count Validator (validation-word-count)

Ensures responses meet specified word count requirements.

Parameter
Type
Description
Defaults

min_words

number

Model List of regex patterns.

10

max_words

number

Whether all, any, or none of the patterns must match.

500

count_method

string

Method for word counting.

split

Sentiment Analysis (content-sentiment-analysis)

Evaluates the sentiment of responses to ensure appropriate tone.

Parameter
Type
Description
Defaults

allowed_sentiments

array

Allowed sentiment categories.

["positive", "neutral"]

threshold

number

Confidence threshold for sentiment detection.

0.7

Language Validator (content-language-validation)

Checks if responses are in allowed languages.

Parameter
Type
Description
Defaults

allowed_languages

array

List of allowed languages.

["english"]

threshold

number

Confidence threshold for language detection.

0.9

Topic Adherence (content-topic-adherence)

Ensures responses stay on specified topics.

Parameter
Type
Description
Defaults

allowed_topics

array

List of allowed topics.

["Product information", "Technical assistance"]

forbidden_topics

array

List of forbidden topics.

["politics", "religion"]

threshold

number

Confidence threshold for topic detection.

0.7

Factual Accuracy (content-factual-accuracy)

Validates that responses contain factually accurate information.

Parameter
Type
Description
Defaults

reference_facts

array

List of reference facts.

[]

threshold

number

Confidence threshold for factuality assessment.

0.8

evaluation_criteria

array

Criteria used to assess factual accuracy.

["Contains verifiable information", "Avoids speculative claims"]

Toxicity Detection
JSON Schema Validator
Competitor Mention Check
PII Detection
Prompt Injection Detection
Company Policy Compliance
Regex Pattern Validator
Word Count Validator
Sentiment Analysis
Language Validator
Topic Adherence
Factual Accuracy
Guardrail Templates on LangDB
LangDB Guardrails - Displaying all the guards available there.
Parameter
Type
Description
Defaults

threshold

number

Confidence threshold for toxicity detection.

Required

categories

array

Categories of toxicity to detect.

["hate", "harassment", "violence", "self-harm", "sexual", "profanity"]

evaluation_criteria

array

Criteria used for toxicity evaluation.

["Hate speech", "Harassment", "Violence", "Self-harm", "Sexual content", "Profanity"]