Guardrails
LangDB allow developers to enforce specific constraints and checks on their LLM calls, ensuring safety, compliance, and quality control.
Guardrails currently support request validation and logging, ensuring structured oversight of LLM interactions.
These guardrails include:
Content Moderation: Detects and filters harmful or inappropriate content (e.g., toxicity detection, sentiment analysis).
Security Checks: Identifies and mitigates security risks (e.g., PII detection, prompt injection detection).
Compliance Enforcement: Ensures adherence to company policies and factual accuracy (e.g., policy adherence, factual accuracy).
Response Validation: Validates response format and structure (e.g., word count, JSON schema, regex patterns).
Guardrails can be configured via the UI or API, providing flexibility for different use cases.
Guardrail Templates
LangDB provides prebuilt templates to enforce various constraints on LLM responses. These templates cover areas such as content moderation, security, compliance, and validation.
The following table provides quick access to each guardrail template:
Detects and filters toxic or harmful content.
Validates responses against a user-defined JSON schema.
Detects mentions of competitor names or products.
Identifies personally identifiable information in responses.
Detects attempts to manipulate the AI through prompt injections.
Ensures responses align with company policies.
Validates responses against specified regex patterns.
Ensures responses meet specified word count requirements.
Evaluates sentiment to ensure appropriate tone.
Checks if responses are in allowed languages.
Ensures responses stay on specified topics.
Validates that responses contain factually accurate information.
Toxicity Detection (content-toxicity
)
content-toxicity
)Detects and filters out toxic, harmful, or inappropriate content.
JSON Schema Validator (validation-json-schema
)
validation-json-schema
)Validates responses against a user-defined JSON schema.
schema
object
Custom JSON schema to validate against (replace with your own schema)
Required
Competitor Mention Check (content-competitor-mentions
)
content-competitor-mentions
)Detects mentions of competitor names or products in LLM responses.
competitors
array
List of competitor names.
["company1", "company2"]
match_partial
boolean
Whether to match partial names.
true
case_sensitive
boolean
Whether matching should be case sensitive
false
PII Detection (security-pii-detection
)
security-pii-detection
)Detects personally identifiable information (PII) in responses.
pii_types
array
Types of PII to detect.
["email", "phone", "ssn", "credit_card"]
redact
boolean
Whether to redact detected PII.
false
Prompt Injection Detection (security-prompt-injection
)
security-prompt-injection
)Identifies prompt injection attacks attempting to manipulate the AI.
threshold
number
Confidence threshold for injection detection.
Required
detection_patterns
array
Common patterns used in prompt injection attacks.
["Ignore previous instructions", "Forget your training", "Tell me your prompt"]
evaluation_criteria
array
Criteria used for detection.
["Attempts to override system instructions", "Attempts to extract system prompt information", "Attempts to make the AI operate outside its intended purpose"]
Company Policy Compliance (compliance-company-policy
)
compliance-company-policy
)Ensures that responses align with predefined company policies.
embedding_model
string
Model used for text embedding.
text-embedding-ada-002
threshold
number
Similarity threshold for compliance.
Required
dataset
object
Example dataset for compliance checking.
Contains predefined examples
Regex Pattern Validator (validation-regex-pattern
)
validation-regex-pattern
)Validates responses against specific regex patterns.
patterns
array
Model List of regex patterns.
["^[A-Za-z0-9\s.,!?]+$"]
match_type
string
Whether all, any, or none of the patterns must match.
"all"
Word Count Validator (validation-word-count
)
validation-word-count
)Ensures responses meet specified word count requirements.
min_words
number
Model List of regex patterns.
10
max_words
number
Whether all, any, or none of the patterns must match.
500
count_method
string
Method for word counting.
split
Sentiment Analysis (content-sentiment-analysis
)
content-sentiment-analysis
)Evaluates the sentiment of responses to ensure appropriate tone.
allowed_sentiments
array
Allowed sentiment categories.
["positive", "neutral"]
threshold
number
Confidence threshold for sentiment detection.
0.7
Language Validator (content-language-validation
)
content-language-validation
)Checks if responses are in allowed languages.
allowed_languages
array
List of allowed languages.
["english"]
threshold
number
Confidence threshold for language detection.
0.9
Topic Adherence (content-topic-adherence
)
content-topic-adherence
)Ensures responses stay on specified topics.
allowed_topics
array
List of allowed topics.
["Product information", "Technical assistance"]
forbidden_topics
array
List of forbidden topics.
["politics", "religion"]
threshold
number
Confidence threshold for topic detection.
0.7
Factual Accuracy (content-factual-accuracy
)
content-factual-accuracy
)Validates that responses contain factually accurate information.
reference_facts
array
List of reference facts.
[]
threshold
number
Confidence threshold for factuality assessment.
0.8
evaluation_criteria
array
Criteria used to assess factual accuracy.
["Contains verifiable information", "Avoids speculative claims"]
Last updated
Was this helpful?