Introducing Virtual MCP Servers
LogoLogo
GithubJoin SlackSignupBook a Demo
  • Documentation
  • Self Hosted
  • Integrations
  • Guides
  • Enterprise
  • Getting Started
  • Rate Limiting
  • Cost Control
  • Routing
  • Observability with Clickhouse
  • Clickhouse UDFs
  • API Reference
  • Postman Collection
Powered by GitBook
LogoLogo

Social

  • LinkedIn
  • X
  • Youtube
  • Github

Platform

  • Pricing
  • Documentation
  • Blog

Company

  • Home
  • About

Legal

  • Privacy Policy
  • Terms of Service

2025 LangDB. All rights reserved.

On this page

Was this helpful?

Export as PDF

Rate Limiting

Apply rate limiting, cost control and more.

Rate limiting is an essential mechanism to prevent API abuse by controlling the number of requests allowed within a specific time frame. You can configure rate limits by setting hourly, daily and monthly total limits

This ensures fair usage and helps maintain system performance and stability.

# Limit to 1000 requests per hour
ai-gateway serve \
    --rate-hourly 1000
    --rate-daily 1000
    --rate-monthly 1000

Or in config.yaml:

rate_limit:
  hourly: 100
  daily: 1000
  monthly: 10000

When a rate limit is exceeded, the API will return a 429 (Too Many Requests) response.

Why Rate Limiting Matters

  • Prevents excessive LLM API usage: Controls the number of requests per user to avoid resource exhaustion.

  • Optimizes model inference efficiency: Ensures that LLM requests are processed smoothly without congestion.

PreviousGetting StartedNextCost Control

Was this helpful?