Skip to main content

Rate Limits

We use a token bucket model to ensure fair usage and stability.

Model overview

  • A bucket holds a limited number of tokens.
  • Tokens are consumed per request.
  • Tokens are refilled over time at a fixed rate.
  • When the bucket is empty, requests are rate-limited with HTTP 429.
[Refill Rate] ---> [ Token Bucket ] ---> [Requests]
                      | empty?
                      v
                    429

Handling 429 responses

  • Back off and retry after a short delay.
  • 429 responses always include a Retry-After header in seconds. Honor this value before retrying.
  • Use batching and idempotency where possible.

Example 429 response

{
  "errorCode": "RATE_LIMITED",
  "message": "You’re sending requests too quickly. Please try again shortly.",
  "requestId": "{{requestId}}"
}

Headers

HTTP/1.1 429 Too Many Requests
Retry-After: 3
Content-Type: application/json
Retry-After is expressed in seconds.