Rate Limits
We use a token bucket model to ensure fair usage and stability.
Model overview
- A bucket holds a limited number of tokens.
- Tokens are consumed per request.
- Tokens are refilled over time at a fixed rate.
- When the bucket is empty, requests are rate-limited with HTTP 429.
[Refill Rate] ---> [ Token Bucket ] ---> [Requests]
| empty?
v
429
Handling 429 responses
- Back off and retry after a short delay.
- 429 responses always include a
Retry-After header in seconds. Honor this value before retrying.
- Use batching and idempotency where possible.
Example 429 response
{
"errorCode": "RATE_LIMITED",
"message": "You’re sending requests too quickly. Please try again shortly.",
"requestId": "{{requestId}}"
}
HTTP/1.1 429 Too Many Requests
Retry-After: 3
Content-Type: application/json
Retry-After is expressed in seconds.