> ## Documentation Index
> Fetch the complete documentation index at: https://docs.polinate.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Rate Limits

> How Polinate rate limiting works and how to handle 429s

# Rate Limits

We use a token bucket model to ensure fair usage and stability.

## Model overview

* A bucket holds a limited number of tokens.
* Tokens are consumed per request.
* Tokens are refilled over time at a fixed rate.
* When the bucket is empty, requests are rate-limited with HTTP 429.

```text theme={null}
[Refill Rate] ---> [ Token Bucket ] ---> [Requests]
                      | empty?
                      v
                    429
```

## Handling 429 responses

* Back off and retry after a short delay.
* 429 responses <strong>always include</strong> a `Retry-After` header <strong>in seconds</strong>. Honor this value before retrying.
* Use batching and idempotency where possible.

## Example 429 response

```json theme={null}
{
  "errorCode": "RATE_LIMITED",
  "message": "You’re sending requests too quickly. Please try again shortly.",
  "requestId": "{{requestId}}"
}
```

### Headers

```http theme={null}
HTTP/1.1 429 Too Many Requests
Retry-After: 3
Content-Type: application/json
```

<Note>
  `Retry-After` is expressed in seconds.
</Note>