Global + per-route

Rate limits.

Projekt enforces a global per-token ceiling on the entire API surface, plus tighter per-route limits on a handful of expensive endpoints. Every response carries X-RateLimit-* headers so a well-behaved client never needs to see a 429.

The global ceiling#

Every authenticated call is bucketed by your token. The limit is 600 requests per 60-second sliding window, per token. Unauthenticated calls are bucketed by IP at the same rate.

In practice that's a sustained 10 req/s with plenty of headroom for bursty work. If you're pulling thousands of issues, paginate with limit=100 and you'll never get close.

Tighter per-route limits#

A small set of endpoints carry extra limits on top of the global one, because they are expensive (search), security-sensitive (auth), or destructive (mutations on certain resources). The exact thresholds are visible via the response headers.

Endpoint family	Limit	Why
`POST /api-keys`	10 / hour	Keep automated provisioning from generating endless keys.
`GET /orgs/search`, `/people/search`, `/palette/search`	30 / 60s	Search hits broad indexes; per-search cap protects the DB.
`POST /projects`, `PUT /projects/:id`, `DELETE /projects/:id`	30 / 60s	Project mutations cascade — slow them down.
`POST /oauth/register`	5 / hour	Dynamic Client Registration is a low-frequency operation.
`POST /oauth/token`	60 / 60s	Cap brute-force on token exchange.

Response headers#

Every response — even successful ones — carries:

Header	Meaning
`X-RateLimit-Limit`	The bucket size (e.g. `600`).
`X-RateLimit-Remaining`	How many requests you can still send before refusal.
`X-RateLimit-Reset`	Seconds until the bucket refills (or until the next slot opens, for fixed-window backends).

On a 429, treat X-RateLimit-Reset like Retry-After: sleep that long before the next request.

Recommended client behaviour#

Read X-RateLimit-Remaining from every response. If it drops below ~10% of X-RateLimit-Limit, slow down voluntarily.
On 429, sleep X-RateLimit-Reset seconds and retry exactly once.
If you hit two consecutive 429s, fall back to exponential backoff (cap at 30s, give up after 6 tries).
For bulk jobs, consider running with a small parallelism factor (4–8 workers) and a token bucket that mirrors the server's window.

Python example#

import time, requests

def call(method, url, **kw):
    for attempt in range(6):
        r = requests.request(method, url, **kw)
        if r.status_code != 429:
            return r
        wait = int(r.headers.get("X-RateLimit-Reset", "5"))
        time.sleep(min(wait, 30))
    r.raise_for_status()
    return r

One token, one bucket

Sharing a token across many machines means they share the bucket. For a fan-out workload, mint one PAT per worker (descriptive names — worker-01, worker-02) so each gets its own 600/60s budget.