Guides/Rate limits & errors
🚦API Reference

Rate limits & errors

How Latentface rate-limits work, what to do when you hit one, and how to read every error response.

1

Plan-level limits

Free: 10 rpm. Starter: 60 rpm. Scale: 300 rpm with priority queue. Enterprise: no per-minute cap. Each response includes `X-RateLimit-Remaining` and `X-RateLimit-Reset` headers.

2

Handle 429 responses

`429 Too Many Requests` responses include a `Retry-After` header (seconds). Both SDKs honour it automatically with exponential backoff.

Tip: If you burst, move to the Scale tier — the priority queue handles bursts up to 2× rpm for 60s.
3

Error taxonomy

`400` invalid input · `401` bad key · `402` out of quota · `404` resource not found · `429` rate limit · `5xx` our fault (automatic retry recommended).

4

Idempotency

Pass `Idempotency-Key` on any `POST` to guarantee exactly-once semantics across retries. Keys live for 24 hours.

Ready to go live?

Start your first Latentface session for free — no credit card required.