Rate limits & errors

How Latentface rate-limits work across Free/Starter/Scale, how to read X-RateLimit headers, handle 429 responses, and every error code.

By Latentface TeamUpdated May 27, 2026

Plan-level limits

Free: 10 rpm. Starter: 60 rpm. Scale: 300 rpm with priority queue. Enterprise: no per-minute cap. Each response includes `X-RateLimit-Remaining` and `X-RateLimit-Reset` headers.

Handle 429 responses

`429 Too Many Requests` responses include a `Retry-After` header (seconds). Both SDKs honour it automatically with exponential backoff.

Tip: If you burst, move to the Scale tier — the priority queue handles bursts up to 2× rpm for 60s.

Error taxonomy

`400` invalid input · `401` bad key · `402` out of quota · `404` resource not found · `429` rate limit · `5xx` our fault (automatic retry recommended).

Idempotency

Pass `Idempotency-Key` on any `POST` to guarantee exactly-once semantics across retries. Keys live for 24 hours.

Ready to go live?

Start your first Latentface session for free — no credit card required.