🚦API Reference
Rate limits & errors
How Latentface rate-limits work, what to do when you hit one, and how to read every error response.
1
Plan-level limits
Free: 10 rpm. Starter: 60 rpm. Scale: 300 rpm with priority queue. Enterprise: no per-minute cap. Each response includes `X-RateLimit-Remaining` and `X-RateLimit-Reset` headers.
2
Handle 429 responses
`429 Too Many Requests` responses include a `Retry-After` header (seconds). Both SDKs honour it automatically with exponential backoff.
Tip: If you burst, move to the Scale tier — the priority queue handles bursts up to 2× rpm for 60s.
3
Error taxonomy
`400` invalid input · `401` bad key · `402` out of quota · `404` resource not found · `429` rate limit · `5xx` our fault (automatic retry recommended).
4
Idempotency
Pass `Idempotency-Key` on any `POST` to guarantee exactly-once semantics across retries. Keys live for 24 hours.
Ready to go live?
Start your first Latentface session for free — no credit card required.