Latentface.ipynb

A face-model API for developers who live in notebooks.

§1Install + authenticate

Install the SDK, set your API key, and you're done. The SDK is thin on purpose — everything below is a single HTTP call against the same endpoints you can hit with `curl`.
In [1]:
pip install latentface
3.4 s
Out[1]:
# pre-recorded — click ▶ run to re-execute Collecting latentface Downloading latentface-2.1.0-py3-none-any.whl (42 kB) Installing collected packages: latentface Successfully installed latentface-2.1.0
In [2]:
import os, latentface latentface.api_key = os.environ["LATENTFACE_API_KEY"]
0.0 s
Out[2]:
# pre-recorded — click ▶ run to re-execute

§2Embed a face

Every face becomes a 512-dimensional ArcFace vector. Distance between vectors is cosine similarity between faces. This is the primitive; everything else (matching, swapping, searching) is built on top.
In [3]:
from latentface import embed url = "https://latentface.net/sample-a.jpg" v = embed(url) v
94 ms
Out[3]:

§3Find similar faces

Compare two faces and get a cosine similarity score. The API returns the distance between the two 512-d embedding vectors — same face across different lighting or angle typically scores above 0.65.
In [4]:
from latentface import match result = match( image_a=open("face-a.jpg", "rb"), image_b=open("face-b.jpg", "rb"), ) result
47 ms
Out[4]:

§4The model zoo

Four models ship today. All four are mirrored on Hugging Face Spaces for offline reproduction of any benchmark result below.
In [5]:
from latentface import list_models list_models()
12 ms
Out[5]:
# pre-recorded — click ▶ run to re-execute ModelCard(name='face-embed', version='v2.1', task='embedding', in=(H,W,3), out=(512,), tier='all') ModelCard(name='face-swap', version='v3.0', task='swap', in=(H,W,3)×2, out=(H,W,3), tier='starter+') ModelCard(name='face-match', version='v2.1', task='similarity', in=(128,)×2, out=(float,), tier='all') ModelCard(name='face-enhance', version='v1.4', task='super-res', in=(H,W,3), out=(4H,4W,3), tier='scale+')

§5Pricing

Pay per call. No seat minimums, no contract call, no demo-request gate. The free tier is rate-limited and watermarks output; paid tiers do not.
In [9]:
from latentface import price_table price_table()
2 ms
Out[9]:
# pre-recorded — click ▶ run to re-execute tier monthly calls_included p95_latency watermark checkout 0 Free $0 100/day 2 s yes ▶ 1 Starter $49 10,000/month 2 s no ▶ 2 Scale $199 100,000/month <1 s no ▶ 3 Enterprise custom negotiated <500 ms no ▶ most popular for labs

§6Benchmarks

Latency and per-call cost across four providers, measured on 2026-03-18 against a 1,000-request sample of the WIDER FACE validation set. Methodology and raw data are linked in the bibliography below; re-run locally with `latentface.bench()`.
In [10]:
from latentface import bench bench(tasks=["embedding","swap"])
Out[10]:
# pre-recorded — click ▶ run to re-execute provider embedding_p95 swap_p95 cost_per_call_usd 0 Latentface 94 ms 2.3 s $0.005 1 Hugging Face 182 ms 4.1 s $0.008 2 Replicate 138 ms 3.2 s $0.007 3 Face++ 246 ms 2.9 s $0.010

§7FAQ

Q1. How are rate limits enforced?

Per-API-key sliding-window counter. The free tier is 100 calls per 24 h. Paid tiers are soft-capped at the monthly allowance with per-call overage pricing disclosed in the `price_table()` output above.

Q2. Do you store the images I send?

No. Request bodies are processed in-memory and discarded after the response is written. We retain one log line per request with request_id + status + latency for billing; no image bytes, no hashes.

Q3. Do you train on my calls?

No. The published models are trained on publicly released research datasets (listed in the model cards). Customer requests are never used as training data, now or in future model versions.

Q4. Is there a free tier for research use?

Yes — the free tier is rate-limited but unrestricted for research purposes. Ping [email protected] with your institution and we'll quadruple the free-tier quota for the duration of your project.

Q5. Can I self-host?

Not today. The three Hugging Face Spaces give you reproducible model weights for offline research; production self-hosting with a licensed inference server is on the 2027 roadmap.

§8Why an API, not a product

The existing face-model tools ship as consumer apps. You download something, click through a wizard, and get a filtered video out. That is fine if you want a filtered video. It is the wrong abstraction if you want to build a feature. A face-model API, by contrast, is closer to a tokenizer. You call it, get a vector, and decide what the vector means inside your product. The API doesn't know whether you're building a dating app lookalike, an e-commerce try-on, or a game NPC — and it shouldn't need to. Latentface is the smallest thing that feels like that. Four endpoints. Two SDKs. Pay per call. Documentation that fits in a notebook you can run today. — The Latentface team · 2026-04-20

§9Get an API key

No seat minimums, no procurement call. Paste your email; we return a key. The first 100 calls are free.
In [0]:
curl -X POST https://latentface.net/keys \ -d '[email protected]'
— run me ▶
Out[0]:
{ "api_key": "lf_live_4f•••••••••••••••••••••", "free_calls": 100, "docs_url": "https://latentface.net/docs/quickstart" }