Latentface.ipynb
A face-model API for developers who live in notebooks.
§1Install + authenticate
Install the SDK, set your API key, and you're done. The SDK is thin on purpose — everything below is a single HTTP call against the same endpoints you can hit with `curl`.
In [1]:
pip install latentface3.4 s
3.4 s
Out[1]:
# pre-recorded — click ▶ run to re-execute
Collecting latentface
Downloading latentface-2.1.0-py3-none-any.whl (42 kB)
Installing collected packages: latentface
Successfully installed latentface-2.1.0
In [2]:
import os, latentface
latentface.api_key = os.environ["LATENTFACE_API_KEY"]0.0 s
0.0 s
Out[2]:
# pre-recorded — click ▶ run to re-execute
§2Embed a face
Every face becomes a 512-dimensional ArcFace vector. Distance between vectors is cosine similarity between faces. This is the primitive; everything else (matching, swapping, searching) is built on top.
In [3]:
from latentface import embed
url = "https://latentface.net/sample-a.jpg"
v = embed(url)
v94 ms
94 ms
Out[3]:
§3Find similar faces
Compare two faces and get a cosine similarity score. The API returns the distance between the two 512-d embedding vectors — same face across different lighting or angle typically scores above 0.65.
In [4]:
from latentface import match
result = match(
image_a=open("face-a.jpg", "rb"),
image_b=open("face-b.jpg", "rb"),
)
result47 ms
47 ms
Out[4]:
§4The model zoo
Four models ship today. All four are mirrored on Hugging Face Spaces for offline reproduction of any benchmark result below.
In [5]:
from latentface import list_models
list_models()12 ms
12 ms
Out[5]:
# pre-recorded — click ▶ run to re-execute
ModelCard(name='face-embed', version='v2.1', task='embedding', in=(H,W,3), out=(512,), tier='all')
ModelCard(name='face-swap', version='v3.0', task='swap', in=(H,W,3)×2, out=(H,W,3), tier='starter+')
ModelCard(name='face-match', version='v2.1', task='similarity', in=(128,)×2, out=(float,), tier='all')
ModelCard(name='face-enhance', version='v1.4', task='super-res', in=(H,W,3), out=(4H,4W,3), tier='scale+')
§5Pricing
Pay per call. No seat minimums, no contract call, no demo-request gate. The free tier is rate-limited and watermarks output; paid tiers do not.
In [9]:
from latentface import price_table
price_table()2 ms
2 ms
Out[9]:
# pre-recorded — click ▶ run to re-execute
tier monthly calls_included p95_latency watermark checkout
0 Free $0 100/day 2 s yes ▶
1 Starter $49 10,000/month 2 s no ▶
2 Scale $199 100,000/month <1 s no ▶
3 Enterprise custom negotiated <500 ms no ▶ most popular for labs
§6Benchmarks
Latency and per-call cost across four providers, measured on 2026-03-18 against a 1,000-request sample of the WIDER FACE validation set. Methodology and raw data are linked in the bibliography below; re-run locally with `latentface.bench()`.
In [10]:
from latentface import bench
bench(tasks=["embedding","swap"])—
—
Out[10]:
# pre-recorded — click ▶ run to re-execute
provider embedding_p95 swap_p95 cost_per_call_usd
0 Latentface 94 ms 2.3 s $0.005
1 Hugging Face 182 ms 4.1 s $0.008
2 Replicate 138 ms 3.2 s $0.007
3 Face++ 246 ms 2.9 s $0.010
§7FAQ
Q1. How are rate limits enforced?
Per-API-key sliding-window counter. The free tier is 100 calls per 24 h. Paid tiers are soft-capped at the monthly allowance with per-call overage pricing disclosed in the `price_table()` output above.
Q2. Do you store the images I send?
No. Request bodies are processed in-memory and discarded after the response is written. We retain one log line per request with request_id + status + latency for billing; no image bytes, no hashes.
Q3. Do you train on my calls?
No. The published models are trained on publicly released research datasets (listed in the model cards). Customer requests are never used as training data, now or in future model versions.
Q4. Is there a free tier for research use?
Yes — the free tier is rate-limited but unrestricted for research purposes. Ping [email protected] with your institution and we'll quadruple the free-tier quota for the duration of your project.
Q5. Can I self-host?
Not today. The three Hugging Face Spaces give you reproducible model weights for offline research; production self-hosting with a licensed inference server is on the 2027 roadmap.
§8Why an API, not a product
The existing face-model tools ship as consumer apps. You download something, click through a wizard, and get a filtered video out. That is fine if you want a filtered video. It is the wrong abstraction if you want to build a feature.
A face-model API, by contrast, is closer to a tokenizer. You call it, get a vector, and decide what the vector means inside your product. The API doesn't know whether you're building a dating app lookalike, an e-commerce try-on, or a game NPC — and it shouldn't need to.
Latentface is the smallest thing that feels like that. Four endpoints. Two SDKs. Pay per call. Documentation that fits in a notebook you can run today.
— The Latentface team · 2026-04-20
§9Get an API key
No seat minimums, no procurement call. Paste your email; we return a key. The first 100 calls are free.
Out[0]:
{
"api_key": "lf_live_4f•••••••••••••••••••••",
"free_calls": 100,
"docs_url": "https://latentface.net/docs/quickstart"
}