Security Model¶
Authentication¶
Every request except public endpoints must include:
- Missing key →
401 Unauthorized - Unknown key →
401 Unauthorized - Insufficient scope →
403 Forbidden
API key comparison uses hmac.compare_digest — constant-time, preventing timing attacks.
Scopes¶
| Scope | Grants access to |
|---|---|
predict |
/predict, /predict/batch, /predict/async* |
read_models |
/models |
admin |
/debug/*, /admin/* |
/health, /ready, /metrics — no auth required (public endpoints).
/jobs/{id} — auth required, no specific scope.
Rate limiting¶
Per-tenant sliding-window rate limits keyed on tenant_id.
| Endpoint | Limit |
|---|---|
/predict |
10 req / 1s |
/models |
2 req / 1s |
Exceeded → 429 Too Many Requests
In-process (REDIS_URL not set): per-process counters only. Accurate for single-process deployments.
Redis-backed (REDIS_URL set): atomic Lua script with sorted set. Accurate across multiple processes.
Every response includes X-RateLimit-Mode: local or X-RateLimit-Mode: distributed.
Payload guard¶
Requests with a body larger than 1 MB are rejected before reaching any route handler → 413 Request Entity Too Large.
Production checklist¶
- [ ] Set
ENV=production - [ ] Set
API_KEYSwith strong randomly generated keys - [ ] Run behind TLS-terminating reverse proxy
- [ ] Restrict
/metricsand/debug/*to internal networks - [ ] Use Redis-backed rate limiting for multi-process deployments
See Auth Configuration.

