Rate Limiting
Three-layer rate limiting architecture protecting QUESTPIE Autopilot API endpoints.
QUESTPIE Autopilot uses a three-layer rate limiting architecture to protect against abuse while keeping agent workflows unthrottled.
Architecture
| Layer | Scope | Where | Limit | Storage |
|---|---|---|---|---|
| 1 — Auth | Login/signup endpoints | Better Auth plugin | 30 req/min global, 10/5min login, 5/5min signup | In-memory |
| 2 — IP | All API endpoints | Hono middleware (before auth) | 20 req/min per IP | SQLite |
| 3 — Actor | Authenticated endpoints | Hono middleware (after auth) | Per-actor, per-endpoint | SQLite |
Per-Actor Limits (Layer 3)
| Actor Type | Endpoint | Limit | Key |
|---|---|---|---|
Agent (type: agent) | All | Exempt | — |
Webhook (source: webhook) | All | Exempt | — |
| Human | /api/search* | 10/min | actor:{id}:/api/search |
| Human | /api/chat* | 20/min | actor:{id}:/api/chat |
| Human | Everything else | 300/min | actor:{id} |
Agents and webhooks are exempt from actor-level rate limiting to avoid blocking automated workflows.
Response Headers
Every API response includes standard rate limit headers:
X-RateLimit-Limit: 300
X-RateLimit-Remaining: 287
X-RateLimit-Reset: 1711234567| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the window |
X-RateLimit-Remaining | Requests remaining in the current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
When the limit is exceeded, the API returns 429 Too Many Requests:
{ "error": "Rate limit exceeded" }Exempt Paths
The following paths are exempt from IP rate limiting:
/hooks/*— Webhook endpoints (external services like GitHub, Slack)/api/status— Health check endpoint
SQLite Storage
Rate limit counters are stored in a rate_limit_entries SQLite table with sliding window semantics. Expired entries are cleaned up automatically every 5 minutes.
This approach ensures rate limits persist across process restarts and work correctly in single-instance deployments.