Rate Limiting

Three-layer rate limiting architecture protecting QUESTPIE Autopilot API endpoints.

QUESTPIE Autopilot uses a three-layer rate limiting architecture to protect against abuse while keeping agent workflows unthrottled.

Architecture

Layer	Scope	Where	Limit	Storage
1 — Auth	Login/signup endpoints	Better Auth plugin	30 req/min global, 10/5min login, 5/5min signup	In-memory
2 — IP	All API endpoints	Hono middleware (before auth)	20 req/min per IP	SQLite
3 — Actor	Authenticated endpoints	Hono middleware (after auth)	Per-actor, per-endpoint	SQLite

Per-Actor Limits (Layer 3)

Actor Type	Endpoint	Limit	Key
Agent (`type: agent`)	All	Exempt	—
Webhook (`source: webhook`)	All	Exempt	—
Human	`/api/search*`	10/min	`actor:{id}:/api/search`
Human	`/api/chat*`	20/min	`actor:{id}:/api/chat`
Human	Everything else	300/min	`actor:{id}`

Agents and webhooks are exempt from actor-level rate limiting to avoid blocking automated workflows.

Response Headers

Every API response includes standard rate limit headers:

X-RateLimit-Limit: 300
X-RateLimit-Remaining: 287
X-RateLimit-Reset: 1711234567

Header	Description
`X-RateLimit-Limit`	Maximum requests allowed in the window
`X-RateLimit-Remaining`	Requests remaining in the current window
`X-RateLimit-Reset`	Unix timestamp when the window resets

When the limit is exceeded, the API returns 429 Too Many Requests:

{ "error": "Rate limit exceeded" }

Exempt Paths

The following paths are exempt from IP rate limiting:

/hooks/* — Webhook endpoints (external services like GitHub, Slack)
/api/status — Health check endpoint

SQLite Storage

Rate limit counters are stored in a rate_limit_entries SQLite table with sliding window semantics. Expired entries are cleaned up automatically every 5 minutes.

This approach ensures rate limits persist across process restarts and work correctly in single-instance deployments.