Per-user daily cap
Auto-block on cap hit. User can request increase manually. Prevents infinite abuse.
Advertisement
Per-request cap
Single request cost capped. Prevents very-long-input DoS.
Advertisement
Cost/latency ratio
Efficient use has normal cost/latency. Abuse patterns often extreme in one direction.
Alert thresholds
Daily budget 80% → warn. 100% → auto-throttle. Escalate to human review.