Per-user daily cap

Auto-block on cap hit. User can request increase manually. Prevents infinite abuse.

Advertisement

Per-request cap

Single request cost capped. Prevents very-long-input DoS.

Advertisement

Cost/latency ratio

Efficient use has normal cost/latency. Abuse patterns often extreme in one direction.

Alert thresholds

Daily budget 80% → warn. 100% → auto-throttle. Escalate to human review.