Input layer
PII detection + redaction. Prompt injection classifier. Rate limits. Content policy filter. Runs before LLM inference.
Advertisement
Model layer
RLHF training. Constitutional AI. Refusal training. Baked into model. Can't be updated per app.
Advertisement
Output layer
Hallucination detection. PII output filter. Toxicity filter. Format validation. Runs after LLM, before user sees.
Compose
Real systems combine all three. Each has failure modes. Independence maximizes coverage.