Components

Attack strategies (single-turn, multi-turn, crescendo). Evaluators (harm classifiers). Targets (LLM under test). Datastore.

Advertisement

Multi-turn attacks

Automated crescendo + PAIR + custom multi-turn. Simulates persistent adversary.

Advertisement

Memory + iteration

Store attack results. Iterate on successful strategies. Adaptive red team.

Custom evaluators

Plug your own harm definitions. Domain-specific safety criteria.