Loop

Try task → evaluator scores → if failed, reflect verbally → store reflection → retry with reflection in context.

Advertisement

Advantage

Learn from failures without fine-tuning. Test-time adaptation via context.

Advertisement

Domains

Code generation with test signal. Reasoning tasks with verifier. Agent tool-use where reward signal available.

Memory management

Accumulate reflections across many tries. Summarize when memory grows. Cap total memory size.