Agent 'memory' is several different things conflated under one word. Working memory (this turn), episodic (this conversation), semantic (long-term facts about the user/world). Each has different storage, retrieval, and policy needs.

Advertisement

Working memory

The model's context window. Token-limited. Includes system prompt, recent turns, retrieved context. Designed per-turn, not persisted.

Episodic memory

Conversation history. Sessions. Often summarized as length grows. Stored in fast KV (Redis, DynamoDB). Visible to current session; ephemeral.

Advertisement

Semantic memory

Long-term facts about the user, world, past resolved issues. Vector DB for retrieval + structured DB for facts. Updated explicitly (or with user confirmation). Survives across sessions, devices, devices.

Three layers, three storage models. Conflating them produces bugs (creepy persistence) or limits (no cross-session memory).