📚Memory Stream & Retrieval

A central limitation in modern AI is the restricted context window — LLMs like ChatGPT can only process a finite number of tokens (~128k), meaning older memories are discarded. To simulate long-term cognition, Generative Agents use a Memory Stream, a dynamic, structured log of experiences, to preserve and prioritize meaningful information.Each memory in the stream contains:

A natural language description
A creation timestamp
A last-accessed timestamp

The fundamental unit is an observation — a direct perception or experience. For example:

Arjay is composing a new song.
Logan is discussing town planning with Owen.
Lulu is organizing a bonfire event with Arjay.
A new visitor arrived at Selena’s tavern.

To keep context relevant, the agent continuously retrieves a subset of these memories based on:

Recency: Prioritizing recent events using an exponential decay function.
Importance: Scored by the LLM at memory creation — highly significant events persist longer.
Relevance: Calculated using embedding similarity between memory text and the current situation.

This ensures that only the most contextually relevant and meaningful memories are fed into the LLM for behavior generation.

PreviousArchitecture of Generative Agents NextReflection

Last updated 2 months ago