# Memory Stream & Retrieval

A central limitation in modern AI is the restricted context window — LLMs like ChatGPT can only process a finite number of tokens (\~128k), meaning older memories are discarded. To simulate long-term cognition, Generative Agents use a Memory Stream, a dynamic, structured log of experiences, to preserve and prioritize meaningful information.Each memory in the stream contains:

* A natural language description
* A creation timestamp
* A last-accessed timestamp

The fundamental unit is an observation — a direct perception or experience. For example:

* Arjay is composing a new song.
* Logan is discussing town planning with Owen.
* Lulu is organizing a bonfire event with Arjay.
* A new visitor arrived at Selena’s tavern.

To keep context relevant, the agent continuously retrieves a subset of these memories based on:

* **Recency**: Prioritizing recent events using an exponential decay function.
* **Importance**: Scored by the LLM at memory creation — highly significant events persist longer.
* **Relevance**: Calculated using embedding similarity between memory text and the current situation.

This ensures that only the most contextually relevant and meaningful memories are fed into the LLM for behavior generation.
