In 2026, most of us have learned basic prompt engineering, the concept of chaining prompts together (like LangChain and LangGraph), and retrieval systems like RAG are also very popular. Additionally, the Agentic AI patterns, such as Planning and ReAct, offer the most advanced and comprehensive ways to enhance the quality of agents' output. However, one critical factor that is starting to trend in 2026 and beyond is the ability to store conversation history. Having the ability to retrieve conversation history allows the agent to understand you better over time. We can leverage both Lakebase and Mosaic AI vector search to store and retrieve this information, also known as Agent Memory.
Cognitive memory
The design of modern agent memory systems is inspired by the paper “Cognitive Architectures for Language Agents” by Summers et al. from Princeton University (https://arxiv.org/abs/2309.02427). The high-level design is shown in Figure 1 below. Recall that Large Language Models are designed to simulate the human brain, it makes sense to model memory based on human memory. These memory patterns can also be found in medical publications.
Figure 1. Cognitive architectures for language agents by Summers et al.
The diagram above illustrates the cognitive architecture of an artificial intelligence agent, designed to mimic human thought processes. At its center is the "Working Memory," which acts as the active processing hub where reasoning takes place based on incoming "Observations" from the outside world. To process these observations, the working memory interacts with three distinct long-term memory banks: Procedural Memory, Semantic Memory, and Episodic Memory. The system continuously pulls relevant information from these memories to understand its situation and updates them through an ongoing learning process. Table 1 summarizes the differences between short-term and long-term memory.
|
Memory type |
Definition |
Example |
|
Short-term memory |
Memory that is discarded after an event completes |
Conversation history, a chat session |
|
Long-term memory |
||
|
Semantic (Declarative) |
Facts and concepts |
Skill level: intermediate, |
|
Episodic (Declarative) |
Experiences |
A PGN (game log) of your match where you won or especially impressed with |
|
Procedural |
Instructions and rules |
The rule of chess (Is it a checkmate?) |
Table 1. Short-term vs long-term memory
When users interact with an LLM agent, the working memory collaborates with a "Decision Procedure" to map out and weigh different choices. This feeds into a dedicated "Planning" phase, which operates in a careful loop: the system first proposes potential plans, evaluates their viability, and selects the best option. Once an optimal path is selected, the system moves to "Execution," translating its internal reasoning into concrete "Actions" directed back out into the environment. This workflow is shown in Figure 2.
Figure 2. The planning phase of working memory
Memory storage
The human brain is such a miracle that it can store a lot of information. However, in the case of agent memory, we need a place to store information temporarily for short-term memory or permanently for long-term memory. Fortunately, Databricks provides two lightning-fast storage options: Lakebase and Mosaic AI vector search.
1. Lakebase (Relational / Key-Value Storage)
PostgreSQL is primarily used for Short-Term Memory (Checkpoints) and Procedural Memory. This is because these types of memory require exact, high-speed lookups based on specific IDs.
Short-Term Memory: Every time the agent makes a move in your chess game, the current state (the board, the turn, the chat history) is saved as a "checkpoint." You retrieve this using a thread_id. You don't need to "search" for it; you need the exact latest version.
Procedural Memory (Rules): Instructions like "If a player is in checkmate, end the game" are static. These are stored as structured text or JSON in a standard relational table.
Profile (Semantic): Simple user facts (e.g., user_id: 123, favorite_opening: "Sicilian") are best stored in a standard JSONB column for quick, direct lookups.
2. Vector Database (Semantic Search)
A Vector DB is used for Long-Term Memory (Episodic & Semantic Collection) where the agent needs to find information based on meaning rather than an exact ID.
Episodic Memory (Past Experiences): If you ask the agent, "Have I ever lost a game like this before?", the agent converts your current board state into a "vector" (embedding) and searches the database for similar historical games. Mosaic AI’s vector search can efficiently handle millions of records.
Semantic Memory: If you have thousands of chess strategy documents, a Vector DB allows the agent to "retrieve" only the relevant paragraphs about “opening tactics.” Similar to learning from practicing chess puzzles and from real-world tactics, they are factual but require semantic understanding. Figure 3 below illustrates the comparison.
Figure 3. Comparison between memory storage
Conclusion
The Databricks Data Intelligence Platform is no longer a data processing engine. When your data and AI come together, it can empower the next generation of intelligence. The key takeaway is that memory type should drive your storage choice. Use Lakebase when you need exact, high-speed access to checkpoints, rules, or user profiles. Turn to vector search when the agent needs to recall past experiences or surface relevant knowledge based on semantic similarity. Together, governed by Unity Catalog, these tools give your agents a durable, secure memory layer built directly into the Lakehouse.

AUTHOR - FOLLOW
Jason Yip
Director of Data and AI, Tredence Inc.



