Chapter 8: Memory Management

第 8 章:记忆管理

Effective memory management is crucial for intelligent agents to retain information. Agents require different types of memory, much like humans, to operate efficiently. This chapter delves into memory management, specifically addressing the immediate (short-term) and persistent (long-term) memory requirements of agents.

有效的记忆管理对于智能体管理信息至关重要。与人类类似,智能体需要不同类型的记忆才能高效运行。本章深入探讨记忆管理,重点关注智能体的短期记忆和持久(长期)记忆需求。

In agent systems, memory refers to an agent’s ability to retain and utilize information from past interactions, observations, and learning experiences. This capability allows agents to make informed decisions, maintain conversational context, and improve over time. Agent memory is generally categorized into two main types:

在智能体中,记忆指智能体利用过去交互、观察和学习经验的能力。这种能力帮助智能体做出决策、维护对话上下文并持续改进。智能体记忆通常分为两大主要类型:

Practical Applications & Use Cases

实际应用与用例

Memory management is vital for agents to track information and perform intelligently over time. This is essential for agents to surpass basic question-answering capabilities. Applications include:

记忆管理对于智能体保留信息并随时间智能执行至关重要。这是智能体具备问答能力的必要条件。主要应用包括:

Memory enables agents to maintain history, learn, personalize interactions, and manage complex, time-dependent problems.

记忆使智能体维护历史、持续学习、个性化交互并管理复杂的时间依赖问题。

Hands-On Code: Memory Management in Google Agent Developer Kit (ADK)

实操代码:Google ADK 中的记忆管理

The Google Agent Developer Kit (ADK) offers a structured method for managing context and memory, including components for practical application. A solid grasp of ADK’s Session, State, and Memory is vital for building agents that need to retain information.

Google 智能体开发工具包(ADK)提供了结构化的上下文和记忆管理方法,包含实际应用组件。深入理解 ADK 的 Session、State 和 Memory 对构建需保留信息的智能体至关重要。

Just as in human interactions, agents require the ability to recall previous exchanges to conduct coherent and natural conversations. ADK simplifies context management through three core concepts and their associated services.

与人类交互类似,智能体需要能够回忆先前交流以进行连贯对话。ADK 通过三个核心概念简化了上下文管理:

Every interaction with an agent can be considered a unique conversation thread. Agents might need to access data from earlier interactions. ADK structures this as follows:

每次与智能体的交互可视为独特的对话线程,智能体可能需要访问早期数据。ADK 将其结构化如下:

ADK provides dedicated services for managing critical components essential for building complex, stateful, and context-aware agents. The SessionService manages chat threads (Session objects) by handling their initiation, recording, and termination, while the MemoryService oversees the storage and retrieval of long-term knowledge (Memory).

ADK 为构建复杂、有状态和上下文感知的智能体提供专用服务:SessionService 管理聊天线程(启动、记录和终止 Session 对象),MemoryService 监督长期知识(Memory)的存储和检索。

Both the SessionService and MemoryService offer various configuration options, allowing users to choose storage methods based on application needs. In-memory options are available for testing purposes, though data will not persist across restarts. For persistent storage and scalability, ADK also supports database and cloud-based services.

SessionService 和 MemoryService 均提供多种配置选项,允许根据应用需求选择存储方法。内存选项适用于测试目的(数据不持久),持久存储和可扩展需求则支持数据库和云服务。

Session: Keeping Track of Each Chat

Session:跟踪每次聊天

A Session object in ADK is designed to track and manage individual chat threads. Upon initiation of a conversation with an agent, the SessionService generates a Session object, represented as google.adk.sessions.Session. This object encapsulates all data relevant to a specific conversation thread, including unique identifiers (id, app_name, user_id), a chronological record of events as Event objects, a storage area for session-specific temporary data known as state, and a timestamp indicating the last update (last_update_time). Developers typically interact with Session objects indirectly through the SessionService. The SessionService is responsible managing the lifecycle of conversation sessions, which includes initiating new sessions, resuming previous sessions, recording session activity (including state updates), identifying active sessions, and managing the removal of session data. The ADK provides several SessionService implementations with varying storage mechanisms for session history and temporary data, such as the InMemorySessionService, which is suitable for testing but does not provide data persistence across application restarts.

ADK 中的 Session 对象旨在跟踪和管理单个聊天线程。在与智能体对话时,SessionService 会生成一个 Session 对象,表示为 google.adk.sessions.Session。此对象封装了与特定对话线程相关的所有数据,包括唯一标识符(id、app_name、user_id)、作为 Event 对象的事件的时间顺序记录、用于会话特定临时数据的存储区域(称为 state)以及指示最后更新的时间戳(last_update_time)。开发人员通常通过 SessionService 间接与 Session 对象交互。SessionService 负责管理对话会话的生命周期,包括启动新会话、恢复先前的会话、记录会话活动(包括状态更新)、识别活动会话以及管理会话数据的删除。ADK 提供几种具有不同存储机制的 SessionService 实现,用于会话历史和临时数据,例如 InMemorySessionService,适合测试但不提供跨应用程序重启的数据持久性。

1
2
3
4
5
6
## Example: Using InMemorySessionService
## This is suitable for local development and testing where data
## persistence across application restarts is not required.
from google.adk.sessions import InMemorySessionService

session_service = InMemorySessionService()
1
2
3
4
5
## 示例:使用 InMemorySessionService
## 这适用于不需要跨应用程序重启的数据持久性的本地开发和测试。
from google.adk.sessions import InMemorySessionService

session_service = InMemorySessionService()

Then there’s DatabaseSessionService if you want reliable saving to a database you manage.

然后是 DatabaseSessionService,如果您想可靠地保存到您管理的数据库。

1
2
3
4
5
6
7
8
9
## Example: Using DatabaseSessionService
## This is suitable for production or development requiring persistent storage.
## You need to configure a database URL (e.g., for SQLite, PostgreSQL, etc.).
## Requires: pip install google-adk[sqlalchemy] and a database driver (e.g., psycopg2 for PostgreSQL)
from google.adk.sessions import DatabaseSessionService

## Example using a local SQLite file:
db_url = "sqlite:///./my_agent_data.db"
session_service = DatabaseSessionService(db_url=db_url)
1
2
3
4
5
6
7
8
9
## 示例:使用 DatabaseSessionService
## 这适用于需要持久存储的生产或开发。
## 您需要配置数据库 URL(例如,用于 SQLite、PostgreSQL 等)。
## 需要:pip install google-adk[sqlalchemy] 和数据库驱动程序(例如,PostgreSQL 的 psycopg2)
from google.adk.sessions import DatabaseSessionService

## 示例使用本地 SQLite 文件:
db_url = "sqlite:///./my_agent_data.db"
session_service = DatabaseSessionService(db_url=db_url)

Besides, there’s VertexAiSessionService which uses Vertex AI infrastructure for scalable production on Google Cloud.

此外,还有 VertexAiSessionService,它使用 Vertex AI 基础设施在 Google Cloud 上进行可扩展的生产。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
## Example: Using VertexAiSessionService
## This is suitable for scalable production on Google Cloud Platform, leveraging
## Vertex AI infrastructure for session management.
## Requires: pip install google-adk[vertexai] and GCP setup/authentication
from google.adk.sessions import VertexAiSessionService

PROJECT_ID = "your-gcp-project-id" # Replace with your GCP project ID
LOCATION = "us-central1" # Replace with your desired GCP location

## The app_name used with this service should correspond to the Reasoning Engine ID or name
REASONING_ENGINE_APP_NAME = "projects/your-gcp-project-id/locations/us-central1/reasoningEngines/your-engine-id" # Replace with your Reasoning Engine resource name

session_service = VertexAiSessionService(project=PROJECT_ID, location=LOCATION)

## When using this service, pass REASONING_ENGINE_APP_NAME to service methods:
## session_service.create_session(app_name=REASONING_ENGINE_APP_NAME, ...)
## session_service.get_session(app_name=REASONING_ENGINE_APP_NAME, ...)
## session_service.append_event(session, event, app_name=REASONING_ENGINE_APP_NAME)
## session_service.delete_session(app_name=REASONING_ENGINE_APP_NAME, ...)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
## 示例:使用 VertexAiSessionService
## 这适用于 Google Cloud Platform 上的可扩展生产,利用
## Vertex AI 基础设施进行会话管理。
## 需要:pip install google-adk[vertexai] 和 GCP 设置/身份验证
from google.adk.sessions import VertexAiSessionService

PROJECT_ID = "your-gcp-project-id"  # 替换为您的 GCP 项目 ID
LOCATION = "us-central1"  # 替换为您想要的 GCP 位置

## 与此服务一起使用的 app_name 应对应于 Reasoning Engine ID 或名称
REASONING_ENGINE_APP_NAME = "projects/your-gcp-project-id/locations/us-central1/reasoningEngines/your-engine-id"  # 替换为您的 Reasoning Engine 资源名称

session_service = VertexAiSessionService(project=PROJECT_ID, location=LOCATION)

## 使用此服务时,将 REASONING_ENGINE_APP_NAME 传递给服务方法:
## session_service.create_session(app_name=REASONING_ENGINE_APP_NAME, ...)
## session_service.get_session(app_name=REASONING_ENGINE_APP_NAME, ...)
## session_service.append_event(session, event, app_name=REASONING_ENGINE_APP_NAME)
## session_service.delete_session(app_name=REASONING_ENGINE_APP_NAME, ...)

Choosing an appropriate SessionService is crucial as it determines how the agent’s interaction history and temporary data are stored and their persistence.

选择适当的 SessionService 至关重要,因为它决定了智能体交互历史和临时数据的存储方式及其持久性。

Each message exchange involves a cyclical process: A message is received, the Runner retrieves or establishes a Session using the SessionService, the agent processes the message using the Session’s context (state and historical interactions), the agent generates a response and may update the state, the Runner encapsulates this as an Event, and the session_service.append_event method records the new event and updates the state in storage. The Session then awaits the next message. Ideally, the delete_session method is employed to terminate the session when the interaction concludes. This process illustrates how the SessionService maintains continuity by managing the Session-specific history and temporary data.

每次消息交换都涉及一个循环过程:接收消息,Runner 使用 SessionService 检索或建立 Session,Agent 使用 Session 的上下文(状态和历史交互)处理消息,Agent 生成响应并可能更新状态,Runner 将其封装为 Event,session_service.append_event 方法记录新事件并更新存储中的状态。然后 Session 等待下一条消息。理想情况下,当交互结束时使用 delete_session 方法终止会话。此过程说明了 SessionService 如何通过管理特定于 Session 的历史和临时数据来维护连续性。

State: The Session’s Scratchpad

State:Session 的草稿本

In the ADK, each Session, representing a chat thread, includes a state component akin to an agent’s temporary working memory for the duration of that specific conversation. While session.events logs the entire chat history, session.state stores and updates dynamic data points relevant to the active chat.

在 ADK 中,每个代表聊天线程的 Session 都包含 state 组件,类似于智能体在特定对话期间的临时工作记忆。虽然 session.events 记录整个聊天历史,但 session.state 存储和更新与活动聊天相关的动态数据点。

Fundamentally, session.state operates as a dictionary, storing data as key-value pairs. Its core function is to enable the agent to retain and manage details essential for coherent dialogue, such as user preferences, task progress, incremental data collection, or conditional flags influencing subsequent agent actions.

从根本上讲,session.state 作为字典运行,以键值对形式存储数据。其核心功能是使智能体保留和管理对连贯对话至关重要的详细信息,如用户偏好、任务进度、增量数据收集或影响后续智能体行为的事件标志。

The state’s structure comprises string keys paired with values of serializable Python types, including strings, numbers, booleans, lists, and dictionaries containing these basic types. State is dynamic, evolving throughout the conversation. The permanence of these changes depends on the configured SessionService.

状态结构包含字符串键与可序列化 Python 类型值的配对,包括字符串、数字、布尔值、列表及包含这些基本类型的字典。State 是动态的,在整个对话过程中不断演变。这些更改的持久性取决于配置的 SessionService。

State organization can be achieved using key prefixes to define data scope and persistence. Keys without prefixes are session-specific.

可使用键前缀定义数据范围和持久性来组织状态:

The agent accesses all state data through a single session.state dictionary. The SessionService handles data retrieval, merging, and persistence. State should be updated upon adding an Event to the session history via session_service.append_event(). This ensures accurate tracking, proper saving in persistent services, and safe handling of state changes.

Agent 通过单个 session.state 字典访问所有状态数据。SessionService 处理数据检索、合并和持久性。应在通过 session_service.append_event() 向会话历史添加 Event 时更新状态。这确保了准确跟踪、在持久服务中的正确保存以及状态更改的安全处理。

  1. The Simple Way: Using output_key (for Agent Text Replies): This is the easiest method if you just want to save your agent’s final text response directly into the state. When you set up your LlmAgent, just tell it the output_key you want to use. The Runner sees this and automatically creates the necessary actions to save the response to the state when it appends the event. Let’s look at a code example demonstrating state update via output_key.

  2. 简单方法:使用 output_key(用于智能体回复): 如果您只想将智能体的文本响应直接保存到状态中,这是最简单的方法。设置 LlmAgent 时,只需告诉它要使用的 output_key。Runner 看到这一点并在追加事件时自动创建必要的操作以将响应保存到状态。让我们看一个通过 output_key 演示状态更新的代码示例。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
## Import necessary classes from the Google Agent Developer Kit (ADK)
from google.adk.agents import LlmAgent
from google.adk.sessions import InMemorySessionService, Session
from google.adk.runners import Runner
from google.genai.types import Content, Part

## Define an LlmAgent with an output_key.
greeting_agent = LlmAgent(
   name="Greeter",
   model="gemini-2.0-flash",
   instruction="Generate a short, friendly greeting.",
   output_key="last_greeting"
)

## --- Setup Runner and Session ---
app_name, user_id, session_id = "state_app", "user1", "session1"
session_service = InMemorySessionService()
runner = Runner(
   agent=greeting_agent,
   app_name=app_name,
   session_service=session_service
)
session = session_service.create_session(
   app_name=app_name,
   user_id=user_id,
   session_id=session_id
)
print(f"Initial state: {session.state}")

## --- Run the Agent ---
user_message = Content(parts=[Part(text="Hello")])
print("\n--- Running the agent ---")
for event in runner.run(
   user_id=user_id,
   session_id=session_id,
   new_message=user_message
):
   if event.is_final_response():
     print("Agent responded.")

## --- Check Updated State ---
## Correctly check the state *after* the runner has finished processing all events.
updated_session = session_service.get_session(app_name, user_id, session_id)
print(f"\nState after agent run: {updated_session.state}")
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
## 从 Google 智能体开发工具包 (ADK) 导入必要的类
from google.adk.agents import LlmAgent
from google.adk.sessions import InMemorySessionService, Session
from google.adk.runners import Runner
from google.genai.types import Content, Part

## 定义带有 output_key 的 LlmAgent。
greeting_agent = LlmAgent(
    name="Greeter",
    model="gemini-2.0-flash",
    instruction="生成一个简短、友好的问候语。",
    output_key="last_greeting"
)

## --- 设置 Runner 和 Session ---
app_name, user_id, session_id = "state_app", "user1", "session1"
session_service = InMemorySessionService()
runner = Runner(
    agent=greeting_agent,
    app_name=app_name,
    session_service=session_service
)
session = session_service.create_session(
    app_name=app_name,
    user_id=user_id,
    session_id=session_id
)
print(f"初始状态:{session.state}")

## --- 运行智能体 ---
user_message = Content(parts=[Part(text="你好")])
print("\n--- 运行 agent ---")
for event in runner.run(
    user_id=user_id,
    session_id=session_id,
    new_message=user_message
):
    if event.is_final_response():
        print("Agent 已响应。")

## --- 检查更新的状态 ---
## 在 runner 完成处理所有事件*之后*正确检查状态。
updated_session = session_service.get_session(app_name, user_id, session_id)
print(f"\nAgent 运行后的状态:{updated_session.state}")

Behind the scenes, the Runner sees your output_key and automatically creates the necessary actions with a state_delta when it calls append_event.

在幕后,Runner 看到您的 output_key,并在调用 append_event 时自动创建具有 state_delta 的必要操作。

  1. The Standard Way: Using EventActions.state_delta (for More Complicated Updates): For times when you need to do more complex things – like updating several keys at once, saving things that aren’t just text, targeting specific scopes like user: or app:, or making updates that aren’t tied to the agent’s final text reply – you’ll manually build a dictionary of your state changes (the state_delta) and include it within the EventActions of the Event you’re appending. Let’s look at one example:

  2. 标准方法:使用 EventActions.state_delta(用于更复杂的更新): 对于需要执行更复杂操作的时候——例如一次更新多个键、保存不只是文本的内容、针对特定范围(如 user: 或 app:)或进行与智能体文本回复无关的更新——您将手动构建状态更改的字典(state_delta)并将其包含在要追加的 Event 的 EventActions 中。让我们看一个例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
import time
from google.adk.tools.tool_context import ToolContext
from google.adk.sessions import InMemorySessionService

## --- Define the Recommended Tool-Based Approach ---
def log_user_login(tool_context: ToolContext) -> dict:
   """
   Updates the session state upon a user login event.
   This tool encapsulates all state changes related to a user login.
   Args:
       tool_context: Automatically provided by ADK, gives access to session state.
   Returns:
       A dictionary confirming the action was successful.
   """
   # Access the state directly through the provided context.
   state = tool_context.state

   # Get current values or defaults, then update the state.
   # This is much cleaner and co-locates the logic.
   login_count = state.get("user:login_count", 0) + 1
   state["user:login_count"] = login_count
   state["task_status"] = "active"
   state["user:last_login_ts"] = time.time()
   state["temp:validation_needed"] = True

   print("State updated from within the `log_user_login` tool.")

   return {
       "status": "success",
       "message": f"User login tracked. Total logins: {login_count}."
   }

## --- Demonstration of Usage ---
## In a real application, an LLM Agent would decide to call this tool.
## Here, we simulate a direct call for demonstration purposes.

## 1. Setup
session_service = InMemorySessionService()
app_name, user_id, session_id = "state_app_tool", "user3", "session3"
session = session_service.create_session(
   app_name=app_name,
   user_id=user_id,
   session_id=session_id,
   state={"user:login_count": 0, "task_status": "idle"}
)
print(f"Initial state: {session.state}")

## 2. Simulate a tool call (in a real app, the ADK Runner does this)
## We create a ToolContext manually just for this standalone example.
from google.adk.tools.tool_context import InvocationContext
mock_context = ToolContext(
   invocation_context=InvocationContext(
       app_name=app_name, user_id=user_id, session_id=session_id,
       session=session, session_service=session_service
   )
)

## 3. Execute the tool
log_user_login(mock_context)

## 4. Check the updated state
updated_session = session_service.get_session(app_name, user_id, session_id)
print(f"State after tool execution: {updated_session.state}")

## Expected output will show the same state change as the
## "Before" case,
## but the code organization is significantly cleaner
## and more robust.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
import time
from google.adk.tools.tool_context import ToolContext
from google.adk.sessions import InMemorySessionService

## --- 定义推荐的基于工具的方法 ---
def log_user_login(tool_context: ToolContext) -> dict:
    """
    在用户登录事件时更新会话状态。
    此工具封装了与用户登录相关的所有状态更改。
    参数:
        tool_context:由 ADK 自动提供,提供对会话状态的访问。
    返回:
        确认操作成功的字典。
    """
    # 通过提供的上下文直接访问状态。
    state = tool_context.state

    # 获取当前值或默认值,然后更新状态。
    # 这更加清晰并将逻辑共置。
    login_count = state.get("user:login_count", 0) + 1
    state["user:login_count"] = login_count
    state["task_status"] = "active"
    state["user:last_login_ts"] = time.time()
    state["temp:validation_needed"] = True

    print("从 `log_user_login` 工具内部更新了状态。")

    return {
        "status": "success",
        "message": f"已跟踪用户登录。总登录次数:{login_count}。"
    }

## --- 使用演示 ---
## 在真实应用程序中,LLM智能体将调用此工具。
## 在这里,我们模拟直接调用以进行演示。

## 1. 设置
session_service = InMemorySessionService()
app_name, user_id, session_id = "state_app_tool", "user3", "session3"
session = session_service.create_session(
    app_name=app_name,
    user_id=user_id,
    session_id=session_id,
    state={"user:login_count": 0, "task_status": "idle"}
)
print(f"初始状态:{session.state}")

## 2. 模拟工具调用(在真实应用中,ADK Runner 会执行此操作)
## 我们仅为此独立示例手动创建 ToolContext。
from google.adk.tools.tool_context import InvocationContext
mock_context = ToolContext(
    invocation_context=InvocationContext(
        app_name=app_name, user_id=user_id, session_id=session_id,
        session=session, session_service=session_service
    )
)

## 3. 执行工具
log_user_login(mock_context)

## 4. 检查更新的状态
updated_session = session_service.get_session(app_name, user_id, session_id)
print(f"工具执行后的状态:{updated_session.state}")

## 预期输出将显示与"之前"情况相同的状态更改,
## 但代码组织明显更清晰
## 且更健壮。

This code demonstrates a tool-based approach for managing user session state in an application. It defines a function log_user_login, which acts as a tool. This tool is responsible for updating the session state when a user logs in. The function takes a ToolContext object, provided by the ADK, to access and modify the session’s state dictionary. Inside the tool, it increments a user:login_count, sets the task_status to “active”, records the user:last_login_ts (timestamp), and adds a temporary flag temp:validation_needed.

此代码演示了一种基于工具的方法来管理应用程序中的用户会话状态。它定义了一个函数 log_user_login,充当工具。此工具负责在用户登录时更新会话状态。 该函数接受由 ADK 提供的 ToolContext 对象,以访问和修改会话的状态字典。在工具内部,它递增 user:login_count,将 task_status 设置为”active”,记录 user:last_login_ts(时间戳),并添加临时标志 temp:validation_needed。

The demonstration part of the code simulates how this tool would be used. It sets up an in-memory session service and creates an initial session with some predefined state. A ToolContext is then manually created to mimic the environment in which the ADK Runner would execute the tool. The log_user_login function is called with this mock context. Finally, the code retrieves the session again to show that the state has been updated by the tool’s execution. The goal is to show how encapsulating state changes within tools makes the code cleaner and more organized compared to directly manipulating state outside of tools.

代码的演示部分模拟了如何使用此工具。它设置了内存会话服务并创建了具有一些预定义状态的初始会话。然后手动创建 ToolContext 以模拟 ADK Runner 执行工具的环境。使用此模拟上下文调用 log_user_login 函数。最后,代码再次检索会话以显示状态已通过工具的执行更新。目标是展示将状态更改封装在工具中如何使代码比直接在工具外部操作状态更清晰、更有组织。

Note that direct modification of the session.state dictionary after retrieving a session is strongly discouraged as it bypasses the standard event processing mechanism. Such direct changes will not be recorded in the session’s event history, may not be persisted by the selected SessionService, could lead to concurrency issues, and will not update essential metadata such as timestamps. The recommended methods for updating the session state are using the output_key parameter on an LlmAgent (specifically for the agent’s final text responses) or including state changes within EventActions.state_delta when appending an event via session_service.append_event(). The session.state should primarily be used for reading existing data.

请注意,强烈不建议在检索会话后直接修改 session.state 字典,因为它会绕过标准事件处理机制。这种直接更改不会记录在会话的事件历史中,可能不会被选定的 SessionService 持久化,可能导致并发问题,并且不会更新时间戳等基本元数据。更新会话状态的推荐方法是在 LlmAgent 上使用 output_key 参数(专门用于智能体最终文本响应)或在通过 session_service.append_event() 追加事件时在 EventActions.state_delta 中包含状态更改。session.state 应主要用于读取现有数据。

To recap, when designing your state, keep it simple, use basic data types, give your keys clear names and use prefixes correctly, avoid deep nesting, and always update state using the append_event process.

总而言之,在设计状态时,保持简单,使用基本数据类型,为键提供清晰的名称并正确使用前缀,避免深度嵌套,并始终使用 append_event 过程更新状态。

Memory: Long-Term Knowledge with MemoryService

Memory:使用 MemoryService 的长期知识

In agent systems, the Session component maintains a record of the current chat history (events) and temporary data (state) specific to a single conversation. However, for agents to retain information across multiple interactions or access external data, long-term knowledge management is necessary. This is facilitated by the MemoryService.

在智能体中,Session 组件维护当前聊天历史(events)和特定于单个对话的临时数据(state)的记录。然而,为使智能体在交互中保留信息或访问外部数据,需要长期知识管理。这由 MemoryService 促进实现

1
2
3
4
5
6
7
## Example: Using InMemoryMemoryService
## This is suitable for local development and testing where data
## persistence across application restarts is not required.
## Memory content is lost when the app stops.
from google.adk.memory import InMemoryMemoryService

memory_service = InMemoryMemoryService()
1
2
3
4
5
6
## 示例:使用 InMemoryMemoryService
## 这适用于不需要跨应用程序重启数据持久性的本地开发和测试
## 应用停止时内存内容会丢失
from google.adk.memory import InMemoryMemoryService

memory_service = InMemoryMemoryService()

Session and State can be conceptualized as short-term memory for a single chat session, whereas the Long-Term Knowledge managed by the MemoryService functions as a persistent and searchable repository. This repository may contain information from multiple past interactions or external sources. The MemoryService, as defined by the BaseMemoryService interface, establishes a standard for managing this searchable, long-term knowledge. Its primary functions include adding information, which involves extracting content from a session and storing it using the add_session_to_memory method, and retrieving information, which allows an agent to query the store and receive relevant data using the search_memory method.

Session 和 State 可概念化为单个聊天会话的短期记忆,而由 MemoryService 管理的长期知识则充当持久且可搜索的存储库。此存储库可能包含来自多个过往交互或外部来源的信息。由 BaseMemoryService 接口定义的 MemoryService 为管理这种可搜索的长期知识建立了标准。其主要功能包括添加信息(涉及从会话中提取内容并使用 add_session_to_memory 方法存储)以及检索信息(允许智能体存储并使用 search_memory 方法接收相关数据)

The ADK offers several implementations for creating this long-term knowledge store. The InMemoryMemoryService provides a temporary storage solution suitable for testing purposes, but data is not preserved across application restarts. For production environments, the VertexAiRagMemoryService is typically utilized. This service leverages Google Cloud’s Retrieval Augmented Generation (RAG) service, enabling scalable, persistent, and semantic search capabilities (Also, refer to the chapter 14 on RAG).

ADK 提供了几种实现来创建这种长期知识存储。InMemoryMemoryService 提供适合测试目的的临时存储解决方案,但数据不会在应用重启后保留。对于生产环境,通常使用 VertexAiRagMemoryService。此服务利用 Google Cloud 的检索增强生成(RAG)服务,实现可扩展、持久和语义搜索功能(另请参阅第 14 章关于 RAG)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
## Example: Using VertexAiRagMemoryService
## This is suitable for scalable production on GCP, leveraging
## Vertex AI RAG (Retrieval Augmented Generation) for persistent,
## searchable memory.
## Requires: pip install google-adk[vertexai], GCP
## setup/authentication, and a Vertex AI RAG Corpus.
from google.adk.memory import VertexAiRagMemoryService

## The resource name of your Vertex AI RAG Corpus
RAG_CORPUS_RESOURCE_NAME = "projects/your-gcp-project-id/locations/us-central1/ragCorpora/your-corpus-id" # Replace with your Corpus resource name

## Optional configuration for retrieval behavior
SIMILARITY_TOP_K = 5 # Number of top results to retrieve
VECTOR_DISTANCE_THRESHOLD = 0.7 # Threshold for vector similarity

memory_service = VertexAiRagMemoryService(
   rag_corpus=RAG_CORPUS_RESOURCE_NAME,
   similarity_top_k=SIMILARITY_TOP_K,
   vector_distance_threshold=VECTOR_DISTANCE_THRESHOLD
)

## When using this service, methods like add_session_to_memory
## and search_memory will interact with the specified Vertex AI
## RAG Corpus.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
## 示例:使用 VertexAiRagMemoryService
## 这适用于 GCP 上的可扩展生产,利用
## Vertex AI RAG(检索增强生成)实现持久的、
## 可搜索的记忆。
## 需要:pip install google-adk[vertexai],GCP
## 设置/身份验证和 Vertex AI RAG Corpus。
from google.adk.memory import VertexAiRagMemoryService

## 您的 Vertex AI RAG Corpus 的资源名称
RAG_CORPUS_RESOURCE_NAME = "projects/your-gcp-project-id/locations/us-central1/ragCorpora/your-corpus-id"  # 替换为您的 Corpus 资源名称

## 检索行为的可选配置
SIMILARITY_TOP_K = 5  # 要检索的顶部结果数
VECTOR_DISTANCE_THRESHOLD = 0.7  # 向量相似度阈值

memory_service = VertexAiRagMemoryService(
    rag_corpus=RAG_CORPUS_RESOURCE_NAME,
    similarity_top_k=SIMILARITY_TOP_K,
    vector_distance_threshold=VECTOR_DISTANCE_THRESHOLD
)

## 使用此服务时,像 add_session_to_memory
## 和 search_memory 这样的方法将与指定的 Vertex AI
## RAG Corpus 交互。

Hands-on code: Memory Management in LangChain and LangGraph

实操代码:LangChain 和 LangGraph 中的记忆管理

In LangChain and LangGraph, Memory is a critical component for creating intelligent and natural-feeling conversational applications. It allows an AI agent to remember information from past interactions, learn from feedback, and adapt to user preferences. LangChain’s memory feature provides the foundation for this by referencing a stored history to enrich current prompts and then recording the latest exchange for future use. As agents handle more complex tasks, this capability becomes essential for both efficiency and user satisfaction.

在 LangChain 和 LangGraph 中,Memory 是创建智能自然对话应用的关键组件。它使 AI 智能体能够记住过去交互信息、从反馈中学习并适应用户偏好。LangChain 的记忆功能通过引用存储历史来丰富当前提示词,然后记录最新交换供将来使用,从而提供此基础。随着智能体执行复杂任务,这种能力对效率和用户满意度变得至关重要

Short-Term Memory: This is thread-scoped, meaning it tracks the ongoing conversation within a single session or thread. It provides immediate context, but a full history can challenge an LLM’s context window, potentially leading to errors or poor performance. LangGraph manages short-term memory as part of the agent’s state, which is persisted via a checkpointer, allowing a thread to be resumed at any time.

短期记忆: 这是线程范围的,意味着它跟踪单个会话或线程内正在进行的对话。它提供即时上下文,但完整历史可能挑战 LLM 的上下文窗口,导致错误或性能下降。LangGraph 将短期记忆作为智能体的一部分管理,该状态通过检查点器持久化,允许随时恢复线程

Long-Term Memory: This stores user-specific or application-level data across sessions and is shared between conversational threads. It is saved in custom “namespaces” and can be recalled at any time in any thread. LangGraph provides stores to save and recall long-term memories, enabling agents to retain knowledge indefinitely.

长期记忆: 存储跨会话的用户特定或应用级数据,在对话线程间共享。它保存在自定义”命名空间”中,可在任何线程的任何时间调用。LangGraph 提供存储来保存和调用长期记忆,使智能体能够无限期保留知识

LangChain provides several tools for managing conversation history, ranging from manual control to automated integration within chains.

LangChain 提供了多种工具管理对话历史,从手动控制到链内自动集成

ChatMessageHistory: Manual Memory Management. For direct and simple control over a conversation’s history outside of a formal chain, the ChatMessageHistory class is ideal. It allows for the manual tracking of dialogue exchanges.

ChatMessageHistory:手动记忆管理 对于在正式链外直接简单控制对话历史,ChatMessageHistory 类是理想选择。它允许手动跟踪对话交换

1
2
3
4
5
6
7
8
9
10
11
from langchain.memory import ChatMessageHistory

## Initialize the history object
history = ChatMessageHistory()

## Add user and AI messages
history.add_user_message("I'm heading to New York next week.")
history.add_ai_message("Great! It's a fantastic city.")

## Access the list of messages
print(history.messages)
1
2
3
4
5
6
7
8
9
10
11
from langchain.memory import ChatMessageHistory

## 初始化历史对象
history = ChatMessageHistory()

## 添加用户和 AI 消息
history.add_user_message("我下周要去纽约。")
history.add_ai_message("太好了!这是一个很棒的城市。")

## 访问消息列表
print(history.messages)

ConversationBufferMemory: Automated Memory for Chains. For integrating memory directly into chains, ConversationBufferMemory is a common choice. It holds a buffer of the conversation and makes it available to your prompt. Its behavior can be customized with two key parameters:

ConversationBufferMemory:链的自动化记忆。要将记忆直接集成到链中,ConversationBufferMemory 是一个常见的选择。它保存对话的缓冲区并使其可用于您的提示词。其行为可以通过两个关键参数自定义:

1
2
3
4
5
6
7
8
9
10
from langchain.memory import ConversationBufferMemory

## Initialize memory
memory = ConversationBufferMemory()

## Save a conversation turn
memory.save_context({"input": "What's the weather like?"}, {"output": "It's sunny today."})

## Load the memory as a string
print(memory.load_memory_variables({}))
1
2
3
4
5
6
7
8
9
10
from langchain.memory import ConversationBufferMemory

## 初始化记忆
memory = ConversationBufferMemory()

## 保存对话轮次
memory.save_context({"input": "天气怎么样?"}, {"output": "今天是晴天。"})

## 将记忆作为字符串加载
print(memory.load_memory_variables({}))

Integrating this memory into an LLMChain allows the model to access the conversation’s history and provide contextually relevant responses

将此记忆集成到 LLMChain 中允许模型访问对话的历史并提供上下文相关的响应

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
from langchain_openai import OpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.memory import ConversationBufferMemory

## 1. Define LLM and Prompt
llm = OpenAI(temperature=0)
template = """You are a helpful travel agent.
Previous conversation: {history}
New question: {question}
Response:"""
prompt = PromptTemplate.from_template(template)

## 2. Configure Memory
## The memory_key "history" matches the variable in the prompt
memory = ConversationBufferMemory(memory_key="history")

## 3. Build the Chain
conversation = LLMChain(llm=llm, prompt=prompt, memory=memory)

## 4. Run the Conversation
response = conversation.predict(question="I want to book a flight.")
print(response)
response = conversation.predict(question="My name is Sam, by the way.")
print(response)
response = conversation.predict(question="What was my name again?")
print(response)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from langchain_openai import OpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.memory import ConversationBufferMemory

## 1. 定义 LLM 和提示词
llm = OpenAI(temperature=0)
template = """你是一个有帮助的旅行代理。
之前的对话:{history}
新问题:{question}
响应:"""
prompt = PromptTemplate.from_template(template)

## 2. 配置记忆
## memory_key "history" 与提示词中的变量匹配
memory = ConversationBufferMemory(memory_key="history")

## 3. 构建链
conversation = LLMChain(llm=llm, prompt=prompt, memory=memory)

## 4. 运行对话
response = conversation.predict(question="我想预订航班。")
print(response)
response = conversation.predict(question="顺便说一下,我叫 Sam。")
print(response)response = conversation.predict(question="我的名字是什么?")
print(response)

For improved effectiveness with chat models, it is recommended to use a structured list of message objects by setting return_messages=True.

为了提高聊天模型的有效性,建议通过设置 return_messages=True 使用消息对象的结构化列表。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
from langchain_openai import ChatOpenAI
from langchain.chains import LLMChain
from langchain.memory import ConversationBufferMemory
from langchain_core.prompts import (
   ChatPromptTemplate,
   MessagesPlaceholder,
   SystemMessagePromptTemplate,
   HumanMessagePromptTemplate,
)

## 1. Define Chat Model and Prompt
llm = ChatOpenAI()
prompt = ChatPromptTemplate(
   messages=[
       SystemMessagePromptTemplate.from_template("You are a friendly assistant."),
       MessagesPlaceholder(variable_name="chat_history"),
       HumanMessagePromptTemplate.from_template("{question}")
   ]
)

## 2. Configure Memory
## return_messages=True is essential for chat models
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

## 3. Build the Chain
conversation = LLMChain(llm=llm, prompt=prompt, memory=memory)

## 4. Run the Conversation
response = conversation.predict(question="Hi, I'm Jane.")
print(response)
response = conversation.predict(question="Do you remember my name?")
print(response)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
from langchain_openai import ChatOpenAI
from langchain.chains import LLMChain
from langchain.memory import ConversationBufferMemory
from langchain_core.prompts import (
    ChatPromptTemplate,
    MessagesPlaceholder,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
)

## 1. 定义聊天模型和提示词
llm = ChatOpenAI()
prompt = ChatPromptTemplate(
    messages=[
        SystemMessagePromptTemplate.from_template("你是一个友好的助手。"),
        MessagesPlaceholder(variable_name="chat_history"),
        HumanMessagePromptTemplate.from_template("{question}")
    ]
)

## 2. 配置记忆
## return_messages=True 对于聊天模型是必不可少的
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

## 3. 构建链
conversation = LLMChain(llm=llm, prompt=prompt, memory=memory)

## 4. 运行对话
response = conversation.predict(question="嗨,我是 Jane。")
print(response)
response = conversation.predict(question="你记得我的名字吗?")
print(response)

Types of Long-Term Memory: Long-term memory allows systems to retain information across different conversations, providing a deeper level of context and personalization. It can be broken down into three types analogous to human memory:

长期记忆类型:长期记忆允许系统在不同对话中保留信息,提供更深层次上下文和个性化。可分解为类似人类记忆的三种类型:

Below is pseudo-code demonstrating how an agent might use reflection to update its procedural memory stored in a LangGraph BaseStore

下面是演示智能体使用反思来更新存储在 LangGraph BaseStore 中的程序记忆的伪代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
## Node that updates the agent's instructions
def update_instructions(state: State, store: BaseStore):
   namespace = ("instructions",)
   # Get the current instructions from the store
   current_instructions = store.search(namespace)[0]

   # Create a prompt to ask the LLM to reflect on the conversation
   # and generate new, improved instructions
   prompt = prompt_template.format(
       instructions=current_instructions.value["instructions"],
       conversation=state["messages"]
   )

   # Get the new instructions from the LLM
   output = llm.invoke(prompt)
   new_instructions = output['new_instructions']

   # Save the updated instructions back to the store
   store.put(("agent_instructions",), "agent_a", {"instructions": new_instructions})

## Node that uses the instructions to generate a response
def call_model(state: State, store: BaseStore):
   namespace = ("agent_instructions",)
   # Retrieve the latest instructions from the store
   instructions = store.get(namespace, key="agent_a")[0]

   # Use the retrieved instructions to format the prompt
   prompt = prompt_template.format(instructions=instructions.value["instructions"])
   # ... application logic continues
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
## 更新 agent 指令的节点
def update_instructions(state: State, store: BaseStore):
    namespace = ("instructions",)
    # 从存储中获取当前指令
    current_instructions = store.search(namespace)[0]

    # 创建提示词要求 LLM 反思对话
    # 并生成新的、改进的指令
    prompt = prompt_template.format(
        instructions=current_instructions.value["instructions"],
        conversation=state["messages"]
    )

    # 从 LLM 获取新指令
    output = llm.invoke(prompt)
    new_instructions = output['new_instructions']

    # 将更新的指令保存回存储
    store.put(("agent_instructions",), "agent_a", {"instructions": new_instructions})

## 使用指令生成响应的节点
def call_model(state: State, store: BaseStore):
    namespace = ("agent_instructions",)
    # 从存储中检索最新指令
    instructions = store.get(namespace, key="agent_a")[0]

    # 使用检索到的指令格式化提示词
    prompt = prompt_template.format(instructions=instructions.value["instructions"])
    # ... 应用程序逻辑继续

LangGraph stores long-term memories as JSON documents in a store. Each memory is organized under a custom namespace (like a folder) and a distinct key (like a filename). This hierarchical structure allows for easy organization and retrieval of information. The following code demonstrates how to use InMemoryStore to put, get, and search for memories.

LangGraph 将长期记忆存储为存储中的 JSON 文档。每个记忆都在自定义命名空间(如文件夹)和不同的键(如文件名)下组织。这种分层结构便于组织和检索信息。以下代码演示了如何使用 InMemoryStore 放置、获取和搜索记忆。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
from langgraph.store.memory import InMemoryStore

## A placeholder for a real embedding function
def embed(texts: list[str]) -> list[list[float]]:
   # In a real application, use a proper embedding model
   return [[1.0, 2.0] for _ in texts]

## Initialize an in-memory store. For production, use a database-backed store.
store = InMemoryStore(index={"embed": embed, "dims": 2})

## Define a namespace for a specific user and application context
user_id = "my-user"
application_context = "chitchat"
namespace = (user_id, application_context)

## 1. Put a memory into the store
store.put(
   namespace,
   "a-memory",  # The key for this memory
   {
       "rules": [
           "User likes short, direct language",
           "User only speaks English & python",
       ],
       "my-key": "my-value",
   },
)

## 2. Get the memory by its namespace and key
item = store.get(namespace, "a-memory")
print("Retrieved Item:", item)

## 3. Search for memories within the namespace, filtering by content
## and sorting by vector similarity to the query.
items = store.search(
   namespace,
   filter={"my-key": "my-value"},
   query="language preferences"
)
print("Search Results:", items)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
from langgraph.store.memory import InMemoryStore

## 真实嵌入函数的占位符
def embed(texts: list[str]) -> list[list[float]]:
    # 在真实应用程序中,使用适当的嵌入模型
    return [[1.0, 2.0] for _ in texts]

## 初始化记忆存储。对于生产,使用数据库支持的存储。
store = InMemoryStore(index={"embed": embed, "dims": 2})

## 为特定用户和应用程序上下文定义命名空间
user_id = "my-user"
application_context = "chitchat"
namespace = (user_id, application_context)

## 1. 将记忆放入存储
store.put(
    namespace,
    "a-memory",  # 此记忆的键
    {
        "rules": [
            "用户喜欢简短、直接的语言",
            "用户只说英语和 python",
        ],
        "my-key": "my-value",
    },
)

## 2. 通过其命名空间和键获取记忆
item = store.get(namespace, "a-memory")
print("检索的项目:", item)

## 3. 在命名空间内搜索记忆,按内容过滤
## 并按与查询的向量相似度排序。
items = store.search(
    namespace,
    filter={"my-key": "my-value"},
    query="语言偏好"
)
print("搜索结果:", items)

Vertex Memory Bank

Vertex Memory Bank(记忆库)

Memory Bank, a managed service in the Vertex AI Agent Engine, provides agents with persistent, long-term memory. The service uses Gemini models to asynchronously analyze conversation histories to extract key facts and user preferences.

Memory Bank 是 Vertex AI Agent Engine 的托管服务,为智能体提供长期记忆。该服务使用 Gemini 模型异步分析对话历史,提取关键事实和用户偏好。

This information is stored persistently, organized by a defined scope like user ID, and intelligently updated to consolidate new data and resolve contradictions. Upon starting a new session, the agent retrieves relevant memories through either a full data recall or a similarity search using embeddings. This process allows an agent to maintain continuity across sessions and personalize responses based on recalled information.

此信息持久存储,按定义范围(如用户 ID)组织,并智能更新(整合新数据、解决矛盾)。开始新会话时,Agent 通过完整数据调用或嵌入相似度搜索检索相关记忆,实现跨会话连续性及基于回忆信息的个性化响应。

The agent’s runner interacts with the VertexAiMemoryBankService, which is initialized first. This service handles the automatic storage of memories generated during the agent’s conversations. Each memory is tagged with a unique USER_ID and APP_NAME, ensuring accurate retrieval in the future.

Agent 的 runner 与 VertexAiMemoryBankService 交互,后者需要先初始化。此服务处理智能体运行期间生成记忆的自动存储。每个记忆标记有唯一的 USER_ID 和 APP_NAME,确保将来能准确检索

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from google.adk.memory import VertexAiMemoryBankService

agent_engine_id = agent_engine.api_resource.name.split("/")[-1]
memory_service = VertexAiMemoryBankService(
   project="PROJECT_ID",
   location="LOCATION",
   agent_engine_id=agent_engine_id
)

session = await session_service.get_session(
   app_name=app_name,
   user_id="USER_ID",
   session_id=session.id
)

await memory_service.add_session_to_memory(session)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from google.adk.memory import VertexAiMemoryBankService

agent_engine_id = agent_engine.api_resource.name.split("/")[-1]
memory_service = VertexAiMemoryBankService(
    project="PROJECT_ID",
    location="LOCATION",
    agent_engine_id=agent_engine_id
)

session = await session_service.get_session(
    app_name=app_name,
    user_id="USER_ID",
    session_id=session.id
)

await memory_service.add_session_to_memory(session)

Memory Bank offers seamless integration with the Google ADK, providing an immediate out-of-the-box experience. For users of other agent frameworks, such as LangGraph and CrewAI, Memory Bank also offers support through direct API calls. Online code examples demonstrating these integrations are readily available for interested readers.

Memory Bank 提供与 Google ADK 的无缝集成,提供即开即用的体验。对于使用其他智能体框架(如 LangGraph 和 CrewAI)的用户,Memory Bank 还通过直接 API 调用提供支持。演示这些集成的在线代码示例可供感兴趣的读者使用

At a Glance

模式概览

What: Agentic systems need to remember information from past interactions to perform complex tasks and provide coherent experiences. Without a memory mechanism, agents are stateless, unable to maintain conversational context, learn from experience, or personalize responses for users. This fundamentally limits them to simple, one-shot interactions, failing to handle multi-step processes or evolving user needs. The core problem is how to effectively manage both the immediate, temporary information of a single conversation and the vast, persistent knowledge gathered over time.

是什么:智能体需要记住过去交互信息以执行复杂任务并提供连贯体验。缺乏记忆机制时,Agent 无法维护对话上下文、从经验中学习或提供个性化响应,被限制在简单的一次性交互中,无法处理多步骤过程或变化需求。核心问题是如何有效管理单个对话的即时临时信息和随时间积累的持久知识。

Why: The standardized solution is to implement a dual-component memory system that distinguishes between short-term and long-term storage. Short-term, contextual memory holds recent interaction data within the LLM’s context window to maintain conversational flow. For information that must persist, long-term memory solutions use external databases, often vector stores, for efficient, semantic retrieval. Agentic frameworks like the Google ADK provide specific components to manage this, such as Session for the conversation thread and State for its temporary data. A dedicated MemoryService is used to interface with the long-term knowledge base, allowing the agent to retrieve and incorporate relevant past information into its current context.

为什么: 标准化解决方案是实现区分短期和长期存储的双组件记忆系统:短期上下文记忆在 LLM 窗口内保存最近交互以维护对话流;长期记忆使用外部数据库(如向量存储)实现高效的语义检索。Google ADK 等框架提供特定组件来管理此过程(如 Session 管理对话线程,State 处理临时数据)。专用的 MemoryService 与长期知识库交互,允许智能体将相关的过去信息纳入当前上下文。

Rule of thumb: Use this pattern when an agent needs to do more than answer a single question. It is essential for agents that must maintain context throughout a conversation, track progress in multi-step tasks, or personalize interactions by recalling user preferences and history. Implement memory management whenever the agent is expected to learn or adapt based on past successes, failures, or newly acquired information.

经验法则: 当智能体做更多事情(不仅是回答单个问题)时使用此模式。对于需维护对话上下文、跟踪多步骤任务进度或通过回忆用户偏好历史来提供个性化交互的 Agent,此模式必不可少。当智能体从过去的成功/失败或新信息中学习并适应时,应实现记忆管理。

Visual summary

可视化摘要

Fig.1: Memory management design pattern

图 1:记忆管理设计模式

Key Takeaways

关键要点

To quickly recap the main points about memory management:

快速回顾记忆管理的核心要点:

Conclusion

结论

This chapter dove into the really important job of memory management for agent systems, showing the difference between the short-lived context and the knowledge that sticks around for a long time. We talked about how these types of memory are set up and where you see them used in building smarter agents that can remember things. We took a detailed look at how Google ADK gives you specific pieces like Session, State, and MemoryService to handle this. Now that we’ve covered how agents can remember things, both short-term and long-term, we can move on to how they can learn and adapt. The next pattern “Learning and Adaptation” is about an agent changing how it thinks, acts, or what it knows, all based on new experiences or data.

本章深入探讨智能体记忆管理,展示了短暂上下文与长期持久知识之间的区别。我们讨论了这些记忆类型的设置方式以及在构建更智能智能体中的应用。详细了解了 Google ADK 如何通过 Session、State 和 MemoryService 等组件处理此过程。既然已介绍智能体如何管理记忆(短期和长期),我们将继续探讨其如何学习和适应。下一模式”学习和适应”关注智能体如何根据新体验或数据改变思考、行动或知识。

References

参考文献

  1. ADK Memory, https://google.github.io/adk-docs/sessions/memory/
  2. LangGraph Memory, https://langchain-ai.github.io/langgraph/concepts/memory/
  3. Vertex AI Agent Engine Memory Bank, https://cloud.google.com/blog/products/ai-machine-learning/vertex-ai-memory-bank-in-public-preview