refactor(ltm): redesign long-term memory with context compaction (reopen of #8144)#8226
refactor(ltm): redesign long-term memory with context compaction (reopen of #8144)#8226RC-CHN wants to merge 8 commits into
Conversation
- Add raw_records / contexts / summaries data model per group - Add LLM summary compaction strategy alongside truncation - Add turn-based (_split_into_rounds) granularity - Add image caption integration into LTM history - Add tool_call / tool_result persistence into raw_records - Add active reply support driven by LTM state - Improve summary injection prefix with system note and delimiters - Add info-level logging for summary compaction lifecycle - Clarify default summary prompt with explicit preserve/drop rules - Add context_guard for history overflow protection in agent runner - Add internal agent history compaction in agent_sub_stages - Add comprehensive LTM unit tests and compaction test suites
There was a problem hiding this comment.
Sorry @RC-CHN, your pull request is larger than the review limit of 150000 diff characters
There was a problem hiding this comment.
Code Review
This pull request introduces a significant upgrade to the Long-Term Memory (LTM v2) system, implementing a more robust architecture for group chat context management. Key changes include the introduction of a RequestContextGuard to protect provider requests from token limits without mutating persistent history, and the implementation of dual compaction strategies (turn-based truncation and LLM-based summarization) for both group and private chats. Feedback identifies several critical issues: potential blocking of message recording due to holding locks across network calls, memory leaks from uncleaned session locks, and resource leaks from uncancelled tasks in the tool loop. Additionally, improvements were suggested for handling malformed JSON in tool arguments, preventing system prompt duplication, and ensuring consistent fallback logic for compression providers.
There was a problem hiding this comment.
Code Review
This pull request implements LongTermMemory v2, introducing sophisticated context management for group chats, including logical round splitting and dual compaction strategies (truncation and LLM-based summarization). It also adds a RequestContextGuard to ensure provider requests stay within token limits without mutating canonical history. Feedback identifies several areas for refinement: the raw_records buffer needs trimming during message handling to prevent memory exhaustion, and robust error handling should be added to tool call parsing. Furthermore, the reviewer pointed out a memory leak in the session lock dictionary, suggested optimizing string length calculations for memory checks, and noted that the compression provider fallback logic needs to be correctly implemented to match the intended design.
- Treat lines starting with <T:CALL>, <T:RES, or <BOT/ as regular user messages when their respective parsers return None, instead of silently dropping them. Defensive guard against malformed internal markers.
Avoid allocating a new bytes object for every string when calculating buffer size in _trim_raw_records. Character count is sufficient for the approximate memory cap.
reopen of #8144 , purify commit history
Modifications / 改动点
Core:
astrbot/builtin_stars/astrbot/long_term_memory.pymax_cntring buffer withraw_records(deque) +_raw_cursor+contexts(append-only list). Old segments are never rebuilt._build_segments()converts raw chat lines into OpenAI-format context segments, handling tool calls, parallel tools, and multi-step chains.<BOT/>markers replace[You/]to avoid nickname collisions.on_agent_donerecords tool-call chains and now includes the @bot prompt in contexts so future rounds see the user's original message.asyncio.Lockfor concurrency safety;remove_session()for cleanup.Hook wiring:
astrbot/builtin_stars/astrbot/main.py@on_llm_response→@on_agent_donefor accurate tool-chain recording.group_icl_enable=trueskips Conversation DB query (conversation=None).Config:
astrbot/builtin_stars/astrbot/default.pycontext_limit_reached_strategy→"llm_compress".Agent runner:
astrbot/core/astr_main_agent.py_get_compress_providerauto-falls back to the main chat provider whenllm_compress_provider_idis unset, preventing silent truncation.Tests:
tests/unit/test_long_term_memory.pyPure functions: extract, parse, truncate, build_segments.
Integration: round-trip lifecycle, multi-round accumulation, tool chains, persona preservation, concurrent safety.
This is NOT a breaking change. / 这不是一个破坏性变更。
Screenshots or Test Results / 运行截图或测试结果
😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in
requirements.txtandpyproject.toml./ 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到
requirements.txt和pyproject.toml文件相应位置。😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。