feat(memory): add behavioral action insight extraction#6175
feat(memory): add behavioral action insight extraction#6175bennyyuan1008 wants to merge 5 commits into
Conversation
Add extract_action_insights() API parallel to extract_memories(): - New Pydantic models: ActionInsightItem, ExtractedActionInsights - New function: extract_action_insights_from_content() in analyze.py - Prompt templates in en.json for D/L/P insight extraction - Public API on Memory, MemoryScope, MemorySlice, Flow - Async variant aextract_action_insights() - 7 unit tests covering normal path, empty input, LLM failure, and delegation through all three layers Pure addition — zero existing code paths modified. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds behavioral action insight extraction and injection across the full agent-memory stack: new ChangesBehavioral Memory Feature
Sequence Diagram(s)sequenceDiagram
participant Agent
participant LiteAgent
participant Memory
participant LLM
participant Storage
rect rgba(200, 150, 255, 0.5)
Note over Agent,Storage: Agent execution and insight extraction
Agent->>LiteAgent: execute task
LiteAgent->>LiteAgent: receive LLM output
LiteAgent->>Memory: extract_action_insights(output)
Memory->>LLM: system+user prompts (response_model=ExtractedActionInsights)
LLM-->>Memory: list[ActionInsightItem]
Memory-->>LiteAgent: insights
LiteAgent->>Storage: _save_action_insight(insight) × N with /behavioral scope
end
rect rgba(150, 200, 150, 0.5)
Note over Agent,Storage: Memory recall and prompt injection
Agent->>Memory: recall_memories(query, limit=10/30)
Memory-->>Agent: factual + behavioral matches with adjusted scores
Agent->>Agent: partition by metadata.type == "action_insight"
Agent->>Agent: format factual under "Relevant memories"
Agent->>Agent: format behavioral under "Behavioral insights"
Agent->>LLM: updated system message with both sections
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
lib/crewai/tests/memory/test_unified_memory.py (1)
999-1111: ⚡ Quick winAdd explicit async coverage for
Memory.aextract_action_insights().The new tests validate only the sync extraction path. Since this PR introduces a public async variant too, add at least one async test (success + graceful failure to
[]) to prevent regressions in that API contract.As per coding guidelines in the provided PR objectives/review stack context, the async API is part of this feature’s public surface and should be validated at test level.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/tests/memory/test_unified_memory.py` around lines 999 - 1111, Add explicit async test coverage for the Memory.aextract_action_insights() method to complement the existing sync extract_action_insights() tests. Create at least two async test functions: one that validates the success path (similar to test_extract_action_insights_returns_list_from_llm but using async/await and pytest.mark.asyncio), and another that validates the graceful failure path where LLM errors result in an empty list being returned (similar to test_extract_action_insights_llm_failure_returns_empty). These tests should mirror the structure and assertions of their sync counterparts to ensure the async API contract is properly validated and prevent future regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/memory/analyze.py`:
- Around line 108-110: The `type` field in the `ActionInsightItem` class is
currently declared as a generic `str` type, which allows any string value
despite the description specifying it must be one of 'decision', 'lesson', or
'pattern'. Replace the `type: str` annotation with a constrained type (such as a
Literal type from typing or an Enum) that enforces the three allowed values,
ensuring Pydantic validation will reject any other values and maintain
consistency in persisted metadata.
- Around line 261-266: The function-calling branch that calls llm.call(messages,
response_model=ExtractedActionInsights) on line 262 does not handle JSON-string
responses consistently with the non-function-calling branch which uses
json.loads() to parse strings. Add handling for JSON-string responses in the
function-calling branch (after the llm.call() invocation on line 262) by
checking if the response is a string and parsing it via json.loads(), similar to
the pattern used in the non-function-calling branch that starts around line
269-271, so both branches handle string responses uniformly instead of relying
on exception catching.
---
Nitpick comments:
In `@lib/crewai/tests/memory/test_unified_memory.py`:
- Around line 999-1111: Add explicit async test coverage for the
Memory.aextract_action_insights() method to complement the existing sync
extract_action_insights() tests. Create at least two async test functions: one
that validates the success path (similar to
test_extract_action_insights_returns_list_from_llm but using async/await and
pytest.mark.asyncio), and another that validates the graceful failure path where
LLM errors result in an empty list being returned (similar to
test_extract_action_insights_llm_failure_returns_empty). These tests should
mirror the structure and assertions of their sync counterparts to ensure the
async API contract is properly validated and prevent future regressions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 164dd063-69cb-48e3-8794-25be771d2848
📒 Files selected for processing (6)
lib/crewai/src/crewai/flow/runtime/__init__.pylib/crewai/src/crewai/memory/analyze.pylib/crewai/src/crewai/memory/memory_scope.pylib/crewai/src/crewai/memory/unified_memory.pylib/crewai/src/crewai/translations/en.jsonlib/crewai/tests/memory/test_unified_memory.py
PR crewAIInc#2 — Save path: - BaseAgentExecutor._save_to_memory(): extract_action_insights after extract_memories, save each insight via remember() with metadata - LiteAgent._save_to_memory(): same pattern (no root_scope) PR crewAIInc#3 — Split prompt injection: - agent/core.py _retrieve_memory_context(): post-filter recall results into factual/behavioral, format with separate templates - agent/core.py _prepare_kickoff(): same post-filter pattern - lite_agent.py _inject_memory_context(): same post-filter pattern - en.json: new "behavioral_memory" template slice All insight saves skip LLM (Group A fast path: scope/categories/importance all pre-filled). Extraction failure degrades to empty list silently.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
lib/crewai/tests/agents/test_lite_agent.py (1)
1142-1185: ⚡ Quick winAdd a scoped-memory regression case for behavioral saves.
This test uses a plain mock, so it won’t catch the
MemoryScope.remember()kwarg incompatibility path. Add a case with a realMemoryScope/MemorySlice(or autospecced mock) to lock this contract down.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/tests/agents/test_lite_agent.py` around lines 1142 - 1185, The current test_lite_agent_save_to_memory_saves_action_insights test uses a plain Mock for mock_memory, which doesn't validate the actual method signatures of MemoryScope.remember(). Add a new test case using an autospecced mock based on the real MemoryScope class (or use a real MemorySlice instance if available) instead of a plain Mock. This will ensure the test catches any incompatibilities between what the LiteAgent passes to remember() and what the actual MemoryScope.remember() method accepts as keyword arguments, properly locking down the contract between the agent and memory system.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/lite_agent.py`:
- Around line 639-652: The `self._memory.remember()` call is passing an
`agent_role` parameter that is not accepted by the `MemoryScope.remember()`
method, causing silent failures in behavioral insight storage. Remove the
`agent_role=self.role` keyword argument from the `remember()` call to make it
compatible with the memory object's API and ensure behavioral insights are
properly persisted.
---
Nitpick comments:
In `@lib/crewai/tests/agents/test_lite_agent.py`:
- Around line 1142-1185: The current
test_lite_agent_save_to_memory_saves_action_insights test uses a plain Mock for
mock_memory, which doesn't validate the actual method signatures of
MemoryScope.remember(). Add a new test case using an autospecced mock based on
the real MemoryScope class (or use a real MemorySlice instance if available)
instead of a plain Mock. This will ensure the test catches any incompatibilities
between what the LiteAgent passes to remember() and what the actual
MemoryScope.remember() method accepts as keyword arguments, properly locking
down the contract between the agent and memory system.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 2961fc89-a9b3-4bbd-ab31-9226a00f2019
📒 Files selected for processing (6)
lib/crewai/src/crewai/agent/core.pylib/crewai/src/crewai/agents/agent_builder/base_agent_executor.pylib/crewai/src/crewai/lite_agent.pylib/crewai/src/crewai/translations/en.jsonlib/crewai/tests/agents/test_lite_agent.pylib/crewai/tests/memory/test_unified_memory.py
- Add _save_action_insight(): embed → search → update-or-insert, bypassing EncodingFlow since all fields are pre-filled (Group A fast path) - Add compute_behavioral_adjustment(): frequency-based confidence (1-10 obs), staleness decay from last_observed_at (60d half-life), contradiction penalty - Apply adjustment factor in recall() shallow path for action_insight records - Replace blind remember() calls in both executors with aggregation save - Replace categories/metadata dicts in save path with structured params
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
lib/crewai/src/crewai/memory/unified_memory.py (2)
645-662:⚠️ Potential issue | 🟠 Major | ⚡ Quick winCatch
_llminitialization failures before they escape.
self._llmis evaluated beforeextract_action_insights_from_content()runs, so a missing/invalid LLM configuration can still raise here instead of returning[]as the method contract promises.🐛 Proposed fix
def extract_action_insights(self, content: str) -> list[Any]: @@ - return extract_action_insights_from_content(content, self._llm) + try: + return extract_action_insights_from_content(content, self._llm) + except Exception: + return []🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/memory/unified_memory.py` around lines 645 - 662, The extract_action_insights method passes self._llm directly to extract_action_insights_from_content without first validating that the LLM is initialized, which contradicts the docstring promise to return an empty list on LLM failure. Add a guard check before calling extract_action_insights_from_content to verify that self._llm exists and is properly configured, returning an empty list immediately if the LLM is missing or invalid. This ensures initialization failures are caught and handled gracefully within the method rather than escaping as an exception.
753-772:⚠️ Potential issue | 🔴 CriticalThe deep recall path does not apply behavioral scoring adjustments to action insights.
In the shallow branch (lines 738-744),
compute_behavioral_adjustment()multiplies scores by confidence and staleness factors based onobservation_count,last_observed_at, andcontradictsmetadata. In the deep branch (RecallFlow), onlycompute_composite_score()is applied insynthesize_results()(line 256ish), without the behavioral adjustment. This means action insights retrieved via the default deep path skip confidence and contradiction penalties.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/memory/unified_memory.py` around lines 753 - 772, The RecallFlow deep recall path does not apply behavioral scoring adjustments to the results retrieved, unlike the shallow branch which calls compute_behavioral_adjustment(). After obtaining results from flow.state.final_results in the RecallFlow block, apply the same behavioral adjustment logic used in the shallow branch that multiplies scores by confidence and staleness factors based on observation_count, last_observed_at, and contradicts metadata. Extract and apply the compute_behavioral_adjustment() call to each result in the deep branch to ensure consistent penalty application for confidence and contradiction factors across both recall paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/agents/agent_builder/base_agent_executor.py`:
- Around line 79-84: The behavioral root_scope construction at lines 79-84 is
missing the leading-slash normalization that is applied to the factual memory
root_scope at lines 58-59. Apply the same normalization pattern used in the
factual branch to the behavioral root_scope to ensure consistency. This ensures
that when memory.root_scope is relative, both behavioral insights and factual
memories are written under the same scope tree, preventing recall partition
mismatches. Update the string formatting for the behavioral root_scope to
include any leading-slash normalization that the factual memory path applies.
- Around line 71-85: Replace the private _save_action_insight() method call with
the public remember() method to ensure compatibility across all memory
implementations (Memory, MemoryScope, MemorySlice). Instead of passing insight
details as separate parameters to _save_action_insight(), use the remember()
method with a metadata dictionary containing type="action_insight" along with
insight_type, domain, rationale, and context_signals. Preserve the
scope="/behavioral" and root_scope logic by passing them as appropriate
parameters to the remember() method. This ensures behavioral insights are
persisted reliably without silent failures when wrapper classes are used instead
of the base Memory class.
In `@lib/crewai/src/crewai/memory/types.py`:
- Around line 415-423: The issue is that when last_seen contains a
timezone-aware ISO string (like "2026-06-16T10:00:00Z"),
datetime.fromisoformat() returns a timezone-aware datetime, but
datetime.utcnow() is naive. Subtracting them raises a TypeError which gets
silently caught in the except block, preventing staleness calculation. To fix
this, after parsing last_seen with datetime.fromisoformat() into the last
variable, normalize timezone-aware datetimes to naive UTC by checking if
last.tzinfo is not None, and if so, convert it to a naive UTC datetime (for
example by using astimezone(timezone.utc).replace(tzinfo=None) or equivalent),
ensuring both datetimes are naive before computing the staleness multiplier with
datetime.utcnow().
In `@lib/crewai/src/crewai/memory/unified_memory.py`:
- Line 921: The `Memory` class uses `self._logger` at line 921 in the warning
call and at lines 979-980, but `_logger` is not defined as an instance variable
in the class, causing an AttributeError when the code tries to log. To fix this,
either initialize `_logger` as an instance variable in the class constructor
(for example, using Python's logging module), or replace all references to
`self._logger` with an appropriate logger instance that is properly available in
the Memory class. Ensure the logger is defined before any code path attempts to
use it, so that exception handlers don't encounter the same AttributeError when
trying to log failures.
- Around line 930-975: The search-update-save aggregation in the insight
consolidation block (where similar records are searched and either updated via
self._storage.update() or new records are saved via self._storage.save())
performs writes directly on the storage layer without serialization. To prevent
race conditions where concurrent saves of the same insight can create duplicates
or lose observation counts, refactor this logic to route the entire
search-update-save operation through the existing _submit_save() mechanism
instead of calling storage methods directly. This ensures the aggregation is
serialized and atomic, matching the pattern used elsewhere in remember().
- Around line 957-966: The metadata dictionary being constructed for action
insights is missing the agent_role parameter, which is accepted by the method
but not persisted. Add the agent_role to the metadata dictionary (the one with
"type": "action_insight") to ensure saved behavioral insights can be properly
attributed and filtered by the agent role that produced them. Include agent_role
in the metadata dict alongside the existing fields like insight_type, domain,
rationale, and the timestamp fields.
- Around line 924-928: The method does not fall back to self.root_scope when the
root_scope argument is not provided, unlike remember() and recall() which do.
Modify the logic in the effective_scope determination block (around
join_scope_paths) to check if root_scope is None or not provided, and if so, use
self.root_scope as the fallback value before joining with the scope parameter.
This ensures behavioral insights are stored under the correct namespace and
remain visible to root-scoped recalls.
- Around line 930-954: In the consolidation logic where you iterate through
similar records and check if sim_score >= threshold, add an additional filter to
verify that the record being compared has the same insight kind before
aggregating. The current code aggregates the first record above the similarity
threshold without checking if metadata.type, insight_type, or domain match the
current record type. Before the aggregation occurs (before updating the record
with updated observation counts), add a condition to compare the insight types
from both records' metadata to ensure you are only consolidating insights of the
same kind. This prevents factual records from being merged with action insights
or other mismatched insight types.
- Around line 917-923: Add a guard clause at the beginning of the code block
that embeds action insight content (before the try block starting with
embed_text call) to check if self.read_only is true and return None early if it
is, matching the behavior of the remember() and remember_many() methods. This
ensures that the read-only mode is honored before any embedding or persistence
operations occur.
In `@lib/crewai/tests/memory/test_unified_memory.py`:
- Around line 1485-1514: The hardcoded timestamp "2026-06-16T10:00:00Z" in the
meta dictionaries of test_behavioral_adjustment_no_discount (line 1485) and
test_behavioral_adjustment_contradiction (line 1509) will become stale as wall
time advances, causing the staleness decay calculation in
compute_behavioral_adjustment to apply unexpected discounts and break the
multiplier assertions (line 1487 expecting 1.0 and line 1513 expecting 0.5). Fix
this by either mocking the current time within these test functions to always be
"2026-06-16T10:00:00Z" (or shortly after), or by using a dynamically generated
timestamp that is always fresh relative to the test execution time, ensuring the
staleness decay does not interfere with the assertions.
---
Outside diff comments:
In `@lib/crewai/src/crewai/memory/unified_memory.py`:
- Around line 645-662: The extract_action_insights method passes self._llm
directly to extract_action_insights_from_content without first validating that
the LLM is initialized, which contradicts the docstring promise to return an
empty list on LLM failure. Add a guard check before calling
extract_action_insights_from_content to verify that self._llm exists and is
properly configured, returning an empty list immediately if the LLM is missing
or invalid. This ensures initialization failures are caught and handled
gracefully within the method rather than escaping as an exception.
- Around line 753-772: The RecallFlow deep recall path does not apply behavioral
scoring adjustments to the results retrieved, unlike the shallow branch which
calls compute_behavioral_adjustment(). After obtaining results from
flow.state.final_results in the RecallFlow block, apply the same behavioral
adjustment logic used in the shallow branch that multiplies scores by confidence
and staleness factors based on observation_count, last_observed_at, and
contradicts metadata. Extract and apply the compute_behavioral_adjustment() call
to each result in the deep branch to ensure consistent penalty application for
confidence and contradiction factors across both recall paths.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 81ce4b8c-9fea-49a8-a47d-b7501af891ef
📒 Files selected for processing (6)
lib/crewai/src/crewai/agents/agent_builder/base_agent_executor.pylib/crewai/src/crewai/lite_agent.pylib/crewai/src/crewai/memory/types.pylib/crewai/src/crewai/memory/unified_memory.pylib/crewai/tests/agents/test_lite_agent.pylib/crewai/tests/memory/test_unified_memory.py
🚧 Files skipped from review as they are similar to previous changes (1)
- lib/crewai/src/crewai/lite_agent.py
| memory._save_action_insight( | ||
| content=insight.content, | ||
| insight_type=insight.type, | ||
| domain=insight.domain, | ||
| rationale=insight.rationale, | ||
| context_signals=insight.context_signals, | ||
| scope="/behavioral", | ||
| agent_role=self.agent.role, | ||
| root_scope=( | ||
| f"{base_root.rstrip('/')}/agent/" | ||
| f"{sanitize_scope_name(self.agent.role or 'unknown')}" | ||
| if isinstance(base_root, str) and base_root | ||
| else None | ||
| ), | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether all memory objects reachable from executors expose _save_action_insight.
# Expected: every concrete memory type used by BaseAgentExecutor has this method
# OR executor should call a public API instead.
set -euo pipefail
echo "== Classes defining _save_action_insight =="
rg -n --type=py '\bdef\s+_save_action_insight\s*\('
echo
echo "== Memory wrapper/public surfaces (Memory, MemoryScope, MemorySlice) =="
rg -n --type=py 'class\s+(Memory|MemoryScope|MemorySlice)\b|def\s+extract_action_insights\s*\(|def\s+remember_many\s*\(|def\s+_save_action_insight\s*\('
echo
echo "== BaseAgentExecutor/LiteAgent call sites =="
rg -n --type=py '_save_action_insight\s*\('Repository: crewAIInc/crewAI
Length of output: 1951
🏁 Script executed:
# Read base_agent_executor.py around the flagged lines to understand context and exception handling
head -n 100 lib/crewai/src/crewai/agents/agent_builder/base_agent_executor.py | tail -n 40Repository: crewAIInc/crewAI
Length of output: 2037
🏁 Script executed:
# Also check the full context around line 86 for the exception handler
sed -n '60,95p' lib/crewai/src/crewai/agents/agent_builder/base_agent_executor.pyRepository: crewAIInc/crewAI
Length of output: 1447
🏁 Script executed:
# Check the type of memory object passed to BaseAgentExecutor
rg -n 'self\.memory|memory:' lib/crewai/src/crewai/agents/agent_builder/base_agent_executor.py -A 2 -B 2 | head -40Repository: crewAIInc/crewAI
Length of output: 200
🏁 Script executed:
# Check if Memory or related classes have wrapper/proxy subclasses
rg -n 'class.*Memory.*:' lib/crewai/src/crewai/memory/ --type=py | head -20Repository: crewAIInc/crewAI
Length of output: 584
🏁 Script executed:
# Check what public APIs exist in Memory class for saving insights
sed -n '57,100p' lib/crewai/src/crewai/memory/unified_memory.pyRepository: crewAIInc/crewAI
Length of output: 1733
🏁 Script executed:
# Check MemoryScope for _save_action_insight method
rg -n '_save_action_insight' lib/crewai/src/crewai/memory/memory_scope.pyRepository: crewAIInc/crewAI
Length of output: 42
🏁 Script executed:
# Check BaseAgentExecutor for memory type annotation
sed -n '1,50p' lib/crewai/src/crewai/agents/agent_builder/base_agent_executor.py | grep -A 5 -B 5 'memory'Repository: crewAIInc/crewAI
Length of output: 1427
🏁 Script executed:
# Check full class definition and constructor of BaseAgentExecutor
rg -n 'class BaseAgentExecutor|def __init__|self\.memory' lib/crewai/src/crewai/agents/agent_builder/base_agent_executor.py -A 2 | head -30Repository: crewAIInc/crewAI
Length of output: 159
🏁 Script executed:
# Check if MemoryScope is used as alternative memory type
rg -n 'MemoryScope\|MemorySlice' lib/crewai/src/crewai/agents/agent_builder/base_agent_executor.pyRepository: crewAIInc/crewAI
Length of output: 42
🏁 Script executed:
# Check what memory types can be passed to agent/executor
fd -n 'agent.py' lib/crewai/src/crewai/agents/ --exec grep -l 'memory' {} \; | head -5Repository: crewAIInc/crewAI
Length of output: 291
🏁 Script executed:
# Check MemoryScope class definition and all its methods
sed -n '38,250p' lib/crewai/src/crewai/memory/memory_scope.py | grep -E '^\s+def |class MemoryScope|class MemorySlice'Repository: crewAIInc/crewAI
Length of output: 977
🏁 Script executed:
# Check if MemoryScope or MemorySlice have _save_action_insight
sed -n '38,300p' lib/crewai/src/crewai/memory/memory_scope.py | tail -100Repository: crewAIInc/crewAI
Length of output: 3796
🏁 Script executed:
# Check the complete MemoryScope class for all method definitions
awk '/^class MemoryScope/,/^class / {print NR": "$0}' lib/crewai/src/crewai/memory/memory_scope.py | grep 'def ' | head -20Repository: crewAIInc/crewAI
Length of output: 42
🏁 Script executed:
# Count lines in memory_scope.py and check structure
wc -l lib/crewai/src/crewai/memory/memory_scope.pyRepository: crewAIInc/crewAI
Length of output: 109
🏁 Script executed:
# Get all methods in MemoryScope class
rg -n '^\s+def ' lib/crewai/src/crewai/memory/memory_scope.py | head -30Repository: crewAIInc/crewAI
Length of output: 1445
🏁 Script executed:
# Check if extract_action_insights returns objects that have a public save method
sed -n '645,680p' lib/crewai/src/crewai/memory/unified_memory.pyRepository: crewAIInc/crewAI
Length of output: 1537
🏁 Script executed:
# Check what the insight object looks like and if there's a public way to save it
rg -n 'insight\.' lib/crewai/src/crewai/agents/agent_builder/base_agent_executor.py -A 2 | head -20Repository: crewAIInc/crewAI
Length of output: 443
🏁 Script executed:
# Check what MemoryScope.extract_action_insights returns
sed -n '174,178p' lib/crewai/src/crewai/memory/memory_scope.pyRepository: crewAIInc/crewAI
Length of output: 289
🏁 Script executed:
# Look for how _save_action_insight is implemented in Memory
sed -n '888,920p' lib/crewai/src/crewai/memory/unified_memory.pyRepository: crewAIInc/crewAI
Length of output: 1289
Replace private _save_action_insight call with public remember() API.
At line 71, the code calls the private _save_action_insight() method, which is only defined in the Memory class. If a MemoryScope or MemorySlice wrapper is used instead, this call fails with an AttributeError. The exception handler (line 86) silently logs the error and continues, dropping behavioral insights without alerting the developer.
The extract_action_insights() docstring explicitly documents using remember() with appropriate metadata instead. Use the public remember() method (available on all memory types) with metadata={"type": "action_insight", "insight_type": insight.type, "domain": insight.domain, ...} to persist insights reliably across all memory implementations.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai/src/crewai/agents/agent_builder/base_agent_executor.py` around
lines 71 - 85, Replace the private _save_action_insight() method call with the
public remember() method to ensure compatibility across all memory
implementations (Memory, MemoryScope, MemorySlice). Instead of passing insight
details as separate parameters to _save_action_insight(), use the remember()
method with a metadata dictionary containing type="action_insight" along with
insight_type, domain, rationale, and context_signals. Preserve the
scope="/behavioral" and root_scope logic by passing them as appropriate
parameters to the remember() method. This ensures behavioral insights are
persisted reliably without silent failures when wrapper classes are used instead
of the base Memory class.
| root_scope=( | ||
| f"{base_root.rstrip('/')}/agent/" | ||
| f"{sanitize_scope_name(self.agent.role or 'unknown')}" | ||
| if isinstance(base_root, str) and base_root | ||
| else None | ||
| ), |
There was a problem hiding this comment.
Normalize behavioral root_scope the same way as factual memory path.
At Line 79, behavioral scope construction omits the leading-slash normalization used in the factual branch (Lines 58-59). If memory.root_scope is relative, behavioral insights can be written under a different scope tree than factual memories, causing recall partition mismatches.
Suggested fix
- root_scope=(
- f"{base_root.rstrip('/')}/agent/"
- f"{sanitize_scope_name(self.agent.role or 'unknown')}"
- if isinstance(base_root, str) and base_root
- else None
- ),
+ root_scope=(
+ (
+ "/" + f"{base_root.rstrip('/')}/agent/{sanitize_scope_name(self.agent.role or 'unknown')}".lstrip("/")
+ )
+ if isinstance(base_root, str) and base_root
+ else None
+ ),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| root_scope=( | |
| f"{base_root.rstrip('/')}/agent/" | |
| f"{sanitize_scope_name(self.agent.role or 'unknown')}" | |
| if isinstance(base_root, str) and base_root | |
| else None | |
| ), | |
| root_scope=( | |
| ( | |
| "/" + f"{base_root.rstrip('/')}/agent/{sanitize_scope_name(self.agent.role or 'unknown')}".lstrip("/") | |
| ) | |
| if isinstance(base_root, str) and base_root | |
| else None | |
| ), |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai/src/crewai/agents/agent_builder/base_agent_executor.py` around
lines 79 - 84, The behavioral root_scope construction at lines 79-84 is missing
the leading-slash normalization that is applied to the factual memory root_scope
at lines 58-59. Apply the same normalization pattern used in the factual branch
to the behavioral root_scope to ensure consistency. This ensures that when
memory.root_scope is relative, both behavioral insights and factual memories are
written under the same scope tree, preventing recall partition mismatches.
Update the string formatting for the behavioral root_scope to include any
leading-slash normalization that the factual memory path applies.
| similar = self._storage.search( | ||
| embedding, | ||
| scope_prefix=effective_scope, | ||
| limit=3, | ||
| min_score=0.0, | ||
| ) | ||
|
|
||
| # 4. Aggregate or insert | ||
| threshold = self._config.consolidation_threshold # default 0.85 | ||
| now = datetime.utcnow() | ||
|
|
||
| for record, sim_score in similar: | ||
| if sim_score >= threshold: | ||
| # Found similar insight — aggregate | ||
| existing_meta = dict(record.metadata) | ||
| count = existing_meta.get("observation_count", 1) | ||
| existing_meta["observation_count"] = count + 1 | ||
| existing_meta["last_observed_at"] = now.isoformat() | ||
|
|
||
| updated = record.model_copy(update={ | ||
| "last_accessed": now, | ||
| "metadata": existing_meta, | ||
| }) | ||
| self._storage.update(updated) | ||
| return updated |
There was a problem hiding this comment.
Filter consolidation candidates to action insights of the same kind.
The search scans every record in the scope, then aggregates the first record above the similarity threshold without checking metadata.type, insight_type, or domain. A similar factual record, or a different behavioral insight kind, can receive action-insight counters while keeping the old content/type metadata.
🐛 Proposed fix
for record, sim_score in similar:
+ if record.metadata.get("type") != "action_insight":
+ continue
+ if record.metadata.get("insight_type") != insight_type:
+ continue
+ if domain and record.metadata.get("domain") != domain:
+ continue
if sim_score >= threshold:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| similar = self._storage.search( | |
| embedding, | |
| scope_prefix=effective_scope, | |
| limit=3, | |
| min_score=0.0, | |
| ) | |
| # 4. Aggregate or insert | |
| threshold = self._config.consolidation_threshold # default 0.85 | |
| now = datetime.utcnow() | |
| for record, sim_score in similar: | |
| if sim_score >= threshold: | |
| # Found similar insight — aggregate | |
| existing_meta = dict(record.metadata) | |
| count = existing_meta.get("observation_count", 1) | |
| existing_meta["observation_count"] = count + 1 | |
| existing_meta["last_observed_at"] = now.isoformat() | |
| updated = record.model_copy(update={ | |
| "last_accessed": now, | |
| "metadata": existing_meta, | |
| }) | |
| self._storage.update(updated) | |
| return updated | |
| for record, sim_score in similar: | |
| if record.metadata.get("type") != "action_insight": | |
| continue | |
| if record.metadata.get("insight_type") != insight_type: | |
| continue | |
| if domain and record.metadata.get("domain") != domain: | |
| continue | |
| if sim_score >= threshold: | |
| # Found similar insight — aggregate | |
| existing_meta = dict(record.metadata) | |
| count = existing_meta.get("observation_count", 1) | |
| existing_meta["observation_count"] = count + 1 | |
| existing_meta["last_observed_at"] = now.isoformat() | |
| updated = record.model_copy(update={ | |
| "last_accessed": now, | |
| "metadata": existing_meta, | |
| }) | |
| self._storage.update(updated) | |
| return updated |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai/src/crewai/memory/unified_memory.py` around lines 930 - 954, In
the consolidation logic where you iterate through similar records and check if
sim_score >= threshold, add an additional filter to verify that the record being
compared has the same insight kind before aggregating. The current code
aggregates the first record above the similarity threshold without checking if
metadata.type, insight_type, or domain match the current record type. Before the
aggregation occurs (before updating the record with updated observation counts),
add a condition to compare the insight types from both records' metadata to
ensure you are only consolidating insights of the same kind. This prevents
factual records from being merged with action insights or other mismatched
insight types.
| similar = self._storage.search( | ||
| embedding, | ||
| scope_prefix=effective_scope, | ||
| limit=3, | ||
| min_score=0.0, | ||
| ) | ||
|
|
||
| # 4. Aggregate or insert | ||
| threshold = self._config.consolidation_threshold # default 0.85 | ||
| now = datetime.utcnow() | ||
|
|
||
| for record, sim_score in similar: | ||
| if sim_score >= threshold: | ||
| # Found similar insight — aggregate | ||
| existing_meta = dict(record.metadata) | ||
| count = existing_meta.get("observation_count", 1) | ||
| existing_meta["observation_count"] = count + 1 | ||
| existing_meta["last_observed_at"] = now.isoformat() | ||
|
|
||
| updated = record.model_copy(update={ | ||
| "last_accessed": now, | ||
| "metadata": existing_meta, | ||
| }) | ||
| self._storage.update(updated) | ||
| return updated | ||
|
|
||
| # 5. No similar insight — insert new | ||
| metadata: dict[str, Any] = { | ||
| "type": "action_insight", | ||
| "insight_type": insight_type, | ||
| "domain": domain, | ||
| "rationale": rationale, | ||
| "context_signals": context_signals, | ||
| "observation_count": 1, | ||
| "first_observed_at": now.isoformat(), | ||
| "last_observed_at": now.isoformat(), | ||
| } | ||
| record = MemoryRecord( | ||
| content=content, | ||
| scope=effective_scope, | ||
| categories=[insight_type, domain], | ||
| metadata=metadata, | ||
| importance=0.5, | ||
| embedding=embedding, | ||
| ) | ||
| self._storage.save([record]) |
There was a problem hiding this comment.
Serialize search-update-save aggregation through the existing save pool.
remember() routes writes through _submit_save() to avoid races, but this method performs search/update/save directly. Concurrent saves of the same insight can both miss each other or overwrite observation_count, producing duplicates or lost observations.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai/src/crewai/memory/unified_memory.py` around lines 930 - 975, The
search-update-save aggregation in the insight consolidation block (where similar
records are searched and either updated via self._storage.update() or new
records are saved via self._storage.save()) performs writes directly on the
storage layer without serialization. To prevent race conditions where concurrent
saves of the same insight can create duplicates or lose observation counts,
refactor this logic to route the entire search-update-save operation through the
existing _submit_save() mechanism instead of calling storage methods directly.
This ensures the aggregation is serialized and atomic, matching the pattern used
elsewhere in remember().
| metadata: dict[str, Any] = { | ||
| "type": "action_insight", | ||
| "insight_type": insight_type, | ||
| "domain": domain, | ||
| "rationale": rationale, | ||
| "context_signals": context_signals, | ||
| "observation_count": 1, | ||
| "first_observed_at": now.isoformat(), | ||
| "last_observed_at": now.isoformat(), | ||
| } |
There was a problem hiding this comment.
Persist agent_role when provided.
The method accepts agent_role, but new-record metadata drops it, so saved behavioral insights cannot be attributed or filtered by the agent role that produced them.
🐛 Proposed fix
metadata: dict[str, Any] = {
"type": "action_insight",
"insight_type": insight_type,
"domain": domain,
"rationale": rationale,
"context_signals": context_signals,
"observation_count": 1,
"first_observed_at": now.isoformat(),
"last_observed_at": now.isoformat(),
}
+ if agent_role is not None:
+ metadata["agent_role"] = agent_role📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| metadata: dict[str, Any] = { | |
| "type": "action_insight", | |
| "insight_type": insight_type, | |
| "domain": domain, | |
| "rationale": rationale, | |
| "context_signals": context_signals, | |
| "observation_count": 1, | |
| "first_observed_at": now.isoformat(), | |
| "last_observed_at": now.isoformat(), | |
| } | |
| metadata: dict[str, Any] = { | |
| "type": "action_insight", | |
| "insight_type": insight_type, | |
| "domain": domain, | |
| "rationale": rationale, | |
| "context_signals": context_signals, | |
| "observation_count": 1, | |
| "first_observed_at": now.isoformat(), | |
| "last_observed_at": now.isoformat(), | |
| } | |
| if agent_role is not None: | |
| metadata["agent_role"] = agent_role |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai/src/crewai/memory/unified_memory.py` around lines 957 - 966, The
metadata dictionary being constructed for action insights is missing the
agent_role parameter, which is accepted by the method but not persisted. Add the
agent_role to the metadata dictionary (the one with "type": "action_insight") to
ensure saved behavioral insights can be properly attributed and filtered by the
agent role that produced them. Include agent_role in the metadata dict alongside
the existing fields like insight_type, domain, rationale, and the timestamp
fields.
…ep recall - Add module-level logger to Memory class (self._logger was undefined) - Fix fromisoformat Z suffix → TypeError when subtracting naive utcnow() - Add read_only guard to _save_action_insight() - Add self.root_scope fallback to _save_action_insight() - Apply behavioral adjustment in deep recall path too - Add _save_action_insight delegation on MemoryScope and MemorySlice - Fix test timestamps to avoid staleness interference
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/memory/unified_memory.py`:
- Around line 776-783: The deep recall path does not re-sort results after
applying behavioral adjustments via compute_behavioral_adjustment, while the
shallow recall path does. After the loop that modifies m.score for
action_insight records (where m.score *= adj is applied), add a re-sort
operation on the results list sorted by score in descending order to maintain
consistent ordering that reflects the adjusted relevance scores, similar to the
pattern used in the shallow recall path around line 755.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: f0c8eb29-2897-4868-8672-1384d5ecf75a
📒 Files selected for processing (4)
lib/crewai/src/crewai/memory/memory_scope.pylib/crewai/src/crewai/memory/types.pylib/crewai/src/crewai/memory/unified_memory.pylib/crewai/tests/memory/test_unified_memory.py
🚧 Files skipped from review as they are similar to previous changes (3)
- lib/crewai/src/crewai/memory/types.py
- lib/crewai/src/crewai/memory/memory_scope.py
- lib/crewai/tests/memory/test_unified_memory.py
| # Apply behavioral adjustment to action insights in deep recall too | ||
| for m in results: | ||
| if m.record.metadata.get("type") == "action_insight": | ||
| adj, adj_reasons = compute_behavioral_adjustment( | ||
| m.record.metadata, self._config, | ||
| ) | ||
| m.score *= adj | ||
| m.match_reasons.extend(adj_reasons) |
There was a problem hiding this comment.
Re-sort results after behavioral adjustment to maintain ordering invariant.
The shallow recall path (line 755) sorts results after applying behavioral adjustments, but the deep recall path does not. If compute_behavioral_adjustment significantly alters scores, the final ordering may not reflect the adjusted relevance.
🐛 Proposed fix to add re-sort after adjustment
for m in results:
if m.record.metadata.get("type") == "action_insight":
adj, adj_reasons = compute_behavioral_adjustment(
m.record.metadata, self._config,
)
m.score *= adj
m.match_reasons.extend(adj_reasons)
+ results.sort(key=lambda m: m.score, reverse=True)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Apply behavioral adjustment to action insights in deep recall too | |
| for m in results: | |
| if m.record.metadata.get("type") == "action_insight": | |
| adj, adj_reasons = compute_behavioral_adjustment( | |
| m.record.metadata, self._config, | |
| ) | |
| m.score *= adj | |
| m.match_reasons.extend(adj_reasons) | |
| # Apply behavioral adjustment to action insights in deep recall too | |
| for m in results: | |
| if m.record.metadata.get("type") == "action_insight": | |
| adj, adj_reasons = compute_behavioral_adjustment( | |
| m.record.metadata, self._config, | |
| ) | |
| m.score *= adj | |
| m.match_reasons.extend(adj_reasons) | |
| results.sort(key=lambda m: m.score, reverse=True) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai/src/crewai/memory/unified_memory.py` around lines 776 - 783, The
deep recall path does not re-sort results after applying behavioral adjustments
via compute_behavioral_adjustment, while the shallow recall path does. After the
loop that modifies m.score for action_insight records (where m.score *= adj is
applied), add a re-sort operation on the results list sorted by score in
descending order to maintain consistent ordering that reflects the adjusted
relevance scores, similar to the pattern used in the shallow recall path around
line 755.
|
Behavioral memory from ReAct chain execution is a powerful extension — capturing how an agent solves problems (tool selection patterns, retry strategies, common failure modes) gives you a lightweight form of meta-learning without retraining. A few design considerations:
Implementation paths:
For provenance (linking back to specific tasks), embedding task IDs into the behavioral memory records lets you trace why a pattern was learned and validate it against historical outcomes. The #6159 feature request ties into this nicely — capturing action patterns is a natural complement to entity/event memory. Together they give you "what happened" + "how we responded." Part of the memory systems conversation. SwarmAI. Discussion: T-MEM |
Summary
Add behavioral memory extraction — a new API to extract professional domain experience from ReAct execution traces. This is PR #1 of the Behavioral Memory feature (3-PR split).
Closes #6159
What's added
New models (
analyze.py)ActionInsightItem— structured insight with type (decision/lesson/pattern), content, rationale, domain, and context_signalsExtractedActionInsights— LLM response wrapperextract_action_insights_from_content()— pure extraction helper, best-effort (empty list on failure)Public API (
unified_memory.py)Memory.extract_action_insights()+ asyncaextract_action_insights()MemoryScope/MemorySlicedelegation (memory_scope.py)Flow.extract_action_insights()(flow/runtime/__init__.py)Prompt (
en.json)extract_action_insights_system— accuracy-first extraction criteria:extract_action_insights_user— user prompt templateTests (
test_unified_memory.py)Design principles
metadata.type=\action_insight\on existingMemoryRecordPR split
_save_to_memory()path (store with metadata)Part of #6159
Summary by CodeRabbit