add Mimir as persistent cross-session memory backend#6208
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRewrites ChangesMimir Memory Backend via MCP
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (3)
lib/crewai/src/crewai/memory/storage/mimir_storage.py (1)
88-103: ⚡ Quick winAsync delegates block the event loop.
asave,asearch, andadeletedirectly call their synchronous counterparts, which perform blocking I/O. In an async context this will block the entire event loop.Consider wrapping in
asyncio.to_thread(Python 3.9+) or usingloop.run_in_executor.♻️ Proposed fix using asyncio.to_thread
+import asyncio + # --- Async implementations required by CrewAI architecture --- async def asave(self, records: list[MemoryRecord]) -> None: - self.save(records) + await asyncio.to_thread(self.save, records) async def asearch( self, query_embedding: list[float], scope_prefix: str | None = None, limit: int = 10, min_score: float = 0.0, **kwargs: Any ) -> list[tuple[MemoryRecord, float]]: - return self.search(query_embedding, scope_prefix, limit, min_score, **kwargs) + return await asyncio.to_thread( + self.search, query_embedding, scope_prefix, limit, min_score, **kwargs + ) async def adelete(self, record_ids: list[str] | None = None, **kwargs: Any) -> int: - return self.delete(record_ids, **kwargs) + return await asyncio.to_thread(self.delete, record_ids, **kwargs)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py` around lines 88 - 103, The async methods asave, asearch, and adelete in the MimirStorage class are blocking the event loop by directly calling their synchronous counterparts which perform blocking I/O operations. Refactor these three methods to wrap the synchronous calls using asyncio.to_thread, which will execute the blocking operations in a thread pool rather than blocking the main event loop. Import asyncio at the top of the file if not already present, then modify each async method to await the asyncio.to_thread call with the corresponding synchronous method and its arguments.lib/crewai/src/crewai/memory/storage/factory.py (2)
54-58: 💤 Low valueComments are in Italian; use English for consistency.
♻️ Proposed fix
- # 1. Se l'utente chiede esplicitamente "mimir", restituiamo il nostro nuovo backend + # 1. If the user explicitly requests "mimir", return the new backend if spec == "mimir": return MimirStorage() - # 2. Altrimenti, seguiamo il flusso normale della factory personalizzata + # 2. Otherwise, follow the normal custom factory flow🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/memory/storage/factory.py` around lines 54 - 58, The comments in the factory storage creation method are written in Italian, which breaks consistency with the rest of the codebase that should use English. Replace the Italian comments with their English equivalents: translate "Se l'utente chiede esplicitamente "mimir", restituiamo il nostro nuovo backend" to explain that if the user explicitly requests "mimir", return the new MimirStorage backend, and translate "Altrimenti, seguiamo il flusso normale della factory personalizzata" to explain the fallback to the normal custom factory flow. Ensure all comments are in English for code consistency.
55-56: ⚖️ Poor tradeoff
MimirStorage()instantiated without configuration options.The factory always returns a default-configured
MimirStorage(). Users cannot specify a custompathor other options via the spec string. Consider supporting a dict/object spec or environment variable fallback for configurability.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/memory/storage/factory.py` around lines 55 - 56, The MimirStorage() instantiation in the factory method does not accept any configuration options, preventing users from customizing the path or other settings. Modify the code to parse the spec parameter to extract configuration details (such as path), check for environment variable fallbacks when spec doesn't provide configuration, and pass these extracted configuration options as arguments to the MimirStorage() constructor to support customizable storage initialization.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/memory/storage/factory.py`:
- Line 18: The top-level import of MimirStorage creates a hard dependency on the
mimir_client package for all users regardless of whether they use the mimir
storage type. Remove the unconditional import statement for MimirStorage from
the module-level imports and move it inside the factory method's conditional
branch that specifically handles the "mimir" storage type, so the import only
occurs when actually needed for lazy loading.
In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py`:
- Around line 33-46: The metadata variable is assigned directly from
record.metadata or an empty dict, and then mutated in-place with the update
call. When record.metadata is not None, this mutates the original dict and can
cause unwanted side effects for the caller. Create a shallow copy of the
metadata dictionary (either from record.metadata or an empty dict) before
calling the update method to ensure the original record.metadata remains
unchanged.
- Around line 77-86: The delete() method in the Mimir storage backend only
processes the record_ids parameter and ignores other filter parameters like
scope_prefix, categories, older_than, and metadata_filter that are passed
through **kwargs, causing silent failures where deletion operations return 0
without actually performing the requested filtering. Modify the delete() method
to either implement support for these filter parameters by passing them to the
client.forget() call, or explicitly raise NotImplementedError with a clear
message if these unsupported parameters are provided in **kwargs, so callers
receive immediate feedback instead of silent failures.
- Around line 15-26: The `__init__` method in the MimirStorage class accepts a
`table_name` parameter with a default value and assigns it to
`self._table_name`, but this parameter is never used anywhere in the class and
MimirClient does not support table configuration at the constructor level.
Remove the `table_name` parameter from the `__init__` method signature (keeping
only `path` and `**kwargs`) and delete the line that assigns `self._table_name =
table_name`.
- Around line 48-75: The search method is calling the non-existent
self.client.recall() method, which will fail at runtime. Replace this call with
the correct mimir-client API method: use
self.client.search_semantic(query_vector=query_embedding, limit=limit) for
embedding-based search or self.client.search_hybrid(query=...,
query_vector=query_embedding, limit=limit) for combined text and embedding
search. Pass the query_embedding parameter (which is currently being ignored) as
the required query_vector argument to the search method. Ensure the response
parsing correctly handles the actual structure returned by the mimir-client
search methods, as the current row structure may not match the expected format.
Apply similar corrections to the save() method (replace self.client.remember()
with appropriate artifact creation methods) and delete() method (replace
self.client.forget() with appropriate deletion methods).
---
Nitpick comments:
In `@lib/crewai/src/crewai/memory/storage/factory.py`:
- Around line 54-58: The comments in the factory storage creation method are
written in Italian, which breaks consistency with the rest of the codebase that
should use English. Replace the Italian comments with their English equivalents:
translate "Se l'utente chiede esplicitamente "mimir", restituiamo il nostro
nuovo backend" to explain that if the user explicitly requests "mimir", return
the new MimirStorage backend, and translate "Altrimenti, seguiamo il flusso
normale della factory personalizzata" to explain the fallback to the normal
custom factory flow. Ensure all comments are in English for code consistency.
- Around line 55-56: The MimirStorage() instantiation in the factory method does
not accept any configuration options, preventing users from customizing the path
or other settings. Modify the code to parse the spec parameter to extract
configuration details (such as path), check for environment variable fallbacks
when spec doesn't provide configuration, and pass these extracted configuration
options as arguments to the MimirStorage() constructor to support customizable
storage initialization.
In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py`:
- Around line 88-103: The async methods asave, asearch, and adelete in the
MimirStorage class are blocking the event loop by directly calling their
synchronous counterparts which perform blocking I/O operations. Refactor these
three methods to wrap the synchronous calls using asyncio.to_thread, which will
execute the blocking operations in a thread pool rather than blocking the main
event loop. Import asyncio at the top of the file if not already present, then
modify each async method to await the asyncio.to_thread call with the
corresponding synchronous method and its arguments.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 587782d0-4b0e-48df-b292-5845dffb0f31
📒 Files selected for processing (2)
lib/crewai/src/crewai/memory/storage/factory.pylib/crewai/src/crewai/memory/storage/mimir_storage.py
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/memory/storage/factory.py`:
- Around line 21-22: The import statement on line 21 uses an incorrect
filesystem-based path (lib.crewai.src.crewai.memory.storage.mimir_storage)
instead of a proper Python module path. This causes a ModuleNotFoundError at
runtime. Additionally, it creates a hard dependency on mimir-client for all
users regardless of whether they need it. Remove the entire import statement for
MimirStorage from line 21 since the correct lazy import pattern already exists
on line 57 within a conditional check.
In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py`:
- Line 7: Update the method signatures in the MimirStorage class to conform to
the StorageBackend Protocol definition. Replace the current incompatible
parameters (value, query, key) with the Protocol-required parameters: the save()
method should accept records: list[MemoryRecord] and return None, the search()
method should accept query_embedding: list[float] along with optional
scope_prefix: str | None and return list[tuple[MemoryRecord, float]], and the
delete() method should accept optional scope_prefix: str | None and record_ids:
list[str] | None, returning an int. Update the internal implementation of each
method to work with these new signatures so that calls from unified_memory.py
with the expected parameters will execute correctly at runtime.
- Line 67: The delete method in the mimir_storage.py file currently returns
None, but the StorageBackend Protocol expects it to return an int representing
the count of deleted records. Update the return type annotation of the delete
method from None to int, and modify the implementation to track and return the
number of records that were actually deleted instead of returning nothing. This
will ensure consistency with the protocol contract and allow callers like
encoding_flow.py to properly use the deletion count for logging and metrics.
- Around line 63-65: The exception handling in the search method of MimirStorage
class is catching all exceptions and returning an empty list, which masks
failures from the caller in unified_memory.py. Instead of silently returning an
empty list after logging the error, re-raise the exception after logging it (or
remove the return statement entirely so the exception propagates naturally).
This allows callers to distinguish between a genuine empty result and an actual
error condition.
- Around line 36-41: The create_artifact method call in the save method is using
an incorrect payload parameter instead of the required keyword arguments.
Replace the create_artifact call to pass artifact_type (such as "memory"),
content (using str(value)), and metadata (using clean_metadata) as separate
keyword arguments instead of a single payload parameter. Add the await keyword
before the create_artifact call since it is an async method. Additionally, mark
the save method itself as async since it now contains an await statement.
- Around line 49-52: The search_semantic method in the self.client object
expects a pre-computed embedding vector passed as the query_vector parameter
(type list[float]), not raw text via query_text. You need to either generate the
embedding from the query text before calling search_semantic using an embedding
service or the mimir-semantic package and pass it as query_vector, or
alternatively replace the search_semantic call entirely with search_fulltext if
you only need text-based search or search_hybrid if you need combined text and
vector search. Remove the unsupported query_text parameter and provide the
correct parameter based on which search method you choose.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 26811172-eb3e-41e4-bec6-509dd4910b99
📒 Files selected for processing (2)
lib/crewai/src/crewai/memory/storage/factory.pylib/crewai/src/crewai/memory/storage/mimir_storage.py
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| class MimirStorage(StorageBackend): |
There was a problem hiding this comment.
MimirStorage declares StorageBackend conformance but method signatures are incompatible.
The StorageBackend Protocol (from backend.py:45-100) defines:
save(self, records: list[MemoryRecord]) -> Nonesearch(self, query_embedding: list[float], scope_prefix: str | None = None, ...) -> list[tuple[MemoryRecord, float]]delete(self, scope_prefix: str | None = None, ..., record_ids: list[str] | None = None, ...) -> int
This implementation uses entirely different signatures (value/query/key parameters, different return types). When unified_memory.py calls self._storage.search(embedding, scope_prefix=..., categories=..., limit=..., min_score=...), the call will fail because this method expects (query: str, limit, filter, score_threshold).
The backend must conform to the Protocol signatures or callers will break at runtime.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py` at line 7, Update the
method signatures in the MimirStorage class to conform to the StorageBackend
Protocol definition. Replace the current incompatible parameters (value, query,
key) with the Protocol-required parameters: the save() method should accept
records: list[MemoryRecord] and return None, the search() method should accept
query_embedding: list[float] along with optional scope_prefix: str | None and
return list[tuple[MemoryRecord, float]], and the delete() method should accept
optional scope_prefix: str | None and record_ids: list[str] | None, returning an
int. Update the internal implementation of each method to work with these new
signatures so that calls from unified_memory.py with the expected parameters
will execute correctly at runtime.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py`:
- Around line 45-54: The search_fulltext() method is asynchronous but is being
called without the await keyword on line 47, which causes it to return a
coroutine object instead of actual search results. To fix this, either make the
calling method async and add await before the self.client.search_fulltext()
call, or alternatively switch to using MimirSyncClient instead of MimirClient in
the __init__ method to use the synchronous version of the search_fulltext()
method that does not require await.
- Around line 69-77: The delete_artifact() method being called in the loop for
each r_id is async but lacks an await keyword on line 73, meaning the coroutine
is created but never executed and deletions never occur. Make the method
containing this code (the one with the try-except block handling record
deletion) an async method by adding the async keyword to its definition, then
add await before each self.client.delete_artifact() call to properly wait for
the async operation to complete before incrementing deleted_count.
- Around line 22-38: The MimirStorage.save() method is declared as async but
violates the StorageBackend protocol which expects save() to be synchronous and
accept records: list[MemoryRecord] as the parameter. Replace the async save()
method with async def asave() using the correct signature that accepts records:
list[MemoryRecord], then iterate through the records to extract their values and
metadata before passing them to self.client.create_artifact(). If synchronous
callers need support, implement a separate synchronous save() wrapper that calls
asave() appropriately, or ensure all callers use await with asave().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: ae9974b3-4fe2-48f3-85a1-ec30c8be1443
📒 Files selected for processing (2)
lib/crewai/src/crewai/memory/storage/factory.pylib/crewai/src/crewai/memory/storage/mimir_storage.py
🚧 Files skipped from review as they are similar to previous changes (1)
- lib/crewai/src/crewai/memory/storage/factory.py
| # Diventa ASYNC per supportare l'SDK di Mimir (Fix Immagine 4) | ||
| async def save(self, value: Any, metadata: Optional[Dict[str, Any]] = None, agent: Optional[str] = None) -> None: | ||
| """Saves a value to the Mimir storage using async artifact creation.""" | ||
| clean_metadata = dict(metadata) if metadata else {} | ||
| if agent: | ||
| clean_metadata["agent"] = agent | ||
|
|
||
| try: | ||
| # Passiamo i parametri corretti richiesti esplicitamente dall'SDK (Fix Immagine 4) | ||
| await self.client.create_artifact( | ||
| artifact_type="memory", | ||
| content=str(value), | ||
| metadata=clean_metadata | ||
| ) | ||
| except Exception as e: | ||
| logger.error(f"Error saving to MimirStorage: {e}") | ||
| raise e |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify how UnifiedMemory calls the storage save method
rg -n "self._storage\.save" lib/crewai/src/crewai/memory/ -B 2 -A 2Repository: crewAIInc/crewAI
Length of output: 518
🏁 Script executed:
# Find and inspect StorageBackend protocol definition
fd -name "*.py" -path "*/memory/*" | head -20Repository: crewAIInc/crewAI
Length of output: 291
🏁 Script executed:
# Look for StorageBackend or storage interface definitions
rg -n "class StorageBackend|def save" lib/crewai/src/crewai/memory/ -B 2 -A 3 | head -50Repository: crewAIInc/crewAI
Length of output: 2551
🏁 Script executed:
# Check if encoding_flow.py is async-capable
rg -n "async def|await" lib/crewai/src/crewai/memory/encoding_flow.py | head -20Repository: crewAIInc/crewAI
Length of output: 42
🏁 Script executed:
# Look for the storage backend interface/protocol
rg -n "StorageBackend|BaseStorage" lib/crewai/src/crewai/memory/storage/ -A 5Repository: crewAIInc/crewAI
Length of output: 4115
🏁 Script executed:
# Check the full save method signature in mimir_storage.py
cat lib/crewai/src/crewai/memory/storage/mimir_storage.pyRepository: crewAIInc/crewAI
Length of output: 3510
🏁 Script executed:
# Check backend.py to see the full StorageBackend protocol
cat lib/crewai/src/crewai/memory/storage/backend.pyRepository: crewAIInc/crewAI
Length of output: 7112
async def save() violates the StorageBackend protocol and will silently fail when called without await.
The StorageBackend protocol defines save() as synchronous, but MimirStorage.save() is declared async. Additionally, the parameter signature is completely incompatible: the protocol expects records: list[MemoryRecord], but MimirStorage.save() takes value: Any, metadata, agent.
When encoding_flow.py calls self._storage.save(records) at line 476 without await, it receives a coroutine object that is never executed. The save operation silently never happens—no exception, no warning.
Use the protocol's async def asave() method instead. Implement asave() with the correct signature (records: list[MemoryRecord]) and keep save() as a synchronous wrapper if synchronous calls are needed, or remove save() if all callers can be made async-aware.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py` around lines 22 - 38,
The MimirStorage.save() method is declared as async but violates the
StorageBackend protocol which expects save() to be synchronous and accept
records: list[MemoryRecord] as the parameter. Replace the async save() method
with async def asave() using the correct signature that accepts records:
list[MemoryRecord], then iterate through the records to extract their values and
metadata before passing them to self.client.create_artifact(). If synchronous
callers need support, implement a separate synchronous save() wrapper that calls
asave() appropriately, or ensure all callers use await with asave().
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
lib/crewai/src/crewai/memory/storage/mimir_storage.py (3)
65-71:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winEdge case:
filter={}ignoreskeyand deletes nothing.When
filteris an empty dict{}, the condition on line 67 evaluatesfilteras truthy, sorecord_ids = filter.get("record_ids")returnsNone. Theif record_ids:check then fails, resulting in zero deletions while thekeyparameter is completely ignored.Consider treating an empty filter the same as
None:- record_ids = filter.get("record_ids") if filter else [key] + record_ids = filter.get("record_ids") if filter and filter.get("record_ids") else [key]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py` around lines 65 - 71, The issue is that when filter is an empty dictionary {}, the condition filter evaluates to truthy, but filter.get("record_ids") returns None since the key doesn't exist, causing the if record_ids check to fail and the key parameter to be ignored entirely. Fix this by modifying the logic to treat an empty filter the same as a None filter, so the fallback to [key] is used in both cases. This can be done by checking whether filter exists and contains the record_ids key before using it, or by explicitly checking if filter is empty and defaulting to None.
44-52:⚠️ Potential issue | 🔴 CriticalFix iteration over
search_fulltext()return value.The
search_fulltext()method returns aSearchResponsePydantic model with aresultsfield (list ofSearchResultobjects), not an iterable list directly. The code must iterate overresults.resultsinstead ofresults:try: response = self.client.search_fulltext(query=query, limit=limit) formatted_results = [] for res in response.results: if hasattr(res, 'score') and res.score < score_threshold: continue formatted_results.append(getattr(res, 'content', str(res))) return formatted_results🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py` around lines 44 - 52, In the try block where search_fulltext() is called, the code incorrectly iterates directly over the results variable. The search_fulltext() method returns a SearchResponse Pydantic model object, not a list directly. Change the code to store the response in a variable (e.g., response), then iterate over response.results instead of iterating over results directly. This applies to the iteration at the beginning of the for loop that checks hasattr(res, 'score').
69-71:⚠️ Potential issue | 🔴 CriticalThe
delete_artifact()method no longer exists in mimir-client (removed as of v5.0.0).Artifacts are now append-only in mimir-client, and the
delete_artifactmethod was removed from the public API. This code will fail at runtime when used with current versions of mimir-client. Either:
- Update the implementation to use
delete_tenant()for cleanup via cascade operations, or- Document that MimirStorage deletion is not supported with current mimir-client versions and remove/disable the delete() method.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py` around lines 69 - 71, The self.client.delete_artifact() method call no longer exists in mimir-client v5.0.0 and artifacts are now append-only. Replace the loop iterating over record_ids in the delete method with either a call to self.client.delete_tenant() for cascade deletion operations, or alternatively document that MimirStorage deletion is not supported with current mimir-client versions and disable the entire delete() method by removing or commenting out the implementation while keeping the method signature. Update any references to deleted_count accordingly based on which approach is chosen.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py`:
- Around line 20-21: Fix the typo in the comment above the MimirSyncClient
initialization. Change "Inization" to "Initialization" in the comment that
precedes the self.client assignment. This is a simple spelling correction in the
comment line that describes the synchronous code initialization for
MimirSyncClient.
- Around line 10-21: In the __init__ method of the MimirStorage class, fix the
typo in the comment "Inization" to "Initialization". Additionally, before
passing self.config as keyword arguments to MimirSyncClient, filter or validate
the config dictionary to ensure it only contains the supported parameters
(api_url, tenant, timeout). Extract these expected keys from self.config and
pass only those to the MimirSyncClient constructor to prevent TypeError from
unexpected parameters.
---
Outside diff comments:
In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py`:
- Around line 65-71: The issue is that when filter is an empty dictionary {},
the condition filter evaluates to truthy, but filter.get("record_ids") returns
None since the key doesn't exist, causing the if record_ids check to fail and
the key parameter to be ignored entirely. Fix this by modifying the logic to
treat an empty filter the same as a None filter, so the fallback to [key] is
used in both cases. This can be done by checking whether filter exists and
contains the record_ids key before using it, or by explicitly checking if filter
is empty and defaulting to None.
- Around line 44-52: In the try block where search_fulltext() is called, the
code incorrectly iterates directly over the results variable. The
search_fulltext() method returns a SearchResponse Pydantic model object, not a
list directly. Change the code to store the response in a variable (e.g.,
response), then iterate over response.results instead of iterating over results
directly. This applies to the iteration at the beginning of the for loop that
checks hasattr(res, 'score').
- Around line 69-71: The self.client.delete_artifact() method call no longer
exists in mimir-client v5.0.0 and artifacts are now append-only. Replace the
loop iterating over record_ids in the delete method with either a call to
self.client.delete_tenant() for cascade deletion operations, or alternatively
document that MimirStorage deletion is not supported with current mimir-client
versions and disable the entire delete() method by removing or commenting out
the implementation while keeping the method signature. Update any references to
deleted_count accordingly based on which approach is chosen.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: b70aa63a-ecf7-484b-b4f2-4e78ba5a8ae9
📒 Files selected for processing (1)
lib/crewai/src/crewai/memory/storage/mimir_storage.py
tcconnally
left a comment
There was a problem hiding this comment.
This is excellent — thank you for picking this up @Jacopos311!
A few notes from the Mimir side:
-
Repo has moved: Mimir now lives at
Perseus-Computing-LLC/mimir(transferred fromtcconnally/mimir). Please update any references in docs/README. -
mimir_clientSDK: I don't see this on PyPI yet. Mimir is primarily accessed via MCP tools (stdin/stdio or HTTP). The standard integration path would be:- MCP client in Python (e.g.,
mcppackage) connecting to the Mimir binary - Or direct SQLite reads (Mimir's DB is a standard
.dbfile)
Happy to help define the right client interface — Mimir exposes 29 MCP tools includingmimir_remember,mimir_recall,mimir_search_memories,mimir_forget, etc.
- MCP client in Python (e.g.,
-
The
save/search/deletesignatures look correct for the CrewAIStorageBackendinterface. Thesearch_fulltextmethod name mapped tomimir_recall/mimir_search_memorieswould be the right Mimir MCP call. -
Score threshold: Mimir uses confidence decay (0.0–1.0), not raw distance scores. The
score_thresholdfilter approach works well here.
Happy to collaborate on this — Mimir is MIT-licensed and we'd love to see it become a first-class CrewAI memory backend. Let me know if you need any help with the Mimir API or client integration!
|
Our Mimir (
Recommended integration pathInstead of import asyncio
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
class MimirStorage(StorageBackend):
def __init__(self, config=None):
self.config = config or {}
# Launch Mimir as a subprocess and connect via MCP stdio
self.server_params = StdioServerParams(
command="mimir",
args=["--db-path", self.config.get("db_path", "~/.mimir/mimir.db")]
)
async def _call_tool(self, name, args):
async with stdio_client(self.server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
result = await session.call_tool(name, args)
return result
def save(self, value, metadata=None, agent=None):
return asyncio.run(self._call_tool("mimir_remember", {
"category": "crewai_memory",
"key": f"memory_{hash(value)}",
"body_json": json.dumps({"value": str(value), "metadata": metadata or {}, "agent": agent})
}))
def search(self, query, limit=3, ...):
return asyncio.run(self._call_tool("mimir_recall", {
"query": query,
"limit": limit
}))Alternatively, Mimir's SQLite database is a standard Happy to pair on this — Mimir is our project and we want to make sure this integration uses the right API! |
|
Hi @tcconnally, Thank you so much for the detailed review and the warm feedback! I am incredibly excited about bringing Mimir as a first-class memory backend into CrewAI. Your corrections regarding PyPI and the repository relocation are extremely helpful. I will immediately update the references in the documentation and README to point to Thank you for providing the recommended implementation path via the MCP Python SDK. Transitioning the I am going to refactor the codebase following your implementation architecture. I would love to team up on this to make sure everything aligns perfectly with Mimir's standard design. I'll keep you posted here as soon as the refactored code is pushed! Thanks again! |
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/README.md`:
- Line 561: The documentation for MimirStorage in the README is unclear about
how to actually pass the database path. Add a code snippet showing the correct
initialization pattern where MimirStorage is instantiated with a config
parameter as a dictionary containing the db_path key, following the proposed
addition format in the review comment. This will clarify for users that they
must use the config dict syntax like MimirStorage(config={"db_path": "..."})
rather than trying alternative approaches like passing the path directly as an
argument.
In `@lib/crewai/src/crewai/memory/storage/factory.py`:
- Around line 54-56: The early return for "mimir" spec bypasses the custom
factory mechanism that should be consulted first. Remove the hardcoded check for
spec == "mimir" and instead allow the custom factory override mechanism to be
consulted first, returning None if no custom factory is set, at which point the
MimirStorage class should be instantiated as the fallback default
implementation. This ensures users can wrap or replace the mimir backend through
set_memory_storage_factory() just like any other spec.
In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py`:
- Around line 57-59: The Mimir storage implementation hardcodes the category as
"crewai_memory" in the save operation (around line 57-59) and doesn't include
scope information in the recall queries (around lines 72-75), causing
cross-user/crew memory leakage in shared databases. Modify the save method to
include the MemoryRecord.scope or configured app/user scope information in the
Mimir category, key, or body_json fields, and update the recall method to filter
by the same scope when querying Mimir, ensuring that both the write and read
operations enforce the same scoping rules to maintain memory isolation across
different crews and users.
- Around line 23-29: The StdioServerParameters args in the MimirStorage
initialization contains two issues: the flag should be "--db" instead of
"--db-path" to match Mimir's documented CLI interface, and the db_path value
containing "~" must be expanded using os.path.expanduser() before being passed
to the subprocess args since subprocess does not interpret shell expansions.
Update the args list to use the correct flag name and expand the user path
before adding it to the arguments.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 98235e81-74ac-4ca2-ba68-3a33697fa32e
📒 Files selected for processing (3)
lib/crewai/README.mdlib/crewai/src/crewai/memory/storage/factory.pylib/crewai/src/crewai/memory/storage/mimir_storage.py
| "category": "crewai_memory", | ||
| "key": f"memory_{hash(value)}", | ||
| "body_json": json.dumps(body_data) |
There was a problem hiding this comment.
Thread memory scope into both writes and recalls.
Every save writes to the global crewai_memory category, while recall sends only query and limit. In a shared Mimir database, that can return memories across crews/users instead of honoring the PR’s app/user scoping goal. Persist the MemoryRecord.scope/categories or configured app/user scope into the Mimir category/key/body and apply the same scope when recalling.
Also applies to: 72-75
🧰 Tools
🪛 ast-grep (0.43.0)
[info] 58-58: use jsonify instead of json.dumps for JSON output
Context: json.dumps(body_data)
Note: Security best practice.
(use-jsonify)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai/src/crewai/memory/storage/mimir_storage.py` around lines 57 - 59,
The Mimir storage implementation hardcodes the category as "crewai_memory" in
the save operation (around line 57-59) and doesn't include scope information in
the recall queries (around lines 72-75), causing cross-user/crew memory leakage
in shared databases. Modify the save method to include the MemoryRecord.scope or
configured app/user scope information in the Mimir category, key, or body_json
fields, and update the recall method to filter by the same scope when querying
Mimir, ensuring that both the write and read operations enforce the same scoping
rules to maintain memory isolation across different crews and users.
Implemented the
MimirStoragebackend inside the unified memory system and wired it intofactory.pyfor built-in resolution.This resolves the feature request to support Mimir for persistent cross-session memory.
Closes #6168.
Summary by CodeRabbit
save,search, anddeletesignatures and behavior were revised, including limited support for filtering.