ENG-740: Redis key performance issue (base storage layer)#7409
ENG-740: Redis key performance issue (base storage layer)#7409
Conversation
Phase 1: Infrastructure + Tests - Context manager get_dsr_cache_store() for obtaining store instances - Fix clear() to use SCAN for finding all keys in mixed state - Add 14 migration tests with factory functions and parametrization - Tests verify backward compatibility and migration behavior No production code changes - infrastructure only.
|
The latest updates on your projects. Learn more about Vercel for GitHub. 2 Skipped Deployments
|
Greptile SummaryThis PR introduces a new Redis cache management layer (
Confidence Score: 3/5
Important Files Changed
Last reviewed commit: 8581498 |
| def set( | ||
| self, | ||
| dsr_id: str, | ||
| part: str, | ||
| value: RedisValue, | ||
| expire_seconds: Optional[int] = None, | ||
| ) -> Optional[bool]: | ||
| """ | ||
| Set a value for the given DSR and part. Registers the key in the DSR index. | ||
| """ | ||
| key = _dsr_key(dsr_id, part) | ||
| return self._manager.set_with_index( | ||
| key, value, _dsr_index_prefix(dsr_id), expire_seconds | ||
| ) |
There was a problem hiding this comment.
Index sets accumulate stale references when data keys expire
When set_with_index is called with expire_seconds, the data key gets a TTL but the index set member (the key name stored in the Redis SET at __idx:dsr:{dsr_id}) does not. When the data key expires via Redis TTL, the index set still references it.
This means get_all_keys() will return keys that no longer exist in Redis, and delete_keys_by_index() in the manager will attempt to delete already-expired keys (harmless but wasteful). More importantly, the index set itself never expires, so for DSRs that are never explicitly clear()-ed, the index sets will accumulate indefinitely.
Consider either:
- Setting a TTL on the index set itself (e.g., matching the max data key TTL), or
- Filtering out dead keys in
get_all_keys()(check existence before returning), or - Documenting this as a known limitation that
clear()should always be called.
| def clear(self, dsr_id: str) -> None: | ||
| """ | ||
| Delete all cache keys for this DSR and remove the index. | ||
|
|
||
| Always uses SCAN to find all keys (both indexed and legacy) to ensure | ||
| complete cleanup in mixed-key scenarios. | ||
| """ | ||
| # Use SCAN to find ALL keys (indexed + legacy) | ||
| all_keys_via_scan = list(self._redis.scan_iter(match=f"*{dsr_id}*", count=500)) | ||
|
|
||
| index_prefix = _dsr_index_prefix(dsr_id) | ||
|
|
||
| # Delete all found keys | ||
| for key in all_keys_via_scan: | ||
| self._redis.delete(key) | ||
| self._manager.remove_key_from_index(index_prefix, key) | ||
|
|
||
| # Delete the index itself | ||
| self._manager.delete_index(index_prefix) |
There was a problem hiding this comment.
SCAN pattern *{dsr_id}* will match the index set key itself
The SCAN pattern f"*{dsr_id}*" on line 293 will also match the index set key __idx:dsr:{dsr_id} because it contains the dsr_id substring. This means line 299 deletes the index set key via self._redis.delete(key), and then line 303 tries to delete it again via self._manager.delete_index(index_prefix). The second call is a harmless no-op, but line 300 also calls remove_key_from_index on the already-deleted index — this is an SREM on a key that no longer exists.
More importantly, deleting keys one at a time in a loop (line 299) is less efficient than batching. Consider:
| def clear(self, dsr_id: str) -> None: | |
| """ | |
| Delete all cache keys for this DSR and remove the index. | |
| Always uses SCAN to find all keys (both indexed and legacy) to ensure | |
| complete cleanup in mixed-key scenarios. | |
| """ | |
| # Use SCAN to find ALL keys (indexed + legacy) | |
| all_keys_via_scan = list(self._redis.scan_iter(match=f"*{dsr_id}*", count=500)) | |
| index_prefix = _dsr_index_prefix(dsr_id) | |
| # Delete all found keys | |
| for key in all_keys_via_scan: | |
| self._redis.delete(key) | |
| self._manager.remove_key_from_index(index_prefix, key) | |
| # Delete the index itself | |
| self._manager.delete_index(index_prefix) | |
| def clear(self, dsr_id: str) -> None: | |
| """ | |
| Delete all cache keys for this DSR and remove the index. | |
| Always uses SCAN to find all keys (both indexed and legacy) to ensure | |
| complete cleanup in mixed-key scenarios. | |
| """ | |
| # Use SCAN to find ALL keys (indexed + legacy) | |
| all_keys_via_scan = list(self._redis.scan_iter(match=f"*{dsr_id}*", count=500)) | |
| index_prefix = _dsr_index_prefix(dsr_id) | |
| # Delete all found keys in batch | |
| if all_keys_via_scan: | |
| self._redis.delete(*all_keys_via_scan) | |
| # Delete the index itself | |
| self._manager.delete_index(index_prefix) |
| logical key; set_encoded_object stores under EN_ + logical key in Redis). | ||
| """ | ||
|
|
||
| from typing import Any, Dict, Tuple |
There was a problem hiding this comment.
Unused imports Any and Dict
Any and Dict are imported but never used in this module. Only Tuple is used.
| from typing import Any, Dict, Tuple | |
| from typing import Tuple |
| __idx:{index_prefix}; members are the actual cache key names. | ||
| """ | ||
|
|
||
| from typing import Any, List, Optional, Union |
There was a problem hiding this comment.
Unused import Any
Any is imported but never used as a type annotation in this module — it only appears in a docstring.
| from typing import Any, List, Optional, Union | |
| from typing import List, Optional, Union |
| from fides.common.cache.manager import RedisCacheManager | ||
|
|
||
| # Value types supported by Redis | ||
| RedisValue = Union[bytes, float, int, str] |
There was a problem hiding this comment.
RedisValue type alias duplicated across modules
RedisValue = Union[bytes, float, int, str] is defined identically in both manager.py (line 17) and dsr_store.py (line 25). Since dsr_store.py already imports from manager.py, consider importing RedisValue from there instead of redefining it.
| from fides.common.cache.manager import RedisCacheManager | |
| # Value types supported by Redis | |
| RedisValue = Union[bytes, float, int, str] | |
| from fides.common.cache.manager import RedisCacheManager, RedisValue |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| @contextmanager | ||
| def get_dsr_cache_store( | ||
| *, | ||
| backfill_index_on_legacy_read: bool = True, | ||
| migrate_legacy_on_read: bool = True, | ||
| ) -> Iterator[DSRCacheStore]: | ||
| """ | ||
| Context manager that yields a DSRCacheStore backed by the application Redis connection. | ||
|
|
||
| The store handles both new-format keys (dsr:{id}:{part}) and legacy keys | ||
| (id-{id}-{field}-{attr}) with automatic migration on read when migrate_legacy_on_read=True. | ||
|
|
||
| Args: | ||
| backfill_index_on_legacy_read: When listing keys and falling back to SCAN for | ||
| legacy keys, add those keys to the index. Default True. | ||
| migrate_legacy_on_read: When a get finds value in legacy key only, write to | ||
| new key, delete legacy key, add new key to index. Default True. | ||
|
|
||
| Yields: | ||
| DSRCacheStore instance ready for use | ||
|
|
||
| Usage: | ||
| with get_dsr_cache_store() as store: | ||
| store.clear(privacy_request_id) | ||
|
|
||
| with get_dsr_cache_store() as store: | ||
| store.write_identity(pr_id, "email", "user@example.com") | ||
| value = store.get_identity(pr_id, "email") | ||
| """ | ||
| from fides.api.util.cache import get_cache | ||
|
|
||
| redis_client = get_cache() |
There was a problem hiding this comment.
Import inside function body
The get_cache import is placed inside the function body. Per project conventions, imports should be at the top of the module. If there's a circular dependency concern here, it may be worth noting that in a comment — otherwise move it to the top.
| @contextmanager | |
| def get_dsr_cache_store( | |
| *, | |
| backfill_index_on_legacy_read: bool = True, | |
| migrate_legacy_on_read: bool = True, | |
| ) -> Iterator[DSRCacheStore]: | |
| """ | |
| Context manager that yields a DSRCacheStore backed by the application Redis connection. | |
| The store handles both new-format keys (dsr:{id}:{part}) and legacy keys | |
| (id-{id}-{field}-{attr}) with automatic migration on read when migrate_legacy_on_read=True. | |
| Args: | |
| backfill_index_on_legacy_read: When listing keys and falling back to SCAN for | |
| legacy keys, add those keys to the index. Default True. | |
| migrate_legacy_on_read: When a get finds value in legacy key only, write to | |
| new key, delete legacy key, add new key to index. Default True. | |
| Yields: | |
| DSRCacheStore instance ready for use | |
| Usage: | |
| with get_dsr_cache_store() as store: | |
| store.clear(privacy_request_id) | |
| with get_dsr_cache_store() as store: | |
| store.write_identity(pr_id, "email", "user@example.com") | |
| value = store.get_identity(pr_id, "email") | |
| """ | |
| from fides.api.util.cache import get_cache | |
| redis_client = get_cache() | |
| from fides.api.util.cache import get_cache | |
| @contextmanager | |
| def get_dsr_cache_store( | |
| *, | |
| backfill_index_on_legacy_read: bool = True, | |
| migrate_legacy_on_read: bool = True, | |
| ) -> Iterator[DSRCacheStore]: | |
| """ | |
| Context manager that yields a DSRCacheStore backed by the application Redis connection. | |
| The store handles both new-format keys (dsr:{id}:{part}) and legacy keys | |
| (id-{id}-{field}-{attr}) with automatic migration on read when migrate_legacy_on_read=True. | |
| Args: | |
| backfill_index_on_legacy_read: When listing keys and falling back to SCAN for | |
| legacy keys, add those keys to the index. Default True. | |
| migrate_legacy_on_read: When a get finds value in legacy key only, write to | |
| new key, delete legacy key, add new key to index. Default True. | |
| Yields: | |
| DSRCacheStore instance ready for use | |
| Usage: | |
| with get_dsr_cache_store() as store: | |
| store.clear(privacy_request_id) | |
| with get_dsr_cache_store() as store: | |
| store.write_identity(pr_id, "email", "user@example.com") | |
| value = store.get_identity(pr_id, "email") | |
| """ | |
| redis_client = get_cache() |
Context Used: Rule from dashboard - Python imports should always be placed at the top of the file, not near the code that uses them. (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Ticket ENG-740
Description Of Changes
TL;DR - redis cache manager with automatic set-based key indexing, lazy migration of old keys, is not used in any APIs or async functions in this PR, it is purely the storage logic.
This PR lays the groundwork for a Redis cache layer that avoids using
KEYS(and eventuallySCAN) by providing a convenience layer for secondary index storage (and also as a starting point for other Redis improvements like clustering, enhanced logging, monitoring, etc). This implementation provides the following (and nothing else) - aRedisCacheManagerthat provides a layer around the Redis client to automatically handle secondary indexing of keys so that they can be cleared and queried without synchronous (and slow) access likeKEYS(orSCAN). It does this by allowing you to specify an index name and then transparently recording keys that are related to that index in a Redis set with that name (i.edsr:1234or something). Then you can quickly get all the keys for iteration or determine membership, etc.The second part which could be a separate PR if you prefer as it sits on top of the RedisCacheManager, is a DSRCacheStore that has well-known operations for caching DSRs that take advantage of the underlying secondary index management in RedisCacheManager and also performs backwards compatible key translation (
KeyMapper) for well-known types of DSR data, lazy migration on read and writes to the new index mechanism. This is because (among other things) there will be an initial period where we have legacy data being processed and we need to be able to continue working while moving over to this new format (at which pointKeyMapperand the things that use it can be removed).The DSRCacheStore addresses the underlying
KEYSissue by leveragingSCANinstead for legacy keys (which uses pagination so, while not nearly as fast, provides less contention).Neither the RedisCacheManager, nor the DSRCacheStore are used in any active API/async worker code in this PR, it is purely a base for the follow ons (which I have split by DSR functionality - key clearing (the worst offender of
KEYScurrently), identity cache, encryption and custom fields, request body, etc. in subsequent PRs.Code Changes
(See above)
Steps to Confirm
None, there are unit tests to test the key management logic / mapping / lazy reads / index management / etc.
Pre-Merge Checklist
CHANGELOG.mdupdatedmaindowngrade()migration is correct and works