Skip to content

ENG-740: Redis key performance issue (base storage layer)#7409

Open
johnewart wants to merge 4 commits intomainfrom
johnewart/ENG-740/base-framework
Open

ENG-740: Redis key performance issue (base storage layer)#7409
johnewart wants to merge 4 commits intomainfrom
johnewart/ENG-740/base-framework

Conversation

@johnewart
Copy link
Collaborator

@johnewart johnewart commented Feb 18, 2026

Ticket ENG-740

Description Of Changes

TL;DR - redis cache manager with automatic set-based key indexing, lazy migration of old keys, is not used in any APIs or async functions in this PR, it is purely the storage logic.

This PR lays the groundwork for a Redis cache layer that avoids using KEYS (and eventually SCAN) by providing a convenience layer for secondary index storage (and also as a starting point for other Redis improvements like clustering, enhanced logging, monitoring, etc). This implementation provides the following (and nothing else) - a RedisCacheManager that provides a layer around the Redis client to automatically handle secondary indexing of keys so that they can be cleared and queried without synchronous (and slow) access like KEYS (or SCAN). It does this by allowing you to specify an index name and then transparently recording keys that are related to that index in a Redis set with that name (i.e dsr:1234 or something). Then you can quickly get all the keys for iteration or determine membership, etc.

The second part which could be a separate PR if you prefer as it sits on top of the RedisCacheManager, is a DSRCacheStore that has well-known operations for caching DSRs that take advantage of the underlying secondary index management in RedisCacheManager and also performs backwards compatible key translation (KeyMapper) for well-known types of DSR data, lazy migration on read and writes to the new index mechanism. This is because (among other things) there will be an initial period where we have legacy data being processed and we need to be able to continue working while moving over to this new format (at which point KeyMapper and the things that use it can be removed).

The DSRCacheStore addresses the underlying KEYS issue by leveraging SCAN instead for legacy keys (which uses pagination so, while not nearly as fast, provides less contention).

Neither the RedisCacheManager, nor the DSRCacheStore are used in any active API/async worker code in this PR, it is purely a base for the follow ons (which I have split by DSR functionality - key clearing (the worst offender of KEYS currently), identity cache, encryption and custom fields, request body, etc. in subsequent PRs.

Code Changes

(See above)

Steps to Confirm

None, there are unit tests to test the key management logic / mapping / lazy reads / index management / etc.

Pre-Merge Checklist

  • Issue requirements met
  • All CI pipelines succeeded
  • CHANGELOG.md updated
    • Add a db-migration This indicates that a change includes a database migration label to the entry if your change includes a DB migration
    • Add a high-risk This issue suggests changes that have a high-probability of breaking existing code label to the entry if your change includes a high-risk change (i.e. potential for performance impact or unexpected regression) that should be flagged
    • Updates unreleased work already in Changelog, no new entry necessary
  • UX feedback:
    • All UX related changes have been reviewed by a designer
    • No UX review needed
  • Followup issues:
    • Followup issues created
    • No followup issues
  • Database migrations:
    • Ensure that your downrev is up to date with the latest revision on main
    • Ensure that your downgrade() migration is correct and works
      • If a downgrade migration is not possible for this change, please call this out in the PR description!
    • No migrations
  • Documentation:
    • Documentation complete, PR opened in fidesdocs
    • Documentation issue created in fidesdocs
    • If there are any new client scopes created as part of the pull request, remember to update public-facing documentation that references our scope registry
    • No documentation updates required

  Phase 1: Infrastructure + Tests
  - Context manager get_dsr_cache_store() for obtaining store instances
  - Fix clear() to use SCAN for finding all keys in mixed state
  - Add 14 migration tests with factory functions and parametrization
  - Tests verify backward compatibility and migration behavior
  No production code changes - infrastructure only.
@johnewart johnewart requested a review from a team as a code owner February 18, 2026 01:48
@vercel
Copy link
Contributor

vercel bot commented Feb 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Actions Updated (UTC)
fides-plus-nightly Ignored Ignored Feb 18, 2026 1:48am
fides-privacy-center Ignored Ignored Feb 18, 2026 1:48am

Request Review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 18, 2026

Greptile Summary

This PR introduces a new Redis cache management layer (RedisCacheManager + DSRCacheStore) that replaces expensive KEYS/SCAN commands with SET-based secondary indexes for DSR (privacy request) cache operations. It also includes a KeyMapper that maps all known legacy key formats to the new dsr:{id}:{part} scheme, and supports lazy migration of legacy keys on read.

  • RedisCacheManager wraps a Redis client with index-backed operations (set_with_index, delete_keys_by_index, etc.) stored as Redis SETs at __idx:{prefix}
  • DSRCacheStore provides convenience methods for all DSR field types (identity, encryption, DRP, masking secrets, etc.) with automatic legacy key fallback and migration
  • KeyMapper encodes all discovered legacy key patterns alongside their new-format equivalents
  • Tests are thorough with 3 test files covering key mapping, store operations, and migration behavior using in-memory mocks — no real Redis required
  • Note: Index sets do not have TTLs, so stale references can accumulate when data keys expire. The clear() method's SCAN pattern also matches the index set key itself, and deletes keys individually rather than in batch
  • This is infrastructure-only — none of this code is wired into active API or async worker paths in this PR

Confidence Score: 3/5

  • Infrastructure-only PR with no active code paths affected, but has design concerns around index set TTL and clear() implementation that should be addressed before follow-on PRs wire this into production.
  • Score of 3 reflects that while the code is well-structured and thoroughly tested, there are two meaningful issues: (1) index sets never expire, creating a potential memory leak for DSRs that aren't explicitly cleared, and (2) the clear() method's SCAN pattern inadvertently matches index keys and uses inefficient one-by-one deletion. These should be resolved before this layer is used in production code paths. The PR also exceeds 500 lines (1265 additions across 10 files).
  • src/fides/common/cache/dsr_store.py needs attention for the index TTL and clear() issues. src/fides/common/cache/manager.py and src/fides/common/cache/key_mapping.py have minor unused import cleanup.

Important Files Changed

Filename Overview
src/fides/common/cache/manager.py Core RedisCacheManager with index-backed key management. Clean design with minor unused import (Any). Index sets lack TTL management, which could lead to stale references when data keys expire.
src/fides/common/cache/dsr_store.py DSR-specific cache store with legacy migration. Has two issues: (1) index sets retain stale references when data keys expire via TTL, and (2) clear() SCAN pattern matches the index set key itself, plus deletes keys one-by-one instead of batching. Duplicated RedisValue type alias.
src/fides/common/cache/key_mapping.py Maps legacy key patterns to new dsr:{id}:{part} format. Comprehensive coverage of all known DSR cache key types. Unused imports (Any, Dict) should be removed.
src/fides/common/cache/init.py Context manager entry point for DSRCacheStore. Has an import inside function body (get_cache) that should be moved to module level per project conventions unless circular dependency prevents it.
tests/common/cache/conftest.py Test conftest that overrides heavy session fixtures to avoid starting FastAPI, DB, and Celery for cache-only tests. Well-structured approach.
tests/common/cache/test_dsr_store.py Thorough unit tests for DSRCacheStore with in-memory mocks covering set/get, indexing, clear, delete, legacy migration, and all convenience methods.
tests/common/cache/test_dsr_store_key_mapping.py Comprehensive tests for every KeyMapper method verifying both new-format and legacy key generation. Clean and complete.
tests/common/cache/test_dsr_store_migration.py Migration tests using MockRedis with real RedisCacheManager. Covers legacy key reading, migration on read, mixed-state clearing, index backfill, and DSR isolation. Good use of parametrize and factory functions.

Last reviewed commit: 8581498

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 files reviewed, 6 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +113 to +126
def set(
self,
dsr_id: str,
part: str,
value: RedisValue,
expire_seconds: Optional[int] = None,
) -> Optional[bool]:
"""
Set a value for the given DSR and part. Registers the key in the DSR index.
"""
key = _dsr_key(dsr_id, part)
return self._manager.set_with_index(
key, value, _dsr_index_prefix(dsr_id), expire_seconds
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Index sets accumulate stale references when data keys expire

When set_with_index is called with expire_seconds, the data key gets a TTL but the index set member (the key name stored in the Redis SET at __idx:dsr:{dsr_id}) does not. When the data key expires via Redis TTL, the index set still references it.

This means get_all_keys() will return keys that no longer exist in Redis, and delete_keys_by_index() in the manager will attempt to delete already-expired keys (harmless but wasteful). More importantly, the index set itself never expires, so for DSRs that are never explicitly clear()-ed, the index sets will accumulate indefinitely.

Consider either:

  1. Setting a TTL on the index set itself (e.g., matching the max data key TTL), or
  2. Filtering out dead keys in get_all_keys() (check existence before returning), or
  3. Documenting this as a known limitation that clear() should always be called.

Comment on lines +285 to +303
def clear(self, dsr_id: str) -> None:
"""
Delete all cache keys for this DSR and remove the index.

Always uses SCAN to find all keys (both indexed and legacy) to ensure
complete cleanup in mixed-key scenarios.
"""
# Use SCAN to find ALL keys (indexed + legacy)
all_keys_via_scan = list(self._redis.scan_iter(match=f"*{dsr_id}*", count=500))

index_prefix = _dsr_index_prefix(dsr_id)

# Delete all found keys
for key in all_keys_via_scan:
self._redis.delete(key)
self._manager.remove_key_from_index(index_prefix, key)

# Delete the index itself
self._manager.delete_index(index_prefix)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SCAN pattern *{dsr_id}* will match the index set key itself

The SCAN pattern f"*{dsr_id}*" on line 293 will also match the index set key __idx:dsr:{dsr_id} because it contains the dsr_id substring. This means line 299 deletes the index set key via self._redis.delete(key), and then line 303 tries to delete it again via self._manager.delete_index(index_prefix). The second call is a harmless no-op, but line 300 also calls remove_key_from_index on the already-deleted index — this is an SREM on a key that no longer exists.

More importantly, deleting keys one at a time in a loop (line 299) is less efficient than batching. Consider:

Suggested change
def clear(self, dsr_id: str) -> None:
"""
Delete all cache keys for this DSR and remove the index.
Always uses SCAN to find all keys (both indexed and legacy) to ensure
complete cleanup in mixed-key scenarios.
"""
# Use SCAN to find ALL keys (indexed + legacy)
all_keys_via_scan = list(self._redis.scan_iter(match=f"*{dsr_id}*", count=500))
index_prefix = _dsr_index_prefix(dsr_id)
# Delete all found keys
for key in all_keys_via_scan:
self._redis.delete(key)
self._manager.remove_key_from_index(index_prefix, key)
# Delete the index itself
self._manager.delete_index(index_prefix)
def clear(self, dsr_id: str) -> None:
"""
Delete all cache keys for this DSR and remove the index.
Always uses SCAN to find all keys (both indexed and legacy) to ensure
complete cleanup in mixed-key scenarios.
"""
# Use SCAN to find ALL keys (indexed + legacy)
all_keys_via_scan = list(self._redis.scan_iter(match=f"*{dsr_id}*", count=500))
index_prefix = _dsr_index_prefix(dsr_id)
# Delete all found keys in batch
if all_keys_via_scan:
self._redis.delete(*all_keys_via_scan)
# Delete the index itself
self._manager.delete_index(index_prefix)

logical key; set_encoded_object stores under EN_ + logical key in Redis).
"""

from typing import Any, Dict, Tuple
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused imports Any and Dict

Any and Dict are imported but never used in this module. Only Tuple is used.

Suggested change
from typing import Any, Dict, Tuple
from typing import Tuple

__idx:{index_prefix}; members are the actual cache key names.
"""

from typing import Any, List, Optional, Union
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import Any

Any is imported but never used as a type annotation in this module — it only appears in a docstring.

Suggested change
from typing import Any, List, Optional, Union
from typing import List, Optional, Union

Comment on lines +22 to +25
from fides.common.cache.manager import RedisCacheManager

# Value types supported by Redis
RedisValue = Union[bytes, float, int, str]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RedisValue type alias duplicated across modules

RedisValue = Union[bytes, float, int, str] is defined identically in both manager.py (line 17) and dsr_store.py (line 25). Since dsr_store.py already imports from manager.py, consider importing RedisValue from there instead of redefining it.

Suggested change
from fides.common.cache.manager import RedisCacheManager
# Value types supported by Redis
RedisValue = Union[bytes, float, int, str]
from fides.common.cache.manager import RedisCacheManager, RedisValue

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +27 to +58
@contextmanager
def get_dsr_cache_store(
*,
backfill_index_on_legacy_read: bool = True,
migrate_legacy_on_read: bool = True,
) -> Iterator[DSRCacheStore]:
"""
Context manager that yields a DSRCacheStore backed by the application Redis connection.

The store handles both new-format keys (dsr:{id}:{part}) and legacy keys
(id-{id}-{field}-{attr}) with automatic migration on read when migrate_legacy_on_read=True.

Args:
backfill_index_on_legacy_read: When listing keys and falling back to SCAN for
legacy keys, add those keys to the index. Default True.
migrate_legacy_on_read: When a get finds value in legacy key only, write to
new key, delete legacy key, add new key to index. Default True.

Yields:
DSRCacheStore instance ready for use

Usage:
with get_dsr_cache_store() as store:
store.clear(privacy_request_id)

with get_dsr_cache_store() as store:
store.write_identity(pr_id, "email", "user@example.com")
value = store.get_identity(pr_id, "email")
"""
from fides.api.util.cache import get_cache

redis_client = get_cache()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import inside function body

The get_cache import is placed inside the function body. Per project conventions, imports should be at the top of the module. If there's a circular dependency concern here, it may be worth noting that in a comment — otherwise move it to the top.

Suggested change
@contextmanager
def get_dsr_cache_store(
*,
backfill_index_on_legacy_read: bool = True,
migrate_legacy_on_read: bool = True,
) -> Iterator[DSRCacheStore]:
"""
Context manager that yields a DSRCacheStore backed by the application Redis connection.
The store handles both new-format keys (dsr:{id}:{part}) and legacy keys
(id-{id}-{field}-{attr}) with automatic migration on read when migrate_legacy_on_read=True.
Args:
backfill_index_on_legacy_read: When listing keys and falling back to SCAN for
legacy keys, add those keys to the index. Default True.
migrate_legacy_on_read: When a get finds value in legacy key only, write to
new key, delete legacy key, add new key to index. Default True.
Yields:
DSRCacheStore instance ready for use
Usage:
with get_dsr_cache_store() as store:
store.clear(privacy_request_id)
with get_dsr_cache_store() as store:
store.write_identity(pr_id, "email", "user@example.com")
value = store.get_identity(pr_id, "email")
"""
from fides.api.util.cache import get_cache
redis_client = get_cache()
from fides.api.util.cache import get_cache
@contextmanager
def get_dsr_cache_store(
*,
backfill_index_on_legacy_read: bool = True,
migrate_legacy_on_read: bool = True,
) -> Iterator[DSRCacheStore]:
"""
Context manager that yields a DSRCacheStore backed by the application Redis connection.
The store handles both new-format keys (dsr:{id}:{part}) and legacy keys
(id-{id}-{field}-{attr}) with automatic migration on read when migrate_legacy_on_read=True.
Args:
backfill_index_on_legacy_read: When listing keys and falling back to SCAN for
legacy keys, add those keys to the index. Default True.
migrate_legacy_on_read: When a get finds value in legacy key only, write to
new key, delete legacy key, add new key to index. Default True.
Yields:
DSRCacheStore instance ready for use
Usage:
with get_dsr_cache_store() as store:
store.clear(privacy_request_id)
with get_dsr_cache_store() as store:
store.write_identity(pr_id, "email", "user@example.com")
value = store.get_identity(pr_id, "email")
"""
redis_client = get_cache()

Context Used: Rule from dashboard - Python imports should always be placed at the top of the file, not near the code that uses them. (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments