feat(hash): add per-field expiration support (HEXPIRE, HTTL, HPERSIST) by tejaslodayadd · Pull Request #3269 · apache/kvrocks

tejaslodayadd · 2025-11-24T21:38:25Z

Implements per-field expiration support for HASH data type, following Redis 7.4 specification.

Commands:

HEXPIRE key seconds FIELDS numfields field [field ...]
HTTL key FIELDS numfields field [field ...]
HPERSIST key FIELDS numfields field [field ...]

Implementation Details

Uses backward-compatible flag-based encoding for field values
Legacy format (no expiration): [raw value]
New format (with expiration): [0xFF][flags][8-byte timestamp][value]
Existing data works without migration
Expired fields are filtered out on read operations (HGET, HMGET, HGETALL, etc.)
Compaction filter cleans up expired fields with 5-minute lazy delete buffer

This extends the existing key-level expiration (HSETEXPIRE) added in #2750 to support per-field granularity, which is a common feature request similar to Redis 7.4's hash field expiration support.

Reference: https://redis.io/docs/latest/develop/data-types/hashes/

aleksraiden · 2025-11-24T22:01:57Z

@torwig could you please check this PR, as I remember, we talk about this feature? Thanks

aleksraiden · 2025-11-24T22:02:44Z

@tejaslodayadd lot ot thanks, cool feature, we need support of this

tejaslodayadd · 2025-11-24T23:26:07Z

+  // Check for hash field expiration
+  if (metadata.Type() == kRedisHash) {
+    // First check if metadata (key-level) is expired
+    if (IsMetadataExpired(ikey, metadata)) {
+      return true;
+    }
+    // Then check per-field expiration
+    // Use lazy deletion with 5 minute buffer similar to metadata expiration
+    uint64_t lazy_expired_ts = util::GetTimeStampMS() - 300000;
+    HashFieldValue field_value;
+    if (!HashFieldValue::Decode(value.ToString(), &field_value)) {
+      // Failed to decode, keep the field
+      return false;
+    }
+    // Check if field has expiration and if it's expired (with lazy delete buffer)
+    if (field_value.expire > 0 && field_value.expire <= lazy_expired_ts) {
+      return true;
+    }
+    return false;
+  }


The key logic for hash field expiration cleanup during compaction -- when the metadata type is kRedisHash, the filter:

First checks if the entire key is expired (key-level expiration)

Then decodes the field value using HashFieldValue::Decode()

Checks if the field has a non-zero expiration timestamp

Uses a 5-minute lazy delete buffer (util::GetTimeStampMS() - 300000) to avoid race conditions between the HEXPIRE command and compaction (same pattern used for key-level expiration)

tejaslodayadd · 2025-11-25T01:27:43Z

After discussing this internally, there's a problem:

Currently, HLEN returns the field count by reading metadata.size directly - an O(1) operation. However, with per-field expiration:

Expired fields remain in storage until compaction cleans them up
The metadata.size counter doesn't know when individual fields expire
Result: A hash with 5 fields where 1 has expired will still report HLEN = 5 instead of 4

Options:

Accept approximate count: Keep current O(1) behavior, document that HLEN returns approximate count when using field TTL, accurate after compaction (eventual consistent)
Scan-based count: HLEN scans all fields and counts non-expired ones. Con: O(n) instead of O(1), expensive for large hashes
Optimized scan with metadata tracking: Add a flag in HashMetadata to track if any field has expiration set. HLEN returns metadata.size directly if no fields have TTL -- O(1), and only scan when hash has fields with expiration -- O(n)

Let us know which way you'd want us to move forward @PragmaTwice @torwig @aleksraiden

git-hulk · 2025-11-25T03:15:46Z

@tejaslodayadd Thanks for your contribution.

I just had a glance at this PR and have two questions about the design:

When would the size in metadata be updated? It seems the size won't be corrected after the compaction.
Should we also take care of the HSCAN command? Let's filter out the expired fields.

From my personal perspective, it's acceptable to sacrifice the short-term correctness of a command like HLEN to improve performance.

PragmaTwice · 2025-11-25T04:05:24Z

Hi, thank you for your contribution. I skimmed through and found some issue in this design:

"New format (with expiration): [0xFF][flags][8-byte timestamp][value]". Hmm the hash field value can actually be binary, so even in the old encoding, the first byte of the value CAN BE 0xff. How to distinguish them? I think the answer is negative.
-2 = field doesn't exist, 1 = expiration set, 0 = expiration not set (e.g., invalid expire time). Usually this comment indicates that you should use enum classes instead of integers.
the size field in metatdata seems not well mantained, in compaction filters.
it seems HashFieldValue can be a view (zero-copy)?
The code seems vibe-coded. It is fine but there are lots of useless code comment and the PR author should understand details of the llm-written code (so that the review process can be meaningful).

ltagliamonte-dd · 2025-12-02T22:06:16Z

@git-hulk @PragmaTwice FYI I'm going to help @tejaslodayadd with this PR.
I've addressed few of the comments you folks left.

I'm currently debating how to proceed with the compaction size/problem.
during compaction of the default CF we can't update the metadata CF (where the hash size is), so i think we could go for a lazy repair pattern:

during compactions if a field is expired we just update it's value to a STUB value (0xFE) so we immediately reclaim the disk space
during read operations like HGET, HGETALL, HSCAN we do a synchronous/asynchronous Repair:
- Create an Atomic WriteBatch.
- Delete the key from defaultCF.
- Update/Merge the count in metadataCF.
- Commit.

What do you folks think about this approach?

ltagliamonte-dd · 2025-12-02T23:02:03Z

when it comes to:

i've implemented a new format that uses a two-byte magic marker (0xFF 0xFE) instead of a single byte, let me know if this works for you @PragmaTwice

New format: [[0xFF][0xFE][1-byte flags][8-byte timestamp if flag set][value]]
Old format: [[0xFF][1-byte flags][8-byte timestamp if flag set][value]]

ltagliamonte-dd · 2025-12-05T21:31:20Z

@PragmaTwice @git-hulk Just pushed the lazy repair approach discussed before, looking forward to your feedbacks!
Thanks!

git-hulk · 2025-12-09T02:01:44Z

  subkey_opts.disable_auto_compactions = config_->rocks_db.disable_auto_compactions;
  subkey_opts.table_properties_collector_factories.emplace_back(
      NewCompactOnExpiredTableCollectorFactory(std::string(kPrimarySubkeyColumnFamilyName), 0.3));
+  subkey_opts.merge_operator = std::make_shared<HashMergeOperator>();


The merge operator would be invoked when iterating over any keys in this column family, and it could significantly affect performance.

git-hulk · 2025-12-09T02:04:41Z

+  return rocksdb::Status::OK();
+}
+
+void Hash::asyncRepairHash(const std::string &ns_key, const Slice &field, const HashMetadata &metadata) const {


It's not a good idea to spawn a thread per expired hash field. Perhaps we can reuse the existing task runner, or have a fixed thread pool for this purpose.

dashjay · 2026-01-07T02:37:14Z

@ltagliamonte-dd

For Data Consistency: Rocksdb should take care this, although rocksdb has not transaction, writebatch can help us put them all together.
For Key Bloat problem: this is a trade-off things to solve the problem between compatibility and the space. For kvrocks engine we flatten all hash keys to kvrocks for reduce the write amp, add more key for one TTL but not every key I think this is acceptable but need do more benchmark.
For Expensive TTL Updates: if we take one TTL as one key, put the TTL in value but not encoded into key, so that we just need to put the same key to engine other than (first scan, delete old, put new).
For Breaks Point Lookups: this must be the biggest problem, no matter where I put TTL, it's a problem. To solve this problem without breaking the original semantics, I believe that in addition to metadata, other indexes need to be introduced.

ltagliamonte-dd · 2026-01-07T22:40:21Z

@PragmaTwice @git-hulk would love to hear from you as well, in order to focus the development in the right direction.

ltagliamonte-dd · 2026-01-09T21:16:18Z

@PragmaTwice @git-hulk @aleksraiden gentle ping about my last few comments.

ltagliamonte-dd · 2026-01-13T21:00:08Z

@PragmaTwice @git-hulk @aleksraiden gentle ping about my last few comments. thanks

aleksraiden · 2026-01-13T21:16:05Z

Still wrong clang-tidy check as i see. @torwig could you check and suggest? Lot of thanks

ltagliamonte-dd · 2026-01-13T21:23:56Z

@aleksraiden thanks yes that needs to be addressed, but i would love to hear from the community leaders how to proceed with this feature.
Another idea that come to my mind is the following:

add a "v2" flag to the Hash key when we write new hash keys
during compaction rewrite all non v2 subkeys values adding the new TTL encoding in the value.

basically after the first major compaction all hash keys would be migrated to v2.

Mrt3383x38

Selam ugulama güncellendigi zaman ne borsada binance nede maht wallet çalışmıyo tsk.

Düz_metin_1739740615805.csv

dashjay · 2026-02-10T06:37:51Z

@ltagliamonte-dd I tried to make another PR, PTLA, go easy on me 👍

thailowki · 2026-03-19T05:53:17Z

@PragmaTwice @git-hulk @aleksraiden gentle ping. Please let me and @ltagliamonte-dd know if you could review this PR

aleksraiden · 2026-03-19T13:51:49Z

Please, check an error at tests

thailowki · 2026-04-07T23:58:52Z

@PragmaTwice @git-hulk @aleksraiden I updated the PR. Please let me and @ltagliamonte-dd know if you could review this PR

jihuayu · 2026-04-08T02:30:39Z

I recall we had an online discussion regarding this feature. Was the outcome of that discussion shared with the community?

Made-with: Cursor

thailowki · 2026-04-08T03:33:38Z

I recall we had an online discussion regarding this feature. Was the outcome of that discussion shared with the community?

I'm not sure about this. @ltagliamonte-dd if you recall ?

ltagliamonte-dd · 2026-04-15T20:13:34Z

I recall we had an online discussion regarding this feature. Was the outcome of that discussion shared with the community?

I'm not sure about this. @ltagliamonte-dd if you recall ?

@thailowki @jihuayu i've tried to reach out @git-hulk @aleksraiden @PragmaTwice on zulipchat but didn't get any attention.
Overall i understand that this is a very tricky feature to add, especially when it comes to updating existing installations.
Would it be possible to just add this new encoding behind feature flag in the config?

aleksraiden · 2026-04-15T20:18:21Z

As I see, at now a main stopper for update and test are unresolved conflicts:

src/storage/redis_db.cc
src/types/redis_hash.cc
src/types/redis_hash.h
tests/cppunit/types/hash_test.cc

P.S. Sorry, I'am only in this place, no possibility to answer at zulip chat, sorry.

PragmaTwice · 2026-04-16T02:37:34Z

We started a proposal #3432 and a tracking issue #3436.

PragmaTwice · 2026-05-30T05:16:25Z

Refer to #3436.

Add hash per-field expiration implementation

f5505fb

tejaslodayadd changed the title ~~feat(hash): add hash per-field expiration~~ feat(hash): add per-field expiration support (HEXPIRE, HTTL, HPERSIST) Nov 24, 2025

aleksraiden requested review from PragmaTwice and torwig November 24, 2025 22:00

tejaslodayadd marked this pull request as ready for review November 24, 2025 22:05

tejaslodayadd commented Nov 24, 2025

View reviewed changes

Fix lint

98eedac

git-hulk self-requested a review November 25, 2025 03:15

ltagliamonte-dd added 3 commits December 2, 2025 10:28

support expiration in HSCAN

8ca5965

HashFieldValue zero-copy view

487e185

use enums

ef35dc7

ltagliamonte-dd and others added 7 commits December 2, 2025 15:02

use new marker

6db0126

Merge branch 'unstable' into hash-field-expiration

b49f46e

use new marker 2

76e6010

fix unit

864cc2d

fix unit

3a13f32

Merge branch 'unstable' into hash-field-expiration

bf70596

add lazy repair

f5041e9

ltagliamonte-dd added 2 commits December 7, 2025 12:17

fix CI

ec2e2d0

add merge operator

f7dd87e

git-hulk reviewed Dec 9, 2025

View reviewed changes

remove merge

5dc043d

Merge branch 'unstable' into hash-field-expiration

3972c94

Merge branch 'unstable' into hash-field-expiration

2de5915

Mrt3383x38 reviewed Jan 13, 2026

View reviewed changes

This was referenced Jan 22, 2026

feat(hash): add per-field expiration support dashjay/kvrocks#1

Open

feat(hash): add per-field expiration support #3346

Closed

Merge branch 'unstable' into hash-field-expiration

c78a0ba

Merge branch 'unstable' into hash-field-expiration

82424c2

thailowki and others added 3 commits March 19, 2026 22:23

fix build error

5808b2b

Merge branch 'unstable' into hash-field-expiration

ba93001

Merge branch 'unstable' into hash-field-expiration

f7a0d4f

Merge branch 'unstable' into hash-field-expiration

fabd31e

chore: fix include ordering for clang-format

77796b0

Made-with: Cursor

Merge branch 'unstable' into hash-field-expiration

7adbb6b

PragmaTwice closed this May 30, 2026

Conversation

tejaslodayadd commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Commands:

Implementation Details

Uh oh!

aleksraiden commented Nov 24, 2025

Uh oh!

aleksraiden commented Nov 24, 2025

Uh oh!

tejaslodayadd Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

tejaslodayadd commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

git-hulk commented Nov 25, 2025

Uh oh!

PragmaTwice commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ltagliamonte-dd commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ltagliamonte-dd commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ltagliamonte-dd commented Dec 5, 2025

Uh oh!

git-hulk Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

git-hulk Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

dashjay commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ltagliamonte-dd commented Jan 7, 2026

Uh oh!

ltagliamonte-dd commented Jan 9, 2026

Uh oh!

ltagliamonte-dd commented Jan 13, 2026

Uh oh!

aleksraiden commented Jan 13, 2026

Uh oh!

ltagliamonte-dd commented Jan 13, 2026

Uh oh!

Mrt3383x38 left a comment

Choose a reason for hiding this comment

Uh oh!

dashjay commented Feb 10, 2026

Uh oh!

thailowki commented Mar 19, 2026

Uh oh!

aleksraiden commented Mar 19, 2026

Uh oh!

thailowki commented Apr 7, 2026

Uh oh!

jihuayu commented Apr 8, 2026

Uh oh!

thailowki commented Apr 8, 2026

Uh oh!

ltagliamonte-dd commented Apr 15, 2026

Uh oh!

aleksraiden commented Apr 15, 2026

Uh oh!

PragmaTwice commented Apr 16, 2026

Uh oh!

PragmaTwice commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

tejaslodayadd commented Nov 24, 2025 •

edited

Loading

tejaslodayadd commented Nov 25, 2025 •

edited

Loading

PragmaTwice commented Nov 25, 2025 •

edited

Loading

ltagliamonte-dd commented Dec 2, 2025 •

edited

Loading

ltagliamonte-dd commented Dec 2, 2025 •

edited

Loading

dashjay commented Jan 7, 2026 •

edited

Loading