fix(table): validate snapshot timestamp drift on add snapshot#3062
fix(table): validate snapshot timestamp drift on add snapshot#3062mrutunjay-kinagi wants to merge 2 commits intoapache:mainfrom
Conversation
Fokko
left a comment
There was a problem hiding this comment.
@mrutunjay-kinagi Can you explain more about the use-case? I'm hesitant with these kind of features since it isn't implemented in other OSS implementations AFAIK.
| elif ( | ||
| base_metadata.snapshot_log and update.snapshot.timestamp_ms - base_metadata.snapshot_log[-1].timestamp_ms < -ONE_MINUTE_MS | ||
| ): | ||
| raise ValueError( |
There was a problem hiding this comment.
Maybe CommitFailedException?
There was a problem hiding this comment.
Good point. I kept ValueError for consistency with the existing validation branches in this same method (sequence/row-id checks all currently raise ValueError). If you prefer, I can switch this validation path to CommitFailedException in a follow-up commit.
|
@Fokko Thanks for the review. The use case here is handling malformed/clock-skewed commit metadata safely and aligning behavior with Java/Rust, which already apply a 1-minute backward-drift tolerance for snapshot timestamps.\n\nWithout this guard, a client can add a snapshot timestamp that is significantly older than current table metadata ( / latest ), which can make timeline semantics inconsistent for timestamp-based operations.\n\nThis change keeps the same 60s tolerance and only rejects larger backward drift; small skew remains allowed. |
|
Clarification on my previous comment (formatting issue): The guard is intended to prevent adding snapshots that are significantly older than the table timeline, specifically older than This follows the same tolerance approach used in Java/Rust and keeps small clock skew allowed while rejecting larger backward drift. |
could you point to the java/rust references? |
|
The existing validation branches inside are raising , so for symmetry I left this guard as as well. If you’d prefer switching to later that’s easy to follow up on but I’d keep the current behavior for consistency. |
|
Reposting cleanly: the guard maintains the same exception type as the surrounding validation branches, so I kept it as |
|
@kevinjqliu Here are the references we aligned with:
|
Rationale for this change
Align snapshot timestamp validation with other Iceberg implementations by rejecting snapshots that drift backwards by more than one minute.
The new guard in
AddSnapshotUpdatechecks:snapshot_logentrylast_updated_msBoth checks allow up to 60 seconds of clock skew tolerance.
Are these changes tested?
Yes.
test_update_metadata_add_snapshot_rejects_old_timestamp_vs_snapshot_log.test_update_metadata_add_snapshot_rejects_old_timestamp_vs_last_updated.Are there any user-facing changes?
No API changes.
ValueErrorwhen timestamp drift exceeds tolerance.