Skip to content

fix(buffers): disk buffers v2 reopen buffer after restart crash loop#25570

Draft
blt wants to merge 2 commits into
blt/antithesis_vector_end-to-end_ack_scenariofrom
blt/disk_buffers_v2_reopen_buffer_after_restart_crash_loop
Draft

fix(buffers): disk buffers v2 reopen buffer after restart crash loop#25570
blt wants to merge 2 commits into
blt/antithesis_vector_end-to-end_ack_scenariofrom
blt/disk_buffers_v2_reopen_buffer_after_restart_crash_loop

Conversation

@blt

@blt blt commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Summary

We have observed in Antithesis scenario vector_to_vector_e2e_disk that
the disk buffer can be driven to a state where Vector will crash loop
without recovery. This happens when the reader is set to resume from a
data file that no longer exists and will not be created.

Strictly, the disk buffer reader is meant to chase the writer. This
happens because the deletion of completed data files is not atomic. When
a reader finishes an acks a data file delete_completed_data_file:

  • unlinks the .dat file
  • advances the reader id in the ledger
  • fsyncs the ledger

If Vector crashes after the unlink but before the ledger is fsync'ed
this will cause the next instance of Vector to attempt to read a .dat
that will never appear, crashing said instance. Mostly. OS page cache
complicates this some, as the newly introduced
reopen_recovers_with_reader_resume_data_file_is_missing discusses.

Vector configuration

How did you test this PR?

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • Some CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • make fmt
      • make check-clippy (if there are failures it's possible some of them can be fixed with make clippy-fix)
      • make test
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run make build-licenses to regenerate the license inventory and commit the changes (if any). More details on the dd-rust-license-tool.

blt commented Jun 3, 2026

Copy link
Copy Markdown
Contributor Author

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@datadog-vectordotdev

This comment has been minimized.

@blt blt force-pushed the blt/antithesis-harness-first-scenario branch from 3614c5a to 5209ae6 Compare June 4, 2026 00:25
@blt blt force-pushed the blt/disk_buffers_v2_reopen_buffer_after_restart_crash_loop branch from db4960b to e3cc97c Compare June 4, 2026 00:25
@blt blt changed the base branch from blt/antithesis-harness-first-scenario to graphite-base/25570 June 4, 2026 01:06
@blt blt force-pushed the blt/disk_buffers_v2_reopen_buffer_after_restart_crash_loop branch from e3cc97c to f6542c0 Compare June 4, 2026 01:06
@blt blt changed the base branch from graphite-base/25570 to blt/antithesis_vector_end-to-end_ack_scenario June 4, 2026 01:06
@blt blt force-pushed the blt/antithesis_vector_end-to-end_ack_scenario branch from 3cce5aa to 84df4d7 Compare June 4, 2026 04:48
@blt blt force-pushed the blt/disk_buffers_v2_reopen_buffer_after_restart_crash_loop branch from f6542c0 to 61e1f3e Compare June 4, 2026 04:48
Comment thread changelog.d/disk_v2_reopen_recovers_missing_data_file.fix.md Fixed
@blt blt force-pushed the blt/disk_buffers_v2_reopen_buffer_after_restart_crash_loop branch 2 times, most recently from 6cf0d05 to 597b732 Compare June 4, 2026 13:17
@blt blt force-pushed the blt/antithesis_vector_end-to-end_ack_scenario branch from 84df4d7 to c066e34 Compare June 4, 2026 13:17
@blt blt force-pushed the blt/disk_buffers_v2_reopen_buffer_after_restart_crash_loop branch from 597b732 to b3c5011 Compare June 4, 2026 14:57
@blt blt force-pushed the blt/antithesis_vector_end-to-end_ack_scenario branch 2 times, most recently from 8ff610c to 4811fb8 Compare June 4, 2026 15:05
@blt blt force-pushed the blt/disk_buffers_v2_reopen_buffer_after_restart_crash_loop branch 2 times, most recently from 05d79f8 to fe305fa Compare June 4, 2026 15:20
@blt blt force-pushed the blt/antithesis_vector_end-to-end_ack_scenario branch from 4811fb8 to 85cfa4b Compare June 4, 2026 15:20
@github-actions github-actions Bot added the domain: vdev Anything related to the vdev tooling label Jun 4, 2026
@blt blt changed the title disk buffers v2 reopen buffer after restart crash loop fix: disk buffers v2 reopen buffer after restart crash loop Jun 4, 2026
@blt blt changed the title fix: disk buffers v2 reopen buffer after restart crash loop fix(buffers): disk buffers v2 reopen buffer after restart crash loop Jun 4, 2026
@blt blt marked this pull request as ready for review June 4, 2026 15:28
@blt blt requested a review from a team as a code owner June 4, 2026 15:28
@blt blt force-pushed the blt/antithesis_vector_end-to-end_ack_scenario branch from 85cfa4b to d40777e Compare June 4, 2026 16:32
@blt blt force-pushed the blt/disk_buffers_v2_reopen_buffer_after_restart_crash_loop branch from fe305fa to 17992e7 Compare June 4, 2026 16:32

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 17992e7285

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

data_file_path = data_file_path.to_string_lossy().as_ref(),
"Reader resume data file is missing; it was fully acknowledged before deletion. Advancing past it."
);
self.ledger.increment_acked_reader_file_id();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Repair ledger accounting before skipping missing files

When a crash happens after delete_completed_data_file unlinks a data file but before its mmap changes are durably flushed, the restored ledger can still include that file's bytes and last-read record state even though the file is gone. This branch only advances the reader file id, so the skipped file's size remains in total_buffer_size and its record ids may be treated as gaps; after the remaining files drain, the buffer can still appear non-empty/full and the reader can wait for a next writer file instead of reaching an empty state. The recovery path needs to make the same accounting durable before unlink, or explicitly repair the skipped file's ledger state here.

Useful? React with 👍 / 👎.

@blt blt force-pushed the blt/disk_buffers_v2_reopen_buffer_after_restart_crash_loop branch from 17992e7 to 6dfab2e Compare June 4, 2026 17:08
@blt blt force-pushed the blt/antithesis_vector_end-to-end_ack_scenario branch from d40777e to 7ecc846 Compare June 4, 2026 17:08
@github-actions github-actions Bot removed the domain: vdev Anything related to the vdev tooling label Jun 4, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6dfab2e1a6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

data_file_path = data_file_path.to_string_lossy().as_ref(),
"Reader resume data file is missing; it was fully acknowledged before deletion. Advancing past it."
);
self.ledger.increment_acked_reader_file_id();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Do not assume missing reader files were acknowledged

This branch also matches buffers affected by the creation-side durability bug this patch calls out: before this change (and still on non-Unix, where sync_directory is a no-op), a crash after the ledger/file contents were flushed but before the new data file's directory entry was durable can leave reader_file_id < writer_file_id with the reader's data file missing even though none of its records were delivered. Incrementing the acked reader file id here silently skips those unread records (and leaves the ledger's byte/record accounting inconsistent), so the recovery path should distinguish/repair the deleted-after-ack case rather than treating every missing file below the writer as fully acknowledged.

Useful? React with 👍 / 👎.

@blt blt force-pushed the blt/antithesis_vector_end-to-end_ack_scenario branch from 7ecc846 to b4576c4 Compare June 4, 2026 17:30
@blt blt force-pushed the blt/disk_buffers_v2_reopen_buffer_after_restart_crash_loop branch from 6dfab2e to 80aa8c9 Compare June 4, 2026 17:30

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 80aa8c93f1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 886 to 888
let data_file_mmap = self
.ledger
.filesystem()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Recheck reader position before mmaping writer file

When recovery skips a missing reader file here, ensure_ready_for_read() can advance reader_file_id until it equals writer_file_id and open the writer's current file. Because the outer while condition is not rechecked before this mmap, a crash after the writer rolled to a freshly created but still-empty next data file still reopens into open_mmap_readable() on that empty writer file, which fails instead of falling back to the slow path. Recheck the reader/writer IDs after ensure_ready_for_read() and break once they are equal.

Useful? React with 👍 / 👎.

@blt blt force-pushed the blt/disk_buffers_v2_reopen_buffer_after_restart_crash_loop branch from 80aa8c9 to cf9f640 Compare June 4, 2026 18:28
@blt blt marked this pull request as draft June 4, 2026 18:42
blt added 2 commits June 12, 2026 16:55
We have observed in Antithesis scenario `vector_to_vector_e2e_disk` that
the disk buffer can be driven to a state where Vector will crash loop
without recovery. This happens when the reader is set to resume from a
data file that no longer exists and will not be created.

Strictly, the disk buffer reader is meant to chase the writer. This
happens because the deletion of completed data files is not atomic. When
a reader finishes an acks a data file `delete_completed_data_file`:

* unlinks the .dat file
* advances the reader id in the ledger
* fsyncs the ledger

If Vector crashes after the unlink but before the ledger is fsync'ed
this will cause the next instance of Vector to attempt to read a .dat
that will never appear, crashing said instance. Mostly. OS page cache
complicates this some, as the newly introduced
`reopen_recovers_with_reader_resume_data_file_is_missing` discusses.

REF SMPTNG-749
This commit fixes the crash-loop by carefully re-arranging when the
ledger is updated and ensuring that directory metadata is also
fsynced. This later is important as ledger still updates after the
unlink takes place. If a file is gone we can assume that it has been
read iff the ledger points to a new dat file, meaning that the data was
previously read entirely and readers may skip the 'stale' reference.

I have left this skip event as a warn! but considering how likely this
is in practice in the event of node termination, for example, it might
rightly be a debug.

Overflow of dat file indexes remains a concern but 1. I would like to
demonstrate this in an antithesis run and 2. that belongs in a separate
commit.
@blt blt force-pushed the blt/antithesis_vector_end-to-end_ack_scenario branch from b4576c4 to f538c36 Compare June 12, 2026 16:57
@blt blt force-pushed the blt/disk_buffers_v2_reopen_buffer_after_restart_crash_loop branch from cf9f640 to 0e78f49 Compare June 12, 2026 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants