Skip to content

CommandChanges.shouldCleanup short-circuits to NO if there is no data…#4873

Open
belliottsmith wants to merge 5 commits into
apache:trunkfrom
belliottsmith:consistent-expunge
Open

CommandChanges.shouldCleanup short-circuits to NO if there is no data…#4873
belliottsmith wants to merge 5 commits into
apache:trunkfrom
belliottsmith:consistent-expunge

Conversation

@belliottsmith

Copy link
Copy Markdown
Contributor

…, but this is incorrect as Cleanup.EXPUNGE may have dropped the data and the record must receive cleanup EXPUNGE and be reported as ERASED. This can lead to lienarizability violations.

Thanks for sending a pull request! Here are some tips if you're new here:

  • Ensure you have added or run the appropriate tests for your PR.
  • Be sure to keep the PR description updated to reflect all changes.
  • Write your PR title to summarize what this PR proposes.
  • If possible, provide a concise example to reproduce the issue for a faster review.
  • Read our contributor guidelines
  • If you're making a documentation change, see our guide to documentation contribution

Commit messages should follow the following format:

<One sentence description, usually Jira title or CHANGES.txt summary>

<Optional lengthier description (context on patch)>

patch by <Authors>; reviewed by <Reviewers> for CASSANDRA-#####

Co-authored-by: Name1 <email1>
Co-authored-by: Name2 <email2>

The Cassandra Jira

@alanwang67

Copy link
Copy Markdown
Contributor

+1

@ifesdjeen ifesdjeen requested review from Copilot and ifesdjeen June 11, 2026 13:20

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to prevent Accord journal cleanup/expunge decisions from being surfaced as NotDefined on the load path (which the PR description notes can cause linearizability violations). The visible changes in this repo primarily add/adjust tests and test utilities to reproduce and guard the behavior.

Changes:

  • Add a new regression unit test (AccordExpungeTest) covering Cleanup.EXPUNGE interactions with CommandChanges reconstruction and journal.loadCommand.
  • Extend AccordGenerators.commandsBuilder() with an overload to allow custom TxnId generation (used by the new test).
  • Adjust distributed burn-test behavior and minor test cleanup/comment removals.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test/unit/org/apache/cassandra/utils/AccordGenerators.java Adds an overload to build commands with a caller-supplied TxnId generator.
test/unit/org/apache/cassandra/service/accord/AccordExpungeTest.java New regression test exercising EXPUNGE → loadCommand behavior across journal states.
test/distributed/org/apache/cassandra/service/accord/journal/AccordJournalBurnTest.java Sets a global burn-test tuning parameter (CMD_BASE_CHECK_CHANCE) via static initializer.
test/distributed/org/apache/cassandra/service/accord/AccordJournalCompactionTest.java Minor whitespace-only change.
test/distributed/org/apache/cassandra/distributed/test/accord/AccordLoadTest.java Removes commented-out config lines.
.gitmodules Points the modules/accord submodule at a personal fork/branch instead of the canonical Apache repo.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +139 to +142
validate((before, rb, db) -> {
CommandChanges builder = new CommandChanges(before.txnId);
builder.maybeCleanup(true, FULL, rb, db);
Command reconstructed = builder.construct(rb);
Comment on lines +169 to +180
private void loadAndValidate(Command before, Journal journal, RedundantBefore rb, DurableBefore db)
{
SoftAssertions checks = new SoftAssertions();
Command loaded = journal.loadCommand(COMMAND_STORE_ID, before.txnId, rb, db);
checks.assertThat(loaded).as("loadCommand returned null after RedundantBefore advance for %s; "
+ "AccordSafeCommand will surface this as NotDefined", loaded).isNotNull();

checks.assertThat(loaded.saveStatus())
.as("loadCommand did not return Erased for previously-written %s; "
+ "loaded=%s, redundantBefore=%s", before, loaded, rb)
.isEqualTo(SaveStatus.Erased);
}
Comment thread .gitmodules
Comment on lines 1 to +4
[submodule "modules/accord"]
path = modules/accord
url = https://github.com/apache/cassandra-accord.git
branch = trunk
url = https://github.com/belliottsmith/cassandra-accord.git
branch = consistent-expunge
Comment on lines +104 to +107
static
{
Cluster.RandomLoader.CMD_BASE_CHECK_CHANCE = 0.1f;
}
…, but this is incorrect as Cleanup.EXPUNGE may have dropped the data and the record must receive cleanup EXPUNGE and be reported as ERASED. This can lead to lienarizability violations.
@belliottsmith belliottsmith force-pushed the consistent-expunge branch 5 times, most recently from 4d1f0a5 to 03ddba8 Compare June 12, 2026 11:00
@belliottsmith belliottsmith force-pushed the consistent-expunge branch 2 times, most recently from c05492b to e938fcb Compare June 12, 2026 18:15
 - system_accord_debug.executors has wrong clustering type
 - RemoteToLocalVirtualTable should lazily allocate the partition collector to avoid LIMIT clause filtering removing it before it's populated
 - Ensure ReadCoordinator callbacks are invoked on owning thread
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants