Skip to content

Revert to cfb1e15: indexLock intersection with block locks is sufficient#1

Draft
Copilot wants to merge 9 commits intomasterfrom
copilot/optimize-throughput-16kib-blocks
Draft

Revert to cfb1e15: indexLock intersection with block locks is sufficient#1
Copilot wants to merge 9 commits intomasterfrom
copilot/optimize-throughput-16kib-blocks

Conversation

Copy link

Copilot AI commented Mar 20, 2026

Previous fix attempts over-engineered the solution to "hash is findable before data is committed." The indexLock overlap with block lock acquisition already provides the necessary guarantee — no special index-update ordering is required.

Why the existing design is correct

Write path:

  1. Acquire indexLock → acquire blockWriteLock (while holding indexLock)
  2. UpdateIndex — expose new hash in in-memory dict
  3. MarkDirtyAsync — set dirty bit before any persisted changes
  4. Release indexLock; then CommitWrite under blockWriteLock only

Read path: acquires blockReadLock while holding indexLock

The overlap is the invariant: any reader that finds the new hash must contend for blockReadLock, which blocks until blockWriteLock is released after CommitWrite. Readers always observe committed data. Crash safety follows from MarkDirtyAsync firing before CommitWrite.

Changes

  • Restore cfb1e15 logic in BlockCache, BlockStorage, and BlockIndex — reverts the three-phase placeholder approach and the "hold indexLock through data write" approach
  • Expand comments on blockWriteLock acquisition and CommitWrite call site to make the invariant explicit
  • Add <summary> docs to BlockStorage.UpdateIndex, CommitWrite, and MarkDirtyAsync capturing locking contract and crash-safety ordering requirement

💬 Send tasks to Copilot coding agent from Slack and Teams to turn conversations into code. Copilot posts an update in your thread when it's finished.

BlockCache.WriteAsync used to hold the global indexLock semaphore for the
entire write, including the 16KiB memcpy into the backing store.  The read
path already released the lock before its data copy; this brings writes in
line.

New locking sequence for writes:
  1. Hold indexLock: eviction, UpdateIndex (dictionary only), MarkDirtyAsync,
     acquire block write lock (while still under indexLock to prevent races).
  2. Release indexLock.
  3. CommitWrite under block lock only: persisted index entry + 16KiB copy.

Supporting changes:
- BlockIndex.UpdateDictionary: updates in-memory positions dictionary only.
- BlockIndex.CommitEntry: writes the persisted index entry only.
- BlockStorage.UpdateIndex/CommitWrite: thin wrappers exposing the split.
- BlockStorage.MarkDirtyAsync: promoted private -> internal.

Co-authored-by: lostmsu <239520+lostmsu@users.noreply.github.com>
Copilot AI changed the title [WIP] Suggest optimizations to increase throughput at 16KiB blocks Release indexLock before the 16 KiB memcpy in WriteAsync Mar 20, 2026
Copilot AI requested a review from lostmsu March 20, 2026 16:47
StressTest.MeasureWriteThroughputAsync drives N concurrent write-only tasks
against any IContentCache for the given duration, returning bytes/second.

ThroughputTests.WriteIsFasterWithIndexLockReleasedBeforeDataCopy runs two
10-second scenarios and asserts the optimized BlockCache is faster:
  Before (GlobalWriteSerializingCache — global SemaphoreSlim(1) for writes,
          simulating old indexLock-held-during-copy behaviour): ~725 MiB/s
  After  (optimized BlockCache, indexLock released before data copy): ~1.36 GiB/s

GlobalWriteSerializingCache implements IDisposable so the SemaphoreSlim
is disposed properly.

Co-authored-by: lostmsu <239520+lostmsu@users.noreply.github.com>
Agent-Logs-Url: https://github.com/BorgGames/Hash/sessions/5ec2bd17-f616-46ca-a50d-5d637e6b2c39
Copilot AI changed the title Release indexLock before the 16 KiB memcpy in WriteAsync Optimize 16KiB write throughput: release indexLock before data copy Mar 20, 2026
StressTest.RunCorrectnessAsync drives ProcessorCount*2 concurrent read/write/
hash-verify tasks against any IContentCache for the given duration, throwing
HashMismatchException immediately if any data corruption is detected.

CorrectnessTests.ConcurrentReadWriteProducesNoDataCorruption exercises an
in-memory BlockCache (64 blocks, forcing constant evictions) directly — no
TCP layer. This directly answers "are there correctness stress-tests?" and
validates the locking changes produce no data corruption.

Also address code-review feedback from the previous commit:
- BlockCache: remove redundant `blockWriteLock = default` after disposal in
  the catch path (exception is re-thrown immediately, variable unused)
- BlockIndex.UpdateDictionary: include blockIndex in both error messages to
  aid debugging of concurrency faults

Co-authored-by: lostmsu <239520+lostmsu@users.noreply.github.com>
Agent-Logs-Url: https://github.com/BorgGames/Hash/sessions/0930c8ef-24d0-4fbb-8238-b8802f739a5b
Copilot AI changed the title Optimize 16KiB write throughput: release indexLock before data copy Add correctness and before/after throughput stress tests for BlockCache Mar 20, 2026
- StressTest.AbuseAsync: remove unused exception variable e from
  catch (OperationCanceledException e) — fixes CS0168 warning
- BlockCache.WriteAsync: add comment explaining why index and
  blockWriteLock are hoisted before the try block (both are needed
  after the finally/indexLock.Release continuation)

Co-authored-by: lostmsu <239520+lostmsu@users.noreply.github.com>
Agent-Logs-Url: https://github.com/BorgGames/Hash/sessions/39ffc0ea-d8a6-48b7-b728-ec388d6592af
Copilot AI changed the title Add correctness and before/after throughput stress tests for BlockCache Add direct BlockCache correctness stress test Mar 20, 2026
- Delete test/ThroughputTests.cs: removes the before/after perf
  comparison test (added for discussion only) and GlobalWriteSerializingCache,
  which simulated the old code path that held indexLock through the data copy
- Remove StressTest.MeasureWriteThroughputAsync and WriteLoopAsync: both were
  only used by ThroughputTests
- Simplify BlockIndex.UpdateDictionary error messages to single-line
  interpolated strings (code review feedback)

CI already runs `dotnet test` with no filter, so CorrectnessTests runs on
every push/PR without any workflow changes.

Co-authored-by: lostmsu <239520+lostmsu@users.noreply.github.com>
Agent-Logs-Url: https://github.com/BorgGames/Hash/sessions/4bdb24b4-cd77-463e-ad7f-7d1da349ec3b
Copilot AI changed the title Add direct BlockCache correctness stress test Remove throughput comparison test and old code path; correctness tests in CI Mar 20, 2026
The optimization called UpdateIndex (making the new hash visible in the
in-memory dictionary) before CommitWrite (writing the actual block data).
This violated the invariant: the hash was observable in the index before
the corresponding data was committed.

Fix: swap the order so CommitWrite runs first, then UpdateIndex, both
still under indexLock and blockWriteLock.  The hash is now only visible
in the dictionary after the data is already written.

The blockWriteLock is still needed to prevent a concurrent reader that
already holds readLock from reading a block while data is being written.

Update docstrings in BlockStorage and BlockIndex to reflect the new
calling contract for CommitWrite/UpdateIndex and CommitEntry/UpdateDictionary.

Co-authored-by: lostmsu <239520+lostmsu@users.noreply.github.com>
Agent-Logs-Url: https://github.com/BorgGames/Hash/sessions/6c77ca38-b6da-482e-9bf0-059168358e19
Copilot AI changed the title Remove throughput comparison test and old code path; correctness tests in CI Fix write-before-index-update race in BlockCache.WriteAsync Mar 20, 2026
…ck release

The previous fix incorrectly held indexLock for the entire data write, serialising
all operations and defeating the throughput optimisation.

The reviewer confirmed the runtime concern ("hash visible before data written") is not
a problem because blockWriteLock already prevents any reader from observing incomplete
data.  The correct improvement is the "invalid hash" approach for the persisted index:
  first set the persisted index entry to a placeholder, then write block data,
  then set the persisted index entry to the real hash.

Phase 1 (under indexLock + blockWriteLock):
  - MarkDirtyAsync
  - CommitPlaceholder: persisted index <- zero-length random-hash sentinel
  - UpdateIndex: in-memory dict <- remove evicted hash, add new hash
  Release indexLock

Phase 2 (blockWriteLock only): WriteBlockData

Phase 3 (blockWriteLock only): CommitFinalEntry (persisted index <- real hash+length)

Release blockWriteLock

This restores the early-indexLock-release optimisation while ensuring the persisted
index never contains a "real" hash for a block whose data has not yet been written,
making crash recovery unambiguous.

BlockStorage: replace CommitWrite with CommitPlaceholder / WriteBlockData / CommitFinalEntry
BlockIndex:   update CommitEntry and UpdateDictionary docstrings accordingly

Co-authored-by: lostmsu <239520+lostmsu@users.noreply.github.com>
Agent-Logs-Url: https://github.com/BorgGames/Hash/sessions/0d9d79ca-c2b7-41f1-a9dc-ae373fa77a34
Copilot AI changed the title Fix write-before-index-update race in BlockCache.WriteAsync Three-phase write: placeholder persisted index + restore early indexLock release Mar 20, 2026
…ndex ordering needed

The two previous commits added unnecessary complexity to address the concern
"hash is findable before data is committed". Re-reasoning from cfb1e15:

The indexLock intersection with block lock acquisitions is sufficient:
- Writer: acquires blockWriteLock WHILE holding indexLock, updates in-memory dict,
  releases indexLock, then runs CommitWrite under blockWriteLock only.
- Reader: acquires indexLock, finds hash->blockIndex, acquires blockReadLock WHILE
  holding indexLock, then releases indexLock and reads data.
- Any reader that finds the new hash must acquire blockReadLock, which blocks until
  blockWriteLock is released (after CommitWrite). Readers always see committed data.

Crash safety: MarkDirtyAsync fires before CommitWrite, so the dirty bit is always set
before any block data changes in persisted storage. A crash anywhere in CommitWrite
always triggers a full index rebuild on restart.

No special index-update ordering is needed.

Changes vs cfb1e15:
- Expand the comment on blockWriteLock acquisition to explain the invariant explicitly
- Add a comment explaining CommitWrite safety at the await-using site
- Add doc comments to BlockStorage.UpdateIndex, CommitWrite, MarkDirtyAsync
- Tighten BlockIndex.UpdateDictionary doc comment

Co-authored-by: lostmsu <239520+lostmsu@users.noreply.github.com>
Agent-Logs-Url: https://github.com/BorgGames/Hash/sessions/75d86e09-4608-4350-a920-0933ee4ecd03
Copilot AI changed the title Three-phase write: placeholder persisted index + restore early indexLock release Revert to cfb1e15: indexLock intersection with block locks is sufficient Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants