WIP/adapter: bypass catalog_transact_inner for storage usage updates (25x speedup) by aljoscha · Pull Request #35436 · MaterializeInc/materialize

aljoscha · 2026-03-11T16:24:58Z

Rewrote storage_usage_update to bypass catalog_transact_inner entirely.
Instead of creating N Op::WeirdStorageUsageUpdates ops and routing them
through the catalog transaction machinery (which incurs a durable persist
transaction just to bump an unused id allocator), we now:

Get a write timestamp from the oracle (unchanged, ~2ms)
Pack N BuiltinTableUpdate rows directly via pack_storage_usage_update
Submit via builtin_table_update().execute() (group commit, ~18ms)

The durable STORAGE_USAGE_ID_ALLOC_KEY allocator is replaced with a cheap
local counter (storage_usage_next_id on the Coordinator). The id column in
mz_storage_usage_by_shard is unused by any view or query.

Results at 10k shards (optimized build):

Before: ~499ms per collection cycle
After: ~20ms per collection cycle
Speedup: 25x

The Op::WeirdStorageUsageUpdates variant and durable id allocator are now
dead code and can be cleaned up in a follow-up.

Documenting the finding that storage_usage_update blocks the coordinator main loop for seconds in large environments, and a proposal to bypass catalog_transact_inner since the durable ID allocator it uses is unused. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace PROMPT.md with storage-usage coordinator blocking investigation instructions. Create storage-usage-log.md for recording session findings. Modeled after the ddl-perf prompt/log workflow: each milestone gets committed, re-read PROMPT.md after context compaction, update log and design doc as findings emerge. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Baseline measurements (optimized build, --storage-usage-collection-interval-sec=10s): - ~2.4k shards: ~51ms/cycle - ~5k shards: ~150ms/cycle - ~10k shards: ~499ms/cycle Scaling is linear with shard count. Also updated setup notes: always use --optimized builds for measurements (debug builds are too slow for reliable cycle timing), and the collection interval is a startup flag not an ALTER SYSTEM SET variable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Added timing logs to storage_usage_update and catalog::transact to identify the dominant cost of coordinator blocking during storage usage collection. Results at 10k shards (optimized build): - transact_inner op loop: ~430ms (91%) - Open transaction: ~10ms (2%) - Persist commit: ~6ms (1.3%) - Group commit: ~15ms (3%) - Oracle: ~2ms (0.4%) The fix (bypassing catalog_transact_inner) should reduce total from ~475ms to ~17ms. Updated design doc and log with findings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-03-11T16:25:11Z

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

The PR title is descriptive and will make sense in the git log.
This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

…peedup) Rewrote storage_usage_update to bypass catalog_transact_inner entirely. Instead of creating N Op::WeirdStorageUsageUpdates ops and routing them through the catalog transaction machinery (which incurs a durable persist transaction just to bump an unused id allocator), we now: 1. Get a write timestamp from the oracle (unchanged, ~2ms) 2. Pack N BuiltinTableUpdate rows directly via pack_storage_usage_update 3. Submit via builtin_table_update().execute() (group commit, ~18ms) The durable STORAGE_USAGE_ID_ALLOC_KEY allocator is replaced with a cheap local counter (storage_usage_next_id on the Coordinator). All rows within one collection batch share the same id so that consumers can identify which rows were collected together. Results at 10k shards (optimized build): - Before: ~499ms per collection cycle - After: ~20ms per collection cycle - Speedup: 25x The Op::WeirdStorageUsageUpdates variant and durable id allocator are now dead code and can be cleaned up in a follow-up. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…fix results Added measured cost breakdown (the op loop is 91% of the cost, not persist I/O), scaling data across shard counts, prototype fix results (25x speedup at 10k shards), and remaining cleanup items. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove Op::WeirdStorageUsageUpdates, its transact_op match arm, the weird_builtin_table_update return path, allocate_storage_usage_ids, STORAGE_USAGE_ID_ALLOC_KEY, and the diagnostic info! log from storage_usage_update. All of these became dead code when the storage usage path was changed to bypass catalog_transact_inner and pack builtin table rows directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove storage_usage IdAllocKey entries from catalog test snapshots to match the removal of storage usage code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

aljoscha and others added 6 commits March 3, 2026 15:35

update to main branch

989387a

prompt: import example prompts from ddl-perf exploration

69a3179

aljoscha and others added 8 commits March 11, 2026 18:01

prompt: update next steps with remaining cleanup tasks

9ca7853

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

add blog post

8cd6c6b

simplify

626fdd8

catalog: update insta snapshots after storage_usage id allocator removal

752f47d

Remove storage_usage IdAllocKey entries from catalog test snapshots to match the removal of storage usage code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

all updates in a batch have same ID

67e53b9

aljoscha force-pushed the design-storage-usage-blocking branch from 368e078 to 67e53b9 Compare March 11, 2026 17:02

aljoscha changed the title ~~WIP/adapter: don't allocate expensive IDs for storage-usage updates~~ WIP/adapter: bypass catalog_transact_inner for storage usage updates (25x speedup) Mar 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP/adapter: bypass catalog_transact_inner for storage usage updates (25x speedup)#35436

WIP/adapter: bypass catalog_transact_inner for storage usage updates (25x speedup)#35436
aljoscha wants to merge 14 commits intoMaterializeInc:mainfrom
aljoscha:design-storage-usage-blocking

aljoscha commented Mar 11, 2026

Uh oh!

github-actions bot commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aljoscha commented Mar 11, 2026

Uh oh!

github-actions bot commented Mar 11, 2026

PR title guidelines

Pre-merge checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant