feat(merge): adversarial-review test coverage (F4/F5/F7) + F14 sub-region engine fix by g-talbot · Pull Request #6428 · quickwit-oss/quickwit

g-talbot · 2026-05-13T18:00:25Z

Stacked on #6423. Closes the remaining open code findings from the adversarial review (.planning/reviews/2026-05-13-adversarial-review-parquet-merger.md).

Summary

Three test additions plus one engine fix surfaced by the F4 tests:

F4 — MS-7 generalized to multi-input × sub-region.
F14 — sub-region path inverted col/sub-region loop nesting to keep page cache bounded.
F5 — prefix-aware proptest over the streaming engine.
F7 — production-shape integration test (5 inputs × ~4 RGs × 4 outputs × prefix_len=1).

F4: MS-7 generalized

test_ms7_per_input_bound_across_num_inputs sweeps num_inputs ∈ {1, 3, 8} × rows_per_input ∈ {3 000, 30 000} and asserts the per-input peak stays bounded.
test_ms7_per_input_bound_across_sub_regions_does_not_scale_with_rows runs the prefix_len=0 multi-output sub-region path at 3 000 vs 30 000 rows and asserts peak doesn't scale with input row count.

The sub-region test surfaced F14 — without the engine fix, the sub-region path's peak grew ~9× when rows grew 10×.

F14: invert col/sub-region loop nesting

Parquet streams emit pages in column-major order (all of col 0, then all of col 1, ...). The old sub-region-outer / col-inner ordering meant that while processing sub-region 0's col K, the stream emitted cols 0..K-1's remaining pages first to reach col K — those skipped pages got cached under their own col_idx for later sub-regions to consume. The cache scaled with input row count.

Fix: new process_split_region_col_outer function for the needs_split path. Cols iterate in the outer loop, sub-regions in the inner. Each parquet col chunk is fully consumed from the stream across all sub-regions before col K+1 starts. Single-region path stays on the existing process_region unchanged.

Mechanics: pre-determine writer assignments for the region's sub-regions (sub-regions can span multiple output writers; consecutive sub-regions on the same writer get coalesced into one combined Region so each writer holds one concurrent RG — RGs on the same writer are sequential per parquet's single-active-RG constraint).

F5: prefix-aware proptest

prop_merge_prefix_aligned_streaming sweeps (num_inputs ∈ 1..=3, per-input RG specs, num_outputs ∈ 1..=3) with prefix_len=1. Asserts on every generated case:

MC-1: rows preserved.
MC-3: sorted_series monotone within each output.
MS-3: MergeOutputFile.num_row_groups matches footer.
PA-1+PA-3: assert_unique_rg_prefix_keys passes.
CS-1: MergeOutputFile.output_rg_partition_prefix_len matches on-disk KV.

32 cases capped to keep runtime tight. Fixture honors the storekey property: different metric_names produce non-overlapping sorted_series byte ranges, same (metric, row_offset) across inputs gets the same sorted_series (the realistic "same series in multiple splits" case).

F7: production-shape integration test

test_f7_production_shape_multi_input_multi_rg_multi_output: 5 inputs × ~4 prefix-aligned RGs each × 4 outputs × prefix_len=1, with overlap (every metric appears in 2-3 of the 5 inputs). Asserts MC-1, MS-3, PA-1+PA-3, MS-5 (cross-output sorted_series monotonicity — a single metric CAN span outputs when its region overflows the per-output budget; the cross-output invariant is sorted_series monotonicity, not "each metric in one output"), and CS-1.

Test plan

cargo test -p quickwit-parquet-engine --lib — 498 unit tests pass (was 495 on feat(merge): legacy promotion path + body-col schema evolution #6423; added test_ms7_per_input_bound_across_num_inputs, test_ms7_per_input_bound_across_sub_regions_does_not_scale_with_rows, prop_merge_prefix_aligned_streaming, test_f5_single_input_two_metrics_minimal, test_f7_production_shape_multi_input_multi_rg_multi_output).
cargo clippy -p quickwit-parquet-engine --tests --all-features with -Dwarnings.
cargo fmt --all -- --check (nightly).
cargo doc --no-deps -p quickwit-parquet-engine warning-free.

Status of the adversarial review

After this PR, the remaining open work from the original review is just the spec docs (PA/MS/CS supplement to ADR-002). All code findings (F1, F2, F4, F6–F14) are closed across PRs #6424, #6425, #6426, #6423, and this one.

🤖 Generated with Claude Code

Three test additions plus one engine fix surfaced by the F4 tests. ## F4: MS-7 generalized to multi-input × sub-region The existing MS-7 test proved the per-input page-cache bound for one input and one region. F4 extends the coverage: - `test_ms7_per_input_bound_across_num_inputs` sweeps `num_inputs ∈ {1, 3, 8}` × `rows_per_input ∈ {3 000, 30 000}` and asserts the per-input peak stays bounded. Cross-axis growth check: going from 1 input to 8 must not push the peak up. - `test_ms7_per_input_bound_across_sub_regions_does_not_scale_with_rows` runs the prefix_len=0 multi-output sub-region path at 3 000 vs 30 000 rows and asserts peak doesn't scale with input row count. **This test surfaced F14 (below) — without the engine fix, the sub-region path's peak grew ~9× when rows grew 10×.** Tests serialize via `ms7_serial_lock` because `PEAK_BODY_COL_PAGE_CACHE_LEN` is process-global; concurrent tests would pollute each other's readings. ## F14: invert col/sub-region loop nesting in the split-region path Parquet streams emit pages in column-major order (all of col 0, then all of col 1, ...). The old sub-region-outer / col-inner ordering meant that while processing sub-region 0's col K, the stream emitted cols 0..K-1's remaining pages first to reach col K — those skipped pages got cached under their own col_idx for later sub-regions to consume, and the cache scaled with input row count. Fix: new `process_split_region_col_outer` function for the `needs_split` path. Cols iterate in the outer loop, sub-regions in the inner. Each parquet col chunk is fully consumed from the stream across all sub-regions before col K+1 starts. Cache for col K is empty before col K+1's pages arrive. Mechanics: pre-determine writer assignments for the region's sub-regions (a top-level region's sub-regions may span multiple output writers; consecutive sub-regions on the same writer get coalesced into one combined Region so each writer holds one RG concurrently — RGs on the same writer are sequential, so coalescing keeps the parquet writer's single-active-RG constraint intact). Single-region path stays on the existing `process_region`. ## F5: prefix-aware proptest `prop_merge_prefix_aligned_streaming` sweeps `(num_inputs ∈ 1..=3, per-input RG specs, num_outputs ∈ 1..=3)` with prefix_len=1 and asserts MC-1 (rows preserved), MC-3 (sorted_series monotone within each output), MS-3 (num_row_groups matches footer), PA-1+PA-3 (`assert_unique_rg_prefix_keys`), and CS-1 (metastore prefix_len == KV) on every generated case. 32 cases capped to keep runtime under a second. Fixture: `make_prefix_len_one_input` writes one RG per `(metric_name, rows)` entry by calling `writer.flush()` between batches. `sorted_series` encodes `metric_base + row_offset_within_metric`, mirroring production's storekey property that different metric_names produce non-overlapping `sorted_series` byte ranges. Plus a focused unit test `test_f5_single_input_two_metrics_minimal` that pins one specific case for fast iteration. ## F7: production-shape integration test `test_f7_production_shape_multi_input_multi_rg_multi_output`: 5 inputs × ~4 prefix-aligned RGs each × 4 outputs × prefix_len=1. Asserts the full invariant bundle (MC-1, MS-3, PA-1+PA-3, MS-5 cross-output sorted_series monotonicity, CS-1) — the corner the adversarial review flagged as "untested production case". MS-5 is "across adjacent outputs, sorted_series is monotone non-decreasing." A single metric CAN span outputs (the engine splits at sorted_series transitions inside an overflowing region), so the cross-output invariant is sorted_series monotonicity, not "each metric in one output." ## Verification - `cargo test -p quickwit-parquet-engine --lib` — 498 unit tests pass. - `cargo clippy -p quickwit-parquet-engine --tests --all-features` with `-Dwarnings`. - `cargo doc --no-deps -p quickwit-parquet-engine` warning-free. - `cargo fmt --all -- --check` (nightly via PATH override). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

g-talbot requested a review from a team as a code owner May 13, 2026 18:00

g-talbot force-pushed the gtt/legacy-promotion-and-schema-evo branch from 2a3f09a to edd45a6 Compare May 13, 2026 18:11

g-talbot force-pushed the gtt/adversarial-review-test-coverage branch from 0e2b36d to 6ed25aa Compare May 13, 2026 18:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(merge): adversarial-review test coverage (F4/F5/F7) + F14 sub-region engine fix#6428

feat(merge): adversarial-review test coverage (F4/F5/F7) + F14 sub-region engine fix#6428
g-talbot wants to merge 1 commit into
gtt/legacy-promotion-and-schema-evofrom
gtt/adversarial-review-test-coverage

g-talbot commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

g-talbot commented May 13, 2026

Summary

F4: MS-7 generalized

F14: invert col/sub-region loop nesting

F5: prefix-aware proptest

F7: production-shape integration test

Test plan

Status of the adversarial review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant