perf: Skip RowFilter when all predicate columns are in the projection#20417
perf: Skip RowFilter when all predicate columns are in the projection#20417darmie wants to merge 5 commits intoapache:mainfrom
Conversation
When all predicate columns are in the output projection, late materialization provides no I/O benefit. Replace the expensive RowFilter path with a lightweight batch-level filter to avoid CachedArrayReader/ReadPlanBuilder/try_next_batch overhead.
Add a dedicated test verifying that when all predicate columns are in the output projection, the opener skips RowFilter and applies a batch filter instead — and that both the batch filter and RowFilter paths produce correct results. Simplify the 4-way stream branching into two independent steps: first apply the empty-batch filter, then optionally wrap with EarlyStoppingStream.
Skip dynamic filter expressions (TopK, join pushdown) when deciding whether a predicate is single-conjunct. This preserves the batch filter optimization for queries like Q25 (WHERE col <> '' ORDER BY col LIMIT N) where TopK adds runtime conjuncts, while still routing multi-conjunct static predicates through RowFilter for incremental evaluation.
|
run benchmark clickbench_partitioned |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark tpch |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark tpcds |
|
🤖 |
|
show benchmark queue |
|
🤖 Hi @Dandandan, you asked to view the benchmark queue (#20417 (comment)).
|
Change is_subset to strict equality for predicate vs projection column indices. When there are non-predicate projection columns (e.g. SELECT * WHERE col = X), RowFilter provides significant value by skipping their decode for non-matching rows. Only skip RowFilter when every projected column is a predicate column. Also exclude dynamic filter expressions (TopK, join pushdown) when counting conjuncts, so runtime-generated filters don't prevent the batch filter optimization for single static predicates.
|
run benchmark clickbench_partitioned |
Which issue does this PR close?
Rationale for this change
When
pushdown_filters = trueand all predicate columns are already in the output projection, the arrow-rsRowFilter(late materialization) machinery provides zero I/O benefit — those columns must be decoded for the projection anyway. Yet the RowFilter adds substantial CPU overhead fromCachedArrayReader,ReadPlanBuilder::with_predicate, andParquetDecoderState::try_next_batch(~1100 extra CPU samples on Q10 flamegraph). This causes regressions on 15 of the 43 ClickBench queries.See profiling details.
What changes are included in this PR?
In
opener.rs, before callingbuild_row_filter(), check whether all predicate column indices are a subset of the projection column indices. If so:build_row_filter()entirely (no RowFilter overhead)batch_filter()If not a subset (i.e., there are non-projected columns that could be skipped), proceed with the RowFilter path as before.
ClickBench results on key regression queries (pushdown ON, fix vs baseline):
Are these changes tested?
Yes. Added
test_skip_row_filter_when_filter_cols_subset_of_projectionwhich validates:All existing tests pass (81 tests in
datafusion-datasource-parquet).Are there any user-facing changes?
No. Behavior is identical — queries return the same results. Performance improves for queries where filter columns overlap with projection columns when
pushdown_filters = true.