Skip to content

perf: Optimize set operations to avoid RowConverter deserialization overhead#20623

Open
neilconway wants to merge 7 commits intoapache:mainfrom
neilconway:neilc/optimize-array-take-results
Open

perf: Optimize set operations to avoid RowConverter deserialization overhead#20623
neilconway wants to merge 7 commits intoapache:mainfrom
neilconway:neilc/optimize-array-take-results

Conversation

@neilconway
Copy link
Contributor

@neilconway neilconway commented Feb 28, 2026

Which issue does this PR close?

Rationale for this change

Several array set operations (e.g., array_distinct, array_union, array_intersect, array_except) share a similar structure:

  • Convert the input(s) using RowConverter, ideally in bulk
  • Apply the set operation as appropriate, which involves adding or removing elements from the candidate set of result Rows
  • Convert the final set of Rows back into ArrayRef

We can do better for the final step: instead of converting from Rows back into ArrayRef, we can just track which indices in the input(s) correspond to the values we want to return. We can then grab those values with a single take, which avoids the Row -> ArrayRef deserialization overhead. This is a 5-20% performance win, depending on the set operation and the characteristics of the input.

The only wrinkle is that for intersect and union, because there are multiple inputs we need to concatenate the inputs together so that we have a single index space. It turns out that this optimization is a win, even incurring the concat overhead.

What changes are included in this PR?

  • Add a benchmark for array_except
  • Implement this optimization for array_distinct, array_union, array_intersect, array_except

Are these changes tested?

Yes, and benchmarked.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the functions Changes to functions implementation label Feb 28, 2026
@neilconway
Copy link
Contributor Author

Benchmarks:

  array_union:

  ┌──────────────────┬─────────┬─────────┬────────┐
  │     Scenario     │ Before  │  After  │ Change │
  ├──────────────────┼─────────┼─────────┼────────┤
  │ high_overlap/10  │ 237 µs  │ 208 µs  │ -12.3% │
  ├──────────────────┼─────────┼─────────┼────────┤
  │ high_overlap/50  │ 1.06 ms │ 886 µs  │ -16.3% │
  ├──────────────────┼─────────┼─────────┼────────┤
  │ high_overlap/100 │ 2.02 ms │ 1.70 ms │ -16.1% │
  ├──────────────────┼─────────┼─────────┼────────┤
  │ low_overlap/10   │ 269 µs  │ 224 µs  │ -18.0% │
  ├──────────────────┼─────────┼─────────┼────────┤
  │ low_overlap/50   │ 1.24 ms │ 1.00 ms │ -20.9% │
  ├──────────────────┼─────────┼─────────┼────────┤
  │ low_overlap/100  │ 2.18 ms │ 1.73 ms │ -19.6% │
  └──────────────────┴─────────┴─────────┴────────┘

  array_intersect:

  ┌──────────────────┬─────────┬─────────┬────────┐
  │     Scenario     │ Before  │  After  │ Change │
  ├──────────────────┼─────────┼─────────┼────────┤
  │ high_overlap/10  │ 216 µs  │ 199 µs  │ -8.3%  │
  ├──────────────────┼─────────┼─────────┼────────┤
  │ high_overlap/50  │ 1.11 ms │ 1.01 ms │ -8.4%  │
  ├──────────────────┼─────────┼─────────┼────────┤
  │ high_overlap/100 │ 2.18 ms │ 1.99 ms │ -8.6%  │
  ├──────────────────┼─────────┼─────────┼────────┤
  │ low_overlap/10   │ 176 µs  │ 174 µs  │ -1.6%  │
  ├──────────────────┼─────────┼─────────┼────────┤
  │ low_overlap/50   │ 1.01 ms │ 997 µs  │ -1.4%  │
  ├──────────────────┼─────────┼─────────┼────────┤
  │ low_overlap/100  │ 2.02 ms │ 2.00 ms │ -0.7%  │
  └──────────────────┴─────────┴─────────┴────────┘

array_except:

  ┌──────────────────┬─────────┬─────────┬─────────────────────────┐
  │     Scenario     │ Before  │  After  │         Change          │
  ├──────────────────┼─────────┼─────────┼─────────────────────────┤
  │ high_overlap/10  │ 218 µs  │ 212 µs  │ -3.4%                   │
  ├──────────────────┼─────────┼─────────┼─────────────────────────┤
  │ high_overlap/50  │ 908 µs  │ 874 µs  │ -1.5% (not significant) │
  ├──────────────────┼─────────┼─────────┼─────────────────────────┤
  │ high_overlap/100 │ 1.74 ms │ 1.68 ms │ -2.9%                   │
  ├──────────────────┼─────────┼─────────┼─────────────────────────┤
  │ low_overlap/10   │ 250 µs  │ 231 µs  │ -8.1%                   │
  ├──────────────────┼─────────┼─────────┼─────────────────────────┤
  │ low_overlap/50   │ 1.12 ms │ 1.00 ms │ -11.2%                  │
  ├──────────────────┼─────────┼─────────┼─────────────────────────┤
  │ low_overlap/100  │ 2.01 ms │ 1.75 ms │ -11.7%                  │
  └──────────────────┴─────────┴─────────┴─────────────────────────┘

array_distinct:

  ┌────────────────────┬──────────┬──────────┬────────┐
  │      Scenario      │  Before  │  After   │ Change │
  ├────────────────────┼──────────┼──────────┼────────┤
  │ high_duplicate/10  │ 69.3 µs  │ 65.6 µs  │ -8.2%  │
  ├────────────────────┼──────────┼──────────┼────────┤
  │ high_duplicate/50  │ 350.0 µs │ 322.2 µs │ -8.9%  │
  ├────────────────────┼──────────┼──────────┼────────┤
  │ high_duplicate/100 │ 671.4 µs │ 624.8 µs │ -7.2%  │
  ├────────────────────┼──────────┼──────────┼────────┤
  │ low_duplicate/10   │ 106.8 µs │ 91.2 µs  │ -15.0% │
  ├────────────────────┼──────────┼──────────┼────────┤
  │ low_duplicate/50   │ 526.3 µs │ 434.3 µs │ -18.1% │
  ├────────────────────┼──────────┼──────────┼────────┤
  │ low_duplicate/100  │ 975.7 µs │ 767.3 µs │ -21.4% │
  └────────────────────┴──────────┴──────────┴────────┘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

In set ops, avoid RowConverter for results when possible

1 participant