Skip to content

feat: Count aggegate#7267

Open
blaginin wants to merge 4 commits intodevelopfrom
db/count
Open

feat: Count aggegate#7267
blaginin wants to merge 4 commits intodevelopfrom
db/count

Conversation

@blaginin
Copy link
Copy Markdown
Member

@blaginin blaginin commented Apr 2, 2026

blaginin added 3 commits April 2, 2026 16:57
Signed-off-by: blaginin <github@blaginin.me>
Signed-off-by: blaginin <github@blaginin.me>
Signed-off-by: blaginin <github@blaginin.me>
@blaginin blaginin requested a review from joseph-isaacs April 2, 2026 16:05
@blaginin blaginin self-assigned this Apr 2, 2026
@blaginin blaginin added the changelog/feature A new feature label Apr 2, 2026
@blaginin blaginin enabled auto-merge (squash) April 2, 2026 16:05
@blaginin blaginin changed the title feat:Count aggegate feat: Count aggegate Apr 2, 2026
.as_primitive()
.typed_value::<u64>()
.vortex_expect("count partial should not be null");
*partial += val;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about overflow?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from nan_count i assume we don't really think this would happen in practice

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's probably fair for a count to not exceed u64

type Partial = u64;

fn id(&self) -> AggregateFnId {
AggregateFnId::new_ref("vortex.count")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure (also @gatesn ) we want a global namespace of scalar-fns and aggregations?

Signed-off-by: blaginin <github@blaginin.me>
Comment on lines +115 to +122
fn try_accumulate(
&self,
_state: &mut Self::Partial,
_batch: &ArrayRef,
_ctx: &mut ExecutionCtx,
) -> VortexResult<bool> {
Ok(false)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remove this. We want a different system to short cut decompression, similar to execute_parent

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think then let's wait for that system to merge - any pr i should follow?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this the equivalent of "reduce"? i.e. something that can operate without looking at data buffers.

So maybe we just need to rename this to fn reduce(...) and Joe will be happy!

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Apr 2, 2026

Merging this PR will improve performance by 12.32%

⚡ 1 improved benchmark
✅ 1121 untouched benchmarks
⏩ 1530 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation chunked_bool_canonical_into[(1000, 10)] 72.9 µs 64.9 µs +12.32%

Comparing db/count (329a3a0) with develop (38ab5af)

Open in CodSpeed

Footnotes

  1. 1530 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature A new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants