GH-50043: [C++][Python] Fix hash_any/hash_all on sliced boolean arrays#50094
Conversation
|
|
|
I wonder if we have the same bug elsewhere in aggregate kernels (e.g. for different types). It's out of scope here, but do you have any input @fenfeng9 ? |
3f7ceae to
22dd4b3
Compare
I don't see the same kind of offset issue in nearby aggregate kernels for other data types. I'll take a closer look separately, and if I find another case, I'll file a new issue. |
Perfect, thank you! This looks good to me now. Will merge if green. |
Thank you for your quick and patient review. |
|
Thanks again @fenfeng9! |
Rationale for this change
hash_anyandhash_allcould return incorrect results for sliced nullable boolean arrays.The validity bitmap used the slice offset, but the boolean values bitmap did not.
What changes are included in this PR?
Apply the slice offset when reading boolean values in
hash_any/hash_all.Add C++ and Python regression tests.
Are these changes tested?
Yes.
Are there any user-facing changes?
No.