Skip to content

HBASE-29995 Improve existing hash implementations by reading 4 bytes at once#7934

Open
jinhyukify wants to merge 5 commits intoapache:masterfrom
jinhyukify:HBASE-29995
Open

HBASE-29995 Improve existing hash implementations by reading 4 bytes at once#7934
jinhyukify wants to merge 5 commits intoapache:masterfrom
jinhyukify:HBASE-29995

Conversation

@jinhyukify
Copy link
Contributor

@Apache9
Copy link
Contributor

Apache9 commented Mar 15, 2026

Please add a dummy change in pom.xml so we can run all the tests in HBase to see if there are any problems with the changes? once we confirm there is no problem, you can do a force push to remove the dummy commit and then merge the PR.

And why not read 8 bytes at once?

@jinhyukify
Copy link
Contributor Author

@Apache9 Thank you for checking!

And why not read 8 bytes at once?

The existing hashes (Jenkins, Murmur, Murmur3) are all 4-byte based, so they operate on 4-byte chunks. There is currently no place in the algorithms where 8-byte reads are used.

In theory we could read 8 bytes and split them into two ints, but given the current structure it would not provide much benefit. If we add hash functions that operate on 8-byte words in the future, we could revisit this.

return (int) assembleCrossingLE(offset, Bytes.SIZEOF_INT);
}

private long assembleCrossingLE(int offset, int wordBytes) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're adding a non-trivial amount of code here, and it seems to overlap with the existing get(int offset) implementation to some degree. Would it make sense to refactor get to call this new method instead, so we can reduce code duplication? We could consider that approach as long as the performance overhead is negligible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for checking.
I changed to use assembleCrossingLE method in get method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants