feat: use HashMap for create_shared_string to fix O(N²) performance#8958
feat: use HashMap for create_shared_string to fix O(N²) performance#8958statxc wants to merge 3 commits intogoogle:masterfrom
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
aead9fc to
77cdf32
Compare
|
@jtdavis777 Could you please review this PR? thanks |
|
this will take me a bit to ingest, as I am a volunteer and not a rust specialist. But I really appreciate you working on the issue and will try to get this in as soon as I can. for now I can kick off the pipeline to make sure all works! |
|
hey @RenzoMXD -- would you be willing to help weigh in on some of these Rust PRs? |
|
I did a quick gpt session to understand what is going on here, and this seems like a safe and meaningful improvement. Having a test is perfect! GPT suggested a few small optimizations to improve efficiency overall -- feel free to take it with a grain of salt. I'd like another rust user to weigh in before I merge as well if possible just to make sure all is well. let found = self.strings_pool.binary_search_by(|offset| {
let ptr = offset.value() as usize;
let str_memory = &buf[buf.len() - ptr..];
let size = u32::from_le_bytes([
str_memory[0],
str_memory[1],
str_memory[2],
str_memory[3],
]) as usize;
let stored = &str_memory[4..4 + size];
stored.cmp(s.as_bytes())
}); |
This suggestion is good. The result is the same. Both compare stored string bytes against |
…ust/optimize-create-shared-string-hashmap
c3876d5 to
b0f7867
Compare
|
@jtdavis777 I updated. please review again. Thanks |
LGTM. Code is clear and test works well. Thanks. @jtdavis777 @statxc |
|
@jtdavis777 Any update for me, please |
|
Awesome, I will look this over again and get it merged as I have time today |
Closes: #8687
Problem
FlatBufferBuilder::create_shared_stringmaintains a sortedVecof offsets for string deduplication. On a cache miss, it inserts into the Vec usingVec::insert, which shifts all subsequent elements - an O(N) operation. Since every unique string triggers this path, inserting N unique strings costs O(N²) total.With ~500K unique strings, this causes ~80% of serialization time to be spent in
create_shared_string.Fix
Replace the sorted
Vecwith aHashMap<String, WIPOffset<&'fbb str>>for thestdfeature (the default). Lookup and insertion are both O(1) amortized instead of O(log N) + O(N).The original
Vec-based implementation is preserved behind#[cfg(not(feature = "std"))]sinceallocdoes not provideHashMap.Changes
rust/flatbuffers/src/builder.rs- swapstrings_poolfromVectoHashMap(cfg-gated), updatenew_in()andcreate_shared_stringaccordinglytests/rust_usage_test/tests/integration_test.rs- addtest_shared_strings_pool_deduplicationcovering uniqueness, deduplication, reset behavior, and end-to-end buffer correctnessIs this a breaking change?
No.
strings_poolis a private field - no user code can reference it. The public signature ofcreate_shared_stringis identical. The only observable difference is faster performance. Bothstdandno_stdbuilds compile cleanly, and all 264 existing tests pass unchanged.Test added?
Yes.
test_shared_strings_pool_deduplicationverifies:reset()clears the pool and deduplication still works afterwardTest plan