Improve performance symmetry of Set.intersect#19292
Improve performance symmetry of Set.intersect#19292aw0lid wants to merge 2 commits intodotnet:mainfrom
Conversation
❗ Release notes required
Warning No PR link found in some release notes, please consider adding it.
|
|
Benchmark will need to be a bdn, to see how it performs in jitted code, with proper preheat, etc. |
|
Please do the BDN benchmark in a style that does "before" vs "after" comparison, to make it apparent what has been improved and by how much. There should be some setup samples over at tests/benchmarks |
5492130 to
a585874
Compare
|
The benchmarks show that certain constellations ended up being slower, this should be addressed before merging. |
To address concerns regarding the reported 15% regression in Disjoint_Huge_Sets, I ran 6 full benchmark sets (3 for Main, 3 for PR) on the same machine to account for statistical variance and CPU throttling. Environment:
1. Disjoint Huge Sets (Regression Concern)The reported 15% regression is within measurement noise. The Main branch itself shows ~22% variance between runs due to CPU throttling.
Conclusion: PR is statistically equivalent to Main (~3% faster on average). Previous 15% observation was a measurement outlier, not a code regression. 2. Massive Win: Asymmetric IntersectionsThis PR eliminates the catastrophic performance asymmetry in
3. Memory & Identity Path
✅ Final VerdictThis PR effectively eliminates the O(N) bottleneck in asymmetric intersections while leaving disjoint set performance intact. The previous "regression" is purely environmental noise, not a code-level issue. |
|
I trust the BDN benchmark and its StdDev algorithm and ability to keep iterating until it gets stable more. Tiny_Intersect_Huge |
a585874 to
bcdfb38
Compare
bcdfb38 to
924a294
Compare
Benchmark Comparison: Main vs PR (focus on regression concerns)
Summary:
|
Improve Set.intersect Performance for Asymmetric Set Sizes (Fixes #19139)
Summary
This change removes argument-order sensitivity in
Set.intersectby selecting the traversal direction based on tree height.The previous implementation always traversed one tree and queried the other using
mem, causing pathological performance when intersecting sets with highly asymmetric sizes depending solely on argument ordering.The new implementation ensures that intersection performance depends on input sizes rather than parameter order, while preserving existing semantics, balancing behavior, and API surface.
Problem
Previously, the performance of
Set.intersectdepended heavily on argument order:This occurred because:
This violates an expected property of set operations: intersection performance should not depend on argument ordering. In highly asymmetric scenarios, this resulted in unnecessary traversal of large trees when a much smaller traversal space was available.
Design Goals
Solution
1. Height-Based Direction Selection
Instead of computing element counts (which would require$O(n)$ traversal), the algorithm compares tree heights:
The traversal direction is chosen so that the smaller tree is traversed whenever doing so preserves existing semantics. Tree height is used as a constant-time proxy for size. While height is not identical to element count, it is monotonic with tree growth in balanced trees and provides an efficient heuristic without additional traversal cost.
2. Direction-Aware Traversal Strategies
Two traversal strategies are used:
Existing Strategy (
intersectionAux)Traverses one tree using
memlookup and inserts elements from the traversed tree. Retained when traversal direction already matches existing behavior.New Optimized Strategy (
intersectionAuxFromSmall)Traverses the smaller tree. Queries the larger tree using
tryGetand inserts the element instance stored in the queried tree. This minimizes traversal work while preserving existing result construction behavior.3. Value Retrieval via
tryGetUnlike
mem, this returns the element instance stored in the queried tree, matching existing behavior of the original implementation. Although F# Set equality is comparer-based, returning the stored instance preserves consistency with the previous implementation, which always inserted elements originating from the queried tree.4. Intersection Selection
Traversal is always chosen to minimize work while preserving previous semantics.
Algorithmic Complexity
Let:
Reasoning:
add, preserving the same construction complexity as the original implementation.Why Height Instead of Size?
Computing element count would require full traversal ($O(n)$), defeating the purpose of optimization. Height provides:
Alternative Approaches Considered
A split-based intersection algorithm (used in some functional set implementations like OCaml's Set) was considered. While split-based approaches can provide strong asymptotic guarantees, they:
The chosen solution minimizes change surface while resolving the observed asymmetry.
Benchmark Methodology
Benchmarks implemented using BenchmarkDotNet.
Comparison setup:
FSharp.Corefrom NuGet.DisableImplicitFSharpCoreReference = true.tests/benchmarksBenchmark Code
Benchmark Results
Before (main branch)
After (this PR)
Performance Impact
Reviewer Checklist
tests/benchmarksConclusion
This change restores symmetry of performance characteristics in
Set.intersectby selecting traversal direction using tree height. It removes argument-order sensitivity while preserving existing semantics, implementation guarantees, and balancing behavior, introducing a measurable improvement for highly asymmetric workloads without affecting other scenarios.No changes were made to tree structure, balancing logic, or public APIs; only traversal direction and lookup strategy were adjusted.