-
-
Notifications
You must be signed in to change notification settings - Fork 14.8k
Range.contains failed to be inlined/optimized #90609
Copy link
Copy link
Closed
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.
Metadata
Metadata
Assignees
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.
Type
Fields
Give feedbackNo fields configured for issues without a type.
I was suggested on Stack Overflow (https://stackoverflow.com/questions/69844819/rust-range-contains-failed-to-be-inlined-optimized) to ask here.
I am aware that optimization in complex situations can fail to apply. However, rather straightforward inlining "in the small" should still apply.
I was running my code through Clippy and it suggested changing the following:
Into
Since it is more readable. Unfortunately the resulting assembly output is twice as long, even with optimization level 3. Manually inlining it (2-nestings down), gives almost the same code as
version1and is as efficient.If I remove the
|| value == SPECIAL_VALUEthey all resolve with the same (though with 1 more instruction added to decrement the parameter value before a compare). Also if I changeSPECIAL_VALUEto something not adjacent to the range they all resolve to same assembly code asversion2, which is the reason why I kept it0unless I eventually have to change it.I have a link to Godbolt with the code here: https://rust.godbolt.org/z/d9PWYEKc8
Why is the compiler failing to properly inline/optimize
version2? Is it an "optimization bug"? Or am I misunderstanding some semantics of Rust, maybe something with the borrowing prevents the optimization, but can't the compiler assume no mutation of value due to the aliasing and referencing rules? Because the optimization is applied inversion1it would suggest LLVM knows that because the value is unsigned it can simplify the comparison. So it may be that there is a missed optimization opportunity in the Rust frontend?Trying to do something similar in C++ gives the optimum short assembly in GCC but not in Clang https://godbolt.org/z/erYPYsvhf