lifecycle: Add exception handling for CPU affinity#113
lifecycle: Add exception handling for CPU affinity#113SangKyeong-Jeong wants to merge 3 commits intoeclipse-score:mainfrom
Conversation
License Check Results🚀 The license check job ran with the Bazel command: bazel run --lockfile_mode=error //:license-checkStatus: Click to expand output |
|
The created documentation from the pull request is available at: docu-html |
|
@SangKyeong-Jeong whats the status here can we review it |
FScholPer
left a comment
There was a problem hiding this comment.
Can you please check also the OS Abstraction Layer
|
@FScholPer We will double-check the OSAL implementation files (specifically under src/internal/osal/linux and qnx) to ensure the CPU core limit is handled consistently across different platforms. We'll update the PR if any adjustments are needed. |
|
This issue is unrelated to the OSAL. When the CPU core count is 32, the return value of the ConfigurationManager::kDefaultProcessorAffinityMask() function is 0, causing an error when setting the CPU affinity to 0. Shifting a 32-bit integer by 32 bits results in a shift-count-overflow error, which leads to undefined behavior. Consequently, the return value of ConfigurationManager::kDefaultProcessorAffinityMask() becomes 0, causing this issue. |
|
@FScholPer Please review this. Thanks. |
Shifting a 32-bit number by 32 bits causes undefined behavior, so modify it to perform the shift operation on a 64-bit number and then cast it.
|
We encounter the Formatting checks fail. It looks like the temporal download connection problem, so I want to re-run the batch, but maybe don't have a permission to do. |
| const uint32_t ConfigurationManager::kDefaultProcessExecutionError = 1U; | ||
| uint32_t ConfigurationManager::kDefaultProcessorAffinityMask() { | ||
| return (1U << osal::getNumCores()) - 1U; | ||
| return static_cast<uint32_t>((1ULL << osal::getNumCores()) - 1ULL); |
There was a problem hiding this comment.
When there are 64 cores, is there still the issue or ? ;) In general, this affinity in LM supports only up to 32 cores, since later on cpu_mask_ also does not support more. So either we do an assert is there is more than 32 cores, or we do wanirng logs that default affinity is set only to first 32 cores ;)
There was a problem hiding this comment.
Even if there are 64 cores, this issue does not reproduce. Inside the osal::getNumCores() function, there is exception handling, and the maximum value this function can return is 32. Therefore, a similar problem does not occur on a 64-core system.
If this issue is not fixed, CPU affinity cannot be configured in a 32-core system, which causes the lifecycle to not operate correctly. I have explained this in the PR.
There was a problem hiding this comment.
This PR addresses the Undefined Behavior on 32-core systems. As chungsky mentioned, getNumCores() already caps the count at 32, so >32 core systems are safe with this default.
|
Done rebasing |
When there are 32 CPU cores, an issue arises where the process cannot be executed, and when there are more than 32 cores, a problem occurs where fewer cores than expected are used. To resolve this issue, exception handling is added to set the maximum number of cores using the uint32_t type when there are more than 32 cores.