Fix SMP scheduler evicting task when preemption is disabled#1379
Fix SMP scheduler evicting task when preemption is disabled#1379Xyerophyte wants to merge 1 commit intoFreeRTOS:mainfrom
Conversation
In the SMP time-slicing path of xTaskIncrementTick(), xYieldPendings[] was set unconditionally without checking xPreemptionDisable. Although the yield-processing block correctly skipped acting on it for cores with preemption disabled, the stale flag remained and was later misinterpreted by FromISR functions (xTaskGenericNotifyFromISR, xTaskResumeFromISR, etc.) that use xYieldPendings as a proxy for whether prvYieldForTask() decided to yield. This caused *pxHigherPriorityTaskWoken to be erroneously set to pdTRUE, triggering portYIELD_FROM_ISR() and ultimately evicting the preemption-disabled task via prvSelectHighestPriorityTask(). Three-layer fix: 1. Root cause: Guard the time-slicing xYieldPendings[] assignment in xTaskIncrementTick() with a configUSE_TASK_PREEMPTION_DISABLE check so the flag is never set for cores running preemption-disabled tasks. 2. Defense in depth: In the four functions that check xYieldPendings[] after prvYieldForTask() (xTaskResumeFromISR, xTaskRemoveFromEventList, xTaskGenericNotifyFromISR, vTaskGenericNotifyGiveFromISR), save the value before the call and only report a yield if it transitioned from pdFALSE to pdTRUE, eliminating false positives from stale flags. 3. Safety net: Add a preemption-disable check in the SMP vTaskSwitchContext(), analogous to the existing uxSchedulerSuspended check. If a context switch is somehow triggered for a core whose current task has preemption disabled, the switch is deferred by setting xYieldPendings and will execute when vTaskPreemptionEnable() is called. Closes FreeRTOS#1376 Made-with: Cursor
|
There was a problem hiding this comment.
Pull request overview
Fixes an SMP FreeRTOS-Kernel issue where a stale xYieldPendings[] flag could cause an ISR-driven context switch to evict a task that has preemption disabled.
Changes:
- Guard time-slicing yield-pending flag updates in
xTaskIncrementTick()when the current task has preemption disabled. - Add an SMP safety check in
vTaskSwitchContext()to defer context switching while the current task has preemption disabled. - Update several ISR-related paths to infer “did
prvYieldForTask()request a yield?” using a before/afterxYieldPendings[]comparison.
Comments suppressed due to low confidence (2)
tasks.c:8224
- The transition-only check can leave pxHigherPriorityTaskWoken unset when xYieldPendings[core] was already pdTRUE on entry (e.g., a yield already pended earlier in the same ISR/critical section). In that case the ISR may skip portYIELD_FROM_ISR() even though a switch is pending, delaying preemption. Consider basing pxHigherPriorityTaskWoken on the post-call xYieldPendings state (as before), and avoid the original stale-flag issue by gating on xPreemptionDisable for the current core (when enabled) and/or ensuring the stale flag cannot be set (which is already addressed in xTaskIncrementTick).
BaseType_t xYieldPendingBefore = xYieldPendings[ portGET_CORE_ID() ];
prvYieldForTask( pxTCB );
if( ( xYieldPendingBefore == pdFALSE ) &&
( xYieldPendings[ portGET_CORE_ID() ] == pdTRUE ) )
{
if( pxHigherPriorityTaskWoken != NULL )
{
*pxHigherPriorityTaskWoken = pdTRUE;
}
tasks.c:8361
- Same issue as xTaskGenericNotifyFromISR(): only reporting a yield when xYieldPendings transitions pdFALSE->pdTRUE can prevent pxHigherPriorityTaskWoken being set when a yield was already pending for this core before prvYieldForTask(). That can delay an otherwise-required context switch at ISR exit. Consider using the post-call xYieldPendings value directly (as before) and adding a targeted preemption-disabled guard if needed.
BaseType_t xYieldPendingBefore = xYieldPendings[ portGET_CORE_ID() ];
prvYieldForTask( pxTCB );
if( ( xYieldPendingBefore == pdFALSE ) &&
( xYieldPendings[ portGET_CORE_ID() ] == pdTRUE ) )
{
if( pxHigherPriorityTaskWoken != NULL )
{
*pxHigherPriorityTaskWoken = pdTRUE;
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| BaseType_t xYieldPendingBefore = xYieldPendings[ portGET_CORE_ID() ]; | ||
|
|
||
| prvYieldForTask( pxTCB ); | ||
|
|
||
| if( xYieldPendings[ portGET_CORE_ID() ] != pdFALSE ) | ||
| if( ( xYieldPendingBefore == pdFALSE ) && | ||
| ( xYieldPendings[ portGET_CORE_ID() ] != pdFALSE ) ) | ||
| { | ||
| xYieldRequired = pdTRUE; | ||
| } |
| BaseType_t xYieldPendingBefore = xYieldPendings[ portGET_CORE_ID() ]; | ||
|
|
||
| prvYieldForTask( pxUnblockedTCB ); | ||
|
|
||
| if( xYieldPendings[ portGET_CORE_ID() ] != pdFALSE ) | ||
| if( ( xYieldPendingBefore == pdFALSE ) && | ||
| ( xYieldPendings[ portGET_CORE_ID() ] != pdFALSE ) ) | ||
| { | ||
| xReturn = pdTRUE; | ||
| } |
|
Thankyou for responding to my bug report in #1376. I was beginning to wonder if I should file a PR rather than an issue, since obviously a lot of issues are feature requests, whereas my issue was more of a rollback request. I did predict to my colleague at the time of filing the issue, that there would be a reluctance to roll back the incorrect code, and that the proposed fix would probably involve many other risky changes. This turned out to be correct. The refactoring proposed here is probably a good idea, but unrelated to the bug. Could I make the suggestion that prior to the refactorings proposed here, the incorrect code first be rolled back? I say this because, based on my experience working with legacy codebases that evolved significant complexity over time, I think it's better to do refactoring proactively (as part of a plan to improve the codebase) rather than reactively (as part of an effort to fix reported issue). Once the codebase is back in a working state, the refactoring can be applied on top, and if issues are encountered with the refactoring, it too can roll back. |



Summary
Fixes #1376 — SMP scheduler evicting a task that has preemption disabled via
vTaskPreemptionDisable().In the SMP time-slicing path of
xTaskIncrementTick(),xYieldPendings[xCoreID]was set topdTRUEunconditionally without checkingxPreemptionDisable. The later yield-processing block correctly skipped acting on it for preemption-disabled cores, but the stale flag remained. When a subsequent ISR (e.g.xTaskGenericNotifyFromISR()) calledprvYieldForTask()— which correctly did not yield — it then checkedxYieldPendings[portGET_CORE_ID()]and found the stalepdTRUE, erroneously setting*pxHigherPriorityTaskWoken = pdTRUE. This triggeredportYIELD_FROM_ISR()→vTaskSwitchContext()→prvSelectHighestPriorityTask(), which evicted the preemption-disabled task.Changes (three-layer fix)
Root cause (
xTaskIncrementTick): Guard the time-slicingxYieldPendings[]assignment with aconfigUSE_TASK_PREEMPTION_DISABLEcheck so the flag is never set for cores running preemption-disabled tasks.Defense in depth (4 functions): In
xTaskResumeFromISR,xTaskRemoveFromEventList,xTaskGenericNotifyFromISR, andvTaskGenericNotifyGiveFromISR— savexYieldPendings[portGET_CORE_ID()]before callingprvYieldForTask()and only report a yield if the flag transitioned frompdFALSEtopdTRUE. This eliminates false positives from any stale flags.Safety net (
vTaskSwitchContextSMP path): Add axPreemptionDisablecheck analogous to the existinguxSchedulerSuspendedcheck. If a context switch is somehow triggered for a core whose task has preemption disabled, the switch is deferred (xYieldPendings = pdTRUE) and will execute whenvTaskPreemptionEnable()callsprvYieldCore().Configuration
Only affects SMP builds with:
Single-core builds are completely unaffected (all changes are inside
#if ( configNUMBER_OF_CORES > 1 )or equivalent guards).Test plan
vTaskPreemptionEnable) still triggers deferred yields correctly