Skip to content

Cancel timer when Workflow.await condition is satisfied#2799

Merged
mjameswh merged 6 commits intotemporalio:masterfrom
mfateev:await-cancel-timer-clean
Mar 17, 2026
Merged

Cancel timer when Workflow.await condition is satisfied#2799
mjameswh merged 6 commits intotemporalio:masterfrom
mfateev:await-cancel-timer-clean

Conversation

@mfateev
Copy link
Copy Markdown
Member

@mfateev mfateev commented Feb 24, 2026

Summary

  • Workflow.await(duration, condition) now cancels the timer when the condition is satisfied before the timeout expires
  • Prevents unnecessary workflow tasks caused by timer fired events
  • If condition is already true when called, returns immediately without creating a timer

Background

Fixes #2312. Port of the same fix from the Go SDK: temporalio/sdk-go#2153

Implementation

  • Added CANCEL_AWAIT_TIMER_ON_CONDITION(4) SDK flag for replay compatibility
  • When the flag is enabled, the timer is created inside a CancellationScope so it can be cancelled when the condition resolves
  • Uses checkSdkFlag (not tryUseSdkFlag) so the flag is not auto-enabled for new workflows in this release — it must wait at least 1 release before being enabled by default, per SDK flag rollout policy
  • TODO comment marks where to switch to tryUseSdkFlag in the next release

Test plan

  • testTimerCancelledWhenFlagEnabled — explicitly enables the flag, verifies TIMER_CANCELED in history
  • testTimerNotCancelledWhenFlagDisabled — default (flag off), verifies timer is NOT cancelled (old behavior)
  • testNoTimerWhenConditionImmediatelySatisfiedWithFlag — verifies no timer created when condition is already true
  • testAwaitReturnValue — verifies return value semantics (true=condition met, false=timeout)
  • testReplayOldHistoryWithoutFlag — replay compatibility with old workflow histories recorded without the flag

…tisfied

This change addresses GitHub issue temporalio#2312 by ensuring that
Workflow.await(duration, condition) cancels the timer when the condition
is satisfied before the timeout expires.

Changes:
- Add CANCEL_AWAIT_TIMER_ON_CONDITION SDK flag for backward compatibility
- Modify SyncWorkflowContext.await() to use a CancellationScope to cancel
  the timer when condition is satisfied before timeout
- Skip timer creation entirely if condition is already satisfied
- Add comprehensive tests including replay compatibility test

The new behavior is enabled by default for new workflows via the SDK flag
mechanism. Old workflows replay correctly with the original behavior.
Follow the Go SDK pattern (PR temporalio#2153) per reviewer feedback:

1. Use checkSdkFlag instead of tryUseSdkFlag so the flag is NOT
   auto-enabled for new workflows. Add TODO to switch to tryUseSdkFlag
   in the next release.
2. Remove CANCEL_AWAIT_TIMER_ON_CONDITION from initialFlags.
3. Tests explicitly toggle the flag to verify both old behavior
   (timer NOT cancelled) and new behavior (timer cancelled).
@mfateev mfateev requested a review from a team as a code owner February 24, 2026 16:42
replayContext.checkSdkFlag(SdkFlag.CANCEL_AWAIT_TIMER_ON_CONDITION);

// If new behavior is enabled and condition is already satisfied, skip creating timer
if (cancelTimerOnCondition && unblockCondition.get()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd move this block into the first branch of the if (cancelTimerOnCondition) { below

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


boolean conditionSatisfied = !timer.isCompleted();
if (conditionSatisfied) {
timerScope.cancel("await condition resolved");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about providing a reason message here. AFAICS, we're not currently providing reason messages in similar context anywhere in the Java SDK, and we don't do that either in the Go SDK, which was used as reference for this PR.

But no strong opinion either way.

}

@WorkflowInterface
public interface TestAwaitWorkflow {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I believe we generally try to deduplicate workflow interfacess in our tests.

Copy link
Copy Markdown
Contributor

@mjameswh mjameswh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two very minor comments, but overall looks correct to me.

@mjameswh mjameswh merged commit 9516965 into temporalio:master Mar 17, 2026
15 checks passed
@gianny82
Copy link
Copy Markdown

Hi,

I see in the description:

If condition is already true when called, returns immediately without creating a timer

Does it mean we break determinism if the condition is initially false (timer started event present in history) and later on we change the workflow definition in a way that makes the condition already true when Workflow.await is called?

What would be the way to make such a change without breaking determinism? I don't think we can call Workflow.getVersion from the unblock condition.

@mjameswh
Copy link
Copy Markdown
Contributor

Does it mean we break determinism if the condition is initially false (timer started event present in history) and later on we change the workflow definition in a way that makes the condition already true when Workflow.await is called?

Yes, that's correct, and that has always been the case.

The change you describe necessarily implies a change in the relative order of events in the workflow history, and would therefore cause an NDE, independently of the present PR.

Do you have a specific case in mind?

@gianny82
Copy link
Copy Markdown

gianny82 commented Apr 2, 2026

Yes, that's correct, and that has always been the case.

The change you describe necessarily implies a change in the relative order of events in the workflow history, and would therefore cause an NDE, independently of the present PR.

Do you have a specific case in mind?

I don't think that has always been the case, I am using Java Temporal SDK 1.32.1 and I see the timer is created by adding following:

Workflow.await(Duration.ofSeconds(5), () -> true)

Doesn't that mean that upgrading to the new version might cause NDE in existing workflows even without applying any change to the workflow definition? Am I missing something?

In my case, I created an helper to do similar things (cancelling the timer when needed and also adding a summary to it) and noticed the difference with Workflow.await behavior, so I wondered if that was ok, googled and find this issue :)
Anyway I guess new behavior is similar to changing the duration from zero to non-zero and vice versa, not sure if it would make sense to document it as it might not be obvious.

Another question, would it make sense to have a version of Workflow.await that also allows to pass timer options in the way we can set the summary?

@mjameswh
Copy link
Copy Markdown
Contributor

mjameswh commented Apr 2, 2026

I don't think that has always been the case, I am using Java Temporal SDK 1.32.1 and I see the timer is created by adding following:

No, the SDK itself has its own mechanism, "SDK flags", to determine whether it needs to replay using the legacy logic or the fixed logic. That's a bit similar to GetVersion, but internal to the SDK itself.

@mjameswh
Copy link
Copy Markdown
Contributor

mjameswh commented Apr 2, 2026

would it make sense to have a version of Workflow.await that also allows to pass timer options in the way we can set the summary

It would certainly make sense. See #2751.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Workflow.await(duration, condition) does not automatically cancel the timer if the condition is resolved

3 participants