Summary
SchedulerService.pauseJob() does nothing for an in-flight job. The
executor reads job.status at pickup and writes that cached value back
to the row on completion, stomping the paused flag that pauseJob
just wrote.
The pause sticks for jobs that are not currently running — those are
the only cases where the documented behavior holds.
Fault site
src/scheduler/executor.ts:59
let newStatus = job.status; // captured from the job object passed at pickup
The happy-path (runStatus === "ok" + cron schedule) flows through to
the UPDATE at line 108–132 without touching newStatus, so the value
written back is whatever the row was at pickup time. If pauseJob ran
during the job's runtime work (ctx.runtime.handleMessage on line 43,
which is the long-running call), the SQL update on line 118 stomps the
paused row back to active.
Reproduction
- Create a cron job whose handler sleeps for ~30 s (any long-running
task works).
- Wait for the job to fire — confirm it is in
handleMessage.
- Call
pauseJob(id). The SQL update succeeds (status = 'paused').
- Let the job finish. The executor's UPDATE on line 108 runs and
writes status = 'active' back. Pause is gone, job fires on next
cron tick.
A unit test can reproduce this without runtime sleep by stubbing
runtime.handleMessage with a promise that the test resolves manually
after pauseJob has been called.
Workaround
Set enabled = 0 via direct SQL instead of calling pauseJob on a
running job. enabled is checked at fire time, not cached at pickup,
so the next scheduled fire is suppressed. This is what
feedback_scheduler_pause_use_enabled_zero documents as the current
operator workaround.
Proposed fix
One of two shapes:
-
Re-read row status before the UPDATE. Right before the
ctx.db.run(UPDATE ...) on line 108, re-query the current row's
status. If it is paused, write paused back instead of the
cached newStatus. One extra read per execution.
-
Compose with WHERE on the UPDATE. Add AND status != 'paused'
to the WHERE on line 120, then run a follow-up UPDATE that writes
only the run telemetry (last_run_at, last_run_status, run_count,
consecutive_errors, last_delivery_status) regardless of pause
state. Run telemetry should always reflect that the job ran; only
the lifecycle status field needs to honor the operator's pause.
Shape 2 is the more correct one: the row's status field is the
operator's lifecycle intent (active / paused / failed / completed),
and the executor should not be in the business of overwriting that
intent for happy-path runs. The executor only owns lifecycle changes
when it has cause to (success + at-kind → completed, MAX errors
→ failed, etc.); for a happy-path cron run, the executor has no
lifecycle decision to make and should not touch status at all.
Why this matters
Operators who hit "pause" on a job in the dashboard expect the job to
stop. If it's mid-flight, the pause silently fails. The
enabled = 0 workaround works but is undocumented surface for an
operator who reasonably trusts the pauseJob API.
Summary
SchedulerService.pauseJob()does nothing for an in-flight job. Theexecutor reads
job.statusat pickup and writes that cached value backto the row on completion, stomping the
pausedflag thatpauseJobjust wrote.
The pause sticks for jobs that are not currently running — those are
the only cases where the documented behavior holds.
Fault site
src/scheduler/executor.ts:59The happy-path (
runStatus === "ok"+ cron schedule) flows through tothe UPDATE at line 108–132 without touching
newStatus, so the valuewritten back is whatever the row was at pickup time. If
pauseJobranduring the job's runtime work (
ctx.runtime.handleMessageon line 43,which is the long-running call), the SQL update on line 118 stomps the
pausedrow back toactive.Reproduction
task works).
handleMessage.pauseJob(id). The SQL update succeeds (status = 'paused').writes
status = 'active'back. Pause is gone, job fires on nextcron tick.
A unit test can reproduce this without runtime sleep by stubbing
runtime.handleMessagewith a promise that the test resolves manuallyafter
pauseJobhas been called.Workaround
Set
enabled = 0via direct SQL instead of callingpauseJobon arunning job.
enabledis checked at fire time, not cached at pickup,so the next scheduled fire is suppressed. This is what
feedback_scheduler_pause_use_enabled_zerodocuments as the currentoperator workaround.
Proposed fix
One of two shapes:
Re-read row status before the UPDATE. Right before the
ctx.db.run(UPDATE ...)on line 108, re-query the current row'sstatus. If it ispaused, writepausedback instead of thecached
newStatus. One extra read per execution.Compose with WHERE on the UPDATE. Add
AND status != 'paused'to the WHERE on line 120, then run a follow-up UPDATE that writes
only the run telemetry (last_run_at, last_run_status, run_count,
consecutive_errors, last_delivery_status) regardless of pause
state. Run telemetry should always reflect that the job ran; only
the lifecycle status field needs to honor the operator's pause.
Shape 2 is the more correct one: the row's
statusfield is theoperator's lifecycle intent (active / paused / failed / completed),
and the executor should not be in the business of overwriting that
intent for happy-path runs. The executor only owns lifecycle changes
when it has cause to (success +
at-kind →completed, MAX errors→
failed, etc.); for a happy-path cron run, the executor has nolifecycle decision to make and should not touch
statusat all.Why this matters
Operators who hit "pause" on a job in the dashboard expect the job to
stop. If it's mid-flight, the pause silently fails. The
enabled = 0workaround works but is undocumented surface for anoperator who reasonably trusts the
pauseJobAPI.