scheduler: pauseJob() is silently no-op on in-flight jobs (executor caches status at pickup)

## Summary

`SchedulerService.pauseJob()` does nothing for an in-flight job. The
executor reads `job.status` at pickup and writes that cached value back
to the row on completion, stomping the `paused` flag that `pauseJob`
just wrote.

The pause sticks for jobs that are not currently running — those are
the only cases where the documented behavior holds.

## Fault site

`src/scheduler/executor.ts:59`

```ts
let newStatus = job.status; // captured from the job object passed at pickup
```

The happy-path (`runStatus === "ok"` + cron schedule) flows through to
the UPDATE at line 108–132 without touching `newStatus`, so the value
written back is whatever the row was at pickup time. If `pauseJob` ran
during the job's runtime work (`ctx.runtime.handleMessage` on line 43,
which is the long-running call), the SQL update on line 118 stomps the
`paused` row back to `active`.

## Reproduction

1. Create a cron job whose handler sleeps for ~30 s (any long-running
   task works).
2. Wait for the job to fire — confirm it is in `handleMessage`.
3. Call `pauseJob(id)`. The SQL update succeeds (`status = 'paused'`).
4. Let the job finish. The executor's UPDATE on line 108 runs and
   writes `status = 'active'` back. Pause is gone, job fires on next
   cron tick.

A unit test can reproduce this without runtime sleep by stubbing
`runtime.handleMessage` with a promise that the test resolves manually
after `pauseJob` has been called.

## Workaround

Set `enabled = 0` via direct SQL instead of calling `pauseJob` on a
running job. `enabled` is checked at fire time, not cached at pickup,
so the next scheduled fire is suppressed. This is what
`feedback_scheduler_pause_use_enabled_zero` documents as the current
operator workaround.

## Proposed fix

One of two shapes:

1. **Re-read row status before the UPDATE.** Right before the
   `ctx.db.run(UPDATE ...)` on line 108, re-query the current row's
   `status`. If it is `paused`, write `paused` back instead of the
   cached `newStatus`. One extra read per execution.

2. **Compose with WHERE on the UPDATE.** Add `AND status != 'paused'`
   to the WHERE on line 120, then run a follow-up UPDATE that writes
   only the run telemetry (last_run_at, last_run_status, run_count,
   consecutive_errors, last_delivery_status) regardless of pause
   state. Run telemetry should always reflect that the job ran; only
   the lifecycle status field needs to honor the operator's pause.

Shape 2 is the more correct one: the row's `status` field is the
operator's lifecycle intent (active / paused / failed / completed),
and the executor should not be in the business of overwriting that
intent for happy-path runs. The executor only owns lifecycle changes
when it has cause to (success + `at`-kind → `completed`, MAX errors
→ `failed`, etc.); for a happy-path cron run, the executor has no
lifecycle decision to make and should not touch `status` at all.

## Why this matters

Operators who hit "pause" on a job in the dashboard expect the job to
stop. If it's mid-flight, the pause silently fails. The
`enabled = 0` workaround works but is undocumented surface for an
operator who reasonably trusts the `pauseJob` API.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scheduler: pauseJob() is silently no-op on in-flight jobs (executor caches status at pickup) #148

Summary

Fault site

Reproduction

Workaround

Proposed fix

Why this matters

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

scheduler: pauseJob() is silently no-op on in-flight jobs (executor caches status at pickup) #148

Description

Summary

Fault site

Reproduction

Workaround

Proposed fix

Why this matters

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions