Skip to content

ENT-14108: drain cf-agent in hub preremove.sh before stopping cfengine3 umbrella#2262

Draft
larsewi wants to merge 1 commit into
cfengine:masterfrom
larsewi:drain-cf-agent-prerm
Draft

ENT-14108: drain cf-agent in hub preremove.sh before stopping cfengine3 umbrella#2262
larsewi wants to merge 1 commit into
cfengine:masterfrom
larsewi:drain-cf-agent-prerm

Conversation

@larsewi
Copy link
Copy Markdown
Contributor

@larsewi larsewi commented May 20, 2026

Race: prerm stops cf-execd and cf-postgres, but a still-running cf-agent then starts cf-php-fpm — which pulls cf-postgres back in as a dependency (Wants=cf-postgres.service). The start fails (data dir being torn down) and Restart=always keeps it looping into the next install's postinst, eventually clobbering postinst's postgres so the final pg_ctl stop fails and dpkg aborts.

This PR drains in-flight cf-agent in prerm before the umbrella stop, so policy can't re-trigger services during teardown. See the commit message for the full mechanism.

A cf-agent process spawned by cf-execd can keep running after
systemctl stop cf-execd.service, finish a policy run, and then call
systemctl start cf-php-fpm.service. cf-php-fpm has Wants=cf-postgres.service,
so systemd pulls cf-postgres back in as a dependency. cf-postgres fails to
start (data dir being torn down) and Restart=always, RestartSec=10 keeps it
looping. The loop continues into the next install's postinst, collides with
the postinst-launched postgres, and the postinst's final pg_ctl stop fails
with "PID file does not exist" — dpkg sees exit 1 and aborts:

  dpkg: error processing package cfengine-nova-hub (--install):
   installed cfengine-nova-hub package post-installation script subprocess
   returned error exit status 1

Fix: in prerm, stop cf-execd first so no new cf-agent runs spawn, then wait
up to 60s for any in-progress cf-agent to drain (SIGKILL the survivor), and
only then run the cfengine3 umbrella stop. cf-php-fpm stays up the whole
time, so policy passes without re-triggering anything.

Ticket: ENT-14108
Signed-off-by: Lars Erik Wik <lars.erik.wik@northern.tech>
@larsewi larsewi marked this pull request as draft May 20, 2026 12:09
@larsewi
Copy link
Copy Markdown
Contributor Author

larsewi commented May 20, 2026

@cf-bottom Jenkins please :)

@cf-bottom
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants