fix(squid): increase healthcheck tolerance to prevent intermittent startup failures#3936
fix(squid): increase healthcheck tolerance to prevent intermittent startup failures#3936lpcox wants to merge 2 commits into
Conversation
…artup failures The Squid container can fail its healthcheck on loaded runners when initialization (chown, base64 decode, IPv6 check, Squid startup) takes longer than the 7-second window. Increase start_period from 2s to 5s, retries from 5 to 10, and interval/timeout from 1s to 2s, giving Squid ~25s total to become healthy without meaningfully slowing the happy path. Fixes #3934 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
✅ Coverage Check PassedOverall Coverage
📁 Per-file Coverage Changes (1 files)
Coverage comparison generated by |
There was a problem hiding this comment.
Pull request overview
This PR adjusts the Docker Compose Squid service healthcheck timing to reduce intermittent “container is unhealthy” startup failures on slower or heavily loaded runners, improving reliability for dependent services that wait on service_healthy.
Changes:
- Increased Squid healthcheck
start_periodfrom 2s → 5s. - Relaxed probe parameters:
interval1s → 2s,timeout1s → 2s,retries5 → 10.
Show a summary per file
| File | Description |
|---|---|
src/services/squid-service.ts |
Relaxes Squid container healthcheck parameters to provide a larger startup tolerance window. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 1/1 changed files
- Comments generated: 1
| healthcheck: { | ||
| test: ['CMD', 'nc', '-z', 'localhost', '3128'], | ||
| interval: '1s', | ||
| timeout: '1s', | ||
| retries: 5, | ||
| start_period: '2s', | ||
| interval: '2s', | ||
| timeout: '2s', | ||
| retries: 10, | ||
| start_period: '5s', |
Smoke Test: Claude Engine ✅
Result: PASS
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Smoke Test: Gemini Engine Validation
Overall status: FAIL Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "localhost"See Network Configuration for more information.
|
|
@copilot address the review feedback |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Implemented in |
🔥 Smoke Test: Copilot BYOK (Offline) Mode
Running in BYOK offline mode ( Author: @lpcox | Overall: PARTIAL (MCP + BYOK ✅; pre-step data not injected
|
🔥 Smoke Test Results
PR: fix(squid): increase healthcheck tolerance to prevent intermittent startup failures Overall: PASS ✅
|
Smoke Test Results — FAIL
|
|
fix(squid): increase healthcheck tolerance to prevent intermittent startup failures ✅ Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "registry.npmjs.org"See Network Configuration for more information.
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
Chroot Runtime Version Comparison
Overall: FAILED — Python and Node.js versions differ between host and chroot.
|
Summary
Increases the Squid container healthcheck tolerance to prevent intermittent "container is unhealthy" failures on loaded runners.
Problem
The Squid healthcheck was configured with tight timing:
start_period: 2s,retries: 5,interval: 1s,timeout: 1sOn loaded ubuntu-24.04 runners, Squid initialization (chown preflight → base64 config decode → IPv6 check → Squid startup) can exceed this window, causing dependent containers to fail with
dependency failed to start: container awf-squid is unhealthy.Fix
Relaxed healthcheck parameters:
start_period: 5s(was 2s)retries: 10(was 5)interval: 2s(was 1s)timeout: 2s(was 1s)Total window: ~25s. Happy-path startup still detected within 5-7s (first successful probe during start_period or first retry).
Fixes #3934