Problem
The touchpoints-production-sidekiq-worker app is crashing repeatedly (every ~16 minutes) and showing 0/1 instances running. This prevents ALL background jobs from processing, including:
- Export jobs (form responses, events, versions, digital service accounts)
- Email notifications
- Scheduled tasks
Impact
- No background jobs are being processed in production
- Users requesting data exports never receive their files
- Scheduled jobs via
cf run-task may also be affected
- Cloud Foundry is sending continuous "sidekiq worker failed" emails
Root Cause
The sidekiq worker app is configured to run bin/rails server (Rails web server) instead of bundle exec sidekiq.
Evidence
$ cf app touchpoints-production-sidekiq-worker
# instances: 0/1 # Worker is DOWN
$ cf events touchpoints-production-sidekiq-worker | head -5
# Shows continuous crashes with "APP/PROC/WEB: Exited with status 1"
$ cf curl /v3/apps/$(cf app touchpoints-production-sidekiq-worker --guid)/droplets/current
# Shows process type: "web":"bin/rails server -b 0.0.0.0 -p $PORT -e $RAILS_ENV"
# Should be: "worker":"bundle exec sidekiq"
Why This Causes Crashes
- The web server starts but expects HTTP traffic
- No route exists to send traffic to the sidekiq worker
- Without traffic, the process appears unhealthy to Cloud Foundry's health check
- CF kills and restarts it repeatedly
- Since there's no manifest or command override, the ruby buildpack defaults to
rails server
Proposed Solution
Fix 1: Update Deploy Script (Immediate Fix - Recommended)
Modify .circleci/deploy-sidekiq.sh to pass the correct command:
# Line 131-133: Add -c flag with sidekiq command
if cf push "$app_name" \
-t 180 \
-c "bundle exec sidekiq -C config/sidekiq.yml" \
--health-check-type process; then
This is the fastest fix with the smallest change surface.
Fix 2: Create Separate Sidekiq Manifests (Alternative)
Create manifest files for each environment:
touchpoints-production-sidekiq.yml
touchpoints-staging-sidekiq.yml
touchpoints-demo-sidekiq.yml
Each with:
applications:
- name: touchpoints-production-sidekiq-worker
command: bundle exec sidekiq -C config/sidekiq.yml
memory: 4G
# ... other configs
Fix 3: Create Procfile (Best Practice - Most Change)
Create Procfile with multiple process types:
web: bundle exec rails s -b 0.0.0.0 -p $PORT -e $RAILS_ENV
worker: bundle exec sidekiq -C config/sidekiq.yml
Then update manifests to use different process types.
Implementation Plan
Phase 1: Fix Production (URGENT)
- Update
.circleci/deploy-sidekiq.sh to include sidekiq command
- Deploy to production
- Verify worker starts:
cf app touchpoints-production-sidekiq-worker (should show 1/1)
- Check logs:
cf logs touchpoints-production-sidekiq-worker --recent
- Verify job processing in Sidekiq Web UI at
/admin/sidekiq
Phase 2: Fix Staging and Demo
- Same deployment script update applies to all environments
- Deploy to staging:
touchpoints-staging-sidekiq-worker
- Deploy to demo:
touchpoints-demo-sidekiq-worker
- Verify each environment
Phase 3: Improvements (Follow-up)
- Add error handling and retry policies to export jobs
- Configure monitoring for job failures
- Add user-facing error notifications
- Consider increasing concurrency if needed
Verification Checklist
Current Status
- Production: BROKEN (0/1 instances, continuous crashes)
- Staging: Likely broken (same deploy script)
- Demo: Likely broken (same deploy script)
Related Files
.circleci/deploy-sidekiq.sh - deployment script (needs -c flag)
config/sidekiq.yml - sidekiq configuration (concurrency: 1, queues: default, mailers)
app/jobs/ - all background jobs currently not processing
References
manifest.sample.yml line 12: Contains commented example of correct sidekiq command
config/initializers/vcap_services.rb - Sets up Redis connection from CF services
config/initializers/sidekiq.rb - Configures Sidekiq Redis connection
Problem
The
touchpoints-production-sidekiq-workerapp is crashing repeatedly (every ~16 minutes) and showing 0/1 instances running. This prevents ALL background jobs from processing, including:Impact
cf run-taskmay also be affectedRoot Cause
The sidekiq worker app is configured to run
bin/rails server(Rails web server) instead ofbundle exec sidekiq.Evidence
Why This Causes Crashes
rails serverProposed Solution
Fix 1: Update Deploy Script (Immediate Fix - Recommended)
Modify
.circleci/deploy-sidekiq.shto pass the correct command:This is the fastest fix with the smallest change surface.
Fix 2: Create Separate Sidekiq Manifests (Alternative)
Create manifest files for each environment:
touchpoints-production-sidekiq.ymltouchpoints-staging-sidekiq.ymltouchpoints-demo-sidekiq.ymlEach with:
Fix 3: Create Procfile (Best Practice - Most Change)
Create
Procfilewith multiple process types:Then update manifests to use different process types.
Implementation Plan
Phase 1: Fix Production (URGENT)
.circleci/deploy-sidekiq.shto include sidekiq commandcf app touchpoints-production-sidekiq-worker(should show 1/1)cf logs touchpoints-production-sidekiq-worker --recent/admin/sidekiqPhase 2: Fix Staging and Demo
touchpoints-staging-sidekiq-workertouchpoints-demo-sidekiq-workerPhase 3: Improvements (Follow-up)
Verification Checklist
instances: 1/1(not0/1)Current Status
Related Files
.circleci/deploy-sidekiq.sh- deployment script (needs -c flag)config/sidekiq.yml- sidekiq configuration (concurrency: 1, queues: default, mailers)app/jobs/- all background jobs currently not processingReferences
manifest.sample.ymlline 12: Contains commented example of correct sidekiq commandconfig/initializers/vcap_services.rb- Sets up Redis connection from CF servicesconfig/initializers/sidekiq.rb- Configures Sidekiq Redis connection