Skip to content

Let SidekiqReporter publish zero for required-but-empty queues#33

Merged
bjonord merged 1 commit into
mainfrom
allow-publishing-zero-for-required-queues
Jun 22, 2026
Merged

Let SidekiqReporter publish zero for required-but-empty queues#33
bjonord merged 1 commit into
mainfrom
allow-publishing-zero-for-required-queues

Conversation

@bjonord

@bjonord bjonord commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

SidekiqReporter iterates Sidekiq::Queue.all, so a queue that has never been enqueued on a given shard is silently absent from CloudWatch. When that queue has a per-queue target-tracking autoscaling policy, its scale-in alarm sits permanently in INSUFFICIENT_DATA and the whole ECS service stops scaling in.

What does this PR do?

  • Adds an optional required_queues: kwarg to SidekiqReporter.enable. Defaults to [], so existing consumers behave exactly as before.
  • Collector now iterates (required_queues + Sidekiq::Queue.all.keys).uniq and emits 0 for any required_queues entry not present in Redis. paused? behavior is preserved for live queues.
  • Bumps to 0.3.0 (additive public API change).

Note

Consumer-side change for the Teamtailor monorepo: bump the gem and pass required_queues: QUEUES_WITH_AUTOSCALING_POLICY from config/initializers/03_sidekiq.rb. Follow-up PR.

📖 Description for AI review

Context

ECS Fargate Sidekiq services on Teamtailor autoscale via per-queue CloudWatch target-tracking policies on the custom metric sidekiq_queue_latency (namespace Teamtailor/queue_times). Each queue has its own latency target (critical 1s, default 30s, import/low/mailers_low/within_1_hour/within_3_hours 300s, etc.).

AWS target-tracking only scales in when every policy can confirm it's safe. Verified on eu-cyan-sidekiq-main (2026-06-22): the publisher emitted 13 queues but never within_3_hours (that shard never enqueues to it), so its scale-in alarm sat in INSUFFICIENT_DATA and the service ratcheted from 1 to 5 tasks over weeks with zero scale-in events between spikes. Roughly $800/mo in waste across low-traffic shards.

What was added

  • required_queues: kwarg on SidekiqReporter.enable. The collector lambda closes over it and passes it through to collect_metrics.
  • collect_metrics(required_queues = []): keeps zero-arg backwards compatibility, walks the dedup'd union of required_queues and Sidekiq::Queue.all, defaulting absent queues to 0. Preserves the existing paused? ? 0 : latency rule for live queues.
  • Spec coverage: existing zero-arg tests preserved; new tests cover the required-queue plumbing through .enable, missing-queue zero emission, and dedup when a required queue is also live.

What was intentionally left out

  • The canonical queue list itself stays out of the gem. It's deployment-specific (which queues have policies depends on the infrastructure config), so passing it in from the consumer keeps the gem decoupled.
  • No CHANGELOG.md entry (none exists in this repo today).

Key constraints to verify

  • Calling .enable without required_queues: must behave exactly as in 0.2.0. The default [] plus the dedup'd union ensures this.
  • The minor-version bump (0.2.0 to 0.3.0) reflects the new public kwarg without breaking the old surface.

@bjonord bjonord requested a review from himynameisjonas June 22, 2026 10:24
@bjonord bjonord self-assigned this Jun 22, 2026
`SidekiqReporter` only published metrics for queues that currently
exist in Redis. A queue with a per-queue CloudWatch autoscaling
policy that never enqueues on a given shard would sit permanently
in INSUFFICIENT_DATA, blocking scale-in for the whole ECS service.

Add an optional `required_queues:` kwarg to `.enable`. The collector
iterates `(required_queues + live.keys).uniq` and emits `0` (the
correct latency for an empty queue) for any required queue missing
from Redis. Behavior with `required_queues` omitted is unchanged,
so existing consumers are unaffected.

Bump to 0.3.0 (new public kwarg).
@bjonord bjonord force-pushed the allow-publishing-zero-for-required-queues branch from 37ebb3b to 3e8cafe Compare June 22, 2026 14:00
@bjonord bjonord merged commit 58e405d into main Jun 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants