Skip to content

PAYMENTS-11567 Resque latency metrics#30

Closed
WillemHoman wants to merge 1 commit into
mainfrom
PAYMENTS-11567-resque_latency
Closed

PAYMENTS-11567 Resque latency metrics#30
WillemHoman wants to merge 1 commit into
mainfrom
PAYMENTS-11567-resque_latency

Conversation

@WillemHoman

@WillemHoman WillemHoman commented May 28, 2026

Copy link
Copy Markdown

What? Why?

Adds opt-in latency metrics for Resque jobs.

  • resque_job_queue_latency_seconds{job_class} : how long the job spent on the queue prior to being picked up for processing
  • resque_job_perform_duration_seconds{job_class} : how long it takes to process the job

Off by default. Opt in per service by setting PROMETHEUS_RESQUE_PER_JOB_METRICS_ENABLED=1.

Supports both ActiveJob and vanilla Resque jobs. resque_job_perform_duration_seconds covers every job. resque_job_queue_latency_seconds is supported for ActiveJob-enqueued jobs only, because it reads fields from the ActiveJob payload — see below.

The metrics introduced

resque_job_queue_latency_seconds

How long the job spent on the queue: time from scheduled_at (or enqueued_at) until the worker picks the job up.
Buckets: [0.01, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 30, 60, 120, 300].
Latency is clamped to 0.0 minimum so that clock skew between the enqueuer and the worker host cannot produce a negative observation.

This metric relies on the job having the ActiveJob payload shape.
i.e. payload['args'][0] (a Hash) must contain

  • job_class (string) — used as the metric label.
  • enqueued_at (ISO 8601 string) — anchor when scheduled_at is absent.
  • scheduled_at (ISO 8601 string, optional) — preferred anchor when present.

For vanilla Resque jobs without these fields, the metric silently no-ops for them.

Sample payloads for both shapes are in Payload parsing below.

The gem detects the payload by the fields above rather than by matching the JobWrapper class name so that non ActiveJob Resque tasks can also opt in by populating the necessary fields.

Converting vanilla jobs to support resque_job_queue_latency_seconds

Vanilla Resque jobs that need queue latency should be either

Retries

ActiveJob re-stamps both enqueued_at and scheduled_at on every retry (including each re-queue).
This means that resque_job_queue_latency_seconds only reflects the latency for each individual retry.
To measure the entire duration from initial enqueue until dequeue of the final attempt, a separate userland metric is needed.
For example, domain events should have an occurred_at timestamp as per the docs.

occurred_at is populated with the time at which the event occurred, typically the current time

resque_job_perform_duration_seconds

How long does the processing of the job take.
This is the total child-process lifetime from fork to Process.waitpid return.
Buckets: [0.05, 0.1, 0.25, 0.5, 1, 2, 5, 10, 30, 60].

This works for both vanilla Resque jobs and ActiveJobs and doesn't rely on any payload fields.

Why this is implemented in bc-prometheus-ruby instead of each service

In-child metric collection requires a synchronous flush before exit! which is Resque's default child-exit.
The flush is bounded by the bc-prometheus-ruby worker thread's sleep cadence (0.5 s by default) — which makes it slow relative to fast publishes.
For example this metric was initially introduced in https://github.com/bigcommerce/bigpay/pull/10597 however had to be reverted as it increased Resque job run-time from 20ms to 500ms.
In contrast, the parent process is long-lived, so the bc-prom worker thread drains naturally between jobs without any synchronous wait.
Moving the instrumentation to the parent eliminates the per-job latency tax entirely.

This also ensures consistent metric names and labels.

Opt-in env var

This is opt-in because per-job series can be high cardinality when a service has many Resque job classes.
This is enabled by PROMETHEUS_RESQUE_PER_JOB_METRICS_ENABLED (default 0).
It is read once at boot via the existing Bigcommerce::Prometheus.configure.
This mirrors the existing PROMETHEUS_ENABLED opt-in pattern.

Implementation

bc-prometheus-ruby follows a producer/aggregator split.
Worker processes push metric observations as JSON envelopes to a long-running aggregator server, which accumulates them and exposes /metrics for Prometheus to scrape.
The new per-job metrics follow the same pattern:

  • a producer hook installed in the worker process captures queue latency and perform duration and pushes type: 'resque_job' envelopes.
  • a new aggregator type collector (TypeCollectors::ResqueJob) on the server side receives those envelopes and records them into two Prometheus histograms.

Producer — Bigcommerce::Prometheus::Integrations::Resque::JobMetrics

Records the metrics and pushes them to the aggregator.

Payload parsing

Resque payloads arrive in one of two shapes: ActiveJob-enqueued or vanilla.

An ActiveJob enqueued via .perform_later is stored with ActiveJob's JobWrapper as the Resque class, and the job's own details serialized into args[0]:

{
  "class": "ActiveJob::QueueAdapters::ResqueAdapter::JobWrapper",
  "args": [
    {
      "job_class": "BigPay::SomePublishJob",
      "job_id": "07b08b09-8a1e-4f4d-9e0e-1f2a3b4c5d6e",
      "queue_name": "scheduled_action",
      "arguments": [12345],
      "enqueued_at": "2026-06-11T01:23:45.123456789Z",
      "scheduled_at": null
    }
  ]
}

A vanilla job enqueued via Resque.enqueue(BigPay::Kount::ProcessNotificationJob, 12345) is stored with its own class at the top level and raw positional args:

{
  "class": "BigPay::Kount::ProcessNotificationJob",
  "args": [12345]
}

Payload parsing models each shape as its own DTO behind a common interface of #job_class and #anchor_time:

  • ActiveJobPayload — built from the inner args[0] hash. From the first example: job_class is "BigPay::SomePublishJob", and anchor_time parses from scheduled_at, falling back to enqueued_at.
  • VanillaResquePayload — built from the whole payload. From the second example: job_class is "BigPay::Kount::ProcessNotificationJob". anchor_time is always nil, because the payload carries no timestamps. Malformed payloads yield 'unknown'.

JobPayload.for(resque_job) decides which to build, once per job.
The criterion: a payload is ActiveJob-shaped when args[0] is a Hash carrying a truthy job_class; everything else is treated as vanilla.
In the examples above, the first payload meets the criterion and the second does not.

Both DTOs eagerly extract everything in initialize and hold no reference to the Resque::Job afterwards.
The recording methods take the prebuilt payload rather than reparsing the job on each call.

Consequences: resque_job_perform_duration_seconds emits a meaningful job_class label for vanilla Resque jobs too.
However resque_job_queue_latency_seconds no-ops for vanilla Resque jobs, since there is no #anchor_time.

How the metrics are recorded

WorkerInstrumentation is a module prepended onto Resque::Worker that wraps perform_with_fork.
It builds a payload object once per job via JobPayload.for, records queue latency before super, and records perform duration in ensure:

def perform_with_fork(job, &block)
  started_at = Process.clock_gettime(Process::CLOCK_MONOTONIC)
  payload    = JobPayload.for(job)
  JobMetrics.record_queue_latency(payload)
  super
ensure
  JobMetrics.record_perform_duration(
    payload,
    Process.clock_gettime(Process::CLOCK_MONOTONIC) - started_at
  )
end

Class-method API

  • JobMetrics.start(client:) — no-op unless the env var is on. Prepends WorkerInstrumentation onto Resque::Worker. Idempotent.
  • JobMetrics.record_queue_latency(payload) — records how long the job sat on the queue: the seconds from the payload's #anchor_time to now. Feeds resque_job_queue_latency_seconds, labelled by the payload's #job_class. Does nothing when the payload has no anchor.
  • JobMetrics.record_perform_duration(payload, duration) — records how long the job took to process; the caller measures and supplies the duration. Feeds resque_job_perform_duration_seconds, labelled by the payload's #job_class.

On the wire, each recording is sent to the aggregator with an envelope of type: 'resque_job'.

Both record_* methods rescue StandardError and log a warning — metric push failures never propagate into the publish/perform path.

Aggregator — type collectors

Everything above is the producer side, running in the worker process. The aggregator is the other half: the long-running server process that receives the pushed envelopes and exposes the accumulated metrics at /metrics for Prometheus to scrape.

Two type collectors, one per envelope shape, are registered side-by-side in Instrumentors::Resque#start.
The upstream PrometheusExporter::Server::Collector routes each envelope to whichever collector's type matches envelope['type'] — no in-collector dispatch needed.

  • Bigcommerce::Prometheus::TypeCollectors::Resque: is the pre-existing aggregator and continues to own the aggregate worker/queue gauges (resque_workers_total, jobs_failed_total, jobs_pending_total, jobs_processed_total, queues_total, queue_sizes) fed by Collectors::Resque#collect.
  • Bigcommerce::Prometheus::TypeCollectors::ResqueJob: the new aggregator owns the two new histograms:
    • resque_job_queue_latency_seconds with buckets tuned for queue dwell ([0.01, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 30, 60, 120, 300]).
    • resque_job_perform_duration_seconds with buckets tuned for per-job work ([0.05, 0.1, 0.25, 0.5, 1, 2, 5, 10, 30, 60]).

Wiring

Integrations::Resque.start(client:) now also calls JobMetrics.start(client:).
If the env var is off, that call returns immediately and nothing is hooked. If on, the hooks install and the metrics flow.

Integrations::Resque.start is itself invoked inside a Resque.before_first_fork block (registered by Instrumentors::Resque#setup_middleware).

Converting to ActiveJob

Converting a vanilla job is mostly mechanical, with three things to plan for:

  • In-flight old-shape payloads in Redis will fail against the converted class at deploy time. Drain the queue first, or keep a temporary self.perform shim.
  • resque-scheduler YAML entries can't enqueue ActiveJobs natively. Each scheduled entry needs a small shim class.
  • Arguments pass through ActiveJob serialization. ActiveRecord records become GlobalIDs.

Wrapper class alternative

An alternative to converting a vanilla job to ActiveJob: a small service-local wrapper class that produces the ActiveJob payload shape at enqueue time.

The wrapper stamps the fields the gem reads into args[0] and delegates perform to the target job. The target job class is untouched. Callers opt in by enqueueing through the wrapper.

class InstrumentedEnqueue
  def self.enqueue(job_class, *args)
    Resque.enqueue_to(
      Resque.queue_from_class(job_class),
      self,
      {
        'job_class' => job_class.name,
        'enqueued_at' => Time.now.utc.iso8601(9),
        'arguments' => args
      }
    )
  end

  def self.perform(payload)
    payload['job_class'].constantize.perform(*payload['arguments'])
  end
end

# the call site changes from
Resque.enqueue(BigPay::Kount::ProcessNotificationJob, 12345)
# to
InstrumentedEnqueue.enqueue(BigPay::Kount::ProcessNotificationJob, 12345)

This produces a payload that meets the classification criterion in Payload parsing: args[0] is a Hash with a truthy job_class. queue_latency anchors on the stamped enqueued_at, and both metrics label with the target job's class name.

Trade-offs versus converting to ActiveJob:

  • Per-call-site opt-in. Only enqueues that go through the wrapper get the metric. Un-migrated call sites coexist safely — their payloads simply skip the metric — so there is no deploy-ordering risk.
  • The key names must match the gem's contract exactly: job_class, enqueued_at, scheduled_at.
  • resque-scheduler YAML entries don't fit this pattern. Recurring scheduled jobs need a different approach.

Testing

# HELP ruby_resque_job_queue_latency_seconds Seconds between when a Resque job was due to run (scheduled_at if set, falling back to enqueued_at) and when a worker process picked it up. Recorded per attempt; retries-with-backoff anchor on scheduled_at, excluding the intentional backoff wait. Opt-in via PROMETHEUS_RESQUE_PER_JOB_METRICS_ENABLED.
# TYPE ruby_resque_job_queue_latency_seconds histogram
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="+Inf"} 4
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="300"} 4
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="120"} 4
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="60"} 4
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="30"} 4
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="5"} 4
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="2.5"} 0
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="1"} 0
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="0.5"} 0
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="0.25"} 0
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="0.1"} 0
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="0.05"} 0
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="0.01"} 0
ruby_resque_job_queue_latency_seconds_count{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob"} 4
ruby_resque_job_queue_latency_seconds_sum{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob"} 17.02194
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="+Inf"} 2
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="300"} 2
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="120"} 2
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="60"} 2
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="30"} 2
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="5"} 2
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="2.5"} 0
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="1"} 0
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="0.5"} 0
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="0.25"} 0
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="0.1"} 0
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="0.05"} 0
ruby_resque_job_queue_latency_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="0.01"} 0
ruby_resque_job_queue_latency_seconds_count{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob"} 2
ruby_resque_job_queue_latency_seconds_sum{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob"} 7.9414940000000005

# HELP ruby_resque_job_perform_duration_seconds Total Resque child process lifetime (fork to waitpid). Includes fork overhead, Redis reconnect, after_fork hooks, perform, and exit. Used as the per-job throughput signal at the worker-pod level. Opt-in via PROMETHEUS_RESQUE_PER_JOB_METRICS_ENABLED.
# TYPE ruby_resque_job_perform_duration_seconds histogram
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="+Inf"} 2
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="60"} 2
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="30"} 2
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="10"} 2
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="5"} 2
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="2"} 2
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="1"} 1
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="0.5"} 0
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="0.25"} 0
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="0.1"} 0
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob",le="0.05"} 0
ruby_resque_job_perform_duration_seconds_count{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob"} 2
ruby_resque_job_perform_duration_seconds_sum{job_class="BigPay::DomainEventing::Payment::Webhook::PublishWebhookTransactionCreatedEventJob"} 2.5267620000522584
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="+Inf"} 2
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="60"} 2
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="30"} 2
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="10"} 2
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="5"} 2
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="2"} 2
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="1"} 2
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="0.5"} 0
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="0.25"} 0
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="0.1"} 0
ruby_resque_job_perform_duration_seconds_bucket{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob",le="0.05"} 0
ruby_resque_job_perform_duration_seconds_count{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob"} 2
ruby_resque_job_perform_duration_seconds_sum{job_class="BigPay::DomainEventing::Payment::Transaction::PublishUnstablePaymentTransactionCreatedEventJob"} 1.4121759999543428

@WillemHoman WillemHoman force-pushed the PAYMENTS-11567-resque_latency branch 3 times, most recently from fe8acb7 to 12b01d8 Compare May 29, 2026 00:11
@WillemHoman

Copy link
Copy Markdown
Author

bugbot run

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 12b01d8. Configure here.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in Resque integration to emit per-job Prometheus histograms (queue latency and perform duration) from the parent worker process, avoiding the per-job synchronous flush overhead that previously regressed throughput in fork-per-job children.

Changes:

  • Add PROMETHEUS_RESQUE_PER_JOB_METRICS_ENABLED configuration and parent-side Resque worker instrumentation to emit resque_job_* envelopes.
  • Introduce a dedicated resque_job type collector (TypeCollectors::ResqueJob) to expose the new per-job histograms.
  • Add/adjust specs and documentation (README + changelog) for the new functionality.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
spec/bigcommerce/prometheus/type_collectors/resque_spec.rb Updates spec to exercise TypeCollectors::Base#collect label-merging behavior.
spec/bigcommerce/prometheus/type_collectors/resque_job_spec.rb Adds spec coverage for new ResqueJob type collector routing + histogram observation behavior.
spec/bigcommerce/prometheus/integrations/resque/job_payload_spec.rb Adds spec coverage for payload parsing (job_class + anchor timestamp selection/parsing).
spec/bigcommerce/prometheus/integrations/resque/job_metrics_spec.rb Adds spec coverage for envelope shape and error-rescue behavior without requiring Resque.
README.md Documents opt-in per-job metrics and adds configuration table row.
lib/bigcommerce/prometheus/type_collectors/resque.rb Clarifies responsibility boundaries between aggregate Resque metrics vs per-job metrics.
lib/bigcommerce/prometheus/type_collectors/resque_job.rb New type collector for per-job histograms with explicit type: 'resque_job' routing.
lib/bigcommerce/prometheus/integrations/resque/job_payload.rb New payload parser used by per-job metrics logic.
lib/bigcommerce/prometheus/integrations/resque/job_metrics.rb New parent-side Resque worker instrumentation + metric envelope emission.
lib/bigcommerce/prometheus/integrations/resque.rb Wires JobMetrics.start into the Resque integration startup path.
lib/bigcommerce/prometheus/instrumentors/resque.rb Registers the new ResqueJob type collector with the exporter server.
lib/bigcommerce/prometheus/configuration.rb Adds resque_per_job_metrics_enabled config key (env var gated).
lib/bigcommerce/prometheus.rb Requires new integration/type collector files.
CHANGELOG.md Adds pending-release entry for the new per-job Resque metrics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/bigcommerce/prometheus/integrations/resque/job_payload.rb
Comment thread lib/bigcommerce/prometheus/integrations/resque/job_metrics.rb
Comment thread CHANGELOG.md Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.

Comment thread lib/bigcommerce/prometheus/integrations/resque/job_metrics.rb
Comment thread lib/bigcommerce/prometheus/type_collectors/resque_job.rb
Comment thread spec/bigcommerce/prometheus/type_collectors/resque_spec.rb Outdated
Comment thread README.md

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.

Comment thread lib/bigcommerce/prometheus/integrations/resque/job_metrics.rb
Comment thread lib/bigcommerce/prometheus/integrations/resque/job_metrics.rb
Comment thread lib/bigcommerce/prometheus/integrations/resque/job_metrics.rb
Comment thread spec/bigcommerce/prometheus/integrations/resque/job_metrics_spec.rb Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated no new comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants