Skip to content

Bug 2007112 - Perform a binary search backfill with the sheriffing bot with change detection technique integration#9237

Open
junngo wants to merge 1 commit intomozilla:masterfrom
junngo:sheriff-automated-backfill
Open

Bug 2007112 - Perform a binary search backfill with the sheriffing bot with change detection technique integration#9237
junngo wants to merge 1 commit intomozilla:masterfrom
junngo:sheriff-automated-backfill

Conversation

@junngo
Copy link
Contributor

@junngo junngo commented Feb 18, 2026

Hi there :)
I've integrated the smart backfill logic in this patch.
I haven't added unit tests yet. I'd like to confirm that the overall direction and policy make sense first. Once that’s agreed on, I’ll follow up with tests.
Please let me know if you have any questions or if there’s anything you’d like me to change.
Bugzilla: https://bugzilla.mozilla.org/show_bug.cgi?id=2007112
Related Patch: https://phabricator.services.mozilla.com/D282605

Features

  1. This improves on the previous approach, which backfilled a fixed set of pushes around the alert and stopped without further analysis. In this patch, the suspected change point is refined by incrementally filling in missing performance data, while minimizing unnecessary test executions.
  2. I kept the previous workflow and state machine [0]. With this logic, after a backfill completes successfully, the patch re-verifies the data to identify the actual culprit.
  3. I added new columns to the BackfillRecord table.
  • iteration_count
  • last_detected_push_id
  • anchor_push_id
  • backfill_logs [1]
  1. During verification, multiple potential culprit pushes can be detected. In such cases, we select the culprit that is closest to the previous alert point.
    In the example below [1], the initial alert point is push 10 (previous_push_id: 10). After backfilling, verification detects changes at pushes 9 and 14. Since push 9 is closer to the previous alert point (10), it is selected as the culprit.
    In this example, the search window is intentionally kept small for testing purposes. In production, we typically use a wider search window (around 12–24 pushes).
    (Note: In this example, push 9 represents a regression, while push 14 represents an improvement.)
  2. When searching in the right direction, the logic currently looks ahead by 25 pushes. This value is currently fixed.
    Ideally, it should match the number of pushes triggered by the real backfill action on the mozilla-central side. Also, if we need to manage this value, we can consider adding a new column for it.
  3. I intentionally did not change the existing workflow. If a failure occurs during verification (in the verify_and_iterate method), the backfill record status remains SUCCESSFUL. We could introduce a new status to represent this case more explicitly, or alternatively change the status to FAILED. I left this as a policy decision to be discussed.
  4. The re_run_detect_changes method is based on the alert generation logic [2], in order to recreate the same environment used when alerts are originally generated.

[0]

PRELIMINARY = 0
READY_FOR_PROCESSING = 1
BACKFILLED = 2
SUCCESSFUL = 3
FAILED = 4

[1] Example

[{
  "iteration": 1,
  "detected_push_id": 9,
  "detected_t_value": 7.831531802772834, 
  "candidates": [{"push_id": 9, "t_value": 7.831531802772834, "push_timestamp": 1770761099}, {"push_id": 14, "t_value": 11.361080927457797, "push_timestamp": 1770782865}], 
  "timestamp": "2026-02-17T16:31:34.889370",
  "previous_push_id": 10,
  "direction": "left",
  "notes": "Detected push moved left (from 10 to 9)"
},{
  "iteration": 2,
  "detected_push_id": 9,
  "detected_t_value": 7.831531802772834, 
  "candidates": [{"push_id": 9, "t_value": 7.831531802772834, "push_timestamp": 1770761099}, {"push_id": 14, "t_value": 11.361080927457797, "push_timestamp": 1770782865}],
  "timestamp": "2026-02-17T16:36:49.724971",
  "previous_push_id": 9,
  "direction": "stabilized",
  "notes": "Detected push same as previous, culprit stabilized"
}]

[2]

def generate_new_alerts_in_series(signature):
# get series data starting from either:
# (1) the last alert, if there is one
# (2) the alerts max age
# (use whichever is newer)
max_alert_age = alert_after_ts = datetime.now() - settings.PERFHERDER_ALERTS_MAX_AGE
series = PerformanceDatum.objects.filter(signature=signature, push_timestamp__gte=max_alert_age)
latest_alert_timestamp = (
PerformanceAlert.objects.filter(series_signature=signature)
.select_related("summary__push__time")
.order_by("-summary__push__time")
.values_list("summary__push__time", flat=True)[:1]
)
if latest_alert_timestamp:
latest_ts = latest_alert_timestamp[0]
series = series.filter(push_timestamp__gt=latest_ts)
if latest_ts > alert_after_ts:
alert_after_ts = latest_ts
datum_with_replicates = (
PerformanceDatum.objects.filter(
signature=signature,
repository=signature.repository,
push_timestamp__gte=alert_after_ts,
)
.annotate(
has_replicate=Exists(
PerformanceDatumReplicate.objects.filter(performance_datum_id=OuterRef("pk"))
)
)
.filter(has_replicate=True)
)
replicates = PerformanceDatumReplicate.objects.filter(
performance_datum_id__in=Subquery(datum_with_replicates.values("id"))
).values_list("performance_datum_id", "value")
replicates_map: dict[int, list[float]] = {}
for datum_id, value in replicates:
replicates_map.setdefault(datum_id, []).append(value)
revision_data = {}
for d in series:
if not revision_data.get(d.push_id):
revision_data[d.push_id] = RevisionDatum(
int(time.mktime(d.push_timestamp.timetuple())), d.push_id, [], []
)
revision_data[d.push_id].values.append(d.value)
revision_data[d.push_id].replicates.extend(replicates_map.get(d.id, []))
min_back_window = signature.min_back_window
if min_back_window is None:
min_back_window = settings.PERFHERDER_ALERTS_MIN_BACK_WINDOW
max_back_window = signature.max_back_window
if max_back_window is None:
max_back_window = settings.PERFHERDER_ALERTS_MAX_BACK_WINDOW
fore_window = signature.fore_window
if fore_window is None:
fore_window = settings.PERFHERDER_ALERTS_FORE_WINDOW
alert_threshold = signature.alert_threshold
if alert_threshold is None:
alert_threshold = settings.PERFHERDER_REGRESSION_THRESHOLD
data = revision_data.values()
analyzed_series = detect_changes(
data,
min_back_window=min_back_window,
max_back_window=max_back_window,
fore_window=fore_window,
)

…t with change detection technique integration
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments