feat(preprod): Add snapshot image comparison task with odiff by NicoHinderling · Pull Request #109150 · getsentry/sentry

NicoHinderling · 2026-02-23T23:07:13Z

Update

Now broken into 3 PRs:

Summary

Adds visual regression detection for preprod snapshots by comparing images between head and base artifacts using the odiff binary.

New modules:

image_diff — wraps odiff in a persistent server mode for efficient batch comparisons, using a dual-threshold approach (base + color-sensitive) to catch both structural and subtle color changes
compare_snapshots task — async Celery task that fetches manifests from object storage, matches images by filename, and categorizes results as changed/added/removed/unchanged. Diff masks are stored as base64-encoded PNGs in object storage alongside a comparison manifest JSON

How it works:

When triggered, the task loads head and base snapshot manifests
Images present in both are compared — if hashes differ, pixel-level diff is computed via odiff
Each comparison runs at two sensitivity thresholds; the one detecting more differences wins
Results (diff masks, scores, dimensions) are persisted to object storage and the PreprodSnapshotComparison model is updated with summary counts

Test plan

Unit tests for compare_images and compare_images_batch covering identical, different, size-mismatched, and threshold-sensitive inputs
Tests with real before/after images (skipped if not available)
Manual test with uploaded snapshot pairs to verify end-to-end comparison flow

NicoHinderling · 2026-02-23T23:07:27Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

github-actions · 2026-02-23T23:07:35Z

🚨 Warning: This pull request contains Frontend and Backend changes!

It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently.

Have questions? Please ask in the #discuss-dev-infra channel.

src/sentry/preprod/snapshots/tasks.py

src/sentry/preprod/snapshots/image_diff/odiff.py

src/sentry/preprod/snapshots/tasks.py

src/sentry/preprod/snapshots/image_diff/compare.py

tests/sentry/preprod/snapshots/image_diff/test_image_diff.py

src/sentry/preprod/snapshots/image_diff/odiff.py

src/sentry/preprod/snapshots/tasks.py

src/sentry/preprod/snapshots/image_diff/odiff.py

src/sentry/preprod/snapshots/image_diff/compare.py

src/sentry/preprod/snapshots/tasks.py

package.json

rbro112 · 2026-02-25T19:42:11Z

src/sentry/preprod/snapshots/image_diff/types.py

+from dataclasses import dataclass
+
+
+@dataclass(frozen=True)


Why use dataclass over a pydantic model?

My (very) limited understanding is that Pydantic is important for data models used within APIs boundaries because Pydantic validates and coerces incoming data at runtime (like when you receive a JSON), with things like type coercion (string "42" → int 42), required/optional field enforcement, custom validators, and clear error messages when validation fails.

For this class, we construct it ourselves from known-correct values inside our own code, so there's nothing to validate and it's not validated from external input or serialized to JSON.

Supposedly there's a slight performance overhead to using pydantic, so I think for this use case this is technically ideal.

lmk if you think im missing something

src/sentry/preprod/snapshots/tasks.py

cursor

Cursor Bugbot has reviewed your changes and found 3 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

src/sentry/preprod/snapshots/tasks.py

src/sentry/preprod/snapshots/image_diff/odiff.py

src/sentry/preprod/snapshots/tasks.py

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

cursor · 2026-02-25T21:22:42Z

src/sentry/preprod/snapshots/image_diff/odiff.py

+                response = self._read_json(line)
+            except RuntimeError:
+                self._process = None
+                raise


Process not killed when nulling reference on error

Low Severity

When _read_json raises RuntimeError (e.g., due to JSONDecodeError from malformed output), self._process is set to None without killing the underlying process. The close() method will then no-op since self._process is already None, and the process local variable in compare() is the only remaining reference. When it goes out of scope, Popen.__del__ does not kill the subprocess — it only warns. The same pattern applies in the BrokenPipeError handler, though there the process is likely already dead. Explicitly terminating the process before nulling the reference would prevent orphaned processes.

Additional Locations (1)

src/sentry/preprod/snapshots/image_diff/odiff.py#L109-L112

Fixed — both error handlers now explicitly process.kill() + process.wait() before nulling the reference, preventing orphaned subprocesses.

Claude Code

cursor · 2026-02-25T21:22:42Z

src/sentry/preprod/snapshots/tasks.py

+            extra={"head_artifact_id": head_artifact_id, "base_artifact_id": base_artifact_id},
+        )
+        comparison.state = PreprodSnapshotComparison.State.PROCESSING
+        comparison.save(update_fields=["state"])


Created comparison lacks atomic state guard against duplicates

Low Severity

The created=True branch transitions the comparison from PENDING to PROCESSING via a non-atomic save, while the not created branch correctly uses an atomic filter().update() guard. If two Celery workers dispatch the same task concurrently, one gets created=True and proceeds unconditionally, while the other gets created=False and can also pass the atomic PENDING check before the first worker's save completes. Both workers then fully execute the comparison, and if one fails, its FAILED state save could overwrite the other's SUCCESS.

Additional Locations (1)

src/sentry/preprod/snapshots/tasks.py#L125-L137

This is a false positive. get_or_create is backed by a unique constraint on (head_snapshot_metrics, base_snapshot_metrics), so only one worker can ever get created=True — the other will either get created=False or hit the IntegrityError handler, both of which route through the atomic filter().update() path. The plain save() on the created=True branch is safe because we definitively own the row we just inserted. Added a comment in code to clarify this.

cursor

Cursor Bugbot has reviewed your changes and found 3 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

cursor · 2026-02-25T22:10:02Z

src/sentry/preprod/snapshots/tasks.py

+            extra={"head_artifact_id": head_artifact_id, "base_artifact_id": base_artifact_id},
+        )
+        comparison.state = PreprodSnapshotComparison.State.PROCESSING
+        comparison.save(update_fields=["state"])


Race allows duplicate processing between created and not-created paths

Medium Severity

The get_or_create inserts the comparison with PENDING state, then a separate save() transitions it to PROCESSING. Between these two non-atomic steps, a concurrent worker taking the created=False path can atomically filter(state=PENDING).update(state=PROCESSING) and succeed, causing both workers to proceed with the comparison simultaneously. Creating with PROCESSING state directly (or using an atomic update in the created=True path as well) would close this window.

cursor · 2026-02-25T22:10:02Z

src/sentry/preprod/snapshots/tasks.py

+        comparison.state = PreprodSnapshotComparison.State.FAILED
+        comparison.error_code = PreprodSnapshotComparison.ErrorCode.INTERNAL_ERROR
+        comparison.save(update_fields=["state", "error_code"])
+        raise


Error handler defeats retry by persisting terminal state

Medium Severity

The task is configured with Retry(times=3) and processing_deadline_duration=300, so it's designed to retry on timeouts. However, the except Exception: handler catches TimeoutError, sets comparison.state to FAILED, and then re-raises. On retry, the state-guard at line 126–136 finds the comparison in FAILED state (not PENDING), so the atomic update returns updated=0 and the task exits early without reprocessing. This makes the retry configuration completely ineffective for the one scenario it's meant to handle.

Additional Locations (1)

src/sentry/preprod/snapshots/tasks.py#L125-L136

cursor · 2026-02-25T22:10:02Z

src/sentry/preprod/snapshots/image_diff/odiff.py

+        try:
+            return json.loads(line)
+        except JSONDecodeError as e:
+            raise RuntimeError(f"odiff server returned invalid JSON: {line!r}") from e


Startup error loses stderr due to unset process reference

Low Severity

_read_json reads stderr from self._process, but when called from _start(), self._process is still None (it's assigned only at line 81 after successful startup). If odiff exits immediately, the actual stderr content from the local proc variable is never captured, and the RuntimeError message contains an empty string, making startup failures hard to debug.

Additional Locations (1)

src/sentry/preprod/snapshots/image_diff/odiff.py#L71-L81

NicoHinderling · 2026-02-25T22:32:32Z

Now broken into 3 PRs:

github-actions bot added Scope: Frontend Automatically applied to PRs that change frontend components Scope: Backend Automatically applied to PRs that change backend components labels Feb 23, 2026

sentry-warden bot reviewed Feb 23, 2026

View reviewed changes

src/sentry/preprod/snapshots/tasks.py Outdated Show resolved Hide resolved

sentry-warden bot reviewed Feb 23, 2026

View reviewed changes

src/sentry/preprod/snapshots/image_diff/odiff.py Outdated Show resolved Hide resolved

vercel bot deployed to Preview February 23, 2026 23:10 View deployment

NicoHinderling changed the title ~~create compare task v1~~ feat(preprod): Add snapshot image comparison task with odiff Feb 23, 2026

NicoHinderling force-pushed the create-compare-task-v1 branch from 5bc85e9 to eb716d6 Compare February 24, 2026 00:56

NicoHinderling mentioned this pull request Feb 24, 2026

feat(preprod): Add snapshot image comparison task and endpoint logic #109151

Open

sentry-warden bot reviewed Feb 24, 2026

View reviewed changes

src/sentry/preprod/snapshots/tasks.py Show resolved Hide resolved

vercel bot deployed to Preview February 24, 2026 00:58 View deployment

rbro112 reviewed Feb 24, 2026

View reviewed changes

src/sentry/preprod/snapshots/image_diff/compare.py Show resolved Hide resolved

rbro112 reviewed Feb 24, 2026

View reviewed changes

tests/sentry/preprod/snapshots/image_diff/test_image_diff.py Outdated Show resolved Hide resolved

NicoHinderling force-pushed the create-compare-task-v1 branch from eb716d6 to 6ced8f1 Compare February 24, 2026 20:41

sentry-warden bot reviewed Feb 24, 2026

View reviewed changes

src/sentry/preprod/snapshots/image_diff/odiff.py Outdated Show resolved Hide resolved

vercel bot deployed to Preview February 24, 2026 20:44 View deployment

NicoHinderling force-pushed the create-compare-task-v1 branch from 6ced8f1 to 66b9902 Compare February 24, 2026 21:08

sentry-warden bot reviewed Feb 24, 2026

View reviewed changes

src/sentry/preprod/snapshots/tasks.py Outdated Show resolved Hide resolved

vercel bot deployed to Preview February 24, 2026 21:12 View deployment

NicoHinderling force-pushed the create-compare-task-v1 branch from 2eb8356 to 9f93bed Compare February 24, 2026 21:18

NicoHinderling marked this pull request as ready for review February 24, 2026 21:20

NicoHinderling requested review from a team as code owners February 24, 2026 21:20

vercel bot deployed to Preview February 24, 2026 21:23 View deployment

cursor bot reviewed Feb 24, 2026

View reviewed changes

src/sentry/preprod/snapshots/tasks.py Outdated Show resolved Hide resolved

src/sentry/preprod/snapshots/image_diff/odiff.py Show resolved Hide resolved

NicoHinderling force-pushed the create-compare-task-v1 branch from b3074bd to 09db923 Compare February 24, 2026 21:41

cursor bot reviewed Feb 24, 2026

View reviewed changes

src/sentry/preprod/snapshots/image_diff/odiff.py Show resolved Hide resolved

src/sentry/preprod/snapshots/image_diff/compare.py Outdated Show resolved Hide resolved

src/sentry/preprod/snapshots/tasks.py Outdated Show resolved Hide resolved

vercel bot deployed to Preview February 24, 2026 21:44 View deployment

NicoHinderling force-pushed the create-compare-task-v1 branch from 09db923 to b3e1513 Compare February 24, 2026 22:53

rbro112 reviewed Feb 25, 2026

View reviewed changes

package.json Outdated Show resolved Hide resolved

rbro112 reviewed Feb 25, 2026

View reviewed changes

NicoHinderling force-pushed the create-compare-task-v1 branch from db87543 to 8f0eb58 Compare February 25, 2026 19:44

vercel bot deployed to Preview February 25, 2026 19:47 View deployment

NicoHinderling force-pushed the create-compare-task-v1 branch from 8f0eb58 to b277064 Compare February 25, 2026 19:53

vercel bot deployed to Preview February 25, 2026 19:56 View deployment

sentry bot reviewed Feb 25, 2026

View reviewed changes

src/sentry/preprod/snapshots/tasks.py Outdated Show resolved Hide resolved

cursor bot reviewed Feb 25, 2026

View reviewed changes

src/sentry/preprod/snapshots/tasks.py Show resolved Hide resolved

src/sentry/preprod/snapshots/tasks.py Outdated Show resolved Hide resolved

src/sentry/preprod/snapshots/image_diff/odiff.py Outdated Show resolved Hide resolved

NicoHinderling force-pushed the create-compare-task-v1 branch from b277064 to bb07625 Compare February 25, 2026 20:22

vercel bot deployed to Preview February 25, 2026 20:25 View deployment

NicoHinderling force-pushed the create-compare-task-v1 branch from bb07625 to 5c143a6 Compare February 25, 2026 20:34

vercel bot deployed to Preview February 25, 2026 20:37 View deployment

sentry bot reviewed Feb 25, 2026

View reviewed changes

src/sentry/preprod/snapshots/tasks.py Show resolved Hide resolved

cursor bot reviewed Feb 25, 2026

View reviewed changes

NicoHinderling requested review from a team as code owners February 25, 2026 21:36

vercel bot deployed to Preview February 25, 2026 21:39 View deployment

NicoHinderling force-pushed the create-compare-task-v1 branch from b2c0d14 to 266ccd9 Compare February 25, 2026 21:40

vercel bot deployed to Preview February 25, 2026 21:43 View deployment

create task

f96f69a

NicoHinderling force-pushed the create-compare-task-v1 branch from 266ccd9 to f96f69a Compare February 25, 2026 21:43

vercel bot deployed to Preview February 25, 2026 21:46 View deployment

cursor bot reviewed Feb 25, 2026

View reviewed changes

NicoHinderling closed this Feb 25, 2026

NicoHinderling mentioned this pull request Feb 26, 2026

feat(preprod): Add odiff server wrapper and Dockerfile binary install #109380

Merged

Uh oh!

Conversation

NicoHinderling commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Update

Summary

Test plan

Uh oh!

NicoHinderling commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 23, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rbro112 Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

NicoHinderling Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Feb 25, 2026

Choose a reason for hiding this comment

Process not killed when nulling reference on error

Uh oh!

NicoHinderling Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot Feb 25, 2026

Choose a reason for hiding this comment

Created comparison lacks atomic state guard against duplicates

Uh oh!

NicoHinderling Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Feb 25, 2026

Choose a reason for hiding this comment

Race allows duplicate processing between created and not-created paths

Uh oh!

cursor bot Feb 25, 2026

Choose a reason for hiding this comment

Error handler defeats retry by persisting terminal state

Uh oh!

cursor bot Feb 25, 2026

Choose a reason for hiding this comment

Startup error loses stderr due to unset process reference

Uh oh!

NicoHinderling commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

NicoHinderling commented Feb 23, 2026 •

edited

Loading

NicoHinderling commented Feb 23, 2026 •

edited

Loading