feat(preprod): Add odiff server wrapper and Dockerfile binary install#109380
feat(preprod): Add odiff server wrapper and Dockerfile binary install#109380NicoHinderling merged 12 commits intomasterfrom
Conversation
self-hosted/Dockerfile
Outdated
| && apt-get update \ | ||
| && apt-get install -y --no-install-recommends $buildDeps \ | ||
| && uv sync --frozen --quiet --no-install-project \ | ||
| && case "$(dpkg --print-architecture)" in \ |
There was a problem hiding this comment.
we should move this into its own layer as it's separate from uv.lock, otherwise we'll do this every time a python dep changes
pretty sure this is all possible with docker primitives + multistage (to handle the docker arch for multiplat builds) at this point too; let me just commit this for you
There was a problem hiding this comment.
Thank you!
cfd6b61 to
4b1becf
Compare
This stack of pull requests is managed by Graphite. Learn more about stacking. |
10e12d3 to
5b2dabe
Compare
5b2dabe to
01b72d2
Compare
ca48cee to
12f0802
Compare
| from dataclasses import dataclass | ||
|
|
||
|
|
||
| @dataclass(frozen=True) |
There was a problem hiding this comment.
There was a problem hiding this comment.
I don't have too strong of an opinion but I'd lean to convention over a perfect performance implementation. I haven't personally seen dataclass used in the codebase, but I widely see pydantic models used, so for ease of use/devs to reuse best practices I'd lean towards using the more widely used convention, but up to you
There was a problem hiding this comment.
(i moved it to pydantic in this most recent iteration btw)
| if process is None or process.stdin is None or process.stdout is None: | ||
| raise RuntimeError("odiff server failed to initialize") | ||
| self._request_id += 1 | ||
| request: dict[str, object] = { |
There was a problem hiding this comment.
Might be a bit cleaner to have this be a type so we don't use untyped dicts
There was a problem hiding this comment.
for this case, I think we can leave this request object as is
| compare_path: str | Path, | ||
| output_path: str | Path, | ||
| **options: object, | ||
| ) -> dict[str, Any]: |
There was a problem hiding this comment.
Might be good to type the response with a simple model to we're not using untyped dicts
| def __exit__(self, *args: object) -> None: | ||
| self.close() | ||
|
|
||
| def _read_json(self, line: bytes) -> dict[str, Any]: |
There was a problem hiding this comment.
We likely wouldn't need this if we just had a type for the response.
| raise RuntimeError("odiff server timed out after 30s") | ||
| line = process.stdout.readline() | ||
| try: | ||
| response = self._read_json(line) |
There was a problem hiding this comment.
This would become something like response = OdiffResponse.model_validate(line) if we used a pydantic model: https://docs.pydantic.dev/latest/concepts/json/
There was a problem hiding this comment.
gotcha, okay let me try that out
There was a problem hiding this comment.
I kept parse_obj(json.loads(line)) rather than model_validate_json since Sentry pins pydantic to v1 (pydantic>=1.10.26,<2 in pyproject.toml). model_validate_json is v2-only and doesn't exist at runtime or in mypy's pydantic plugin unfortunately
| "jest-fail-on-console": "3.3.1", | ||
| "jest-junit": "16.0.0", | ||
| "knip": "5.83.1", | ||
| "odiff-bin": "^4.3.2", |
There was a problem hiding this comment.
Why do we need the package.json change? This is only touching backend
There was a problem hiding this comment.
right, i removed it initially as you remember, but as it turns out, for local development, this is the easiest way to have access to odiff
| '@babel/preset-react', // Still used in jest | ||
| '@babel/preset-typescript', // Still used in jest | ||
| '@emotion/babel-plugin', // Still used in jest | ||
| 'odiff-bin', // raw binary consumed by Python backend, not a JS import |
There was a problem hiding this comment.
Hmm, why is this needed in knip? That's only a frontend check IIUC
There was a problem hiding this comment.
this is needed because odiff is used for dev purposes and knip freaks out that we have a package that doesnt seem to be referenced in the frontend code
b1f6cc6 to
2296369
Compare
| def __exit__(self, *args: object) -> None: | ||
| self.close() | ||
|
|
||
| def _read_json(self, line: bytes) -> dict[str, object]: |
There was a problem hiding this comment.
I'd like to remove this entirely, it seems it's only used twice (_read_response and _start). _read_response is already typed and the json.loads should be fine to do inline. If it throws, that's a valid error we want to catch (and catching and rethrowing anyways?)
In _start I feel we should just inline json.loads(). Both errors here are just rethrown, so I don't see what catching and rethrowing is doing in this function
There was a problem hiding this comment.
tried to incorporate this feedback. Removed _read_json entirely and inlined json.loads at both call sites. Kept parse_obj over model_validate_json since we pin pydantic to v1.
Add the odiff subprocess wrapper (OdiffServer) for image comparison, the DiffResult type, and install the odiff binary in the self-hosted Dockerfile.
2296369 to
4b62d3f
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
…109381) ## Summary - Add `compare_images` and `compare_images_batch` functions using dual-threshold detection (base + color-sensitive) - Produces base64-encoded diff mask PNGs with pixel-level change data - Unit tests covering identical, different, different-size, threshold sensitivity, bytes input, and batch modes **Stack**: 2/3 — depends on #109380, next: Celery task --------- Co-authored-by: Claude <noreply@anthropic.com>

Summary
OdiffServersubprocess wrapper that communicates with odiff via JSON-over-stdin/stdout protocolDiffResultdataclass for structured diff outputStack: 1/3 — next: image comparison library