Skip to content

refactor(breaking): rename breaking module + dual-vocabulary change-category aliasing#1424

Merged
gcko merged 4 commits into
mainfrom
feature/drc-3553-m4-backend-parser-api-alias-window-oss-cloud-python
Jun 12, 2026
Merged

refactor(breaking): rename breaking module + dual-vocabulary change-category aliasing#1424
gcko merged 4 commits into
mainfrom
feature/drc-3553-m4-backend-parser-api-alias-window-oss-cloud-python

Conversation

@even-wei

@even-wei even-wei commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

PR checklist

  • Added/ran appropriate tests
  • DCO signed

What type of PR is this?

refactor / feature (backwards-compatible)

What this PR does / why we need it

M4 (OSS half) of the change-category vocabulary migration. Opens the dual-vocabulary aliasing window so Recce Cloud and OSS can roll over from the legacy vocabulary (breaking/non_breaking/partial_breaking/unknown) to v2 (model_wide/column/additive) independently over one release window.

  • Renames recce/util/breaking.pyrecce/util/change_classifier.py; recce/util/breaking.py remains a deprecated re-export shim (removed next release).
  • Adds CHANGE_CATEGORY_ALIASES + normalize_change_category() (accepts either vocab, returns canonical legacy value) + to_v2_change_category().
  • Defensive parsing: Pydantic field_validator(mode="before") on NodeChange.category and CllNode.change_category normalize v2 input to legacy (forward-compat).
  • Opt-in Accept-Vocabulary: v2 output on /api/info and /api/cll (default path unchanged).
  • CI lint (scripts/check_wire_enum_literals.sh + .github/workflows/wire-enum-lint.yaml) fails PRs adding NEW legacy wire-enum literals under js/packages/ui/src/; allowlist via inline marker // wire-enum-ok.

Note on issue AC#1: the parse_change_category("breaking") example is pseudo-code — the real function diffs SQL. The intent is realized by the string normalize_change_category(); the new test asserts the four equivalences against it.

Which issue(s) this PR fixes

Resolves DRC-3553 (OSS portion). Cloud-side (recce-cloud-infra) is a separate PR.

Special notes for your reviewer

  • Wire/enum values intentionally stay legacy this release; only aliasing added.
  • Alias maps live in recce/models/types.py (not change_classifier.py) to avoid a circular import; re-exported from change_classifier.
  • The rename shows as create+modify rather than a single git rename because the old path is reoccupied by the shim. Net new logic ≈ 250 LOC.
  • Tests: 186 passed (tests/util/test_breaking.py, tests/adapter/dbt_adapter/test_dbt_cll.py, tests/test_merged_lineage.py, tests/util/test_cll_cache.py).

Does this PR introduce a user-facing change?

Opt-in Accept-Vocabulary: v2 response header and a dual-vocabulary aliasing window for change categories. The legacy vocabulary (breaking/non_breaking/partial_breaking) is DEPRECATED and will be REMOVED next release; migrate to v2 (model_wide/column/additive). Both vocabularies are accepted on input during the window. recce.util.breaking is deprecated; import from recce.util.change_classifier.

…ategory aliasing

M4 (OSS half) of the change-category vocabulary migration. Opens the
dual-vocabulary aliasing window: wire enums stay legacy
(breaking/non_breaking/partial_breaking/unknown); the v2 vocabulary
(model_wide/column/additive) is accepted on input and emitted opt-in via the
`Accept-Vocabulary: v2` response header.

- Rename recce/util/breaking.py -> recce/util/change_classifier.py;
  breaking.py kept as a deprecated re-export shim (removed next release).
- Add CHANGE_CATEGORY_ALIASES + normalize_change_category() +
  to_v2_change_category() (in models/types.py to avoid a circular import,
  re-exported from change_classifier).
- Defensive Pydantic field validators normalize v2 input to legacy.
- Opt-in Accept-Vocabulary: v2 output on /api/info and /api/cll.
- CI lint (scripts/check_wire_enum_literals.sh + workflow) blocks NEW legacy
  wire-enum literals under js/packages/ui/src/; allowlist marker // wire-enum-ok.

Resolves DRC-3553 (OSS portion). Cloud-side is a separate repo/PR.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: even-wei <evenwei@infuseai.io>
Comment thread .github/workflows/wire-enum-lint.yaml Fixed
@codecov

codecov Bot commented Jun 9, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 93.24009% with 29 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
recce/util/change_classifier.py 91.28% 25 Missing ⚠️
recce/models/types.py 91.66% 2 Missing ⚠️
recce/server.py 95.23% 1 Missing ⚠️
tests/test_server.py 97.87% 1 Missing ⚠️
Files with missing lines Coverage Δ
recce/adapter/dbt_adapter/__init__.py 79.16% <100.00%> (ø)
recce/util/breaking.py 100.00% <100.00%> (+9.32%) ⬆️
tests/util/test_breaking.py 98.35% <100.00%> (+0.20%) ⬆️
recce/server.py 72.01% <95.23%> (+1.31%) ⬆️
tests/test_server.py 98.78% <97.87%> (-0.22%) ⬇️
recce/models/types.py 97.43% <91.66%> (-0.82%) ⬇️
recce/util/change_classifier.py 91.28% <91.28%> (ø)

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

even-wei and others added 2 commits June 9, 2026 18:28
Add an explicit top-level permissions block scoped to contents:read on the
new Wire-Enum Lint workflow. The job only checks out the repo and runs a
read-only diff script, so the default broad GITHUB_TOKEN scope is unneeded.

Resolves the CodeQL "Workflow does not contain permissions" alert flagged on
this PR and matches the repo convention (tests-python.yaml,
address-dependabot.yaml).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: even-wei <evenwei@infuseai.io>
@even-wei even-wei requested a review from gcko June 12, 2026 01:17
@gcko

gcko commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Code Review: PR #1424

SHA ab55c079 · Verdict NO-GO

The breaking.pychange_classifier.py rename is verbatim (diff confirms identical body) and the shim re-exports cleanly — no circular import (verified by import), 58/58 tests pass, flake8 clean. The aliasing core (normalize_change_category, to_v2_change_category, the two model validators) is correct. The findings below are on the net-new v2 output layer.

Issues

  1. recce/server.py:697 — The Accept-Vocabulary: v2 path for /api/cll serializes with model_dump(exclude_none=True), but the default path (return CllOutput(current=cll), line 701, response_model=CllOutput with no response_model_exclude_none) includes null fields.
    Evidence: line 537 sets response_model_exclude_none=True explicitly on another route, proving /api/cll's default keeps nulls; the v2 branch passes exclude_none=True. So opting into v2 silently drops every null field (source_name, change_status, impacted, null change_category, and all CllColumn nulls) — a response-shape change beyond the documented category remap. The inline comment claims "only ... remap change_category to v2." /api/info avoids this by using exclude_none=True in both modes (line 637); /api/cll does not.
    Pass C.

  2. recce/server.py:577-606, recce/util/change_classifier.py:590-620 — The v2 output paths have zero test coverage: wants_v2_vocabulary, translate_merged_lineage_to_v2, _maybe_v2_lineage, _maybe_v2_cll, and both endpoint v2 branches.
    Evidence: the added tests in test_breaking.py exercise only normalize_change_category / to_v2_change_category / the model validators / the shim — never an endpoint or the translate helpers. One endpoint test asserting the v2 response shape would have caught Issue 1.
    Pass E.

Notes

  1. tests/util/test_breaking.pytest_parse_change_category_accepts_both_vocabularies never calls parse_change_category; every assertion is on normalize_change_category. The class docstring acknowledges this, but the test name still asserts coverage that does not exist.
    Pass E.

  2. recce/models/types.py:173normalize_change_category's docstring promises unrecognized labels pass through "forward-compat: don't hard-fail on a label a future vocabulary introduces." But as a mode="before" validator feeding the ChangeCategory Literal, an unrecognized label is still rejected by Literal validation (NodeChange(category="some_future_label") raises). The passthrough only benefits non-model callers; the model boundary is not forward-compatible.
    Pass A.


Resolve Issues 1–2 (or accept Issue 2 as a tracked follow-up) for GO.

@gcko gcko left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review: NO-GO — 2 issues (v2 /api/cll exclude_none divergence; untested v2 output paths). See review comment.

…r v2 output paths

Address PR #1424 review feedback (gcko):

- /api/cll v2 branch no longer passes exclude_none=True to model_dump();
  the manual serialization now matches the default response_model shape,
  so the ONLY difference between modes is the change_category labels
- add endpoint tests for /api/info and /api/cll v2 branches, including
  shape-parity assertions, plus unit tests for wants_v2_vocabulary and
  translate_merged_lineage_to_v2
- rename test_parse_change_category_accepts_both_vocabularies ->
  test_normalize_change_category_accepts_both_vocabularies (the test
  exercises normalize_change_category, not parse_change_category)
- scope normalize_change_category docstring: the unrecognized-label
  passthrough benefits direct callers only; ChangeCategory Literal
  fields still reject unknown labels at model boundaries

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: even-wei <evenwei@infuseai.io>
@even-wei

Copy link
Copy Markdown
Contributor Author

@gcko Thanks for the review — all four findings addressed in 474131a9.

Issue 1 — v2 /api/cll exclude_none divergence (recce/server.py:697): Fixed. The v2 branch now serializes with plain model_dump() (no exclude_none), matching the default response_model=CllOutput shape exactly — the Cll models carry no field aliases, so the only remaining difference between the two modes is the change_category labels. The inline comment now states this invariant explicitly.

Issue 2 — untested v2 output paths: Fixed. Added:

  • tests/test_server.py::TestV2VocabularyEndpoints — endpoint tests for both v2 branches: /api/info (test_api_info_lineage_respects_accept_vocabulary_header, exercising _maybe_v2_lineage) and /api/cll (test_api_cll_v2_remaps_categories_and_preserves_shape, exercising _maybe_v2_cll). The /api/cll test asserts key-set parity between legacy and v2 responses on nodes with null-valued optional fields — exactly the assertion that would have caught Issue 1, and it now pins the fix.
  • tests/util/test_breaking.py — unit tests for wants_v2_vocabulary (absent/None/empty/case-and-whitespace-tolerant matching) and translate_merged_lineage_to_v2 (in-place remap, change_status untouched, non-dict passthrough).

Note 1 — misleading test name: Fixed. Renamed to test_normalize_change_category_accepts_both_vocabularies; the class docstring explaining the AC#1 intent stays.

Note 2 — forward-compat docstring overpromise (recce/models/types.py:173): Fixed (doc-level). The docstring now scopes the passthrough to direct callers and explicitly states that ChangeCategory Literal fields (e.g. NodeChange.category) still reject unrecognized labels after normalization. Loosening the Literal itself is intentionally out of scope for the aliasing window — the wire contract stays legacy-strict until the v2 cutover.

Verification: make format / make flake8 clean; tests/util/test_breaking.py + tests/test_server.py::TestV2VocabularyEndpoints 62/62 pass; full make test — the only failures are the pre-existing local-env ones (SPA-route tests needing a built frontend, rich-console width assertions in CLI tests), confirmed identical on the unmodified tree.

@gcko

gcko commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Code Review: PR #1424

SHA 474131a9 · Verdict GO · Incremental (since ab55c079)

All four prior findings resolved. Verified the new commit 474131a9:

  • Issue 1 (resolved)server.py:700 now uses model_dump() (no exclude_none), matching the default response_model=CllOutput path. CLL models carry no aliases, so by_alias divergence is moot. The new test_api_cll_v2_remaps_categories_and_preserves_shape asserts per-node key parity between legacy and v2 (including null fields) and passes.
  • Issue 2 (resolved)tests/test_server.py::TestV2VocabularyEndpoints covers both /api/info and /api/cll v2 paths end-to-end; test_breaking.py adds test_wants_v2_vocabulary_header_detection and test_translate_merged_lineage_to_v2 (incl. non-dict / missing-change defensive cases).
  • Note 1 (resolved) — test renamed to test_normalize_change_category_accepts_both_vocabularies, matching what it asserts.
  • Note 2 (resolved)normalize_change_category docstring now states the passthrough does not make Pydantic Literal boundaries forward-compatible.

Verification: 62 tests pass (test_server.py::TestV2VocabularyEndpoints + full test_breaking.py), flake8 clean.

@gcko gcko left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review: GO (incremental). All prior findings resolved. See review comment.

@gcko gcko merged commit 06f7ed6 into main Jun 12, 2026
22 checks passed
@gcko gcko deleted the feature/drc-3553-m4-backend-parser-api-alias-window-oss-cloud-python branch June 12, 2026 01:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants