Skip to content

refactor(scorer): score batches over typed change details#196

Merged
behinddwalls merged 3 commits into
mainfrom
scorer
Jun 5, 2026
Merged

refactor(scorer): score batches over typed change details#196
behinddwalls merged 3 commits into
mainfrom
scorer

Conversation

@behinddwalls
Copy link
Copy Markdown
Collaborator

@behinddwalls behinddwalls commented Jun 4, 2026

Summary

Why?

The scorer took entity.Change (just URIs), so it could not score on real
change size — the example heuristic counted URIs as a placeholder. With
typed change details now persisted on change records, the scorer can score a
batch on its actual lines/files changed.

What?

  • Add entity.BatchChanges — the normalized, batch-level view of all changes
    in a batch (BatchID, Queue, []ChangeInfo) with aggregation helpers.
  • Scorer.Score now takes entity.BatchChanges; the heuristic ValueFunc and the
    composite scorer operate over it.
  • The score controller resolves each request's change records, flattens their
    details into BatchChanges, and scores the batch once — replacing the
    per-request multiplicative product over len(URIs).
  • Example wiring buckets by total lines changed.

Consumes the typed details persisted by the change-details change.

Test Plan

  • make build, make test, make lint, make check-mocks/gazelle/tidy
  • make integration-test, make e2e-test (start -> validate enrich ->
    score normalizes the batch and scores on real change size)

Issues

Stack

  1. refactor(change): persist typed change details from the change provider #195
  2. @ refactor(scorer): score batches over typed change details #196
  3. feat(extensions): fake implementations with error injection #197
  4. feat(example): per-queue extension wiring; retire buildrunner/noop #193
  5. refactor(conflict): take a batch of changes as analyzer input #202

@behinddwalls behinddwalls marked this pull request as ready for review June 4, 2026 19:17
@behinddwalls behinddwalls requested review from a team and sbalabanov as code owners June 4, 2026 19:17
@behinddwalls behinddwalls marked this pull request as draft June 4, 2026 19:22
@behinddwalls behinddwalls marked this pull request as ready for review June 4, 2026 19:24
behinddwalls added a commit that referenced this pull request Jun 5, 2026
## Summary
### Why?

Two "change"-related stores were in the wrong shape. `ChangeStore` — the
real, used store that records per-URI claims for in-flight requests and
backs `start`'s URI claiming and `validate`'s overlap detection — lived
as its own top-level extension and was injected into controllers as a
separate dependency, bypassing the `storage.Storage` aggregator that
owns
every other entity store. Meanwhile `ChangeProviderStore` (exposed via
`Storage.GetChangeProviderStore()`, with a mysql impl, mock, and schema)
was dead code: no controller ever called it, and its
`entity.ChangeProvider`
was orphaned alongside it.

### What?

- Move `ChangeStore` into `package storage`: the interface, mysql impl,
and `change.sql` schema now live under `extension/storage[/mysql]`, and
  `ChangeStore` is a first-class member of the `Storage` aggregator via
  `GetChangeStore()` — matching every other store.
- `start` and `validate` controllers drop their separate `changeStore`
  constructor param and read `store.GetChangeStore()`; the example
  orchestrator no longer constructs/injects it separately.
- Delete the dead `ChangeProviderStore` (interface, mysql, mock,
schema),
`GetChangeProviderStore()`, and the orphaned
`entity/change_provider.go`.
- Fold the standalone changestore integration suite into the shared
`StorageContractSuite` (driven through `GetChangeStore()`); e2e drops
the
  now-redundant changestore schema apply.

## Test Plan
- ✅ `make build`, `make test` (start/validate controllers pass against
the
  storage-package mock)
- ✅ `make lint`, `make check-gazelle`, `make check-mocks`, `make
check-tidy`
  (no drift; `go.mod` / `MODULE.bazel` unchanged)
- ✅ `make integration-test` (storage mysql suite now exercises the
change
  store via `GetChangeStore()`)
- ✅ `make e2e-test` (full land→validate flow: URI claim + overlap
detection,
  `change` table applied from the storage schema dir)

## Issues


## Stack
1. @ #191
1. #195
1. #196
1. #197
## Summary

### Why?

The change provider already produces rich per-URI facts (author, changed
files, line counts), but its value types lived in the extension layer and
the data was thrown away — validate fetched ChangeInfo only to log a file
count, and ChangeRecord stored an opaque Metadata JSON string that was never
written. Nothing downstream could read typed change facts.

### What?

- Move the change value types into entities: entity.User, entity.ChangedFile
  (now with LinesModified), entity.ChangeDetails (the facts), and
  entity.ChangeInfo (URI -> Details), with aggregation helpers. The
  changeprovider extension and GitHub impl now produce these.
- Replace ChangeRecord.Metadata (opaque string) with typed Details
  (ChangeDetails); the change table's metadata JSON column becomes details.
- Add ChangeStore.UpdateDetails — a version-guarded conditional write,
  following the optimistic-locking contract (arithmetic in the controller).
- validate now persists each fetched ChangeInfo onto the request's change
  records (per-URI, idempotent; ErrVersionMismatch is a benign no-op).

This is the producer half: typed details now exist and are persisted. The
score controller consumes them in a follow-up.

## Test Plan

- ✅ `make build`, `make test`, `make lint`, `make check-mocks/gazelle/tidy`
- ✅ `make integration-test` (storage contract suite round-trips Details and
  covers UpdateDetails create/update/version-mismatch)
## Summary

### Why?

The scorer took entity.Change (just URIs), so it could not score on real
change size — the example heuristic counted URIs as a placeholder. With
typed change details now persisted on change records, the scorer can score a
batch on its actual lines/files changed.

### What?

- Add entity.BatchChanges — the normalized, batch-level view of all changes
  in a batch (BatchID, Queue, []ChangeInfo) with aggregation helpers.
- Scorer.Score now takes entity.BatchChanges; the heuristic ValueFunc and the
  composite scorer operate over it.
- The score controller resolves each request's change records, flattens their
  details into BatchChanges, and scores the batch once — replacing the
  per-request multiplicative product over len(URIs).
- Example wiring buckets by total lines changed.

Consumes the typed details persisted by the change-details change.

## Test Plan

- ✅ `make build`, `make test`, `make lint`, `make check-mocks/gazelle/tidy`
- ✅ `make integration-test`, `make e2e-test` (start -> validate enrich ->
  score normalizes the batch and scores on real change size)
Base automatically changed from change-details to main June 5, 2026 05:48
@behinddwalls behinddwalls enabled auto-merge June 5, 2026 05:51
@behinddwalls behinddwalls disabled auto-merge June 5, 2026 05:51
@behinddwalls behinddwalls merged commit 9adffc9 into main Jun 5, 2026
15 checks passed
@behinddwalls behinddwalls deleted the scorer branch June 5, 2026 05:51
@behinddwalls behinddwalls deployed to stack-rebase June 5, 2026 05:51 — with GitHub Actions Active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants