lisa/fix-prompt-to-use-fewer-labels#422
Conversation
gusfcarvalho
left a comment
There was a problem hiding this comment.
Findings inline. Build + unit tests pass locally, gofmt clean, CI green (unit/lint/check-diff). The proposed-filter-subset design is sound and well-tested on the main path; two issues to address before merge.
Inline:
- Medium —
dashboard_suggestionsfilter-labels unique index is dropped + recreated (non-CONCURRENTLY) on every boot, unlike its idempotent sibling. Exclusive lock + uniqueness gap each startup. (migrator.go + test-migrator mirror.) - Medium — empty-model
proposed_filter_labelsfallback degenerates to{_policy}only, never reaching the documented{_policy, provider, type}default; uncovered by tests. (+ a minor double-counted metric.)
Out-of-scope observation (Low, no change requested): the pre-existing idx_dashboard_suggestions_unique_pending (keyed on label_set_hash) is untouched and now coexists with the new subset-keyed index. Two pending suggestions for the same control + same evidence label-set but different proposed subsets would be silently rejected by the old index (OnConflict DoNothing → counted Excluded), dropping the lower-confidence one. Rare since evidence→subset is deterministic, but worth a conscious decision on whether the old index should stay.
gusfcarvalho
left a comment
There was a problem hiding this comment.
All pass-1 findings addressed in b0ffab5 and verified:
- Medium (idempotent index):
DROPremoved; bothinternal/service/migrator.goandinternal/tests/migrate.gonow useCREATE UNIQUE INDEX IF NOT EXISTS— no more every-boot rebuild / uniqueness gap. - Medium (fallback subset):
validateProposedFilterLabelsnow seeds the empty-model case fromdefaultFilterLabelSubset(evidence)before re-injecting_policy, so it yields{_policy, provider, type}instead of{_policy}only; the double-countedfallback_filter_labelsmetric is gone. New testTestValidateMappingsEmptyFilterLabelsUseDefaultSubsetlocks in both behaviors.
Local go build ./..., package unit tests, and gofmt clean; CI check-diff/lint/unit-tests green (integration still running, as in round 1). The out-of-scope Low (old label_set_hash index coexisting with the new subset index) was a non-blocking observation. LGTM.
|
PR approved. Marking as ready for e2e. |
gusfcarvalho
left a comment
There was a problem hiding this comment.
Blocker found in runtime testing: the new proposed_filter_labels output schema uses a schema-valued additionalProperties (open-ended map), which Anthropic's strict structured-output mode rejects with a 400 — every cell fails, so the feature produces zero suggestions. Inline comment has the error and a concrete array-of-{key,value} fix. This needs to land before merge.
gusfcarvalho
left a comment
There was a problem hiding this comment.
Blocker fixed and verified in cf3f8db. proposed_filter_labels is now a strict array of {key, value} objects (additionalProperties: false, required: [key, value]), so Anthropic structured-output no longer 400s. RawMapping.ProposedFilterLabels is a ProposedFilterLabels named map with a custom UnmarshalJSON accepting both the pair-array and legacy object form, leaving validateProposedFilterLabels semantics untouched; prompt prose + golden updated. Good new tests: recursive strict-schema guard (every object has additionalProperties:false), array-vs-legacy equivalence, and duplicate-key-last-wins.
Verified: go build ./..., package unit tests, and gofmt clean; CI check-diff/lint/unit-tests green (integration running, as in prior rounds). All prior findings (two round-1 Mediums + this Blocker) addressed. LGTM.
|
PR approved. Marking as ready for e2e. |
automated implementation by lisa.