fix(sec): persist orphan redemptions + record runs + idempotent backfill#168
Open
sroussey wants to merge 3 commits into
Open
fix(sec): persist orphan redemptions + record runs + idempotent backfill#168sroussey wants to merge 3 commits into
sroussey wants to merge 3 commits into
Conversation
…eal yet
H-4: processRedemption8K previously early-returned and wrote nothing when
spacRepo.getDeals(cik) was empty. For SPACs where a 5.07 / 2.01 / 8.01
vote-results 8-K is ingested before any 1.01 definitive-agreement 8-K
("5.07-first ingestion"), the redemption was lost permanently — no row,
no dead-letter, no correlation when the missing 1.01 later landed.
Delete the deals.length === 0 guard. deriveDeals already reads the full
redemption-extraction set on every recompute and partitions the timeline
into deal windows (including the spacDealGrouping.redemption.test.ts:87-97
"completed-only deal" case), so an orphan extraction persisted here is
automatically correlated by any future write that mints a deal. No schema
change, no version bump.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01LfEFT4C5ayZkU7157sNwTg
…kfill
C-1: processRedemption8K never wrote to extractor_runs, so the major-bump
coverage gate (`sec version coverage extractor redemption`) always read 0
and `sec version drop-previous extractor redemption` was a no-op. Wire
recordRun around both the well-defined PARSE_ERROR catch and the section
runner — success on return, failure on throw + rethrow. Trigger-item and
SPAC gates remain unrecorded (defensive dead branches in production).
H-1 / H-2: BackfillRedemptionsTask re-ran every known-SPAC trigger 8-K on
every invocation. Add a left-anti-join against extractor_runs via
hasSuccessfulRun(cik, accession, "redemption", activeVersion) so the
sweep is idempotent; a new --force flag opts back into reprocessing.
Replace the per-SPAC (form, cik) loop with two bulk filingRepo.query({
form }) calls + an in-memory CIK-set filter, matching the existing
UpdateAllFormsTask pattern.
H-3: emit a progress log every 100 processed and honor context.signal so
a long sweep can be aborted cleanly; the next invocation resumes
naturally since already-extracted filings are skipped.
CLI: `sec spac backfill-redemptions [--force] [--dry-run]` now reports
selected / processed / skipped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01LfEFT4C5ayZkU7157sNwTg
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes and operational improvements for the SPAC redemption 8‑K AI extractor introduced in #166, ensuring redemptions are not dropped during out-of-order ingestion and making historical backfills efficient and resumable.
Changes:
- Persist “orphan” redemption extractions even when no
spac_dealexists yet, relying on laterderiveDealsrecomputes to correlate them. - Record
extractor_runsfor redemption extraction outcomes and use these runs to makebackfill-redemptionsidempotent (with--force/--dry-run). - Optimize backfill candidate selection via bulk filing queries and add progress + abort-signal handling.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/task/spac/BackfillRedemptionsTask.ts | Adds candidate selection, --force/idempotent skipping via extractor_runs, progress logging, and cancellation support. |
| src/task/spac/BackfillRedemptionsTask.test.ts | Expands tests for candidate selection and backfill idempotency/force behavior. |
| src/task/forms/ProcessAccessionDocFormTask.redemption.test.ts | Updates redemption escalation tests to register a fake model/provider; adds extractor_runs recording coverage. |
| src/storage/spac/SpacReportWriter.ts | Updates docstring to clarify orphan redemption persistence and later correlation behavior. |
| src/sec/forms/miscellaneous-filings/redemption8k.ts | Removes “no deal” guard; adds redemption extractor_runs recording and PARSE_ERROR handling integration. |
| src/sec/forms/miscellaneous-filings/redemption8k.test.ts | Reworks tests to validate orphan persistence, later correlation, idempotency, and extractor_runs recording. |
| src/sec/forms/miscellaneous-filings/redemption8k.e2e.test.ts | Updates e2e expectations to reflect orphan persistence and later rollup after deal minting. |
| src/commands/spac.ts | Adds --force and --dry-run flags; logs processed/skipped counts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
79
to
+103
| const deadLetters = new ExtractionDeadLetterRepo(); | ||
| const runRepo = new ExtractorRunRepo(globalServiceRegistry.get(EXTRACTOR_RUN_REPOSITORY_TOKEN)); | ||
| const model = args.model ?? (await getRedemptionModel()); | ||
| const model_id = resolveModelId(model); | ||
|
|
||
| const recordRedemptionRun = async (success: boolean, error: string | null): Promise<void> => { | ||
| try { | ||
| await runRepo.recordRun({ | ||
| cik, | ||
| accession_number, | ||
| form, | ||
| extractor_id: EXTRACTOR_ID, | ||
| extractor_version, | ||
| slot_at_run, | ||
| success, | ||
| error: error === null ? null : error.slice(0, 4096), | ||
| }); | ||
| } catch (recordErr) { | ||
| console.error( | ||
| `Failed to record extractor_runs row for ${cik}/${accession_number}@${EXTRACTOR_ID}:${extractor_version}:`, | ||
| recordErr | ||
| ); | ||
| } | ||
| }; | ||
|
|
Comment on lines
+121
to
+137
| const extractorVersion = activeSlot?.semver ?? DEFAULT_REDEMPTION_VERSION; | ||
|
|
||
| const signal = (context as unknown as { signal?: AbortSignal }).signal; | ||
|
|
||
| // Isolate per-filing failures: one bad 8-K (fetch error, malformed body) | ||
| // must not abort the sweep over the remaining accessions. | ||
| // must not abort the sweep over the remaining candidates. | ||
| let processed = 0; | ||
| for (const accessionNumber of accessions) { | ||
| let skipped = 0; | ||
| for (const c of candidates) { | ||
| if (signal?.aborted) break; | ||
| if (!input.force) { | ||
| const already = await runRepo.hasSuccessfulRun( | ||
| c.cik, | ||
| c.accession_number, | ||
| REDEMPTION_EXTRACTOR_ID, | ||
| extractorVersion | ||
| ); |
…ndidates Two Copilot-review findings on #168: - getRedemptionModel() throw used to bypass the surrounding try/catch and abort the whole 8-K processing, leaving no extractor_runs row. Move the recordRedemptionRun helper above the model-resolution call and wrap that call in a try/catch that dead-letters with reason_code MODEL_RESOLUTION_ERROR, records a failed run, and returns cleanly — mirroring the PARSE_ERROR path. - BackfillRedemptionsTask's skip predicate ran one hasSuccessfulRun query per candidate AND used exact-semver matching, so a patch-only version bump would reprocess every previously-extracted filing. Switch to a single listFilingsWithoutSuccessfulRun call up front, which both narrows to one storage query and applies the codebase's standard major.minor.* gating.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related fixes to the redemption AI-extractor that ships in #166:
processRedemption8Kused to early-return and write nothing when the SPAC had nospac_dealrow yet. For SPACs whose 5.07 / 2.01 / 8.01 vote-results 8-K is ingested before any 1.01 definitive-agreement 8-K (out-of-order ingestion), the redemption was lost permanently. Delete the guard;deriveDealsalready correlates the full extraction set on every recompute.extractor_runsrecording around the redemption extractor's success/failure outcomes, and makeBackfillRedemptionsTaskidempotent + fast.Findings addressed
extractor_runs, sosec version coverage extractor redemptionalways read 0 andsec version drop-previous extractor redemptionwas a no-op. Now records success/failure for every well-defined outcome.extractor_runs(hasSuccessfulRun(...)) skips already-extracted filings;--forceopts back in.(form, cik)query loop replaced with two bulkfilingRepo.query({ form })calls + in-memory CIK-set filter, matching the existingUpdateAllFormsTaskpattern.context.signalso an aborted sweep resumes naturally on the next invocation.Commits
fix(forms/8-K): persist redemption extraction even when SPAC has no deal yet— Plan B (H-4). Deletes thedeals.length === 0guard inprocessRedemption8K, updates the JSDoc onSpacReportWriter.recordRedemption, inverts the "no deal yields no extraction" tests into orphan-persistence + later-deal-correlation tests, and updates the e2e test.fix(sec): record extractor_runs for redemption + idempotent, fast backfill— Plan A (C-1 / H-1 / H-2 / H-3). WiresrecordRunaroundprocessRedemption8K's PARSE_ERROR catch and the section runner. RewritesBackfillRedemptionsTaskwith aforceinput, a bulk-query candidate selector (selectRedemptionBackfillCandidates), thehasSuccessfulRunleft-anti-join, abort-signal handling, and a progress log. Adds--force/--dry-runflags tosec spac backfill-redemptions.Test plan
bun test src/sec/forms/miscellaneous-filings/ src/storage/spac/(Plan B scope)bun test src/task/spac/ src/sec/forms/miscellaneous-filings/ src/task/forms/ src/storage/versioning/ src/storage/spac/(Plan A scope)bun run build(clean both commits)All 212 affected tests pass; build clean.
Migration
No schema change, no extractor version bump. Run
sec spac backfill-redemptionsonce post-deploy to populateextractor_runsfor filings already extracted at the current version and to re-extract orphan redemptions that were dropped by the (now-deleted)deals.length === 0guard. The sweep is idempotent and resumes naturally after an interruption.🤖 Generated with Claude Code
Generated by Claude Code