Skip to content

fix(sec): persist orphan redemptions + record runs + idempotent backfill#168

Open
sroussey wants to merge 3 commits into
mainfrom
claude/wonderful-hypatia-jx5g36
Open

fix(sec): persist orphan redemptions + record runs + idempotent backfill#168
sroussey wants to merge 3 commits into
mainfrom
claude/wonderful-hypatia-jx5g36

Conversation

@sroussey

Copy link
Copy Markdown
Contributor

Summary

Two related fixes to the redemption AI-extractor that ships in #166:

  1. Persist orphan redemptions. processRedemption8K used to early-return and write nothing when the SPAC had no spac_deal row yet. For SPACs whose 5.07 / 2.01 / 8.01 vote-results 8-K is ingested before any 1.01 definitive-agreement 8-K (out-of-order ingestion), the redemption was lost permanently. Delete the guard; deriveDeals already correlates the full extraction set on every recompute.
  2. Operability. Wire extractor_runs recording around the redemption extractor's success/failure outcomes, and make BackfillRedemptionsTask idempotent + fast.

Findings addressed

  • C-1 — Redemption sub-pass never wrote to extractor_runs, so sec version coverage extractor redemption always read 0 and sec version drop-previous extractor redemption was a no-op. Now records success/failure for every well-defined outcome.
  • H-1 — Backfill task re-ran every known-SPAC trigger 8-K on every invocation. Left-anti-join against extractor_runs (hasSuccessfulRun(...)) skips already-extracted filings; --force opts back in.
  • H-2 — Per-SPAC (form, cik) query loop replaced with two bulk filingRepo.query({ form }) calls + in-memory CIK-set filter, matching the existing UpdateAllFormsTask pattern.
  • H-3 — Progress log every 100 processed; honors context.signal so an aborted sweep resumes naturally on the next invocation.
  • H-4 — 5.07-first ingestion lost redemption rows. Guard removed; orphan extractions now persist and self-correlate when a later 1.01 mints the deal.

Commits

  1. fix(forms/8-K): persist redemption extraction even when SPAC has no deal yet — Plan B (H-4). Deletes the deals.length === 0 guard in processRedemption8K, updates the JSDoc on SpacReportWriter.recordRedemption, inverts the "no deal yields no extraction" tests into orphan-persistence + later-deal-correlation tests, and updates the e2e test.
  2. fix(sec): record extractor_runs for redemption + idempotent, fast backfill — Plan A (C-1 / H-1 / H-2 / H-3). Wires recordRun around processRedemption8K's PARSE_ERROR catch and the section runner. Rewrites BackfillRedemptionsTask with a force input, a bulk-query candidate selector (selectRedemptionBackfillCandidates), the hasSuccessfulRun left-anti-join, abort-signal handling, and a progress log. Adds --force / --dry-run flags to sec spac backfill-redemptions.

Test plan

  • bun test src/sec/forms/miscellaneous-filings/ src/storage/spac/ (Plan B scope)
  • bun test src/task/spac/ src/sec/forms/miscellaneous-filings/ src/task/forms/ src/storage/versioning/ src/storage/spac/ (Plan A scope)
  • bun run build (clean both commits)

All 212 affected tests pass; build clean.

Migration

No schema change, no extractor version bump. Run sec spac backfill-redemptions once post-deploy to populate extractor_runs for filings already extracted at the current version and to re-extract orphan redemptions that were dropped by the (now-deleted) deals.length === 0 guard. The sweep is idempotent and resumes naturally after an interruption.

🤖 Generated with Claude Code


Generated by Claude Code

claude added 2 commits June 24, 2026 08:38
…eal yet

H-4: processRedemption8K previously early-returned and wrote nothing when
spacRepo.getDeals(cik) was empty. For SPACs where a 5.07 / 2.01 / 8.01
vote-results 8-K is ingested before any 1.01 definitive-agreement 8-K
("5.07-first ingestion"), the redemption was lost permanently — no row,
no dead-letter, no correlation when the missing 1.01 later landed.

Delete the deals.length === 0 guard. deriveDeals already reads the full
redemption-extraction set on every recompute and partitions the timeline
into deal windows (including the spacDealGrouping.redemption.test.ts:87-97
"completed-only deal" case), so an orphan extraction persisted here is
automatically correlated by any future write that mints a deal. No schema
change, no version bump.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01LfEFT4C5ayZkU7157sNwTg
…kfill

C-1: processRedemption8K never wrote to extractor_runs, so the major-bump
coverage gate (`sec version coverage extractor redemption`) always read 0
and `sec version drop-previous extractor redemption` was a no-op. Wire
recordRun around both the well-defined PARSE_ERROR catch and the section
runner — success on return, failure on throw + rethrow. Trigger-item and
SPAC gates remain unrecorded (defensive dead branches in production).

H-1 / H-2: BackfillRedemptionsTask re-ran every known-SPAC trigger 8-K on
every invocation. Add a left-anti-join against extractor_runs via
hasSuccessfulRun(cik, accession, "redemption", activeVersion) so the
sweep is idempotent; a new --force flag opts back into reprocessing.
Replace the per-SPAC (form, cik) loop with two bulk filingRepo.query({
form }) calls + an in-memory CIK-set filter, matching the existing
UpdateAllFormsTask pattern.

H-3: emit a progress log every 100 processed and honor context.signal so
a long sweep can be aborted cleanly; the next invocation resumes
naturally since already-extracted filings are skipped.

CLI: `sec spac backfill-redemptions [--force] [--dry-run]` now reports
selected / processed / skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01LfEFT4C5ayZkU7157sNwTg

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes and operational improvements for the SPAC redemption 8‑K AI extractor introduced in #166, ensuring redemptions are not dropped during out-of-order ingestion and making historical backfills efficient and resumable.

Changes:

  • Persist “orphan” redemption extractions even when no spac_deal exists yet, relying on later deriveDeals recomputes to correlate them.
  • Record extractor_runs for redemption extraction outcomes and use these runs to make backfill-redemptions idempotent (with --force / --dry-run).
  • Optimize backfill candidate selection via bulk filing queries and add progress + abort-signal handling.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/task/spac/BackfillRedemptionsTask.ts Adds candidate selection, --force/idempotent skipping via extractor_runs, progress logging, and cancellation support.
src/task/spac/BackfillRedemptionsTask.test.ts Expands tests for candidate selection and backfill idempotency/force behavior.
src/task/forms/ProcessAccessionDocFormTask.redemption.test.ts Updates redemption escalation tests to register a fake model/provider; adds extractor_runs recording coverage.
src/storage/spac/SpacReportWriter.ts Updates docstring to clarify orphan redemption persistence and later correlation behavior.
src/sec/forms/miscellaneous-filings/redemption8k.ts Removes “no deal” guard; adds redemption extractor_runs recording and PARSE_ERROR handling integration.
src/sec/forms/miscellaneous-filings/redemption8k.test.ts Reworks tests to validate orphan persistence, later correlation, idempotency, and extractor_runs recording.
src/sec/forms/miscellaneous-filings/redemption8k.e2e.test.ts Updates e2e expectations to reflect orphan persistence and later rollup after deal minting.
src/commands/spac.ts Adds --force and --dry-run flags; logs processed/skipped counts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 79 to +103
const deadLetters = new ExtractionDeadLetterRepo();
const runRepo = new ExtractorRunRepo(globalServiceRegistry.get(EXTRACTOR_RUN_REPOSITORY_TOKEN));
const model = args.model ?? (await getRedemptionModel());
const model_id = resolveModelId(model);

const recordRedemptionRun = async (success: boolean, error: string | null): Promise<void> => {
try {
await runRepo.recordRun({
cik,
accession_number,
form,
extractor_id: EXTRACTOR_ID,
extractor_version,
slot_at_run,
success,
error: error === null ? null : error.slice(0, 4096),
});
} catch (recordErr) {
console.error(
`Failed to record extractor_runs row for ${cik}/${accession_number}@${EXTRACTOR_ID}:${extractor_version}:`,
recordErr
);
}
};

Comment on lines +121 to +137
const extractorVersion = activeSlot?.semver ?? DEFAULT_REDEMPTION_VERSION;

const signal = (context as unknown as { signal?: AbortSignal }).signal;

// Isolate per-filing failures: one bad 8-K (fetch error, malformed body)
// must not abort the sweep over the remaining accessions.
// must not abort the sweep over the remaining candidates.
let processed = 0;
for (const accessionNumber of accessions) {
let skipped = 0;
for (const c of candidates) {
if (signal?.aborted) break;
if (!input.force) {
const already = await runRepo.hasSuccessfulRun(
c.cik,
c.accession_number,
REDEMPTION_EXTRACTOR_ID,
extractorVersion
);
…ndidates

Two Copilot-review findings on #168:

- getRedemptionModel() throw used to bypass the surrounding try/catch and
  abort the whole 8-K processing, leaving no extractor_runs row. Move the
  recordRedemptionRun helper above the model-resolution call and wrap that
  call in a try/catch that dead-letters with reason_code MODEL_RESOLUTION_ERROR,
  records a failed run, and returns cleanly — mirroring the PARSE_ERROR path.
- BackfillRedemptionsTask's skip predicate ran one hasSuccessfulRun query per
  candidate AND used exact-semver matching, so a patch-only version bump would
  reprocess every previously-extracted filing. Switch to a single
  listFilingsWithoutSuccessfulRun call up front, which both narrows to one
  storage query and applies the codebase's standard major.minor.* gating.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants