fix(sec): seal merger-proxy + redemption + transactional SPAC deal recompute#169
Open
sroussey wants to merge 2 commits into
Open
fix(sec): seal merger-proxy + redemption + transactional SPAC deal recompute#169sroussey wants to merge 2 commits into
sroussey wants to merge 2 commits into
Conversation
…extractors The merger-proxy and redemption extractors that landed via PR #166 missed the new prompt-injection seal helpers introduced in PR #165. The seal — raw- byte verifyRowSpan at gate, boundSourceSpan at persist — is now applied to both extractors so an unbounded source_span can no longer ship through SpacMergerExtractionRepo / SpacRedemptionExtractionRepo via filer-controlled DEFM14A or post-vote 8-K narrative. Also widen the fence defang to neutralize the </UNTRUSTED	FILER	 DOCUMENT> family of bypasses: add whitespace named entities (Tab, NewLine, nbsp, ensp, emsp, thinsp, zwsp, zwnj, zwj) to NAMED_ENTITY_TABLE and collapse numeric whitespace entities (	 /   etc.) to a single space before the TAG_SHAPED scan. The per-call 64-bit nonce on the real fence remains the primary defense; this closes the layered defang gap. No extractor version bumps: prompt is unchanged in non-adversarial inputs, the gate change is normalization-only.
…ompute Two SPAC correctness issues: 1. processMergerProxy never wrote to extractor_runs. The outer ProcessAccessionDocFormTask records a run for the form's extractor id (DEFM14A), but the merger-proxy nested extractor id was uncovered, so `sec version coverage extractor merger-proxy` always read zero and `drop-previous` was a no-op. Mirrors the redemption recordRun pattern from PR #168: success at the end, PARSE_ERROR in the segmenter catch, PROVIDER_ERROR around runSection. 2. SpacReportWriter.recomputeAndSaveDeals deleted orphan deal rows then wrote new deals in a non-atomic loop. A crash, AbortSignal, or DB error between the delete and the final saveDeal corrupted the SPAC report row. New SpacDealReplace helper wraps the delete+upsert pass in a real transaction: better-sqlite3 `db.transaction` for SQLite, BEGIN/COMMIT/ ROLLBACK on a checked-out PG client. In-memory fallback retains the sequential semantics (no concurrency in tests). No extractor version bump: merger-proxy stays at 1.0.0; `coverage` will simply start populating an empty table.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Four HIGH-priority findings from an automated security/correctness review of
secfor the last 24h, scoped to extractors that #165 / #166 left half-wired.Stacked on PR #165 (
claude/wonderful-hypatia-y6cb4l). The seal helpers (verifyRowSpan,boundSourceSpan, multi-stagewrapUntrusteddefang) are inherited from that PR; this PR extends them to the two missed extractors and closes a defang gap. Retarget tomainafter #165 merges.source_spanthroughSpacMergerExtractionRepo/SpacRedemptionExtractionRepo.wrapUntrusteddefang missed</UNTRUSTED	FILER	DOCUMENT>-style tokens because	wasn't in the named-entity table and&/;broke the tag-scan regex. The per-call nonce on the real fence still holds — this closes the layered gap.processMergerProxynever wrote toextractor_runs.sec version coverage extractor merger-proxyread zero;drop-previouswas a no-op.SpacReportWriter.recomputeAndSaveDealsdeleted orphan deal rows then wrote new rows non-atomically. A crash mid-pass corrupted the SPAC report row.Two commits, scoped per concern pair:
fix(forms): apply prompt-injection seal to merger-proxy + redemption extractors(HIGH-1 + HIGH-2)fix(spac): record extractor_runs for merger-proxy + transactional recompute(HIGH-3 + HIGH-4)Test plan
bun test src/sec/forms/registration-statements/s1/bun test src/sec/forms/proxies-information-statements/bun test src/sec/forms/miscellaneous-filings/bun test src/storage/spac/bun test src/storage/form-8k-event/bun run buildGenerated by Claude Code