Skip to content

Full-repo sweep remediation: 44 fixes, stack-graphs rip, read-only MCP surface#175

Merged
theagenticguy merged 4 commits into
mainfrom
sweep/full-remediation
Jun 1, 2026
Merged

Full-repo sweep remediation: 44 fixes, stack-graphs rip, read-only MCP surface#175
theagenticguy merged 4 commits into
mainfrom
sweep/full-remediation

Conversation

@theagenticguy
Copy link
Copy Markdown
Owner

Autonomous full-repo sweep (ERPAVal session-bba601) followed by full remediation. Three logically-separated commits; mise run check:full is green (2064 tests, 0 fail) including all determinism/parity gates.

What this does

1. fix: remediate 44 sweep findings + align local gate with CI (5aafec5)
44 adversarially-confirmed findings (5 candidates refuted by independent verifiers) across 16 packages. Activated four inert safety rails: the embedder-fingerprint guard (never persisted embedderModelId), the SARIF inline-suppression regex (matched inside string literals — a gate bypass), ci-init templates (emitted 3 nonexistent flags → generated CI failed on first run), and the doctor lbug check. Fixed the empty-array graphHash divergence (+ recreated the missing parity test), incremental closure carry-forward, and ~20 more. Rewrote 6 fictional-API package READMEs + SPECS.md; fixed doc count drift.

Also fixed three latent gate-fidelity bugs found in passing: biome colliding with sibling git-worktree configs, mise run check diverging from CI on the heavy docs package (red on any machine without cached Chromium), and a stale committed SBOM.cdx.json poisoning the local OSV scan.

2. refactor(mcp)!: remove source-mutating tools — MCP surface is read-only (5e30bd0)
Removed rename and remove_dead_code (the only tools that edited user source). Artifact writers (scan/pack_codebase/group_sync, which write .codehub/ outputs) stay. Surface is now 28 tools; server.test.ts pins the exact set and asserts no destructiveHint=true tool exists. The opencodehub-refactoring skill is reframed to analysis-only.
BREAKING: the two MCP tools are gone — OpenCodeHub plans and verifies refactors; it does not apply edits.

3. refactor(ingestion): rip the dead stack-graphs subsystem (7b91d4b)
~1,900 LOC of test-only code (zero production callers), superseded by SCIP-as-oracle and orphaned by upstream archival (github/stack-graphs, archived 2025-09-09). Removed the subsystem, vendored rule data, and NOTICE attribution. Capability note in the commit: non-SCIP languages now use the three-tier walker's global tier only (already the de-facto behavior).

Verification

  • mise run check:full: lint + typecheck + test + banned-strings + licenses + OSV + pack-determinism all green.
  • Determinism gates pass: double-run graphHash, embeddings determinism, pack byte-identity.
  • Docs site builds (64 pages, links valid).

Notes for review

  • The remediation commit is large by necessity (44 findings); the two rips are isolated and reviewable on their own.
  • Durable lessons recorded under .erpaval/solutions/.

🤖 Generated with Claude Code

The stack-graphs name-resolution subsystem (~1,900 LOC: the clean-room
.tsg evaluator, the Python/TS resolvers, resolver-strategy, the __all__
post-processor, and the vendored github/stack-graphs rule data) ran only
under test. `getResolver()` had zero production callers; the live
`resolve()` path in context.ts never dispatched through it.

It was a planned precision layer to replace the three-tier walker's lossy
global tier, but two forces stranded it: the SCIP-as-oracle phase
(scip-index, confidence 1.0) supersedes it for every SCIP-covered
language, and upstream github/stack-graphs was archived 2025-09-09.

Removed: the subsystem files + tests + fixtures, vendor/stack-graphs-python/,
the NOTICE attribution block, and the `resolverStrategyName: "stack-graphs"`
opt-in + stale docstrings on the python/typescript/tsx/javascript providers.

Capability note: non-SCIP-covered languages (Rust gap, Swift, COBOL) now
rely on the three-tier walker's global tier with no precision overlay.
That was the de-facto behavior already, since the layer never ran.
The MCP surface must never edit a user's source files. Remove the two
tools that did: `rename` (rewrote source to apply a symbol rename) and
`remove_dead_code` (deleted source ranges via fs.writeFileAtomic).

Removed: both tool modules + tests, analysis/rename.ts + its types and
index re-exports, the analysis-bridge callRunRename wrapper (and its
now-orphaned createNodeFs/FsAbstraction imports), and the mcp index
re-exports. Kept analysis/dead-code.ts (classifyDeadness still backs the
read-only list_dead_code) and fs.ts/git.ts (shared with staleness +
detect-changes).

Artifact writers stay: scan, pack_codebase, group_sync are readOnlyHint
false because they write .codehub/ artifacts (SARIF, code-packs, contract
registries), not user source — destructiveHint stays false.

The surface is now 28 tools. server.test.ts pins the exact name set and
asserts no registered tool has destructiveHint=true; annotations.test.ts
asserts the two tools are absent. Docs, skills, the tool-catalog, the
server INSTRUCTIONS string, and the codehub-init agent-context stanza are
updated to 28; the opencodehub-refactoring skill is reframed to
analysis-only (plan with impact/context, apply edits yourself, verify with
detect_changes).

BREAKING CHANGE: the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors; it does not apply source edits.
Full-repo sweep (ERPAVal session-bba601) surfaced 44 adversarially-
confirmed findings across 16 packages; all are fixed here. Highlights:

- embedder: persist embedderModelId in analyze storeMeta so the
  fingerprint-mismatch guard actually fires (was inert).
- sarif: anchor the inline-suppression marker to a comment opener so a
  marker inside a string literal can no longer suppress a real finding.
- cli: fix ci-init templates to emit only real flags (verdict has no
  --output-format/--pr-comment; analyze has no --incremental); make doctor
  hard-fail on a missing graph binding and drop phantom CODEHUB_STORE hints.
- ingestion: stripPhaseKeys no longer drops strictDetectors/embeddings*/
  coverage options; wire coveragePhase into DEFAULT_PHASES; re-BFS the
  closure for incremental processes/communities; persist structured_json
  (citations/side_effects/invariants) from the summarize phase.
- analysis: real sha256 author-exclusion in verdict privacy mode; count
  findings by id (not ruleId) so tiers don't undercount; hoist the
  ACCESSES fetch in api-impact.
- storage: encode empty arrays with a sentinel so lbug graphHash stays
  byte-identical to DuckDB; recreate graph-hash-parity.test.ts.
- core-types: deterministic merge tiebreak in KnowledgeGraph; kind-aware
  parseNodeId; forward-incompat schema-version branch.
- scip-ingest: drop phantom @bufbuild/protobuf + @opencodehub/analysis
  deps; bounds-check proto-reader FIXED64/FIXED32 skips.
- cobol-proleap: fix parseJreMajor for Java 9+ schemes; SIGKILL escalation
  no longer races the exit handler for the settle reason.
- frameworks: drop unused zod dep; honest stage-count docs (stages 3/5
  ship standalone but stage 5 can't run at profile time).
- plus pack, policy, wiki, mcp pack-codebase/connection-pool fixes.

Plus six fictional-API package READMEs rewritten to real exports, SPECS.md
rewritten, doc count drift fixed (signature registered → tools, 11 skills,
18 packages, 19 scanners), and the astro llms-txt CODEHUB_STORE
contradiction removed.

Infra (latent gate-fidelity bugs fixed in passing):
- biome: respect .gitignore + negated-include so sibling git worktrees'
  nested biome.json no longer collide with the root config.
- mise: exclude @opencodehub/docs from build/test/typecheck to mirror CI
  (its astro+playwright build was making `mise run check` red on any
  machine without cached Chromium); add a docs:build task that installs
  chromium on demand.
- untrack + gitignore SBOM.cdx.json (a release artifact regenerated by
  cdxgen; the stale committed copy poisoned the local OSV scan).

Durable lessons recorded under .erpaval/solutions/.
…command-injection)

CodeQL flagged runCommand's spawn in scip-ingest/runners with four
command-injection alerts (js/shell-command-injection-from-environment,
js/indirect-command-line-injection, js/shell-command-constructed-from-input):
the executable name flows through a variable rather than a literal, and
withCodehubBinOnPath feeds process.env PATH into the spawn.

The code was already defended — `shell: false` array-spawn, and the Kotlin
`sh -c` chain shellQuote-escapes every interpolated path — but CodeQL does
not recognize the custom shellQuote as a sanitizer and treats the variable
`cmd` as tainted.

Add a closed `ALLOWED_COMMANDS` allowlist of the 12 executables buildCommand
(plus the dotnet probe and the Kotlin sh chain) can hand to runCommand, and
have runCommand validate `cmd` against it BEFORE spawning, then spawn the
literal recovered from the allowlist (`safeCmd`) rather than the incoming
argument. This is a real defense-in-depth barrier at the spawn boundary and
the taint barrier CodeQL's js/shell-command-* queries recognize.

Export `isAllowedCommand` + `ALLOWED_COMMANDS` and pin both the
every-emitted-command-is-allowlisted contract and the
rejects-arbitrary/injected-command behavior in tests (+2 cases). These
alerts were pre-existing on main (created 2026-05-05 / 2026-05-29).
@theagenticguy
Copy link
Copy Markdown
Owner Author

Security: fixed 4 CodeQL command-injection alerts (ac60c2f)

Investigated the open code-scanning alerts. Four were genuine CodeQL command-injection findings in packages/scip-ingest/src/runners/index.ts (all pre-existing on main, created 2026-05-05 / 2026-05-29):

Root cause: the spawn was already defended — shell: false array-spawn, and the Kotlin sh -c chain shellQuote-escapes every interpolated path — but CodeQL doesn't recognize the custom shellQuote as a sanitizer and treats the variable cmd (a resolved indexer binary name) as tainted.

Fix: added a closed ALLOWED_COMMANDS allowlist (the 12 executables buildCommand / the dotnet probe / the Kotlin chain can spawn) and made runCommand validate against it before spawning, then spawn the literal recovered from the allowlist rather than the incoming argument. This is real defense-in-depth at the spawn boundary and the taint barrier CodeQL's js/shell-command-* queries recognize. Exported isAllowedCommand + pinned the contract in 2 new tests.

Dependency CVEs (OSV / pnpm audit) were already clean — the earlier OSV failure was a stale committed SBOM.cdx.json, fixed in 5aafec5.

Note: 2 pre-existing verify-global-install CI failures (not from this branch)

  • macos-x64-node22-nvm: Cannot find module onnxruntime_binding.node — a missing native prebuild for darwin/x64 in the global-install smoke test (embedder dep packaging gap).
  • macos-arm64-node22-volta: a 63s-vs-60s install-budget timing flake + a volta-specific npm-prefix path quirk.

Both reproduce on main and are unrelated to this diff. Out of scope for this PR; flagging for a separate native-prebuild fix.

@theagenticguy theagenticguy merged commit dbb574a into main Jun 1, 2026
37 of 39 checks passed
@theagenticguy theagenticguy deleted the sweep/full-remediation branch June 1, 2026 18:54
@github-actions github-actions Bot mentioned this pull request Jun 1, 2026
theagenticguy pushed a commit that referenced this pull request Jun 1, 2026
🤖 Automated release via release-please
---


<details><summary>analysis: 0.4.0</summary>

##
[0.4.0](analysis-v0.3.3...analysis-v0.4.0)
(2026-06-01)


### ⚠ BREAKING CHANGES

* **sweep:** the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors via read-only analysis
(impact/context/detect_changes); it does not apply source edits.

### Features

* **sweep:** remediate 44 findings, rip stack-graphs + source-mutating
MCP tools
([#175](#175))
([dbb574a](dbb574a))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/core-types bumped to 0.4.0
    * @opencodehub/sarif bumped to 0.2.0
    * @opencodehub/storage bumped to 0.3.0
    * @opencodehub/wiki bumped to 0.3.0
</details>

<details><summary>cli: 0.6.0</summary>

##
[0.6.0](cli-v0.5.6...cli-v0.6.0)
(2026-06-01)


### ⚠ BREAKING CHANGES

* **sweep:** the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors via read-only analysis
(impact/context/detect_changes); it does not apply source edits.

### Features

* **sweep:** remediate 44 findings, rip stack-graphs + source-mutating
MCP tools
([#175](#175))
([dbb574a](dbb574a))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.4.0
    * @opencodehub/core-types bumped to 0.4.0
    * @opencodehub/embedder bumped to 0.1.3
    * @opencodehub/ingestion bumped to 0.5.0
    * @opencodehub/mcp bumped to 0.5.0
    * @opencodehub/pack bumped to 0.3.0
    * @opencodehub/policy bumped to 0.2.0
    * @opencodehub/sarif bumped to 0.2.0
    * @opencodehub/scanners bumped to 0.2.4
    * @opencodehub/search bumped to 0.3.0
    * @opencodehub/storage bumped to 0.3.0
    * @opencodehub/wiki bumped to 0.3.0
</details>

<details><summary>cobol-proleap: 0.2.0</summary>

##
[0.2.0](cobol-proleap-v0.1.9...cobol-proleap-v0.2.0)
(2026-06-01)


### ⚠ BREAKING CHANGES

* **sweep:** the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors via read-only analysis
(impact/context/detect_changes); it does not apply source edits.

### Features

* **sweep:** remediate 44 findings, rip stack-graphs + source-mutating
MCP tools
([#175](#175))
([dbb574a](dbb574a))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/core-types bumped to 0.4.0
    * @opencodehub/ingestion bumped to 0.5.0
</details>

<details><summary>core-types: 0.4.0</summary>

##
[0.4.0](core-types-v0.3.0...core-types-v0.4.0)
(2026-06-01)


### ⚠ BREAKING CHANGES

* **sweep:** the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors via read-only analysis
(impact/context/detect_changes); it does not apply source edits.

### Features

* **sweep:** remediate 44 findings, rip stack-graphs + source-mutating
MCP tools
([#175](#175))
([dbb574a](dbb574a))
</details>

<details><summary>embedder: 0.1.3</summary>

##
[0.1.3](embedder-v0.1.2...embedder-v0.1.3)
(2026-06-01)


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/core-types bumped to 0.4.0
</details>

<details><summary>frameworks: 0.2.0</summary>

##
[0.2.0](frameworks-v0.1.1...frameworks-v0.2.0)
(2026-06-01)


### ⚠ BREAKING CHANGES

* **sweep:** the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors via read-only analysis
(impact/context/detect_changes); it does not apply source edits.

### Features

* **sweep:** remediate 44 findings, rip stack-graphs + source-mutating
MCP tools
([#175](#175))
([dbb574a](dbb574a))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/core-types bumped to 0.4.0
</details>

<details><summary>ingestion: 0.5.0</summary>

##
[0.5.0](ingestion-v0.4.5...ingestion-v0.5.0)
(2026-06-01)


### ⚠ BREAKING CHANGES

* **sweep:** the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors via read-only analysis
(impact/context/detect_changes); it does not apply source edits.

### Features

* **sweep:** remediate 44 findings, rip stack-graphs + source-mutating
MCP tools
([#175](#175))
([dbb574a](dbb574a))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.4.0
    * @opencodehub/core-types bumped to 0.4.0
    * @opencodehub/embedder bumped to 0.1.3
    * @opencodehub/frameworks bumped to 0.2.0
    * @opencodehub/scip-ingest bumped to 0.3.0
    * @opencodehub/storage bumped to 0.3.0
    * @opencodehub/summarizer bumped to 0.2.0
</details>

<details><summary>mcp: 0.5.0</summary>

##
[0.5.0](mcp-v0.4.5...mcp-v0.5.0)
(2026-06-01)


### ⚠ BREAKING CHANGES

* **sweep:** the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors via read-only analysis
(impact/context/detect_changes); it does not apply source edits.

### Features

* **sweep:** remediate 44 findings, rip stack-graphs + source-mutating
MCP tools
([#175](#175))
([dbb574a](dbb574a))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.4.0
    * @opencodehub/core-types bumped to 0.4.0
    * @opencodehub/embedder bumped to 0.1.3
    * @opencodehub/pack bumped to 0.3.0
    * @opencodehub/sarif bumped to 0.2.0
    * @opencodehub/scanners bumped to 0.2.4
    * @opencodehub/search bumped to 0.3.0
    * @opencodehub/storage bumped to 0.3.0
</details>

<details><summary>pack: 0.3.0</summary>

##
[0.3.0](pack-v0.2.4...pack-v0.3.0)
(2026-06-01)


### ⚠ BREAKING CHANGES

* **sweep:** the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors via read-only analysis
(impact/context/detect_changes); it does not apply source edits.

### Features

* **sweep:** remediate 44 findings, rip stack-graphs + source-mutating
MCP tools
([#175](#175))
([dbb574a](dbb574a))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.4.0
    * @opencodehub/core-types bumped to 0.4.0
    * @opencodehub/ingestion bumped to 0.5.0
    * @opencodehub/sarif bumped to 0.2.0
    * @opencodehub/storage bumped to 0.3.0
</details>

<details><summary>policy: 0.2.0</summary>

##
[0.2.0](policy-v0.1.1...policy-v0.2.0)
(2026-06-01)


### ⚠ BREAKING CHANGES

* **sweep:** the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors via read-only analysis
(impact/context/detect_changes); it does not apply source edits.

### Features

* **sweep:** remediate 44 findings, rip stack-graphs + source-mutating
MCP tools
([#175](#175))
([dbb574a](dbb574a))
</details>

<details><summary>sarif: 0.2.0</summary>

##
[0.2.0](sarif-v0.1.2...sarif-v0.2.0)
(2026-06-01)


### ⚠ BREAKING CHANGES

* **sweep:** the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors via read-only analysis
(impact/context/detect_changes); it does not apply source edits.

### Features

* **sweep:** remediate 44 findings, rip stack-graphs + source-mutating
MCP tools
([#175](#175))
([dbb574a](dbb574a))
</details>

<details><summary>scanners: 0.2.4</summary>

##
[0.2.4](scanners-v0.2.3...scanners-v0.2.4)
(2026-06-01)


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/sarif bumped to 0.2.0
</details>

<details><summary>scip-ingest: 0.3.0</summary>

##
[0.3.0](scip-ingest-v0.2.5...scip-ingest-v0.3.0)
(2026-06-01)


### ⚠ BREAKING CHANGES

* **sweep:** the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors via read-only analysis
(impact/context/detect_changes); it does not apply source edits.

### Features

* **sweep:** remediate 44 findings, rip stack-graphs + source-mutating
MCP tools
([#175](#175))
([dbb574a](dbb574a))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/core-types bumped to 0.4.0
</details>

<details><summary>search: 0.3.0</summary>

##
[0.3.0](search-v0.2.3...search-v0.3.0)
(2026-06-01)


### ⚠ BREAKING CHANGES

* **sweep:** the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors via read-only analysis
(impact/context/detect_changes); it does not apply source edits.

### Features

* **sweep:** remediate 44 findings, rip stack-graphs + source-mutating
MCP tools
([#175](#175))
([dbb574a](dbb574a))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/core-types bumped to 0.4.0
    * @opencodehub/storage bumped to 0.3.0
</details>

<details><summary>storage: 0.3.0</summary>

##
[0.3.0](storage-v0.2.3...storage-v0.3.0)
(2026-06-01)


### ⚠ BREAKING CHANGES

* **sweep:** the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors via read-only analysis
(impact/context/detect_changes); it does not apply source edits.

### Features

* **sweep:** remediate 44 findings, rip stack-graphs + source-mutating
MCP tools
([#175](#175))
([dbb574a](dbb574a))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/core-types bumped to 0.4.0
</details>

<details><summary>summarizer: 0.2.0</summary>

##
[0.2.0](summarizer-v0.1.1...summarizer-v0.2.0)
(2026-06-01)


### ⚠ BREAKING CHANGES

* **sweep:** the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors via read-only analysis
(impact/context/detect_changes); it does not apply source edits.

### Features

* **sweep:** remediate 44 findings, rip stack-graphs + source-mutating
MCP tools
([#175](#175))
([dbb574a](dbb574a))
</details>

<details><summary>wiki: 0.3.0</summary>

##
[0.3.0](wiki-v0.2.3...wiki-v0.3.0)
(2026-06-01)


### ⚠ BREAKING CHANGES

* **sweep:** the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors via read-only analysis
(impact/context/detect_changes); it does not apply source edits.

### Features

* **sweep:** remediate 44 findings, rip stack-graphs + source-mutating
MCP tools
([#175](#175))
([dbb574a](dbb574a))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/core-types bumped to 0.4.0
    * @opencodehub/storage bumped to 0.3.0
    * @opencodehub/summarizer bumped to 0.2.0
</details>

<details><summary>root: 0.7.0</summary>

##
[0.7.0](root-v0.6.7...root-v0.7.0)
(2026-06-01)


### ⚠ BREAKING CHANGES

* **sweep:** the `rename` and `remove_dead_code` MCP tools are removed.
OpenCodeHub plans and verifies refactors via read-only analysis
(impact/context/detect_changes); it does not apply source edits.

### Features

* **sweep:** remediate 44 findings, rip stack-graphs + source-mutating
MCP tools
([#175](#175))
([dbb574a](dbb574a))
</details>

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant