Conversation
Co-authored-by: Eric Elliott <support@paralleldrive.com>
Co-authored-by: Eric Elliott <support@paralleldrive.com>
| const src = readFileSync(path.resolve(cwd, file), "utf8"); | ||
| const loc = src.split("\n").length; | ||
| const buf = Buffer.from(src, "utf8"); | ||
| const gzipRatio = gzipSync(buf).length / buf.length; |
There was a problem hiding this comment.
Division by zero for empty files produces Infinity
Low Severity
gzipSync(buf).length / buf.length produces Infinity when an empty file is read, because Buffer.from("", "utf8").length is 0. While the default --min-loc 50 filters these out, using --min-loc 1 or --min-loc 0 lets Infinity propagate as gzipRatio. This causes the formatter to display "Infinity%" in the table and JSON.stringify to serialize it as null in JSON output.
There was a problem hiding this comment.
Pull request overview
Adds an initial implementation and documentation for a new npx aidd churn hotspot-analysis command (LoC × churn × complexity, with gzip “density”) and updates the AI “split PR” skill/docs to reference it.
Changes:
- Adds churn analysis core modules (git churn collection, file metrics collection, scoring, formatting) plus a new Commander
churnsubcommand. - Introduces
tsmetrics-coredependency for AST-based cyclomatic complexity. - Adds/updates planning docs and skill/README references for the new command.
Reviewed changes
Copilot reviewed 15 out of 17 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| tasks/aidd-churn-epic.md | New epic/spec for the churn feature and related deliverables |
| plan.md | Links the churn epic from the main plan |
| package.json | Adds tsmetrics-core dependency |
| package-lock.json | Updates lockfile for tsmetrics-core (and other lockfile changes) |
| lib/churn-collector.js | New git churn collector with structured errors |
| lib/file-metrics-collector.js | New LoC/gzip/complexity metrics collector |
| lib/churn-scorer.js | New pure composite scoring + filtering/sorting |
| lib/churn-formatter.js | New table + JSON output formatting |
| lib/churn-command.js | New Commander subcommand wiring collectors/scorer/formatter |
| lib/churn-scorer.test.js | Unit tests for scoring logic |
| lib/churn-formatter.test.js | Unit tests for table/JSON formatting |
| bin/aidd.js | Registers the new churn command on the CLI |
| ai/skills/index.md | Adds index entry for the new split-pr skill |
| ai/skills/aidd-split-pr/* | Adds split-pr skill spec + README |
| README.md | Adds /split-pr mention and documents aidd churn usage |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const src = readFileSync(path.resolve(cwd, file), "utf8"); | ||
| const loc = src.split("\n").length; | ||
| const buf = Buffer.from(src, "utf8"); |
There was a problem hiding this comment.
The collector claims to "skip" binary files, but readFileSync(..., 'utf8') typically does not throw for binaries—it will just decode with replacement characters—so binary assets can end up in the scoring output with meaningless loc/complexity. Consider reading as a Buffer first and detecting likely-binary content (e.g., NUL bytes) to skip those paths.
| const src = readFileSync(path.resolve(cwd, file), "utf8"); | |
| const loc = src.split("\n").length; | |
| const buf = Buffer.from(src, "utf8"); | |
| const fullPath = path.resolve(cwd, file); | |
| const buf = readFileSync(fullPath); | |
| // Heuristic: treat files containing NUL bytes as binary and skip them. | |
| let isBinary = false; | |
| for (let i = 0; i < buf.length; i++) { | |
| if (buf[i] === 0) { | |
| isBinary = true; | |
| break; | |
| } | |
| } | |
| if (isBinary) { | |
| return map; | |
| } | |
| const src = buf.toString("utf8"); | |
| const loc = src.split("\n").length; |
| /** | ||
| * Returns file metrics for each path: { loc, gzipRatio, complexity }. | ||
| * Skips files that cannot be read (binary, missing, etc). | ||
| */ | ||
| export const collectFileMetrics = ({ files, cwd = process.cwd() }) => | ||
| files.reduce((map, file) => { | ||
| try { | ||
| const src = readFileSync(path.resolve(cwd, file), "utf8"); | ||
| const loc = src.split("\n").length; | ||
| const buf = Buffer.from(src, "utf8"); | ||
| const gzipRatio = gzipSync(buf).length / buf.length; | ||
| const complexity = measureComplexity(file, src); | ||
| map.set(file, { complexity, gzipRatio, loc }); | ||
| } catch { | ||
| // skip unreadable files silently | ||
| } | ||
| return map; | ||
| }, new Map()); |
There was a problem hiding this comment.
There are no unit tests covering collectFileMetrics yet (LoC counting, gzip ratio formatting, binary/unreadable file skipping, and the TypeScript-vs-non-TypeScript complexity behavior). Given this feeds the CLI output directly, please add tests for these cases.
| .option("--days <n>", "git log window in days", "90") | ||
| .option("--top <n>", "max results to show", "20") | ||
| .option("--min-loc <n>", "minimum lines of code to include", "50") | ||
| .option("--json", "output raw JSON") | ||
| .action(async ({ days, top, minLoc, json }) => { | ||
| const cwd = process.cwd(); | ||
| try { | ||
| const churnMap = collectChurn({ cwd, days: Number(days) }); | ||
| const files = [...churnMap.keys()]; | ||
| const metricsMap = collectFileMetrics({ cwd, files }); | ||
| const results = scoreFiles(churnMap, metricsMap, { | ||
| minLoc: Number(minLoc), | ||
| top: Number(top), |
There was a problem hiding this comment.
--days, --top, and --min-loc are converted with Number(...) but never validated. Invalid inputs become NaN (e.g. --min-loc foo), which can yield confusing empty output or odd git args. Prefer Commander’s option coercion (parseInt) plus explicit validation (finite, positive) and a clear error message before running the analysis.
| .option("--days <n>", "git log window in days", "90") | |
| .option("--top <n>", "max results to show", "20") | |
| .option("--min-loc <n>", "minimum lines of code to include", "50") | |
| .option("--json", "output raw JSON") | |
| .action(async ({ days, top, minLoc, json }) => { | |
| const cwd = process.cwd(); | |
| try { | |
| const churnMap = collectChurn({ cwd, days: Number(days) }); | |
| const files = [...churnMap.keys()]; | |
| const metricsMap = collectFileMetrics({ cwd, files }); | |
| const results = scoreFiles(churnMap, metricsMap, { | |
| minLoc: Number(minLoc), | |
| top: Number(top), | |
| .option( | |
| "--days <n>", | |
| "git log window in days", | |
| (value) => Number.parseInt(value, 10), | |
| 90, | |
| ) | |
| .option( | |
| "--top <n>", | |
| "max results to show", | |
| (value) => Number.parseInt(value, 10), | |
| 20, | |
| ) | |
| .option( | |
| "--min-loc <n>", | |
| "minimum lines of code to include", | |
| (value) => Number.parseInt(value, 10), | |
| 50, | |
| ) | |
| .option("--json", "output raw JSON") | |
| .action(async ({ days, top, minLoc, json }) => { | |
| const ensurePositiveInteger = (value, name) => { | |
| if ( | |
| !Number.isFinite(value) || | |
| !Number.isInteger(value) || | |
| value <= 0 | |
| ) { | |
| console.error( | |
| chalk.red( | |
| `❌ Invalid value for --${name}: ${String( | |
| value, | |
| )}. Expected a positive integer.`, | |
| ), | |
| ); | |
| process.exit(1); | |
| } | |
| }; | |
| ensurePositiveInteger(days, "days"); | |
| ensurePositiveInteger(top, "top"); | |
| ensurePositiveInteger(minLoc, "min-loc"); | |
| const cwd = process.cwd(); | |
| try { | |
| const churnMap = collectChurn({ cwd, days }); | |
| const files = [...churnMap.keys()]; | |
| const metricsMap = collectFileMetrics({ cwd, files }); | |
| const results = scoreFiles(churnMap, metricsMap, { | |
| minLoc, | |
| top, |
| const JS_TS_EXTENSIONS = new Set([ | ||
| ".js", | ||
| ".jsx", | ||
| ".ts", | ||
| ".tsx", | ||
| ".mjs", | ||
| ".cjs", | ||
| ]); |
There was a problem hiding this comment.
JS_TS_EXTENSIONS uses ALL_CAPS constant naming, which conflicts with the project JavaScript guide (“Avoid ALL_CAPS for constants…”, ai/rules/javascript/javascript.mdc). Consider renaming to lowerCamelCase (e.g., jsTsExtensions) for consistency with the rest of the codebase.
| "gray-matter": "^4.0.3", | ||
| "js-sha3": "^0.9.3" | ||
| "js-sha3": "^0.9.3", | ||
| "tsmetrics-core": "^1.4.1" |
There was a problem hiding this comment.
tsmetrics-core@^1.4.1 declares a typescript@^4.9.4 peer dependency (see lockfile), but this repo uses TypeScript 5.x in devDependencies. This can lead to peer-dep resolution warnings/errors (especially with pnpm/Yarn PnP) and potentially multiple TS versions installed. Consider upgrading tsmetrics-core to a version that supports TS 5.x, or aligning/declaring the appropriate TypeScript version strategy (dependency vs peer) for runtime use.
| "tsmetrics-core": "^1.4.1" | |
| "tsmetrics-core": "^1.5.0" |
| export const collectChurn = ({ cwd = process.cwd(), days = 90 } = {}) => { | ||
| const since = `${days} days ago`; | ||
| let output; | ||
|
|
||
| try { | ||
| output = execSync( | ||
| `git log --since="${since}" --name-only --pretty=format: --diff-filter=ACMR`, | ||
| { cwd, encoding: "utf8" }, | ||
| ); |
There was a problem hiding this comment.
collectChurn builds a shell command string and interpolates days directly into --since="${since}". Since collectChurn is exported, any caller passing untrusted/non-numeric input could potentially cause shell injection or at least produce invalid git arguments. Validate/coerce days to a finite positive integer inside collectChurn and consider using execFileSync (args array) to avoid invoking a shell.
| export const collectChurn = ({ cwd = process.cwd(), days = 90 } = {}) => { | ||
| const since = `${days} days ago`; | ||
| let output; | ||
|
|
||
| try { | ||
| output = execSync( | ||
| `git log --since="${since}" --name-only --pretty=format: --diff-filter=ACMR`, | ||
| { cwd, encoding: "utf8" }, | ||
| ); | ||
| } catch (cause) { | ||
| const isNotRepo = cause.message?.includes("not a git repository"); | ||
| throw createError( | ||
| isNotRepo ? churnErrors.NotAGitRepo : churnErrors.GitError, | ||
| { cause }, | ||
| ); | ||
| } | ||
|
|
||
| return output | ||
| .split("\n") | ||
| .map((f) => f.trim()) | ||
| .filter(Boolean) | ||
| .reduce((map, file) => map.set(file, (map.get(file) ?? 0) + 1), new Map()); | ||
| }; |
There was a problem hiding this comment.
There are no unit tests covering collectChurn yet (e.g., parsing git log output, the day window behavior, and structured errors when not in a git repo). Given this is core to the new churn feature and the repo already has extensive lib-level tests, please add coverage for these scenarios.
| const src = readFileSync(path.resolve(cwd, file), "utf8"); | ||
| const loc = src.split("\n").length; | ||
| const buf = Buffer.from(src, "utf8"); | ||
| const gzipRatio = gzipSync(buf).length / buf.length; |
There was a problem hiding this comment.
gzipRatio is computed as gzipSync(buf).length / buf.length. For empty files buf.length is 0, which yields Infinity/NaN and can break sorting/formatting. Handle the empty-file case explicitly (e.g., treat density as 0 or 1, consistently).
| const gzipRatio = gzipSync(buf).length / buf.length; | |
| const gzipRatio = | |
| buf.length === 0 ? 0 : gzipSync(buf).length / buf.length; |
| **Requirements**: | ||
| - Given a source file, should return line count, gzip compression ratio, and cyclomatic complexity | ||
| - Given a binary or unreadable file, should skip it gracefully | ||
| - Given a non-TypeScript file, should return complexity of 1 |
There was a problem hiding this comment.
The epic’s requirement says non-TypeScript files should return complexity of 1, but the current implementation attempts complexity parsing for several non-TS extensions (e.g. .js/.jsx/.mjs/.cjs). Either update this requirement to match the intended behavior (complexity for JS too) or adjust the implementation so the spec and shipped behavior are consistent.
| - Given a non-TypeScript file, should return complexity of 1 | |
| - Given a file that is neither TypeScript nor JavaScript-family (`.js`, `.jsx`, `.mjs`, `.cjs`), should return a default complexity of 1 |
| export const addChurnCommand = (program) => { | ||
| program | ||
| .command("churn") | ||
| .description("rank files by hotspot score (LoC × churn × complexity)") | ||
| .option("--days <n>", "git log window in days", "90") | ||
| .option("--top <n>", "max results to show", "20") | ||
| .option("--min-loc <n>", "minimum lines of code to include", "50") | ||
| .option("--json", "output raw JSON") | ||
| .action(async ({ days, top, minLoc, json }) => { | ||
| const cwd = process.cwd(); | ||
| try { | ||
| const churnMap = collectChurn({ cwd, days: Number(days) }); | ||
| const files = [...churnMap.keys()]; | ||
| const metricsMap = collectFileMetrics({ cwd, files }); | ||
| const results = scoreFiles(churnMap, metricsMap, { | ||
| minLoc: Number(minLoc), | ||
| top: Number(top), | ||
| }); | ||
| console.log(json ? formatJson(results) : formatTable(results)); | ||
| } catch (err) { | ||
| try { | ||
| handleChurnErrors({ | ||
| GitError: ({ message }) => | ||
| console.error(chalk.red(`❌ Git error: ${message}`)), | ||
| NotAGitRepo: () => | ||
| console.error( | ||
| chalk.red("❌ Not a git repository. Run inside a git repo."), | ||
| ), | ||
| })(err); | ||
| } catch { | ||
| console.error(chalk.red(`❌ ${err.message}`)); | ||
| } | ||
| process.exit(1); | ||
| } | ||
| }); |
There was a problem hiding this comment.
There’s no integration/smoke test for the new churn CLI subcommand yet (e.g., npx aidd churn --json produces parseable JSON and exits 0). Since the command wires together git parsing + file IO + formatting, adding at least a minimal CLI-level test would help prevent regressions.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| output = execSync( | ||
| `git log --since="${since}" --name-only --pretty=format: --diff-filter=ACMR`, | ||
| { cwd, encoding: "utf8" }, | ||
| ); |
There was a problem hiding this comment.
Shell injection via days parameter in execSync
Medium Severity
The days parameter is interpolated directly into a shell command string passed to execSync. Because the value is placed inside double quotes in the shell (--since="${since}"), command substitution like $(...) would be executed. The immediate CLI caller converts to Number() which mitigates this, but collectChurn is an exported function that accepts arbitrary input without validation, leaving the injection surface open for any future caller.
Additional Locations (1)
| .action(async ({ days, top, minLoc, json }) => { | ||
| const cwd = process.cwd(); | ||
| try { | ||
| const churnMap = collectChurn({ cwd, days: Number(days) }); |
There was a problem hiding this comment.
Non-numeric --days silently produces wrong results
Medium Severity
When a user passes a non-numeric value like --days abc, Number("abc") evaluates to NaN. This produces the git argument --since="NaN days ago", which git silently treats as an invalid date and returns the entire commit history instead of the intended time window. The user receives a plausible-looking but incorrect hotspot ranking with no error or warning.


Note
Medium Risk
Adds a new CLI surface that shells out to
gitand reads/compresses local files, so failures or edge cases can impact developer workflows and performance. Also introduces a new production dependency (tsmetrics-core) and related parsing logic.Overview
Adds a new
npx aidd churnCLI subcommand to rank repository hotspots by a composite score (LoC × git churn × cyclomatic complexity, with gzip density as an extra signal) and prints either an aligned table or raw JSON.Implements the supporting collectors/scorers/formatters (including
tsmetrics-core-based complexity measurement), wires the command intobin/aidd.js, and adds unit tests for scoring/formatting. Documentation is updated to introduce the new/split-prskill and to referencechurnin both the main README and project planning docs, along with adding the newai/skills/aidd-split-prdirectory/index entries.Written by Cursor Bugbot for commit a04ea7f. This will update automatically on new commits. Configure here.