diff --git a/formats/pr-comment-responses.md b/formats/pr-comment-responses.md new file mode 100644 index 0000000..a88e0d6 --- /dev/null +++ b/formats/pr-comment-responses.md @@ -0,0 +1,108 @@ + + + +--- +name: pr-comment-responses +type: format +description: > + Output format for responding to pull request review comments. + Structures per-thread analysis, contradiction detection, and + response generation in either document or action mode. +produces: pr-comment-responses +--- + +# Format: PR Comment Responses + +The output MUST be a structured response plan for pull request review +threads. The format adapts based on `output_mode`: + +- **Document mode**: produce the full report below. +- **Action mode**: use the same section structure below as an analysis + and planning artifact, rather than a prose report. + +## Output Structure + +### 1. Thread Summary + +Summarize all review threads by state: + +| State | Count | Description | +|-------|-------|-------------| +| **Pending** | N | Active threads requiring response | +| **Outdated** | N | Threads on code that has since changed | +| **Resolved** | N | Already resolved — skipped unless user requests | + +- **Total threads**: count +- **Actionable threads**: count (pending only, unless user overrides) +- **Skipped threads**: count and reason (resolved, outdated) + +### 2. Contradiction Report + +Identify conflicting feedback across different reviewers on the same +code area or design decision. For each contradiction: + +```markdown +#### Contradiction C-: + +- **Reviewer A** (@handle): +- **Reviewer B** (@handle): +- **Conflict**: +- **Resolution Options / Tradeoffs**: +- **Decision Needed**: +``` + +If no contradictions are detected, state: "None identified." + +### 3. Per-Thread Responses + +For each actionable thread, in file order: + +```markdown +#### Thread T-: : + +- **Reviewer**: @handle +- **Thread State**: Pending / Outdated +- **Comment Summary**: <1–2 sentence summary of the reviewer's point> +- **Response Type**: Fix / Explain / Both +- **Analysis**: +- **Response**: + - *(If Fix)*: + - *(If Explain)*: + - *(If Both)*: +- **Linked Contradiction**: C- (if part of a detected contradiction) +``` + +### 4. Action Summary + +| Category | Count | Details | +|----------|-------|---------| +| **Code fixes applied** | N | Threads where code was changed | +| **Explanations provided** | N | Threads answered with rationale | +| **Skipped (resolved)** | N | Already resolved threads | +| **Skipped (outdated)** | N | Threads on changed code | +| **Needs discussion** | N | Contradictions or ambiguous feedback | + +- **Files modified**: list of files changed by fixes +- **Commits**: list of commits created (action mode only) +- **Unresolved items**: threads that need human judgment + +## Formatting Rules + +- Threads MUST be ordered by file path, then by line number within + each file. +- Every actionable thread MUST have a response — do not skip threads + without stating why. +- Code fixes MUST show before/after snippets with enough context + (at least 3 lines) to verify correctness. +- If a section has no content, state "None identified" — never omit + sections. +- In action mode, present the full analysis to the user and obtain + explicit confirmation before executing any mutation (code change, + comment post, thread resolution). + +## Response Type + +- `response_type` MUST be one of: **Fix**, **Explain**, or **Both**. +- If a per-thread override is shown, it MUST use one of the same values. diff --git a/manifest.yaml b/manifest.yaml index dd3eaee..fd845f9 100644 --- a/manifest.yaml +++ b/manifest.yaml @@ -699,6 +699,14 @@ formats: Prioritized triage report for issues, pull requests, or work items. Classifies items by priority, effort, and recommended action. + - name: pr-comment-responses + path: formats/pr-comment-responses.md + produces: pr-comment-responses + description: > + Output format for responding to pull request review comments. + Structures per-thread analysis, contradiction detection, and + response generation in either document or action mode. + - name: release-notes path: formats/release-notes.md produces: release-notes @@ -1409,6 +1417,28 @@ templates: taxonomies: [kernel-defect-categories] format: exhaustive-review-report + - name: review-pull-request + path: templates/review-pull-request.md + description: > + Review a pull request's diff, commits, and linked issues. + Supports document mode (investigation report) or action mode + (post inline review comments via GitHub API). Language-agnostic + by default with optional language-specific focus. + persona: systems-engineer + protocols: [anti-hallucination, self-verification, operational-constraints] + format: investigation-report + + - name: respond-to-pr-comments + path: templates/respond-to-pr-comments.md + description: > + Process PR review feedback and generate per-thread responses. + Supports document mode (response plan) or action mode (make + code fixes, post replies, resolve threads via GitHub API). + Detects contradictory feedback across reviewers. + persona: systems-engineer + protocols: [anti-hallucination, self-verification, operational-constraints] + format: pr-comment-responses + - name: reconstruct-behavior path: templates/reconstruct-behavior.md description: > diff --git a/templates/respond-to-pr-comments.md b/templates/respond-to-pr-comments.md new file mode 100644 index 0000000..621728d --- /dev/null +++ b/templates/respond-to-pr-comments.md @@ -0,0 +1,239 @@ + + + +--- +name: respond-to-pr-comments +description: > + Process pull request review feedback and generate per-thread responses. + Supports document mode (structured response plan) or action mode + (make code fixes, post replies, and resolve threads via GitHub API). + Detects contradictory feedback across reviewers. +persona: systems-engineer +protocols: + - guardrails/anti-hallucination + - guardrails/self-verification + - guardrails/operational-constraints +format: pr-comment-responses +params: + pr_reference: "Pull request to respond to — URL or number (e.g., #42)" + review_threads: "Review feedback to address — 'all pending', specific thread URLs, or pasted comments" + codebase_context: "What this code does, relevant architecture, design decisions that inform responses" + response_mode: "How to respond per-thread — 'auto' (heuristic), 'fix' (code changes), or 'explain' (rationale)" + output_mode: "Output mode — 'document' (produce response plan) or 'action' (make changes and post replies via gh CLI)" +input_contract: null +output_contract: + type: pr-comment-responses + description: > + A structured per-thread response plan with code fixes and/or + explanations. In action mode, responses are executed as code + changes, reply comments, and thread resolutions. +--- + +# Task: Respond to PR Review Comments + +You are tasked with processing review feedback on a pull request and +generating responses for each review thread — either code fixes, +explanatory replies, or both. + +## Inputs + +**Pull Request**: {{pr_reference}} + +**Review Threads to Address**: {{review_threads}} + +**Codebase Context**: {{codebase_context}} + +**Response Mode**: {{response_mode}} + +**Output Mode**: {{output_mode}} + +## Instructions + +### Phase 1: Gather Review Threads + +1. **Read all review threads** on the PR: + - Use `gh pr view {{pr_reference}} --comments` for a quick overview, but use + `gh api graphql` to fetch the authoritative review-thread data + needed for deterministic action mode execution + - For each review thread, record: + - `thread_id`: the GraphQL review thread ID (required for + `resolveReviewThread`) + - Reviewer handle + - File path and line number + - Thread state (pending, resolved, outdated) + - Full comment text and any replies + - Whether the thread is on code that still exists in the + current diff + - For each review comment within the thread, record: + - `comment_id`: the review comment database ID (required for + REST `in_reply_to` when posting a reply) + - Author handle + - Comment body + - Use a GraphQL query via `gh api graphql` that includes each + thread's ID, state, path, and line metadata, plus each + comment's database ID, author, and body + - Preserve these IDs in your working notes so later action + steps can post replies and resolve the correct threads + +2. **Filter threads** based on `review_threads` parameter: + - If `all pending` — include all threads with state `pending` + - If specific threads are listed — include only those + - Skip `resolved` threads unless the user explicitly requests them + - Flag `outdated` threads (code has changed since the comment) + and ask the user whether to address them + +3. **Read the current code** at each thread's location: + - Fetch the file content at the relevant lines + - Understand the surrounding context (function, class, module) + - Check if the code has changed since the review comment was posted + +### Phase 2: Detect Contradictions + +Compare feedback across different reviewers on the same code area +or design decision: + +1. **Group threads by location**: threads on the same file within + 10 lines of each other, or threads referencing the same function + or design concept. + +2. **Compare positions**: for each group, check if reviewers disagree: + - Reviewer A says "add error handling" but Reviewer B says + "keep it simple, don't over-engineer" + - Reviewer A says "use approach X" but Reviewer B says + "use approach Y" + - Reviewer A approves a pattern but Reviewer B flags it + +3. **Report contradictions** with both positions and a recommended + resolution. Do NOT silently pick one side — flag for the user. + +### Phase 3: Analyze Each Thread + +For each actionable thread, determine the response type: + +1. **If `response_mode` is `auto`**, apply these heuristics: + + | Reviewer Feedback | Response Type | + |---|---| + | Points out a bug, missing check, or incorrect behavior | **Fix** | + | Asks "why" or questions a design choice | **Explain** | + | Suggests a refactor or alternative approach | **Both** | + | Requests documentation or comment changes | **Fix** | + | Flags a style or convention issue | **Fix** | + | Raises a concern without a specific ask | **Explain** | + +2. **If `response_mode` is `fix`** — generate a code fix for every + thread. If a fix is not applicable (e.g., the comment is a design + question), note this and fall back to an explanation. + +3. **If `response_mode` is `explain`** — generate an explanatory reply + for every thread. If the feedback clearly requires a code change + (e.g., a bug), note this and recommend the user switch to `auto`. + +For each thread, produce: + +- **Analysis**: Why the feedback is valid, partially valid, or based + on a misunderstanding. Be honest — if the reviewer is right, + acknowledge it. If they are wrong, explain why respectfully. +- **Fix** (if applicable): The specific code change, shown as + before/after with at least 3 lines of surrounding context. +- **Explanation** (if applicable): A draft reply to the reviewer + that explains the design decision, tradeoff, or rationale. Keep + it professional, concise, and technical. + +### Phase 4: Output + +#### Document Mode (`output_mode: document`) + +Produce the output following the `pr-comment-responses` format: +1. Thread Summary (by state) +2. Contradiction Report +3. Per-Thread Responses (in file order) +4. Action Summary + +#### Action Mode (`output_mode: action`) + +Execute responses with **mandatory user confirmation at every step**: + +1. **Present the full analysis** (thread summary, contradictions, + per-thread responses) to the user using the document structure. + +2. **For each thread with a code fix**: + a. Show the proposed diff to the user. + b. Ask: "Apply this fix? (yes / skip / edit)" + c. If confirmed, make the code change in the file. + d. Do NOT commit yet — batch all fixes first. + +3. **After all fixes are applied**: + a. Show the user a summary of all changes made. + b. Ask: "Commit and push these changes? (yes / no)" + c. If confirmed, commit with a descriptive message referencing + the review threads addressed, then push. + +4. **For each thread with an explanation**: + a. Show the draft reply to the user. + b. Ask: "Post this reply? (yes / skip / edit)" + c. If confirmed, write the reply payload to `reply.json` and post: + ```json + { + "body": "", + "in_reply_to": + } + ``` + ``` + gh api repos/{owner}/{repo}/pulls/{pr_number}/comments \ + --method POST \ + --input reply.json + ``` + +5. **For threads that were fixed**: + a. Ask: "Resolve these threads? (yes / no)" + b. If confirmed, resolve each thread using: + ``` + gh api graphql \ + -f query='mutation($threadId: ID!) { + resolveReviewThread(input: {threadId: $threadId}) { + thread { isResolved } + } + }' \ + -F threadId="" + ``` + +6. **Never take any action without explicit user confirmation.** + If the user skips all items, produce a document-mode report instead. + +### Phase 5: Handle Edge Cases + +- **No pending threads**: Report "No actionable review threads found" + and list any resolved/outdated threads for reference. +- **Large thread count (>20)**: Process in batches of 10. After each + batch, summarize progress and ask to continue. +- **Outdated threads**: Flag these separately. Ask the user whether + to address them — the code may have already changed to address + the feedback. +- **Threads on deleted files**: Skip with a note explaining the file + no longer exists. + +## Non-Goals + +- Do NOT perform a new code review — focus only on addressing + existing feedback. +- Do NOT modify code beyond what is needed to address review comments. +- Do NOT resolve threads without user confirmation. +- Do NOT dismiss or ignore valid feedback — if a reviewer is correct, + acknowledge it and fix it. +- Do NOT take sides in contradictions — present both positions and + let the user decide. +- Do NOT push commits without explicit user confirmation. + +## Quality Checklist + +Before finalizing, verify: + +- [ ] Every pending thread has a response (fix, explanation, or both) +- [ ] Contradictions across reviewers are explicitly flagged +- [ ] Code fixes show before/after with sufficient context +- [ ] Draft replies are professional, concise, and technical +- [ ] Resolved and outdated threads are accounted for (skipped with reason) +- [ ] In action mode: user confirmation obtained before every mutation +- [ ] Thread states (pending/resolved/outdated) are accurately reported +- [ ] Files modified by fixes are listed in the action summary diff --git a/templates/review-pull-request.md b/templates/review-pull-request.md new file mode 100644 index 0000000..a8f993d --- /dev/null +++ b/templates/review-pull-request.md @@ -0,0 +1,231 @@ + + + +--- +name: review-pull-request +description: > + Review a pull request's diff, commits, and linked issues to produce + a structured code review. Supports document mode (investigation report) + or action mode (post inline review comments via GitHub API). + Language-agnostic by default with optional language-specific focus. +persona: systems-engineer +protocols: + - guardrails/anti-hallucination + - guardrails/self-verification + - guardrails/operational-constraints +format: investigation-report +params: + pr_reference: "Pull request to review — URL, number (e.g., #42), or pasted diff" + review_focus: "What to focus on — e.g., correctness, security, performance, all" + language_focus: "Optional — primary language(s) for language-specific analysis (e.g., C, TypeScript)" + additional_protocols: "Optional — specific protocols to apply (e.g., memory-safety-c, thread-safety)" + context: "What this PR does, which system it affects, any known concerns" + output_mode: "Output mode — 'document' (produce investigation report) or 'action' (post review comments via gh CLI)" +input_contract: null +output_contract: + type: investigation-report + description: > + A structured code review report with per-finding severity, + file/line references, and an overall verdict. In action mode, + findings are posted as inline review comments on the PR. +--- + +# Task: Review Pull Request + +You are tasked with performing a thorough **code review** of a pull +request, analyzing the changes in context — not just the code in +isolation, but the diff, commit history, linked issues, and CI status. + +## Inputs + +**Pull Request**: {{pr_reference}} + +**Review Focus**: {{review_focus}} + +**Language Focus**: {{language_focus}} + +**Additional Protocols to Apply**: {{additional_protocols}} + +**Context**: {{context}} + +**Output Mode**: {{output_mode}} + +## Instructions + +### Phase 1: Gather PR Context + +1. **Read the PR metadata**: + - Title, description, and linked issues or work items + - Author and reviewers + - Target branch and source branch + - CI/CD status (passing, failing, pending) + - Existing review comments and their resolution state + - PR labels and milestone + +2. **Read the diff**: + - Use `gh pr diff` or equivalent to obtain the full diff + - Note which files are added, modified, and deleted + - Note the total size (files changed, lines added/removed) + +3. **Read linked issues** (if any): + - What problem does this PR claim to solve? + - What acceptance criteria are stated? + - Use linked issues to evaluate whether the PR actually addresses + the stated goals + +4. **If language focus is specified**, identify which additional analysis + protocols are relevant (e.g., `memory-safety-c` for C, + `thread-safety` for concurrent code). Apply these in Phase 2. + +### Phase 2: Analyze Changes + +Apply the **anti-hallucination protocol** throughout — base your review +ONLY on the code visible in the diff and any files you read for context. +Do not assume behaviors not visible in the code. + +For each changed file, evaluate: + +#### Correctness +- Does the change accomplish what the PR description claims? +- Are edge cases introduced or left unhandled by the change? +- Do the changes break any existing behavior? Check callers and + dependents of modified functions. +- Are return values and error codes handled correctly in new code? +- If the PR links to an issue, does the change actually fix it? + +#### Safety +- Are there memory safety issues introduced by the change? +- Are there concurrency issues (data races, deadlocks) in new code? +- Are there resource leaks introduced (file handles, connections)? +- Does the change affect initialization or cleanup paths? + +#### Security +- Is new input validated before use in sensitive operations? +- Are there injection risks introduced (SQL, command, path traversal)? +- Are secrets or credentials handled appropriately? +- Does the change widen the attack surface? + +#### Change Quality +- Is the commit history clean and logical? (atomic commits, + meaningful messages, no "fix typo" chains) +- Is the diff minimal — does it change only what is necessary? +- Are there unrelated changes bundled in? +- Is there adequate test coverage for the new behavior? +- Are documentation and comments updated to reflect the change? + +#### If additional protocols are specified +- Apply each specified protocol (e.g., `memory-safety-c`, + `thread-safety`) systematically to the changed code. + +### Phase 3: Produce Findings + +Format each finding as: + +``` +[SEVERITY: Critical|High|Medium|Low|Informational] +File: +Line: +Issue: +Evidence: +Suggestion: +``` + +Group findings by file, ordered by severity within each file. + +### Phase 4: Verdict + +Produce an overall assessment: + +- **Approve**: No Critical or High findings. Medium/Low findings + are acceptable or easily addressed. +- **Approve with suggestions**: No Critical findings. High findings + are minor or have clear fixes. Include specific suggestions. +- **Request changes**: Any Critical finding, or multiple High findings + that indicate systemic issues. State what must change before approval. + +Summarize: +- Total findings by severity +- Top 3 findings ranked by impact +- Whether the PR achieves its stated goal (per linked issues) +- Whether test coverage is adequate for the changes + +### Phase 5: Output + +#### Document Mode (`output_mode: document`) + +Produce the output following the `investigation-report` format. Map +PR review concepts to report sections: + +| Report Section | PR Review Content | +|---|---| +| Executive Summary | Overall verdict + key findings | +| Problem Statement | What the PR claims to change and why | +| Investigation Scope | Files changed, diff size, linked issues | +| Findings | Per-file findings with severity | +| Root Cause Analysis | Omit or use for systemic patterns | +| Remediation Plan | Suggested fixes ordered by priority | +| Prevention | Process suggestions (testing, CI checks) | +| Open Questions | Ambiguities in the PR or linked issues | +| Coverage | Files examined, search method, exclusions, limitations | + +#### Action Mode (`output_mode: action`) + +1. **Present findings** to the user using the document structure above. +2. **Ask the user to confirm** which findings to post as review comments. + Present each finding (or batch by file) and ask: + - Post this comment? (yes / skip / edit) +3. **Post confirmed findings** as inline review comments using a JSON + payload file so `comments` is sent as an array, not a string: + ``` + cat > review.json <<'EOF' + { + "body": "", + "event": "", + "commit_id": "", + "comments": [ + { + "path": "path/to/file.ext", + "body": "", + "line": 123, + "side": "RIGHT" + } + ] + } + EOF + + gh api repos/{owner}/{repo}/pulls/{pr_number}/reviews \ + --method POST \ + --input review.json + ``` + Fetch the head SHA with `gh pr view {pr_number} --json headRefOid --jq .headRefOid` + before constructing the payload. Each inline comment object must include + `path` and comment `body`, plus the review location fields required by + GitHub's API: typically `line` and `side` for a diff comment on the new code. +4. **Never post without explicit user confirmation.** If the user skips + all findings, do not submit a review. + +## Non-Goals + +- Do NOT refactor the code — identify issues, do not rewrite. +- Do NOT review code outside the PR diff unless it is directly called + by or affected by the changed code. +- Do NOT comment on personal style preferences — focus on correctness, + safety, security, and change quality. +- Do NOT merge the PR programmatically. The verdict is advisory. In + action mode, you may post an `APPROVE`, `REQUEST_CHANGES`, or `COMMENT` + review only after explicit user confirmation, and you must not merge. +- Do NOT modify the PR branch or push commits. + +## Quality Checklist + +Before finalizing, verify: + +- [ ] Every finding cites a specific file and line from the diff +- [ ] Every finding has a severity rating +- [ ] Every finding includes a concrete fix suggestion +- [ ] Findings are grouped by file and ordered by severity +- [ ] The verdict is consistent with the findings +- [ ] Linked issues were checked against the actual changes +- [ ] CI status was noted (or stated as unavailable) +- [ ] At least 3 findings have been re-verified against the diff +- [ ] In action mode: user confirmation was obtained before every post