feat: improve reliability of generated agent skills by clay-good · Pull Request #1284 · Fission-AI/OpenSpec

clay-good · 2026-06-30T16:33:32Z

What this does

Upgrades the 11 agent skills OpenSpec generates — the SKILL.md files that tell coding agents how to run the OpenSpec workflow — so an agent can reliably pick the right skill, know when it's done, recover when it gets stuck, and write specs that follow OpenSpec's conventions. It also adds allowed-tools frontmatter so agents stop asking permission on every openspec call, and a validation gate so a malformed skill can't ship.

Nothing an agent actually does changes — same commands, same prompts, same artifacts. Only the instructions get clearer. (Verified by diffing every skill's executed CLI commands against main — identical.)

What it fixes

Measured across all 11 generated skills today:

Ambiguous triggers — several skills answer the same request ("I want to build X") with nothing telling the agent which to choose.
No "done" signal — not one skill stated a success condition, so an agent can't tell "finished" from "stalled."
No recovery — skills named blocked/error states but not how to get unstuck.
Permission friction — no skill pre-approved the openspec CLI, so agents prompt on every call.
Spec-writing rules stranded in the docs (issue Concepts from docs not included in the skills #1289) — the guidance for what belongs in a spec lived only in docs/, so agents drafting specs never followed it.

Before → after (if merged)

	Before	After
Skills that say when to use them vs. their siblings	2 / 11	11 / 11
Skills that state a success / "done" condition	0 / 11	11 / 11
Skills with named failure + concrete recovery	0 / 11	11 / 11
Skills that hand off to the next skill	0 / 11	11 / 11
`openspec` CLI pre-approved (no permission prompts)	no	yes
Spec-writing conventions reach the drafting agent	no	yes
A malformed skill can be generated	yes	no — blocked by a gate

An objective conformance scorecard (printed on every test run) goes from 33/81 → 81/81 checks passing.

Why it's safe to merge

Behavior preserved — commands, prompts, and artifacts are unchanged; proven by a command-level diff of all 11 skills against main.
Tests green — full suite passes except one pre-existing zsh-installer failure that also fails on main (unrelated shell-completion test).
allowed-tools is pure upside — agents that honor it stop prompting; agents that ignore it are unaffected.

Notes for review

One planned piece — AGENTS.md guidance for agents that don't load skills — was dropped, because OpenSpec no longer generates that file (legacy-cleanup deletes it as obsolete), so there's nowhere to put it.
Command (slash-command) templates are intentionally left unchanged; no spec requirement covers them and they're independently rewritable.
tasks.md has the full requirement-by-requirement coverage matrix if you want the details.

🤖 Generated with Claude Code

… skills Add an OpenSpec change proposal (proposal/design/tasks + spec delta) that establishes a quality contract for the 11 generated agent skills: trigger disambiguation, canonical structure, explicit success criteria, named failure recovery, single-source skill/command generation, shared-snippet reuse, lean always-on body, and cross-skill navigation. Proposal only — no skill code or CLI behavior changes in this PR. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-06-30T16:33:53Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds planning docs for a unified skill-authoring-conventions contract, plus related bundle, validation, and agent-guidance specs. The proposal, design, capability spec, and task list define shared instruction structure, deduplication rules, allowed-tools, publishable bundle requirements, and validation gates.

Changes

skill-authoring-conventions planning documents

Layer / File(s)	Summary
Proposal: gaps and planned changes `openspec/changes/improve-skill-instructions/proposal.md`	Describes instruction-quality gaps, the unified contract, canonical section structure, shared generation, lean bodies, related navigation, validation, and non-goals.
Design: canonical structure and deduplication `openspec/changes/improve-skill-instructions/design.md`	Defines the procedural layout, preserves `explore` and `onboard`, and specifies shared snippets, deduplication, worked examples, `allowed-tools`, and cross-platform notes.
Bundle validation and AGENTS guidance `openspec/changes/improve-skill-instructions/specs/skill-distribution/spec.md`, `openspec/changes/improve-skill-instructions/specs/docs-agent-instructions/spec.md`	Adds publishable-bundle requirements, validation-before-publish gating, listing metadata, and regenerated `openspec/AGENTS.md` guidance.
Capability spec: authoring requirements `openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`	States the authoring rules for trigger disambiguation, section ordering, success criteria, recovery behavior, shared sources, references, navigation, and conformance validation.
Task checklist for implementation `openspec/changes/improve-skill-instructions/tasks.md`	Lists the implementation work for shared constants, workflow refactoring, canonical rewrites, variants, validation gates, bundle metadata, and test coverage.

Estimated code review effort: 2 (Simple) | ~10 minutes

Possibly related PRs

Fission-AI/OpenSpec#564: Updates the workflow skill generation path that these new skill-authoring and validation rules target.
Fission-AI/OpenSpec#719: Also touches propose skill/command generation, which is standardized here via shared instruction guidance.

Suggested reviewers: TabishB

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly matches the PR’s main goal of improving generated agent skills, even if it is broader than the specific authoring-spec changes.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md (1)
73-79: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Reconcile unconditional requirement with conditional scenario.

The requirement states "Each skill SHALL reference the related or next skill" unconditionally, but the scenario's WHEN clause only triggers "when a natural next or sibling skill exists." This leaves terminal skills (e.g., feedback) without a defined behavior. Either:

Add a scenario covering the absence of a related skill, or

Soften the requirement to "SHALL where a natural next or sibling skill exists."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`
around lines 73 - 79, The Cross-Skill Navigation requirement is unconditional
while the scenario in the skill-authoring-conventions spec only applies when a
natural next or sibling skill exists. Update the Requirement: Cross-Skill
Navigation text and/or add a complementary scenario in the same spec so terminal
skills like feedback have explicit behavior, using the existing requirement and
scenario wording as the anchor. Make the policy consistent by either qualifying
the requirement with “where a natural next or sibling skill exists” or adding an
absence case that defines what terminal skills should do.

🧹 Nitpick comments (2)

openspec/changes/improve-skill-instructions/design.md (1)
31-39: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Add language specifier to fenced code block.

Satisfy markdownlint MD040 by tagging the structural diagram as text (or markdown). No semantic change.
-```
+```text
 Use when     — one line; includes the sibling boundary
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openspec/changes/improve-skill-instructions/design.md` around lines 31 - 39,
The fenced structural diagram in the skill instructions is missing a language
tag and needs to be annotated to satisfy markdownlint MD040. Update the fenced
block in the design document so the diagram is explicitly marked as text (or
markdown) while keeping the content unchanged; use the existing fenced section
containing the “Use when”, “Inputs”, “Steps”, and “Guardrails” headings as the
target.
Source: Linters/SAST tools
openspec/changes/improve-skill-instructions/proposal.md (1)
3-3: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Repetitive use of "right" weakens prose.

Three instances in one sentence dilute impact. Vary the wording: e.g., "correct skill," "proper steps," "intended place."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openspec/changes/improve-skill-instructions/proposal.md` at line 3, The
sentence in the OpenSpec proposal repeats “right” three times, making the prose
feel repetitive and weak. Revise that sentence in the proposal text to vary the
wording while preserving meaning, using distinct phrasing such as “correct
skill,” “proper steps,” and “intended place” so the opening reads more cleanly.
Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openspec/changes/improve-skill-instructions/proposal.md`:
- Line 59: The proposal text has a wording typo in the new capability spec
reference: update the phrase in the skill-authoring-conventions entry from “on
archive” to “on disk” or “in the repository.” Locate the bullet mentioning
openspec/specs/skill-authoring-conventions and correct the description so it
matches the repository context.

In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`:
- Line 25: The skill-authoring convention text currently uses a lowercase
section name that conflicts with the proposal’s title-case naming. Update the
instructions in the spec so the required sections are named consistently as
title-case labels, matching the proposal’s “Use when / Inputs / Steps / Success
/ Failure & recovery / Guardrails / Related” ordering, and keep this wording
aligned wherever the section list is referenced so generators and tests can
match it deterministically.

In `@openspec/changes/improve-skill-instructions/tasks.md`:
- Line 31: The task item overstates that every skill must have a Related line,
but terminal or isolated skills may not have a natural successor. Update the
wording in the task list entry to explicitly scope it to skills that have a
natural workflow successor, or add a short exception list for terminal cases
such as feedback; keep the change aligned with the related skill-instruction
spec language.

---

Outside diff comments:
In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`:
- Around line 73-79: The Cross-Skill Navigation requirement is unconditional
while the scenario in the skill-authoring-conventions spec only applies when a
natural next or sibling skill exists. Update the Requirement: Cross-Skill
Navigation text and/or add a complementary scenario in the same spec so terminal
skills like feedback have explicit behavior, using the existing requirement and
scenario wording as the anchor. Make the policy consistent by either qualifying
the requirement with “where a natural next or sibling skill exists” or adding an
absence case that defines what terminal skills should do.

---

Nitpick comments:
In `@openspec/changes/improve-skill-instructions/design.md`:
- Around line 31-39: The fenced structural diagram in the skill instructions is
missing a language tag and needs to be annotated to satisfy markdownlint MD040.
Update the fenced block in the design document so the diagram is explicitly
marked as text (or markdown) while keeping the content unchanged; use the
existing fenced section containing the “Use when”, “Inputs”, “Steps”, and
“Guardrails” headings as the target.

In `@openspec/changes/improve-skill-instructions/proposal.md`:
- Line 3: The sentence in the OpenSpec proposal repeats “right” three times,
making the prose feel repetitive and weak. Revise that sentence in the proposal
text to vary the wording while preserving meaning, using distinct phrasing such
as “correct skill,” “proper steps,” and “intended place” so the opening reads
more cleanly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2afb5da9-8590-4a2f-95a4-fe390e1ad158

📥 Commits

Reviewing files that changed from the base of the PR and between 546224e and 96c3e97.

📒 Files selected for processing (4)

openspec/changes/improve-skill-instructions/design.md
openspec/changes/improve-skill-instructions/proposal.md
openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md
openspec/changes/improve-skill-instructions/tasks.md

coderabbitai · 2026-06-30T16:36:51Z

+- `src/core/templates/workflows/*.ts` — rewrite the 11 workflow instruction strings (and feedback) to the new conventions; collapse each skill/command pair onto one instruction source.
+- `src/core/templates/workflows/store-selection.ts` (and likely new sibling snippet modules) — house the shared change-selection, artifact-loop, and context/rules guardrail blocks.
+- `src/core/shared/skill-generation.ts` / `src/core/templates/skill-templates.ts` — adjust the assembly so skill and command derive from one source.
+- `openspec/specs/skill-authoring-conventions/` — new capability spec created on archive.


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Fix typo: "on archive" → "on disk" (or "in the repository").

"On archive" does not fit the context of creating a new spec directory.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openspec/changes/improve-skill-instructions/proposal.md` at line 59, The proposal text has a wording typo in the new capability spec reference: update the phrase in the skill-authoring-conventions entry from “on archive” to “on disk” or “in the repository.” Locate the bullet mentioning openspec/specs/skill-authoring-conventions and correct the description so it matches the repository context.

coderabbitai · 2026-06-30T16:36:52Z

+
+## 4. Cross-skill navigation
+
+- [ ] 4.1 Add a Related line to every skill pointing to its natural next/sibling (e.g. `propose` → `apply`, `verify` → `archive`, `new-change` → `continue`)


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Clarify terminal skills without a natural next/sibling.

"Every skill" includes terminal or isolated skills that may not have a meaningful next step. Either enumerate exceptions (e.g., feedback) or change to "every skill that has a natural workflow successor," matching the spec's conditional scenario.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openspec/changes/improve-skill-instructions/tasks.md` at line 31, The task item overstates that every skill must have a Related line, but terminal or isolated skills may not have a natural successor. Update the wording in the task list entry to explicitly scope it to skills that have a natural workflow successor, or add a short exception list for terminal cases such as feedback; keep the change aligned with the related skill-instruction spec language.

…eservation contract - Correct duplication/size figures to measured values (onboard 543, bulk-archive 237, verify 160, explore 278 instruction lines; skill/command overlap 89-100% for 9 of 11 pairs; propose body 87% identical to ff-change). - Add an audit-evidence table and worked before/after examples (trigger disambiguation, explicit success, failure recovery) to design.md. - Add a Behavior Preservation requirement and tighten the single-source and lean-body scenarios to be normalizable/testable. - Add behavior-preservation and single-source identity validation tasks. Strict-validated with the repo CLI. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

alfred-openspec

This proposal is solid. The measured audit evidence plus behavior-preservation contract makes this safe to take forward, and the single-source plan matches the existing template drift risk.

…ribution, and AGENTS.md guidance Broaden the proposal from instruction quality to making OpenSpec's skills first-class Agent Skills packages and getting them listable in a public directory: - skill-authoring-conventions: add standard-conformance and a generation/CI validation gate; anchor the lean-body rule to the standard's <500-line / ~5000-token budget with references/ split (onboard is the one over-budget body). - skill-distribution (new capability): a validated, publishable bundle and a documented listing checklist. - docs-agent-instructions (modified): openspec/AGENTS.md advertises the skills and the deterministic CLI so non-skill-loading agents follow the same workflow. Notes: agents.sh is a voice product, not the registry — the target is the Agent Skills standard (agentskills.io) and the skills.sh directory. Verified all 11 skill names already satisfy name==folder and the charset rules; deliberately orthogonal to add-tool-command-surface-capabilities (no layout/delivery change). Strict-validated; 3 spec deltas. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai

🧹 Nitpick comments (1)

openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md (1)
73-77: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Clarify or relax the "one level deep" path constraint.

Requiring reference links to be references/ files at "a relative path one level deep" is brittle if skill folders are nested or reorganized. Either explain why the depth matters (e.g., standard-mandated layout), or rephrase to require a stable relative path from SKILL.md without prescribing depth.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`
around lines 73 - 77, Update the “Reference material in on-demand files”
scenario in skill-authoring-conventions so the link requirement is less brittle:
either justify the “one level deep” constraint or change it to require a stable
relative link from SKILL.md without mandating directory depth. Keep the existing
references/ guidance and adjust the scenario text so authors can place linked
material using the relevant relative path while preserving the rule that the
body remains readable without opening the reference file.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`:
- Around line 73-77: Update the “Reference material in on-demand files” scenario
in skill-authoring-conventions so the link requirement is less brittle: either
justify the “one level deep” constraint or change it to require a stable
relative link from SKILL.md without mandating directory depth. Keep the existing
references/ guidance and adjust the scenario text so authors can place linked
material using the relevant relative path while preserving the rule that the
body remains readable without opening the reference file.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b5555378-acd1-4754-84df-87178d86198c

📥 Commits

Reviewing files that changed from the base of the PR and between 3561e72 and 77ca885.

📒 Files selected for processing (6)

openspec/changes/improve-skill-instructions/design.md
openspec/changes/improve-skill-instructions/proposal.md
openspec/changes/improve-skill-instructions/specs/docs-agent-instructions/spec.md
openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md
openspec/changes/improve-skill-instructions/specs/skill-distribution/spec.md
openspec/changes/improve-skill-instructions/tasks.md

✅ Files skipped from review due to trivial changes (4)

openspec/changes/improve-skill-instructions/specs/skill-distribution/spec.md
openspec/changes/improve-skill-instructions/design.md
openspec/changes/improve-skill-instructions/tasks.md
openspec/changes/improve-skill-instructions/proposal.md

- allowed-tools: each skill declares its toolset and emits the standard's allowed-tools frontmatter; Bash scoped to Bash(openspec:*) for CLI-only skills, unrestricted Bash only for apply-change/onboard (arbitrary commands). Declared set is a validated superset of body usage, so strict-allowlist agents never block a needed tool and ignoring agents are unaffected — pure upside. - New requirement + scenarios in skill-authoring-conventions; design rationale for the asymmetric-risk decision; tasks; validation-gate covers tool coverage. - Coherence pass: clarify that conformance/distribution/allowed-tools target the 11 generated SKILL.md skills; feedback is held to the authoring bar only. 3 deltas, 14 reqs / 30 scenarios, strict-valid. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md (1)
18-18: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Use title-case for section names.

The spec still uses lowercase "use when" here. Per the design's canonical structure (line 53), section names are title-case: Use when / Inputs / Steps / Success / Failure & recovery / Guardrails / Related. Use consistent title-case section names so generators and validators can match them deterministically. This applies to line 25 as well.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`
at line 18, Update the spec text in the relevant clause and any matching
references so section names use title-case consistently; specifically, replace
the lowercase “use when” wording in the affected requirement with “Use when,”
and align the other section heading mention near the same area to the canonical
title-case names used by the design structure. Keep the wording deterministic so
generators and validators can match section names like Use when, Inputs, Steps,
Success, Failure & recovery, Guardrails, and Related.

♻️ Duplicate comments (1)

openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md (1)
25-25: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Use title-case for section names.

The scenario still lists 'a "use when" line' in lowercase. Align with the design's canonical structure (line 53) using title-case Use when, and ensure Inputs is also capitalized for consistency within the same list.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`
at line 25, The canonical instruction sequence in the scenario still uses
lowercase section labels; update the wording in the specification so the listed
sections match the design’s title-case convention. In the relevant requirement
text, change the “use when” entry to “Use when” and ensure “Inputs” remains
capitalized, keeping the rest of the ordered list aligned with the same
title-case style.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`:
- Line 127: The unrestricted shell access rule is ambiguous because it refers to
“the implementation skill” instead of the explicitly named skill. Update the
wording in the skill-authoring conventions spec to use apply-change directly, or
clearly define that “implementation skill” means apply-change, so generators and
validators have a single unambiguous target.

---

Outside diff comments:
In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`:
- Line 18: Update the spec text in the relevant clause and any matching
references so section names use title-case consistently; specifically, replace
the lowercase “use when” wording in the affected requirement with “Use when,”
and align the other section heading mention near the same area to the canonical
title-case names used by the design structure. Keep the wording deterministic so
generators and validators can match section names like Use when, Inputs, Steps,
Success, Failure & recovery, Guardrails, and Related.

---

Duplicate comments:
In
`@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md`:
- Line 25: The canonical instruction sequence in the scenario still uses
lowercase section labels; update the wording in the specification so the listed
sections match the design’s title-case convention. In the relevant requirement
text, change the “use when” entry to “Use when” and ensure “Inputs” remains
capitalized, keeping the rest of the ordered list aligned with the same
title-case style.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5f8d4c0f-014a-4cd5-b90a-fb65bfacdb75

📥 Commits

Reviewing files that changed from the base of the PR and between 77ca885 and 6b362a8.

📒 Files selected for processing (4)

openspec/changes/improve-skill-instructions/design.md
openspec/changes/improve-skill-instructions/proposal.md
openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md
openspec/changes/improve-skill-instructions/tasks.md

🚧 Files skipped from review as they are similar to previous changes (2)

openspec/changes/improve-skill-instructions/tasks.md
openspec/changes/improve-skill-instructions/proposal.md

coderabbitai · 2026-06-30T19:55:54Z

+#### Scenario: CLI bash pre-approved and narrowly scoped
+- **WHEN** a skill invokes the OpenSpec CLI through a shell tool
+- **THEN** its `allowed-tools` SHALL pre-approve the OpenSpec CLI invocation scoped to that binary (for example `Bash(openspec:*)`)
+- **AND** unrestricted shell access SHALL be declared only for skills that run arbitrary build or test commands (for example the implementation skill)


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Clarify which skill is "the implementation skill."

The spec uses "the implementation skill" as the example for unrestricted Bash, but the design explicitly names apply-change and onboard. Use the actual skill name (apply-change) or clarify that "implementation skill" refers to it, so the generator and validation have an unambiguous target.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openspec/changes/improve-skill-instructions/specs/skill-authoring-conventions/spec.md` at line 127, The unrestricted shell access rule is ambiguous because it refers to “the implementation skill” instead of the explicitly named skill. Update the wording in the skill-authoring conventions spec to use apply-change directly, or clearly define that “implementation skill” means apply-change, so generators and validators have a single unambiguous target.

…ission-AI#1289) Address issue Fission-AI#1289: docs/concepts.md's "What a Spec Is (and Is Not)" guidance (what belongs in a spec vs. what to keep out) never reaches the skills that draft specs, so agents write implementation-laden specs unless separately instructed. Add a SPEC_CONTENT_GUIDANCE shared snippet, sourced from concepts.md and embedded by the spec-authoring skills (propose, ff-change, continue-change, sync-specs), plus a new "Embedded Spec-Content Guidance" requirement in skill-authoring-conventions and a test asserting the snippet stays aligned with the docs. Proposal, design, and tasks updated to match. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…guidance A doc-vs-skill audit found Fission-AI#1289 (spec-content guidance stranded in the docs) is one instance of a class: rules that shape artifact quality live only in docs/ and never reach the skills that draft artifacts, so agents don't follow them unless told. Confirmed absent from the templates by grep: right-sized rigor (Lite/Full), RFC-2119 keyword meanings, scenario quality (edge cases), and delta conventions (MODIFIED shows prior value, REMOVED says why). Generalize the requirement "Embedded Spec-Content Guidance" into "Embedded Authoring Guidance" (5 scenarios) covering the whole class, add a SPEC_CONVENTIONS_GUIDANCE shared snippet alongside SPEC_CONTENT_GUIDANCE, and require AGENTS.md (docs-agent-instructions) to carry the same conventions so non-skill agents get them too. Design gains an audit table plus two deliberately out-of-scope divergences (enabler-graph vs. gate wording; update-vs-fresh heuristics, owned by add-update-workflow). Now 15 requirements / 36 scenarios; strict-valid. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Incorporate maintainer direction on the skills architecture: - Duplication across skill files is intentional (self-contained skills → independent rewrites), so drop the single-source/DRY pillar. Item 4 becomes "self-contained skills, shared conventions by reference"; the spec requirement, design principles/decisions/alternatives, and tasks (no single-source refactor, no extracted procedure constants) follow. - Favor design/behavior guidance over procedure-heavy "if this then that" skills. Item 2 becomes guidance-first; the canonical structure's Steps section becomes Guidance, with deep/exact procedure moved to references/. - Deliver the Fission-AI#1289-class authoring guidance as a proposal-writing reference the artifact-drafting skills link to (item 12), not inline shared snippets. AGENTS.md carries the same reference for non-skill agents. Tests assert the reference matches concepts.md and that skills link to it. Three architecture principles now stated up front in What Changes and design. Still 15 requirements / 36 scenarios; strict-valid. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…rewrite Start implementing skill-authoring-conventions in this PR, with a measured before/after as proof it earns its place. - Add a conformance scorer (src/core/shared/skill-conformance.ts) that scores every skill against the conventions on objective signals (trigger boundary, success criteria, failure & recovery, guardrails, related-skills, body budget, authoring-reference link) and prints a scorecard. - Add the authoring-conventions reference (the proposal-writing reference, src/core/templates/workflows/authoring-conventions.ts) — compact form of docs/concepts.md (belongs/avoid, rigor, RFC-2119 meanings, scenario quality, delta conventions). Closes the Fission-AI#1289 class. - Emit it on disk: getSkillReferenceFiles + init/update write references/ for exactly the skills that link it (verified e2e: openspec init emits openspec-propose/references/authoring-conventions.md; new-change gets none). - Rewrite the create-a-change family (new/propose/ff/continue) and sync-specs skills to the conventions — trigger boundaries, Use when/Inputs/Success/ Failure & recovery/Related, and the reference link for the spec-authoring ones. Behavior preserved (same commands/prompts/artifacts); command templates unchanged (self-contained, independently rewritable). - Regenerate golden template hashes; add skill-conformance test. Measured efficacy: convention checks passing rose 33/81 -> 57/81 across the 11 skills; the five rewritten skills now score full marks (7/7 or 8/8). Full suite green except a pre-existing, environment-specific zsh-installer failure (fails identically on baseline). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Complete the implementation across all 11 skills and the cross-cutting infra. Skills (behavior-preserving, command templates unchanged): - Rewrote apply, archive, bulk-archive, verify (procedural), explore (stance), onboard (tutorial), and feedback to the conventions — trigger boundaries, Use when/Inputs/Success/Failure & recovery/Related. Combined with the earlier create-a-change family + sync-specs, all 11 skills now conform. allowed-tools (item 11): - Declared per-skill toolsets (skill-tools.ts); generateSkillContent emits allowed-tools frontmatter. Bash scoped to Bash(openspec:*) for CLI-only skills; unrestricted Bash only for apply and onboard. Conformance gate (item 8) + distribution (item 9/skill-distribution): - validateSkillConformance enforces frontmatter validity, name==folder, resolvable references, and declared tools as hard errors; body budget is a warning. Wired into init/update (fail rather than write a bad skill) and covered in CI. Bundle-validation test + docs/skill-distribution.md checklist. Efficacy: convention checks 33/81 -> 80/81 (the one miss is onboard's over-budget body, a documented warning). Full suite green except the pre-existing env-specific zsh-installer failure. Regenerated all golden hashes. BLOCKED / flagged for maintainer: docs-agent-instructions (AGENTS.md, item 10) is left unbuilt because the codebase removed openspec/AGENTS.md generation and legacy-cleanup deletes it as obsolete; re-introducing it would contradict that direction. See tasks.md §9. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…er-body, drop AGENTS.md Methodical pass to fully satisfy the spec deltas: - Lean body: moved deep reference material out of the skill bodies into emitted references/ files — onboard artifact skeletons (references/onboarding-artifact-templates.md, body now 456 lines, under the 500 budget), sync-specs delta-format (references/delta-format.md), bulk-archive conflict examples (references/conflict-resolution.md), verify dimension detail (references/verification-dimensions.md). Generalized getSkillReferenceFiles to a REFERENCE_REGISTRY; onboard's *command* keeps the skeletons inline (self-contained, no references/ dir). - Declared tools cover body usage (6.3): the gate now fails if a body uses an unambiguous tool token (AskUserQuestion/TodoWrite/Grep/Glob/WebFetch/ WebSearch) not in the declared allowed-tools. - Reference/docs drift (10.7): a test asserts the authoring-conventions reference and docs/concepts.md share the same anchor items. - Dropped the docs-agent-instructions capability: OpenSpec removed AGENTS.md generation (legacy-cleanup deletes openspec/AGENTS.md as obsolete), so there is no always-on surface to target. Spec delta deleted; proposal/design/tasks updated. Always-on guidance can return in a separate change once a surface exists. Result: conformance scorecard 33/81 -> 81/81 (all skills fully conformant); 2 spec deltas, 14 requirements / 32 scenarios, strict-valid; full suite green except the pre-existing env-specific zsh-installer failure. Golden hashes regenerated (only the changed skills + onboard command; other command templates byte-identical). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…f-scope Methodical verification that the spec is fully built out: - Added a requirement-by-requirement coverage matrix (14/14 requirements, all scenarios) mapping each to concrete code/test evidence. - Behavior Preservation proven: diffed the executed CLI command set of all 11 skills against the pre-rewrite baseline — identical for every skill (the apparent verify deltas are prose in the new recovery section, not executed commands); user-facing prompts preserved; per-skill behavioral specs hold. - Clarified that slash-command templates are out of the spec's scope (no scenario governs them) — not an open task; the 10 unchanged command templates stay byte-identical by design. Scorecard 81/81; 2 deltas, 14 req / 32 scenarios; strict-valid. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

alfred-openspec

Reviewed the latest skill-authoring updates through e46c3e4, including the bda1816 spec-content guidance change. The direction looks right: artifact-drafting skills now link to a shared authoring-conventions reference instead of duplicating docs guidance, and the proposal/spec/tasks are aligned with the self-contained-skill architecture.\n\nVerified locally: targeted authoring/conformance/parity tests pass, build passes, and openspec validate improve-skill-instructions --strict passes.

TabishB

Ok this PR is super rough. I think we need to start over here and make more intentional, deliberate changes, a bit at a time. A lot of it just makes the skill bulkier and I can't really see it adding much value.

The main part i like is for tools and auto approving OpenSpec commands.

TabishB · 2026-07-03T12:55:54Z

+/**
+ * Scores one skill's description + instructions against the conventions.
+ */
+export function scoreSkillConformance(input: ConformanceInput): ConformanceResult {


I'm not sure I agree that these are the key ingredients of every single skill. Not every single skill is implicit or explicit in nature. Different types of skills serve different purposes.

i.e., some skills are implicitly triggered vs some are explicit.

These seem focused on implicit skills (which we don't really have in OpenSpec to begin with). Even if these were implicit, I don't think they would follow the same pattern.

TabishB · 2026-07-03T12:58:03Z


 ${STORE_SELECTION_GUIDANCE}

+**Use when:** the user wants to write code and check off a change's tasks. To confirm the work is correct without modifying tasks, use \`openspec-verify-change\`; to create missing artifacts (proposal, design, tasks) rather than implement them, use \`openspec-continue-change\`.


As mentioned in the call yesterday. "Use when" as part of the instruction makes no sense. By the time the instruction is loaded in the agent has already choosen to invoke the skill.

In general a lot of the skills at the moment are expected to be explicity triggered vs implicitly triggered.

not to mention this just doubles up on the description above anyways

TabishB · 2026-07-03T12:58:49Z


+**Use when:** the user wants to write code and check off a change's tasks. To confirm the work is correct without modifying tasks, use \`openspec-verify-change\`; to create missing artifacts (proposal, design, tasks) rather than implement them, use \`openspec-continue-change\`.
+
+**Inputs:** optionally a change name. If omitted, infer it from conversation context; auto-select when only one active change exists; if vague or ambiguous you MUST run \`openspec list --json\` and prompt for available changes.


This just seems to double up on the Input section below?

TabishB · 2026-07-03T13:01:37Z

+
+**Failure & recovery**
+- **Ambiguous or missing change name:** run \`openspec list --json\` and prompt with the AskUserQuestion tool; never guess.
+- **\`state: "blocked"\` (missing artifacts):** stop implementing and invoke \`openspec-continue-change\` to create the missing artifacts, then re-run the apply instructions.


A user could not have this skill installed. I'm not sure if continue would be the right thing to do here either?

I would expect this to be a soft warning with a prompt that asks to proceed.

TabishB · 2026-07-03T13:02:31Z

- **Allows artifact updates**: If implementation reveals design issues, suggest updating artifacts - not phase-locked, work fluidly`,
+- **Allows artifact updates**: If implementation reveals design issues, suggest updating artifacts - not phase-locked, work fluidly
+
+**Success:** every task in the tasks file is checked \`- [x]\`, and \`openspec instructions apply --change "<name>" --json\` reports \`state: "all_done"\` with 0 remaining tasks.


I'm not sure if this is the real success criteria, ideally the success is the change is implemented as expected + tasks ticked off + matching the specs etc

TabishB · 2026-07-03T13:06:29Z


 ${STORE_SELECTION_GUIDANCE}

+**Use when:** the user wants to finalize a single completed change - sync its delta specs and move it to the archive. To sync main specs without archiving (keeping the change active), use \`openspec-sync-specs\`; to archive several changes in one run, use \`openspec-bulk-archive-change\`.


Ok sensing a theme here that, I think the assumptions that have gone into this by the agent from the model are just not right. It's repeated the same mistake here as above. We also have no clue that this is actually whats missing in the skills.

There's no empirical or anecdotal evidence for the need for these additional sections. It dosen't feel tied to anything in particular and we can't really prove this makes it better or worse.

TabishB · 2026-07-03T13:14:51Z

+export const SKILL_TOOLS: Record<string, string[]> = {
+  'openspec-explore': [CLI, 'Read', 'Grep', 'Glob', 'Write', 'Edit', 'AskUserQuestion', 'TodoWrite'],
+  'openspec-new-change': [CLI, 'Read', 'AskUserQuestion', 'TodoWrite'],
+  'openspec-continue-change': [CLI, 'Read', 'Write', 'Edit', 'AskUserQuestion', 'TodoWrite'],
+  'openspec-apply-change': [FULL_BASH, 'Read', 'Write', 'Edit', 'Grep', 'Glob', 'AskUserQuestion', 'TodoWrite', 'Skill'],
+  'openspec-ff-change': [CLI, 'Read', 'Write', 'Edit', 'AskUserQuestion', 'TodoWrite'],
+  'openspec-sync-specs': [CLI, 'Read', 'Edit', 'AskUserQuestion', 'TodoWrite'],
+  'openspec-archive-change': [CLI, 'Read', 'AskUserQuestion', 'TodoWrite', 'Skill'],
+  'openspec-bulk-archive-change': [CLI, 'Read', 'Edit', 'AskUserQuestion', 'TodoWrite'],
+  'openspec-verify-change': [CLI, 'Read', 'Grep', 'Glob', 'AskUserQuestion', 'TodoWrite'],
+  'openspec-onboard': [FULL_BASH, 'Read', 'Grep', 'Glob', 'Write', 'Edit', 'AskUserQuestion', 'TodoWrite'],
+  'openspec-propose': [CLI, 'Read', 'Write', 'Edit', 'AskUserQuestion', 'TodoWrite'],
+};


Are some of these tools coding agent agnostic? Like is TodoWrite and AskUserQuestion tool agnostic? I don't think so.

FULL_BASH seems tricky? how does it work with user level safeguards?

TabishB · 2026-07-03T13:16:58Z

+    if (shouldGenerateSkills) {
+      const conformanceErrors: string[] = [];
+      for (const { template, dirName } of skillTemplates) {
+        conformanceErrors.push(...validateSkillConformance(template, dirName).errors);
+      }
+      if (conformanceErrors.length > 0) {
+        throw new Error(`Skill conformance check failed:\n- ${conformanceErrors.join('\n- ')}`);
+      }
+    }
+


It makes no sense for this to be gated during init. I feel like if we did get added in it should be a linting rule when people create skills.

Like imaging if we made an update to the skill that was non conformant. This would just cause it to error for users initializing openspec in their project.

clay-good · 2026-07-03T20:13:51Z

I opened #1300 to auto-approve the openspec CLI in generated skills and I am closing this PR now.

clay-good requested a review from TabishB as a code owner June 30, 2026 16:33

coderabbitai Bot reviewed Jun 30, 2026

View reviewed changes

clay-good self-assigned this Jun 30, 2026

alfred-openspec previously approved these changes Jun 30, 2026

View reviewed changes

clay-good dismissed alfred-openspec’s stale review via 77ca885 June 30, 2026 19:29

coderabbitai Bot reviewed Jun 30, 2026

View reviewed changes

clay-good requested a review from alfred-openspec June 30, 2026 19:55

coderabbitai Bot reviewed Jun 30, 2026

View reviewed changes

clay-good mentioned this pull request Jun 30, 2026

Publish skills to skills.sh #1258

Open

clay-good changed the title ~~Proposal: skill-authoring-conventions — update all agent skills~~ [Docs] Propose skill-authoring-conventions — quality bar, distribution, and allowed-tools for all 11 agent skills Jul 1, 2026

clay-good mentioned this pull request Jul 1, 2026

Concepts from docs not included in the skills #1289

Open

clay-good and others added 5 commits July 1, 2026 12:31

clay-good changed the title ~~[Docs] Propose skill-authoring-conventions — quality bar, distribution, and allowed-tools for all 11 agent skills~~ [Feature] Build out skill-authoring-conventions — rewrite all 11 agent skills + allowed-tools + conformance gate (33/81 → 80/81) Jul 1, 2026

alfred-openspec approved these changes Jul 2, 2026

View reviewed changes

clay-good changed the title ~~[Feature] Make the generated agent skills more reliable — clear triggers, success & recovery steps, pre-approved CLI~~ feat: improve reliability of generated agent skills Jul 2, 2026

TabishB requested changes Jul 3, 2026

View reviewed changes

clay-good mentioned this pull request Jul 3, 2026

feat(skills): auto-approve the openspec CLI in generated skills and commands #1300

Open

clay-good closed this Jul 3, 2026


		## 4. Cross-skill navigation

		- [ ] 4.1 Add a Related line to every skill pointing to its natural next/sibling (e.g. `propose` → `apply`, `verify` → `archive`, `new-change` → `continue`)


		${STORE_SELECTION_GUIDANCE}

		Use when: the user wants to write code and check off a change's tasks. To confirm the work is correct without modifying tasks, use \`openspec-verify-change\`; to create missing artifacts (proposal, design, tasks) rather than implement them, use \`openspec-continue-change\`.


		Use when: the user wants to write code and check off a change's tasks. To confirm the work is correct without modifying tasks, use \`openspec-verify-change\`; to create missing artifacts (proposal, design, tasks) rather than implement them, use \`openspec-continue-change\`.

		Inputs: optionally a change name. If omitted, infer it from conversation context; auto-select when only one active change exists; if vague or ambiguous you MUST run \`openspec list --json\` and prompt for available changes.


		${STORE_SELECTION_GUIDANCE}

		Use when: the user wants to finalize a single completed change - sync its delta specs and move it to the archive. To sync main specs without archiving (keeping the change active), use \`openspec-sync-specs\`; to archive several changes in one run, use \`openspec-bulk-archive-change\`.

Uh oh!

Conversation

clay-good commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this does

What it fixes

Before → after (if merged)

Why it's safe to merge

Notes for review

Uh oh!

coderabbitai Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

alfred-openspec left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

alfred-openspec left a comment

Choose a reason for hiding this comment

Uh oh!

TabishB left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TabishB Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

clay-good commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

clay-good commented Jun 30, 2026 •

edited

Loading

coderabbitai Bot commented Jun 30, 2026 •

edited

Loading

TabishB Jul 3, 2026 •

edited

Loading

clay-good commented Jul 3, 2026 •

edited

Loading