v2.0.0 Release: Plan/Design workflow, working memory fix, learning system review

## Overview

Final workstream to reach v2.0.0. Eight tracks of work (5 done, 3 remaining).

---

## Track 1: Rename Shepherd → Evaluator ✅

- [x] Rename `shared/agents/shepherd.md` → `shared/agents/evaluator.md`
- [x] Update all references across skills, commands, agents, and orchestration files
- [x] Update plugin.json manifests, tests, README, CLAUDE.md, CONTRIBUTING.md

**Completed in PR #171**

## Track 2: Tester Agent + QA Skill ✅

- [x] Create `shared/skills/qa/SKILL.md` + `shared/agents/tester.md`
- [x] Add Tester to quality gate pipeline, `/implement` command, plugin.json manifests
- [x] Add tests

**Completed in PR #171**

## Track 3: Ambient Mode Refinements ✅

- [x] Full pipeline rearchitecture: lean SessionStart preamble, SKILL router, three-tier depth classification
- [x] All intents verified (REVIEW, RESOLVE, DEBUG), edge cases tested

**Completed in PRs #172, #174**

## Track 4: Plan/Design Workflow + Gap Analysis ✅

Unified `/plan` command with gap analysis, design review, and plan-aware `/implement`. Implementation differed from original spec — unified into `devflow-plan` plugin with a `designer` agent rather than separate gap-checker/design-reviewer agents.

- [x] Create gap-analysis and design-review skills + references
- [x] Create `shared/agents/designer.md` (unified gap-checker + design-reviewer role)
- [x] Extend `shared/agents/synthesizer.md` (planning mode)
- [x] Create `devflow-plan` plugin (plugin.json, commands/plan.md, README)
- [x] Add plan detection to `/implement` command (accepts plan documents)
- [x] Deprecate `/specify` in favor of unified `/plan`
- [x] Update `src/cli/plugins.ts`, marketplace.json, CLAUDE.md, README.md
- [x] Build & verify (71 skills distributed, 614/614 tests passing)

**Completed in PR #176**

## Track 5: Working Memory Message Extraction Bug 🐛

See #179 for details.

- [ ] Fix `background-memory-update` message extraction: `tail -1` → `head -1` for first user message
- [ ] Verify assistant extraction is correct (last message)
- [ ] Test with real session transcript

## Track 6: Self-Learning System Review ✅ (scoped down)

Core learning system reviewed and stabilized in PR #173 (thresholds, single toggle, frontmatter, --reset). Remaining items (PF-004 god script decomposition, PF-006 jq latency, PF-005 hook duplication) are tracked in #23 (Tech Debt Backlog) — not v2 blockers.

- [x] Fix self-learning thresholds, single toggle, frontmatter (#173)
- [x] Review confidence scoring and temporal decay for correctness
- [x] Review observation deduplication reliability
- [ ] ~~PF-004 god script decomposition~~ → deferred to tech debt
- [ ] ~~PF-006 jq loop latency~~ → deferred to tech debt
- [ ] ~~PF-005 hook interface duplication~~ → deferred to tech debt

## Track 7: Project Knowledge Refinements 🔬

See #177 for details. Revisit the project knowledge system from scratch — analyze current decisions/pitfalls implementation, resolve orphaned `PROJECT-PATTERNS.md`, decide on two-file vs three-file model, evaluate extraction mechanisms and cross-workflow flow.

- [ ] Analyze current implementation end-to-end
- [ ] Decide file model and resolve PROJECT-PATTERNS.md
- [ ] Implement refinements
- [ ] Verify cross-workflow knowledge flow

## Track 8: Ambient Mode Detection Tweaks 🎯

See #178 for details. Refine the lean classification approach based on real usage — intent accuracy, depth calibration, edge cases, router mappings.

- [ ] Collect observations from real usage
- [ ] Identify and fix misclassification patterns
- [ ] Apply classification/router adjustments

---

## Release Checklist

- [x] Tracks 1-4 complete
- [ ] Track 5 complete (#179 — memory extraction bug)
- [x] Track 6 scoped and core fixes shipped
- [ ] Track 7 complete (#177 — project knowledge refinements)
- [ ] Track 8 complete (#178 — ambient detection tweaks)
- [ ] `npm run build` clean
- [ ] `npm test` — all tests pass
- [ ] Version bump to 2.0.0 in package.json (already done)
- [ ] CHANGELOG.md updated
- [ ] Tag v2.0.0 and create GitHub release


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.0.0 Release: Plan/Design workflow, working memory fix, learning system review #170

Overview

Track 1: Rename Shepherd → Evaluator ✅

Track 2: Tester Agent + QA Skill ✅

Track 3: Ambient Mode Refinements ✅

Track 4: Plan/Design Workflow + Gap Analysis ✅

Track 5: Working Memory Message Extraction Bug 🐛

Track 6: Self-Learning System Review ✅ (scoped down)

Track 7: Project Knowledge Refinements 🔬

Track 8: Ambient Mode Detection Tweaks 🎯

Release Checklist

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

v2.0.0 Release: Plan/Design workflow, working memory fix, learning system review #170

Description

Overview

Track 1: Rename Shepherd → Evaluator ✅

Track 2: Tester Agent + QA Skill ✅

Track 3: Ambient Mode Refinements ✅

Track 4: Plan/Design Workflow + Gap Analysis ✅

Track 5: Working Memory Message Extraction Bug 🐛

Track 6: Self-Learning System Review ✅ (scoped down)

Track 7: Project Knowledge Refinements 🔬

Track 8: Ambient Mode Detection Tweaks 🎯

Release Checklist

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions