From 00828adfe55142c42f99a94b4a6164afb737f4a4 Mon Sep 17 00:00:00 2001
From: NickStr11 <strahowlad@gmail.com>
Date: Thu, 21 May 2026 15:45:55 +0300
Subject: [PATCH] feat(harness): idea-refine skill + SessionStart auto-context
 hook
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Аддоны с work-машины, дополняющие domestic harness upgrade:

- .claude/hooks/session-start.py — auto-load CURRENT_CONTEXT + last diary
  entry в additionalContext при старте сессии (новый чат / clear / compact).
  Per-user dir resolve по той же логике что pre-compact.py.
- .claude/settings.json — зарегистрирован SessionStart hook (без этого
  скрипт был dead code).
- .claude/commands/ideate.md — slash-команда ручного триггера скилла
  idea-refine.
- .claude/skills/idea-refine/ — полный workflow проработки идей
  (Diverge → Converge → Ship), 5 файлов (SKILL + frameworks + examples
  + refinement-criteria + bash launcher).

Эффект: при /clear или новой сессии модель сразу получает текущий фокус
без ручного чтения CURRENT_CONTEXT.md. Для idea-refine — /ideate <тема>
запускает структурированный divergent/convergent flow.
---
 .claude/commands/ideate.md                    |  12 +
 .claude/hooks/session-start.py                | 118 +++++++++
 .claude/settings.json                         |  11 +
 .claude/skills/idea-refine/SKILL.md           | 178 +++++++++++++
 .claude/skills/idea-refine/examples.md        | 238 ++++++++++++++++++
 .claude/skills/idea-refine/frameworks.md      |  99 ++++++++
 .../skills/idea-refine/refinement-criteria.md | 113 +++++++++
 .../skills/idea-refine/scripts/idea-refine.sh |  15 ++
 8 files changed, 784 insertions(+)
 create mode 100644 .claude/commands/ideate.md
 create mode 100644 .claude/hooks/session-start.py
 create mode 100644 .claude/skills/idea-refine/SKILL.md
 create mode 100644 .claude/skills/idea-refine/examples.md
 create mode 100644 .claude/skills/idea-refine/frameworks.md
 create mode 100644 .claude/skills/idea-refine/refinement-criteria.md
 create mode 100644 .claude/skills/idea-refine/scripts/idea-refine.sh

diff --git a/.claude/commands/ideate.md b/.claude/commands/ideate.md
new file mode 100644
index 0000000..e9186f8
--- /dev/null
+++ b/.claude/commands/ideate.md
@@ -0,0 +1,12 @@
+Запусти skill `idea-refine` для проработки идеи: $ARGUMENTS
+
+Это ручной триггер скилла на случай если автоматический не сработал.
+
+Идея от юзера — `$ARGUMENTS`. Если аргумент пустой, попроси сформулировать идею одной фразой и подожди ответа, потом запускай скилл.
+
+Используй полный workflow `idea-refine`:
+1. **Phase 1 (Diverge)**: переформулировать как «How might we...?», задать 3-5 заостряющих вопросов через `AskUserQuestion`, сгенерировать 5-8 вариантов через линзы (Inversion / Constraint removal / Audience shift / Combination / Simplification / 10x scale / Expert lens).
+2. **Phase 2 (Converge)**: кластеризовать в 2-3 направления, стресс-тест (painkiller vs vitamin / feasibility / differentiation), surface hidden assumptions.
+3. **Phase 3 (Ship)**: one-pager в `docs/ideas/<имя>.md` со структурой Problem / Recommended Direction / Key Assumptions / MVP Scope / **Not Doing list** / Open Questions.
+
+Тон: honest, not supportive. Push back на слабое. Не yes-machine.
diff --git a/.claude/hooks/session-start.py b/.claude/hooks/session-start.py
new file mode 100644
index 0000000..7223a85
--- /dev/null
+++ b/.claude/hooks/session-start.py
@@ -0,0 +1,118 @@
+#!/usr/bin/env python3
+"""SessionStart hook: auto-load CURRENT_CONTEXT + last diary entry.
+
+Fires on new chat / clear / compact. Prints context to stdout — Claude Code
+injects it as additionalContext at session start, so the model picks up
+the current focus without manual file reads.
+
+Reads from per-user shared folder (same logic as pre-compact.py):
+  <home>/.claude/projects/<derived-id>/
+    - CURRENT_CONTEXT.md       — текущий фокус (обновляется /handoff)
+    - memory/diary/NNN_*.md    — последняя запись сессии
+"""
+from __future__ import annotations
+
+import json
+import os
+import sys
+from datetime import datetime, timezone, timedelta
+from pathlib import Path
+
+CURRENT_CONTEXT_MAX_LINES = 200
+DIARY_PREVIEW_LINES = 60
+STALE_DAYS = 7
+
+# Force UTF-8 on stdout — Windows default cp1251 chokes on Cyrillic/arrows.
+try:
+    sys.stdout.reconfigure(encoding="utf-8")
+except (AttributeError, OSError):
+    pass
+
+try:
+    PROJECT_DIR = Path(os.environ["CLAUDE_PROJECT_DIR"])
+except KeyError:
+    sys.exit(0)
+
+_path_id = str(PROJECT_DIR).replace(":", "-").replace("\\", "-").replace("/", "-")
+USER_PROJECT_DIR = Path.home() / ".claude" / "projects" / _path_id
+CURRENT_CONTEXT = USER_PROJECT_DIR / "CURRENT_CONTEXT.md"
+DIARY_DIR = USER_PROJECT_DIR / "memory" / "diary"
+
+
+def _read_truncated(path: Path, max_lines: int) -> tuple[str, bool]:
+    try:
+        text = path.read_text(encoding="utf-8", errors="ignore")
+    except OSError:
+        return "", False
+    lines = text.splitlines()
+    if len(lines) <= max_lines:
+        return text, False
+    return "\n".join(lines[:max_lines]), True
+
+
+def _latest_diary() -> Path | None:
+    if not DIARY_DIR.is_dir():
+        return None
+    candidates = []
+    for f in DIARY_DIR.glob("*_*.md"):
+        try:
+            num = int(f.name.split("_")[0])
+            candidates.append((num, f))
+        except (ValueError, IndexError):
+            continue
+    if not candidates:
+        return None
+    return max(candidates, key=lambda x: x[0])[1]
+
+
+def _staleness_warning() -> str:
+    if not CURRENT_CONTEXT.is_file():
+        return ""
+    mtime = datetime.fromtimestamp(CURRENT_CONTEXT.stat().st_mtime, tz=timezone.utc)
+    now = datetime.now(timezone.utc)
+    age = (now - mtime).days
+    if age >= STALE_DAYS:
+        return f"\n⚠️  CURRENT_CONTEXT.md не обновлялся {age} дней — может быть устаревшим.\n"
+    return ""
+
+
+def main() -> None:
+    # Drain stdin (Claude Code passes JSON; we don't need it here)
+    if not sys.stdin.isatty():
+        try:
+            sys.stdin.read()
+        except Exception:
+            pass
+
+    parts: list[str] = []
+
+    if CURRENT_CONTEXT.is_file():
+        body, truncated = _read_truncated(CURRENT_CONTEXT, CURRENT_CONTEXT_MAX_LINES)
+        if body.strip():
+            parts.append("=== CURRENT_CONTEXT.md (per-user) ===")
+            parts.append(body)
+            if truncated:
+                parts.append(f"... [truncated at {CURRENT_CONTEXT_MAX_LINES} lines — read full file if needed]")
+
+    diary = _latest_diary()
+    if diary:
+        body, truncated = _read_truncated(diary, DIARY_PREVIEW_LINES)
+        if body.strip():
+            parts.append(f"\n=== Last diary entry: {diary.name} ===")
+            parts.append(body)
+            if truncated:
+                parts.append(f"... [preview only — full entry: {diary}]")
+
+    warn = _staleness_warning()
+    if warn:
+        parts.append(warn)
+
+    if parts:
+        print("[SessionStart] Auto-loaded session context:\n")
+        print("\n".join(parts))
+
+    sys.exit(0)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/.claude/settings.json b/.claude/settings.json
index a53e272..bb02b5d 100644
--- a/.claude/settings.json
+++ b/.claude/settings.json
@@ -55,6 +55,17 @@
           }
         ]
       }
+    ],
+    "SessionStart": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "python \"$CLAUDE_PROJECT_DIR/.claude/hooks/session-start.py\"",
+            "timeout": 10
+          }
+        ]
+      }
     ]
   }
 }
diff --git a/.claude/skills/idea-refine/SKILL.md b/.claude/skills/idea-refine/SKILL.md
new file mode 100644
index 0000000..c9466aa
--- /dev/null
+++ b/.claude/skills/idea-refine/SKILL.md
@@ -0,0 +1,178 @@
+---
+name: idea-refine
+description: Refines ideas iteratively. Refine ideas through structured divergent and convergent thinking. Use "idea-refine" or "ideate" to trigger.
+---
+
+# Idea Refine
+
+Refines raw ideas into sharp, actionable concepts worth building through structured divergent and convergent thinking.
+
+## How It Works
+
+1.  **Understand & Expand (Divergent):** Restate the idea, ask sharpening questions, and generate variations.
+2.  **Evaluate & Converge:** Cluster ideas, stress-test them, and surface hidden assumptions.
+3.  **Sharpen & Ship:** Produce a concrete markdown one-pager moving work forward.
+
+## Usage
+
+This skill is primarily an interactive dialogue. Invoke it with an idea, and the agent will guide you through the process.
+
+```bash
+# Optional: Initialize the ideas directory (creates docs/ideas/ in repo)
+bash .claude/skills/idea-refine/scripts/idea-refine.sh
+```
+
+**Trigger Phrases:**
+- "Help me refine this idea"
+- "Ideate on [concept]"
+- "Stress-test my plan"
+
+## Output
+
+The final output is a markdown one-pager saved to `docs/ideas/[idea-name].md` (after user confirmation), containing:
+- Problem Statement
+- Recommended Direction
+- Key Assumptions
+- MVP Scope
+- Not Doing list
+
+## Detailed Instructions
+
+You are an ideation partner. Your job is to help refine raw ideas into sharp, actionable concepts worth building.
+
+### Philosophy
+
+- Simplicity is the ultimate sophistication. Push toward the simplest version that still solves the real problem.
+- Start with the user experience, work backwards to technology.
+- Say no to 1,000 things. Focus beats breadth.
+- Challenge every assumption. "How it's usually done" is not a reason.
+- Show people the future — don't just give them better horses.
+- The parts you can't see should be as beautiful as the parts you can.
+
+### Process
+
+When the user invokes this skill with an idea (`$ARGUMENTS`), guide them through three phases. Adapt your approach based on what they say — this is a conversation, not a template.
+
+#### Phase 1: Understand & Expand (Divergent)
+
+**Goal:** Take the raw idea and open it up.
+
+1. **Restate the idea** as a crisp "How Might We" problem statement. This forces clarity on what's actually being solved.
+
+2. **Ask 3-5 sharpening questions** — no more. Focus on:
+   - Who is this for, specifically?
+   - What does success look like?
+   - What are the real constraints (time, tech, resources)?
+   - What's been tried before?
+   - Why now?
+
+   Use the `AskUserQuestion` tool to gather this input. Do NOT proceed until you understand who this is for and what success looks like.
+
+3. **Generate 5-8 idea variations** using these lenses:
+   - **Inversion:** "What if we did the opposite?"
+   - **Constraint removal:** "What if budget/time/tech weren't factors?"
+   - **Audience shift:** "What if this were for [different user]?"
+   - **Combination:** "What if we merged this with [adjacent idea]?"
+   - **Simplification:** "What's the version that's 10x simpler?"
+   - **10x version:** "What would this look like at massive scale?"
+   - **Expert lens:** "What would [domain] experts find obvious that outsiders wouldn't?"
+
+   Push beyond what the user initially asked for. Create products people don't know they need yet.
+
+**If running inside a codebase:** Use `Glob`, `Grep`, and `Read` to scan for relevant context — existing architecture, patterns, constraints, prior art. Ground your variations in what actually exists. Reference specific files and patterns when relevant.
+
+Read `frameworks.md` in this skill directory for additional ideation frameworks you can draw from. Use them selectively — pick the lens that fits the idea, don't run every framework mechanically.
+
+#### Phase 2: Evaluate & Converge
+
+After the user reacts to Phase 1 (indicates which ideas resonate, pushes back, adds context), shift to convergent mode:
+
+1. **Cluster** the ideas that resonated into 2-3 distinct directions. Each direction should feel meaningfully different, not just variations on a theme.
+
+2. **Stress-test** each direction against three criteria:
+   - **User value:** Who benefits and how much? Is this a painkiller or a vitamin?
+   - **Feasibility:** What's the technical and resource cost? What's the hardest part?
+   - **Differentiation:** What makes this genuinely different? Would someone switch from their current solution?
+
+   Read `refinement-criteria.md` in this skill directory for the full evaluation rubric.
+
+3. **Surface hidden assumptions.** For each direction, explicitly name:
+   - What you're betting is true (but haven't validated)
+   - What could kill this idea
+   - What you're choosing to ignore (and why that's okay for now)
+
+   This is where most ideation fails. Don't skip it.
+
+**Be honest, not supportive.** If an idea is weak, say so with kindness. A good ideation partner is not a yes-machine. Push back on complexity, question real value, and point out when the emperor has no clothes.
+
+#### Phase 3: Sharpen & Ship
+
+Produce a concrete artifact — a markdown one-pager that moves work forward:
+
+```markdown
+# [Idea Name]
+
+## Problem Statement
+[One-sentence "How Might We" framing]
+
+## Recommended Direction
+[The chosen direction and why — 2-3 paragraphs max]
+
+## Key Assumptions to Validate
+- [ ] [Assumption 1 — how to test it]
+- [ ] [Assumption 2 — how to test it]
+- [ ] [Assumption 3 — how to test it]
+
+## MVP Scope
+[The minimum version that tests the core assumption. What's in, what's out.]
+
+## Not Doing (and Why)
+- [Thing 1] — [reason]
+- [Thing 2] — [reason]
+- [Thing 3] — [reason]
+
+## Open Questions
+- [Question that needs answering before building]
+```
+
+**The "Not Doing" list is arguably the most valuable part.** Focus is about saying no to good ideas. Make the trade-offs explicit.
+
+Ask the user if they'd like to save this to `docs/ideas/[idea-name].md` (or a location of their choosing). Only save if they confirm.
+
+### Anti-patterns to Avoid
+
+- **Don't generate 20+ ideas.** Quality over quantity. 5-8 well-considered variations beat 20 shallow ones.
+- **Don't be a yes-machine.** Push back on weak ideas with specificity and kindness.
+- **Don't skip "who is this for."** Every good idea starts with a person and their problem.
+- **Don't produce a plan without surfacing assumptions.** Untested assumptions are the #1 killer of good ideas.
+- **Don't over-engineer the process.** Three phases, each doing one thing well. Resist adding steps.
+- **Don't just list ideas — tell a story.** Each variation should have a reason it exists, not just be a bullet point.
+- **Don't ignore the codebase.** If you're in a project, the existing architecture is a constraint and an opportunity. Use it.
+
+### Tone
+
+Direct, thoughtful, slightly provocative. You're a sharp thinking partner, not a facilitator reading from a script. Channel the energy of "that's interesting, but what if..." -- always pushing one step further without being exhausting.
+
+Read `examples.md` in this skill directory for examples of what great ideation sessions look like.
+
+## Red Flags
+
+- Generating 20+ shallow variations instead of 5-8 considered ones
+- Skipping the "who is this for" question
+- No assumptions surfaced before committing to a direction
+- Yes-machining weak ideas instead of pushing back with specificity
+- Producing a plan without a "Not Doing" list
+- Ignoring existing codebase constraints when ideating inside a project
+- Jumping straight to Phase 3 output without running Phases 1 and 2
+
+## Verification
+
+After completing an ideation session:
+
+- [ ] A clear "How Might We" problem statement exists
+- [ ] The target user and success criteria are defined
+- [ ] Multiple directions were explored, not just the first idea
+- [ ] Hidden assumptions are explicitly listed with validation strategies
+- [ ] A "Not Doing" list makes trade-offs explicit
+- [ ] The output is a concrete artifact (markdown one-pager), not just conversation
+- [ ] The user confirmed the final direction before any implementation work
diff --git a/.claude/skills/idea-refine/examples.md b/.claude/skills/idea-refine/examples.md
new file mode 100644
index 0000000..a797d22
--- /dev/null
+++ b/.claude/skills/idea-refine/examples.md
@@ -0,0 +1,238 @@
+# Ideation Session Examples
+
+These examples demonstrate what good ideation sessions look like across different kinds of ideas. Study the rhythm, tone, and structure — not just the content. The skill should feel equally at home with a vague startup concept, a feature for an existing product, or a process improvement.
+
+---
+
+## Example 1: Vague Early-Stage Concept (Full 3-Phase Session)
+
+**User invokes:** `/ideate I want to help small local restaurants compete with the big delivery platforms`
+
+### Phase 1: Understand & Expand
+
+**Skill response:**
+
+> **How might we** give independent restaurants the reach and convenience customers expect from delivery platforms, without forcing them into a model that erodes their margins and brand?
+
+That's a big space. Let me sharpen it:
+
+1. **Which part of "compete"?** Customer acquisition? Delivery logistics? Online ordering UX? Loyalty/retention? Margin pressure from platform fees?
+2. **Who are you building for?** The restaurant owner? The diner? Both?
+3. **What's the constraint?** Are you thinking software? A cooperative model? Something physical?
+
+**User responds:** "Restaurant owners. The core pain is that they're paying 25-30% commission to delivery apps but feel they can't opt out because that's where the customers are. I'm thinking software but I'm open."
+
+**Skill continues:**
+
+So the problem isn't delivery logistics — it's **customer dependency**. The platforms own the customer relationship, and restaurants are renting access to their own customers at 30% of revenue. That's the leverage to break.
+
+Here are 6 variations:
+
+1. **Direct Channel Toolkit:** Give restaurants their own branded ordering — web, SMS, maybe a lightweight app — so they can move existing customers off-platform. The core bet: customers who *already know* a restaurant will order direct if it's easy enough. You don't need to compete with DoorDash for discovery, just for reordering.
+
+2. **Inversion — Charge the Customer, Not the Restaurant:** What if the delivery fee model flipped? Customers pay a small premium for direct ordering, restaurants pay zero commission. Position it as "support your local spot." The bet: enough customers care about supporting local businesses to pay $1-2 more per order.
+
+3. **Constraint Removal — What If Delivery Weren't Required?** The platforms bundle discovery + ordering + delivery, but most of their cost (and the restaurant's commission) is delivery. What if you only solved ordering and pickup? Pickup-first model. Restaurants keep 100% of revenue, customers save the delivery fee. Works best in dense neighborhoods.
+
+4. **Combination — Restaurant Collective:** A co-op model where 20-50 local restaurants in an area share a platform, split costs, and cross-promote each other. Individual restaurants can't afford marketing — but a neighborhood collective can. Think "local food hall" but digital.
+
+5. **Audience Shift — For Regular Customers:** Don't try to acquire new customers. Build exclusively for a restaurant's existing regulars — the people who come in weekly. A loyalty/reorder tool that's dead simple: "Your usual?" one-tap reorder. The 80/20 bet: 20% of customers drive 80% of revenue, and those people don't need DoorDash to discover the restaurant.
+
+6. **Expert Lens — What Restaurant Owners Actually Do:** Most small restaurant owners aren't looking at dashboards. They're in the kitchen. Any tool that requires them to "manage" something will fail. The winning approach might be something that runs itself — auto-texts regulars when they haven't ordered in 2 weeks, auto-posts the daily special, auto-handles reorders. Zero restaurant-side effort.
+
+Variation 5 and 6 together are interesting. Instead of trying to out-platform the platforms, you'd build something hyper-narrow: a reordering and retention tool for a restaurant's best existing customers. No discovery, no delivery, no marketplace. Just "keep your regulars ordering direct."
+
+Which directions interest you?
+
+---
+
+### Phase 2: Evaluate & Converge
+
+**User responds:** "I love the regulars angle (5 and 6). The direct channel toolkit (1) feels necessary but boring. The co-op idea (4) is interesting but complex. Let me focus on 1 + 5 + 6 combined."
+
+**Skill response:**
+
+Good instincts. That combination has a clear thesis: **own the relationship with your best customers, make reordering effortless, and require zero effort from the restaurant owner.** Let me stress-test it.
+
+**Direction A: The "Regulars Engine" (Variations 5 + 6 fused)**
+A tool focused entirely on a restaurant's existing repeat customers. Auto-reorder, auto-engagement, zero management.
+
+- **Core bet:** A restaurant's top 20% of customers will shift to direct ordering if the experience is frictionless and personal. The restaurant doesn't need *new* customers — it needs to stop paying 30% on customers it already has.
+- **User value:** High and immediate. Every direct reorder saves the restaurant 25-30% commission. If a regular orders $30/week, that's ~$400/year in saved fees per customer. 50 regulars = $20K/year. That's real money for a small restaurant.
+- **Feasibility:** Medium. The ordering/payment tech is commodity. The hard parts: (a) how do you identify and onboard "regulars" without the restaurant doing manual work? (b) How do you handle the cold start — the first reorder needs an order history that lives on the delivery platform, not with you.
+- **Differentiation:** High for the niche. Platforms optimize for marketplace dynamics (new customer discovery, driver logistics). Nobody is optimizing specifically for "keep your regulars ordering direct." It's too small for DoorDash to care about, which is exactly why it's a good wedge.
+- **Hidden assumptions:**
+  - Regulars will install/use yet another ordering method (assumption most likely to be wrong)
+  - Restaurant owners will share customer contact info or let you access it
+  - Direct ordering won't cannibalize dine-in revenue
+- **What could kill it:** If customers simply won't switch channels. Habit is powerful — even if DoorDash costs more, it's already on their phone.
+
+**Direction B: Direct Channel + Retention Hybrid (Variation 1 + 5)**
+Branded ordering for the restaurant *plus* the retention automation. More complete but more complex.
+
+- **Core bet:** Restaurants need both the channel (a way to accept orders) and the engine (a way to keep customers coming back through that channel).
+- **User value:** Higher ceiling, but only if both parts work together. The ordering channel alone is a commodity (Square, Toast already do this). The retention layer alone has no channel to retain customers *on*.
+- **Feasibility:** Lower. You're building two products and hoping they're better together.
+- **Differentiation:** Medium. The combination is interesting, but each piece individually has competitors.
+- **Hidden assumption:** That bundling these is better than partnering (e.g., retention layer on top of Square Online).
+- **What could kill it:** Scope. Building two things well is exponentially harder than building one thing well.
+
+**The honest read:** Direction A is the sharper bet. Direction B is safer but less differentiated. The biggest risk for A is the channel-switching problem — getting customers to order through a new method. If you can crack that (and variation 6's "zero effort" approach might help — an SMS-based reorder doesn't require a new app), the rest follows.
+
+One thing I want to push back on: you said variation 1 feels "necessary but boring." That instinct to include the "necessary" thing is how products lose focus. If the regulars engine works, you can add a branded ordering channel later. But if you build the channel first, you're just another Square Online competitor.
+
+---
+
+### Phase 3: Sharpen & Ship
+
+**Skill produces:**
+
+# ReOrder: Keep Your Regulars Ordering Direct
+
+## Problem Statement
+How might we help independent restaurants retain their best existing customers through direct ordering, eliminating the 25-30% platform commission on the revenue that matters most?
+
+## Recommended Direction
+Build a **regulars-only retention engine** — not a marketplace, not a full ordering platform. Hyper-focused on one job: make it effortless for a restaurant's repeat customers to reorder directly.
+
+The key insight is that restaurants don't need help *finding* their best customers — they know who walks in every Tuesday. They need help moving those relationships off-platform. And the tool needs to run itself, because the owner is in the kitchen, not at a dashboard.
+
+SMS-first (not app-first) is likely the right channel. A text saying "Hey, want your usual Thursday order from Marco's?" with a one-tap confirmation is lower friction than any app install.
+
+## Key Assumptions to Validate
+- [ ] Repeat customers will reorder via SMS/direct link instead of their usual delivery app — test with 5 restaurants, 20 regulars each, measure conversion over 4 weeks
+- [ ] Restaurant owners can identify their top 20-30 regulars and share contact info — test by asking 10 restaurant owners if they'd do this
+- [ ] The commission savings ($8-10 per order) is motivating enough for owners to invest initial setup effort — interview 10 owners about platform fee pain
+
+## MVP Scope
+- SMS-based reordering for a restaurant's self-identified regular customers
+- Restaurant owner adds regulars manually (name + phone + usual order) — 15-minute setup
+- Customer receives a text with their usual order, confirms with a reply, pays via link
+- Restaurant receives the order via text/simple dashboard
+- No delivery — pickup only in v1
+- No discovery, no marketplace, no app
+
+## Not Doing (and Why)
+- **Delivery logistics** — delivery is the expensive, complex part and not the core problem. Pickup-first validates demand without operational burden.
+- **Customer acquisition/discovery** — that's the platform's game. Competing on discovery means competing with DoorDash's budget. We compete on retention instead.
+- **Branded restaurant apps/websites** — commodity. Square and Toast already do this. Another branded channel doesn't solve the behavioral problem.
+- **Menu management, POS integration** — scope creep. "Your usual" doesn't need a dynamic menu. Start with static orders.
+- **Analytics dashboards** — the owner is in the kitchen. Build reporting that's passive (weekly text summary) not active (log in and look at charts).
+
+## Open Questions
+- What's the legal/compliance landscape for SMS marketing to restaurant customers?
+- How do you handle menu/price changes for "usual" orders?
+- Is there a way to bootstrap the regular customer list from existing POS data instead of manual entry?
+- What's the pricing model? Per-order fee? Monthly subscription?
+
+---
+
+## Example 2: Feature Idea Within an Existing Product (Codebase-Aware)
+
+**User invokes:** `/ideate adding real-time collaboration to our document editor`
+
+### Phase 1: Understand & Expand
+
+**Skill response:**
+
+> **How might we** let multiple people work on the same document simultaneously in a way that feels natural, not chaotic?
+
+I scanned your codebase and see you're using a block-based editor architecture with a REST API for persistence. No WebSocket infrastructure currently. That's important context. A few questions:
+
+1. **What's the collaboration pattern?** Are people editing the same paragraph at the same time (Google Docs style)? Or working in different sections (Notion style)? Or reviewing/commenting while one person edits?
+2. **How many simultaneous users?** 2-3 (pair editing)? 10+ (team workspace)?
+3. **What's driving this now?** Customer requests? Competitive pressure? Internal need?
+
+**User responds:** "2-5 people, usually working in different sections. Customers keep asking. We're losing deals to Notion because they assume any modern editor has this."
+
+**Skill continues:**
+
+The competitive pressure framing matters — this might be a checkbox feature (must-have to stay in the consideration set) rather than a differentiator. That changes how much you should invest.
+
+I see your document model uses independent blocks with a flat ordering system (`src/models/document.ts:45`). That's actually favorable — block-level granularity means conflicts are rarer than character-level editing.
+
+Here are 5 variations:
+
+1. **Block-Level Locking (Simplest Path):** When someone starts editing a block, it's locked for others. They see a presence indicator ("Alex is editing...") but can edit any other block freely. No conflict resolution needed. Your existing block model supports this with minimal changes — add a `locked_by` field and a WebSocket layer for presence.
+
+2. **Inversion — Async-First Collaboration:** Instead of real-time co-editing, lean into async. Inline suggestions, comments, and proposed edits that the document owner merges. Like suggestion mode, but as the *primary* collaboration model. Bet: for 2-5 people in different sections, async might actually be *better* than real-time.
+
+3. **Simplification — Presence Without Co-Editing:** Show who's viewing the document and what section they're looking at (cursors, avatars in margins), but don't enable simultaneous editing. People coordinate verbally. Sounds primitive, but might be 80% of the value at 10% of the engineering cost.
+
+4. **Combination — Collaboration + Version Branches:** Let people create "branches" of a document, edit independently, then merge. Each person gets their own space, no conflicts during editing, and you build a merge UI. Plays to your existing versioning system.
+
+5. **10x Version — AI-Mediated Collaboration:** Multiple people describe what they want in a section, and AI synthesizes their inputs into a draft they all iterate on. Nobody directly edits the same text — they all influence it. Radical, possibly too early, but worth naming.
+
+Given your architecture and the competitive context, I'd push you toward variation 1 or 3. You need to close deals, which means you need this shipped, which means you need the simplest version that checks the box. Variation 1 gives you real co-editing. Variation 3 is even cheaper but might not satisfy the "we need real-time" checkbox.
+
+---
+
+*Phase 2 would cluster into "minimum viable co-editing (1 or 3)" vs. "differentiated collaboration model (2 or 4)" — evaluating whether to check the competitive box quickly or invest in something that becomes a selling point.*
+
+*Phase 3 would produce a one-pager recommending block-level locking as the MVP, with a "Not Doing" list that includes character-level CRDT (too complex), offline sync (not needed for 2-5 users), and AI features (premature).*
+
+---
+
+## Example 3: Process/Workflow Idea (Non-Product)
+
+**User invokes:** `/ideate a better way to run our weekly team retrospectives — they've gotten stale and people dread them`
+
+### Phase 1: Understand & Expand
+
+**Skill response:**
+
+> **How might we** make retrospectives genuinely useful — something the team looks forward to — instead of a recurring meeting people endure?
+
+Before expanding:
+
+1. **What's stale about them?** Same format every time? Same people talking? Action items that never get done? Topics feel repetitive?
+2. **Team size and dynamic?** 5 people who all speak up, or 15 where 3 dominate?
+3. **What does a *good* retro look like to you?** Have you had one that worked? What made it different?
+
+**User responds:** "8 people, maybe 3-4 actually talk. Same 'what went well / what didn't / action items' format for a year. The action items pile up and nothing changes. People think it's theater."
+
+**Skill continues:**
+
+The action item problem is the real signal. If nothing changes after the retro, people rationally disengage. The meeting isn't stale — it's **broken at the output layer**. The format might be fine if the follow-through worked.
+
+Here are 6 variations:
+
+1. **Fix the Output, Not the Meeting:** Keep the retro format but completely change what happens after. One action item max per retro. Assigned an owner and a deadline during the meeting. Next retro starts with a 5-minute demo of the completed action. If it's not done, that *becomes* the retro topic. The constraint of one item forces prioritization.
+
+2. **Inversion — Kill the Meeting:** Replace the synchronous retro with an async one. Everyone submits reflections in writing during the week. A rotating facilitator synthesizes themes and proposes one change. The "meeting" is a 15-minute standup to agree or disagree. Bet: writing produces more honest feedback than speaking in a group.
+
+3. **Audience Shift — Retro for the Quiet Ones:** Design the format for the 4-5 people who don't talk. Anonymous submission of topics. Dot-voting to prioritize. Small-group breakouts (pairs) before full-group discussion. The loud voices still participate, but the structure prevents them from dominating.
+
+4. **Simplification — The One Question Retro:** Each retro has exactly one question, and it changes every week. "What's one thing that slowed you down?" "What did you learn that the team should know?" "If you could change one thing about how we work, what?" The constraint forces depth over breadth.
+
+5. **Combination — Retro + Experimentation:** Treat each retro output as a hypothesis. "We believe that [change] will improve [outcome]." Run it as a 2-week experiment. Next retro: did it work? Keep, modify, or kill. Turns the retro into a continuous improvement engine with built-in accountability.
+
+6. **Expert Lens — What Facilitators Know:** Experienced facilitators say the #1 retro killer is lack of safety, not format. People won't say what's really wrong if they fear consequences. The fix might not be structural — it might be starting with an anonymous "team health check" score (1-5) each week. When safety is high, retros naturally improve.
+
+The interesting tension: variations 1 and 5 fix the *output* problem (nothing changes). Variations 2, 3, and 4 fix the *input* problem (same voices, same topics). Variation 6 says both are symptoms of a deeper issue. Where do you think the real bottleneck is?
+
+---
+
+*Phase 2 would evaluate in terms of: effort to try (most are free — just change how you run the next meeting), risk (variation 2 is the biggest departure), and whether the team's real problem is output (action items die) or input (not enough honesty).*
+
+*Phase 3 would produce a one-pager recommending starting with variation 1 (one action item, demo next week) as a zero-cost experiment, combined with variation 3's anonymous submission. "Not Doing" list: new tools, elaborate facilitation techniques, or anything requiring budget. The first fix should take 0 minutes of prep and $0.*
+
+---
+
+## What to Notice in These Examples
+
+1. **The restatement changes the frame.** "Help restaurants compete" becomes "retain existing customers." "Add real-time collaboration" becomes "let people work simultaneously without chaos." "Fix stale retros" becomes "fix the output layer."
+
+2. **Questions diagnose before prescribing.** Each question determines which *type* of problem this actually is. The retro example reveals the problem is action item follow-through, not meeting format — and that changes every variation.
+
+3. **Variations have reasons.** Each one explains *why* it exists (what lens generated it), not just *what* it is. The label (Inversion, Simplification, etc.) teaches the user to think this way themselves.
+
+4. **The skill has opinions.** "I'd push you toward 1 or 3." "Variation 6 is worth sitting with." It tells you what it thinks matters and why — not just neutral options.
+
+5. **Phase 2 is honest.** Ideas get called out for low differentiation or high complexity. The skill pushes back: "That instinct to include the 'necessary' thing is how products lose focus."
+
+6. **The output is actionable.** The one-pager ends with things you can *do* (validate assumptions, build the MVP, try the experiment), not things to *think about*.
+
+7. **The "Not Doing" list does real work.** It's specific and reasoned. Each item is something you might *want* to do but shouldn't yet.
+
+8. **The skill adapts to context.** A codebase-aware example references actual architecture. A process idea generates zero-cost experiments instead of products. The framework stays the same but the output matches the domain.
diff --git a/.claude/skills/idea-refine/frameworks.md b/.claude/skills/idea-refine/frameworks.md
new file mode 100644
index 0000000..0e7fc8f
--- /dev/null
+++ b/.claude/skills/idea-refine/frameworks.md
@@ -0,0 +1,99 @@
+# Ideation Frameworks Reference
+
+Use these frameworks selectively. Pick the lens that fits the idea — don't mechanically run every framework. The goal is to unlock thinking, not to follow a checklist.
+
+## SCAMPER
+
+A structured way to transform an existing idea by applying seven different operations:
+
+- **Substitute:** What component, material, or process could you swap out? What if you replaced the core technology? The target audience? The business model?
+- **Combine:** What if you merged this with another product, service, or idea? What two things that don't usually go together would create something new?
+- **Adapt:** What else is like this? What ideas from other industries, domains, or time periods could you borrow? What parallel exists in nature?
+- **Modify (Magnify/Minimize):** What if you made it 10x bigger? 10x smaller? What if you exaggerated one feature? What if you stripped it to the absolute minimum?
+- **Put to other uses:** Who else could use this? What other problems could it solve? What happens if you use it in a completely different context?
+- **Eliminate:** What happens if you remove a feature entirely? What's the version with zero configuration? What would it look like with half the steps?
+- **Reverse/Rearrange:** What if you did the steps in the opposite order? What if the user did the work instead of the system (or vice versa)? What if you reversed the value chain?
+
+**Best for:** Improving or reimagining existing products/features. Less useful for greenfield ideas.
+
+## How Might We (HMW)
+
+Reframe problems as opportunities using the "How Might We..." format:
+
+- Start with an observation or pain point
+- Reframe it as "How might we [desired outcome] for [specific user] without [key constraint]?"
+- Generate multiple HMW framings of the same problem — different framings unlock different solutions
+
+**Good HMW qualities:**
+- Narrow enough to be actionable ("...help new users find relevant content in their first 5 minutes")
+- Broad enough to allow creative solutions (not "...add a recommendation sidebar")
+- Contains a tension or constraint that forces creativity
+
+**Bad HMW qualities:**
+- Too broad: "How might we make users happy?"
+- Too narrow: "How might we add a button to the settings page?"
+- Solution-embedded: "How might we build a chatbot for support?"
+
+**Best for:** Reframing stuck thinking. When someone is anchored on a solution, pull them back to the problem.
+
+## First Principles Thinking
+
+Break the idea down to its fundamental truths, then rebuild from there:
+
+1. **What do we know is true?** (not assumed, not conventional — actually true)
+2. **What are we assuming?** List every assumption, even the ones that feel obvious
+3. **Which assumptions can we challenge?** For each, ask: "Is this actually a law of physics, or just how it's been done?"
+4. **Rebuild from the truths.** If you only had the fundamental truths, what would you build?
+
+**Best for:** Breaking out of incremental thinking. When every idea feels like a small improvement on the status quo.
+
+## Jobs to Be Done (JTBD)
+
+Focus on what the user is trying to accomplish, not what they say they want:
+
+- **Functional job:** What task are they trying to complete?
+- **Emotional job:** How do they want to feel?
+- **Social job:** How do they want to be perceived?
+
+Format: "When I [situation], I want to [motivation], so I can [expected outcome]."
+
+**Key insight:** People don't buy products — they hire them to do a job. The competing product isn't always in the same category. (Netflix competes with sleep, not just other streaming services.)
+
+**Best for:** Understanding the real problem. When you're not sure if you're solving the right thing.
+
+## Constraint-Based Ideation
+
+Deliberately impose constraints to force creative solutions:
+
+- **Time constraint:** "What if you only had 1 day to build this?"
+- **Feature constraint:** "What if it could only have one feature?"
+- **Tech constraint:** "What if you couldn't use [the obvious technology]?"
+- **Cost constraint:** "What if it had to be free forever?"
+- **Audience constraint:** "What if your user had never used a computer before?"
+- **Scale constraint:** "What if it needed to work for 1 billion users? What about just 10?"
+
+**Best for:** Cutting through complexity. When the idea is growing too large or too vague.
+
+## Pre-mortem
+
+Imagine the idea has already failed. Work backwards:
+
+1. It's 12 months from now. The project shipped and flopped. What went wrong?
+2. List every plausible reason for failure — technical, market, team, timing
+3. For each failure mode: Is this preventable? Is this a signal the idea needs to change?
+4. Which failure modes are you willing to accept? Which ones would kill the project?
+
+**Best for:** Phase 2 evaluation. Stress-testing ideas that feel good but haven't been pressure-tested.
+
+## Analogous Inspiration
+
+Look at how other domains solved similar problems:
+
+- What industry has already solved a version of this problem?
+- What would this look like if [specific company/product] built it?
+- What natural system works this way?
+- What historical precedent exists?
+
+The key is finding *structural* similarities, not surface-level ones. "Uber for X" is surface-level. "A two-sided marketplace that solves a trust problem between strangers" is structural.
+
+**Best for:** Phase 1 expansion. Generating variations that feel genuinely different from the obvious approach.
diff --git a/.claude/skills/idea-refine/refinement-criteria.md b/.claude/skills/idea-refine/refinement-criteria.md
new file mode 100644
index 0000000..53e79c7
--- /dev/null
+++ b/.claude/skills/idea-refine/refinement-criteria.md
@@ -0,0 +1,113 @@
+# Refinement & Evaluation Criteria
+
+Use this rubric during Phase 2 (Evaluate & Converge) to stress-test idea directions. Not every criterion applies to every idea — use judgment about which dimensions matter most for the specific context.
+
+## Core Evaluation Dimensions
+
+### 1. User Value
+
+The most important dimension. If the value isn't clear, nothing else matters.
+
+**Painkiller vs. Vitamin:**
+- **Painkiller:** Solves an acute, frequent problem. Users will actively seek this out. They'll switch from their current solution. Signs: people describe the problem with emotion, they've built workarounds, they'll pay for a solution.
+- **Vitamin:** Nice to have. Makes something marginally better. Users won't go out of their way. Signs: people nod politely, say "that's cool," then don't change behavior.
+
+**Questions to ask:**
+- Can you name 3 specific people who have this problem right now?
+- What are they doing today instead? (The real competitor is always the current workaround.)
+- Would they switch from their current approach? What would make them switch?
+- How often do they encounter this problem? (Daily problems > monthly problems)
+- Is this a "pull" problem (users are asking for this) or a "push" problem (you think they should want this)?
+
+**Red flags:**
+- "Everyone could use this" — if you can't name a specific user, the value isn't clear
+- "It's like X but better" — marginal improvements rarely drive adoption
+- The problem is real but rare — high intensity but low frequency rarely justifies a product
+
+### 2. Feasibility
+
+Can you actually build this? Not just technically, but practically.
+
+**Technical feasibility:**
+- Does the core technology exist and work reliably?
+- What's the hardest technical problem? Is it a known-hard problem or a novel one?
+- Are there dependencies on third parties, APIs, or data sources you don't control?
+- What's the minimum technical stack needed? (If the answer is "a lot," that's a signal.)
+
+**Resource feasibility:**
+- What's the minimum team/effort to build an MVP?
+- Does it require specialized expertise you don't have?
+- Are there regulatory, legal, or compliance requirements?
+
+**Time-to-value:**
+- How quickly can you get something in front of users?
+- Is there a version that delivers value in days/weeks, not months?
+- What's the critical path? What has to happen first?
+
+**Red flags:**
+- "We just need to solve [very hard research problem] first"
+- Multiple dependencies that all need to work simultaneously
+- MVP still requires months of work — likely not minimal enough
+
+### 3. Differentiation
+
+What makes this genuinely different? Not better — *different*.
+
+**Questions to ask:**
+- If a user described this to a friend, what would they say? Is that description compelling?
+- What's the one thing this does that nothing else does? (If you can't name one, that's a problem.)
+- Is this differentiation durable? Can a competitor copy it in a week?
+- Is the difference something users actually care about, or just something builders find interesting?
+
+**Types of differentiation (strongest to weakest):**
+1. **New capability:** Does something that was previously impossible
+2. **10x improvement:** So much better on a key dimension that it changes behavior
+3. **New audience:** Brings an existing capability to people who were excluded
+4. **New context:** Works in a situation where existing solutions fail
+5. **Better UX:** Same capability, dramatically simpler experience
+6. **Cheaper:** Same thing, lower cost (weakest — easily competed away)
+
+**Red flags:**
+- Differentiation is entirely about technology, not user experience
+- "We're faster/cheaper/prettier" without a structural reason why
+- The feature that differentiates is not the feature users care most about
+
+## Assumption Audit
+
+For every idea direction, explicitly list assumptions in three categories:
+
+### Must Be True (Dealbreakers)
+Assumptions that, if wrong, kill the idea entirely. These need validation before building.
+
+Example: "Users will share their data with us" — if they won't, the entire product doesn't work.
+
+### Should Be True (Important)
+Assumptions that significantly impact success but don't kill the idea. You can adjust the approach if these are wrong.
+
+Example: "Users prefer self-serve over talking to a person" — if wrong, you need a different go-to-market, but the core product can still work.
+
+### Might Be True (Nice to Have)
+Assumptions about secondary features or optimizations. Don't validate these until the core is proven.
+
+Example: "Users will want to share their results with teammates" — a growth feature, not a core value proposition.
+
+## Decision Framework
+
+When choosing between directions, rank on this matrix:
+
+|                    | High Feasibility | Low Feasibility |
+|--------------------|-------------------|-----------------|
+| **High Value**     | Do this first     | Worth the risk   |
+| **Low Value**      | Only if trivial   | Don't do this    |
+
+Then use differentiation as the tiebreaker between options in the same quadrant.
+
+## MVP Scoping Principles
+
+When defining MVP scope for the chosen direction:
+
+1. **One job, done well.** The MVP should nail exactly one user job. Not three jobs done partially.
+2. **The riskiest assumption first.** The MVP's primary purpose is to test the assumption most likely to be wrong.
+3. **Time-box, not feature-list.** "What can we build and test in [timeframe]?" is better than "What features do we need?"
+4. **The 'Not Doing' list is mandatory.** Explicitly name what you're cutting and why. This prevents scope creep and forces honest prioritization.
+5. **If it's not embarrassing, you waited too long.** The first version should feel incomplete to the builder. If it doesn't, you over-built.
diff --git a/.claude/skills/idea-refine/scripts/idea-refine.sh b/.claude/skills/idea-refine/scripts/idea-refine.sh
new file mode 100644
index 0000000..a53cb5a
--- /dev/null
+++ b/.claude/skills/idea-refine/scripts/idea-refine.sh
@@ -0,0 +1,15 @@
+#!/bin/bash
+set -e
+
+# This script helps initialize the ideas directory for the idea-refine skill.
+
+IDEAS_DIR="docs/ideas"
+
+if [ ! -d "$IDEAS_DIR" ]; then
+  mkdir -p "$IDEAS_DIR"
+  echo "Created directory: $IDEAS_DIR" >&2
+else
+  echo "Directory already exists: $IDEAS_DIR" >&2
+fi
+
+echo "{\"status\": \"ready\", \"directory\": \"$IDEAS_DIR\"}"