Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,16 @@ on:

jobs:
test:
runs-on: ubuntu-latest
runs-on: ${{ matrix.os }}

strategy:
# One Node version failing shouldn't cancel the other — we want to see
# which versions regress independently.
# One cell failing shouldn't cancel the others — we want to see which
# OS/Node combinations regress independently.
fail-fast: false
matrix:
# Linux and Windows are both first-class targets. macOS shares the
# POSIX path with Linux, so it isn't a separate cell here.
os: [ubuntu-latest, windows-latest]
# package.json declares `engines.node: >=20`. We test 20 (minimum) and
# 22 (current LTS) so both supported versions stay green. Extending
# to newer LTS is ~30s of extra runtime, which is worth the coverage.
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,8 @@ almanac

Requires Node 20, or Node 22 and newer. The npm package is `codealmanac`; the commands are `almanac` and `alm`.

Works on macOS, Linux, and native Windows (PowerShell or cmd) — WSL counts as Linux. Scheduled automation uses launchd on macOS and Task Scheduler on Windows.

## Try The Sample Wiki

Want to see the shape before running an agent over your own repo?
Expand Down
85 changes: 85 additions & 0 deletions docs/plans/2026-06-21-windows-support.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Windows Support Implementation Plan

**Issue:** [#1 — Not Detecting Codex CLI](https://github.com/AlmanacCode/codealmanac/issues/1)
**Prior art:** Draft [PR #2](https://github.com/AlmanacCode/codealmanac/pull/2) (`codex/windows-support`, cut from v0.2.23, now CONFLICTING). We borrow its scheduler/setup/doctor/install work and re-apply onto current `main`, but fix the parts it missed and consolidate duplicated primitives.

**Goal:** Make codealmanac work end-to-end on native Windows / PowerShell: provider detection, agent execution (capture/bootstrap), run cancellation, and auto-scheduling. No change to capture/Garden semantics on macOS.

---

## Root cause (verified)

Three independent layers break on native Windows:

1. **Detection (the reported bug).** `commandExists` / `defaultCommandExists` / `resolveClaudeExecutable` shell out to `sh -lc 'command -v X'`. Windows has no `sh`, so every provider reports "not found on PATH". Three duplicated copies:
- `src/agent/readiness/providers/cli-status.ts:8` ← **the live path the user's screenshot hits** (`codex-cli.ts` → `commandExists`)
- `src/agent/auth/claude.ts:37` (`resolveClaudeExecutable`)
- `src/harness/providers/codex/status.ts:4` (`defaultCommandExists`)
2. **Spawning the CLIs.** Every `spawn(command, …)` omits `shell`. On Node ≥20 Windows refuses to spawn npm's `.cmd`/`.ps1` shims without `shell: true` (CVE-2024-27980 hardening). Affects status probes **and** the real run paths (`harness/providers/codex/exec.ts`, `app-server.ts` → `process/process-group.ts`, `agent/auth/claude.ts` `defaultSpawnCli`).
3. **Process-group lifecycle.** `process/process-group.ts` uses `detached:true` + `process.kill(-pgid)` (POSIX negative-PID group signal). On Windows this throws/no-ops, leaking the agent's child tree on cancel. Windows needs `taskkill /PID <pid> /T /F`.

Plus the **scheduling** layer is macOS-launchd-only (`src/automation/`, `/usr/bin/env` hardcoded in setup) — no Windows Task Scheduler path.

PR #2 only patched the **old** single-file `harness/providers/codex.ts` (since refactored into `codex/`) and never touched layer-1's live `agent/readiness` path, `agent/auth/claude.ts`, or layer 3.

---

## Architecture decision

Rather than scatter `if (process.platform === "win32") { where … } else { sh … }` + `shell: process.platform === "win32"` across 5+ spawn sites (PR #2's approach, and a smell this project's CLAUDE.md explicitly pushes back on — "a central status file should not know provider-specific details", "no one-off fixes"), introduce **one shared cross-platform process module** and route every caller through it:

`src/process/exec.ts` (new):
- `commandExists(command): boolean` — pure-Node PATH + PATHEXT scan (no subprocess at all). Removes the `sh` dependency on **every** platform, which is strictly more correct.
- `resolveExecutable(command): string | undefined` — full resolved path (used by claude auth's `pathToClaudeCodeExecutable` and to feed spawns).
- `crossSpawn(command, args, options)` — thin wrapper that sets `shell: true` on win32 and resolves shims; single place that knows the Windows quirk.

This collapses 3 copies of `commandExists` into 1 and removes per-site platform branches.

---

## Tasks (TDD: failing test → implement → verify, per project convention)

### Task 1 — Shared cross-platform exec module
- **Create** `src/process/exec.ts`: `commandExists`, `resolveExecutable`, `crossSpawn`.
- **Test** `test/process-exec.test.ts`: PATHEXT resolution on a faked win32 env, POSIX `command -v`-equivalent behavior, missing-command returns false. Use `withTempHome` style env injection (inject PATH/PATHEXT + platform, no real subprocess).

### Task 2 — Route detection + status spawns through it
- **Modify** `src/agent/readiness/providers/cli-status.ts` — `commandExists` + `runStatusCommand` use the shared module.
- **Modify** `src/agent/auth/claude.ts` — `resolveClaudeExecutable` + `defaultSpawnCli` use the shared module.
- **Modify** `src/harness/providers/codex/status.ts` — delete the duplicated `defaultCommandExists`/`defaultRunStatus`, import shared.
- **Test**: extend existing provider/codex-harness tests to assert detection succeeds with a Windows `.cmd` shim on PATH (faked).

### Task 3 — Route run/execution spawns through it
- **Modify** `src/harness/providers/codex/exec.ts` and `app-server.ts` (via `process-group.ts`) to spawn through the shared helper so `.cmd`/`.ps1` shims launch.
- **Modify** `src/process/background.ts` detached spawn similarly.

### Task 4 — Windows-safe process termination
- **Modify** `src/process/process-group.ts`: on win32, terminate via `taskkill /PID <pid> /T /F` instead of `process.kill(-pgid)`; keep POSIX path unchanged. Guard `detached` semantics per-platform.
- **Test** `test/process-group.test.ts`: win32 branch invokes taskkill (injected exec), POSIX branch unchanged.

### Task 5 — Windows Task Scheduler (borrow PR #2)
- **Create** `src/commands/automation/windows.ts` (install/status/uninstall via `schtasks`, manifests under `~/.almanac/automation/`). Re-apply PR #2's file; fix the stray tab-indentation in its source.
- **Modify** `src/commands/automation.ts` — `platform` injection + win32 branch (from PR #2).
- **Modify** `src/cli/register-wiki-lifecycle-commands.ts` — generic "platform scheduler" descriptions.
- **Test** `test/automation.test.ts` — add `platform:"darwin"` to existing launchd tests; add win32 schtasks tests (from PR #2).

### Task 6 — Setup / doctor / install path platform-awareness (borrow PR #2)
- **Create** `src/install/ephemeral.ts` (`looksEphemeralInstallPath`, handles `%TEMP%`/`%TMP%`/`_npx`).
- **Modify** `src/commands/setup.ts` (win32 `almanac.cmd` program args), `setup/install-path.ts` (`cmd.exe /d /s /c npm.cmd …`), `doctor-checks/install.ts` + `probes.ts` + `types.ts`, `uninstall.ts`.
- **Test**: extend `test/setup.test.ts`, `test/doctor.test.ts`, `test/uninstall.test.ts` with win32 cases (from PR #2).

### Task 7 — CI + docs
- **Modify** `.github/workflows/ci.yml` — add `windows-latest` matrix (Node 20 & 22) (from PR #2).
- **Modify** `README.md` — drop "macOS only", document Windows support + scheduler caveat.
- Update `.almanac/` pages PR #2 touched if still accurate.

---

## Out of scope / risks
- `cursor-agent` on Windows is detected/spawned the same way but unverified (no cursor CLI here).
- WSL is already covered (it's Linux); this targets **native** Windows.
- `taskkill`-based termination is best-effort; can't send graceful SIGTERM-equivalent, so Windows cancel is harder-kill than macOS. Acceptable.
- Path-with-spaces quoting under `shell:true` — covered by resolving full paths and quoting; status/run args are simple flags.

## Verification
`npm run lint` (tsc), `npm test` (vitest), `npm run build` (tsup) — all green. Then a real-machine smoke test on this Windows box: `almanac` status detects Codex, and a `capture`/`bootstrap` dry run launches the agent.
15 changes: 6 additions & 9 deletions src/agent/auth/claude.ts
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
import { spawn, spawnSync, type ChildProcess } from "node:child_process";
import { spawn, type ChildProcess } from "node:child_process";
import { createRequire } from "node:module";
import { dirname, join } from "node:path";

import type { SpawnCliFn, SpawnedProcess } from "../types.js";
import { crossSpawn, resolveExecutable } from "../../process/exec.js";

/**
* Claude auth gate — accepts either an active Claude subscription login
Expand Down Expand Up @@ -34,12 +35,7 @@ const AUTH_TIMEOUT_MS = 10_000;
* the same binary so Almanac agrees with `claude auth status`.
*/
export function resolveClaudeExecutable(): string | undefined {
const result = spawnSync("sh", ["-lc", "command -v claude"], {
encoding: "utf8",
});
if (result.status !== 0) return undefined;
const found = result.stdout.trim().split("\n")[0]?.trim();
return found !== undefined && found.length > 0 ? found : undefined;
return resolveExecutable("claude");
}

/**
Expand All @@ -58,8 +54,9 @@ function resolveCliJsPath(): string {
* Claude Code CLI.
*/
export const defaultSpawnCli: SpawnCliFn = (args: string[]) => {
const command = resolveClaudeExecutable() ?? "claude";
const child = spawn(command, args, {
// Pass the bare command so crossSpawn lets the shell resolve the npm
// `.cmd` shim on Windows; on POSIX it resolves via PATH as before.
const child = crossSpawn("claude", args, {
stdio: ["ignore", "pipe", "pipe"],
});
return child as unknown as SpawnedProcess;
Expand Down
13 changes: 5 additions & 8 deletions src/agent/readiness/providers/cli-status.ts
Original file line number Diff line number Diff line change
@@ -1,15 +1,12 @@
import { spawn, spawnSync, type ChildProcess } from "node:child_process";
import { type ChildProcess } from "node:child_process";

import type { SpawnCliFn } from "../../types.js";
import { commandExists, crossSpawn } from "../../../process/exec.js";

const STATUS_TIMEOUT_MS = 3_000;

export function commandExists(command: string): boolean {
const result = spawnSync("sh", ["-lc", `command -v ${command}`], {
encoding: "utf8",
});
return result.status === 0 && result.stdout.trim().length > 0;
}
// Re-exported so providers keep a single import site for PATH detection.
export { commandExists };

export function runInjectedStatusCommand(
spawnCli: SpawnCliFn,
Expand Down Expand Up @@ -75,7 +72,7 @@ export function runStatusCommand(
resolve(value);
};
try {
child = spawn(command, args, { stdio: ["ignore", "pipe", "pipe"] });
child = crossSpawn(command, args, { stdio: ["ignore", "pipe", "pipe"] });
} catch (err: unknown) {
const msg = err instanceof Error ? err.message : String(err);
resolve({ ok: false, detail: msg });
Expand Down
4 changes: 2 additions & 2 deletions src/cli/register-wiki-lifecycle-commands.ts
Original file line number Diff line number Diff line change
Expand Up @@ -324,7 +324,7 @@ export function registerWikiLifecycleCommands(program: Command): void {

automation
.command("install [tasks...]")
.description("install the macOS launchd automation jobs")
.description("install the platform scheduler automation jobs (launchd on macOS, Task Scheduler on Windows)")
.option("--every <duration>", "run interval for capture or a single selected task")
.option("--quiet <duration>", "minimum quiet time before capture (default: 45m)")
.option("--garden-every <duration>", "Garden run interval (default: 4h)")
Expand Down Expand Up @@ -353,7 +353,7 @@ export function registerWikiLifecycleCommands(program: Command): void {

automation
.command("uninstall [tasks...]")
.description("remove the macOS launchd automation jobs")
.description("remove the platform scheduler automation jobs (launchd on macOS, Task Scheduler on Windows)")
.action(async (tasks: string[]) => {
const parsed = parseAutomationTaskIds(tasks);
if (!parsed.ok) {
Expand Down
62 changes: 60 additions & 2 deletions src/commands/automation.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
import { execFile } from "node:child_process";
import { existsSync } from "node:fs";
import { homedir } from "node:os";
import path from "node:path";
import { promisify } from "node:util";

import {
bootstrapLaunchdJob,
Expand Down Expand Up @@ -33,8 +35,20 @@ import type { CommandResult } from "../cli/helpers.js";
import { ensureAutomationCaptureSince } from "../config/index.js";
import { parseDuration } from "../indexer/duration.js";
import { findNearestAlmanacDir } from "../paths.js";
import {
installWindowsAutomation,
statusWindowsAutomation,
uninstallWindowsAutomation,
type WindowsAutomationJob,
} from "./automation/windows.js";

export { cleanupLegacyHooks } from "../automation/legacy-hooks.js";
export {
defaultWindowsCaptureManifestPath,
readWindowsManifest,
windowsManifestPath,
windowsTaskName,
} from "./automation/windows.js";

export interface AutomationOptions {
tasks?: ScheduledTaskId[];
Expand All @@ -54,6 +68,8 @@ export interface AutomationOptions {
exec?: ExecFn;
now?: Date;
configPath?: string;
/** Override scheduler platform; production uses `process.platform`. */
platform?: NodeJS.Platform;
}

export interface AutomationStatusOptions {
Expand All @@ -63,6 +79,8 @@ export interface AutomationStatusOptions {
gardenPlistPath?: string;
updatePlistPath?: string;
exec?: ExecFn;
/** Override scheduler platform; production uses `process.platform`. */
platform?: NodeJS.Platform;
}

interface PlannedAutomationJob {
Expand All @@ -82,6 +100,25 @@ const TASK_LABELS: Record<ScheduledTaskId, string> = {
update: "auto-update automation",
};

const execFileAsync = promisify(execFile);

async function defaultWindowsExec(
file: string,
args: string[],
): Promise<{ stdout?: string; stderr?: string }> {
return await execFileAsync(file, args);
}

function toWindowsJob(planned: PlannedAutomationJob): WindowsAutomationJob {
return {
taskId: planned.task.id,
intervalInput: planned.intervalInput,
intervalSeconds: planned.job.intervalSeconds,
programArguments: planned.job.programArguments,
workingDirectory: planned.job.workingDirectory,
};
}

export async function runAutomationInstall(
options: AutomationOptions = {},
): Promise<CommandResult> {
Expand All @@ -90,15 +127,26 @@ export async function runAutomationInstall(
return { stdout: "", stderr: `almanac: ${plan.error}\n`, exitCode: 1 };
}

await writeAutomationPlists(plan.value);

const captureJob = plan.value.jobs.find((job) => job.task.id === "capture");
const captureSince = captureJob === undefined
? null
: await ensureAutomationCaptureSince(
(options.now ?? new Date()).toISOString(),
options.configPath,
);

if ((options.platform ?? process.platform) === "win32") {
return installWindowsAutomation({
home: options.homeDir ?? homedir(),
jobs: plan.value.jobs.map(toWindowsJob),
disabledTaskIds: plan.value.disabledGardenPlistPath !== null ? ["garden"] : [],
captureSince,
Comment on lines +138 to +143

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Record capture baseline only after install succeeds

On Windows this branch calls installWindowsAutomation only after ensureAutomationCaptureSince has already written automation.capture_since. If windowsSchedule rejects the interval (for example almanac automation install --every 30s) or schtasks /Create fails, the command exits non-zero but the baseline has advanced, so a later successful install skips transcripts from before the failed attempt. Validate/create the Windows tasks before recording the capture baseline, or roll the baseline back on failure.

Useful? React with 👍 / 👎.

exec: options.exec ?? defaultWindowsExec,
});
}

await writeAutomationPlists(plan.value);

const activated = await activateAutomationJobs(plan.value, options.exec);
if (!activated.ok) {
return activated.result;
Expand All @@ -116,6 +164,13 @@ export async function runAutomationUninstall(
): Promise<CommandResult> {
const home = options.homeDir ?? homedir();
const tasks = selectedTaskIds(options.tasks, false);
if ((options.platform ?? process.platform) === "win32") {
return uninstallWindowsAutomation({
home,
taskIds: tasks,
exec: options.exec ?? defaultWindowsExec,
});
}
const exec = options.exec;
const removed: string[] = [];
for (const task of tasks.map((id) => scheduledTaskDefinition(id))) {
Expand Down Expand Up @@ -145,6 +200,9 @@ export async function runAutomationStatus(
): Promise<CommandResult> {
const home = options.homeDir ?? homedir();
const tasks = selectedTaskIds(options.tasks, false);
if ((options.platform ?? process.platform) === "win32") {
return statusWindowsAutomation({ home, taskIds: tasks });
}
const sections: string[] = [];
for (const task of tasks.map((id) => scheduledTaskDefinition(id))) {
const status = await readLaunchdJobStatus({
Expand Down
Loading
Loading