Releases · Protocol-zero-0/evolution-kernel

26 May 10:09

v1.1.2

5dfef32

v1.1.2 — Packaging fix: roles/ now ship in the wheel Latest

Latest

Released: 2026-05-26 · hours after v1.1.1 · pip install -U evolution-kernel

TL;DR: v1.1.0 and v1.1.1 wheels shipped the runtime but not the reference role scripts — pip install evolution-kernel users had nowhere to point roles.executor at. This release fixes the packaging and adds a bundled: prefix so the same config works for pip install and git clone setups.

The bug

The wheel only contained evolution_kernel/ and evolution_kernel/templates/. The reference roles (planner.py, executor.sh, evaluator.py, goal_evaluator.py, strategist.py) lived at the repo top level, outside the package, so setuptools never bundled them. The README's ["python3", "roles/planner.py"] only resolved for users whose CWD was a fresh git-clone of the repo. Discovered while smoke-testing the v1.1.1 ship — fixed before sundown.

The fix

Move roles/* → evolution_kernel/roles/* so setuptools ships them as package_data.
Add a bundled:<name> prefix that the kernel resolves to the absolute path inside the installed wheel via importlib.resources. Same evolution.yml now works for both pip install users and developers running from a checkout.
Update all 5 evolution-kernel init templates + examples/evolution.yml + both READMEs to use bundled: form. evolution-kernel init now writes configs that work out of the box on a pip-installed box.

roles:
  planner:   ["python3", "bundled:planner.py"]
  executor:  ["bash",    "bundled:executor.sh"]
  evaluator: ["python3", "bundled:evaluator.py"]

→ 30-second smoke test:

pip install -U evolution-kernel
evolution-kernel init     # 3 questions, drops a working evolution.yml

Backward compatibility

bundled: is opt-in. Plain argv (["python3", "myplanner.py"]) still passes through unchanged. Existing configs that hardcoded roles/X at the repo top level need updating to bundled:X — re-run evolution-kernel init for a fresh config, or do a single search-and-replace.

Verification

108 tests pass (102 prior + 6 new bundled-role tests covering happy path, no-op for plain argv, path-separator rejection, missing-file errors, and end-to-end through load_config)
Wheel now contains all 5 role scripts under evolution_kernel/roles/
Fresh-venv pip install + bundled: resolution returns real site-packages paths for all 5 roles

中文摘要

发版： 2026-05-26 · 距 v1.1.1 几个小时 · pip install -U evolution-kernel

一句话： v1.1.0 / v1.1.1 的 wheel 只带了 runtime，没带 5 个参考 role 脚本——pip install evolution-kernel 用户没有可以指向 roles.executor 的文件。这一版修了封装、加了 bundled: 前缀，让同一份 evolution.yml 在 pip 装和 git clone 两种环境下都直接能跑。

Bug 在哪

之前 wheel 只装了 evolution_kernel/ 包 + templates/*.yml，5 个参考 role 脚本（`planner.py` / `executor.sh` / `evaluator.py` / `goal_evaluator.py` / `strategist.py`）住在仓库顶层，不在包里，setuptools 没打进去。README 写的 `["python3", "roles/planner.py"]` 只在 cwd 是 git-clone 仓库根的用户身上能跑。这个问题是 v1.1.1 发版做 smoke test 时发现的，当天就修完。

怎么修

移动 `roles/` → `evolution_kernel/roles/`，让 setuptools 当 `package_data` 发出去
加 `bundled:` 前缀，kernel 用 `importlib.resources` 解析成 wheel 里的绝对路径；同一份 `evolution.yml`，pip 装和 git clone 都直接跑
更新 5 个 init 模板 + `examples/evolution.yml` + 中英 README，全部用 `bundled:` 写法。`evolution-kernel init` 写出来的 config 在 pip 装机器上开箱即用

roles:
  planner:   ["python3", "bundled:planner.py"]
  executor:  ["bash",    "bundled:executor.sh"]
  evaluator: ["python3", "bundled:evaluator.py"]

→ 30 秒 smoke test：

pip install -U evolution-kernel
evolution-kernel init     # 3 个问题，落一份能跑的 evolution.yml

兼容性

`bundled:` 是 opt-in。普通 argv（如 `["python3", "myplanner.py"]`）原样透传。已有 config 写死 `roles/X`（顶层路径）需要改成 `bundled:X`——重跑 `evolution-kernel init` 或一次性 search-and-replace 即可。

验证

108 测试通过（102 历史 + 6 新增 bundled-role 测试，覆盖 happy path / 普通 argv 透传 / 路径分隔符拒绝 / 缺文件报错 / 端到端走 `load_config`）
Wheel 里现在含 5 个 role 脚本（位于 `evolution_kernel/roles/`）
干净 venv `pip install` + `bundled:` resolution 对 5 个 role 都返回 site-packages 真实路径

Assets 2

26 May 09:55

Protocol-zero-0

v1.1.1

33f9a73

v1.1.1 — Executor permission-mode fix

Released: 2026-05-26 · 9 days after v1.1.0 · pip install -U evolution-kernel

TL;DR: If you tried coding_agent.tool: claude-code on v1.1.0 and your runs silently produced no patch, this fixes it.

→ Try the 10-minute quickstart on the fixed build: examples/quickstart/ — zero cost, no Anthropic API key required (Claude Pro is enough).

What changed

roles/executor.sh now passes --permission-mode bypassPermissions to claude -p. Without it, the inner Claude session refuses to edit files in non-interactive mode — the kernel run completes but the worktree stays unchanged, every attempt gets rejected by the evaluator, and budget is spent on no-op iterations.

Why this is safe

Governor already isolates each attempt in a temporary git worktree, and scope.py rejects any change outside allowed_paths. The kernel is the trust boundary; the inner claude -p session does not need its own permission prompts on top of that. With sandbox.enabled: true (firejail), you also get an OS-level read-only-root cage — two layers below the executor, so bypassPermissions inside the inner agent is the right default.

Affected configurations

You're affected if your evolution.yml sets coding_agent.tool: claude-code. Setups using aider or a custom executor (like examples/oss_fix_demo/) are unaffected.

Compatibility

No breaking changes. Pure bug fix. Single runtime dependency (PyYAML) preserved. Cumulative test count: 102 (unchanged from v1.1.0).

中文摘要

发版： 2026-05-26 · 距 v1.1.0 整 9 天 · pip install -U evolution-kernel

一句话： v1.1.0 用 `coding_agent.tool: claude-code` 跑 kernel 时，所有 attempt 都"跑完但没改文件"——这个版本修了。

→ 10 分钟用上修复版： `examples/quickstart/` —— 零成本，不需要 Anthropic API key（Claude Pro 订阅即可）。

改了什么

`roles/executor.sh` 调 `claude -p` 时补了 `--permission-mode bypassPermissions`。之前缺这个 flag 导致 Claude 非交互模式拒绝改文件——kernel 跑完了但 worktree 没动，所有 attempt 被 evaluator reject，预算白烧在空转上。

为什么这样安全

Governor 已经把每次 attempt 隔离在临时 git worktree 里，`scope.py` 拒绝 `allowed_paths` 之外的任何改动。Kernel 才是信任边界，里面的 `claude -p` 不需要再加一层权限提示。开了 `sandbox.enabled: true`（firejail）还能再叠一层 OS 级 read-only-root cage——executor 下方共两层 sandbox，所以内层 agent 用 `bypassPermissions` 是正确默认。

谁受影响

你 `evolution.yml` 里设了 `coding_agent.tool: claude-code` 就受影响。用 `aider` 或自定义 executor（比如 `examples/oss_fix_demo/`）的不受影响。

兼容性

零破坏性变更。单依赖（PyYAML）不变。累计测试数 102（与 v1.1.0 持平）。

Assets 2

17 May 06:34

Protocol-zero-0

v1.1.0

5bfb505

v1.1.0 — ready to publicize

The v1.0 line shipped the runtime; v1.1 ships the first 10 minutes a stranger spends with the runtime. No kernel refactor, no new dependencies, no abstractions — just the onboarding surface that turns "interesting README" into "I just ran it."

What's new

`evolution-kernel init` — three-question scaffolder (#28)

A new subcommand asks 3 questions — mission, template, allowed paths — and drops a valid evolution.yml in the current directory. Five starter templates ship as plain YAML and cover the common mission shapes:

lint — drive a linter/formatter to zero violations
coverage — raise test coverage
perf — optimize a measurable workload
benchmark — FunSearch / AlphaEvolve-style population search (k-branch parallel)
custom — blank-ish starter

No interactive prompt library, no template base classes, no Python template generator — the wizard is 76 lines of stdlib input() calls, and the rendered output is fed through load_config() before it can hit disk, so a broken template can never escape.

`examples/quickstart/` — see the loop close in 1.4 seconds (#30)

A turn-key demo that takes a stranger from git clone to evolution/accepted commit in one shell snippet, zero cost, no API key:

pip install -e . && pip install ruff
bash examples/quickstart/setup.sh
evolution-kernel --config examples/quickstart/evolution.yml \
                 --repo /tmp/ek-quickstart-target \
                 --ledger /tmp/ek-quickstart-ledger --loop

Measured wall-clock on a developer laptop: 1.4 s. The mission is small on purpose (drive ruff to zero violations on src/messy.py) so the entire closed loop — worktree sandbox, scope enforcement, ledger writes, evolution/accepted branch advancing — fits in one terminal scroll. No LLM in the loop; the planner/executor/evaluator are deterministic Python scripts committed into the demo target itself. This is intentional: the example demonstrates the runtime, not LLM smarts. For the LLM-driven story, see ⬇.

`examples/oss_fix_demo/` — real OSS fix via `claude` CLI (#32)

The companion to quickstart, pointed at a real published OSS package: python-slugify v8.0.4 (1,106 LoC, MIT). The executor is claude -p --permission-mode acceptEdits, billed against the operator's Claude Pro / Max subscription — no API key, no per-token charge.

Verified end-to-end (2026-05-17):

10 real ruff violations on the cloned target
Claude makes the semantic edits (F401 → explicit as-alias re-exports), wall-clock 34 s
A ruff check --fix && ruff format postprocess mops up structural autofixes (I001 import sort)
Run 0001 accepted, evolution/accepted advanced, real commit bae97a8 landed
Total --loop time: 48 s, $0 marginal cost

The realistic split — LLM does semantic work, deterministic tooling handles structural cleanup — mirrors how production teams actually chain agents with formatters.

README hero block (#34)

The first thing a visitor sees is now a copy-pasteable ▶ Try in 10 minutes snippet plus a compact ASCII workflow diagram showing Observe → Plan → Execute → Evaluate → accept/reject → ledger. The existing investor-narrative Motivation / SWE-bench Verified worked-example sections are intact below.

Numbers

102 tests pass under python -m unittest discover -s tests on Python 3.10 and 3.12 (99 baseline + 3 new for the init wizard, covering all 5 templates via subTest).
evolution_kernel/*.py: 1,969 lines — well under the v1.1 soft cap of baseline + 200 (= 2,089).
Single runtime dependency (PyYAML) preserved.
No kernel changes — every new behavior lives in init_wizard.py, the templates, or under examples/. The runtime that v1.0.0 froze is byte-identical.

Issues closed

#27 evolution-kernel init subcommand + 5 YAML templates
#29 examples/quickstart/ 10-minute zero-cost ruff cleanup demo
#31 examples/oss_fix_demo/ real OSS fix via claude CLI
#33 README hero — ▶ Try in 10 minutes + ASCII workflow

Migration

None. v1.1 is a strict superset of v1.0 on the kernel surface. Existing configs and ledgers continue to work unchanged.

Assets 2

14 May 03:24

Protocol-zero-0

v1.0.0

5e3ab83

v1.0.0 — Phase 4: Sandbox + Remote Observer

The kernel crosses the "灵魂插件" bar: an evolution runtime you can point at any git repository and trust to run unattended overnight — sandboxed at the OS level, with evidence pulled from anywhere on the network.

Highlights

Process sandbox via firejail (PR #19, closes #17)

Executor argv is wrapped with firejail --quiet --noprofile --read-only=/ --read-write=<worktree> --read-write=<run_dir> when sandbox.enabled: true.
The rest of the filesystem is mounted read-only, so a misbehaving executor cannot write to /tmp, ~/.ssh, or anywhere else on disk during a round.
Verified end-to-end: planted /tmp/sandbox-leak-<run_id>.txt write attempt blocked with OSError: [Errno 30] Read-only file system; same fixture without the sandbox writes the file. CI installs firejail and runs this assertion on every push.
Planner and evaluator stay unsandboxed (read-mostly) to keep the blast radius of any policy bug as small as possible.

Remote observer — HTTP evidence source (PR #20, closes #18)

evidence_sources gains type: http with url, method, headers, timeout.
Uses stdlib urllib.request, so the kernel's single-dependency rule still holds — pyproject.toml lists only PyYAML>=6.0.
Captures status, body (64 KiB cap with a truncated flag), and a sorted list of response headers so the ledger is stable for diffing.
Non-2xx responses still record the body — the planner gets to decide how to react instead of the observer silently retrying.
Verified end-to-end: a Governor.run_once against a local python3 -m http.server records the JSON response under observation.json.sources[0] with status: 200.

Test surface

	v0.3 → v1.0
Tests	67 → 99 (+16 sandbox, +16 HTTP)
Runtime	~1,200 → ~1,400 lines of Python
Third-party deps	1 (PyYAML) → 1 (PyYAML)
CI matrix	Python 3.10 + 3.12, both green with firejail installed

Compatibility

sandbox.enabled defaults to false. v0.3 configs run byte-for-byte identically.
type: http is an additive evidence source; existing type: file and type: shell are unchanged.
The EvidenceSource dataclass gained new fields but kept the existing ones at the same positions.

What ships in v1.0

Multi-round LLM loop with history injection.
max_total_usd / max_total_tokens / max_iterations / max_consecutive_failures hard stops.
Full ledger audit trail (survives process restarts).
Git worktree sandbox per attempt.
Scope enforcement against allowed_paths.
Aider + Claude Code executor adapters, Anthropic + OpenAI planner/evaluator adapters.
Goal evaluator — stops when the mission is "won".
k-branch parallel exploration (FunSearch / AlphaEvolve style).
NEW Process sandbox via firejail.
NEW Remote observer (HTTP evidence source).

Changelog (since v0.3.0)

5e3ab83 chore: release v1.0.0 (#21)
61bb2ee feat: HTTP evidence source for the observer (closes #18) (#20)
f41b55f feat: process sandbox via firejail (closes #17) (#19)

Assets 2

13 May 19:55

Protocol-zero-0

v0.3.0

818860b

v0.3.0 — k-branch parallel exploration

First tagged release. The repository's pyproject.toml is now in sync with the public version — earlier README badges (v0.2) referenced unreleased states.

Highlights since v0.1.0

This release bundles four months of work that took the kernel from a flat MVP to a population-level evolution runtime:

🧬 Phase 3 — k-branch parallel exploration (#15)

Governor.run_once_parallel(goal, k) spawns k independent worktrees per round, each running plan → execute → evaluate.
The highest-fitness survivor is promoted to evolution/accepted; the rest are recorded under ledger/failed/.
New parallel.k_branches config field (default 1, fully back-compatible).
Evaluator role now emits a float fitness; older evaluators that only set hard_gates_passed keep working via automatic back-fill.

🎯 Phase 2 — Goal evaluator + Strategist (#13)

Goal evaluator — after every accepted round, an external role decides whether the mission has been won; true → CLI exits 0.
Strategist — every N rounds, an external role injects { stage, next_milestone, taboo_directions } into the planner's input.
Both default to disabled; existing configs are unchanged.

🔁 Phase 1 — LLM loop + history + cost guard (PR #4, retroactively v0.2)

Multi-round LLM loop with history injection — planner sees prior rounds' reflections.
Budget guards: max_total_usd, max_total_tokens.
Anthropic + OpenAI planner/evaluator support; Aider + Claude Code executor support.

🛡️ Phase 0 — MVP closed loop (PR #2, retroactively v0.1)

Observer → planner → executor → evaluator → ledger.
Git worktree sandbox; every change reversible.
mutation_scope.allowed_paths enforcement.
Iteration / consecutive-failure hard stops.

What works today

Feature	Status
Multi-round LLM loop with memory	✅
Budget guards (`max_total_usd`, `max_total_tokens`)	✅
Iteration / consecutive-failure hard stops	✅
Full ledger audit trail	✅
Git worktree sandbox	✅
Scope enforcement	✅
Config-driven LLM provider / model / coding agent	✅
Aider and Claude Code executor support	✅
Anthropic and OpenAI planner/evaluator support	✅
Goal evaluator — stops when mission is won	✅
k-branch parallel exploration (FunSearch / AlphaEvolve)	✅
Process sandbox (firejail / bwrap)	🔧 next

Tests

67 passed — CI green on Python 3.10 and 3.12.

Install

pip install evolution-kernel==0.3.0

(or clone the repo — single dependency: PyYAML.)

Assets 2

Releases: Protocol-zero-0/evolution-kernel

v1.1.2 — Packaging fix: roles/ now ship in the wheel

The bug

The fix

Backward compatibility

Verification

Links

中文摘要

Bug 在哪

怎么修

兼容性

验证

Uh oh!

v1.1.1 — Executor permission-mode fix

What changed

Why this is safe

Affected configurations

Compatibility

Links

中文摘要

改了什么

为什么这样安全

谁受影响

兼容性

Uh oh!

v1.1.0 — ready to publicize

v1.1.0 — ready to publicize

What's new

evolution-kernel init — three-question scaffolder (#28)

examples/quickstart/ — see the loop close in 1.4 seconds (#30)

examples/oss_fix_demo/ — real OSS fix via claude CLI (#32)

README hero block (#34)

Numbers

Issues closed

Migration

Uh oh!

v1.0.0 — Phase 4: Sandbox + Remote Observer

Highlights

Process sandbox via firejail (PR #19, closes #17)

Remote observer — HTTP evidence source (PR #20, closes #18)

Test surface

Compatibility

What ships in v1.0

Changelog (since v0.3.0)

Uh oh!

v0.3.0 — k-branch parallel exploration

Highlights since v0.1.0

🧬 Phase 3 — k-branch parallel exploration (#15)

🎯 Phase 2 — Goal evaluator + Strategist (#13)

🔁 Phase 1 — LLM loop + history + cost guard (PR #4, retroactively v0.2)

🛡️ Phase 0 — MVP closed loop (PR #2, retroactively v0.1)

What works today

Tests

Install

Uh oh!

`evolution-kernel init` — three-question scaffolder (#28)

`examples/quickstart/` — see the loop close in 1.4 seconds (#30)

`examples/oss_fix_demo/` — real OSS fix via `claude` CLI (#32)