Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 16 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -294,9 +294,9 @@ sandbox:
backend: firejail

roles:
planner: ["python3", "roles/planner.py"]
executor: ["bash", "roles/executor.sh"]
evaluator: ["python3", "roles/evaluator.py"]
planner: ["python3", "bundled:planner.py"]
executor: ["bash", "bundled:executor.sh"]
evaluator: ["python3", "bundled:evaluator.py"]
EOF

# 3. Run overnight
Expand Down Expand Up @@ -367,9 +367,12 @@ sandbox:
extra_args: [] # appended verbatim before `--`

roles:
planner: ["python3", "roles/planner.py"]
executor: ["bash", "roles/executor.sh"]
evaluator: ["python3", "roles/evaluator.py"]
# `bundled:` resolves to evolution_kernel/roles/<filename> inside the installed
# wheel — works for both `pip install evolution-kernel` and git-clone setups.
# Or replace any entry with your own argv (`["python3", "myplanner.py"]`).
planner: ["python3", "bundled:planner.py"]
executor: ["bash", "bundled:executor.sh"]
evaluator: ["python3", "bundled:evaluator.py"]
```

**Switch to OpenAI:**
Expand Down Expand Up @@ -443,7 +446,7 @@ Each role is an executable that receives:
--worktree <path> path to the isolated git sandbox checkout
```

`roles/planner.py`, `roles/executor.sh`, and `roles/evaluator.py` are the reference implementation. Copy, modify, or replace them entirely — with a shell script, a Docker call, or anything that reads `--input` and writes `--output`.
The bundled `evolution_kernel/roles/planner.py`, `executor.sh`, and `evaluator.py` are the reference implementation, shipped inside the wheel — reference them in your `evolution.yml` as `bundled:<name>`. Copy, modify, or replace them entirely — with a shell script, a Docker call, or anything that reads `--input` and writes `--output`.

---

Expand All @@ -462,12 +465,12 @@ Being honest about where v1.0 is *not* yet.
## Project layout

```
evolution_kernel/ ~1,900-line runtime (Governor · Observer · HardStops · Sandbox · Config · CLI · Scope)
roles/ reference planner, executor, evaluator, goal_evaluator, strategist
examples/ demo target + sandbox demo + working evolution.yml
docs/ protocol spec + first-task spec
tests/ 99 unit + acceptance tests · 14 fixture role scripts
evidence/ checked-in artifacts of runs anyone can reproduce
evolution_kernel/ ~1,900-line runtime (Governor · Observer · HardStops · Sandbox · Config · CLI · Scope)
evolution_kernel/roles/ reference planner, executor, evaluator, goal_evaluator, strategist — bundled in wheel, addressable as `bundled:<name>`
examples/ demo target + sandbox demo + working evolution.yml
docs/ protocol spec + first-task spec
tests/ 99 unit + acceptance tests · 14 fixture role scripts
evidence/ checked-in artifacts of runs anyone can reproduce
```

---
Expand Down
29 changes: 16 additions & 13 deletions README.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -290,9 +290,9 @@ sandbox:
backend: firejail

roles:
planner: ["python3", "roles/planner.py"]
executor: ["bash", "roles/executor.sh"]
evaluator: ["python3", "roles/evaluator.py"]
planner: ["python3", "bundled:planner.py"]
executor: ["bash", "bundled:executor.sh"]
evaluator: ["python3", "bundled:evaluator.py"]
EOF

# 3. 跑一晚上,放着不管
Expand Down Expand Up @@ -360,9 +360,12 @@ sandbox:
extra_args: [] # 追加到 firejail 命令的额外参数(在 `--` 之前)

roles:
planner: ["python3", "roles/planner.py"]
executor: ["bash", "roles/executor.sh"]
evaluator: ["python3", "roles/evaluator.py"]
# `bundled:` 自动解析到已安装的 wheel 里 evolution_kernel/roles/<filename>,
# `pip install evolution-kernel` 和 git-clone 两种安装方式都能直接跑。
# 也可以把任意一项换成你自己的 argv(如 `["python3", "myplanner.py"]`)。
planner: ["python3", "bundled:planner.py"]
executor: ["bash", "bundled:executor.sh"]
evaluator: ["python3", "bundled:evaluator.py"]
```

**切换到 OpenAI:**
Expand Down Expand Up @@ -436,7 +439,7 @@ python3 -m pytest tests/ -v
--worktree <path> 隔离 git 沙箱 checkout 的路径
```

`roles/planner.py`、`roles/executor.sh`、`roles/evaluator.py` 是参考实现。复制、改写、或者完全替换——shell 脚本、Docker 调用、任何能读 `--input` 写 `--output` 的东西都行。
bundled 的 `evolution_kernel/roles/planner.py`、`executor.sh`、`evaluator.py` 是参考实现,随 wheel 一起发出去——在 `evolution.yml` 里用 `bundled:<name>` 引用即可。复制、改写、或者完全替换——shell 脚本、Docker 调用、任何能读 `--input` 写 `--output` 的东西都行。

---

Expand All @@ -455,12 +458,12 @@ python3 -m pytest tests/ -v
## 项目结构

```
evolution_kernel/ ~1,900 行 runtime(Governor · Observer · HardStops · Sandbox · Config · CLI · Scope)
roles/ 参考规划器 / 执行器 / 评估器 / 目标评估器 / 策略师
examples/ demo 目标 + sandbox demo + 可直接运行的 evolution.yml
docs/ 协议规范 + 第一个进化任务规范
tests/ 99 个单元 + 验收测试 · 14 个 fixture 角色脚本
evidence/ checked-in 的可复现运行 artifact
evolution_kernel/ ~1,900 行 runtime(Governor · Observer · HardStops · Sandbox · Config · CLI · Scope)
evolution_kernel/roles/ 参考规划器 / 执行器 / 评估器 / 目标评估器 / 策略师 —— 随 wheel 发,用 `bundled:<name>` 引用
examples/ demo 目标 + sandbox demo + 可直接运行的 evolution.yml
docs/ 协议规范 + 第一个进化任务规范
tests/ 99 个单元 + 验收测试 · 14 个 fixture 角色脚本
evidence/ checked-in 的可复现运行 artifact
```

---
Expand Down
40 changes: 40 additions & 0 deletions evolution_kernel/_bundled.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
"""Resolve `bundled:<filename>` argv entries to absolute paths inside the wheel.

The reference roles ship inside the installed package at
`evolution_kernel/roles/`. Templates and example configs reference them
through the `bundled:` prefix so the same config works for both
`pip install evolution-kernel` users and `git clone` developers.
"""
from __future__ import annotations

from importlib.resources import as_file, files
from pathlib import Path

BUNDLED_PREFIX = "bundled:"


def resolve_bundled(arg: str) -> str:
"""If `arg` starts with `bundled:`, return the absolute path to the
matching file inside `evolution_kernel/roles/`. Otherwise return `arg`
unchanged.

Raises FileNotFoundError when the prefix is used but the file does
not exist in the bundle (clearer signal than a downstream
`subprocess` error).
"""
if not arg.startswith(BUNDLED_PREFIX):
return arg
name = arg[len(BUNDLED_PREFIX):]
if not name or "/" in name or "\\" in name:
raise ValueError(
f"bundled: prefix takes a bare filename, got {arg!r}"
)
resource = files("evolution_kernel").joinpath("roles", name)
with as_file(resource) as path:
resolved = Path(path)
if not resolved.exists():
raise FileNotFoundError(
Comment on lines +32 to +36
f"bundled role {name!r} not found in evolution_kernel.roles "
f"(looked at {resolved})"
)
return str(resolved.resolve())
19 changes: 13 additions & 6 deletions evolution_kernel/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -295,17 +295,24 @@ def _parse_roles(value: Any) -> Roles:
if not value:
return Roles()

from ._bundled import resolve_bundled

def _argv(label: str) -> tuple[str, ...]:
v = value.get(label)
if v is None:
return ()
if isinstance(v, str):
return (v,)
if isinstance(v, list) and all(isinstance(x, str) and x.strip() for x in v):
return tuple(x.strip() for x in v)
raise ConfigError(
f"`roles.{label}` must be a string or a list of non-empty strings"
)
items = (v,)
elif isinstance(v, list) and all(isinstance(x, str) and x.strip() for x in v):
items = tuple(x.strip() for x in v)
else:
raise ConfigError(
f"`roles.{label}` must be a string or a list of non-empty strings"
)
try:
return tuple(resolve_bundled(x) for x in items)
except (FileNotFoundError, ValueError) as e:
raise ConfigError(f"`roles.{label}`: {e}") from e

return Roles(
planner=_argv("planner"),
Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
is complete, and writes goal_evaluation.json.

LLM provider/model are read from config.json in the same run directory (same
pattern as roles/planner.py).
pattern as evolution_kernel/roles/planner.py).
"""
from __future__ import annotations

Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
(current stage, next milestone, taboo directions), and writes strategy.json.

LLM provider/model are read from config.json in the same run directory (same
pattern as roles/planner.py).
pattern as evolution_kernel/roles/planner.py).
"""
from __future__ import annotations

Expand Down
6 changes: 3 additions & 3 deletions evolution_kernel/templates/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,6 @@ parallel:
k_branches: 3 # explore 3 candidates per round; best fitness wins

roles:
planner: ["python3", "roles/planner.py"]
executor: ["bash", "roles/executor.sh"]
evaluator: ["python3", "roles/evaluator.py"]
planner: ["python3", "bundled:planner.py"]
executor: ["bash", "bundled:executor.sh"]
evaluator: ["python3", "bundled:evaluator.py"]
6 changes: 3 additions & 3 deletions evolution_kernel/templates/coverage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,6 @@ hard_stops:
max_total_tokens: 1000000

roles:
planner: ["python3", "roles/planner.py"]
executor: ["bash", "roles/executor.sh"]
evaluator: ["python3", "roles/evaluator.py"]
planner: ["python3", "bundled:planner.py"]
executor: ["bash", "bundled:executor.sh"]
evaluator: ["python3", "bundled:evaluator.py"]
6 changes: 3 additions & 3 deletions evolution_kernel/templates/custom.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,6 @@ hard_stops:
max_total_tokens: 500000

roles:
planner: ["python3", "roles/planner.py"]
executor: ["bash", "roles/executor.sh"]
evaluator: ["python3", "roles/evaluator.py"]
planner: ["python3", "bundled:planner.py"]
executor: ["bash", "bundled:executor.sh"]
evaluator: ["python3", "bundled:evaluator.py"]
6 changes: 3 additions & 3 deletions evolution_kernel/templates/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,6 @@ hard_stops:
max_total_tokens: 500000

roles:
planner: ["python3", "roles/planner.py"]
executor: ["bash", "roles/executor.sh"]
evaluator: ["python3", "roles/evaluator.py"]
planner: ["python3", "bundled:planner.py"]
executor: ["bash", "bundled:executor.sh"]
evaluator: ["python3", "bundled:evaluator.py"]
6 changes: 3 additions & 3 deletions evolution_kernel/templates/perf.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,6 @@ hard_stops:
max_total_tokens: 1500000

roles:
planner: ["python3", "roles/planner.py"]
executor: ["bash", "roles/executor.sh"]
evaluator: ["python3", "roles/evaluator.py"]
planner: ["python3", "bundled:planner.py"]
executor: ["bash", "bundled:executor.sh"]
evaluator: ["python3", "bundled:evaluator.py"]
8 changes: 4 additions & 4 deletions examples/evolution.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ llm:
model: claude-sonnet-4-6
api_key_env: ANTHROPIC_API_KEY # name of the env var holding the key

# Coding agent used by roles/executor.sh
# Coding agent used by bundled:executor.sh
coding_agent:
tool: aider # aider | claude-code

Expand All @@ -31,6 +31,6 @@ hard_stops:
max_total_tokens: 500000 # stop if total tokens reaches 500k

roles:
planner: ["python3", "roles/planner.py"]
executor: ["bash", "roles/executor.sh"]
evaluator: ["python3", "roles/evaluator.py"]
planner: ["python3", "bundled:planner.py"]
executor: ["bash", "bundled:executor.sh"]
evaluator: ["python3", "bundled:evaluator.py"]
2 changes: 1 addition & 1 deletion examples/oss_fix_demo/bots/executor.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""OSS-fix-demo executor: invokes `claude -p` inside the worktree.

The kernel-bundled `roles/executor.sh` claude-code path drops permission
The kernel-bundled `evolution_kernel/roles/executor.sh` claude-code path drops permission
flags, so claude refuses to make edits in non-interactive mode. This
wrapper sets `--permission-mode acceptEdits` so the agent actually edits
files. The cost is whatever your Claude Pro / Max subscription already
Comment on lines +3 to 6
Expand Down
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
name = "evolution-kernel"
version = "1.1.1"
version = "1.1.2"
description = "A minimal autonomous evolution kernel with isolated planner, executor, evaluator roles."
readme = "README.md"
requires-python = ">=3.10"
Expand Down Expand Up @@ -52,5 +52,5 @@ evolution-kernel = "evolution_kernel.cli:main"
packages = ["evolution_kernel"]

[tool.setuptools.package-data]
evolution_kernel = ["templates/*.yml"]
evolution_kernel = ["templates/*.yml", "roles/*"]

94 changes: 94 additions & 0 deletions tests/test_bundled_roles.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
"""Tests for the `bundled:` prefix in `roles.*` argv entries."""
from __future__ import annotations

import tempfile
import unittest
from pathlib import Path

from evolution_kernel._bundled import BUNDLED_PREFIX, resolve_bundled
from evolution_kernel.config import ConfigError, load_config


SAMPLE_CONFIG = """\
mission: "test"
llm:
provider: anthropic
model: claude-sonnet-4-6
api_key_env: ANTHROPIC_API_KEY
coding_agent:
tool: aider
evidence_sources:
- type: file
path: "metrics.json"
mutation_scope:
allowed_paths:
- "src/"
roles:
planner: ["python3", "bundled:planner.py"]
executor: ["bash", "bundled:executor.sh"]
evaluator: ["python3", "bundled:evaluator.py"]
"""


class BundledPrefixTest(unittest.TestCase):
def test_resolve_bundled_returns_absolute_path_to_existing_file(self):
path = resolve_bundled("bundled:executor.sh")
resolved = Path(path)
self.assertTrue(resolved.is_absolute(), f"expected absolute path, got {path!r}")
self.assertTrue(resolved.exists(), f"resolved path does not exist: {path!r}")
self.assertEqual(resolved.name, "executor.sh")
self.assertEqual(resolved.parent.name, "roles")

def test_resolve_bundled_is_noop_for_non_prefixed_strings(self):
self.assertEqual(resolve_bundled("python3"), "python3")
self.assertEqual(resolve_bundled("/abs/path/to/exec"), "/abs/path/to/exec")
self.assertEqual(resolve_bundled("./relative/path.py"), "./relative/path.py")

def test_resolve_bundled_rejects_path_separators(self):
with self.assertRaises(ValueError):
resolve_bundled("bundled:../escape.py")
with self.assertRaises(ValueError):
resolve_bundled("bundled:subdir/file.py")
with self.assertRaises(ValueError):
resolve_bundled(BUNDLED_PREFIX) # empty name

def test_resolve_bundled_raises_for_missing_files(self):
with self.assertRaises(FileNotFoundError):
resolve_bundled("bundled:does-not-exist.xyz")

def test_load_config_resolves_bundled_in_roles(self):
with tempfile.TemporaryDirectory() as tmpdir:
cfg_path = Path(tmpdir) / "evolution.yml"
cfg_path.write_text(SAMPLE_CONFIG, encoding="utf-8")
cfg = load_config(str(cfg_path))

# `bundled:planner.py` should have become an absolute path to an
# existing file inside the installed package.
planner_script = Path(cfg.roles.planner[1])
executor_script = Path(cfg.roles.executor[1])
evaluator_script = Path(cfg.roles.evaluator[1])

for script in (planner_script, executor_script, evaluator_script):
self.assertTrue(script.is_absolute(), f"not absolute: {script}")
self.assertTrue(script.exists(), f"missing: {script}")
self.assertEqual(script.parent.name, "roles")

# Surrounding argv entries (python3, bash) must be untouched.
self.assertEqual(cfg.roles.planner[0], "python3")
self.assertEqual(cfg.roles.executor[0], "bash")
self.assertEqual(cfg.roles.evaluator[0], "python3")

def test_load_config_surfaces_missing_bundled_role_as_config_error(self):
bad_cfg = SAMPLE_CONFIG.replace(
'bundled:planner.py', 'bundled:nope.py'
)
with tempfile.TemporaryDirectory() as tmpdir:
cfg_path = Path(tmpdir) / "evolution.yml"
cfg_path.write_text(bad_cfg, encoding="utf-8")
with self.assertRaises(ConfigError) as ctx:
load_config(str(cfg_path))
self.assertIn("nope.py", str(ctx.exception))


if __name__ == "__main__": # pragma: no cover
unittest.main()
Loading