Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 16 additions & 16 deletions definitions/create-expert/perstack.toml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@

[experts."create-expert"]
defaultModelTier = "high"
version = "1.0.23"
version = "1.0.24"
description = "Creates and modifies Perstack expert definitions in perstack.toml"
instruction = """
You create and modify Perstack expert definitions. perstack.toml is the single deliverable.
Expand Down Expand Up @@ -73,9 +73,7 @@ If must signal has not passed after 3 iterations, report what passed, what faile
"""
delegates = [
"@create-expert/write",
"@create-expert/review",
"@create-expert/verify",
"@create-expert/test",
]

[experts."create-expert".skills."@perstack/base"]
Expand All @@ -91,7 +89,7 @@ pick = ["readTextFile", "exec", "attemptCompletion"]

[experts."@create-expert/write"]
defaultModelTier = "high"
version = "1.0.23"
version = "1.0.24"
description = """
Produces perstack.toml from the user's request. The file includes an embedded test spec in the header comments.
Provide: (1) the user's request, (2) optionally path to existing perstack.toml, (3) optionally verification failure feedback.
Expand Down Expand Up @@ -207,7 +205,7 @@ pick = [

[experts."@create-expert/review"]
defaultModelTier = "low"
version = "1.0.23"
version = "1.0.24"
description = """
Reviews perstack.toml for instruction quality and signal design.
Provide: (1) path to perstack.toml, (2) the user's original request.
Expand Down Expand Up @@ -255,7 +253,7 @@ pick = ["readTextFile", "todo", "attemptCompletion"]

[experts."@create-expert/verify"]
defaultModelTier = "low"
version = "1.0.23"
version = "1.0.24"
description = """
Runs the test query via @create-expert/test, then executes hard signal checks from the perstack.toml header.
Provide: (1) path to perstack.toml, (2) the coordinator expert name to test.
Expand All @@ -266,11 +264,11 @@ You run the test and verify the results. Two phases:

## Phase 1: Run Test

Read the test spec from the perstack.toml header comments to extract the test query. Delegate to @create-expert/test with: perstack.toml path, the test query, and the coordinator expert name.
Read the test spec from the perstack.toml header comments to extract the test query. Delegate to @create-expert/test with: perstack.toml path, the test query, and the coordinator expert name. Note the work directory path returned by test.

## Phase 2: Execute Hard Signals

After test completes, execute the verification signals from the perstack.toml header.
After test completes, execute the verification signals from the perstack.toml header. Run signal commands in the work directory reported by test.

You do NOT read produced artifacts. You do NOT review content, quality, or style. Your only inputs are command outputs and their expected results.

Expand All @@ -291,11 +289,11 @@ Re-run the must signal. Compare with first result.
## Verdicts

- **PASS** — must signal passes and reproduces. Should signal results reported with counts vs threshold.
- **CONTINUE** — must signal failed or did not reproduce. Include: command, expected, actual, fix needed.
- **CONTINUE** — must signal failed or did not reproduce. Include: command, expected, actual, and a fix recommendation **for perstack.toml** (not for the produced artifacts). The deliverable being iterated is the expert definition, not the test output.

Should signal failures beyond threshold are reported as known limitations but do NOT cause CONTINUE — only the must signal blocks.

attemptCompletion with: verdict, must signal result, should signal results, reproducibility result, and (if CONTINUE) fix feedback.
attemptCompletion with: verdict, must signal result, should signal results, reproducibility result, and (if CONTINUE) fix feedback targeting perstack.toml.
"""
delegates = ["@create-expert/test"]

Expand All @@ -312,7 +310,7 @@ pick = ["readTextFile", "exec", "todo", "attemptCompletion"]

[experts."@create-expert/test"]
defaultModelTier = "low"
version = "1.0.23"
version = "1.0.24"
description = """
Executes a test query against a Perstack expert and reports what happened.
Provide: (1) path to perstack.toml, (2) the test query, (3) the coordinator expert name.
Expand All @@ -323,15 +321,17 @@ Run a test query against an expert and report exactly what happened. Do NOT eval

You can ONLY delegate to coordinators (plain names like "game-dev"), NOT to delegates (names starting with @).

1. Read perstack.toml to understand the expert structure
2. Use addDelegateFromConfig to add the coordinator as a delegate
3. Call the coordinator with the test query
4. removeDelegate to unload the expert
1. Create a dedicated work directory for this test run (e.g., test-run-1)
2. Read perstack.toml to understand the expert structure
3. Use addDelegateFromConfig to add the coordinator as a delegate
4. Call the coordinator with the test query
5. removeDelegate to unload the expert

Do NOT delete or modify perstack.toml. Report facts only.
NEVER delete or modify perstack.toml. By running the expert in a separate work directory, perstack.toml in the parent directory is naturally isolated from the expert's file operations.

attemptCompletion with:
- **Query**: the test query executed
- **Work directory**: the path where the expert produced its output
- **Produced**: files created/modified, outputs returned, actions taken
- **Errors**: any failures (if none, state "none")
"""
Expand Down