docs(sdk): add normative invariants to architecture docs#274
docs(sdk): add normative invariants to architecture docs#274
Conversation
…815\n\nCo-authored-by: openhands <openhands@all-hands.dev>
…Co-authored-by: openhands <openhands@all-hands.dev>
…: openhands <openhands@all-hands.dev>
…\n\nCo-authored-by: openhands <openhands@all-hands.dev>
…d-by: openhands <openhands@all-hands.dev>
|
Looks like there are a few issues preventing this PR from being merged!
If you'd like me to help, just leave a comment, like Feel free to include any additional details that might help me get this PR into a better state. You can manage your notification settings |
|
@OpenHands The trouble with this PR is that it displays our OCL constraints in the user-facing docs, but they are meant for AI agents, not for humans. I think we have two options:
WDYT? Answer me directly on github, in a comment you post to this PR. Note that it will be rendered as markdown. |
|
I'm on it! enyst can track my progress at all-hands.dev |
|
I did not make any repository changes in this interaction. I answered the PR comment question by recommending how to handle the OCL constraints so they’re available to AI agents but not shown in user-facing docs:
No git diff to review, no commits, and nothing to push. |
|
@OpenHands Merge main and fix conflicts. Then read all comments to this PR, and implement Option 1: keep the natural-language invariants visible, but move the OCL blocks into HTML comments (non-rendered) with clear sentinels like ... so agents can reliably extract them. |
|
I'm on it! enyst can track my progress at all-hands.dev |
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
|
Summary of new changes since my last update:
Checklist:
Note: I did not re-check |
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
|
I moved these invariants in The docs pages are for humans, but this information is just a restatement in more formal language intended for agents, not for humans. @xingyaoww WDYT about this solution? |
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable - Clean pattern, but needs verification and refinement
Taste Rating: The sidecar file pattern is elegant and the separation of concerns is good. However, the value proposition is speculative and some content needs refinement.
Verification Status
I cloned software-agent-sdk and verified key invariants against source code:
✅ Verified Correct:
AgentBasehasfrozen=Trueinmodel_configEventhasfrozen=Trueandextra="forbid"- Conversation factory returns
RemoteConversationwhen workspace isRemoteWorkspace - Tool naming:
TerminalTool→terminal(via_camel_to_snake().removesuffix("_tool")) - Workspace factory returns
RemoteWorkspacewhenhostis provided - Multi-action batching:
_combine_action_eventsasserts emptythoughtfor all but first action
The invariants I spot-checked are accurate. Good work on verification.
Key Issues
🟠 Important: Value Proposition Unclear
The PR description says this "maybe" will help LLMs avoid drift, but provides no concrete evidence:
- No examples of drift this would prevent
- No measurement or testing against actual LLM behavior
- No comparison before/after
Recommendation: Either demonstrate measurable benefit (e.g., "this prevents X type of invalid PR") or frame this as an experiment to gather data.
🟡 Suggestion: Design Commentary vs Normative Constraints
Several invariant files include design discussions (especially workspace.ai-invariants.md pause/resume section). This is valuable context but blurs the line between "normative constraint" and "architectural commentary."
Consider:
- Creating a separate section for "Design Discussions" or "Open Questions"
- Keeping "Invariants (Normative)" strictly to verifiable contracts
- Moving design rationale to the main
.mdxfiles where humans can also benefit
🟢 Acceptable: Unrelated Formatting Changes
Some .mdx files have whitespace changes (trailing newlines, blank line cleanup) that are unrelated to the main PR goal. Not a blocker, but ideally these would be in a separate cleanup commit.
| #### Discussion: `pause()` / `resume()` semantics (design tradeoff) | ||
|
|
||
| There is an argument that this is compatible with the “swap workspaces without rewriting code” principle, because most client code should only rely on the *core* workspace and conversation operations, while optional capabilities are feature-detected or used conditionally. | ||
|
|
||
| There is a mild design smell here: the method names `pause()` / `resume()` suggest a strong guarantee (that work is actually suspended), but the SDK currently treats them as a **best-effort resource management hook**. | ||
|
|
||
| - Locally, there is often nothing meaningful the workspace can suspend at the boundary (it is operating on the host OS), so `LocalWorkspace.pause()` is a no-op. | ||
| - Some remote/container workspaces may be able to pause a container or VM, but others may not. | ||
|
|
||
| This tension matters because it creates two different reasonable expectations: | ||
|
|
||
| 1. *Ergonomic expectation*: orchestration code can call `pause()` unconditionally and it will be safe. | ||
| 2. *Guarantee expectation*: calling `pause()` actually pauses resource usage. | ||
|
|
||
| **Maybe it would make sense to** model this explicitly as an optional capability: | ||
|
|
||
| - Add `supports_pause` (or a richer `pause_capability`) to `BaseWorkspace`, and | ||
| - Make `pause()` / `resume()` no-ops everywhere by default (including remote) while letting pausable implementations override, | ||
| - Keep a strict helper (e.g., `pause_or_raise()`) for callers who require a guarantee. | ||
|
|
There was a problem hiding this comment.
🟡 Suggestion: This discussion is valuable but doesn't belong in "Invariants (Normative)".
Recommendation: Move this design smell discussion to either:
- A "Design Rationale" section in
workspace.mdx(where humans can see it) - A separate
workspace.design-discussion.mdfile - Or at minimum, retitle this section "Design Discussion: pause/resume semantics"
Normative invariants should be contracts that can be verified/enforced, not open-ended design debates.
| 1. **[Tool (Spec)](https://github.com/OpenHands/software-agent-sdk/blob/main/openhands-sdk/openhands/sdk/tool/spec.py)** - Configuration object with `name` (e.g., "BashTool") and `params` (e.g., `{"working_dir": "/workspace"}`) | ||
| 2. **Resolver Lookup** - Registry finds the registered resolver for the tool name | ||
| 3. **Factory Invocation** - Resolver calls the tool's `.create()` method with params and conversation state | ||
| 4. **Instance Creation** - Tool instance(s) are created with configured executors | ||
| 5. **Agent Usage** - Instances are added to the agent's tools_map for execution | ||
|
|
||
| **Registration Types:** | ||
|
|
||
| | Type | Registration | Resolver Behavior | | ||
| |------|-------------|-------------------| | ||
| | **Tool Instance** | `register_tool(name, instance)` | Returns the fixed instance (params not allowed) | | ||
| | **Tool Subclass** | `register_tool(name, ToolClass)` | Calls `ToolClass.create(**params, conv_state=state)` | | ||
| | **Factory Function** | `register_tool(name, factory)` | Calls `factory(**params, conv_state=state)` | | ||
|
|
||
| ### File Organization | ||
|
|
||
| Tools follow a consistent file structure for maintainability: | ||
|
|
||
| ``` | ||
| openhands-tools/openhands/tools/my_tool/ | ||
| ├── __init__.py # Export MyTool | ||
| ├── definition.py # Action, Observation, MyTool(ToolDefinition) | ||
| ├── impl.py # MyExecutor(ToolExecutor) | ||
| └── [other modules] # Tool-specific utilities | ||
| ``` | ||
|
|
||
| **File Responsibilities:** | ||
|
|
||
| | File | Contains | Purpose | | ||
| |------|----------|---------| | ||
| | `definition.py` | Action, Observation, ToolDefinition subclass | Public API, schema definitions, factory method | | ||
| | `impl.py` | ToolExecutor implementation | Business logic, state management, execution | | ||
| | `__init__.py` | Tool exports | Package interface | | ||
|
|
||
| **Benefits:** | ||
| - **Separation of Concerns** - Public API separate from implementation | ||
| - **Avoid Circular Imports** - Import `impl` only inside `create()` method | ||
| - **Consistency** - All tools follow same structure for discoverability | ||
|
|
||
| **Example Reference:** See [`terminal/`](https://github.com/OpenHands/software-agent-sdk/tree/main/openhands-tools/openhands/tools/terminal) for complete implementation |
There was a problem hiding this comment.
🟢 Nit: This entire "File Organization" section describes project conventions, not runtime invariants.
It's useful content, but consider whether it belongs in:
- The main
tool-system.mdxpage (for human readers) - A separate "SDK Conventions" doc
- Or a different section title like "## Code Organization Patterns (Informative)"
Not a blocker, just a categorization question.
xingyaoww
left a comment
There was a problem hiding this comment.
My only concern here is how do we keep this up-to-date :(
We have a lot of docs (for human) and sometimes it is already hard to make sure they are accurate and up-to-date..
Maybe we will need some CI that runs periodically that updates these?
HUMAN: I’ve been thinking for a while that maybe it will help LLMs to
I asked OH to model the first few layers of the SDK (I think it only did the first in-package layer, will come back to that if we want this) and add constraints in both natural language and, if simple, in good old OCL.
——
Summary of changes
Added "Invariants (Normative)" sections across SDK architecture docs (design, agent, conversation, events, tool system, workspace)
Included OCL-like invariants where simple and precise natural-language invariants where complex
Fixed/maintained MDX structure so tables/mermaid blocks render correctly
closes Analyze the SDK and model it software-agent-sdk#1815
@enyst can click here to continue refining the PR