Summary (from RFC)
- Problem: agent-server and CLI hardcode how an Agent is built; multi-agent patterns
require manual wiring and aren't shareable.
- Proposal: introduce a Strategy protocol in openhands-sdk to build a configured Agent.
- Strategy: a Python class with
setup() and build_agent(llm, params).
- Consumers: agent-server loads strategies via --extensions, CLI via --strategy, app-server
can list strategies and launch them.
- Outcome: pluggable, shareable multi-agent pipelines with zero runtime overhead.
RFC: Pluggable Multi-Agent Strategies for OpenHands
Status: Draft
Author: @rbren
Scope: openhands-sdk, openhands-agent-server, openhands-cli
Problem
Both agent-server and openhands-cli hardcode how an Agent gets built. Multi-agent patterns (delegation, adversarial review, phased pipelines) require manually wiring DelegateTool, register_agent(), critic configs, custom system prompts, and skills. There's no way to package and share these configurations.
Proposal
A Strategy is a Python class that takes an LLM and returns a configured Agent. One protocol, defined in openhands-sdk, usable by every consumer.
The Protocol
Lives in openhands-sdk because that's the shared layer.
# openhands/sdk/strategy.py
from abc import ABC, abstractmethod
from typing import Any
from pydantic import BaseModel
from openhands.sdk import Agent, LLM
class StrategyParam(BaseModel):
name: str
description: str
type: str = "string" # string | integer | boolean
default: Any = None
class Strategy(ABC):
"""A factory that assembles SDK primitives into a configured Agent."""
name: str
description: str
params: list[StrategyParam] = []
def setup(self) -> None:
"""Called once on load. Register tools, agent types, etc."""
@abstractmethod
def build_agent(self, llm: LLM, params: dict[str, Any] | None = None) -> Agent:
"""Return a fully configured Agent for this strategy."""
That's it. A Strategy calls the same things you'd call by hand — register_agent(), DelegateTool, AgentContext, Critic, whatever — and returns an Agent.
How Each Consumer Uses It
agent-server (--extensions=./waterfall.py):
startup: load module → find Strategy subclass → call setup() → register in dict
API: POST /api/strategies/{name}/start → strategy.build_agent(llm, params)
→ wrap in StartConversationRequest
→ existing conversation_service.start()
New files: strategy_router.py (discovery + start endpoints). Modified: __main__.py (+6 lines for --extensions arg and loader). POST /api/conversations is unchanged.
openhands-cli (openhands --strategy=./waterfall.py):
startup: load module → find Strategy subclass → call setup()
run: strategy.build_agent(llm_from_agent_store, params)
→ Conversation(agent=agent, workspace=...) → .run()
Modified: entrypoint.py (+1 arg), setup.py (if --strategy, use it instead of load_agent_specs()). Everything else — TUI, confirmation, visualizer — works unchanged because it already takes an Agent.
app_server (future — UI dropdown):
GET /api/strategies → list loaded strategies with param schemas → render form
POST → build StartConversationRequest → POST to agent-server inside sandbox
Replaces the hardcoded AgentType.PLAN / AgentType.DEFAULT switch in _create_agent_with_context() with a pluggable version.
What an Extension Looks Like
# waterfall.py
from openhands.sdk.strategy import Strategy, StrategyParam
from openhands.sdk import Agent, AgentContext, LLM
from openhands.sdk.context import Skill
from openhands.sdk.tool import Tool
from openhands.tools.delegate import DelegateTool, register_agent
from openhands.tools.preset.default import get_default_tools
def _designer(llm: LLM) -> Agent:
return Agent(llm=llm, tools=[], agent_context=AgentContext(
system_message_suffix="You are a system architect. Output a design doc."))
def _coder(llm: LLM) -> Agent:
return Agent(llm=llm, tools=get_default_tools())
class Waterfall(Strategy):
name = "waterfall"
description = "Design → Code → Test pipeline"
params = [StrategyParam(name="max_turns", type="integer", default=8,
description="Max turns per phase")]
def setup(self):
register_agent("designer", _designer, "System architect")
register_agent("coder", _coder, "Implementation engineer")
def build_agent(self, llm, params=None):
max_turns = (params or {}).get("max_turns", 8)
return Agent(
llm=llm,
tools=[Tool(name="DelegateTool"), *get_default_tools()],
agent_context=AgentContext(system_message_suffix=f"""
Orchestrate a waterfall process:
1. Delegate to 'designer' for architecture ({max_turns} turns max)
2. Delegate to 'coder' to implement the design
3. Verify the result
"""))
# agent-server
python -m openhands.agent_server --extensions=./waterfall.py
# CLI
openhands --strategy=./waterfall.py --task "Build a REST API for todos"
Why This Works
The SDK already has every primitive. DelegateTool, register_agent(), Critic, IterativeRefinementConfig, AgentContext, skills, hooks, plugins. A strategy just wires them together. No new execution engine.
Agent is the universal currency. agent-server serializes it into StartConversationRequest. CLI passes it to Conversation(). app_server POSTs it as JSON to agent-server. All three already revolve around building an Agent — this just makes the building step pluggable.
Zero runtime overhead. build_agent() runs once at conversation start. After that, the conversation is identical to any other.
It's just Python. No YAML, no DSL. Import from openhands.sdk, use any library, test standalone.
Loader (shared, ~20 lines)
# openhands/sdk/strategy.py (addition)
def load_strategy(path: str) -> Strategy:
"""Import a .py file, find the Strategy subclass, instantiate and setup."""
p = Path(path).resolve()
spec = importlib.util.spec_from_file_location(p.stem, p)
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
for attr in vars(mod).values():
if isinstance(attr, type) and issubclass(attr, Strategy) and attr is not Strategy:
instance = attr()
instance.setup()
return instance
raise ValueError(f"No Strategy subclass found in {path}")
Open Questions
- Should strategies be distributable as plugins? A strategy could be a plugin directory with
strategy.py + skills/ + hooks/. The existing plugin system already handles git repos and local paths.
- Strategy composition — can a strategy use another strategy? Probably yes, since
build_agent is just Python, but do we want a formal API for it?
- Should
GET /api/strategies exist on agent-server, or only on app_server? Agent-server is low-level; discovery might belong higher up.
Summary (from RFC)
require manual wiring and aren't shareable.
setup()andbuild_agent(llm, params).can list strategies and launch them.
RFC: Pluggable Multi-Agent Strategies for OpenHands
Status: Draft
Author: @rbren
Scope:
openhands-sdk,openhands-agent-server,openhands-cliProblem
Both
agent-serverandopenhands-clihardcode how anAgentgets built. Multi-agent patterns (delegation, adversarial review, phased pipelines) require manually wiringDelegateTool,register_agent(), critic configs, custom system prompts, and skills. There's no way to package and share these configurations.Proposal
A Strategy is a Python class that takes an LLM and returns a configured
Agent. One protocol, defined inopenhands-sdk, usable by every consumer.The Protocol
Lives in
openhands-sdkbecause that's the shared layer.That's it. A Strategy calls the same things you'd call by hand —
register_agent(),DelegateTool,AgentContext,Critic, whatever — and returns anAgent.How Each Consumer Uses It
agent-server (
--extensions=./waterfall.py):New files:
strategy_router.py(discovery + start endpoints). Modified:__main__.py(+6 lines for--extensionsarg and loader).POST /api/conversationsis unchanged.openhands-cli (
openhands --strategy=./waterfall.py):Modified:
entrypoint.py(+1 arg),setup.py(if--strategy, use it instead ofload_agent_specs()). Everything else — TUI, confirmation, visualizer — works unchanged because it already takes anAgent.app_server (future — UI dropdown):
Replaces the hardcoded
AgentType.PLAN/AgentType.DEFAULTswitch in_create_agent_with_context()with a pluggable version.What an Extension Looks Like
Why This Works
The SDK already has every primitive.
DelegateTool,register_agent(),Critic,IterativeRefinementConfig,AgentContext, skills, hooks, plugins. A strategy just wires them together. No new execution engine.Agent is the universal currency. agent-server serializes it into
StartConversationRequest. CLI passes it toConversation(). app_server POSTs it as JSON to agent-server. All three already revolve around building anAgent— this just makes the building step pluggable.Zero runtime overhead.
build_agent()runs once at conversation start. After that, the conversation is identical to any other.It's just Python. No YAML, no DSL. Import from
openhands.sdk, use any library, test standalone.Loader (shared, ~20 lines)
Open Questions
strategy.py+ skills/ + hooks/. The existing plugin system already handles git repos and local paths.build_agentis just Python, but do we want a formal API for it?GET /api/strategiesexist on agent-server, or only on app_server? Agent-server is low-level; discovery might belong higher up.