Skip to content

RFC: Pluggable Multi-Agent Strategies #2364

@enyst

Description

@enyst

Summary (from RFC)

  • Problem: agent-server and CLI hardcode how an Agent is built; multi-agent patterns
    require manual wiring and aren't shareable.
  • Proposal: introduce a Strategy protocol in openhands-sdk to build a configured Agent.
  • Strategy: a Python class with setup() and build_agent(llm, params).
  • Consumers: agent-server loads strategies via --extensions, CLI via --strategy, app-server
    can list strategies and launch them.
  • Outcome: pluggable, shareable multi-agent pipelines with zero runtime overhead.

RFC: Pluggable Multi-Agent Strategies for OpenHands

Status: Draft
Author: @rbren
Scope: openhands-sdk, openhands-agent-server, openhands-cli

Problem

Both agent-server and openhands-cli hardcode how an Agent gets built. Multi-agent patterns (delegation, adversarial review, phased pipelines) require manually wiring DelegateTool, register_agent(), critic configs, custom system prompts, and skills. There's no way to package and share these configurations.

Proposal

A Strategy is a Python class that takes an LLM and returns a configured Agent. One protocol, defined in openhands-sdk, usable by every consumer.

The Protocol

Lives in openhands-sdk because that's the shared layer.

# openhands/sdk/strategy.py

from abc import ABC, abstractmethod
from typing import Any
from pydantic import BaseModel
from openhands.sdk import Agent, LLM

class StrategyParam(BaseModel):
    name: str
    description: str
    type: str = "string"   # string | integer | boolean
    default: Any = None

class Strategy(ABC):
    """A factory that assembles SDK primitives into a configured Agent."""

    name: str
    description: str
    params: list[StrategyParam] = []

    def setup(self) -> None:
        """Called once on load. Register tools, agent types, etc."""

    @abstractmethod
    def build_agent(self, llm: LLM, params: dict[str, Any] | None = None) -> Agent:
        """Return a fully configured Agent for this strategy."""

That's it. A Strategy calls the same things you'd call by hand — register_agent(), DelegateTool, AgentContext, Critic, whatever — and returns an Agent.

How Each Consumer Uses It

agent-server (--extensions=./waterfall.py):

startup:  load module → find Strategy subclass → call setup() → register in dict
API:      POST /api/strategies/{name}/start  →  strategy.build_agent(llm, params)
                                             →  wrap in StartConversationRequest
                                             →  existing conversation_service.start()

New files: strategy_router.py (discovery + start endpoints). Modified: __main__.py (+6 lines for --extensions arg and loader). POST /api/conversations is unchanged.

openhands-cli (openhands --strategy=./waterfall.py):

startup:  load module → find Strategy subclass → call setup()
run:      strategy.build_agent(llm_from_agent_store, params)
          → Conversation(agent=agent, workspace=...) → .run()

Modified: entrypoint.py (+1 arg), setup.py (if --strategy, use it instead of load_agent_specs()). Everything else — TUI, confirmation, visualizer — works unchanged because it already takes an Agent.

app_server (future — UI dropdown):

GET /api/strategies → list loaded strategies with param schemas → render form
POST → build StartConversationRequest → POST to agent-server inside sandbox

Replaces the hardcoded AgentType.PLAN / AgentType.DEFAULT switch in _create_agent_with_context() with a pluggable version.

What an Extension Looks Like

# waterfall.py
from openhands.sdk.strategy import Strategy, StrategyParam
from openhands.sdk import Agent, AgentContext, LLM
from openhands.sdk.context import Skill
from openhands.sdk.tool import Tool
from openhands.tools.delegate import DelegateTool, register_agent
from openhands.tools.preset.default import get_default_tools

def _designer(llm: LLM) -> Agent:
    return Agent(llm=llm, tools=[], agent_context=AgentContext(
        system_message_suffix="You are a system architect. Output a design doc."))

def _coder(llm: LLM) -> Agent:
    return Agent(llm=llm, tools=get_default_tools())

class Waterfall(Strategy):
    name = "waterfall"
    description = "Design → Code → Test pipeline"
    params = [StrategyParam(name="max_turns", type="integer", default=8,
                            description="Max turns per phase")]

    def setup(self):
        register_agent("designer", _designer, "System architect")
        register_agent("coder", _coder, "Implementation engineer")

    def build_agent(self, llm, params=None):
        max_turns = (params or {}).get("max_turns", 8)
        return Agent(
            llm=llm,
            tools=[Tool(name="DelegateTool"), *get_default_tools()],
            agent_context=AgentContext(system_message_suffix=f"""
Orchestrate a waterfall process:
1. Delegate to 'designer' for architecture ({max_turns} turns max)
2. Delegate to 'coder' to implement the design
3. Verify the result
"""))
# agent-server
python -m openhands.agent_server --extensions=./waterfall.py

# CLI
openhands --strategy=./waterfall.py --task "Build a REST API for todos"

Why This Works

The SDK already has every primitive. DelegateTool, register_agent(), Critic, IterativeRefinementConfig, AgentContext, skills, hooks, plugins. A strategy just wires them together. No new execution engine.

Agent is the universal currency. agent-server serializes it into StartConversationRequest. CLI passes it to Conversation(). app_server POSTs it as JSON to agent-server. All three already revolve around building an Agent — this just makes the building step pluggable.

Zero runtime overhead. build_agent() runs once at conversation start. After that, the conversation is identical to any other.

It's just Python. No YAML, no DSL. Import from openhands.sdk, use any library, test standalone.

Loader (shared, ~20 lines)

# openhands/sdk/strategy.py (addition)

def load_strategy(path: str) -> Strategy:
    """Import a .py file, find the Strategy subclass, instantiate and setup."""
    p = Path(path).resolve()
    spec = importlib.util.spec_from_file_location(p.stem, p)
    mod = importlib.util.module_from_spec(spec)
    spec.loader.exec_module(mod)
    for attr in vars(mod).values():
        if isinstance(attr, type) and issubclass(attr, Strategy) and attr is not Strategy:
            instance = attr()
            instance.setup()
            return instance
    raise ValueError(f"No Strategy subclass found in {path}")

Open Questions

  1. Should strategies be distributable as plugins? A strategy could be a plugin directory with strategy.py + skills/ + hooks/. The existing plugin system already handles git repos and local paths.
  2. Strategy composition — can a strategy use another strategy? Probably yes, since build_agent is just Python, but do we want a formal API for it?
  3. Should GET /api/strategies exist on agent-server, or only on app_server? Agent-server is low-level; discovery might belong higher up.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions