Add per-turn multi-agent mode#28685
Conversation
| #[experimental("thread/start.multiAgentMode")] | ||
| #[serde(default)] | ||
| pub multi_agent_mode: MultiAgentMode, |
There was a problem hiding this comment.
still exp - also this is currently not written to config.toml yet. Note: this is on the response and not on the request.
There was a problem hiding this comment.
- Add this to thread/start request for consistency sake.
| let inherited_multi_agent_mode = match initial_history.get_multi_agent_mode() { | ||
| Some(mode) => Some(mode), | ||
| None => match Self::inherited_thread_id_for_spawn( | ||
| &initial_history, | ||
| Some(&session_source), | ||
| parent_thread_id, | ||
| forked_from_thread_id, | ||
| ) { | ||
| Some(thread_id) => match self.get_thread(thread_id).await { | ||
| Ok(thread) => Some(thread.config_snapshot().await.multi_agent_mode), | ||
| Err(_) => None, | ||
| }, | ||
| None => None, | ||
| }, | ||
| }; |
There was a problem hiding this comment.
this means that the subagent inherit the multi agent mode from root and therefore would spawn with the corresponding developer instruction - one potential issue I can see is that if the sub agent is spawn in Proactive all the subagent would be eager too but if the user switch back to explicit the subagent does not get that instruction update necessarily. That is okay that cause if we do hit max concurrency the subagent cannot spawn more and the root agent will be less eager with agent spawning too.
| crate::session::multi_agents::multi_agent_mode_for_turn(next, &next.session_source) | ||
| .and_then(|multi_agent_mode| { | ||
| previous | ||
| .filter(|previous| previous.multi_agent_mode != Some(multi_agent_mode)) | ||
| .map(|_| MultiAgentModeInstructions::new(multi_agent_mode).render()) | ||
| }), |
There was a problem hiding this comment.
repo agents.md does not like to make single use helper... but I am going to anyway
- update
| sandbox: AppServerSandboxPolicy::DangerFullAccess, | ||
| active_permission_profile: None, | ||
| reasoning_effort: None, | ||
| multi_agent_mode: Default::default(), |
There was a problem hiding this comment.
You're probably right in designing this way, but just to ask: Does it make sense to add these feature flags as capabilities rather than modes? Like, allowing subagent delegation might be necessary for future features as well.
There was a problem hiding this comment.
I am not super familiar with capabilities in the harness world to make a judgement here to be fair. My dumb brain would be capabilities is more or less what is available vs a mode here is more or less a user intended choice?
| use codex_protocol::protocol::MULTI_AGENT_MODE_CLOSE_TAG; | ||
| use codex_protocol::protocol::MULTI_AGENT_MODE_OPEN_TAG; | ||
|
|
||
| const PROACTIVE_MULTI_AGENT_MODE_TEXT: &str = "Proactive multi-agent delegation is active. Any earlier instruction requiring an explicit user request before spawning sub-agents no longer applies. Use sub-agents when parallel work would materially improve speed or quality. This mode remains active until a later multi-agent mode developer message changes it."; |
There was a problem hiding this comment.
I'm amazed the most correct solution is to just append a line that tells the agent to ignore other parts of the prompt. Is that precedented? Do we need to eval stuff like this?
There was a problem hiding this comment.
Yes we will definitely eval this - the goal here is to not change the default behavior (i.e. current baseline) and introduce new prompts like this that we can run eval with.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 119bc45785
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ] | ||
| .into_iter() | ||
| .flatten() | ||
| .flat_map(multi_agent_v2_usage_hint_variants) |
There was a problem hiding this comment.
Stabilize full-history fork hints for proactive mode
With features.multi_agent_mode enabled, a V2 child spawned with the default fork_turns: "all" keeps the parent's TurnContext (so it inherits Proactive) but this filtering change only removes old hint variants; the full-history branch below still pushes the raw subagent_usage_hint_text, which ends with the explicit-request-only sentence. Because the retained previous context already has multi_agent_mode: Proactive, no fresh <multi_agent_mode> override is emitted, so the child sees the last developer instruction as “do not spawn” and proactive delegation is disabled for full-history subagents.
Useful? React with 👍 / 👎.
| thread_source, | ||
| dynamic_tools, | ||
| metrics_service_name: service_name, | ||
| multi_agent_mode, |
There was a problem hiding this comment.
Persist initial mode across pre-turn resumes
When a client sets multiAgentMode on thread/start and the app-server process unloads or restarts before the first turn, this value is only passed into the in-memory spawn options. The persisted thread metadata still records multi_agent_version but not the initial mode, and resume reconstructs the mode only from TurnContext items, so a later thread/resume falls back to explicitRequestOnly before any turn has written a context item. Please persist the initial mode with the thread/session metadata or otherwise restore it on resume.
AGENTS.md reference: AGENTS.md:L103-L110
Useful? React with 👍 / 👎.
| RolloutItem::TurnContext(turn_context) => turn_context.multi_agent_mode, | ||
| RolloutItem::SessionMeta(_) | ||
| | RolloutItem::ResponseItem(_) | ||
| | RolloutItem::InterAgentCommunication(_) | ||
| | RolloutItem::Compacted(_) | ||
| | RolloutItem::EventMsg(_) => None, |
There was a problem hiding this comment.
Restore standalone mode updates on resume
When Op::ThreadSettings is used on its own to change multi_agent_mode, update_thread_settings persists a ThreadSettingsApplied event with the new snapshot but does not write a TurnContext. This reconstruction helper ignores EventMsg, so after an unload/restart the resumed session falls back to an older turn context or the default and loses a persistent thread-setting update.
AGENTS.md reference: AGENTS.md:L103-L110
Useful? React with 👍 / 👎.
| FeatureSpec { | ||
| id: Feature::MultiAgentMode, | ||
| key: "multi_agent_mode", | ||
| stage: Stage::UnderDevelopment, | ||
| default_enabled: false, | ||
| }, |
There was a problem hiding this comment.
If this feature is off, we continue adding the no_spawn hint to the static developer message and do not inject any turn level developer message even if multi_agent_mode is supplied on the turn.
jif-oai
left a comment
There was a problem hiding this comment.
I think there is a broader design issue here: requested mode and effective model policy are separate concepts, but ownership is spread across feature gating, multi-agent version, usage-hint eligibility, session state, rollout reconstruction, prompt generation, and child inheritance
I would love to see a more centralised concept becoming the actual source of truth
| next: &TurnContext, | ||
| ) -> Option<String> { | ||
| let multi_agent_mode = | ||
| crate::session::multi_agents::multi_agent_mode_for_turn(next, &next.session_source)?; |
There was a problem hiding this comment.
This compares only the requested mode, but model-visible policy also depends on feature/V2/usage-hint eligibility.
On cold resume, disabled -> enabled with retained proactive suppresses the first proactive fragment because the raw modes match, while enabled -> disabled emits no explicit revocation and leaves the old proactive fragment in history
There was a problem hiding this comment.
This is a good point - instead of using feature.multi_agent_mode to toggle static hint behavior and disable turn level developer message, I think we can actually purely rely on turn level developer message with feature.multi_agent_mode check.
If last turn is proactive and on resume feature.multi_agent_mode goes is disabled, we would inject explicit instruction and toggle the multi-agent-mode back to explicit.
On cold resume, disabled -> enabled with retained proactive suppresses the first proactive fragment because the raw modes match
I think this is less of a concern on resume? unless I missed something. okay this actually makes more sense now. I think what I am coming up with should address this too sufficiently.
Let me iterate on this a little bit. We can have a centralized capability helper instead of each entry point check for its own piece. |
b32aa8d to
511fa89
Compare
1f54314 to
5d78618
Compare
| const DEFAULT_MULTI_AGENT_V2_NO_SPAWN_HINT_TEXT: &str = "Do not spawn sub-agents unless the user explicitly asks for sub-agents, delegation, or parallel agent work."; | ||
|
|
||
| fn default_multi_agent_v2_usage_hint_text(usage_hint_text: &str, max_concurrency: usize) -> String { | ||
| format!( | ||
| "{usage_hint_text}\n{DEFAULT_MULTI_AGENT_V2_SHARED_USAGE_HINT_TEXT}\nThere are {max_concurrency} available concurrency slots, meaning that up to {max_concurrency} agents can be active at once, including you.\n\n{DEFAULT_MULTI_AGENT_V2_NO_SPAWN_HINT_TEXT}" |
There was a problem hiding this comment.
We are removing the no spawn hint from the base static hint - the idea is that with effective multi agent mode calculated, we can deterministically add the no spawn (explicit invoke) or spawn (proactive) developer message per turn.
| fn body(&self) -> String { | ||
| match self.multi_agent_mode { | ||
| MultiAgentMode::ExplicitRequestOnly => { | ||
| EXPLICIT_REQUEST_ONLY_MULTI_AGENT_MODE_TEXT.to_string() | ||
| } | ||
| MultiAgentMode::Proactive => PROACTIVE_MULTI_AGENT_MODE_TEXT.to_string(), | ||
| } | ||
| } |
There was a problem hiding this comment.
per multiAgentmode, we then choose the corresponding developer message.
5d78618 to
ddb088f
Compare
| &next.session_source, | ||
| ); | ||
| let previous = previous?; | ||
| if previous.multi_agent_mode == effective_multi_agent_mode { |
There was a problem hiding this comment.
Remember we are only storing effective_multi_agent_mode (calculated agent mode per combo of current thread MA version, features (ma v2, usage hint enabled, multi agent mode) etc). Therefore we can use this to effectively diff if there is a change in agent mode in between turn (consider thread resume with multi agent mode disabled -> enabled etc). This effectively give a single point of comparison and let us know if MA mode is enabled or not comparing to last turn and if we should add the developer message to restrict its usage.
| if previous.multi_agent_mode == effective_multi_agent_mode { | ||
| return None; | ||
| } |
There was a problem hiding this comment.
Observe no change, no need to add developer message and existing message represent the current need.
ddb088f to
a50ab75
Compare
| } | ||
|
|
||
| match effective_multi_agent_mode { | ||
| Some(multi_agent_mode) => Some(MultiAgentModeInstructions::new(multi_agent_mode).render()), |
There was a problem hiding this comment.
If there is a diff, we will inject that instruction based on the new multi agent mode.
| None if previous.multi_agent_mode == Some(MultiAgentMode::Proactive) => { | ||
| Some(MultiAgentModeInstructions::new(MultiAgentMode::ExplicitRequestOnly).render()) | ||
| } |
There was a problem hiding this comment.
None means that no multi agent mode is valid, but if the previous mode is proactive, we need to intentionally disable it.
a50ab75 to
ec2f4a9
Compare
| if !multi_agent_mode_is_applicable(multi_agent_version, multi_agent_v2, session_source) { | ||
| return None; | ||
| } |
There was a problem hiding this comment.
If not applicable, we know we are not using MA v2 and therefore not injecting usage hint.
ec2f4a9 to
9ac7c38
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9ac7c38f19
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| /// Omitted keeps the loaded session's current mode. | ||
| #[experimental("turn/start.multiAgentMode")] | ||
| #[ts(optional = nullable)] | ||
| pub multi_agent_mode: Option<MultiAgentMode>, |
There was a problem hiding this comment.
Surface sticky multi-agent mode updates to clients
Because this new turn/start field is documented as sticky for subsequent turns, a request that only changes multiAgentMode now applies a persistent ThreadSettingsOverrides, but ThreadSettingsSnapshot and the app-server ThreadSettings notification shape still omit the mode. The core will emit ThreadSettingsApplied, yet app-server event handling compares a snapshot that is identical to the prior baseline and sends no thread/settings/updated notification, so another attached app-server client can keep showing/using the old mode while subsequent omitted turns run with the new one. Please include the mode in the thread settings snapshots/notifications when accepting this sticky override.
AGENTS.md reference: AGENTS.md:L103-L110
Useful? React with 👍 / 👎.
| multi_agent_mode: super::multi_agents::effective_multi_agent_mode( | ||
| self.multi_agent_version, | ||
| &self.config.multi_agent_v2, | ||
| &self.session_source, | ||
| self.multi_agent_mode, | ||
| self.features.enabled(Feature::MultiAgentMode), | ||
| ), |
There was a problem hiding this comment.
Persist the requested mode, not only the effective one
When multiAgentMode: "proactive" is accepted while the mode feature is disabled or multi-agent v2 is not currently applicable, the live SessionConfiguration retains Some(Proactive), but this rollout snapshot writes only the effective model-visible mode (explicitRequestOnly or None). On unload/restart, conversation_history.get_multi_agent_mode() restores from TurnContextItem, so the retained request is lost and a later eligible resume will not become proactive unless the client resends the setting. Persist the requested sticky mode separately from the effective prompt diff baseline.
AGENTS.md reference: AGENTS.md:L103-L110
Useful? React with 👍 / 👎.
Why
Multi-agent v2 currently carries an explicit-request-only delegation rule in its static usage hint. That provides a safe default, but it prevents clients from selecting proactive delegation per turn without changing static guidance or rewriting prior model context.
This change makes delegation mode a session selection that can be updated through
turn/start, while deriving the effective model-visible mode separately for each turn. Eligible multi-agent v2 turns remain explicit-request-only unless proactive mode is both selected and enabled.What changed
turn/start.multiAgentModeparameter withexplicitRequestOnlyandproactivevalues. Omission retains the loaded session's current optional selection.features.multi_agent_modefeature gate. Eligible multi-agent v2 turns use the selected mode when enabled; an unset selection or disabled gate resolves toexplicitRequestOnly.TurnContextItemas the durable baseline for resume and context-update comparisons.Historical rollout items are not rewritten. Later mode developer messages establish the current rule incrementally.
Not covered
thread/startand selected-mode reporting from thread lifecycle/settings APIs; those are isolated in the stacked Expose thread-level multi-agent mode #28792.config.toml; selection remains session/turn scoped.Verification
CARGO_INCREMENTAL=0 just test -p codex-core multi_agent_modeturn/start.multiAgentModeproduces proactive developer instructions for an eligible v2 turn.Stack
Followed by #28792, which adds
thread/startinitialization and lifecycle/settings observability.