Skip to content

Add per-turn multi-agent mode#28685

Open
shijie-oai wants to merge 1 commit into
mainfrom
shijie/ma-v2-mode
Open

Add per-turn multi-agent mode#28685
shijie-oai wants to merge 1 commit into
mainfrom
shijie/ma-v2-mode

Conversation

@shijie-oai

@shijie-oai shijie-oai commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Why

Multi-agent v2 currently carries an explicit-request-only delegation rule in its static usage hint. That provides a safe default, but it prevents clients from selecting proactive delegation per turn without changing static guidance or rewriting prior model context.

This change makes delegation mode a session selection that can be updated through turn/start, while deriving the effective model-visible mode separately for each turn. Eligible multi-agent v2 turns remain explicit-request-only unless proactive mode is both selected and enabled.

What changed

  • Add the experimental turn/start.multiAgentMode parameter with explicitRequestOnly and proactive values. Omission retains the loaded session's current optional selection.
  • Add the default-off features.multi_agent_mode feature gate. Eligible multi-agent v2 turns use the selected mode when enabled; an unset selection or disabled gate resolves to explicitRequestOnly.
  • Treat mode prompting as inapplicable for multi-agent v1 and other unsupported session configurations, producing no multi-agent mode developer message rather than rejecting the turn.
  • Move the explicit-request-only rule out of the static v2 usage hint and into a bounded, tagged developer context fragment.
  • Emit the effective mode in initial context and only when that effective mode changes on later turns.
  • Persist the effective mode in TurnContextItem as the durable baseline for resume and context-update comparisons.

Historical rollout items are not rewritten. Later mode developer messages establish the current rule incrementally.

Not covered

  • Initial selection through thread/start and selected-mode reporting from thread lifecycle/settings APIs; those are isolated in the stacked Expose thread-level multi-agent mode #28792.
  • A TUI control or slash command for selecting the mode.
  • Persisting a preferred mode to config.toml; selection remains session/turn scoped.
  • Changes to multi-agent concurrency limits, tool availability, or model catalog capability declarations.
  • Rewriting historical rollout prompt items. Cold resume restores the latest persisted effective mode when available while leaving historical developer messages intact.

Verification

  • CARGO_INCREMENTAL=0 just test -p codex-core multi_agent_mode
  • Focused app-server coverage verifies that turn/start.multiAgentMode produces proactive developer instructions for an eligible v2 turn.

Stack

Followed by #28792, which adds thread/start initialization and lifecycle/settings observability.

Comment on lines +181 to +183
#[experimental("thread/start.multiAgentMode")]
#[serde(default)]
pub multi_agent_mode: MultiAgentMode,

@shijie-oai shijie-oai Jun 17, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still exp - also this is currently not written to config.toml yet. Note: this is on the response and not on the request.

@shijie-oai shijie-oai Jun 17, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Add this to thread/start request for consistency sake.

Comment thread codex-rs/core/src/thread_manager.rs Outdated
Comment on lines +1414 to +1428
let inherited_multi_agent_mode = match initial_history.get_multi_agent_mode() {
Some(mode) => Some(mode),
None => match Self::inherited_thread_id_for_spawn(
&initial_history,
Some(&session_source),
parent_thread_id,
forked_from_thread_id,
) {
Some(thread_id) => match self.get_thread(thread_id).await {
Ok(thread) => Some(thread.config_snapshot().await.multi_agent_mode),
Err(_) => None,
},
None => None,
},
};

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this means that the subagent inherit the multi agent mode from root and therefore would spawn with the corresponding developer instruction - one potential issue I can see is that if the sub agent is spawn in Proactive all the subagent would be eager too but if the user switch back to explicit the subagent does not get that instruction update necessarily. That is okay that cause if we do hit max concurrency the subagent cannot spawn more and the root agent will be less eager with agent spawning too.

Comment on lines +230 to +235
crate::session::multi_agents::multi_agent_mode_for_turn(next, &next.session_source)
.and_then(|multi_agent_mode| {
previous
.filter(|previous| previous.multi_agent_mode != Some(multi_agent_mode))
.map(|_| MultiAgentModeInstructions::new(multi_agent_mode).render())
}),

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repo agents.md does not like to make single use helper... but I am going to anyway

  • update

@shijie-oai shijie-oai marked this pull request as ready for review June 17, 2026 17:03
@shijie-oai shijie-oai requested a review from a team as a code owner June 17, 2026 17:03
sandbox: AppServerSandboxPolicy::DangerFullAccess,
active_permission_profile: None,
reasoning_effort: None,
multi_agent_mode: Default::default(),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're probably right in designing this way, but just to ask: Does it make sense to add these feature flags as capabilities rather than modes? Like, allowing subagent delegation might be necessary for future features as well.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not super familiar with capabilities in the harness world to make a judgement here to be fair. My dumb brain would be capabilities is more or less what is available vs a mode here is more or less a user intended choice?

use codex_protocol::protocol::MULTI_AGENT_MODE_CLOSE_TAG;
use codex_protocol::protocol::MULTI_AGENT_MODE_OPEN_TAG;

const PROACTIVE_MULTI_AGENT_MODE_TEXT: &str = "Proactive multi-agent delegation is active. Any earlier instruction requiring an explicit user request before spawning sub-agents no longer applies. Use sub-agents when parallel work would materially improve speed or quality. This mode remains active until a later multi-agent mode developer message changes it.";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm amazed the most correct solution is to just append a line that tells the agent to ignore other parts of the prompt. Is that precedented? Do we need to eval stuff like this?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we will definitely eval this - the goal here is to not change the default behavior (i.e. current baseline) and introduce new prompts like this that we can run eval with.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 119bc45785

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

]
.into_iter()
.flatten()
.flat_map(multi_agent_v2_usage_hint_variants)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Stabilize full-history fork hints for proactive mode

With features.multi_agent_mode enabled, a V2 child spawned with the default fork_turns: "all" keeps the parent's TurnContext (so it inherits Proactive) but this filtering change only removes old hint variants; the full-history branch below still pushes the raw subagent_usage_hint_text, which ends with the explicit-request-only sentence. Because the retained previous context already has multi_agent_mode: Proactive, no fresh <multi_agent_mode> override is emitted, so the child sees the last developer instruction as “do not spawn” and proactive delegation is disabled for full-history subagents.

Useful? React with 👍 / 👎.

thread_source,
dynamic_tools,
metrics_service_name: service_name,
multi_agent_mode,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Persist initial mode across pre-turn resumes

When a client sets multiAgentMode on thread/start and the app-server process unloads or restarts before the first turn, this value is only passed into the in-memory spawn options. The persisted thread metadata still records multi_agent_version but not the initial mode, and resume reconstructs the mode only from TurnContext items, so a later thread/resume falls back to explicitRequestOnly before any turn has written a context item. Please persist the initial mode with the thread/session metadata or otherwise restore it on resume.

AGENTS.md reference: AGENTS.md:L103-L110

Useful? React with 👍 / 👎.

Comment thread codex-rs/core/src/context/multi_agent_mode_instructions.rs
Comment on lines +2872 to +2877
RolloutItem::TurnContext(turn_context) => turn_context.multi_agent_mode,
RolloutItem::SessionMeta(_)
| RolloutItem::ResponseItem(_)
| RolloutItem::InterAgentCommunication(_)
| RolloutItem::Compacted(_)
| RolloutItem::EventMsg(_) => None,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restore standalone mode updates on resume

When Op::ThreadSettings is used on its own to change multi_agent_mode, update_thread_settings persists a ThreadSettingsApplied event with the new snapshot but does not write a TurnContext. This reconstruction helper ignores EventMsg, so after an unload/restart the resumed session falls back to an older turn context or the default and loses a persistent thread-setting update.

AGENTS.md reference: AGENTS.md:L103-L110

Useful? React with 👍 / 👎.

Comment on lines +992 to +997
FeatureSpec {
id: Feature::MultiAgentMode,
key: "multi_agent_mode",
stage: Stage::UnderDevelopment,
default_enabled: false,
},

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this feature is off, we continue adding the no_spawn hint to the static developer message and do not inject any turn level developer message even if multi_agent_mode is supplied on the turn.

@jif-oai jif-oai left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a broader design issue here: requested mode and effective model policy are separate concepts, but ownership is spread across feature gating, multi-agent version, usage-hint eligibility, session state, rollout reconstruction, prompt generation, and child inheritance

I would love to see a more centralised concept becoming the actual source of truth

next: &TurnContext,
) -> Option<String> {
let multi_agent_mode =
crate::session::multi_agents::multi_agent_mode_for_turn(next, &next.session_source)?;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This compares only the requested mode, but model-visible policy also depends on feature/V2/usage-hint eligibility.
On cold resume, disabled -> enabled with retained proactive suppresses the first proactive fragment because the raw modes match, while enabled -> disabled emits no explicit revocation and leaves the old proactive fragment in history

@shijie-oai shijie-oai Jun 17, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point - instead of using feature.multi_agent_mode to toggle static hint behavior and disable turn level developer message, I think we can actually purely rely on turn level developer message with feature.multi_agent_mode check.

If last turn is proactive and on resume feature.multi_agent_mode goes is disabled, we would inject explicit instruction and toggle the multi-agent-mode back to explicit.

On cold resume, disabled -> enabled with retained proactive suppresses the first proactive fragment because the raw modes match

I think this is less of a concern on resume? unless I missed something. okay this actually makes more sense now. I think what I am coming up with should address this too sufficiently.

@shijie-oai

Copy link
Copy Markdown
Collaborator Author

I think there is a broader design issue here: requested mode and effective model policy are separate concepts, but ownership is spread across feature gating, multi-agent version, usage-hint eligibility, session state, rollout reconstruction, prompt generation, and child inheritance

I would love to see a more centralised concept becoming the actual source of truth

Let me iterate on this a little bit. We can have a centralized capability helper instead of each entry point check for its own piece.

@shijie-oai shijie-oai marked this pull request as draft June 17, 2026 18:12
@shijie-oai shijie-oai force-pushed the shijie/ma-v2-mode branch 2 times, most recently from 1f54314 to 5d78618 Compare June 17, 2026 22:04
Comment on lines -248 to -252
const DEFAULT_MULTI_AGENT_V2_NO_SPAWN_HINT_TEXT: &str = "Do not spawn sub-agents unless the user explicitly asks for sub-agents, delegation, or parallel agent work.";

fn default_multi_agent_v2_usage_hint_text(usage_hint_text: &str, max_concurrency: usize) -> String {
format!(
"{usage_hint_text}\n{DEFAULT_MULTI_AGENT_V2_SHARED_USAGE_HINT_TEXT}\nThere are {max_concurrency} available concurrency slots, meaning that up to {max_concurrency} agents can be active at once, including you.\n\n{DEFAULT_MULTI_AGENT_V2_NO_SPAWN_HINT_TEXT}"

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are removing the no spawn hint from the base static hint - the idea is that with effective multi agent mode calculated, we can deterministically add the no spawn (explicit invoke) or spawn (proactive) developer message per turn.

Comment on lines +33 to +40
fn body(&self) -> String {
match self.multi_agent_mode {
MultiAgentMode::ExplicitRequestOnly => {
EXPLICIT_REQUEST_ONLY_MULTI_AGENT_MODE_TEXT.to_string()
}
MultiAgentMode::Proactive => PROACTIVE_MULTI_AGENT_MODE_TEXT.to_string(),
}
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per multiAgentmode, we then choose the corresponding developer message.

&next.session_source,
);
let previous = previous?;
if previous.multi_agent_mode == effective_multi_agent_mode {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember we are only storing effective_multi_agent_mode (calculated agent mode per combo of current thread MA version, features (ma v2, usage hint enabled, multi agent mode) etc). Therefore we can use this to effectively diff if there is a change in agent mode in between turn (consider thread resume with multi agent mode disabled -> enabled etc). This effectively give a single point of comparison and let us know if MA mode is enabled or not comparing to last turn and if we should add the developer message to restrict its usage.

Comment on lines +106 to +108
if previous.multi_agent_mode == effective_multi_agent_mode {
return None;
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Observe no change, no need to add developer message and existing message represent the current need.

}

match effective_multi_agent_mode {
Some(multi_agent_mode) => Some(MultiAgentModeInstructions::new(multi_agent_mode).render()),

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a diff, we will inject that instruction based on the new multi agent mode.

Comment on lines +112 to +114
None if previous.multi_agent_mode == Some(MultiAgentMode::Proactive) => {
Some(MultiAgentModeInstructions::new(MultiAgentMode::ExplicitRequestOnly).render())
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None means that no multi agent mode is valid, but if the previous mode is proactive, we need to intentionally disable it.

Comment on lines +59 to +61
if !multi_agent_mode_is_applicable(multi_agent_version, multi_agent_v2, session_source) {
return None;
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not applicable, we know we are not using MA v2 and therefore not injecting usage hint.

@shijie-oai shijie-oai requested a review from jif-oai June 18, 2026 07:12
@shijie-oai shijie-oai marked this pull request as ready for review June 18, 2026 07:12

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9ac7c38f19

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

/// Omitted keeps the loaded session's current mode.
#[experimental("turn/start.multiAgentMode")]
#[ts(optional = nullable)]
pub multi_agent_mode: Option<MultiAgentMode>,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Surface sticky multi-agent mode updates to clients

Because this new turn/start field is documented as sticky for subsequent turns, a request that only changes multiAgentMode now applies a persistent ThreadSettingsOverrides, but ThreadSettingsSnapshot and the app-server ThreadSettings notification shape still omit the mode. The core will emit ThreadSettingsApplied, yet app-server event handling compares a snapshot that is identical to the prior baseline and sends no thread/settings/updated notification, so another attached app-server client can keep showing/using the old mode while subsequent omitted turns run with the new one. Please include the mode in the thread settings snapshots/notifications when accepting this sticky override.

AGENTS.md reference: AGENTS.md:L103-L110

Useful? React with 👍 / 👎.

Comment on lines +419 to +425
multi_agent_mode: super::multi_agents::effective_multi_agent_mode(
self.multi_agent_version,
&self.config.multi_agent_v2,
&self.session_source,
self.multi_agent_mode,
self.features.enabled(Feature::MultiAgentMode),
),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Persist the requested mode, not only the effective one

When multiAgentMode: "proactive" is accepted while the mode feature is disabled or multi-agent v2 is not currently applicable, the live SessionConfiguration retains Some(Proactive), but this rollout snapshot writes only the effective model-visible mode (explicitRequestOnly or None). On unload/restart, conversation_history.get_multi_agent_mode() restores from TurnContextItem, so the retained request is lost and a later eligible resume will not become proactive unless the client resends the setting. Persist the requested sticky mode separately from the effective prompt diff baseline.

AGENTS.md reference: AGENTS.md:L103-L110

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants