feat: pursue a goal autonomously #270
Conversation
…ng, and metadata reservation
…ta, and replay ignore
…nd behind goal-command flag
…ed by goal-command
…h budget threshold bands
…urnFlow afterStep
…with budget and step-cap stops
…st, flag docs, and gates
…plus loop-safety hardening
…int, not a fatal error
…ching + compaction)
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bf46229fbc
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e68378e8ba
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b0815f5ea1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
commit: |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b7f34e153d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 734b1d438d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (!stopHookContinuationUsed) { | ||
| const stopBlock = await this.agent.hooks?.triggerBlock('Stop', { | ||
| signal, |
There was a problem hiding this comment.
Block Stop-hook continuations after budget exhaustion
When an active goal hits its token budget, afterStep returns stopTurn, but runTurn still invokes this shouldContinueAfterStop hook before the goal driver can mark the goal blocked. In sessions with a Stop hook that returns a block/continuation, this branch appends the hook prompt and returns continue: true, causing another model step after the hard budget has already been reached. Check stopForGoalBudget before allowing steer/Stop-hook continuations so configured token ceilings remain deterministic.
Useful? React with 👍 / 👎.
| const reservedGoal = this.session.metadata.custom?.['goal']; | ||
| const patchCustom = (payload.metadata as Partial<SessionMeta> | undefined)?.custom; | ||
| if (patchCustom !== undefined && 'goal' in patchCustom) { |
There was a problem hiding this comment.
Reserve goal metadata on create and fork
This guard only protects updateSessionMetadata, but the public create/fork session metadata path is copied directly into Session.metadata.custom, where goal is now reserved. An SDK caller that creates or forks a session with custom metadata { goal: ... } can still seed arbitrary goal state outside the lifecycle methods and flag gate, and later getGoal()/goal driving will treat that host metadata as the current goal. Apply the same reservation to create/fork metadata inputs.
Useful? React with 👍 / 👎.
| typeof state.status === 'string' && | ||
| ALL_GOAL_STATUSES.has(state.status) && | ||
| typeof state.turnsUsed === 'number' && | ||
| typeof state.tokensUsed === 'number' && |
There was a problem hiding this comment.
Reject goal records missing wall-clock fields
normalizeMetadata() relies on this predicate before treating metadata.custom.goal as a SessionGoalState, but this accepts records that lack required fields such as wallClockMs, createdAt, and actor timestamps. For an imported/forked session with the checked subset present, the malformed record is kept and later snapshots compute elapsed time from undefined, producing NaN/invalid goal status output instead of dropping the corrupt goal. Require the full numeric/string fields used by toSnapshot() and budget math here.
Useful? React with 👍 / 👎.
| await store.setBudgetLimits({ budgetLimits: budget, actor: 'model' }); | ||
| return { output: `Goal budget set: ${formatBudget(args.value, args.unit)}.` }; |
There was a problem hiding this comment.
Stop immediately when SetGoalBudget is already exhausted
When the model records a user-stated budget after the goal has already spent at least that much (for example it has used one turn and calls SetGoalBudget({ value: 1, unit: 'turns' }) as instructed), this persists an over-budget goal but returns an ordinary tool result. Because the goal driver only checks budgets at the turn boundary, the current tool batch and even the next model step can continue before the goal is blocked; return a stop hint or block immediately when the new snapshot is over budget.
Useful? React with 👍 / 👎.
This PR implements the “goal” feature, which has been highly requested for many times.
Related Issue
TBA.
Problem
TBA.
What changed
TBA.
Checklist
gen-changesetsskill, or this PR needs no changeset.gen-docsskill, or this PR needs no doc update.