Signed-commit push silently invents unrelated file changes (and bypasses protected_files) when checkout is shallow and base branch advances

## Summary

When the agent checks out the target repo shallowly and the base branch advances during the run, `pushSignedCommits` produces a PR commit whose **parent** is current `origin/<base>` but whose **tree** comes from the agent's stale-base checkout. This silently reverts every intervening commit in the affected files and the resulting PR contains dozens-to-hundreds of file changes the agent never asked for. The `protected_files` policy is also bypassed because it is evaluated against the agent's emitted patch, not against the synthesized GraphQL `fileChanges` payload.

## Concrete repro

- Workflow run: https://github.com/github/github-automation/actions/runs/26955005543
- Resulting PR: https://github.com/github/github/pull/434722
- gh-aw actions version used by the run: `v0.78.1`

| Fact | Value |
|---|---|
| PR commit | `53829c3f360d090447237acd79707c7e6f99a477` |
| PR commit parent | `dc55a3747c9c38503e62f90af61796290d8169ea` (current `master`) |
| PR commit committer | `web-flow` (signed via GraphQL `createCommitOnBranch`) |
| PR commit files changed | **99** (+7323 / -2263) |
| Agent's actual patch (from `safe-output-items.jsonl` artifact) | **1 file** — `test/integration/blob_controller_test.rb`, ~14 lines |
| Agent's bundle tip SHA | `f1ddbe37b9b9b444815e935cd4c2401275735151` (never reachable via API) |
| Agent's bundle declared prerequisite | `07028dfaa35b484328b7f8dbe32ee5c36d152ad3` (`2026-06-04T10:23:10Z`) |
| `master` was ahead of that prerequisite by | **178 commits** at safe-outputs time |

`compare 07028dfa...master` → `ahead_by: 178, files: 98`. That matches the PR's 99 changed files almost exactly (98 from staleness + 1 from the agent's edit).

## Mechanism

The workflow uses `actions/checkout@v6` with `fetch-depth: 20` on a high-churn monorepo (`github/github` advances dozens of commits per hour). The agent's local "master" was more than 3 hours stale by the time the safe-outputs job ran. The generated safe-outputs job *re*-checks-out `master` with `fetch-depth: 20`, but the bundle commits it unpacks still record their original (stale) first parent.

In `actions/setup/js/push_signed_commits.cjs`, `pushSignedCommits`:

1. Resolves `expectedHeadOid` from `git ls-remote origin refs/heads/<branch>` / `baseRefOid` — i.e. **current `origin/<base>`**.
2. For each commit in `origin/<base>..HEAD`, runs `git diff-tree -r --raw <sha>` against the commit's **own first parent** (the stale base baked into the bundle) and packages those into `fileChanges: { additions, deletions }`.
3. Calls the GraphQL `createCommitOnBranch` mutation with that combination.

GitHub then writes a new commit whose parent is current `<base>` but whose tree is `parent_tree + agent_fileChanges`. Because `agent_fileChanges` were computed against a 178-commit-old tree, every file where `stale_<base>_tree` differs from `current_<base>_tree` gets silently reverted to the stale content.

## Protection bypass

The workflow has:

```yaml
safe-outputs:
  create-pull-request:
    max_patch_files: 100
    protected_files: [..., "CODEOWNERS", ...]
    protected_files_policy: "blocked"
```

`CODEOWNERS` and `.rubocop_todo.yml` (+6775 lines, reverting a recent cleanup) both ended up in the PR. The protection check inspects the agent's emitted patch (1 file) — not the synthesized `fileChanges` payload actually sent to GraphQL — so it has no chance to fire.

## Why this is severe

- The PR title/body/commit message describe a 1-file change. A reviewer skimming the description has no signal that 98 unrelated files were touched, and several of those reverts would be silently merged if auto-merge or trust were enabled.
- `protected_files` is an explicit safety net that users rely on. It is bypassed without warning.
- Any gh-aw workflow that targets a high-churn repo with the documented `fetch-depth` examples is exposed.

## Suggested fix (two parts — both are needed)

A natural first instinct is "just deepen the safe-outputs checkout". That alone is **not sufficient**: even with a full clone, `pushSignedCommits` still computes `fileChanges` from `diff-tree` against each bundle commit's recorded first parent (the stale base), and still sends that payload to a GraphQL mutation whose `expectedHeadOid` is current `<base>`. The 99-file PR would still happen.

The fix is the pair:

### 1. Deepen until the bundle's base is reachable

In the generated safe-outputs job, after unbundling the agent's commits, fetch `origin/<base>` until the bundle's recorded base SHA (the prerequisite encoded in the `.bundle` header, or the first parent of the bundle's tip) is reachable locally. Either:

- iterate `git fetch origin <base> --deepen=<N>` until `git merge-base --is-ancestor <bundle-base> origin/<base>` succeeds, or
- run `git fetch origin <base> --unshallow` once when shallow.

Without this, step 2 has no merge base to work with.

### 2. Compute `fileChanges` against current `origin/<base>` (the GraphQL parent)

In `pushSignedCommits`, after step 1 has made the merge base reachable, either:

- **Rebase the bundle onto current `origin/<base>`** and then build `fileChanges` per rebased commit (each commit's first parent now matches what GraphQL will use), **or**
- **Build a single combined diff** with `git diff <merge-base>...<bundle-tip>` (three-dot — diff from the merge base, *not* from the agent's recorded stale parent) and send that as a single `createCommitOnBranch` call.

If the rebase / cherry-pick has conflicts, refuse the push with a clear error and surface it to the user — silently inventing 98 file changes is the failure mode being fixed.

### 3. Validate the synthesized payload, not the agent's input

Run `protected_files`, `max_patch_files`, and `max_patch_size` against the final `fileChanges` array that is about to be sent to GraphQL, not only against the agent's emitted patch. This is a defense-in-depth check that would have caught this incident even without fixes 1 and 2.

### 4. Default `fetch-depth` guidance

Until fixes 1 and 2 ship, the example/template workflows for high-churn targets should default to `fetch-depth: 0` (or auto-deepen on demand) rather than `20`.

## Artifacts available

I have the workflow run artifacts saved locally and can attach: the agent's `aw-*.patch` (1 file), the agent's `.bundle` (declares `07028dfa` as prerequisite), and the `safe-output-items.jsonl` containing the `create_pull_request` item.


Fact	Value
PR commit	`53829c3f360d090447237acd79707c7e6f99a477`
PR commit parent	`dc55a3747c9c38503e62f90af61796290d8169ea` (current `master`)
PR commit committer	`web-flow` (signed via GraphQL `createCommitOnBranch`)
PR commit files changed	99 (+7323 / -2263)
Agent's actual patch (from `safe-output-items.jsonl` artifact)	1 file — `test/integration/blob_controller_test.rb`, ~14 lines
Agent's bundle tip SHA	`f1ddbe37b9b9b444815e935cd4c2401275735151` (never reachable via API)
Agent's bundle declared prerequisite	`07028dfaa35b484328b7f8dbe32ee5c36d152ad3` (`2026-06-04T10:23:10Z`)
`master` was ahead of that prerequisite by	178 commits at safe-outputs time

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Signed-commit push silently invents unrelated file changes (and bypasses protected_files) when checkout is shallow and base branch advances #36934

Summary

Concrete repro

Mechanism

Protection bypass

Why this is severe

Suggested fix (two parts — both are needed)

1. Deepen until the bundle's base is reachable

2. Compute `fileChanges` against current `origin/<base>` (the GraphQL parent)

3. Validate the synthesized payload, not the agent's input

4. Default `fetch-depth` guidance

Artifacts available

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Signed-commit push silently invents unrelated file changes (and bypasses protected_files) when checkout is shallow and base branch advances #36934

Description

Summary

Concrete repro

Mechanism

Protection bypass

Why this is severe

Suggested fix (two parts — both are needed)

1. Deepen until the bundle's base is reachable

2. Compute fileChanges against current origin/<base> (the GraphQL parent)

3. Validate the synthesized payload, not the agent's input

4. Default fetch-depth guidance

Artifacts available

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

2. Compute `fileChanges` against current `origin/<base>` (the GraphQL parent)

4. Default `fetch-depth` guidance