rn-133: add git-am and commit messages article

chriscool · chriscool · commit 1d69c1a8b4ca · 2026-03-30T12:12:11.000+02:00
diff --git a/rev_news/drafts/edition-133.md b/rev_news/drafts/edition-133.md
@@ -25,9 +25,244 @@ This edition covers what happened during the months of February and March 2026.
 ### Reviews
 -->
 
-<!---
 ### Support
--->
+
++ [git-am applies commit message diffs](https://lore.kernel.org/git/bcqvh7ahjjgzpgxwnr4kh3hfkksfruf54refyry3ha7qk7dldf@fij5calmscvm)
+
+On February 6, 2026, Matthias Beyer forwarded to the Git mailing list a
+surprising warning that had just circulated on Mastodon:
+
+> PSA: Did you know that it's **unsafe** to put code diffs into your
+> commit messages?
+>
+> Such diffs will be applied by patch(1) (also git-am(1)) as part of
+> the code change!
+>
+> This is how a sleep(1) made it into i3 4.25-2 in Debian unstable.
+
+The incident had originated in the i3 window manager project, where a
+commit message contained an unindented diff for illustration purposes.
+When Debian packagers later applied the patch using `patch(1)`, the diff
+in the commit message was applied as actual code, sneaking a spurious
+`sleep(1)` call into the Debian unstable package. Matthias asked the
+list whether this was a known issue and whether it could be an attack
+vector.
+
+To understand why this happens, it helps to know how `git-am` parses
+its input. When processing a patch email, it must split the stream into
+two parts: the commit message and the actual patch to apply. It does
+this by treating the first occurrence of any of the following lines as
+the boundary between the two:
+
+- a line consisting of only three dashes (`---`),
+- a line beginning with `diff -`, or
+- a line beginning with `Index: `.
+
+Everything before that boundary becomes the commit message; everything
+after is fed to the patch application machinery. Crucially, `git-am`
+scans from the top of the email, so the very first such line it
+encounters terminates the commit message regardless of whether that
+line was meant to be part of the message text.
+
+This design dates back to the tool's origins. As Jeff King (also known
+as "Peff") quickly explained in reply to Matthias, `git-am` was
+originally designed to handle patches sent by all kinds of people, not
+just Git users. A contributor might have generated a diff with plain
+GNU `diff` and typed the rest of the email by hand, without any `---`
+separator. The tool was therefore intentionally permissive: it would
+find a `diff -` line anywhere in the email and treat it as the start
+of the patch. Peff demonstrated this with a live example. He fed `git
+am` a hand-typed email containing a GNU diff, and it produced the
+expected commit.
+
+This historical context also explained why `git-am` is notoriously
+hard to fix: "I don't think there is a way to unambiguously parse the
+single-stream output that format-patch produces," Peff wrote, noting
+that he could find at least three earlier discussions of the same
+problem (in 2015, 2022, and 2024). The stream is simply ambiguous by
+design. Even the `---` marker itself cannot be used to robustly split
+things, since `---` on a line by itself is a valid diff hunk line
+indicating that the string `--` was removed from a file.
+
+Matthias proposed parsing from the end of the email rather than from
+the top. Peff replied that this would still be ambiguous for the same
+reasons, and would introduce new corner cases.
+
+Jacob Keller noted early on that the issue was certainly surprising but
+that he was unsure it constituted a security attack vector, since
+someone should be reading the commit message before applying. But
+Matthias pushed back: the whole point was that nobody realized the
+behavior was there. He called it "sheer luck" that it was only a
+`sleep(1)` and not something more malicious crafted as a diff in the
+commit message.
+
+Florian Weimer wondered whether the `git-format-patch` output was
+really ambiguous, given that the patch section is normally preceded by
+a diffstat block. Peff replied that the diffstat is optional and is not
+even parsed by the receiving side at all.
+
+Jakob Haufe added an important nuance: even if `git-am` were fixed to
+require indented diffs, it would only partially mitigate the problem,
+because `patch(1)` (which many distributions use to apply upstream
+fixes to packages) is even more permissive. It will strip a consistent
+level of indentation from diffs before applying them. He quoted the
+`patch(1)` manual page: "If the entire diff is indented by a
+consistent amount, [...] this is taken into account." The i3 incident
+had in fact been triggered by `patch(1)`, not `git-am`.
+
+Kristoffer Haugsbakk synthesized this into a clear summary of the
+situation and immediately proposed documenting it.
+
+Matthias also highlighted the broader applicability beyond email
+workflows: Linux distributions like NixOS routinely fetch patches
+directly from upstream Git repositories and apply them to packages
+using `patch(1)`. He noted that even after 15 years of using Git and
+being comfortable with email patch workflows, he himself had not known
+about this behavior.
+
+Several directions were then explored to look for solutions.
+
+Peff observed the irony that `git-format-patch` does have a `--attach`
+option which puts the message and the patch in separate MIME parts —
+making them unambiguous in principle. However, `git-mailinfo` (which
+powers `git-am` under the hood) decodes both parts into a single
+stream and still treats a `diff` line in the message part as the start
+of a patch. Fixing this would require careful surgery to avoid
+breaking the existing forgiving handling of patches received as a
+single attachment.
+
+Patrick Steinhardt suggested that even if parsing cannot be made
+unambiguous, `git-am` could at least detect the ambiguity and bail by
+default with an `--accept-ambiguous-patch` override. Jacob Keller
+proposed going further: a new "unambiguous mode" where
+`git-format-patch` would produce output that new versions of `git-am`
+could distinguish unambiguously, while old versions would still handle
+the common case the same way as before.
+
+Jacob had also sketched a concrete scheme: add a new unambiguous
+marker after the `---` separator, so that old versions of `git-am`
+would still cut at the `---` and ignore everything up to the diff, while
+new versions would wait for the new marker and correctly ignore any
+diff appearing before it. Since the new marker would come after `---`,
+it would not be inserted into the commit message when applied.
+
+Peff replied that this was trickier than it sounded: the new marker
+would have to be something that could never appear legitimately in a
+commit message, and both sides would need to complain if they saw
+multiple markers. He explored further options: reversible quoting of
+`---` and `diff` lines in the commit message (analogous to the `>From`
+quoting used in mbox files), applied only when the message would
+otherwise be ambiguous. This way, if an older `git-am` received the
+mail, the worst case would be visible quoting in the commit message —
+ugly but readable. Junio Hamano, the Git maintainer, added another
+thought: refusing to accept unsigned patches at all.
+
+Peff also proposed a simpler receiver-side improvement: a
+`git am --strict` mode that would always require a `---` separator
+before the diff, on the assumption that well-formatted patches from Git
+always have one. This would not help with diffs that legitimately
+appear before the `---`, but would eliminate the most common accidental
+cases.
+
+None of these ideas led to an immediate implementation, as they all
+involve backward compatibility tradeoffs that would need careful
+thought.
+
+On February 8, Kristoffer sent a documentation patch titled "doc: add
+caveat about roundtripping format-patch" which introduced a new
+`Documentation/format-patch-caveats.adoc` file explaining the
+behavior. The caveat was designed to be included in the documentation
+for `git-am`, `git-format-patch`, and `git-send-email`.
+
+Junio reviewed
+[version 1](https://lore.kernel.org/git/format-patch_caveats.281@msgid.xyz)
+and offered a correction to the wording: rather than saying that an
+unindented diff in the commit message "will not only cut the message
+short but cause that very diff to be applied, along with the patch in
+the patch section," Junio noted that the outcome is not so
+deterministic. The diff in the commit message might get applied, or
+the patch machinery might trip on something and fail outright. He also
+flagged that the space after the `---` in the cover letter was
+inconsistent with the project's conventions.
+
+Phillip Wood reviewed the patch and found the mention of
+`git-send-email` a bit distracting, since that command merely runs
+`git-format-patch` and does not do any formatting itself. He also
+suggested wording improvements: replacing "One might want to use [...]
+patch(1)" with "Given these limitations, one might be tempted to [...]".
+
+Kristoffer incorporated all of this in
+[version 2](https://lore.kernel.org/git/V2_format-patch_caveats.34b@msgid.xyz),
+which dropped the `git-send-email` mention from the introductory
+paragraph (while keeping the CAVEATS section in its documentation, for
+users who encounter it there), removed example code blocks in favor of
+clearer prose, and used the list of message-terminating patterns
+already present in `git-am`'s documentation. Junio reviewed it and
+queued it with the comment "Nicely written."
+
+A third version,
+[version 3](https://lore.kernel.org/git/pull.2220.v3.git.git.1772559813151.gitgitgadget@gmail.com),
+was submitted and received Junio's approval to go to `next`.
+
+Meanwhile, Phillip had observed that since the parsing cannot be fixed,
+"perhaps we should update our sample `commit-msg` hook to reject
+messages that will cause problems." On February 7, he sent a 3-patch
+series titled "commit-msg.sample: reject messages that would confuse
+`git am`". The series:
+
+1. Added a `.gitattributes` rule for sample hooks (which are shell
+   scripts but have `.sample` extensions).
+2. Extended the sample `commit-msg` hook to scan the body of the commit
+   message for unindented `diff -` and `Index: ` lines and reject the
+   commit with a helpful error message.
+3. Added a further check to detect `---` separator lines in the message
+   body, which would cause `git-am` to silently truncate the commit
+   message.
+
+Peff reacted with measured skepticism to patch 3 in
+[version 1](https://lore.kernel.org/git/cover.1770476279.git.phillip.wood@dunelm.org.uk):
+he and Junio both pointed out that they themselves sometimes use `---`
+intentionally in commit messages to add notes that will appear in the
+formatted patch email but not end up in the final commit message when
+applied. Junio explained the trick: "when I know what I want to write
+below the three-dash lines, I would commit with `---` and additional
+notes below it, so that I do not forget during format-patch. When the
+commit is turned into a patch email [...] `am` cuts at the first one,
+and `apply` knows that the garbage lines at front, including
+three-dash lines, do not matter until it sees `^diff`, this works out
+perfectly well."
+
+Peff confirmed he used the same trick. Phillip, acknowledging that at
+least three developers relied on this behavior, decided to drop patch 3
+entirely, reducing the series from three patches to two, in
+[version 2](https://lore.kernel.org/git/cover.1770993281.git.phillip.wood@dunelm.org.uk).
+He also refined the diff detection in the body: the v2 correctly skips
+the first paragraph of the message (which becomes the email Subject
+header and so does not go through the patch boundary detection), skips
+lines below a scissors line, and handles the `core.commentChar` and
+`core.commentString` configuration options for determining which lines
+are comments. Junio reviewed version 2 with detailed questions about
+the scissors-line logic.
+
+Kristoffer verified that version 2 worked with `git commit
+--cleanup=scissors --verbose` and was satisfied.
+
+The discussion did not lead to a fundamental fix to the ambiguous
+parsing in `git-am`, which remains an open problem with no obvious
+backward-compatible solution. But it produced two concrete
+improvements that were accepted and are now in `master`: a CAVEATS
+section in the documentation for `git-am`, `git-format-patch`, and
+`git-send-email` spelling out exactly how commit messages can
+inadvertently interfere with patch application, and an enhanced sample
+`commit-msg` hook that rejects messages containing unindented diffs.
+
+The thread also served as a useful reminder that this problem is not
+limited to email workflows: any project that generates patches from
+Git commits using `git-format-patch` and applies them with `patch(1)`
+or `git-am` is exposed to it. The practical advice for authors is
+simple: if you include diffs in commit messages for illustrative
+purposes, make sure to indent them consistently, and be aware that
+even that does not protect you from `patch(1)`.
 
 ## Developer Spotlight: Olamide Caleb Bello