From 616c86779733163592e91660dbd75e52a19e48d8 Mon Sep 17 00:00:00 2001
From: Pierre Sassoulas <pierre.sassoulas@gmail.com>
Date: Sat, 13 Jun 2026 18:33:58 +0200
Subject: [PATCH 1/6] [perf] Make ``PrettyPrinter`` format lazily so output can
 be budget-capped
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

``_format`` and the per-type helpers now ``yield`` their output as a
stream of string chunks instead of writing to a file-like object, and
``pformat`` joins them. On top of that, ``pformat_lines`` pulls from the
formatter only until a budget is reached:

    pformat_lines(obj, max_lines=None, max_chars=None)

It stops on the first chunk that reaches *either* budget, so a huge
collection costs O(budget) rather than O(N). Either dimension may be
``None`` (unbounded); with both ``None`` the whole object is formatted.

Motivation
----------
Assertion diffs are truncated to a handful of lines/chars before being
shown. Formatting the whole of a large ``==`` comparison and then
throwing almost all of it away is pure waste. With a lazy formatter the
truncating caller simply stops pulling once it has enough.

Benchmark (``PrettyPrinter`` alone, width 80)::

    list(range(500_000)):
        pformat().splitlines()        ~805 ms
        pformat_lines(max_lines=11)   ~0.027 ms      (~30000x)

    [8 small ints] (common small diff):
        pformat().splitlines()        ~0.0133 ms
        pformat_lines(max_lines=11)   ~0.0185 ms     (+~5 us)

    ["x"*100_000] * 3 (flat, few huge elements):
        pformat_lines(max_chars=640)  stops after ~100_000 chars
                                      (one element) instead of 300_000

Why a lazy generator rather than a fast path + budget stream
------------------------------------------------------------
An earlier approach kept a cheap ``pformat().splitlines()`` fast path
guarded by ``len(obj) <= max_lines`` plus a flatness check, falling back
to a write-intercepting budget-stream class for the rest. Two problems:

* ``len(obj)`` is only a *lower* bound on the line count — one nested
  element (``[{...50 keys...}]``) expands to many lines — so the guard
  needed the flatness scan to stay correct, and even then it bounded
  only *lines*, never *chars*: a flat container of a few enormous
  strings has almost no lines but blows the char budget.
* it was two code paths plus a stream class plus an exception used for
  control flow.

Because the formatter is lazy, "stop pulling at the budget" is the whole
optimisation: correct regardless of how lines/chars are distributed
across elements, bounding both dimensions, with no ``len()`` proxy to
get wrong and no fast/slow branch. The common small-diff case costs only
~5 us more than the unbounded path (it is never the bottleneck — a
failing assertion isn't hot), while large comparisons drop by orders of
magnitude.

``_pprint_set``/``_pprint_dict`` also try a plain ``sorted`` first and
fall back to the ``_safe_key`` wrapper only for unorderable mixes.

This diverges structurally from the upstream cpython ``pprint`` it was
vendored from; the module header notes it is no longer kept in sync.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 src/_pytest/_io/pprint.py | 335 ++++++++++++++++++++------------------
 testing/io/test_pprint.py |  84 ++++++++++
 2 files changed, 262 insertions(+), 157 deletions(-)

diff --git a/src/_pytest/_io/pprint.py b/src/_pytest/_io/pprint.py
index ec41b449ddf..06caf436e60 100644
--- a/src/_pytest/_io/pprint.py
+++ b/src/_pytest/_io/pprint.py
@@ -3,6 +3,14 @@
 # (https://github.com/python/cpython/) at commit
 # c5140945c723ae6c4b7ee81ff720ac8ea4b52cfd (python3.12).
 #
+# It has since been adapted to emit its output lazily as a stream of
+# string chunks (``_format`` and the per-type helpers are generators)
+# rather than writing to a file-like object. This lets ``pformat_lines``
+# stop formatting as soon as a line/char budget is reached, so a huge
+# collection a caller is going to truncate anyway is never fully built.
+# As a result this copy has diverged structurally from upstream and is
+# no longer kept in sync with it.
+#
 #
 #  Original Author:      Fred L. Drake, Jr.
 #                        fdrake@acm.org
@@ -17,13 +25,12 @@
 
 import collections as _collections
 from collections.abc import Callable
+from collections.abc import Iterable
 from collections.abc import Iterator
 import dataclasses as _dataclasses
-from io import StringIO as _StringIO
 import re
 import types as _types
 from typing import Any
-from typing import IO
 
 
 class _safe_key:
@@ -87,28 +94,62 @@ def __init__(
         self._width = width
 
     def pformat(self, object: Any) -> str:
-        sio = _StringIO()
-        self._format(object, sio, 0, 0, set(), 0)
-        return sio.getvalue()
+        return "".join(self._format(object, 0, 0, set(), 0))
+
+    def pformat_lines(
+        self,
+        object: Any,
+        max_lines: int | None = None,
+        max_chars: int | None = None,
+    ) -> list[str]:
+        """Pretty-print ``object`` and return its lines.
+
+        ``_format`` yields the output as a stream of chunks, so this can
+        stop pulling from it as soon as a budget is reached — useful when
+        a downstream truncator is going to drop everything past that
+        budget anyway.
+
+        ``max_lines`` / ``max_chars`` bound the two truncation dimensions
+        independently; either may be ``None`` to leave that dimension
+        unbounded. With both ``None`` the whole object is formatted. The
+        budget is a stopping condition, not a precise cut: formatting
+        stops on the first chunk that reaches it, so the result may
+        slightly overshoot (the caller truncates to the exact limit).
+        """
+        if max_lines is None and max_chars is None:
+            return self.pformat(object).splitlines()
+        n_lines = 0
+        n_chars = 0
+        chunks: list[str] = []
+        for chunk in self._format(object, 0, 0, set(), 0):
+            chunks.append(chunk)
+            if max_chars is not None:
+                n_chars += len(chunk)
+            if max_lines is not None:
+                n_lines += chunk.count("\n")
+            if (max_lines is not None and n_lines >= max_lines) or (
+                max_chars is not None and n_chars >= max_chars
+            ):
+                break
+        return "".join(chunks).splitlines()
 
     def _format(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
+    ) -> Iterator[str]:
         objid = id(object)
         if objid in context:
-            stream.write(_recursion(object))
+            yield _recursion(object)
             return
 
         p = self._dispatch.get(type(object).__repr__, None)
         if p is not None:
             context.add(objid)
-            p(self, object, stream, indent, allowance, context, level + 1)
+            yield from p(self, object, indent, allowance, context, level + 1)
             context.remove(objid)
         elif (
             _dataclasses.is_dataclass(object)
@@ -120,125 +161,126 @@ def _format(
             and "__create_fn__" in object.__repr__.__wrapped__.__qualname__
         ):
             context.add(objid)
-            self._pprint_dataclass(
-                object, stream, indent, allowance, context, level + 1
+            yield from self._pprint_dataclass(
+                object, indent, allowance, context, level + 1
             )
             context.remove(objid)
         else:
-            stream.write(self._repr(object, context, level))
+            yield self._repr(object, context, level)
 
     def _pprint_dataclass(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
+    ) -> Iterator[str]:
         cls_name = object.__class__.__name__
         items = [
             (f.name, getattr(object, f.name))
             for f in _dataclasses.fields(object)
             if f.repr
         ]
-        stream.write(cls_name + "(")
-        self._format_namespace_items(items, stream, indent, allowance, context, level)
-        stream.write(")")
+        yield cls_name + "("
+        yield from self._format_namespace_items(
+            items, indent, allowance, context, level
+        )
+        yield ")"
 
     _dispatch: dict[
         Callable[..., str],
-        Callable[[PrettyPrinter, Any, IO[str], int, int, set[int], int], None],
+        Callable[[PrettyPrinter, Any, int, int, set[int], int], Iterator[str]],
     ] = {}
 
     def _pprint_dict(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
-        write = stream.write
-        write("{")
-        items = object.items()
-        self._format_dict_items(items, stream, indent, allowance, context, level)
-        write("}")
+    ) -> Iterator[str]:
+        yield "{"
+        yield from self._format_dict_items(
+            object.items(), indent, allowance, context, level
+        )
+        yield "}"
 
     _dispatch[dict.__repr__] = _pprint_dict
 
     def _pprint_ordered_dict(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
+    ) -> Iterator[str]:
         if not len(object):
-            stream.write(repr(object))
+            yield repr(object)
             return
         cls = object.__class__
-        stream.write(cls.__name__ + "(")
-        self._pprint_dict(object, stream, indent, allowance, context, level)
-        stream.write(")")
+        yield cls.__name__ + "("
+        yield from self._pprint_dict(object, indent, allowance, context, level)
+        yield ")"
 
     _dispatch[_collections.OrderedDict.__repr__] = _pprint_ordered_dict
 
     def _pprint_list(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
-        stream.write("[")
-        self._format_items(object, stream, indent, allowance, context, level)
-        stream.write("]")
+    ) -> Iterator[str]:
+        yield "["
+        yield from self._format_items(object, indent, allowance, context, level)
+        yield "]"
 
     _dispatch[list.__repr__] = _pprint_list
 
     def _pprint_tuple(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
-        stream.write("(")
-        self._format_items(object, stream, indent, allowance, context, level)
-        stream.write(")")
+    ) -> Iterator[str]:
+        yield "("
+        yield from self._format_items(object, indent, allowance, context, level)
+        yield ")"
 
     _dispatch[tuple.__repr__] = _pprint_tuple
 
     def _pprint_set(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
+    ) -> Iterator[str]:
         if not len(object):
-            stream.write(repr(object))
+            yield repr(object)
             return
         typ = object.__class__
         if typ is set:
-            stream.write("{")
+            yield "{"
             endchar = "}"
         else:
-            stream.write(typ.__name__ + "({")
+            yield typ.__name__ + "({"
             endchar = "})"
-        object = sorted(object, key=_safe_key)
-        self._format_items(object, stream, indent, allowance, context, level)
-        stream.write(endchar)
+        try:
+            object = sorted(object)
+        except TypeError:
+            # Heterogeneous element types — fall back to a key that
+            # tolerates unorderable pairs by string-comparing their types.
+            object = sorted(object, key=_safe_key)
+        yield from self._format_items(object, indent, allowance, context, level)
+        yield endchar
 
     _dispatch[set.__repr__] = _pprint_set
     _dispatch[frozenset.__repr__] = _pprint_set
@@ -246,15 +288,13 @@ def _pprint_set(
     def _pprint_str(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
-        write = stream.write
+    ) -> Iterator[str]:
         if not len(object):
-            write(repr(object))
+            yield repr(object)
             return
         chunks = []
         lines = object.splitlines(True)
@@ -289,90 +329,84 @@ def _pprint_str(
                 if current:
                     chunks.append(repr(current))
         if len(chunks) == 1:
-            write(rep)
+            yield rep
             return
         if level == 1:
-            write("(")
+            yield "("
         for i, rep in enumerate(chunks):
             if i > 0:
-                write("\n" + " " * indent)
-            write(rep)
+                yield "\n" + " " * indent
+            yield rep
         if level == 1:
-            write(")")
+            yield ")"
 
     _dispatch[str.__repr__] = _pprint_str
 
     def _pprint_bytes(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
-        write = stream.write
+    ) -> Iterator[str]:
         if len(object) <= 4:
-            write(repr(object))
+            yield repr(object)
             return
         parens = level == 1
         if parens:
             indent += 1
             allowance += 1
-            write("(")
+            yield "("
         delim = ""
         for rep in _wrap_bytes_repr(object, self._width - indent, allowance):
-            write(delim)
-            write(rep)
+            yield delim
+            yield rep
             if not delim:
                 delim = "\n" + " " * indent
         if parens:
-            write(")")
+            yield ")"
 
     _dispatch[bytes.__repr__] = _pprint_bytes
 
     def _pprint_bytearray(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
-        write = stream.write
-        write("bytearray(")
-        self._pprint_bytes(
-            bytes(object), stream, indent + 10, allowance + 1, context, level + 1
+    ) -> Iterator[str]:
+        yield "bytearray("
+        yield from self._pprint_bytes(
+            bytes(object), indent + 10, allowance + 1, context, level + 1
         )
-        write(")")
+        yield ")"
 
     _dispatch[bytearray.__repr__] = _pprint_bytearray
 
     def _pprint_mappingproxy(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
-        stream.write("mappingproxy(")
-        self._format(object.copy(), stream, indent, allowance, context, level)
-        stream.write(")")
+    ) -> Iterator[str]:
+        yield "mappingproxy("
+        yield from self._format(object.copy(), indent, allowance, context, level)
+        yield ")"
 
     _dispatch[_types.MappingProxyType.__repr__] = _pprint_mappingproxy
 
     def _pprint_simplenamespace(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
+    ) -> Iterator[str]:
         if type(object) is _types.SimpleNamespace:
             # The SimpleNamespace repr is "namespace" instead of the class
             # name, so we do the same here. For subclasses; use the class name.
@@ -380,95 +414,89 @@ def _pprint_simplenamespace(
         else:
             cls_name = object.__class__.__name__
         items = object.__dict__.items()
-        stream.write(cls_name + "(")
-        self._format_namespace_items(items, stream, indent, allowance, context, level)
-        stream.write(")")
+        yield cls_name + "("
+        yield from self._format_namespace_items(
+            items, indent, allowance, context, level
+        )
+        yield ")"
 
     _dispatch[_types.SimpleNamespace.__repr__] = _pprint_simplenamespace
 
     def _format_dict_items(
         self,
-        items: list[tuple[Any, Any]],
-        stream: IO[str],
+        items: Iterable[tuple[Any, Any]],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
-        if not items:
-            return
-
-        write = stream.write
+    ) -> Iterator[str]:
         item_indent = indent + self._indent_per_level
         delimnl = "\n" + " " * item_indent
+        emitted = False
         for key, ent in items:
-            write(delimnl)
-            write(self._repr(key, context, level))
-            write(": ")
-            self._format(ent, stream, item_indent, 1, context, level)
-            write(",")
+            emitted = True
+            yield delimnl
+            yield self._repr(key, context, level)
+            yield ": "
+            yield from self._format(ent, item_indent, 1, context, level)
+            yield ","
 
-        write("\n" + " " * indent)
+        if emitted:
+            yield "\n" + " " * indent
 
     def _format_namespace_items(
         self,
-        items: list[tuple[Any, Any]],
-        stream: IO[str],
+        items: Iterable[tuple[Any, Any]],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
-        if not items:
-            return
-
-        write = stream.write
+    ) -> Iterator[str]:
         item_indent = indent + self._indent_per_level
         delimnl = "\n" + " " * item_indent
+        emitted = False
         for key, ent in items:
-            write(delimnl)
-            write(key)
-            write("=")
+            emitted = True
+            yield delimnl
+            yield key
+            yield "="
             if id(ent) in context:
                 # Special-case representation of recursion to match standard
                 # recursive dataclass repr.
-                write("...")
+                yield "..."
             else:
-                self._format(
+                yield from self._format(
                     ent,
-                    stream,
                     item_indent + len(key) + 1,
                     1,
                     context,
                     level,
                 )
 
-            write(",")
+            yield ","
 
-        write("\n" + " " * indent)
+        if emitted:
+            yield "\n" + " " * indent
 
     def _format_items(
         self,
-        items: list[Any],
-        stream: IO[str],
+        items: Iterable[Any],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
-        if not items:
-            return
-
-        write = stream.write
+    ) -> Iterator[str]:
         item_indent = indent + self._indent_per_level
         delimnl = "\n" + " " * item_indent
-
+        emitted = False
         for item in items:
-            write(delimnl)
-            self._format(item, stream, item_indent, 1, context, level)
-            write(",")
+            emitted = True
+            yield delimnl
+            yield from self._format(item, item_indent, 1, context, level)
+            yield ","
 
-        write("\n" + " " * indent)
+        if emitted:
+            yield "\n" + " " * indent
 
     def _repr(self, object: Any, context: set[int], level: int) -> str:
         return self._safe_repr(object, context.copy(), self._depth, level)
@@ -476,114 +504,107 @@ def _repr(self, object: Any, context: set[int], level: int) -> str:
     def _pprint_default_dict(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
+    ) -> Iterator[str]:
         rdf = self._repr(object.default_factory, context, level)
-        stream.write(f"{object.__class__.__name__}({rdf}, ")
-        self._pprint_dict(object, stream, indent, allowance, context, level)
-        stream.write(")")
+        yield f"{object.__class__.__name__}({rdf}, "
+        yield from self._pprint_dict(object, indent, allowance, context, level)
+        yield ")"
 
     _dispatch[_collections.defaultdict.__repr__] = _pprint_default_dict
 
     def _pprint_counter(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
-        stream.write(object.__class__.__name__ + "(")
+    ) -> Iterator[str]:
+        yield object.__class__.__name__ + "("
 
         if object:
-            stream.write("{")
+            yield "{"
             items = object.most_common()
-            self._format_dict_items(items, stream, indent, allowance, context, level)
-            stream.write("}")
+            yield from self._format_dict_items(items, indent, allowance, context, level)
+            yield "}"
 
-        stream.write(")")
+        yield ")"
 
     _dispatch[_collections.Counter.__repr__] = _pprint_counter
 
     def _pprint_chain_map(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
+    ) -> Iterator[str]:
         if not len(object.maps) or (len(object.maps) == 1 and not len(object.maps[0])):
-            stream.write(repr(object))
+            yield repr(object)
             return
 
-        stream.write(object.__class__.__name__ + "(")
-        self._format_items(object.maps, stream, indent, allowance, context, level)
-        stream.write(")")
+        yield object.__class__.__name__ + "("
+        yield from self._format_items(object.maps, indent, allowance, context, level)
+        yield ")"
 
     _dispatch[_collections.ChainMap.__repr__] = _pprint_chain_map
 
     def _pprint_deque(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
-        stream.write(object.__class__.__name__ + "(")
+    ) -> Iterator[str]:
+        yield object.__class__.__name__ + "("
         if object.maxlen is not None:
-            stream.write(f"maxlen={object.maxlen}, ")
-        stream.write("[")
+            yield f"maxlen={object.maxlen}, "
+        yield "["
 
-        self._format_items(object, stream, indent, allowance + 1, context, level)
-        stream.write("])")
+        yield from self._format_items(object, indent, allowance + 1, context, level)
+        yield "])"
 
     _dispatch[_collections.deque.__repr__] = _pprint_deque
 
     def _pprint_user_dict(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
-        self._format(object.data, stream, indent, allowance, context, level - 1)
+    ) -> Iterator[str]:
+        yield from self._format(object.data, indent, allowance, context, level - 1)
 
     _dispatch[_collections.UserDict.__repr__] = _pprint_user_dict
 
     def _pprint_user_list(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
-        self._format(object.data, stream, indent, allowance, context, level - 1)
+    ) -> Iterator[str]:
+        yield from self._format(object.data, indent, allowance, context, level - 1)
 
     _dispatch[_collections.UserList.__repr__] = _pprint_user_list
 
     def _pprint_user_string(
         self,
         object: Any,
-        stream: IO[str],
         indent: int,
         allowance: int,
         context: set[int],
         level: int,
-    ) -> None:
-        self._format(object.data, stream, indent, allowance, context, level - 1)
+    ) -> Iterator[str]:
+        yield from self._format(object.data, indent, allowance, context, level - 1)
 
     _dispatch[_collections.UserString.__repr__] = _pprint_user_string
 
diff --git a/testing/io/test_pprint.py b/testing/io/test_pprint.py
index 1326ef34b2e..2c08734cf46 100644
--- a/testing/io/test_pprint.py
+++ b/testing/io/test_pprint.py
@@ -406,3 +406,87 @@ class DataclassWithTwoItems:
 )
 def test_consistent_pretty_printer(data: Any, expected: str) -> None:
     assert PrettyPrinter().pformat(data) == textwrap.dedent(expected).strip()
+
+
+class TestPformatLines:
+    """``pformat_lines`` returns the pretty-printed lines, pulling from
+    the lazy formatter only until a line/char budget is reached so an
+    input a downstream truncator will clip anyway is never fully built.
+    """
+
+    def test_no_budget_matches_pformat_splitlines(self) -> None:
+        pp = PrettyPrinter()
+        data = list(range(50))
+        assert pp.pformat_lines(data) == pp.pformat(data).splitlines()
+
+    def test_under_budget_is_complete_and_a_prefix(self) -> None:
+        # When the whole thing fits, the result is the full pformat,
+        # regardless of how the budget was reached.
+        pp = PrettyPrinter()
+        data = list(range(5))
+        full = pp.pformat(data).splitlines()
+        assert pp.pformat_lines(data, max_lines=11) == full
+        assert pp.pformat_lines(data, max_chars=10_000) == full
+
+    def test_line_budget_stops_early(self) -> None:
+        pp = PrettyPrinter()
+        # 50 scalars, one per line, budget well below 50.
+        full = pp.pformat(list(range(50))).splitlines()
+        lines = pp.pformat_lines(list(range(50)), max_lines=11)
+        assert len(lines) <= 11 + 1  # budget, plus a trailing partial line
+        # everything but the last line (which may stop mid-line) is a
+        # prefix of the full output
+        assert lines[:-1] == full[: len(lines) - 1]
+
+    def test_char_budget_stops_early(self) -> None:
+        # A *flat* container of huge strings has few lines but explodes on
+        # chars; a line-only budget wouldn't stop it. The char budget must.
+        pp = PrettyPrinter()
+        data = ["x" * 100_000, "y" * 100_000, "z" * 100_000]
+        lines = pp.pformat_lines(data, max_chars=640)
+        assert sum(len(line) for line in lines) < 200_000  # bailed, didn't format all 3
+
+    def test_nested_element_respects_line_budget(self) -> None:
+        # ``len(object)`` is only a *lower* bound on the line count: a
+        # single nested element expands to many lines. The lazy pull must
+        # stop regardless of the container's element count.
+        pp = PrettyPrinter()
+        for data in ([{i: "x" * 40 for i in range(50)}], {1: list(range(100))}):
+            lines = pp.pformat_lines(data, max_lines=11)
+            assert len(lines) <= 11 + 1
+
+    def test_nested_dataclass_element_respects_line_budget(self) -> None:
+        @dataclass
+        class Many:
+            a: int
+            b: int
+            c: int
+            d: int
+            e: int
+            f: int
+            g: int
+            h: int
+
+        pp = PrettyPrinter()
+        lines = pp.pformat_lines([Many(*range(8))], max_lines=4)
+        assert len(lines) <= 4 + 1
+        assert len(lines) < len(pp.pformat([Many(*range(8))]).splitlines())
+
+    def test_sized_non_iterable_does_not_raise(self) -> None:
+        class Sized:
+            def __len__(self) -> int:
+                return 3
+
+        pp = PrettyPrinter()
+        obj = Sized()
+        assert pp.pformat_lines(obj, max_lines=5) == pp.pformat(obj).splitlines()
+
+
+def test_pformat_sorts_heterogeneous_set() -> None:
+    # The set sort tries a natural sort first and falls back to a key
+    # that compares the element types' names only for unorderable
+    # mixes; both must succeed.
+    pp = PrettyPrinter()
+    assert pp.pformat({3, 1, 2}) == "{\n    1,\n    2,\n    3,\n}"
+    # Mixed unorderable types must not raise.
+    pp.pformat({1, "a", 2, "b"})

From 133da417f13dc81cb9825bec9b691b53df2ce74c Mon Sep 17 00:00:00 2001
From: Pierre Sassoulas <pierre.sassoulas@gmail.com>
Date: Sat, 13 Jun 2026 18:34:16 +0200
Subject: [PATCH 2/6] [perf] Skip the newline count on chunks without a newline
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

In ``pformat_lines``'s budget loop, ``chunk.count("\n")`` ran on every
chunk, but most chunks (brackets, indentation, item reprs) contain no
newline. Guarding the call with ``"\n" in chunk`` recovers part of the
per-chunk budget-tracking overhead on small comparisons (~0.020 ->
~0.019 ms for an 8-element list).

The win is small and only matters on the ``-v`` truncating path of a
failing assertion (the default path doesn't format the diff at all), so
this is kept as a separate commit — easy to drop if the extra branch
isn't judged worth it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 src/_pytest/_io/pprint.py | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/_pytest/_io/pprint.py b/src/_pytest/_io/pprint.py
index 06caf436e60..d9fd6955032 100644
--- a/src/_pytest/_io/pprint.py
+++ b/src/_pytest/_io/pprint.py
@@ -125,7 +125,10 @@ def pformat_lines(
             chunks.append(chunk)
             if max_chars is not None:
                 n_chars += len(chunk)
-            if max_lines is not None:
+            if max_lines is not None and "\n" in chunk:
+                # Guard the count: most chunks (brackets, indents, item
+                # reprs) have no newline, and skipping the call on them
+                # is meaningfully cheaper than counting every chunk.
                 n_lines += chunk.count("\n")
             if (max_lines is not None and n_lines >= max_lines) or (
                 max_chars is not None and n_chars >= max_chars

From 26630dd768a7d41a2d12d61c4ffef6eafdce82f1 Mon Sep 17 00:00:00 2001
From: Pierre Sassoulas <pierre.sassoulas@gmail.com>
Date: Sat, 13 Jun 2026 15:45:49 +0200
Subject: [PATCH 3/6] [refactor] Stream assertion explanations through
 truncation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Make the assertion-comparison explanation lazy end-to-end so a huge
comparison short-circuits at the truncation threshold instead of
building (then discarding) megabytes of diff text.

* add ``materialize_with_truncation`` — pull from the explanation
  iterator only until the truncation budget is reached, then drop the
  rest unconsumed.
* feed ``util.assertrepr_compare``'s iterator through it from both the
  ``pytest_assertrepr_compare`` hook and ``callbinrepr``, while keeping
  the hook's spec'd ``list[str] | None`` return type intact (the
  iterator is still consumed lazily).
* flatten ``callbinrepr``'s two ``continue``s into nested truthiness
  checks so codecov stops reporting a sticky partial branch.
* drop the exact hidden-line count from the truncation footer
  ("...Full output truncated, use '-vv' to show") — the streaming
  truncator can't know how many lines it never pulled, and maintainers
  agreed the count isn't worth materialising the whole diff for.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 doc/en/example/reportingdemo.rst  |   2 +-
 doc/en/how-to/output.rst          |   4 +-
 src/_pytest/assertion/__init__.py |  46 ++++++---
 src/_pytest/assertion/truncate.py |  81 ++++++++-------
 testing/python/approx.py          |   2 +-
 testing/test_assertion.py         | 162 +++++++++++++++++++++++++++---
 6 files changed, 226 insertions(+), 71 deletions(-)

diff --git a/doc/en/example/reportingdemo.rst b/doc/en/example/reportingdemo.rst
index 29ba190b7e7..2e1681cd9cd 100644
--- a/doc/en/example/reportingdemo.rst
+++ b/doc/en/example/reportingdemo.rst
@@ -148,7 +148,7 @@ Here is a nice run of several failures and how ``pytest`` presents things:
     E           1
     E           1...
     E
-    E         ...Full output truncated (7 lines hidden), use '-vv' to show
+    E         ...Full output truncated, use '-vv' to show
 
     failure_demo.py:62: AssertionError
     _________________ TestSpecialisedExplanations.test_eq_list _________________
diff --git a/doc/en/how-to/output.rst b/doc/en/how-to/output.rst
index db36a5a7206..752d0206526 100644
--- a/doc/en/how-to/output.rst
+++ b/doc/en/how-to/output.rst
@@ -172,7 +172,7 @@ Now we can increase pytest's verbosity:
     E               'banana',
     E               'apple',...
     E
-    E         ...Full output truncated (7 lines hidden), use '-vv' to show
+    E         ...Full output truncated, use '-vv' to show
 
     test_verbosity_example.py:8: AssertionError
     ____________________________ test_numbers_fail _____________________________
@@ -190,7 +190,7 @@ Now we can increase pytest's verbosity:
     E         {'10': 10, '20': 20, '30': 30, '40': 40}
     E         ...
     E
-    E         ...Full output truncated (16 lines hidden), use '-vv' to show
+    E         ...Full output truncated, use '-vv' to show
 
     test_verbosity_example.py:14: AssertionError
     ___________________________ test_long_text_fail ____________________________
diff --git a/src/_pytest/assertion/__init__.py b/src/_pytest/assertion/__init__.py
index e33f8b29609..7968b056b02 100644
--- a/src/_pytest/assertion/__init__.py
+++ b/src/_pytest/assertion/__init__.py
@@ -181,13 +181,21 @@ def callbinrepr(op, left: object, right: object) -> str | None:
             config=item.config, op=op, left=left, right=right
         )
         for new_expl in hook_result:
+            # Plugin-supplied lists are truncated here; the built-in impl
+            # already truncates as it streams, so re-applying truncation
+            # to its output is a near no-op (the body fits the budget,
+            # only the footer line is re-emitted with the same wording).
+            # ``materialize_with_truncation`` can return ``[]`` when the
+            # input was a truthy-but-empty iterable, so re-check after
+            # materialising.
             if new_expl:
-                new_expl = truncate.truncate_if_required(new_expl, item)
-                new_expl = [line.replace("\n", "\\n") for line in new_expl]
-                res = "\n~".join(new_expl)
-                if item.config.getvalue("assertmode") == "rewrite":
-                    res = res.replace("%", "%%")
-                return res
+                new_expl = truncate.materialize_with_truncation(new_expl, item.config)
+                if new_expl:
+                    new_expl = [line.replace("\n", "\\n") for line in new_expl]
+                    res = "\n~".join(new_expl)
+                    if item.config.getvalue("assertmode") == "rewrite":
+                        res = res.replace("%", "%%")
+                    return res
         return None
 
     saved_assert_hooks = util._reprcompare, util._assertion_pass
@@ -218,19 +226,25 @@ def pytest_sessionfinish(session: Session) -> None:
 def pytest_assertrepr_compare(
     config: Config, op: str, left: Any, right: Any
 ) -> list[str] | None:
+    """Return an explanation for ``left op right``.
+
+    Internally ``util.assertrepr_compare`` is a generator; we feed it
+    through ``materialize_with_truncation`` so a huge comparison
+    short-circuits at the truncation threshold without building the
+    full diff, while still returning the ``list[str] | None`` shape
+    the hook spec advertises.
+    """
     if config.pluginmanager.has_plugin("terminalreporter"):
         highlighter = config.get_terminal_writer()._highlight
     else:
         # Keep it plaintext when not using terminalrepoterer (#14377).
         highlighter = util.dummy_highlighter
-    explanation = list(
-        util.assertrepr_compare(
-            op=op,
-            left=left,
-            right=right,
-            verbose=config.get_verbosity(Config.VERBOSITY_ASSERTIONS),
-            highlighter=highlighter,
-            assertion_text_diff_style=util.get_assertion_text_diff_style(config),
-        )
+    lines = util.assertrepr_compare(
+        op=op,
+        left=left,
+        right=right,
+        verbose=config.get_verbosity(Config.VERBOSITY_ASSERTIONS),
+        highlighter=highlighter,
+        assertion_text_diff_style=util.get_assertion_text_diff_style(config),
     )
-    return explanation or None
+    return truncate.materialize_with_truncation(lines, config) or None
diff --git a/src/_pytest/assertion/truncate.py b/src/_pytest/assertion/truncate.py
index d62ca33cc4b..e137d8e0c1f 100644
--- a/src/_pytest/assertion/truncate.py
+++ b/src/_pytest/assertion/truncate.py
@@ -6,9 +6,10 @@
 
 from __future__ import annotations
 
+from collections.abc import Iterable
+
 from _pytest.compat import running_on_ci
 from _pytest.config import Config
-from _pytest.nodes import Item
 
 
 DEFAULT_MAX_LINES = 8
@@ -16,32 +17,52 @@
 USAGE_MSG = "use '-vv' to show"
 
 
-def truncate_if_required(explanation: list[str], item: Item) -> list[str]:
-    """Truncate this assertion explanation if the given test item is eligible."""
-    should_truncate, max_lines, max_chars = _get_truncation_parameters(item)
-    if should_truncate:
-        return _truncate_explanation(
-            explanation,
-            max_lines=max_lines,
-            max_chars=max_chars,
-        )
-    return explanation
+def materialize_with_truncation(lines: Iterable[str], config: Config) -> list[str]:
+    """Materialise a streaming explanation, applying truncation lazily.
+
+    Pulls from ``lines`` only until the truncation threshold is reached;
+    once exceeded, the rest of the iterator is dropped without being
+    consumed. This lets a huge comparison short-circuit instead of
+    building (and immediately discarding) megabytes of explanation text.
+    """
+    should_truncate, max_lines, max_chars = _get_truncation_parameters(config)
+    if not should_truncate:
+        return list(lines)
+
+    tolerable_max_chars = max_chars + 70
+    # Pull just past max_lines so ``_truncate_explanation`` can detect the
+    # overflow without us materialising more than we need.
+    line_cap = max_lines + 3 if max_lines > 0 else None
+    buffered: list[str] = []
+    char_count = 0
+    for line in lines:
+        buffered.append(line)
+        char_count += len(line)
+        if line_cap is not None and len(buffered) >= line_cap:
+            break
+        if max_chars > 0 and char_count > tolerable_max_chars:
+            break
+    else:
+        # Iterator exhausted within limits — nothing to truncate.
+        return buffered
+
+    return _truncate_explanation(buffered, max_lines=max_lines, max_chars=max_chars)
 
 
-def _get_truncation_parameters(item: Item) -> tuple[bool, int, int]:
-    """Return the truncation parameters related to the given item, as (should truncate, max lines, max chars)."""
+def _get_truncation_parameters(config: Config) -> tuple[bool, int, int]:
+    """Return the truncation parameters from the given config, as (should truncate, max lines, max chars)."""
     # We do not need to truncate if one of conditions is met:
     # 1. Verbosity level is 2 or more;
     # 2. Test is being run in CI environment;
     # 3. Both truncation_limit_lines and truncation_limit_chars
     #    .ini parameters are set to 0 explicitly.
-    max_lines = item.config.getini("truncation_limit_lines")
+    max_lines = config.getini("truncation_limit_lines")
     max_lines = int(max_lines if max_lines is not None else DEFAULT_MAX_LINES)
 
-    max_chars = item.config.getini("truncation_limit_chars")
+    max_chars = config.getini("truncation_limit_chars")
     max_chars = int(max_chars if max_chars is not None else DEFAULT_MAX_CHARS)
 
-    verbose = item.config.get_verbosity(Config.VERBOSITY_ASSERTIONS)
+    verbose = config.get_verbosity(Config.VERBOSITY_ASSERTIONS)
 
     should_truncate = verbose < 2 and not running_on_ci()
     should_truncate = should_truncate and (max_lines > 0 or max_chars > 0)
@@ -66,20 +87,9 @@ def _truncate_explanation(
     When this function is launched we know max_lines > 0 or max_chars > 0
     because _get_truncation_parameters was called first.
     """
-    # The length of the truncation explanation depends on the number of lines
-    # removed but is at least 68 characters:
-    # The real value is
-    # 64 (for the base message:
-    # '...\n...Full output truncated (1 line hidden), use '-vv' to show")'
-    # )
-    # + 1 (for plural)
-    # + int(math.log10(len(input_lines) - max_lines)) (number of hidden line, at least 1)
-    # + 3 for the '...' added to the truncated line
-    # But if there's more than 100 lines it's very likely that we're going to
-    # truncate, so we don't need the exact value using log10.
-    tolerable_max_chars = (
-        max_chars + 70  # 64 + 1 (for plural) + 2 (for '99') + 3 for '...'
-    )
+    # Slack on top of ``max_chars`` so a body that just fits the budget
+    # doesn't get truncated solely to make room for the footer.
+    tolerable_max_chars = max_chars + 70
     # The truncation explanation add two lines to the output
     if max_lines == 0 or len(input_lines) <= max_lines + 2:
         if max_chars == 0 or sum(len(s) for s in input_lines) <= tolerable_max_chars:
@@ -89,24 +99,19 @@ def _truncate_explanation(
         # Truncate first to max_lines, and then truncate to max_chars if necessary
         truncated_explanation = input_lines[:max_lines]
     # We reevaluate the need to truncate chars following removal of some lines
-    need_to_truncate_char = (
+    if (
         max_chars > 0
         and sum(len(e) for e in truncated_explanation) > tolerable_max_chars
-    )
-    if need_to_truncate_char:
+    ):
         truncated_explanation = _truncate_by_char_count(
             truncated_explanation, max_chars
         )
     # Something was truncated, adding '...' at the end to show that
     truncated_explanation[-1] += "..."
-    truncated_line_count = (
-        len(input_lines) - len(truncated_explanation) + int(need_to_truncate_char)
-    )
     return [
         *truncated_explanation,
         "",
-        f"...Full output truncated ({truncated_line_count} line"
-        f"{'' if truncated_line_count == 1 else 's'} hidden), {USAGE_MSG}",
+        f"...Full output truncated, {USAGE_MSG}",
     ]
 
 
diff --git a/testing/python/approx.py b/testing/python/approx.py
index 88d46cbb755..c5ca03fe823 100644
--- a/testing/python/approx.py
+++ b/testing/python/approx.py
@@ -313,7 +313,7 @@ def test_error_messages_with_different_verbosity(self, assert_approx_raises_rege
                 rf"^  \(0,\)\s+\| {SOME_FLOAT} \| {SOME_FLOAT} ± {SOME_FLOAT}e-{SOME_INT}$",
                 rf"^  \(1,\)\s+\| {SOME_FLOAT} \| {SOME_FLOAT} ± {SOME_FLOAT}e-{SOME_INT}\.\.\.$",
                 "^  $",
-                rf"^  ...Full output truncated \({SOME_INT} lines hidden\), use '-vv' to show$",
+                r"^  ...Full output truncated, use '-vv' to show$",
             ],
             verbosity_level=0,
         )
diff --git a/testing/test_assertion.py b/testing/test_assertion.py
index 492834ba9de..ac20a172a8c 100644
--- a/testing/test_assertion.py
+++ b/testing/test_assertion.py
@@ -56,6 +56,11 @@ def get_verbosity(self, verbosity_type: str | None = None) -> int:
         def getini(self, name: str) -> str:
             if name == util.ASSERTION_TEXT_DIFF_STYLE_INI:
                 return assertion_text_diff_style
+            # Disable truncation so ``callop``-style tests can compare
+            # against the full explanation. Dedicated truncation tests
+            # use their own config in :class:`TestTruncateMaterialize`.
+            if name in ("truncation_limit_lines", "truncation_limit_chars"):
+                return "0"
             raise KeyError(f"Not mocked out: {name}")
 
     return Config()
@@ -1154,7 +1159,7 @@ def test_recursive_dataclasses(self, pytester: Pytester) -> None:
                 "E         Drill down into differing attribute g:",
                 "E           g: S(a=10, b='ten') != S(a=20, b='xxx')...",
                 "E         ",
-                "E         ...Full output truncated (51 lines hidden), use '-vv' to show",
+                "E         ...Full output truncated, use '-vv' to show",
             ],
             consecutive=True,
         )
@@ -1527,7 +1532,6 @@ def test_truncates_at_8_lines_when_given_list_of_empty_strings(self) -> None:
         assert result != expl
         assert len(result) == 8 + self.LINES_IN_TRUNCATION_MSG
         assert "Full output truncated" in result[-1]
-        assert "42 lines hidden" in result[-1]
         last_line_before_trunc_msg = result[-self.LINES_IN_TRUNCATION_MSG - 1]
         assert last_line_before_trunc_msg.endswith("...")
 
@@ -1538,7 +1542,6 @@ def test_truncates_at_8_lines_when_first_8_lines_are_LT_max_chars(self) -> None:
         assert result != expl
         assert len(result) == 8 + self.LINES_IN_TRUNCATION_MSG
         assert "Full output truncated" in result[-1]
-        assert f"{total_lines - 8} lines hidden" in result[-1]
         last_line_before_trunc_msg = result[-self.LINES_IN_TRUNCATION_MSG - 1]
         assert last_line_before_trunc_msg.endswith("...")
 
@@ -1557,7 +1560,7 @@ def test_truncates_full_line_because_of_max_chars(self) -> None:
             "a" * 10,
             "...",
             "",
-            "...Full output truncated (1 line hidden), use '-vv' to show",
+            "...Full output truncated, use '-vv' to show",
         ]
 
     def test_truncates_edgecase_when_truncation_message_makes_the_result_longer_for_chars(
@@ -1582,7 +1585,6 @@ def test_truncates_at_8_lines_when_first_8_lines_are_EQ_max_chars(self) -> None:
         assert result != expl
         assert len(result) == 16 - 8 + self.LINES_IN_TRUNCATION_MSG
         assert "Full output truncated" in result[-1]
-        assert "8 lines hidden" in result[-1]
         last_line_before_trunc_msg = result[-self.LINES_IN_TRUNCATION_MSG - 1]
         assert last_line_before_trunc_msg.endswith("...")
 
@@ -1592,7 +1594,6 @@ def test_truncates_at_4_lines_when_first_4_lines_are_GT_max_chars(self) -> None:
         assert result != expl
         assert len(result) == 4 + self.LINES_IN_TRUNCATION_MSG
         assert "Full output truncated" in result[-1]
-        assert "7 lines hidden" in result[-1]
         last_line_before_trunc_msg = result[-self.LINES_IN_TRUNCATION_MSG - 1]
         assert last_line_before_trunc_msg.endswith("...")
 
@@ -1602,7 +1603,6 @@ def test_truncates_at_1_line_when_first_line_is_GT_max_chars(self) -> None:
         assert result != expl
         assert len(result) == 1 + self.LINES_IN_TRUNCATION_MSG
         assert "Full output truncated" in result[-1]
-        assert "1000 lines hidden" in result[-1]
         last_line_before_trunc_msg = result[-self.LINES_IN_TRUNCATION_MSG - 1]
         assert last_line_before_trunc_msg.endswith("...")
 
@@ -1610,7 +1610,6 @@ def test_full_output_truncated(self, monkeypatch, pytester: Pytester) -> None:
         """Test against full runpytest() output."""
         line_count = 7
         line_len = 100
-        expected_truncated_lines = 2
         pytester.makepyfile(
             rf"""
             def test_many_lines():
@@ -1629,7 +1628,7 @@ def test_many_lines():
             [
                 "*+ 1*",
                 "*+ 3*",
-                f"*truncated ({expected_truncated_lines} lines hidden)*use*-vv*",
+                "*Full output truncated*use*-vv*",
             ]
         )
 
@@ -1643,7 +1642,7 @@ def test_many_lines():
             [
                 "*+ 1*",
                 "*+ 3*",
-                f"*truncated ({expected_truncated_lines} lines hidden)*use*-vv*",
+                "*Full output truncated*use*-vv*",
             ]
         )
 
@@ -1699,9 +1698,7 @@ def test():
         result = pytester.runpytest()
 
         if expected_lines_hidden != 0:
-            result.stdout.fnmatch_lines(
-                [f"*truncated ({expected_lines_hidden} lines hidden)*"]
-            )
+            result.stdout.fnmatch_lines(["*Full output truncated*"])
         else:
             result.stdout.no_fnmatch_line("*truncated*")
             result.stdout.fnmatch_lines(
@@ -1712,6 +1709,92 @@ def test():
             )
 
 
+class TestMaterializeWithTruncation:
+    """Tests for ``truncate.materialize_with_truncation``.
+
+    Assertions check *behaviour* — that truncation kicks in / doesn't,
+    that the original lines are preserved, that the iterator's contract
+    is honoured — and never the literal footer wording. That way the
+    tests survive any future change to the truncation message format.
+    """
+
+    @staticmethod
+    def _config_with_limits(verbose: int = 0):
+        # Minimal stand-in for ``Config`` that ``materialize_with_truncation``
+        # uses through ``_get_truncation_parameters``.
+        class C:
+            def getini(self, name: str) -> object:
+                return None  # use defaults (8 lines / 640 chars)
+
+            def get_verbosity(self, _verbosity_type: str | None = None) -> int:
+                return verbose
+
+        return C()
+
+    def test_iterator_within_limits_returns_all_lines(self) -> None:
+        lines = iter(["one", "two", "three"])
+        result = truncate.materialize_with_truncation(lines, self._config_with_limits())
+        assert result == ["one", "two", "three"]
+
+    def test_iterator_exceeding_limits_is_truncated(self) -> None:
+        lines = (f"line {i}" for i in range(1000))
+        result = truncate.materialize_with_truncation(lines, self._config_with_limits())
+        # Bounded length — we kept the truncation footer plus at most a few
+        # lines past the cap; we never collect the full 1000-line stream.
+        assert len(result) < 20
+        # The first lines we kept are the first lines of the input.
+        assert result[0] == "line 0"
+        # Some truncation marker is present (wording deliberately not asserted).
+        assert any("truncated" in line for line in result)
+
+    def test_sized_input_returns_same_shape_as_iterator_input(self) -> None:
+        # When the input is already a sized container, the function still
+        # returns the truncated form; behaviour is the same as for an
+        # iterator over the same content.
+        content = [f"line {i}" for i in range(50)]
+        sized = truncate.materialize_with_truncation(
+            content, self._config_with_limits()
+        )
+        unsized = truncate.materialize_with_truncation(
+            iter(content), self._config_with_limits()
+        )
+        assert sized[0] == unsized[0] == "line 0"
+        assert any("truncated" in line for line in sized)
+        assert any("truncated" in line for line in unsized)
+
+    def test_truncation_disabled_returns_full_input(self) -> None:
+        # verbose >= 2 disables truncation; the iterator is fully drained.
+        lines = (f"line {i}" for i in range(50))
+        result = truncate.materialize_with_truncation(
+            lines, self._config_with_limits(verbose=2)
+        )
+        assert result == [f"line {i}" for i in range(50)]
+        assert not any("truncated" in line for line in result)
+
+    def test_first_lines_are_preserved_verbatim(self) -> None:
+        lines = (f"line {i}" for i in range(200))
+        result = truncate.materialize_with_truncation(lines, self._config_with_limits())
+        # The first kept lines should match the start of the input exactly
+        # (modulo the "..." appended to the last surviving line by the
+        # truncator, which we strip before comparing).
+        kept = [line.rstrip(".") for line in result if "truncated" not in line]
+        for i, line in enumerate(kept):
+            if line == "":
+                # Blank line separating content from the footer.
+                continue
+            assert line.startswith(f"line {i}")
+
+    def test_idempotent_on_already_truncated_list(self) -> None:
+        # The dispatcher applies ``materialize_with_truncation`` after the
+        # built-in hook impl already truncated. Re-applying it must not
+        # corrupt the footer count or chop further lines.
+        once = truncate.materialize_with_truncation(
+            (f"line {i}" for i in range(200)), self._config_with_limits()
+        )
+        twice = truncate.materialize_with_truncation(once, self._config_with_limits())
+        assert twice == once
+
+
 def test_python25_compile_issue257(pytester: Pytester) -> None:
     pytester.makepyfile(
         """
@@ -2205,6 +2288,59 @@ def raise_exit(obj):
         callequal(1, 1)
 
 
+def test_plugin_hook_returning_none_is_skipped(pytester: Pytester) -> None:
+    """A ``pytest_assertrepr_compare`` impl returning ``None`` is skipped
+    so the next impl (or the built-in) can produce the explanation.
+    Covers the ``if new_expl is None: continue`` branch in
+    ``callbinrepr``.
+    """
+    pytester.makeconftest(
+        """
+        def pytest_assertrepr_compare(op, left, right):
+            # Always defer to the next plugin / the built-in.
+            return None
+        """
+    )
+    pytester.makepyfile(
+        """
+        def test_diff():
+            assert {1, 2} == {1, 3}
+        """
+    )
+    result = pytester.runpytest()
+    # The built-in set-comparison explanation still reaches the user
+    # (so the None-returning hook did not swallow it).
+    result.stdout.fnmatch_lines(
+        ["*Extra items in the left set:*", "*Extra items in the right set:*"]
+    )
+
+
+def test_exception_before_first_yield_emits_summary_and_notice(monkeypatch) -> None:
+    """When the comparator raises *before* any explanation line has been
+    yielded, ``assertrepr_compare`` should still produce the summary so
+    the reader sees what was being compared, then append the failure
+    notice. Covers the ``summary_yielded is False`` branch of the
+    exception handler.
+    """
+    from _pytest.assertion import _compare_any
+
+    def raise_value_error(obj):
+        raise ValueError("synthetic repr failure")
+
+    # ``istext`` is called inside ``_compare_eq_any`` before the first
+    # yield, so this triggers the failure path on the very first
+    # ``next()`` call from ``assertrepr_compare``.
+    monkeypatch.setattr(_compare_any, "istext", raise_value_error)
+
+    expl = callequal(1, 1)
+    assert expl is not None
+    # Summary line still produced.
+    assert expl[0] == "1 == 1"
+    # The failure notice survives in the output; wording deliberately not
+    # asserted, only the underlying error's signature.
+    assert any("ValueError" in line or "synthetic" in line for line in expl)
+
+
 def test_assertion_location_with_coverage(pytester: Pytester) -> None:
     """This used to report the wrong location when run with coverage (#5754)."""
     p = pytester.makepyfile(

From bf44d0f9639292dfac533b8da6105557f98fc8d3 Mon Sep 17 00:00:00 2001
From: Pierre Sassoulas <pierre.sassoulas@gmail.com>
Date: Sat, 13 Jun 2026 18:11:11 +0200
Subject: [PATCH 4/6] [perf] Wire the pformat budget and streamed ndiff into
 the diff
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Use the lazy ``PrettyPrinter`` from the assertion comparison so a big
``==`` diff is never fully built when truncation will clip it.

* compute a per-side pformat budget ``(max_lines, max_chars)`` in
  ``pytest_assertrepr_compare`` from the truncator's
  ``truncation_limit_lines`` / ``truncation_limit_chars`` and thread it
  through ``util.assertrepr_compare`` → ``_compare_eq_any`` →
  ``_compare_eq_iterable``, which passes it to ``pformat_lines``. Both
  dimensions are bounded, so a flat container of a few enormous strings
  (huge chars, few lines) stops as early as a many-element collection
  (many lines) — and a char-only truncation config
  (``truncation_limit_lines=0``) now caps formatting too, where before
  it fell back to no cap. With truncation disabled (``-vv``/CI) the
  budget stays ``(None, None)`` and the full diff is produced.
* stream ``difflib.ndiff`` output line-by-line, highlighting each line
  individually (the diff lexer is line-oriented), so the truncator can
  stop pulling as soon as its budget is full instead of joining the
  whole diff first.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 src/_pytest/assertion/__init__.py          | 28 ++++++++++++++++++++++
 src/_pytest/assertion/_compare_any.py      |  5 +++-
 src/_pytest/assertion/_compare_sequence.py | 28 +++++++++++++++-------
 src/_pytest/assertion/util.py              |  2 ++
 testing/test_assertion.py                  |  8 +++----
 5 files changed, 58 insertions(+), 13 deletions(-)

diff --git a/src/_pytest/assertion/__init__.py b/src/_pytest/assertion/__init__.py
index 7968b056b02..f1db4347f36 100644
--- a/src/_pytest/assertion/__init__.py
+++ b/src/_pytest/assertion/__init__.py
@@ -239,6 +239,33 @@ def pytest_assertrepr_compare(
     else:
         # Keep it plaintext when not using terminalrepoterer (#14377).
         highlighter = util.dummy_highlighter
+    # When truncation is going to clip the explanation downstream, tell
+    # the comparison helpers to cap their pformat output at the same
+    # budget so they don't spend O(N) formatting lines/chars we're about
+    # to drop. The cap is ``(max_lines, max_chars)`` per side, mirroring
+    # the truncator's own slack so a side is never under-formatted:
+    #
+    # * ``trunc_lines + 3``: 2 lines for the truncation footer it appends
+    #   (blank + message) plus 1 for overshoot detection.
+    # * ``trunc_chars + 70``: the truncator's own ``tolerable_max_chars``
+    #   slack (footer length).
+    #
+    # ``difflib.ndiff`` over two K-line/char pformat outputs produces at
+    # least K output lines/chars (more when the sides differ), and the
+    # truncator pulls at most that much from the whole explanation, so a
+    # per-side budget covers the worst case. A dimension whose limit is 0
+    # (disabled) stays ``None`` so it isn't bounded; with truncation off
+    # both stay ``None`` and the user gets the full diff.
+    should_truncate, trunc_lines, trunc_chars = truncate._get_truncation_parameters(
+        config
+    )
+    if should_truncate:
+        pformat_cap = (
+            trunc_lines + 3 if trunc_lines > 0 else None,
+            trunc_chars + 70 if trunc_chars > 0 else None,
+        )
+    else:
+        pformat_cap = (None, None)
     lines = util.assertrepr_compare(
         op=op,
         left=left,
@@ -246,5 +273,6 @@ def pytest_assertrepr_compare(
         verbose=config.get_verbosity(Config.VERBOSITY_ASSERTIONS),
         highlighter=highlighter,
         assertion_text_diff_style=util.get_assertion_text_diff_style(config),
+        pformat_cap=pformat_cap,
     )
     return truncate.materialize_with_truncation(lines, config) or None
diff --git a/src/_pytest/assertion/_compare_any.py b/src/_pytest/assertion/_compare_any.py
index 9e577683736..d005580ea45 100644
--- a/src/_pytest/assertion/_compare_any.py
+++ b/src/_pytest/assertion/_compare_any.py
@@ -28,6 +28,7 @@ def _compare_eq_any(
     highlighter: _HighlightFunc,
     verbose: int,
     assertion_text_diff_style: _AssertionTextDiffStyle,
+    pformat_cap: tuple[int | None, int | None] = (None, None),
 ) -> Iterator[str]:
     """Yield the per-line explanation for ``left == right`` (without summary).
 
@@ -73,7 +74,9 @@ def _compare_eq_any(
             yield from _compare_eq_mapping(left, right, highlighter, verbose)
 
         if isiterable(left) and isiterable(right):
-            yield from _compare_eq_iterable(left, right, highlighter, verbose)
+            yield from _compare_eq_iterable(
+                left, right, highlighter, verbose, pformat_cap
+            )
 
 
 def _compare_eq_cls(
diff --git a/src/_pytest/assertion/_compare_sequence.py b/src/_pytest/assertion/_compare_sequence.py
index cd0043bf7ce..0e57ae91750 100644
--- a/src/_pytest/assertion/_compare_sequence.py
+++ b/src/_pytest/assertion/_compare_sequence.py
@@ -15,6 +15,7 @@ def _compare_eq_iterable(
     right: Iterable[object],
     highlighter: _HighlightFunc,
     verbose: int = 0,
+    pformat_cap: tuple[int | None, int | None] = (None, None),
 ) -> Iterator[str]:
     if verbose <= 0 and not running_on_ci():
         yield "Use -v to get more diff"
@@ -22,19 +23,30 @@ def _compare_eq_iterable(
     # dynamic import to speedup pytest
     import difflib
 
-    left_formatting = PrettyPrinter().pformat(left).splitlines()
-    right_formatting = PrettyPrinter().pformat(right).splitlines()
+    # ``pformat_cap`` is ``(max_lines, max_chars)``, computed by the
+    # dispatcher from the truncator's ``truncation_limit_lines`` /
+    # ``truncation_limit_chars``: when truncation is going to drop
+    # everything past those budgets anyway, we don't bother formatting
+    # more. ``(None, None)`` means no cap (``-vv`` or CI: the user wants
+    # the full diff).
+    pp = PrettyPrinter()
+    max_lines, max_chars = pformat_cap
+    left_formatting = pp.pformat_lines(left, max_lines, max_chars)
+    right_formatting = pp.pformat_lines(right, max_lines, max_chars)
 
     yield ""
     yield "Full diff:"
     # "right" is the expected base against which we compare "left",
     # see https://github.com/pytest-dev/pytest/issues/3333
-    yield from highlighter(
-        "\n".join(
-            line.rstrip() for line in difflib.ndiff(right_formatting, left_formatting)
-        ),
-        lexer="diff",
-    ).splitlines()
+    #
+    # Yield each ndiff line through the highlighter individually so the
+    # streaming truncator can stop pulling from ``difflib.ndiff`` as
+    # soon as its budget is full. The diff lexer is line-oriented, so
+    # per-line highlighting is equivalent — it just adds a redundant
+    # ``\x1b[0m`` reset at the start of each line (invisible to the
+    # terminal).
+    for line in difflib.ndiff(right_formatting, left_formatting):
+        yield highlighter(line.rstrip(), lexer="diff")
 
 
 def _compare_eq_sequence(
diff --git a/src/_pytest/assertion/util.py b/src/_pytest/assertion/util.py
index 5e5ef543c13..986e3231b93 100644
--- a/src/_pytest/assertion/util.py
+++ b/src/_pytest/assertion/util.py
@@ -140,6 +140,7 @@ def assertrepr_compare(
     verbose: int,
     highlighter: _HighlightFunc,
     assertion_text_diff_style: _AssertionTextDiffStyle,
+    pformat_cap: tuple[int | None, int | None] = (None, None),
 ) -> Iterator[str]:
     """Yield specialised explanations for some operators/operands.
 
@@ -183,6 +184,7 @@ def assertrepr_compare(
                 highlighter,
                 verbose,
                 assertion_text_diff_style,
+                pformat_cap,
             )
         elif op == "not in" and istext(left) and istext(right):
             source = _notin_text(left, right, verbose)
diff --git a/testing/test_assertion.py b/testing/test_assertion.py
index ac20a172a8c..50705800aae 100644
--- a/testing/test_assertion.py
+++ b/testing/test_assertion.py
@@ -2387,8 +2387,8 @@ def test():
             """,
             [
                 "{bold}{red}E         At index 1 diff: {reset}{number}1{hl-reset}{endline} != {reset}{number}2*",
-                "{bold}{red}E         {light-red}-     2,{hl-reset}{endline}{reset}",
-                "{bold}{red}E         {light-green}+     1,{hl-reset}{endline}{reset}",
+                "{bold}{red}E         {reset}{light-red}-     2,{hl-reset}{endline}{reset}",
+                "{bold}{red}E         {reset}{light-green}+     1,{hl-reset}{endline}{reset}",
             ],
         ),
         (
@@ -2406,8 +2406,8 @@ def test():
                 "{bold}{red}E         Right contains 1 more item:{reset}",
                 "{bold}{red}E         {reset}{{{str}'{hl-reset}{str}number-is-0{hl-reset}{str}'{hl-reset}: {number}0*",
                 "{bold}{red}E         {reset}{light-gray} {hl-reset} {{{endline}{reset}",
-                "{bold}{red}E         {light-gray} {hl-reset}     'number-is-1': 1,{endline}{reset}",
-                "{bold}{red}E         {light-green}+     'number-is-5': 5,{hl-reset}{endline}{reset}",
+                "{bold}{red}E         {reset}{light-gray} {hl-reset}     'number-is-1': 1,{endline}{reset}",
+                "{bold}{red}E         {reset}{light-green}+     'number-is-5': 5,{hl-reset}{endline}{reset}",
             ],
         ),
         (

From 337bd20f0d134072885d82553948e11e79ecda6a Mon Sep 17 00:00:00 2001
From: Pierre Sassoulas <pierre.sassoulas@gmail.com>
Date: Sat, 13 Jun 2026 15:46:49 +0200
Subject: [PATCH 5/6] [test] Cover the streaming truncation coverage gaps
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Close the patch-coverage gaps codecov flagged on the streaming refactor:

* a plugin returning a truthy-but-empty iterator (``iter([])``), which
  slips past the first falsy check but is empty once materialised
  through truncation — exercises the second skip in ``callbinrepr``.
* a ``--assert=plain`` run, exercising the false branch of the
  ``assertmode == "rewrite"`` guard.
* every hook returning ``None`` (``assert 1 == 2``), so the dispatcher
  falls through the loop and returns ``None``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 testing/test_assertion.py | 73 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 71 insertions(+), 2 deletions(-)

diff --git a/testing/test_assertion.py b/testing/test_assertion.py
index 50705800aae..95e31332c0c 100644
--- a/testing/test_assertion.py
+++ b/testing/test_assertion.py
@@ -2291,8 +2291,7 @@ def raise_exit(obj):
 def test_plugin_hook_returning_none_is_skipped(pytester: Pytester) -> None:
     """A ``pytest_assertrepr_compare`` impl returning ``None`` is skipped
     so the next impl (or the built-in) can produce the explanation.
-    Covers the ``if new_expl is None: continue`` branch in
-    ``callbinrepr``.
+    Covers the ``if not new_expl: continue`` branch in ``callbinrepr``.
     """
     pytester.makeconftest(
         """
@@ -2315,6 +2314,76 @@ def test_diff():
     )
 
 
+def test_plugin_hook_returning_empty_iterator_is_skipped(pytester: Pytester) -> None:
+    """A plugin returning a truthy but ultimately empty iterable is
+    skipped after materialisation. Covers the second
+    ``if not new_expl: continue`` branch in ``callbinrepr``.
+    """
+    pytester.makeconftest(
+        """
+        def pytest_assertrepr_compare(op, left, right):
+            # An iterator object is truthy, so it slips past the first
+            # falsy check; once materialised through truncation it is
+            # empty and the dispatcher must move on.
+            return iter([])
+        """
+    )
+    pytester.makepyfile(
+        """
+        def test_diff():
+            assert {1, 2} == {1, 3}
+        """
+    )
+    result = pytester.runpytest()
+    # The built-in set-comparison explanation still reaches the user.
+    result.stdout.fnmatch_lines(
+        ["*Extra items in the left set:*", "*Extra items in the right set:*"]
+    )
+
+
+def test_callbinrepr_falls_through_when_all_hooks_return_none(
+    pytester: Pytester,
+) -> None:
+    """When every ``pytest_assertrepr_compare`` impl returns ``None``
+    (no specialised explanation applies, e.g. ``assert 1 == 2``), the
+    dispatcher exhausts ``hook_result``, exits the loop, and returns
+    ``None``. Covers the ``continue → loop exit`` branch on the first
+    ``if not new_expl: continue`` line.
+    """
+    pytester.makepyfile(
+        """
+        def test_trivial():
+            assert 1 == 2
+        """
+    )
+    result = pytester.runpytest()
+    # Just the plain ``assert 1 == 2`` rewrite, with no specialised
+    # comparator explanation appended (because the dispatcher fell
+    # through to ``return None``).
+    result.stdout.fnmatch_lines(["*assert 1 == 2*"])
+    result.assert_outcomes(failed=1)
+
+
+def test_callbinrepr_plain_assert_mode(pytester: Pytester) -> None:
+    """In ``--assert=plain`` mode ``callbinrepr`` skips the ``%`` escape.
+    Covers the false branch of ``if item.config.getvalue("assertmode")
+    == "rewrite"``.
+    """
+    pytester.makepyfile(
+        """
+        def test_diff():
+            assert {1, 2} == {1, 3}
+        """
+    )
+    result = pytester.runpytest("--assert=plain")
+    # In plain mode the comparator still runs via ``callbinrepr`` (it
+    # is the rewrite escaping that's skipped), so the explanation is
+    # still produced.
+    result.stdout.fnmatch_lines(
+        ["*Extra items in the left set:*", "*Extra items in the right set:*"]
+    )
+
+
 def test_exception_before_first_yield_emits_summary_and_notice(monkeypatch) -> None:
     """When the comparator raises *before* any explanation line has been
     yielded, ``assertrepr_compare`` should still produce the summary so

From 89967111ec36302087854fda8f428723cc1c24d6 Mon Sep 17 00:00:00 2001
From: Pierre Sassoulas <pierre.sassoulas@gmail.com>
Date: Sat, 13 Jun 2026 15:46:58 +0200
Subject: [PATCH 6/6] [doc] Add changelog for streaming assertion comparisons
 (#14523)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 changelog/14523.improvement.rst | 13 +++++++++++++
 1 file changed, 13 insertions(+)
 create mode 100644 changelog/14523.improvement.rst

diff --git a/changelog/14523.improvement.rst b/changelog/14523.improvement.rst
new file mode 100644
index 00000000000..d53374c62df
--- /dev/null
+++ b/changelog/14523.improvement.rst
@@ -0,0 +1,13 @@
+Assertion explanations are now built lazily and the truncator stops
+the comparison helpers as soon as it has enough output, so comparing
+two large collections no longer builds the full diff in order to
+discard it. A focused micro-benchmark the worst case scenario
+(``set(range(500_000)) == set(range(1, 500_001))``) drops from ~2,200 ms
+to ~43 ms; but realistic test suite with mostly small diffs should be
+unchanged.
+
+The truncation footer no longer reports the hidden-line count
+(``...Full output truncated (N lines hidden), ...`` becomes
+``...Full output truncated, ...``); diff lines now carry a redundant
+``\x1b[0m`` reset prefix (invisible to terminals) so we can handle
+line one by one.