vinitkumar · vinitkumar · Jun 10, 2026 · Jun 10, 2026 · Jun 10, 2026
diff --git a/docs/rust_memory_benchmark.rst b/docs/rust_memory_benchmark.rst
@@ -6,7 +6,7 @@ This benchmark compares peak memory usage before and after the Rust serializer s
 Summary
 -------
 
-The current branch reduces serializer peak RSS by about **77.44 MiB** on a 100,000-record payload that produces **78.17 MiB** of XML. That is a **49.1% reduction** in serializer memory delta compared with the previous PR commit.
+The current branch still reduces serializer peak RSS by about **77.43 MiB** on a 100,000-record payload that produces **78.17 MiB** of XML. That is a **49.1% reduction** in serializer memory delta compared with the previous PR commit.
 
 .. list-table::
    :header-rows: 1
@@ -23,10 +23,10 @@ The current branch reduces serializer peak RSS by about **77.44 MiB** on a 100,0
      - 157.70 MiB
      - 0.180s
    * - Current
-     - ``07d840f``
-     - 271.65 MiB
-     - 80.26 MiB
-     - 0.265s
+     - ``43543e8`` + worktree
+     - 270.91 MiB
+     - 80.27 MiB
+     - 0.232s
 
 The memory result matches the implementation change: the previous version held roughly one final XML payload in Rust plus one Python ``bytes`` payload, while the current version writes into the Python ``bytes`` object directly.
 
@@ -36,14 +36,14 @@ Methodology
 The benchmark uses ``benchmark_memory_rust.py`` with a deterministic generated payload so the Rust fast path can be measured without file parsing or pure-Python fallback behavior.
 
 * Machine: Apple Silicon arm64
-* OS: macOS 26.5
+* OS: macOS 26.5.1
 * Python: 3.14.0
 * Build: ``python3 -m maturin develop --release --offline``
 * Payload: 100,000 nested records
 * Input JSON size: 44.31 MiB
 * Output XML size: 78.17 MiB
 * Measurement: process ``ru_maxrss`` after payload creation versus peak after ``json2xml_rs.dicttoxml(payload, attr_type=True)``
-* Sampling: three fresh Python processes per version
+* Sampling: five fresh Python processes for the current version using ``hyperfine 1.20.0 --runs 5 --show-output --export-json``
 
 The baseline RSS is captured after the large Python payload is already built. The reported serializer delta is ``peak_rss - baseline_rss``, which focuses the comparison on output construction rather than payload allocation.
 
@@ -89,11 +89,41 @@ Raw Samples
      - 271.55 MiB
      - 80.30 MiB
      - 0.258s
+   * - current-hyperfine-1
+     - 190.77 MiB
+     - 271.12 MiB
+     - 80.36 MiB
+     - 0.233s
+   * - current-hyperfine-2
+     - 190.66 MiB
+     - 271.02 MiB
+     - 80.36 MiB
+     - 0.234s
+   * - current-hyperfine-3
+     - 190.62 MiB
+     - 270.98 MiB
+     - 80.36 MiB
+     - 0.234s
+   * - current-hyperfine-4
+     - 190.78 MiB
+     - 270.84 MiB
+     - 80.06 MiB
+     - 0.231s
+   * - current-hyperfine-5
+     - 190.38 MiB
+     - 270.61 MiB
+     - 80.23 MiB
+     - 0.229s
+
+Hyperfine Result
+----------------
+
+Hyperfine measured full fresh-process runtime for the current benchmark command at **684.3 ms +/- 26.5 ms** across five runs, with a range of **664.1 ms to 728.0 ms**. The exported hyperfine memory sample was **272.59 MiB**, which is consistent with the script's average peak RSS of **270.91 MiB**.
 
 Tradeoff
 --------
 
-The memory improvement comes with a throughput cost in this release benchmark. Average conversion time increased from 0.180s to 0.265s, about **47.5% slower** for this payload.
+The memory improvement comes with a throughput cost in this release benchmark. Average conversion time increased from the previous 0.180s to the current hyperfine-sampled 0.232s, about **29.0% slower** for this payload.
 
 That cost is likely from routing every XML write through ``std::io::Write`` and PyO3's bytes writer. The memory win is substantial for large outputs, but latency-sensitive callers may want more timing data before treating the bytes-writer path as a universal improvement.
 
@@ -105,6 +135,7 @@ Run each version in a fresh process after installing the desired Rust extension
 .. code-block:: bash
 
    python3 -m maturin develop --release --offline
-   python3 benchmark_memory_rust.py --records 100000 --label current-release-1
+   hyperfine --runs 5 --show-output --export-json /private/tmp/json2xml-rust-memory-hyperfine.json \
+     'python3 benchmark_memory_rust.py --records 100000 --label current-hyperfine'
 
 For the previous comparison, install commit ``7dd86b0`` in a temporary worktree, then run the same command from the main checkout so the benchmark script stays identical.
diff --git a/lat.md/architecture.md b/lat.md/architecture.md
@@ -44,7 +44,7 @@ The May 2026 benchmark on Apple Silicon shows the Rust extension as the best opt
 
 Reproduction docs require contributors to record machine, OS, Python, and tool availability before comparing results. `benchmark_all.py` mixes library calls and CLI subprocesses intentionally, so its Go and Zig rows include process startup overhead.
 
-The June 2026 Rust memory benchmark uses [[benchmark_memory_rust.py#main]] to compare release builds in fresh Python processes. The bytes-writer implementation cuts serializer peak RSS by about half for large outputs, with a documented throughput tradeoff.
+The June 2026 Rust memory benchmark uses [[benchmark_memory_rust.py#main]] under hyperfine to compare release builds in fresh Python processes. The bytes-writer implementation cuts serializer peak RSS by about half for large outputs, with a documented throughput tradeoff.
 
 ## Dependency security
 

diff --git a/lat.md/tests.md b/lat.md/tests.md
@@ -153,3 +153,7 @@ XML name validation should agree across the ASCII fast path, parser-backed path,
 ### XML attribute name validation
 
 Attribute name validation should reject malformed custom attribute keys while preserving parser-accepted edge names such as underscores, hyphens, and xml-prefixed names.
+
+### Rust invalid-name attrs escape once
+
+Rust XML-name helpers should return raw invalid keys for later attribute escaping so borrowed-name optimizations cannot reintroduce double escaping.
diff --git a/rust/src/lib.rs b/rust/src/lib.rs
@@ -11,6 +11,9 @@ use pyo3::prelude::*;
 use pyo3::types::{PyBool, PyBytes, PyDict, PyFloat, PyInt, PyList, PyString};
 #[cfg(feature = "python")]
 use std::io::Write;
+
+use std::borrow::Cow;
+
 /// Escape special XML characters in a string (allocating convenience wrapper).
 #[inline]
 pub fn escape_xml(s: &str) -> String {
@@ -172,28 +175,27 @@ pub fn is_valid_xml_name(key: &str) -> bool {
 /// (unescaped) original key when a fallback is needed. Escaping of the
 /// attribute value is handled later by `make_attr_string`, so we must NOT
 /// escape here to avoid double-escaping.
-pub fn make_valid_xml_name(key: &str) -> (String, Option<(String, String)>) {
+pub fn make_valid_xml_name<'a>(
+    key: &'a str,
+) -> (Cow<'a, str>, Option<(&'static str, Cow<'a, str>)>) {
     // Already valid
     if is_valid_xml_name(key) {
-        return (key.to_string(), None);
+        return (Cow::Borrowed(key), None);
     }
 
     // Numeric key - prepend 'n'
     if key.bytes().all(|b| b.is_ascii_digit()) && !key.is_empty() {
-        return (format!("n{}", key), None);
+        return (Cow::Owned(format!("n{}", key)), None);
     }
 
     // Try replacing spaces with underscores
     let with_underscores = key.replace(' ', "_");
     if is_valid_xml_name(&with_underscores) {
-        return (with_underscores, None);
+        return (Cow::Owned(with_underscores), None);
     }
 
     // Fall back to using "key" with name attribute (raw value, escaped later)
-    (
-        "key".to_string(),
-        Some(("name".to_string(), key.to_string())),
-    )
+    (Cow::Borrowed("key"), Some(("name", Cow::Borrowed(key))))
 }
 
 /// Build an attribute string from key-value pairs (allocating convenience wrapper).
@@ -392,10 +394,10 @@ fn write_dict_contents(
     cfg: &ConvertConfig,
 ) -> PyResult<()> {
     for (key, val) in dict.iter() {
-        let key_str: String = key.str()?.extract()?;
-        let (xml_key, name_attr_pair) = make_valid_xml_name(&key_str);
-        let name_attr = name_attr_pair.as_ref().map(|(_, v)| v.as_str());
-
+        let key_py_str = key.str()?;
+        let key_str = key_py_str.to_str()?;
+        let (xml_key, name_attr_pair) = make_valid_xml_name(key_str);
+        let name_attr = name_attr_pair.as_ref().map(|(_, v)| v.as_ref());
         // Lists in dicts get special wrapping treatment
         if let Ok(list) = val.cast::<PyList>() {
             let first_is_scalar = list
@@ -755,16 +757,23 @@ mod tests {
         fn falls_back_to_key_with_name_attr() {
             let (name, attr) = make_valid_xml_name("-invalid");
             assert_eq!(name, "key");
-            assert_eq!(attr, Some(("name".to_string(), "-invalid".to_string())));
+            assert_eq!(
+                attr.as_ref().map(|(k, v)| (*k, v.as_ref())),
+                Some(("name", "-invalid"))
+            );
         }
 
         #[test]
+        // @lat: [[tests#XML helper behavior#Rust invalid-name attrs escape once]]
         fn returns_raw_key_for_invalid_names() {
             // make_valid_xml_name must return the raw key, not escaped.
             // Escaping happens later in make_attr_string to avoid double-escaping.
             let (name, attr) = make_valid_xml_name("tag&name");
             assert_eq!(name, "key");
-            assert_eq!(attr, Some(("name".to_string(), "tag&name".to_string())));
+            assert_eq!(
+                attr.as_ref().map(|(k, v)| (*k, v.as_ref())),
+                Some(("name", "tag&name"))
+            );
         }
 
         #[test]
@@ -773,7 +782,9 @@ mod tests {
             // a single level of escaping, not &amp;amp;
             let (name, attr) = make_valid_xml_name("tag&name");
             assert_eq!(name, "key");
-            let attrs = attr.map(|(k, v)| vec![(k, v)]).unwrap_or_default();
+            let attrs = attr
+                .map(|(k, v)| vec![(k.to_string(), v.into_owned())])
+                .unwrap_or_default();
             let attr_string = make_attr_string(&attrs);
             assert_eq!(attr_string, " name=\"tag&amp;name\"");
         }