Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 40 additions & 9 deletions docs/rust_memory_benchmark.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ This benchmark compares peak memory usage before and after the Rust serializer s
Summary
-------

The current branch reduces serializer peak RSS by about **77.44 MiB** on a 100,000-record payload that produces **78.17 MiB** of XML. That is a **49.1% reduction** in serializer memory delta compared with the previous PR commit.
The current branch still reduces serializer peak RSS by about **77.43 MiB** on a 100,000-record payload that produces **78.17 MiB** of XML. That is a **49.1% reduction** in serializer memory delta compared with the previous PR commit.

.. list-table::
:header-rows: 1
Expand All @@ -23,10 +23,10 @@ The current branch reduces serializer peak RSS by about **77.44 MiB** on a 100,0
- 157.70 MiB
- 0.180s
* - Current
- ``07d840f``
- 271.65 MiB
- 80.26 MiB
- 0.265s
- ``43543e8`` + worktree
- 270.91 MiB
- 80.27 MiB
- 0.232s

The memory result matches the implementation change: the previous version held roughly one final XML payload in Rust plus one Python ``bytes`` payload, while the current version writes into the Python ``bytes`` object directly.

Expand All @@ -36,14 +36,14 @@ Methodology
The benchmark uses ``benchmark_memory_rust.py`` with a deterministic generated payload so the Rust fast path can be measured without file parsing or pure-Python fallback behavior.

* Machine: Apple Silicon arm64
* OS: macOS 26.5
* OS: macOS 26.5.1
* Python: 3.14.0
* Build: ``python3 -m maturin develop --release --offline``
* Payload: 100,000 nested records
* Input JSON size: 44.31 MiB
* Output XML size: 78.17 MiB
* Measurement: process ``ru_maxrss`` after payload creation versus peak after ``json2xml_rs.dicttoxml(payload, attr_type=True)``
* Sampling: three fresh Python processes per version
* Sampling: five fresh Python processes for the current version using ``hyperfine 1.20.0 --runs 5 --show-output --export-json``

The baseline RSS is captured after the large Python payload is already built. The reported serializer delta is ``peak_rss - baseline_rss``, which focuses the comparison on output construction rather than payload allocation.

Expand Down Expand Up @@ -89,11 +89,41 @@ Raw Samples
- 271.55 MiB
- 80.30 MiB
- 0.258s
* - current-hyperfine-1
- 190.77 MiB
- 271.12 MiB
- 80.36 MiB
- 0.233s
* - current-hyperfine-2
- 190.66 MiB
- 271.02 MiB
- 80.36 MiB
- 0.234s
* - current-hyperfine-3
- 190.62 MiB
- 270.98 MiB
- 80.36 MiB
- 0.234s
* - current-hyperfine-4
- 190.78 MiB
- 270.84 MiB
- 80.06 MiB
- 0.231s
* - current-hyperfine-5
- 190.38 MiB
- 270.61 MiB
- 80.23 MiB
- 0.229s

Hyperfine Result
----------------

Hyperfine measured full fresh-process runtime for the current benchmark command at **684.3 ms +/- 26.5 ms** across five runs, with a range of **664.1 ms to 728.0 ms**. The exported hyperfine memory sample was **272.59 MiB**, which is consistent with the script's average peak RSS of **270.91 MiB**.

Tradeoff
--------

The memory improvement comes with a throughput cost in this release benchmark. Average conversion time increased from 0.180s to 0.265s, about **47.5% slower** for this payload.
The memory improvement comes with a throughput cost in this release benchmark. Average conversion time increased from the previous 0.180s to the current hyperfine-sampled 0.232s, about **29.0% slower** for this payload.

That cost is likely from routing every XML write through ``std::io::Write`` and PyO3's bytes writer. The memory win is substantial for large outputs, but latency-sensitive callers may want more timing data before treating the bytes-writer path as a universal improvement.

Expand All @@ -105,6 +135,7 @@ Run each version in a fresh process after installing the desired Rust extension
.. code-block:: bash

python3 -m maturin develop --release --offline
python3 benchmark_memory_rust.py --records 100000 --label current-release-1
hyperfine --runs 5 --show-output --export-json /private/tmp/json2xml-rust-memory-hyperfine.json \
'python3 benchmark_memory_rust.py --records 100000 --label current-hyperfine'

For the previous comparison, install commit ``7dd86b0`` in a temporary worktree, then run the same command from the main checkout so the benchmark script stays identical.
2 changes: 1 addition & 1 deletion lat.md/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ The May 2026 benchmark on Apple Silicon shows the Rust extension as the best opt

Reproduction docs require contributors to record machine, OS, Python, and tool availability before comparing results. `benchmark_all.py` mixes library calls and CLI subprocesses intentionally, so its Go and Zig rows include process startup overhead.

The June 2026 Rust memory benchmark uses [[benchmark_memory_rust.py#main]] to compare release builds in fresh Python processes. The bytes-writer implementation cuts serializer peak RSS by about half for large outputs, with a documented throughput tradeoff.
The June 2026 Rust memory benchmark uses [[benchmark_memory_rust.py#main]] under hyperfine to compare release builds in fresh Python processes. The bytes-writer implementation cuts serializer peak RSS by about half for large outputs, with a documented throughput tradeoff.

## Dependency security

Expand Down
4 changes: 4 additions & 0 deletions lat.md/tests.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,3 +153,7 @@ XML name validation should agree across the ASCII fast path, parser-backed path,
### XML attribute name validation

Attribute name validation should reject malformed custom attribute keys while preserving parser-accepted edge names such as underscores, hyphens, and xml-prefixed names.

### Rust invalid-name attrs escape once

Rust XML-name helpers should return raw invalid keys for later attribute escaping so borrowed-name optimizations cannot reintroduce double escaping.
41 changes: 26 additions & 15 deletions rust/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@ use pyo3::prelude::*;
use pyo3::types::{PyBool, PyBytes, PyDict, PyFloat, PyInt, PyList, PyString};
#[cfg(feature = "python")]
use std::io::Write;

use std::borrow::Cow;

/// Escape special XML characters in a string (allocating convenience wrapper).
#[inline]
pub fn escape_xml(s: &str) -> String {
Expand Down Expand Up @@ -172,28 +175,27 @@ pub fn is_valid_xml_name(key: &str) -> bool {
/// (unescaped) original key when a fallback is needed. Escaping of the
/// attribute value is handled later by `make_attr_string`, so we must NOT
/// escape here to avoid double-escaping.
pub fn make_valid_xml_name(key: &str) -> (String, Option<(String, String)>) {
pub fn make_valid_xml_name<'a>(
key: &'a str,
) -> (Cow<'a, str>, Option<(&'static str, Cow<'a, str>)>) {
// Already valid
if is_valid_xml_name(key) {
return (key.to_string(), None);
return (Cow::Borrowed(key), None);
}

// Numeric key - prepend 'n'
if key.bytes().all(|b| b.is_ascii_digit()) && !key.is_empty() {
return (format!("n{}", key), None);
return (Cow::Owned(format!("n{}", key)), None);
}

// Try replacing spaces with underscores
let with_underscores = key.replace(' ', "_");
if is_valid_xml_name(&with_underscores) {
return (with_underscores, None);
return (Cow::Owned(with_underscores), None);
}

// Fall back to using "key" with name attribute (raw value, escaped later)
(
"key".to_string(),
Some(("name".to_string(), key.to_string())),
)
(Cow::Borrowed("key"), Some(("name", Cow::Borrowed(key))))
}

/// Build an attribute string from key-value pairs (allocating convenience wrapper).
Expand Down Expand Up @@ -392,10 +394,10 @@ fn write_dict_contents(
cfg: &ConvertConfig,
) -> PyResult<()> {
for (key, val) in dict.iter() {
let key_str: String = key.str()?.extract()?;
let (xml_key, name_attr_pair) = make_valid_xml_name(&key_str);
let name_attr = name_attr_pair.as_ref().map(|(_, v)| v.as_str());

let key_py_str = key.str()?;
let key_str = key_py_str.to_str()?;
let (xml_key, name_attr_pair) = make_valid_xml_name(key_str);
let name_attr = name_attr_pair.as_ref().map(|(_, v)| v.as_ref());
// Lists in dicts get special wrapping treatment
if let Ok(list) = val.cast::<PyList>() {
let first_is_scalar = list
Expand Down Expand Up @@ -755,16 +757,23 @@ mod tests {
fn falls_back_to_key_with_name_attr() {
let (name, attr) = make_valid_xml_name("-invalid");
assert_eq!(name, "key");
assert_eq!(attr, Some(("name".to_string(), "-invalid".to_string())));
assert_eq!(
attr.as_ref().map(|(k, v)| (*k, v.as_ref())),
Some(("name", "-invalid"))
);
}

#[test]
// @lat: [[tests#XML helper behavior#Rust invalid-name attrs escape once]]
fn returns_raw_key_for_invalid_names() {
// make_valid_xml_name must return the raw key, not escaped.
// Escaping happens later in make_attr_string to avoid double-escaping.
let (name, attr) = make_valid_xml_name("tag&name");
assert_eq!(name, "key");
assert_eq!(attr, Some(("name".to_string(), "tag&name".to_string())));
assert_eq!(
attr.as_ref().map(|(k, v)| (*k, v.as_ref())),
Some(("name", "tag&name"))
);
}

#[test]
Expand All @@ -773,7 +782,9 @@ mod tests {
// a single level of escaping, not &amp;amp;
let (name, attr) = make_valid_xml_name("tag&name");
assert_eq!(name, "key");
let attrs = attr.map(|(k, v)| vec![(k, v)]).unwrap_or_default();
let attrs = attr
.map(|(k, v)| vec![(k.to_string(), v.into_owned())])
.unwrap_or_default();
let attr_string = make_attr_string(&attrs);
assert_eq!(attr_string, " name=\"tag&amp;name\"");
}
Expand Down
Loading