HTTP/2 server: write response body in place, drop per-response body-sized buffer (fixes #1271) by canemon-markov · Pull Request #1282 · JuliaWeb/HTTP.jl

canemon-markov · 2026-06-04T23:08:30Z

Problem

Fixes #1271. The HTTP/2 server allocates a fresh Vector{UInt8} sized to the
(windowed) response body on every response and copies the body into it
before writing to the socket. Two paths do this:

_write_h2_batch_via_single_buffer! — the buffered-response fast path
_write_data_frames_h2_server! — streaming / window-overflow batches

On the h2 server's per-stream Threads.@spawn, these body-sized per-response
allocations drive sustained memory growth proportional to traffic × body size —
in both retained live heap and process RSS. The HTTP/1 path does not show
this: it writes the body in place (write(io, body)), with no intermediate copy.

Fix

Write each DATA frame as a reused 9-byte header followed by the payload slice
taken directly from the body via a view. A unit-range view of a
Vector{UInt8} or Base.CodeUnits (both DenseVector) is a stride-1
StridedVector, so write(conn, ::view) takes Reseau's zero-copy pointer
path straight to the socket — matching the HTTP/1 behavior. Per-response
allocation drops from O(body size) to O(1) (the 9-byte header plus the
small HEADERS-frame bytes).

Accepted server connections enable TCP_NODELAY, so splitting the frame header
and payload into separate writes does not introduce Nagle latency.

Results

Two-process repro: a standalone HTTP/2 server, driven with 100,000 measured
requests of the same response body. Server-process memory sampled via a /mem
endpoint after a settled forced GC.

Scenario	Protocol	Requests	Live heap Δ	RSS Δ	Server proto counts	Client failures
Before (HTTP.jl 2.1.0)	h2	100,000	+20,508 KB	+28.0 MB	h1=0 h2=100,111 other=0	0
After (this PR)	h2	100,000	−0.0 KB	−2.9 MB	h1=0 h2=100,111 other=0	0

Before, the server retained ~20 MiB of live heap (after settled forced GC) and
+28 MiB RSS over the run. After, both stay flat on the identical HTTP/2-only
workload, with zero client failures.

Testing

Full test/http2_server_tests.jl passes, including the flow-control / window
/ large-body DATA-frame-splitting testsets that exercise this code.
Note: the timeout-middleware testset has 2 failures present on master
independently of this change (verified against the pristine base) — a
separate, pre-existing issue, not introduced here.

@Spawn

…JuliaWeb#1271) The HTTP/2 server response-body write paths allocated a fresh Vector{UInt8} sized to the (windowed) response body on every response and copied the body into it before a single socket write: - _write_data_frames_h2_server! (streaming / window-overflow batches) - _write_h2_batch_via_single_buffer! (buffered-response fast path) Under the h2 server's per-stream Threads.@Spawn, those large, short-lived allocations are retained by the glibc malloc per-thread arenas and not returned to the OS, so process RSS climbs in proportion to response body size even though the Julia heap stays flat. The HTTP/1 path does not exhibit this because it writes the body in place (write(io, body)) with no intermediate copy. This is the mechanism behind JuliaWeb#1271 ("HTTP/2 server leaks memory proportional to response body size"). Write each DATA frame as a reused 9-byte header followed by the payload slice taken directly from the body via a view. A unit-range view of a Vector{UInt8} or Base.CodeUnits (both DenseVector) is a stride-1 StridedVector, so write(conn, ::view) takes the zero-copy pointer path straight to the socket -- matching the HTTP/1 behavior. Per-response allocation drops from O(body size) to O(1) (the 9-byte header plus the small HEADERS-frame bytes). Accepted server connections set TCP_NODELAY, so splitting the frame header and payload into separate writes does not incur Nagle latency. All HTTP/2 server tests pass (the 2 pre-existing failures in the timeout middleware testset are present on master with and without this change).

codecov · 2026-06-04T23:15:08Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.62%. Comparing base (4e2b698) to head (e9590a8).

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1282      +/-   ##
==========================================
+ Coverage   84.53%   84.62%   +0.09%     
==========================================
  Files          28       28              
  Lines       10774    10766       -8     
==========================================
+ Hits         9108     9111       +3     
+ Misses       1666     1655      -11

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

quinnj · 2026-06-05T15:50:31Z

Thanks for digging into this. One tradeoff I want to think through before merging: this extends the v2.1.0 BytesBody fix down into the lower DATA-frame writers, which is good for avoiding the body-sized copy, but it also changes the write shape from one contiguous buffered write per DATA batch into separate writes for the frame header and payload.

For TLS, I think that may mean more TLS application writes/records; for plain TCP, it likely means more syscalls. A benchmark that would help: compare master/v2.1.0 vs this branch for a large fixed Vector{UInt8} response over both h2c and h2/TLS, tracking RSS/live heap plus req/s/latency, and ideally write/syscall counts. That would tell us whether the RSS win comes with a meaningful throughput/latency cost.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTP/2 server: write response body in place, drop per-response body-sized buffer (fixes #1271)#1282

HTTP/2 server: write response body in place, drop per-response body-sized buffer (fixes #1271)#1282
canemon-markov wants to merge 1 commit into
JuliaWeb:masterfrom
canemon-markov:fix/h2-server-zerocopy-response-body

canemon-markov commented Jun 4, 2026

Uh oh!

codecov Bot commented Jun 4, 2026 •

edited

Loading

Uh oh!

quinnj commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

canemon-markov commented Jun 4, 2026

Problem

Fix

Results

Testing

Uh oh!

codecov Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

quinnj commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov Bot commented Jun 4, 2026 •

edited

Loading