Skip to content

Commit 1e158a7

Browse files
etrclaude
andcommitted
TASK-047: fire request_received and body_chunk hooks (pre-handler short-circuit)
Wires the two pre-routing, pre-handler hook phases that observe the inbound request. Both support short-circuit via hook_action::respond_with(...): a non-pass return populates mr->response_, sets the new mr->skip_handler flag, and routes through finalize_answer's new skip-handler branch. The body_chunk short-circuit also destroys any in-flight MHD_PostProcessor so its 32 KB buffer is freed (ASan-verified). Closes #273 (early 413 on oversize body) — demonstrated by examples/early_413.cpp. Partially addresses #272 (observation half of delayed body processing). Acceptance tests: - hooks_body_chunk_ctx_shape: compile-time pin for the TASK-045 ctx shapes (mutable http_request*, std::span<const std::byte> chunk, std::uint64_t offset, bool is_final). - hooks_request_received_short_circuit: 413 hook aborts the upload before any body bytes flow; downstream body_chunk hook never fires. Second sub-test pins the non-short-circuit path through to 200. - hooks_body_chunk_observes_progress: accumulated bytes equal body size; offsets are monotonic and start at zero. - hooks_body_chunk_short_circuit_no_leak: form-urlencoded POST forces MHD_PostProcessor allocation; first-chunk short-circuit must free it (sentinel for ASan). Implementation: - New fire_short_circuit_hooks_for_phase template in hook_handle.cpp (sibling to TASK-046's fire_hooks_for_phase) returns std::optional<http_response>; same shared_lock-snapshot-then-iterate pattern, throwing hook treated as pass per DR-009 §5.2. - New any_hooks_-gated firing sites in webserver_impl::requests_answer_first_step (request_received) and requests_answer_second_step (body_chunk). - modded_request gains a skip_handler bool; finalize_answer gains an early-exit branch that routes directly to materialize_and_queue_response when set. Both pipeline functions also re-check the flag so a request_received short-circuit suppresses body_chunk firings on subsequent MHD callbacks. - File-size mitigation: the two firing-site insertions pushed src/detail/webserver_request.cpp over the 500-LOC ceiling; mirrored the TASK-046 split pattern by carving src/detail/webserver_body_pipeline.cpp out (hosts both pipeline functions plus two anon-ns helpers that keep requests_answer_second_step at CCN <= 10). - hook_context.hpp doc comments now warn that body_chunk fires from arbitrary MHD worker threads at arbitrary granularity (no I/O, no per-chunk allocation; the chunk span aliases MHD-owned memory). - hooks_no_firing narrowed to drop request_received and body_chunk from its "must observe zero" set. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 1b942d9 commit 1e158a7

16 files changed

Lines changed: 969 additions & 64 deletions

examples/Makefile.am

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
LDADD = $(top_builddir)/src/libhttpserver.la
2020
AM_CPPFLAGS = -I$(top_srcdir)/src -I$(top_srcdir)/src/httpserver/
2121
METASOURCES = AUTO
22-
noinst_PROGRAMS = hello_world shared_state service custom_error allowing_disallowing_methods handlers hello_with_get_arg args_processing setting_headers custom_access_log minimal_https minimal_file_response minimal_deferred url_registration minimal_ip_ban banned_ip_log benchmark_select benchmark_threads benchmark_nodelay deferred_with_accumulator file_upload file_upload_with_callback empty_response_example iovec_response_example pipe_response_example daemon_info external_event_loop turbo_mode binary_buffer_response
22+
noinst_PROGRAMS = hello_world shared_state service custom_error allowing_disallowing_methods handlers hello_with_get_arg args_processing setting_headers custom_access_log minimal_https minimal_file_response minimal_deferred url_registration minimal_ip_ban banned_ip_log early_413 benchmark_select benchmark_threads benchmark_nodelay deferred_with_accumulator file_upload file_upload_with_callback empty_response_example iovec_response_example pipe_response_example daemon_info external_event_loop turbo_mode binary_buffer_response
2323

2424
hello_world_SOURCES = hello_world.cpp
2525
shared_state_SOURCES = shared_state.cpp
@@ -38,6 +38,7 @@ deferred_with_accumulator_SOURCES = deferred_with_accumulator.cpp
3838
url_registration_SOURCES = url_registration.cpp
3939
minimal_ip_ban_SOURCES = minimal_ip_ban.cpp
4040
banned_ip_log_SOURCES = banned_ip_log.cpp
41+
early_413_SOURCES = early_413.cpp
4142
benchmark_select_SOURCES = benchmark_select.cpp
4243
benchmark_threads_SOURCES = benchmark_threads.cpp
4344
benchmark_nodelay_SOURCES = benchmark_nodelay.cpp

examples/early_413.cpp

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
// Demonstrates the solution to issue #273: short-circuit large uploads
2+
// with a 413 BEFORE any body bytes are consumed.
3+
//
4+
// Register a `request_received` hook that inspects Content-Length; if
5+
// it exceeds the configured cap, return a respond_with(413) action.
6+
// libhttpserver aborts the upload -- the resource handler is not
7+
// invoked and the body bytes never cross the daemon's I/O boundary.
8+
9+
#include <cstddef>
10+
#include <functional>
11+
#include <memory>
12+
#include <string>
13+
14+
#include <httpserver.hpp>
15+
16+
namespace hs = httpserver;
17+
18+
namespace {
19+
constexpr std::size_t kMaxUploadBytes = 1 * 1024 * 1024; // 1 MB
20+
} // namespace
21+
22+
class upload_resource : public hs::http_resource {
23+
public:
24+
hs::http_response render_post(const hs::http_request&) override {
25+
return hs::http_response::string("UPLOAD OK");
26+
}
27+
};
28+
29+
int main() {
30+
hs::webserver ws{hs::create_webserver(8080)};
31+
32+
auto h = ws.add_hook(hs::hook_phase::request_received,
33+
std::function<hs::hook_action(hs::request_received_ctx&)>(
34+
[](hs::request_received_ctx& ctx) {
35+
std::string cl{ctx.request->get_header("Content-Length")};
36+
if (cl.empty()) return hs::hook_action::pass();
37+
try {
38+
if (std::stoull(cl) > kMaxUploadBytes) {
39+
return hs::hook_action::respond_with(
40+
hs::http_response::empty().with_status(413));
41+
}
42+
} catch (...) {
43+
// Malformed Content-Length: let the normal pipeline
44+
// produce a 400 elsewhere.
45+
}
46+
return hs::hook_action::pass();
47+
}));
48+
49+
auto resource = std::make_shared<upload_resource>();
50+
ws.register_path("/upload", resource);
51+
52+
ws.start(true); // blocking
53+
return 0;
54+
}

specs/tasks/M5-routing-lifecycle/TASK-047.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,11 @@
88
Wire the two pre-routing, pre-handler phases that observe the inbound request. Both support short-circuit (`hook_action::respond_with(...)`). Closes feature request #273 (early 413 on oversize body); partially addresses #272 (delayed body processing — the `body_chunk` half).
99

1010
**Action Items:**
11-
- [ ] Fire `request_received` from `webserver_impl::requests_answer_first_step` (`webserver.cpp:2010`) — after `mr->dhr.reset(new http_request(...))` so the `http_request` is populated, before the body read decision. Context: `request_received_ctx { http_request& req }`. (Mutable ref so a hook can adjust per-request state like content-size limit if a follow-up adds that knob.)
12-
- [ ] Fire `body_chunk` from `webserver_impl::requests_answer_second_step` (`webserver.cpp:2034`) — once per chunk delivered by MHD, before the chunk is appended to `mr->dhr` content / post-processor. Context: `body_chunk_ctx { http_request& req; std::span<const char> bytes; std::size_t total_bytes_received; }`.
13-
- [ ] Short-circuit handling: if any hook returns `hook_action::respond_with(r)`, stash `r` into `mr->response_`, mark `mr` as "skip-to-finalize", and let the existing finalize path queue it. The resource handler is not invoked. For `body_chunk` specifically, also drop any in-flight post-processor (`MHD_destroy_post_processor`) to free its buffer.
14-
- [ ] Document in the public header that `body_chunk` fires from MHD worker threads at arbitrary granularity — chunks may be byte-level on slow networks. Hooks must be cheap (no I/O, no allocation in the hot path) or risk back-pressuring the connection.
15-
- [ ] Same `fire_*` pattern as TASK-046 (shared_lock → copy → release → iterate).
11+
- [x] Fire `request_received` from `webserver_impl::requests_answer_first_step` (relocated to `src/detail/webserver_body_pipeline.cpp` by the TASK-047 split) — after `mr->dhr.reset(new http_request(...))` so the `http_request` is populated, before the body read decision. Context: `request_received_ctx { http_request* request; steady_clock::time_point received_at; }` (the TASK-045 shape).
12+
- [x] Fire `body_chunk` from `webserver_impl::requests_answer_second_step` (same file) — once per chunk delivered by MHD, before the chunk is appended to `mr->dhr` content / post-processor. Context: `body_chunk_ctx { http_request* request; std::span<const std::byte> chunk; std::uint64_t offset; bool is_final; }` (the TASK-045 shape; `offset` carries the running total of bytes already buffered).
13+
- [x] Short-circuit handling: a non-pass hook return stashes the response into `mr->response_`, sets `mr->skip_handler = true`, and lets the existing finalize path queue it (new `if (mr->skip_handler)` early-return in `finalize_answer`). The resource handler is not invoked. The `body_chunk` short-circuit also calls `MHD_destroy_post_processor(mr->pp)` to free the 32 KB buffer so ASan stays clean.
14+
- [x] Documented in the public header (`src/httpserver/hook_context.hpp`) that `body_chunk` fires from MHD worker threads at arbitrary granularity and hooks must be cheap (no blocking I/O, no per-chunk heap allocation).
15+
- [x] Same `fire_*` pattern as TASK-046 (shared_lock → reserve(8) → copy → release → iterate → try/catch → `log_dispatch_error`). The `hook_action`-returning variant `fire_short_circuit_hooks_for_phase` lives in `hook_handle.cpp` next to the void variant.
1616

1717
**Dependencies:**
1818
- Blocked by: TASK-045
@@ -29,4 +29,4 @@ Wire the two pre-routing, pre-handler phases that observe the inbound request. B
2929
**Related Decisions:** DR-012, §4.10
3030
**Resolves:** Issue #273. Partially addresses issue #272 (observation half only — body-buffer steal remains a v2.1 candidate).
3131

32-
**Status:** Not Started
32+
**Status:** Done

src/Makefile.am

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ lib_LTLIBRARIES = libhttpserver.la
2525
# builds. The WS-off branch in websocket_handler.cpp provides stub
2626
# definitions (every member throws feature_unavailable except is_valid()
2727
# which returns false).
28-
libhttpserver_la_SOURCES = string_utilities.cpp webserver.cpp http_utils.cpp file_info.cpp http_request.cpp http_request_auth.cpp http_response.cpp create_webserver.cpp create_test_request.cpp websocket_handler.cpp hook_handle.cpp detail/http_endpoint.cpp detail/body.cpp detail/ip_representation.cpp detail/http_request_impl.cpp detail/http_request_impl_tls.cpp detail/webserver_setup.cpp detail/webserver_register.cpp detail/webserver_routes.cpp detail/webserver_callbacks.cpp detail/webserver_callbacks_lifecycle.cpp detail/webserver_websocket.cpp detail/webserver_dispatch.cpp detail/webserver_request.cpp
28+
libhttpserver_la_SOURCES = string_utilities.cpp webserver.cpp http_utils.cpp file_info.cpp http_request.cpp http_request_auth.cpp http_response.cpp create_webserver.cpp create_test_request.cpp websocket_handler.cpp hook_handle.cpp detail/http_endpoint.cpp detail/body.cpp detail/ip_representation.cpp detail/http_request_impl.cpp detail/http_request_impl_tls.cpp detail/webserver_setup.cpp detail/webserver_register.cpp detail/webserver_routes.cpp detail/webserver_callbacks.cpp detail/webserver_callbacks_lifecycle.cpp detail/webserver_websocket.cpp detail/webserver_dispatch.cpp detail/webserver_request.cpp detail/webserver_body_pipeline.cpp
2929
# noinst_HEADERS: shipped in the tarball but NEVER installed under $prefix/include.
3030
# Detail headers (httpserver/detail/*.hpp) live here so they cannot leak to
3131
# downstream consumers — the public surface comes in through <httpserver.hpp>.
Lines changed: 207 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,207 @@
1+
/*
2+
This file is part of libhttpserver
3+
Copyright (C) 2011-2026 Sebastiano Merlino
4+
5+
This library is free software; you can redistribute it and/or
6+
modify it under the terms of the GNU Lesser General Public
7+
License as published by the Free Software Foundation; either
8+
version 2.1 of the License, or (at your option) any later version.
9+
10+
This library is distributed in the hope that it will be useful,
11+
but WITHOUT ANY WARRANTY; without even the implied warranty of
12+
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
13+
Lesser General Public License for more details.
14+
15+
You should have received a copy of the GNU Lesser General Public
16+
License along with this library; if not, write to the Free Software
17+
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
18+
USA
19+
*/
20+
21+
// TASK-047 -- The two body-pipeline stages
22+
// (requests_answer_first_step / requests_answer_second_step) extracted
23+
// from webserver_request.cpp to keep that TU under the 500-LOC ceiling
24+
// after adding the request_received and body_chunk firing sites
25+
// (matches the TASK-046 split pattern that carved
26+
// webserver_callbacks_lifecycle.cpp out of webserver_callbacks.cpp).
27+
//
28+
// Also hosts the small anon-ns helper that wraps the body_chunk firing
29+
// site, both to keep the second-step orchestrator at CCN <= 10 and to
30+
// make the short-circuit teardown path single-sourced.
31+
32+
#include "httpserver/webserver.hpp"
33+
#include "httpserver/detail/webserver_impl.hpp"
34+
35+
#include <microhttpd.h>
36+
37+
#include <strings.h>
38+
39+
#include <chrono>
40+
#include <cstddef>
41+
#include <cstdint>
42+
#include <cstring>
43+
#include <iostream>
44+
#include <optional>
45+
#include <span>
46+
#include <string>
47+
#include <utility>
48+
49+
#include "httpserver/create_webserver.hpp"
50+
#include "httpserver/hook_action.hpp"
51+
#include "httpserver/hook_context.hpp"
52+
#include "httpserver/hook_phase.hpp"
53+
#include "httpserver/http_request.hpp"
54+
#include "httpserver/http_response.hpp"
55+
#include "httpserver/http_utils.hpp"
56+
#include "httpserver/detail/modded_request.hpp"
57+
58+
namespace httpserver {
59+
60+
using httpserver::http::http_utils;
61+
62+
namespace detail {
63+
64+
namespace {
65+
66+
// Wrap the body_chunk firing site so requests_answer_second_step stays a
67+
// flat sequence of small steps. Returns true iff a hook short-circuited
68+
// and the caller should signal MHD that the chunk was consumed (returns
69+
// MHD_YES with *upload_data_size = 0). Side effects on short-circuit:
70+
// - mr->response_ is populated with the hook-supplied response,
71+
// - mr->skip_handler is set so finalize_answer routes through the
72+
// skip branch,
73+
// - any in-flight MHD_PostProcessor is destroyed (32 KB buffer freed).
74+
bool fire_and_maybe_short_circuit_body_chunk(webserver_impl* impl,
75+
modded_request* mr,
76+
const char* upload_data,
77+
size_t upload_data_size) {
78+
::httpserver::body_chunk_ctx ctx{
79+
mr->dhr.get(),
80+
std::span<const std::byte>(
81+
reinterpret_cast<const std::byte*>(upload_data),
82+
upload_data_size),
83+
static_cast<std::uint64_t>(mr->dhr->get_content().size()),
84+
/*is_final=*/false};
85+
auto sc = impl->fire_body_chunk(ctx);
86+
if (!sc) return false;
87+
mr->response_.emplace(std::move(*sc));
88+
mr->skip_handler = true;
89+
if (mr->pp != nullptr) {
90+
MHD_destroy_post_processor(mr->pp);
91+
mr->pp = nullptr;
92+
}
93+
return true;
94+
}
95+
96+
// Feed @p upload_data through MHD's post processor (when one is
97+
// attached) and close any open upload-target stream. Pulled out of the
98+
// second_step orchestrator so the orchestrator stays under the CCN bar.
99+
void run_post_processor_if_attached(modded_request* mr,
100+
webserver* parent,
101+
const char* upload_data,
102+
size_t upload_data_size) {
103+
if (mr->pp == nullptr) return;
104+
mr->ws = parent;
105+
MHD_post_process(mr->pp, upload_data, upload_data_size);
106+
if (mr->upload_ostrm != nullptr && mr->upload_ostrm->is_open()) {
107+
mr->upload_ostrm->close();
108+
}
109+
}
110+
111+
} // namespace
112+
113+
MHD_Result webserver_impl::requests_answer_first_step(MHD_Connection* connection, struct detail::modded_request* mr) {
114+
mr->dhr.reset(new http_request(connection, parent->unescaper));
115+
mr->dhr->set_file_cleanup_callback(parent->file_cleanup_callback);
116+
117+
// TASK-047 -- request_received hook. Fires after the http_request is
118+
// populated but before any body bytes are read (and before any
119+
// post-processor is created). Mutable ref so a hook may adjust
120+
// per-request state. Short-circuit: stash the response, mark
121+
// skip-to-finalize, and return MHD_YES. MHD will call back into
122+
// requests_answer_second_step with *upload_data_size == 0, which
123+
// routes through complete_request -> finalize_answer, where the
124+
// skip_handler branch goes straight to materialize_and_queue_response.
125+
// No post-processor exists at this point, so no teardown is needed.
126+
if (any_hooks_[static_cast<std::size_t>(
127+
::httpserver::hook_phase::request_received)]
128+
.load(std::memory_order_relaxed)) {
129+
::httpserver::request_received_ctx ctx{
130+
mr->dhr.get(),
131+
std::chrono::steady_clock::now()};
132+
if (auto sc = fire_request_received(ctx)) {
133+
mr->response_.emplace(std::move(*sc));
134+
mr->skip_handler = true;
135+
return MHD_YES;
136+
}
137+
}
138+
139+
if (!mr->has_body) {
140+
return MHD_YES;
141+
}
142+
143+
mr->dhr->set_content_size_limit(parent->content_size_limit);
144+
const char *encoding = MHD_lookup_connection_value(connection, MHD_HEADER_KIND, http_utils::http_header_content_type);
145+
146+
if (parent->post_process_enabled &&
147+
(nullptr != encoding &&
148+
((0 == strncasecmp(http_utils::http_post_encoding_form_urlencoded, encoding, strlen(http_utils::http_post_encoding_form_urlencoded))) ||
149+
(0 == strncasecmp(http_utils::http_post_encoding_multipart_formdata, encoding, strlen(http_utils::http_post_encoding_multipart_formdata)))))) {
150+
const size_t post_memory_limit(32 * 1024); // Same as #MHD_POOL_SIZE_DEFAULT
151+
mr->pp = MHD_create_post_processor(connection, post_memory_limit, &webserver_impl::post_iterator, mr);
152+
} else {
153+
mr->pp = nullptr;
154+
}
155+
return MHD_YES;
156+
}
157+
158+
MHD_Result webserver_impl::requests_answer_second_step(MHD_Connection* connection, const char* method,
159+
const char* version, const char* upload_data,
160+
size_t* upload_data_size, struct detail::modded_request* mr) {
161+
if (0 == *upload_data_size) return complete_request(connection, mr, version, method);
162+
163+
if (!mr->has_body) {
164+
*upload_data_size = 0;
165+
return MHD_YES;
166+
}
167+
168+
// TASK-047 -- a prior pre-handler short-circuit (request_received in
169+
// first_step, or body_chunk on an earlier chunk) already populated
170+
// mr->response_. Consume the chunk so MHD advances; the next
171+
// *upload_data_size == 0 callback will route to finalize_answer's
172+
// skip_handler branch.
173+
if (mr->skip_handler) {
174+
*upload_data_size = 0;
175+
return MHD_YES;
176+
}
177+
178+
// TASK-047 -- body_chunk hook fires per chunk BEFORE the bytes are
179+
// appended to mr->dhr / fed to MHD_post_process.
180+
if (any_hooks_[static_cast<std::size_t>(
181+
::httpserver::hook_phase::body_chunk)]
182+
.load(std::memory_order_relaxed)) {
183+
if (fire_and_maybe_short_circuit_body_chunk(
184+
this, mr, upload_data, *upload_data_size)) {
185+
*upload_data_size = 0;
186+
return MHD_YES;
187+
}
188+
}
189+
190+
#ifdef DEBUG
191+
std::cout << "Writing content: " << std::string(upload_data, *upload_data_size) << std::endl;
192+
#endif // DEBUG
193+
// The post iterator is only created from the libmicrohttpd for content of type
194+
// multipart/form-data and application/x-www-form-urlencoded
195+
// all other content (which is indicated by mr-pp == nullptr)
196+
// has to be put to the content even if put_processed_data_to_content is set to false
197+
if (mr->pp == nullptr || parent->put_processed_data_to_content) {
198+
mr->dhr->grow_content(upload_data, *upload_data_size);
199+
}
200+
run_post_processor_if_attached(mr, parent, upload_data, *upload_data_size);
201+
202+
*upload_data_size = 0;
203+
return MHD_YES;
204+
}
205+
206+
} // namespace detail
207+
} // namespace httpserver

0 commit comments

Comments
 (0)