Skip to content

http_server: unify Fluent Bit HTTP listeners on flb_http_server#11538

Merged
edsiper merged 42 commits intomasterfrom
http-server-unified
Mar 12, 2026
Merged

http_server: unify Fluent Bit HTTP listeners on flb_http_server#11538
edsiper merged 42 commits intomasterfrom
http-server-unified

Conversation

@edsiper
Copy link
Copy Markdown
Member

@edsiper edsiper commented Mar 11, 2026

The first HTTP server implementations in Fluent Bit evolved as a mix of the Monkey HTTP server runtime and plugin-local standalone parsing logic. Later, when HTTP/2 support was added through nghttp2, that left callers with two different server paths to handle. We started addressing that some time ago by introducing a unified HTTP server interface, and this PR is another major step in that direction.

With this change, plugins and internal components that expose HTTP endpoints rely on a single internal server API instead of carrying duplicated protocol-specific logic. That reduces per-plugin complexity substantially, removes more than 4,000 lines of duplicated code, and moves the HTTP server closer to what it should be: a real internal API that hides transport and protocol details behind one consistent interface.

For short:

  • Unifies Fluent Bit’s HTTP listener implementations on flb_http_server across the built-in API server, HTTP input plugins, and Prometheus/Vivo exporters.
  • Adds shared core-owned http_server.* configuration handling and wires it through input and output integration points.
  • Removes duplicated legacy HTTP hosting paths from migrated plugins and internal listeners.
  • Adds internal and runtime regression coverage for built-in endpoints, protocol matrices, exporters, and connection-limit handling.
Property Type Default Scope Notes
http_server.http2 bool true listeners using flb_http_server true enables autodetect / HTTP/2-capable listener path, false forces HTTP/1.1 only
http_server.buffer_max_size size 10000K listeners using flb_http_server Maximum request buffer size accepted by the server
http_server.buffer_chunk_size size 10K config surface only for now Exposed as shared server config; currently mainly kept for compatibility/future internal use
http_server.max_connections int 0 listeners using flb_http_server 0 means unlimited; > 0 limits concurrent active connections

Compatibility aliases currently accepted for opted-in callers:

Alias Canonical Property
http2 http_server.http2
buffer_max_size http_server.buffer_max_size
buffer_chunk_size http_server.buffer_chunk_size
max_connections http_server.max_connections

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Unified, options-driven in-process HTTP server: per-instance configuration, HTTP/2, max connections, multi-worker support, worker lifecycle hooks, runtime endpoint registration, and higher-level response helpers.
  • Bug Fixes

    • Improved request normalization, reliable remote-address extraction, consistent default response headers, and consolidation of legacy connection handling for more stable HTTP behavior.
  • Tests

    • Expanded end-to-end HTTP tests and harness: protocol matrix (HTTP/1.1, HTTP/2), TLS, max-connections, and internal server/metrics/health validations.

edsiper added 11 commits March 11, 2026 11:22
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
@edsiper edsiper requested a review from cosmo0920 as a code owner March 11, 2026 17:31
@edsiper edsiper added this to the Fluent Bit v5.0 milestone Mar 11, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Unified in-process HTTP server introduced and integrated across core, server, APIs, and many plugins: new flb_http_server runtime, endpoint registration and utilities, per-instance http_server config plumbing, migration from legacy per-connection modules to the new server, and expanded tests and harnesses.

Changes

Cohort / File(s) Summary
Core HTTP server public headers
include/fluent-bit/http_server/flb_hs.h, include/fluent-bit/http_server/flb_http_server.h, include/fluent-bit/http_server/flb_hs_utils.h, include/fluent-bit/http_server/flb_http_server_config_map.h
Add new types, options/config structs, worker/runtime APIs, endpoint callback type, route registration API, and HTTP server config-map interface.
HTTP common & request helpers
include/fluent-bit/flb_http_common.h, src/flb_http_common.c
New forward decl struct flb_connection; add flb_http_request_get_remote_address() and flb_http_request_normalize(); ensure default Server and X-Http-Engine headers on commit.
Input / Output public interfaces
include/fluent-bit/flb_input.h, include/fluent-bit/flb_output.h, src/flb_input.c, src/flb_output.c
Add FLB_*_HTTP_SERVER flags, forward declare flb_http_server_config, add per-instance http_server_config/properties and config-map plumbing with validation hooks.
HTTP server implementation
src/http_server/flb_hs.c, src/http_server/flb_http_server.c, src/http_server/flb_http_server_config_map.c, src/http_server/flb_http_server_http1.c, src/http_server/flb_http_server_http2.c, src/http_server/flb_hs_utils.c
New server runtime with multi-worker runtime structs, route registry, endpoint registration API, request dispatch, buffer lifecycle helpers, normalization and per-protocol response enhancements, and options-based init/start/destroy.
API endpoints & utilities migration
src/http_server/api/v1/*, src/http_server/api/v2/*
Migrate endpoints to HS callback signature (struct flb_hs *, struct flb_http_request *, struct flb_http_response *), register via flb_hs_register_endpoint, and produce responses via flb_hs utilities; remove per-thread MQ plumbing.
Plugin migrations (inputs)
plugins/in_elasticsearch/*, plugins/in_http/*, plugins/in_opentelemetry/*, plugins/in_splunk/*, plugins/in_prometheus_remote_write/*
Remove legacy per-connection modules/collectors; plugins now initialize and start flb_http_server via options, set .cb_collect = NULL, opt into FLB_INPUT_HTTP_SERVER, and delete many _conn and bulk connection files.
Plugin migrations (outputs)
plugins/out_prometheus_exporter/*, plugins/out_vivo_exporter/*
Migrate exporters to use flb_http_server options and new request/response APIs; update create/start signatures and lifecycle wrappers.
Removed per-connection modules
plugins/*/*_conn.c, *_conn.h, *_bulk_conn.c, *_bulk_conn.h (various)
Deleted legacy per-connection buffer/connection management, parsers, and public headers across multiple input plugins.
Help integration
src/flb_help.c
Include HTTP server config-map entries in help payloads for inputs and outputs.
Build & tests
src/http_server/CMakeLists.txt, tests/internal/CMakeLists.txt, tests/internal/http_server.c, tests/runtime_shell/*, tests/runtime_shell/http_harness.sh
Add HTTP server source to build, new unit tests, a shell HTTP test harness, and numerous runtime-shell protocol matrix tests covering inputs/outputs and HTTP/HTTP2/TLS scenarios.
Misc / vendored
lib/monkey/CMakeLists.txt, lib/monkey/mk_server/mk_server.c
Bump Monkey patch version and add scheduler cleanup on MK_SCHED_SIGNAL_FREE_ALL.
Large internal refactors
plugins/*/*_prot.c, src/http_server/api/*, src/http_server/flb_hs.c
Many internal signature changes and control-flow shifts from Monkey per-connection callbacks to flb_hs route callbacks; careful review of endpoint registration, response commit, normalization, and buffer lifecycle recommended.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant FlbHS as "Fluent Bit HS"
    participant Router as "Route Registry"
    participant Handler as "Endpoint Handler"
    participant Response as "Response Builder"

    Client->>FlbHS: TCP/TLS HTTP request
    FlbHS->>FlbHS: parse request, flb_http_request_normalize(request)
    FlbHS->>Router: lookup route (path, match_type)
    Router-->>FlbHS: matched callback
    FlbHS->>Handler: invoke callback(hs, request, response)
    Handler->>Response: flb_hs_response_set_payload / set headers
    Response->>Response: ensure default headers (Server: "Fluent Bit", X-Http-Engine: "Monkey heritage")
    Response->>FlbHS: commit response
    FlbHS->>Client: send HTTP response
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested labels

docs-required

Suggested reviewers

  • fujimotos
  • koleini

Poem

🐇 I hopped through headers, routes in my paw,
I planted endpoints neat as a law.
Plugins now listen where routes softly sing,
Workers hum rhythms — servers take wing.
Hop, hop, hooray — the HTTP bells ring.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch http-server-unified

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)
src/http_server/flb_http_server.c (1)

815-821: ⚠️ Potential issue | 🟠 Major

Don't advertise h2 when http2 is explicitly disabled.

When callers set http_server.http2 off, the option plumbing maps the listener to HTTP_PROTOCOL_VERSION_11, but Lines 815-821 still advertise h2 over ALPN for every TLS listener. A client can negotiate HTTP/2 during TLS setup and then hit an HTTP/1.1-only session. Build the ALPN list from session->protocol_version and omit h2 for forced HTTP/1.x listeners.

One way to derive ALPN from the configured protocol mode
-    if (session->tls_provider != NULL) {
-        result = flb_tls_set_alpn(session->tls_provider, "h2,http/1.0,http/1.1");
+    if (session->tls_provider != NULL) {
+        const char *alpn;
+
+        alpn = "h2,http/1.0,http/1.1";
+
+        if (session->protocol_version != HTTP_PROTOCOL_VERSION_AUTODETECT &&
+            session->protocol_version < HTTP_PROTOCOL_VERSION_20) {
+            alpn = "http/1.0,http/1.1";
+        }
+
+        result = flb_tls_set_alpn(session->tls_provider, alpn);
 
         if (result != 0) {
             return -1;
         }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/http_server/flb_http_server.c` around lines 815 - 821, The ALPN list
currently always advertises "h2" regardless of session->protocol_version; update
the TLS ALPN setup (the block calling flb_tls_set_alpn on session->tls_provider)
to construct the ALPN string based on session->protocol_version so that if
session->protocol_version == HTTP_PROTOCOL_VERSION_11 you omit "h2" and only
advertise the HTTP/1.x tokens (e.g., "http/1.1,http/1.0"), otherwise include
"h2" (e.g., "h2,http/1.1,http/1.0"); keep the existing flb_tls_set_alpn() call
and result/error handling but pass the dynamically chosen ALPN string.
src/flb_help.c (3)

548-550: ⚠️ Potential issue | 🟡 Minor

Duplicate options_size assignment.

Line 550 duplicates the assignment from line 548. This appears to be a copy-paste error.

🧹 Proposed fix
         config_map = flb_config_map_create(ins->config, ins->p->config_map);
         options_size = mk_list_size(config_map);

-        options_size = mk_list_size(config_map);
         if (ins->flags & FLB_OUTPUT_NET) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/flb_help.c` around lines 548 - 550, Remove the duplicated assignment to
options_size: there are two consecutive calls assigning options_size =
mk_list_size(config_map); — keep a single assignment and delete the redundant
second one so options_size is computed exactly once (search for the symbols
options_size, config_map, and mk_list_size in flb_help.c to locate the
duplicate).

273-299: ⚠️ Potential issue | 🔴 Critical

Critical: config_map variable is overwritten, causing memory leak and incorrect help output.

The code has a serious bug where the config_map variable holding the plugin's config map (created at line 267) is overwritten by flb_http_server_get_config_map() at line 274. This causes:

  1. Memory leak: The original plugin config map is never destroyed
  2. Incorrect help output: Lines 295-299 iterate over config_map which now points to the HTTP server config (or a destroyed map), not the plugin's options
  3. Double free risk: The HTTP server config map is destroyed at both line 292 and line 299
🐛 Proposed fix using separate variable
     /* properties['options']: options exposed by the plugin */
     if (ins->p->config_map) {
+        struct mk_list *http_server_cm = NULL;
+        struct mk_list *plugin_config_map = NULL;
+
         flb_mp_map_header_append(&mh);
         pack_str(&mp_pck, "options");

-        config_map = flb_config_map_create(ins->config, ins->p->config_map);
-        options_size = mk_list_size(config_map);
+        plugin_config_map = flb_config_map_create(ins->config, ins->p->config_map);
+        options_size = mk_list_size(plugin_config_map);

         if ((ins->flags & (FLB_INPUT_NET | FLB_INPUT_NET_SERVER)) != 0) {
             options_size += 3;
         }
         if ((ins->flags & FLB_INPUT_HTTP_SERVER) != 0) {
-            config_map = flb_http_server_get_config_map(ins->config);
-            options_size += mk_list_size(config_map);
-            flb_config_map_destroy(config_map);
+            http_server_cm = flb_http_server_get_config_map(ins->config);
+            options_size += mk_list_size(http_server_cm);
+            flb_config_map_destroy(http_server_cm);
         }

         msgpack_pack_array(&mp_pck, options_size);

         if ((ins->flags & (FLB_INPUT_NET | FLB_INPUT_NET_SERVER)) != 0) {
             pack_config_map_entry(&mp_pck, &m_input_net_listen);
             pack_config_map_entry(&mp_pck, &m_input_net_host);
             pack_config_map_entry(&mp_pck, &m_input_net_port);
         }
         if ((ins->flags & FLB_INPUT_HTTP_SERVER) != 0) {
-            config_map = flb_http_server_get_config_map(ins->config);
-            mk_list_foreach(head, config_map) {
+            http_server_cm = flb_http_server_get_config_map(ins->config);
+            mk_list_foreach(head, http_server_cm) {
                 m = mk_list_entry(head, struct flb_config_map, _head);
                 pack_config_map_entry(&mp_pck, m);
             }
-            flb_config_map_destroy(config_map);
+            flb_config_map_destroy(http_server_cm);
         }

-        mk_list_foreach(head, config_map) {
+        mk_list_foreach(head, plugin_config_map) {
             m = mk_list_entry(head, struct flb_config_map, _head);
             pack_config_map_entry(&mp_pck, m);
         }
-        flb_config_map_destroy(config_map);
+        flb_config_map_destroy(plugin_config_map);
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/flb_help.c` around lines 273 - 299, The current code overwrites the
plugin's config_map with the HTTP server config_map causing leaks, wrong output
and double-free; fix it by using a separate variable (e.g., http_config_map) for
flb_http_server_get_config_map() calls, call mk_list_foreach and
pack_config_map_entry on the original plugin config_map for the plugin options
(where pack_config_map_entry is used), iterate http_config_map separately for
HTTP options, and ensure you call flb_config_map_destroy() once for each
distinct map (destroy plugin config_map and http_config_map independently) so no
map is leaked or freed twice.

554-590: ⚠️ Potential issue | 🔴 Critical

Critical: Same config_map overwrite bug exists in flb_help_output.

This function has the same critical bug as flb_help_input. The plugin's config_map (line 547) is overwritten at line 555 when FLB_OUTPUT_HTTP_SERVER is set, causing memory leaks, incorrect help output, and potential double-free.

🐛 Proposed fix using separate variable
     /* properties['options']: options exposed by the plugin */
     if (ins->p->config_map) {
+        struct mk_list *http_server_cm = NULL;
+        struct mk_list *plugin_config_map = NULL;
+
         flb_mp_map_header_append(&mh);
         pack_str(&mp_pck, "options");

-        config_map = flb_config_map_create(ins->config, ins->p->config_map);
-        options_size = mk_list_size(config_map);
+        plugin_config_map = flb_config_map_create(ins->config, ins->p->config_map);
+        options_size = mk_list_size(plugin_config_map);

-        options_size = mk_list_size(config_map);
         if (ins->flags & FLB_OUTPUT_NET) {
             options_size += 2;
         }
         if (ins->flags & FLB_OUTPUT_HTTP_SERVER) {
-            config_map = flb_http_server_get_config_map(ins->config);
-            options_size += mk_list_size(config_map);
-            flb_config_map_destroy(config_map);
+            http_server_cm = flb_http_server_get_config_map(ins->config);
+            options_size += mk_list_size(http_server_cm);
+            flb_config_map_destroy(http_server_cm);
         }
         if (ins->flags & FLB_IO_OPT_TLS) {
             tls_config = flb_tls_get_config_map(ins->config);
             options_size += mk_list_size(tls_config);
         }

         msgpack_pack_array(&mp_pck, options_size);

         if (ins->flags & FLB_OUTPUT_NET) {
             pack_config_map_entry(&mp_pck, &m_output_net_host);
             pack_config_map_entry(&mp_pck, &m_output_net_port);
         }
         if (ins->flags & FLB_OUTPUT_HTTP_SERVER) {
-            config_map = flb_http_server_get_config_map(ins->config);
-            mk_list_foreach(head, config_map) {
+            http_server_cm = flb_http_server_get_config_map(ins->config);
+            mk_list_foreach(head, http_server_cm) {
                 m = mk_list_entry(head, struct flb_config_map, _head);
                 pack_config_map_entry(&mp_pck, m);
             }
-            flb_config_map_destroy(config_map);
+            flb_config_map_destroy(http_server_cm);
         }
         if (ins->flags & FLB_IO_OPT_TLS) {
             mk_list_foreach(head, tls_config) {
                 m = mk_list_entry(head, struct flb_config_map, _head);
                 pack_config_map_entry(&mp_pck, m);
             }
             flb_config_map_destroy(tls_config);
         }

-        mk_list_foreach(head, config_map) {
+        mk_list_foreach(head, plugin_config_map) {
             m = mk_list_entry(head, struct flb_config_map, _head);
             pack_config_map_entry(&mp_pck, m);
         }
-        flb_config_map_destroy(config_map);
+        flb_config_map_destroy(plugin_config_map);
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/flb_help.c` around lines 554 - 590, flb_help_output currently reuses the
local variable config_map for the HTTP-server config map, overwriting the
plugin's original config_map and causing double-free/leak; fix by introducing
distinct locals (e.g., http_config_map and tls_config_map) for
flb_http_server_get_config_map() and flb_tls_get_config_map(), use those
variables in their respective mk_list_foreach loops and flb_config_map_destroy()
calls, and leave the original config_map (the plugin's config map used at the
final mk_list_foreach) untouched; update references to tls_config to
tls_config_map and ensure each created map is destroyed exactly once.
plugins/in_prometheus_remote_write/prom_rw.c (1)

7-11: ⚠️ Potential issue | 🟡 Minor

Typo in license header comments.

The license comments contain LICENSE-2.in_in instead of LICENSE-2.0. This appears to be a copy-paste or search-replace error.

📝 Proposed fix
-/*  Licensed under the Apache License, Version 2.in_in (the "License");
+/*  Licensed under the Apache License, Version 2.0 (the "License");
 ...
-*      http://www.apache.org/licenses/LICENSE-2.in_in
+*      http://www.apache.org/licenses/LICENSE-2.0
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/in_prometheus_remote_write/prom_rw.c` around lines 7 - 11, Fix the
typo in the license header by replacing the incorrect token "Licensed under the
Apache License, Version 2.in_in (the \"License\");" (and any other occurrences
of "2.in_in" in the top-of-file comment) with the correct text "Licensed under
the Apache License, Version 2.0 (the \"License\");" so the header reads the
standard Apache License, Version 2.0.
src/flb_input.c (1)

444-451: ⚠️ Potential issue | 🟠 Major

Memory leak: http_server_config not freed on ring buffer allocation failure.

When flb_ring_buffer_create fails, the function frees instance but does not free instance->http_server_config, causing a memory leak.

🐛 Proposed fix
         /* allocate a ring buffer */
         instance->rb = flb_ring_buffer_create(instance->ring_buffer_size);
         if (!instance->rb) {
             flb_error("instance %s could not initialize ring buffer",
                       flb_input_name(instance));
+            flb_free(instance->http_server_config);
             flb_free(instance);
             return NULL;
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/flb_input.c` around lines 444 - 451, The ring buffer allocation failure
path frees instance but skips freeing instance->http_server_config, leaking
memory; modify the error path in the block that calls flb_ring_buffer_create
(check instance->rb) to free instance->http_server_config (e.g., call the
appropriate free/destructor for http_server_config or
flb_free(instance->http_server_config)) before calling flb_free(instance) and
returning NULL so all allocated sub-structures are released (refer to symbols:
flb_ring_buffer_create, instance->rb, instance->http_server_config, flb_free).
plugins/in_splunk/splunk_prot.c (1)

688-693: ⚠️ Potential issue | 🟡 Minor

Missing error response for HTTP/1.1 without Host header.

When an HTTP/1.1 request lacks the required Host header, the function returns -1 without sending any response to the client. This leaves the client waiting without feedback.

🐛 Proposed fix
     /* HTTP/1.1 needs Host header */
     if (request->protocol_version == HTTP_PROTOCOL_VERSION_11 &&
         request->host == NULL) {
-
+        send_response_ng(response, 400, "error: Host header required for HTTP/1.1\n");
         return -1;
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/in_splunk/splunk_prot.c` around lines 688 - 693, The code currently
returns -1 when an HTTP/1.1 request lacks a Host header (check using
request->protocol_version == HTTP_PROTOCOL_VERSION_11 and request->host == NULL)
but does not send any response; update this branch so it sends a 400 Bad Request
to the client (with a short explanatory body or header like "Host header
required") using the project's existing HTTP response helper (or, if none
exists, write a minimal "HTTP/1.1 400 Bad Request" response) before returning -1
so the client receives feedback.
🧹 Nitpick comments (11)
tests/runtime_shell/http_harness.sh (1)

206-235: Consider parameterizing the TLS virtual host.

The hardcoded tls.vhost leo.vcap.me may be test-specific. If this needs to vary across test environments, consider making it a parameter. Otherwise, this is acceptable for the current test scope.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/runtime_shell/http_harness.sh` around lines 206 - 235, The
hmt_write_tls_config function currently hardcodes tls.vhost to "leo.vcap.me";
change the function signature to accept an additional parameter (e.g.,
tls_vhost) and use that variable in place of the hardcoded value, while
preserving backward compatibility by defaulting tls_vhost to "leo.vcap.me" when
empty; update callers that invoke hmt_write_tls_config to pass an explicit vhost
where needed (or omit to use the default), and ensure the heredoc uses the new
tls_vhost variable in the tls.vhost line so tests can vary the virtual host per
environment.
tests/runtime_shell/out_vivo_exporter_expect.sh (1)

97-101: Avoid pinning this test to the legacy engine label.

The check at Lines 97-101 makes X-HTTP-Engine: Monkey heritage part of the contract for a path this PR just migrated onto the unified HTTP server. If the header still matters for compatibility, assert that it exists instead of hard-coding the backend string.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/runtime_shell/out_vivo_exporter_expect.sh` around lines 97 - 101, The
test currently asserts the exact header value "X-HTTP-Engine: Monkey heritage"
which pins the contract to the legacy engine; update the check to only assert
the presence of the X-HTTP-Engine header key (case-insensitive) instead of the
specific value. Locate the block that greps "$logs_headers" and replace the
pattern '^x-http-engine: Monkey heritage' with a pattern that matches just the
header name (e.g., '^x-http-engine:'), or alternatively split the check to first
grep for the header key and only assert exact values in tests that are
explicitly about engine labels; use the existing variable logs_headers to
perform this presence check.
plugins/out_vivo_exporter/vivo_http.c (1)

376-383: Remove vivo_http_server_mq_push_metrics() or document why it must remain.

This function is dead code—no call sites exist in the codebase. Removing it is safer than maintaining a stub that silently drops data. If backward compatibility is required, add a comment explaining that intent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/out_vivo_exporter/vivo_http.c` around lines 376 - 383,
vivo_http_server_mq_push_metrics is an unused stub that silently drops data;
either delete the function to remove dead code or, if it must remain for
API/backwards-compat reasons, mark it appropriately and add a clear comment
explaining why it is retained (e.g., "kept for ABI compatibility; intentionally
no-op") and consider making it static or adding an explicit TODO/DEPRECATION
note; update any public headers or documentation accordingly so callers know the
behavior.
tests/runtime_shell/out_prometheus_exporter_expect.sh (2)

67-73: Inconsistent HTTP request approach.

Line 54 uses hmt_request_with_method for the root endpoint, but this block uses direct curl for the metrics endpoint. Consider using the harness helper consistently for maintainability, or add a comment explaining why direct curl is needed here (e.g., to capture headers in a specific format).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/runtime_shell/out_prometheus_exporter_expect.sh` around lines 67 - 73,
The test is inconsistently using a direct curl call to fetch /metrics while
other requests use the harness helper hmt_request_with_method; replace this curl
block with the harness helper or, if curl is required to capture raw headers,
add a comment explaining why. Specifically, update the code that sets
metrics_status (currently using curl with variables header_file, metrics_body,
port) to call hmt_request_with_method("/metrics", "GET", ...) and capture
status/body/headers the same way as other tests, or document why
metrics_body/header_file need raw curl output before returning 1 on non-200
responses.

52-52: Hardcoded sleep could cause test flakiness.

The sleep 2 after waiting for the log message might be insufficient in slow CI environments or excessive in fast ones. Consider using a more dynamic wait approach similar to hmt_wait_for_log if metrics need time to populate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/runtime_shell/out_prometheus_exporter_expect.sh` at line 52, Replace
the hardcoded "sleep 2" with a dynamic wait: remove the fixed sleep in the
script and instead use the existing hmt_wait_for_log helper (or implement a
short polling loop) to wait until the Prometheus exporter metrics are actually
available (e.g., poll the exporter endpoint with curl and check for the expected
metric name or status code), timing out after a reasonable interval; update the
script where "sleep 2" appears and reference hmt_wait_for_log or the new polling
logic so tests no longer rely on a fixed delay.
plugins/in_prometheus_remote_write/prom_rw_config.c (1)

58-66: Consider adding NULL check for defensive coding.

The prom_rw_config_destroy function dereferences ctx without a NULL check. While callers are expected to pass valid pointers, a defensive NULL check would prevent crashes if the destroy function is called during error paths where ctx might be NULL.

🛡️ Proposed defensive check
 int prom_rw_config_destroy(struct flb_prom_remote_write *ctx)
 {
+    if (ctx == NULL) {
+        return 0;
+    }
+
     flb_http_server_destroy(&ctx->http_server);
 
     flb_free(ctx->listen);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/in_prometheus_remote_write/prom_rw_config.c` around lines 58 - 66,
The prom_rw_config_destroy function currently dereferences ctx (calling
flb_http_server_destroy(&ctx->http_server) and freeing members) without checking
for NULL; add a defensive NULL check at the top of prom_rw_config_destroy to
return early if ctx is NULL, and only call flb_http_server_destroy and flb_free
on ctx->listen, ctx->tcp_port and ctx itself when ctx is non-NULL (optionally
ensuring each member is non-NULL before freeing). This change should be applied
inside prom_rw_config_destroy in prom_rw_config.c.
src/http_server/flb_hs.c (2)

314-314: atoi does not validate port string.

Using atoi for port conversion provides no error detection for invalid input (non-numeric strings return 0). Consider using strtol with error checking, or ensure the port is validated upstream before reaching this code.

♻️ Suggested improvement
-    options.port = atoi(tcp_port);
+    char *endptr;
+    long port = strtol(tcp_port, &endptr, 10);
+    if (*endptr != '\0' || port <= 0 || port > 65535) {
+        flb_error("[http_server] invalid port: %s", tcp_port);
+        flb_free(hs);
+        return NULL;
+    }
+    options.port = (int) port;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/http_server/flb_hs.c` at line 314, Replace the unsafe atoi conversion for
tcp_port with a validated strtol-based parse: use strtol(tcp_port, &endptr, 10),
check errno for ERANGE, ensure endptr consumed the whole string (no non-digit
chars), and verify the parsed value is within valid port range (1–65535) before
assigning to options.port; on parse/validation failure, log a clear error and
return/fail early instead of setting options.port to 0.

145-145: Route path is stored without copying - relies on caller providing stable string.

The route->path is assigned directly from the path parameter without copying. This is safe if all callers pass string literals (as they currently do), but could cause use-after-free if a dynamically allocated string is passed and later freed. Consider documenting this assumption or using flb_sds_create to copy the path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/http_server/flb_hs.c` at line 145, Route path is currently assigned
directly (route->path = path) which assumes the caller keeps the string alive;
change this to make an owned copy using flb_sds_create(path) when initializing
the route (replace the direct assignment in the route creation code in
flb_hs.c), check the return for NULL and handle allocation failure, and ensure
the route teardown code frees the SDS (use flb_sds_destroy or the
project-equivalent) so the route owns its path; alternatively, if you prefer not
to copy, add a clear comment/documentation at the route creation site
documenting the requirement that callers must supply a stable (e.g., literal)
string.
plugins/in_splunk/splunk_prot.c (1)

574-581: Unreachable code detected.

The return SPLUNK_AUTH_SUCCESS; at line 580 is unreachable. The control flow either returns at line 574 (SPLUNK_AUTH_UNAUTHORIZED) or line 577 (SPLUNK_AUTH_MISSING_CRED), making the final return statement dead code.

♻️ Proposed cleanup
         }
 
         return SPLUNK_AUTH_UNAUTHORIZED;
     }
-    else {
-        return SPLUNK_AUTH_MISSING_CRED;
-    }
 
-    return SPLUNK_AUTH_SUCCESS;
+    return SPLUNK_AUTH_MISSING_CRED;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/in_splunk/splunk_prot.c` around lines 574 - 581, The final "return
SPLUNK_AUTH_SUCCESS;" is unreachable because the preceding branches always
return SPLUNK_AUTH_UNAUTHORIZED or SPLUNK_AUTH_MISSING_CRED; remove the dead
return or restructure the function so a success path actually reaches a return
of SPLUNK_AUTH_SUCCESS (update the branch logic that currently returns
SPLUNK_AUTH_UNAUTHORIZED / SPLUNK_AUTH_MISSING_CRED), referencing the constants
SPLUNK_AUTH_UNAUTHORIZED, SPLUNK_AUTH_MISSING_CRED and SPLUNK_AUTH_SUCCESS to
locate the code to change.
src/http_server/api/v1/health.c (2)

146-151: Redundant (void) hs cast after actual usage.

The (void) hs; at line 149 is misleading because hs is actually used at line 147 in is_healthy(hs). This cast typically suppresses "unused parameter" warnings, but here the parameter is used.

♻️ Remove redundant cast
 static int cb_health(struct flb_hs *hs,
                      struct flb_http_request *request,
                      struct flb_http_response *response)
 {
     int ret;
     int status = is_healthy(hs);
 
-    (void) hs;
     (void) request;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/http_server/api/v1/health.c` around lines 146 - 151, The cast (void) hs
is incorrect because hs is used in is_healthy(hs); remove the redundant line
"(void) hs;" and keep any legitimate unused-parameter suppression such as
"(void) request;" if request is unused; verify that is_healthy(hs) call and the
variables ret and status remain unchanged and compile cleanly.

171-174: Consider removing empty function or documenting intent.

The flb_hs_health_destroy function is now a no-op. If health resource cleanup is now handled by the hs lifecycle, consider either removing this function entirely (if no external callers remain) or adding a brief comment explaining why it's intentionally empty.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/http_server/api/v1/health.c` around lines 171 - 174, The
flb_hs_health_destroy function is a no-op; either delete this function if there
are no remaining external callers or keep it but add a one-line comment
explaining why it's intentionally empty (e.g., "cleanup handled by hs
lifecycle") to avoid confusion; if you remove it, also remove or update any
references/calls to flb_hs_health_destroy to prevent linker errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/in_http/http_prot.c`:
- Around line 66-74: The content_type_is_urlencoded function only matches the
exact string and should accept parameterized types (e.g.,
"application/x-www-form-urlencoded; charset=UTF-8"); modify
content_type_is_urlencoded to ignore any parameters by locating the first ';'
(if present), trimming trailing whitespace on the media-type portion, and then
performing a case-insensitive comparison of that media-type against
"application/x-www-form-urlencoded" (use strcasecmp or strncasecmp on the
truncated portion) so requests with parameters are accepted.

In `@plugins/in_prometheus_remote_write/prom_rw_prot.c`:
- Around line 71-76: When flb_input_metrics_append(...) fails after a successful
decode (i.e., after cmt_decode_prometheus_remote_write_destroy(context) is
called), treat this as a server-side ingest error and return a 5xx response
rather than collapsing it into the malformed-payload path; update the branch
that checks the result of flb_input_metrics_append (and the analogous branch
around the second occurrence cited) to log the error and return an appropriate
5xx status (or propagate a server error code) instead of returning 400, while
still returning 400 only when the decode itself fails.

In `@plugins/out_prometheus_exporter/prom_http.c`:
- Around line 53-63: The current handler copies ph->metrics_payload and returns
404 when payload==NULL; change this so that when ph->metrics_payload is NULL (no
snapshot yet) the code returns HTTP 200 with an empty Prometheus payload instead
of 404, while still treating actual allocation/clone failures from
cfl_sds_create_len as 500. Concretely, in the block around pthread_mutex_lock /
ph->metrics_payload / cfl_sds_create_len, detect a NULL original payload and
call flb_http_response_set_status(response, 200) and
flb_http_response_commit(response) with an empty body (or an empty SDS) to keep
/metrics scrapeable; only if cfl_sds_create_len fails after a non-NULL source
should you return flb_http_response_set_status(response, 500) and commit. Keep
the mutex usage and existing calls to flb_http_response_commit and
flb_http_response_set_status (and reference functions cfl_sds_create_len,
flb_http_response_set_status, flb_http_response_commit, and
ph->metrics_payload).

In `@src/flb_http_common.c`:
- Around line 998-1017: The commit path currently forces default identity
headers by calling flb_http_response_get_header and then
flb_http_response_set_header for "server" and "x-http-engine", which prevents
flb_http_response_unset_header from taking effect; remove (or disable) the
flb_http_response_set_header calls in this shared commit path (the block that
inspects "server" and "x-http-engine") so headers are not re-inserted here, and
instead set any default values in the specific listener/exporter code that owns
that policy; references: flb_http_response_get_header,
flb_http_response_set_header, flb_http_response_unset_header, and the
surrounding commit logic in flb_http_common.c.

In `@src/http_server/api/v1/metrics.c`:
- Around line 429-435: The call to flb_hs_register_endpoint for
"/api/v1/metrics/prometheus" inside api_v1_metrics ignores its return value so a
failure can be masked; capture the return (e.g., int ret =
flb_hs_register_endpoint(..., cb_metrics_prometheus)), check it for error and
return it immediately on failure before attempting to register the second
endpoint (cb_metrics), otherwise proceed to register and return the second
call's result.

In `@src/http_server/api/v1/trace.c`:
- Around line 599-603: The DELETE branch omits the msgpack map wrapper causing
malformed output; in the branch that calls disable_trace_input(hs, input_name,
input_nlen) you should pack a map header before packing fields — use
msgpack_pack_map(&mp_pck, 2) (matching the POST/error branches) and then call
msgpack_pack_str_with_body(&mp_pck, HTTP_FIELD_STATUS, HTTP_FIELD_STATUS_LEN)
and msgpack_pack_str_with_body(&mp_pck, HTTP_RESULT_OK, HTTP_RESULT_OK_LEN) so
the response is properly wrapped.

In `@src/http_server/api/v2/metrics.c`:
- Around line 61-79: Move the buffer usage increment so the buffer is protected
before any access: increment buf->users before reading buf->raw_data and before
calling cmt_encode_prometheus_create (and the analogous cmt_encode_* call in the
other handler around lines 99-117), then decrement buf->users after the response
payload is set and the encoder payload destroyed; specifically adjust the
sequence around the cmt variable assignment and cmt_encode_prometheus_create so
buf->users++ precedes those calls and buf->users-- remains after
cmt_encode_prometheus_destroy and return.
- Around line 123-126: The first flb_hs_register_endpoint call for
"/api/v2/metrics/prometheus" currently ignores its return value; change the code
to capture and check that return (from flb_hs_register_endpoint) and if it
indicates failure return that error immediately instead of proceeding to
register "/api/v2/metrics"; ensure you still call flb_hs_register_endpoint for
cb_metrics only if the prometheus registration succeeded and propagate any error
from either registration back to the caller.

In `@src/http_server/api/v2/reload.c`:
- Around line 120-122: The code is returning HTTP 400 for internal JSON
serialization failures; update those branches to return HTTP 500 instead. Locate
the branches that call flb_http_response_set_status(response, 400) followed by
return flb_http_response_commit(response) (the two occurrences around the JSON
encoding/error handling in reload.c) and change the status code passed to
flb_http_response_set_status to 500 so internal server errors are reported
correctly while still committing the response with
flb_http_response_commit(response).

In `@src/http_server/flb_hs_utils.c`:
- Around line 78-99: The call to flb_hs_response_set_content_type in
flb_hs_response_set_payload is not checked; capture its return value and if it
returns -1, propagate an error (return -1) immediately instead of continuing to
call flb_http_response_set_body and flb_http_response_commit. Modify
flb_hs_response_set_payload to call flb_hs_response_set_content_type, check for
-1, and return on error so the response is not committed without a valid
Content-Type.

In `@src/http_server/flb_http_server.c`:
- Around line 793-796: When options->workers > 1 you set options->reuse_port and
force options->use_caller_event_loop = FLB_FALSE, but later the code immediately
rejects any non-caller-context server, leaving the worker/runtime API unusable;
either invoke the runtime start path for multi-worker mode here (i.e., call the
server runtime start routine for the new worker API immediately after setting
options->workers/reuse_port/use_caller_event_loop) or do not flip
options->use_caller_event_loop to FLB_FALSE (and keep the worker-related fields
private) so the rejection code for non-caller-context servers no longer breaks
startup; update the logic around the non-caller-context rejection to only reject
when the runtime start path was not taken, referencing options->workers,
options->reuse_port, options->use_caller_event_loop and the non-caller-context
rejection check so the chosen wiring is consistent.

In `@tests/internal/http_server.c`:
- Around line 170-183: The test calls flb_http_server_init_with_options(&server,
&options) but never tears the server down; insert a call to
flb_http_server_destroy(&server) after the TEST_CHECK assertions and before
test_http_server_context_destroy(&context) (mirroring
test_http_server_options_multi_worker_magic) so the server is properly
destroyed; ensure you use the same server variable passed to
flb_http_server_init_with_options and keep the existing order of cleanup
(destroy server, then test_http_server_context_destroy, then flb_config_exit).

---

Outside diff comments:
In `@plugins/in_prometheus_remote_write/prom_rw.c`:
- Around line 7-11: Fix the typo in the license header by replacing the
incorrect token "Licensed under the Apache License, Version 2.in_in (the
\"License\");" (and any other occurrences of "2.in_in" in the top-of-file
comment) with the correct text "Licensed under the Apache License, Version 2.0
(the \"License\");" so the header reads the standard Apache License, Version
2.0.

In `@plugins/in_splunk/splunk_prot.c`:
- Around line 688-693: The code currently returns -1 when an HTTP/1.1 request
lacks a Host header (check using request->protocol_version ==
HTTP_PROTOCOL_VERSION_11 and request->host == NULL) but does not send any
response; update this branch so it sends a 400 Bad Request to the client (with a
short explanatory body or header like "Host header required") using the
project's existing HTTP response helper (or, if none exists, write a minimal
"HTTP/1.1 400 Bad Request" response) before returning -1 so the client receives
feedback.

In `@src/flb_help.c`:
- Around line 548-550: Remove the duplicated assignment to options_size: there
are two consecutive calls assigning options_size = mk_list_size(config_map); —
keep a single assignment and delete the redundant second one so options_size is
computed exactly once (search for the symbols options_size, config_map, and
mk_list_size in flb_help.c to locate the duplicate).
- Around line 273-299: The current code overwrites the plugin's config_map with
the HTTP server config_map causing leaks, wrong output and double-free; fix it
by using a separate variable (e.g., http_config_map) for
flb_http_server_get_config_map() calls, call mk_list_foreach and
pack_config_map_entry on the original plugin config_map for the plugin options
(where pack_config_map_entry is used), iterate http_config_map separately for
HTTP options, and ensure you call flb_config_map_destroy() once for each
distinct map (destroy plugin config_map and http_config_map independently) so no
map is leaked or freed twice.
- Around line 554-590: flb_help_output currently reuses the local variable
config_map for the HTTP-server config map, overwriting the plugin's original
config_map and causing double-free/leak; fix by introducing distinct locals
(e.g., http_config_map and tls_config_map) for flb_http_server_get_config_map()
and flb_tls_get_config_map(), use those variables in their respective
mk_list_foreach loops and flb_config_map_destroy() calls, and leave the original
config_map (the plugin's config map used at the final mk_list_foreach)
untouched; update references to tls_config to tls_config_map and ensure each
created map is destroyed exactly once.

In `@src/flb_input.c`:
- Around line 444-451: The ring buffer allocation failure path frees instance
but skips freeing instance->http_server_config, leaking memory; modify the error
path in the block that calls flb_ring_buffer_create (check instance->rb) to free
instance->http_server_config (e.g., call the appropriate free/destructor for
http_server_config or flb_free(instance->http_server_config)) before calling
flb_free(instance) and returning NULL so all allocated sub-structures are
released (refer to symbols: flb_ring_buffer_create, instance->rb,
instance->http_server_config, flb_free).

In `@src/http_server/flb_http_server.c`:
- Around line 815-821: The ALPN list currently always advertises "h2" regardless
of session->protocol_version; update the TLS ALPN setup (the block calling
flb_tls_set_alpn on session->tls_provider) to construct the ALPN string based on
session->protocol_version so that if session->protocol_version ==
HTTP_PROTOCOL_VERSION_11 you omit "h2" and only advertise the HTTP/1.x tokens
(e.g., "http/1.1,http/1.0"), otherwise include "h2" (e.g.,
"h2,http/1.1,http/1.0"); keep the existing flb_tls_set_alpn() call and
result/error handling but pass the dynamically chosen ALPN string.

---

Nitpick comments:
In `@plugins/in_prometheus_remote_write/prom_rw_config.c`:
- Around line 58-66: The prom_rw_config_destroy function currently dereferences
ctx (calling flb_http_server_destroy(&ctx->http_server) and freeing members)
without checking for NULL; add a defensive NULL check at the top of
prom_rw_config_destroy to return early if ctx is NULL, and only call
flb_http_server_destroy and flb_free on ctx->listen, ctx->tcp_port and ctx
itself when ctx is non-NULL (optionally ensuring each member is non-NULL before
freeing). This change should be applied inside prom_rw_config_destroy in
prom_rw_config.c.

In `@plugins/in_splunk/splunk_prot.c`:
- Around line 574-581: The final "return SPLUNK_AUTH_SUCCESS;" is unreachable
because the preceding branches always return SPLUNK_AUTH_UNAUTHORIZED or
SPLUNK_AUTH_MISSING_CRED; remove the dead return or restructure the function so
a success path actually reaches a return of SPLUNK_AUTH_SUCCESS (update the
branch logic that currently returns SPLUNK_AUTH_UNAUTHORIZED /
SPLUNK_AUTH_MISSING_CRED), referencing the constants SPLUNK_AUTH_UNAUTHORIZED,
SPLUNK_AUTH_MISSING_CRED and SPLUNK_AUTH_SUCCESS to locate the code to change.

In `@plugins/out_vivo_exporter/vivo_http.c`:
- Around line 376-383: vivo_http_server_mq_push_metrics is an unused stub that
silently drops data; either delete the function to remove dead code or, if it
must remain for API/backwards-compat reasons, mark it appropriately and add a
clear comment explaining why it is retained (e.g., "kept for ABI compatibility;
intentionally no-op") and consider making it static or adding an explicit
TODO/DEPRECATION note; update any public headers or documentation accordingly so
callers know the behavior.

In `@src/http_server/api/v1/health.c`:
- Around line 146-151: The cast (void) hs is incorrect because hs is used in
is_healthy(hs); remove the redundant line "(void) hs;" and keep any legitimate
unused-parameter suppression such as "(void) request;" if request is unused;
verify that is_healthy(hs) call and the variables ret and status remain
unchanged and compile cleanly.
- Around line 171-174: The flb_hs_health_destroy function is a no-op; either
delete this function if there are no remaining external callers or keep it but
add a one-line comment explaining why it's intentionally empty (e.g., "cleanup
handled by hs lifecycle") to avoid confusion; if you remove it, also remove or
update any references/calls to flb_hs_health_destroy to prevent linker errors.

In `@src/http_server/flb_hs.c`:
- Line 314: Replace the unsafe atoi conversion for tcp_port with a validated
strtol-based parse: use strtol(tcp_port, &endptr, 10), check errno for ERANGE,
ensure endptr consumed the whole string (no non-digit chars), and verify the
parsed value is within valid port range (1–65535) before assigning to
options.port; on parse/validation failure, log a clear error and return/fail
early instead of setting options.port to 0.
- Line 145: Route path is currently assigned directly (route->path = path) which
assumes the caller keeps the string alive; change this to make an owned copy
using flb_sds_create(path) when initializing the route (replace the direct
assignment in the route creation code in flb_hs.c), check the return for NULL
and handle allocation failure, and ensure the route teardown code frees the SDS
(use flb_sds_destroy or the project-equivalent) so the route owns its path;
alternatively, if you prefer not to copy, add a clear comment/documentation at
the route creation site documenting the requirement that callers must supply a
stable (e.g., literal) string.

In `@tests/runtime_shell/http_harness.sh`:
- Around line 206-235: The hmt_write_tls_config function currently hardcodes
tls.vhost to "leo.vcap.me"; change the function signature to accept an
additional parameter (e.g., tls_vhost) and use that variable in place of the
hardcoded value, while preserving backward compatibility by defaulting tls_vhost
to "leo.vcap.me" when empty; update callers that invoke hmt_write_tls_config to
pass an explicit vhost where needed (or omit to use the default), and ensure the
heredoc uses the new tls_vhost variable in the tls.vhost line so tests can vary
the virtual host per environment.

In `@tests/runtime_shell/out_prometheus_exporter_expect.sh`:
- Around line 67-73: The test is inconsistently using a direct curl call to
fetch /metrics while other requests use the harness helper
hmt_request_with_method; replace this curl block with the harness helper or, if
curl is required to capture raw headers, add a comment explaining why.
Specifically, update the code that sets metrics_status (currently using curl
with variables header_file, metrics_body, port) to call
hmt_request_with_method("/metrics", "GET", ...) and capture status/body/headers
the same way as other tests, or document why metrics_body/header_file need raw
curl output before returning 1 on non-200 responses.
- Line 52: Replace the hardcoded "sleep 2" with a dynamic wait: remove the fixed
sleep in the script and instead use the existing hmt_wait_for_log helper (or
implement a short polling loop) to wait until the Prometheus exporter metrics
are actually available (e.g., poll the exporter endpoint with curl and check for
the expected metric name or status code), timing out after a reasonable
interval; update the script where "sleep 2" appears and reference
hmt_wait_for_log or the new polling logic so tests no longer rely on a fixed
delay.

In `@tests/runtime_shell/out_vivo_exporter_expect.sh`:
- Around line 97-101: The test currently asserts the exact header value
"X-HTTP-Engine: Monkey heritage" which pins the contract to the legacy engine;
update the check to only assert the presence of the X-HTTP-Engine header key
(case-insensitive) instead of the specific value. Locate the block that greps
"$logs_headers" and replace the pattern '^x-http-engine: Monkey heritage' with a
pattern that matches just the header name (e.g., '^x-http-engine:'), or
alternatively split the check to first grep for the header key and only assert
exact values in tests that are explicitly about engine labels; use the existing
variable logs_headers to perform this presence check.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: abee6a80-96e3-4344-8dbd-0eee8bc594a4

📥 Commits

Reviewing files that changed from the base of the PR and between a1d9c2a and 8ee77e3.

📒 Files selected for processing (84)
  • include/fluent-bit/flb_http_common.h
  • include/fluent-bit/flb_input.h
  • include/fluent-bit/flb_output.h
  • include/fluent-bit/http_server/flb_hs.h
  • include/fluent-bit/http_server/flb_hs_utils.h
  • include/fluent-bit/http_server/flb_http_server.h
  • include/fluent-bit/http_server/flb_http_server_config_map.h
  • plugins/in_elasticsearch/CMakeLists.txt
  • plugins/in_elasticsearch/in_elasticsearch.c
  • plugins/in_elasticsearch/in_elasticsearch.h
  • plugins/in_elasticsearch/in_elasticsearch_bulk_conn.c
  • plugins/in_elasticsearch/in_elasticsearch_bulk_conn.h
  • plugins/in_elasticsearch/in_elasticsearch_bulk_prot.c
  • plugins/in_elasticsearch/in_elasticsearch_bulk_prot.h
  • plugins/in_elasticsearch/in_elasticsearch_config.c
  • plugins/in_http/CMakeLists.txt
  • plugins/in_http/http.c
  • plugins/in_http/http.h
  • plugins/in_http/http_config.c
  • plugins/in_http/http_prot.c
  • plugins/in_http/http_prot.h
  • plugins/in_opentelemetry/CMakeLists.txt
  • plugins/in_opentelemetry/http_conn.c
  • plugins/in_opentelemetry/http_conn.h
  • plugins/in_opentelemetry/opentelemetry.c
  • plugins/in_opentelemetry/opentelemetry.h
  • plugins/in_opentelemetry/opentelemetry_config.c
  • plugins/in_opentelemetry/opentelemetry_prot.c
  • plugins/in_opentelemetry/opentelemetry_prot.h
  • plugins/in_prometheus_remote_write/CMakeLists.txt
  • plugins/in_prometheus_remote_write/prom_rw.c
  • plugins/in_prometheus_remote_write/prom_rw.h
  • plugins/in_prometheus_remote_write/prom_rw_config.c
  • plugins/in_prometheus_remote_write/prom_rw_conn.c
  • plugins/in_prometheus_remote_write/prom_rw_conn.h
  • plugins/in_prometheus_remote_write/prom_rw_prot.c
  • plugins/in_prometheus_remote_write/prom_rw_prot.h
  • plugins/in_splunk/CMakeLists.txt
  • plugins/in_splunk/splunk.c
  • plugins/in_splunk/splunk.h
  • plugins/in_splunk/splunk_config.c
  • plugins/in_splunk/splunk_conn.c
  • plugins/in_splunk/splunk_conn.h
  • plugins/in_splunk/splunk_prot.c
  • plugins/in_splunk/splunk_prot.h
  • plugins/out_prometheus_exporter/prom.c
  • plugins/out_prometheus_exporter/prom_http.c
  • plugins/out_prometheus_exporter/prom_http.h
  • plugins/out_vivo_exporter/vivo.c
  • plugins/out_vivo_exporter/vivo_http.c
  • plugins/out_vivo_exporter/vivo_http.h
  • src/flb_help.c
  • src/flb_http_common.c
  • src/flb_input.c
  • src/flb_output.c
  • src/http_server/CMakeLists.txt
  • src/http_server/api/v1/health.c
  • src/http_server/api/v1/health.h
  • src/http_server/api/v1/metrics.c
  • src/http_server/api/v1/plugins.c
  • src/http_server/api/v1/storage.c
  • src/http_server/api/v1/trace.c
  • src/http_server/api/v1/uptime.c
  • src/http_server/api/v2/metrics.c
  • src/http_server/api/v2/reload.c
  • src/http_server/flb_hs.c
  • src/http_server/flb_hs_utils.c
  • src/http_server/flb_http_server.c
  • src/http_server/flb_http_server_config_map.c
  • src/http_server/flb_http_server_http1.c
  • src/http_server/flb_http_server_http2.c
  • tests/internal/CMakeLists.txt
  • tests/internal/http_server.c
  • tests/runtime_shell/CMakeLists.txt
  • tests/runtime_shell/http_harness.sh
  • tests/runtime_shell/in_elasticsearch_matrix_expect.sh
  • tests/runtime_shell/in_http_matrix_expect.sh
  • tests/runtime_shell/in_http_max_connections_expect.sh
  • tests/runtime_shell/in_opentelemetry_matrix_expect.sh
  • tests/runtime_shell/in_prometheus_remote_write_matrix_expect.sh
  • tests/runtime_shell/in_splunk_matrix_expect.sh
  • tests/runtime_shell/internal_http_server_expect.sh
  • tests/runtime_shell/out_prometheus_exporter_expect.sh
  • tests/runtime_shell/out_vivo_exporter_expect.sh
💤 Files with no reviewable changes (24)
  • plugins/in_http/CMakeLists.txt
  • plugins/in_elasticsearch/in_elasticsearch.h
  • plugins/in_opentelemetry/http_conn.h
  • plugins/in_http/http_prot.h
  • plugins/in_prometheus_remote_write/CMakeLists.txt
  • plugins/in_opentelemetry/opentelemetry_prot.h
  • plugins/in_splunk/splunk_prot.h
  • plugins/in_splunk/CMakeLists.txt
  • plugins/in_elasticsearch/in_elasticsearch_bulk_conn.c
  • plugins/in_elasticsearch/in_elasticsearch_bulk_conn.h
  • plugins/in_opentelemetry/opentelemetry.h
  • plugins/in_prometheus_remote_write/prom_rw_prot.h
  • plugins/in_opentelemetry/CMakeLists.txt
  • plugins/in_elasticsearch/CMakeLists.txt
  • plugins/in_elasticsearch/in_elasticsearch_bulk_prot.h
  • plugins/in_splunk/splunk_conn.h
  • plugins/in_prometheus_remote_write/prom_rw_conn.c
  • plugins/in_prometheus_remote_write/prom_rw_conn.h
  • plugins/in_splunk/splunk_conn.c
  • plugins/in_prometheus_remote_write/prom_rw.h
  • plugins/in_opentelemetry/http_conn.c
  • plugins/in_http/http.h
  • plugins/in_splunk/splunk.h
  • plugins/in_opentelemetry/opentelemetry_prot.c

Comment on lines +998 to +1017
char *server_header;

version = flb_http_response_get_version(response);

server_header = flb_http_response_get_header(response, "server");
if (server_header == NULL) {
flb_http_response_set_header(response,
"server",
strlen("server"),
"Fluent Bit",
strlen("Fluent Bit"));
}

if (flb_http_response_get_header(response, "x-http-engine") == NULL) {
flb_http_response_set_header(response,
"x-http-engine",
strlen("x-http-engine"),
"Monkey heritage",
strlen("Monkey heritage"));
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't force identity headers in the shared response commit path.

This makes flb_http_response_unset_header() ineffective for server and x-http-engine, and it changes wire-visible behavior for every migrated listener/exporter. If some listeners still want these defaults, set them where that policy is owned instead of re-inserting them here.

Suggested minimal fix
 int flb_http_response_commit(struct flb_http_response *response)
 {
     int len;
     char tmp[64];
     int version;
-    char *server_header;
 
     version = flb_http_response_get_version(response);
-
-    server_header = flb_http_response_get_header(response, "server");
-    if (server_header == NULL) {
-        flb_http_response_set_header(response,
-                                     "server",
-                                     strlen("server"),
-                                     "Fluent Bit",
-                                     strlen("Fluent Bit"));
-    }
-
-    if (flb_http_response_get_header(response, "x-http-engine") == NULL) {
-        flb_http_response_set_header(response,
-                                     "x-http-engine",
-                                     strlen("x-http-engine"),
-                                     "Monkey heritage",
-                                     strlen("Monkey heritage"));
-    }
 
     if (response->body == NULL) {
         flb_http_response_set_header(response,
                                      "content-length",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
char *server_header;
version = flb_http_response_get_version(response);
server_header = flb_http_response_get_header(response, "server");
if (server_header == NULL) {
flb_http_response_set_header(response,
"server",
strlen("server"),
"Fluent Bit",
strlen("Fluent Bit"));
}
if (flb_http_response_get_header(response, "x-http-engine") == NULL) {
flb_http_response_set_header(response,
"x-http-engine",
strlen("x-http-engine"),
"Monkey heritage",
strlen("Monkey heritage"));
}
version = flb_http_response_get_version(response);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/flb_http_common.c` around lines 998 - 1017, The commit path currently
forces default identity headers by calling flb_http_response_get_header and then
flb_http_response_set_header for "server" and "x-http-engine", which prevents
flb_http_response_unset_header from taking effect; remove (or disable) the
flb_http_response_set_header calls in this shared commit path (the block that
inspects "server" and "x-http-engine") so headers are not re-inserted here, and
instead set any default values in the specific listener/exporter code that owns
that policy; references: flb_http_response_get_header,
flb_http_response_set_header, flb_http_response_unset_header, and the
surrounding commit logic in flb_http_common.c.

Comment on lines +793 to +796
if (options->workers > 1) {
options->reuse_port = FLB_TRUE;
options->use_caller_event_loop = FLB_FALSE;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

workers > 1 currently selects a mode that can never start.

Lines 793-796 force multi-worker sessions into internally managed-runtime mode, but Lines 810-813 immediately reject every non-caller-context server. That leaves the new worker/runtime API exposed but nonfunctional. Either call the runtime start path here or keep the worker fields private until the implementation is complete.

Possible wiring if the runtime path is intended to be live now
-    if (!flb_http_server_running_on_caller_context(session)) {
-        flb_error("[http_server] internally managed worker runtime is not enabled yet");
-        return -1;
-    }
+    if (!flb_http_server_running_on_caller_context(session)) {
+        return flb_http_server_runtime_start(session);
+    }

Also applies to: 810-813

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/http_server/flb_http_server.c` around lines 793 - 796, When
options->workers > 1 you set options->reuse_port and force
options->use_caller_event_loop = FLB_FALSE, but later the code immediately
rejects any non-caller-context server, leaving the worker/runtime API unusable;
either invoke the runtime start path for multi-worker mode here (i.e., call the
server runtime start routine for the new worker API immediately after setting
options->workers/reuse_port/use_caller_event_loop) or do not flip
options->use_caller_event_loop to FLB_FALSE (and keep the worker-related fields
private) so the rejection code for non-caller-context servers no longer breaks
startup; update the logic around the non-caller-context rejection to only reject
when the runtime start path was not taken, referencing options->workers,
options->reuse_port, options->use_caller_event_loop and the non-caller-context
rejection check so the chosen wiring is consistent.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8ee77e323a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

src/flb_help.c Outdated
Comment on lines +274 to +276
config_map = flb_http_server_get_config_map(ins->config);
options_size += mk_list_size(config_map);
flb_config_map_destroy(config_map);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve plugin config map when adding HTTP help options

This overwrites config_map (which initially points to the plugin’s own config map) with the temporary HTTP-server map and then destroys it, but later the function still iterates config_map to emit plugin options. For inputs with FLB_INPUT_HTTP_SERVER (e.g. -i http -h), that leaves config_map dangling and can produce a use-after-free/crash or missing option output in help generation.

Useful? React with 👍 / 👎.

src/flb_help.c Outdated
Comment on lines +555 to +557
config_map = flb_http_server_get_config_map(ins->config);
options_size += mk_list_size(config_map);
flb_config_map_destroy(config_map);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep output plugin config map alive in help path

The output help path has the same lifetime bug: config_map is reassigned to the temporary HTTP-server map and destroyed, then reused later as if it still referenced the output plugin map. For HTTP-server-enabled outputs (e.g. -o prometheus_exporter -h), this can iterate freed memory and lead to crashes or incomplete option lists.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/http_server/flb_http_server.c (1)

977-977: Redundant assignment.

The assignment result = 0; is unnecessary since result was already 0 from the successful flb_http_server_start call on line 970.

Proposed fix
-    result = 0;
-
     if (session->runtime == NULL && session->downstream != NULL) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/http_server/flb_http_server.c` at line 977, Remove the redundant
assignment to result after a successful flb_http_server_start call: since
flb_http_server_start already set result to 0 on success, delete the line
"result = 0;" to avoid a needless reassignment; update the code around the
flb_http_server_start call (look for the variable "result" and the call to
"flb_http_server_start") so the function logic relies on the existing value
rather than reassigning it.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/http_server/flb_http_server.c`:
- Line 977: Remove the redundant assignment to result after a successful
flb_http_server_start call: since flb_http_server_start already set result to 0
on success, delete the line "result = 0;" to avoid a needless reassignment;
update the code around the flb_http_server_start call (look for the variable
"result" and the call to "flb_http_server_start") so the function logic relies
on the existing value rather than reassigning it.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 34bba052-d399-4bce-96bd-06443f111488

📥 Commits

Reviewing files that changed from the base of the PR and between 8ee77e3 and ae3f272.

📒 Files selected for processing (1)
  • src/http_server/flb_http_server.c

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/http_server/api/v1/trace.c (1)

621-639: ⚠️ Potential issue | 🟠 Major

Return an error status in the 503 JSON body.

This branch sends HTTP 503 but still packs HTTP_RESULT_OK, so callers can receive an error response with {"status":"ok"}.

🔧 Suggested fix
     else if (http_status == 503) {
         msgpack_pack_map(&mp_pck, 2);
         msgpack_pack_str_with_body(&mp_pck, HTTP_FIELD_STATUS, HTTP_FIELD_STATUS_LEN);
-        msgpack_pack_str_with_body(&mp_pck, HTTP_RESULT_OK, HTTP_RESULT_OK_LEN);
+        msgpack_pack_str_with_body(&mp_pck, HTTP_RESULT_ERROR, HTTP_RESULT_ERROR_LEN);
         msgpack_pack_str_with_body(&mp_pck, HTTP_FIELD_MESSAGE, HTTP_FIELD_MESSAGE_LEN);
         if (error_msg) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/http_server/api/v1/trace.c` around lines 621 - 639, The 503 branch
currently packs HTTP_RESULT_OK into the response body despite sending HTTP 503;
update the packing to return an error status (replace the use of HTTP_RESULT_OK
with the appropriate error constant, e.g., HTTP_RESULT_ERROR or a defined error
constant) so msgpack_pack_str_with_body(&mp_pck, HTTP_FIELD_STATUS, ...) writes
an error value, and keep the existing HTTP_FIELD_MESSAGE/error_msg handling
(including flb_sds_destroy) intact; adjust only the constant used for the status
field in the http_status == 503 branch.
🧹 Nitpick comments (4)
plugins/out_prometheus_exporter/prom_http.c (1)

76-84: Inconsistent return value handling for flb_hs_response_set_payload.

At lines 66-70, the return value of flb_hs_response_set_payload is propagated to the caller, but here it's ignored and the function unconditionally returns 0. If the helper fails (e.g., allocation or commit failure), this path will silently report success.

💡 Suggested fix
-    flb_hs_response_set_payload(response,
+    ret = flb_hs_response_set_payload(response,
                                 200,
                                 FLB_HS_CONTENT_TYPE_PROMETHEUS,
                                 payload,
                                 cfl_sds_len(payload));
 
     cfl_sds_destroy(payload);
 
-    return 0;
+    return ret;

This requires declaring int ret; at the top of the function.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/out_prometheus_exporter/prom_http.c` around lines 76 - 84, The code
ignores the return value of flb_hs_response_set_payload in this branch; capture
its return (e.g., declare an int ret), assign ret =
flb_hs_response_set_payload(...), ensure cfl_sds_destroy(payload) is still
called, and then return ret instead of unconditionally returning 0 so failures
in flb_hs_response_set_payload (allocation/commit errors) are propagated by the
function.
plugins/in_http/http_prot.c (1)

42-64: Consider aligning with the more robust approach used in content_type_is_urlencoded.

Per RFC 7231, optional whitespace (OWS) before the semicolon is allowed (e.g., application/json ; charset=UTF-8). The current implementation would reject such requests. The content_type_is_urlencoded helper handles this correctly by locating the separator first and trimming whitespace.

For consistency and spec compliance, consider using the same separator-based approach here.

💡 Suggested refactor
 static int content_type_is_json(const char *content_type)
 {
-    size_t length;
+    const char *separator;
+    size_t media_type_length;
 
     if (content_type == NULL) {
         return FLB_FALSE;
     }
 
-    length = strlen(content_type);
-
-    if (length == 16 &&
-        strncasecmp(content_type, "application/json", 16) == 0) {
-        return FLB_TRUE;
+    separator = strchr(content_type, ';');
+    if (separator == NULL) {
+        return strcasecmp(content_type, "application/json") == 0;
     }
 
-    if (length > 16 &&
-        (strncasecmp(content_type, "application/json;", 17) == 0 ||
-         strncasecmp(content_type, "application/json ", 17) == 0)) {
-        return FLB_TRUE;
+    media_type_length = separator - content_type;
+
+    while (media_type_length > 0 &&
+           isspace((unsigned char) content_type[media_type_length - 1])) {
+        media_type_length--;
     }
 
-    return FLB_FALSE;
+    if (media_type_length != 16) {  /* strlen("application/json") */
+        return FLB_FALSE;
+    }
+
+    return strncasecmp(content_type, "application/json", media_type_length) == 0;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/in_http/http_prot.c` around lines 42 - 64, The content_type_is_json
function rejects valid MIME values with optional whitespace before the
semicolon; change it to follow the separator-first approach used by
content_type_is_urlencoded: locate the first ';' (or end), trim trailing OWS
from the type token, then compare the trimmed token case-insensitively to
"application/json"; update content_type_is_json to use the same separator/trim
logic to accept inputs like "application/json ; charset=UTF-8".
src/flb_input.c (2)

671-682: Minor: Redundant NULL check.

The if (tmp) check at line 676 is redundant since tmp != NULL is already verified in the condition at line 673.

🧹 Proposed simplification
     else if ((ins->p->flags & FLB_INPUT_HTTP_SERVER) != 0 &&
              flb_http_server_property_is_allowed(k) == FLB_TRUE &&
              tmp != NULL) {
         kv = flb_kv_item_create(&ins->http_server_properties, (char *) k, NULL);
         if (!kv) {
-            if (tmp) {
-                flb_sds_destroy(tmp);
-            }
+            flb_sds_destroy(tmp);
             return -1;
         }
         kv->val = tmp;
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/flb_input.c` around lines 671 - 682, The code contains a redundant NULL
check for tmp inside the block guarded by (tmp != NULL); remove the inner if
(tmp) { flb_sds_destroy(tmp); } branch and simply destroy tmp unconditionally
before returning on flb_kv_item_create failure. Update the error path in the
block that checks ins->p->flags & FLB_INPUT_HTTP_SERVER and
flb_http_server_property_is_allowed(k) so that after
flb_kv_item_create(&ins->http_server_properties, (char *) k, NULL) fails you
call flb_sds_destroy(tmp) and return -1; keep the assignment kv->val = tmp when
creation succeeds.

446-454: Redundant NULL check; also pre-existing memory leak.

  1. The NULL check at line 449 is unnecessary since http_server_config is guaranteed to be non-NULL at this point (allocation succeeded at line 268, otherwise we would have returned earlier).

  2. Similar to the proxy context path, this error path leaks the four hash tables, the proxy context (for non-core plugins), and potentially host fields (for network plugins).

🔧 Proposed simplification and complete cleanup
         instance->rb = flb_ring_buffer_create(instance->ring_buffer_size);
         if (!instance->rb) {
             flb_error("instance %s could not initialize ring buffer",
                       flb_input_name(instance));
-            if (instance->http_server_config != NULL) {
-                flb_free(instance->http_server_config);
-            }
+            /* Clean up all allocated resources */
+            flb_hash_table_destroy(instance->ht_log_chunks);
+            flb_hash_table_destroy(instance->ht_metric_chunks);
+            flb_hash_table_destroy(instance->ht_trace_chunks);
+            flb_hash_table_destroy(instance->ht_profile_chunks);
+            if (plugin->type != FLB_INPUT_PLUGIN_CORE && instance->context) {
+                flb_free(instance->context);
+            }
+            flb_free(instance->http_server_config);
             flb_free(instance);
             return NULL;
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/flb_input.c` around lines 446 - 454, Remove the redundant NULL check for
instance->http_server_config and replace this early-error path with a full
cleanup: when instance->rb is NULL free the four instance hash tables, free the
proxy context (instance->proxy_ctx) for non-core plugins, free any allocated
host fields (e.g., instance->hosts or similar network-related allocations), then
free instance->http_server_config and finally flb_free(instance) before
returning NULL so no resources are leaked. Ensure you reference and call the
appropriate free/destroy helpers used elsewhere for those hash tables and proxy
context.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/in_prometheus_remote_write/prom_rw.c`:
- Around line 51-79: The code sets the input context via
flb_input_set_context(ins, ctx) before starting the HTTP server, which leads to
a dangling ins->context if server init fails and prom_rw_config_destroy(ctx) is
called; to fix, defer calling flb_input_set_context until after the HTTP server
is successfully started (i.e., after flb_http_server_start and
flb_input_downstream_set succeed) or, alternatively, if you keep the current
order, ensure you clear the stored context on error by calling
flb_input_set_context(ins, NULL) before prom_rw_config_destroy(ctx); update the
block around flb_input_set_context, flb_http_server_init_with_options,
flb_http_server_start, and flb_input_downstream_set to implement one of these
two approaches consistently.

In `@src/flb_input.c`:
- Around line 333-339: On ctx allocation failure in the function that creates
the proxy input (where struct flb_plugin_input_proxy_context is allocated), add
cleanup for the four previously created hash tables (ht_log_chunks,
ht_metric_chunks, ht_trace_chunks, ht_profile_chunks) before freeing
http_server_config and instance and returning NULL; call the appropriate
destroy/free API used elsewhere in this file for those hash tables to ensure
they are released and guard against NULLs if necessary.

In `@src/http_server/api/v1/trace.c`:
- Around line 665-668: api_v1_trace() currently ignores failures from
flb_hs_register_endpoint(), so if either registration for "/api/v1/traces/"
(cb_traces) or "/api/v1/trace/" (cb_trace) fails the function still returns
success; update api_v1_trace() to check the return value of each
flb_hs_register_endpoint(hs, ...) call and propagate an error (non-zero/FAIL)
immediately if any call fails so partial registration cannot occur. Ensure you
reference and handle both registration calls for cb_traces and cb_trace,
returning the appropriate failure code from api_v1_trace() when
flb_hs_register_endpoint() indicates error.

In `@src/http_server/api/v2/reload.c`:
- Around line 72-75: The early error branches after msgpack_sbuffer_init leak
the msgpack buffer; ensure mp_sbuf is destroyed before returning on signal
failures (both the GenerateConsoleCtrlEvent branch and the kill() branch around
lines with ret==0). Update the error paths in the function to call
msgpack_sbuffer_destroy(&mp_sbuf) (or route them to a common cleanup label that
destroys mp_sbuf) before calling flb_http_response_set_status(...) and
flb_http_response_commit(...), so the temporary msgpack allocation is freed on
all early 500 returns.

In `@src/http_server/flb_hs.c`:
- Around line 172-177: The eviction loop unconditionally frees the oldest health
sample (mk_list_entry_first -> struct flb_hs_hc_buf) while is_healthy() may have
incremented buf->users, causing a use-after-free; change the loop to check
buf->users before removing/freeing: for each oldest buf, if buf->users == 0 then
mk_list_del(&buf->_head), flb_free(buf) and decrement
hs->health_counter.period_counter, otherwise stop/skip eviction (e.g., break the
loop or move to the next-safe policy) so you never free a buffer still in use.
Update the loop around hs->health_metrics and keep references to struct
flb_hs_hc_buf and is_healthy() in mind when implementing the check.
- Around line 313-334: The flb_http_server_options struct is left at defaults
after flb_http_server_options_init(&options), so shared listener settings from
config->http_server (like http2 enablement, buffer sizing, and connection
limits) are not applied; populate options from config->http_server right after
flb_http_server_options_init(&options) (before using options to create the
server) by copying the relevant fields (e.g., options.http2 =
config->http_server.http2; options.buffer_size =
config->http_server.buffer_size; options.max_connections =
config->http_server.max_connections; etc.), ensuring all shared http_server.*
settings are threaded into options in the same place you already set
options.address, options.port, options.networking_setup and options.event_loop.
- Around line 108-132: flb_hs_buf_cleanup currently frees buffer->data and
buffer->raw_data unconditionally, which can race with in-flight handlers that
increment buffer->users; modify flb_hs_buf_cleanup to check buffer->users and,
if >0, avoid immediate free by setting a new flag (e.g., buffer->pending_free =
1) and return, otherwise perform the same destruction logic (calling
flb_sds_destroy, raw_free or flb_free) and clear pending_free; ensure the
counterpart that decrements buffer->users (where handlers release buffers)
checks pending_free and calls flb_hs_buf_cleanup when users reaches 0 so cleanup
is deferred until no users remain.

In `@tests/internal/http_server.c`:
- Around line 129-137: The test must stop if
flb_http_server_init_with_options(&server, &options) fails to avoid accessing an
uninitialized server: capture the return value (e.g., int ret =
flb_http_server_init_with_options(...)), keep the TEST_CHECK(ret == 0)
assertion, and immediately return (or otherwise abort the test function) when
ret != 0 so subsequent reads of server and the call to
flb_http_server_destroy(&server) are not executed; apply the same pattern to the
other test block that calls flb_http_server_init_with_options().

---

Outside diff comments:
In `@src/http_server/api/v1/trace.c`:
- Around line 621-639: The 503 branch currently packs HTTP_RESULT_OK into the
response body despite sending HTTP 503; update the packing to return an error
status (replace the use of HTTP_RESULT_OK with the appropriate error constant,
e.g., HTTP_RESULT_ERROR or a defined error constant) so
msgpack_pack_str_with_body(&mp_pck, HTTP_FIELD_STATUS, ...) writes an error
value, and keep the existing HTTP_FIELD_MESSAGE/error_msg handling (including
flb_sds_destroy) intact; adjust only the constant used for the status field in
the http_status == 503 branch.

---

Nitpick comments:
In `@plugins/in_http/http_prot.c`:
- Around line 42-64: The content_type_is_json function rejects valid MIME values
with optional whitespace before the semicolon; change it to follow the
separator-first approach used by content_type_is_urlencoded: locate the first
';' (or end), trim trailing OWS from the type token, then compare the trimmed
token case-insensitively to "application/json"; update content_type_is_json to
use the same separator/trim logic to accept inputs like "application/json ;
charset=UTF-8".

In `@plugins/out_prometheus_exporter/prom_http.c`:
- Around line 76-84: The code ignores the return value of
flb_hs_response_set_payload in this branch; capture its return (e.g., declare an
int ret), assign ret = flb_hs_response_set_payload(...), ensure
cfl_sds_destroy(payload) is still called, and then return ret instead of
unconditionally returning 0 so failures in flb_hs_response_set_payload
(allocation/commit errors) are propagated by the function.

In `@src/flb_input.c`:
- Around line 671-682: The code contains a redundant NULL check for tmp inside
the block guarded by (tmp != NULL); remove the inner if (tmp) {
flb_sds_destroy(tmp); } branch and simply destroy tmp unconditionally before
returning on flb_kv_item_create failure. Update the error path in the block that
checks ins->p->flags & FLB_INPUT_HTTP_SERVER and
flb_http_server_property_is_allowed(k) so that after
flb_kv_item_create(&ins->http_server_properties, (char *) k, NULL) fails you
call flb_sds_destroy(tmp) and return -1; keep the assignment kv->val = tmp when
creation succeeds.
- Around line 446-454: Remove the redundant NULL check for
instance->http_server_config and replace this early-error path with a full
cleanup: when instance->rb is NULL free the four instance hash tables, free the
proxy context (instance->proxy_ctx) for non-core plugins, free any allocated
host fields (e.g., instance->hosts or similar network-related allocations), then
free instance->http_server_config and finally flb_free(instance) before
returning NULL so no resources are leaked. Ensure you reference and call the
appropriate free/destroy helpers used elsewhere for those hash tables and proxy
context.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3cacabae-b672-4855-bf21-18238d1d0f79

📥 Commits

Reviewing files that changed from the base of the PR and between ae3f272 and dff5d4c.

📒 Files selected for processing (20)
  • lib/monkey/CMakeLists.txt
  • lib/monkey/mk_server/mk_server.c
  • plugins/in_http/http_prot.c
  • plugins/in_prometheus_remote_write/prom_rw.c
  • plugins/in_prometheus_remote_write/prom_rw_config.c
  • plugins/in_prometheus_remote_write/prom_rw_prot.c
  • plugins/in_splunk/splunk_prot.c
  • plugins/out_prometheus_exporter/prom_http.c
  • src/flb_help.c
  • src/flb_input.c
  • src/http_server/api/v1/health.c
  • src/http_server/api/v1/metrics.c
  • src/http_server/api/v1/trace.c
  • src/http_server/api/v2/metrics.c
  • src/http_server/api/v2/reload.c
  • src/http_server/flb_hs.c
  • src/http_server/flb_hs_utils.c
  • tests/internal/http_server.c
  • tests/runtime_shell/http_harness.sh
  • tests/runtime_shell/out_prometheus_exporter_expect.sh
✅ Files skipped from review due to trivial changes (1)
  • lib/monkey/CMakeLists.txt
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/flb_help.c

Comment on lines +313 to 334
flb_http_server_options_init(&options);
options.protocol_version = HTTP_PROTOCOL_VERSION_AUTODETECT;
options.flags = 0;
options.request_callback = flb_hs_request_handler;
options.user_data = hs;
options.address = (char *) listen;
errno = 0;
port = strtol(tcp_port, &end, 10);
if (errno == ERANGE || end == tcp_port || *end != '\0' ||
port <= 0 || port > 65535) {
flb_error("[http_server] invalid monitoring tcp_port '%s'", tcp_port);
flb_free(hs);
return NULL;
}
options.port = (int) port;
options.networking_flags = 0;
flb_net_setup_init(&hs->net_setup);
options.networking_setup = &hs->net_setup;
options.event_loop = config->evl;
options.system_context = config;
options.use_caller_event_loop = FLB_TRUE;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Thread the shared http_server.* settings into options.

This path hard-codes the listener behavior and leaves the other flb_http_server_options fields at defaults, so the built-in API server will ignore configured http_server.http2, buffer sizing, and max-connection limits. That breaks the shared listener configuration introduced by this migration.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/http_server/flb_hs.c` around lines 313 - 334, The flb_http_server_options
struct is left at defaults after flb_http_server_options_init(&options), so
shared listener settings from config->http_server (like http2 enablement, buffer
sizing, and connection limits) are not applied; populate options from
config->http_server right after flb_http_server_options_init(&options) (before
using options to create the server) by copying the relevant fields (e.g.,
options.http2 = config->http_server.http2; options.buffer_size =
config->http_server.buffer_size; options.max_connections =
config->http_server.max_connections; etc.), ensuring all shared http_server.*
settings are threaded into options in the same place you already set
options.address, options.port, options.networking_setup and options.event_loop.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
src/http_server/flb_hs.c (1)

354-374: ⚠️ Potential issue | 🟠 Major

Still not threading the shared http_server.* settings into the built-in listener.

This path still hard-codes protocol autodetect and leaves the rest of flb_http_server_options at defaults, so the built-in API server ignores configured http_server.http2, buffer_max_size, and max_connections. plugins/out_prometheus_exporter/prom_http.c Lines 115-130 already wires those settings through, so this listener is still inconsistent with the unified-server contract.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/http_server/flb_hs.c` around lines 354 - 374, The built-in listener
currently hard-codes options (e.g., options.protocol_version) and leaves other
flb_http_server_options fields at defaults so configured http_server.* values
(http2, buffer_max_size, max_connections, flags) are ignored; update the flb_hs
code that calls flb_http_server_options_init and sets options.* so it copies the
configured http_server settings into options (map configured http2 ->
options.protocol_version, buffer_max_size -> options.buffer_max_size,
max_connections -> options.max_connections, and preserve configured
flags/networking flags), following the same wiring approach used in
plugins/out_prometheus_exporter/prom_http.c (use the same fields and helpers
rather than hard-coded defaults), keeping request_callback =
flb_hs_request_handler and the existing networking/event-loop assignments.
🧹 Nitpick comments (2)
src/http_server/api/v2/reload.c (1)

175-184: Return 405 for unsupported methods instead of 400.

This endpoint handles GET, POST, and PUT; requests with other methods should return 405 (Method Not Allowed) per RFC 9110. Include an Allow: GET, POST, PUT header in the response to inform clients of supported methods.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/http_server/api/v2/reload.c` around lines 175 - 184, Replace the final
400 response with a 405 Method Not Allowed and add an Allow header listing the
supported methods; specifically, in the branch after the GET/POST/PUT handling,
call flb_http_response_set_status(response, 405), add the header "Allow: GET,
POST, PUT" using the response header API used elsewhere in this module, then
call flb_http_response_commit(response) as before (refer to request->method,
flb_http_response_set_status, and flb_http_response_commit in the snippet).
include/fluent-bit/http_server/flb_hs.h (1)

71-76: Make the route-path lifetime contract explicit in the public API.

flb_hs_register_endpoint() stores a borrowed path pointer (see src/http_server/flb_hs.c Lines 171-174), but this header does not tell callers that the string must outlive hs. Either copy the route during registration or document that lifetime requirement here so future dynamic callers do not register dangling paths.

Also applies to: 106-109

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/fluent-bit/http_server/flb_hs.h` around lines 71 - 76, The public API
currently stores a borrowed pointer in struct flb_hs_route.path via
flb_hs_register_endpoint; make the lifetime contract explicit by owning the
string: modify flb_hs_register_endpoint to strdup/copy the provided path and
store that copy in flb_hs_route.path, and ensure the copy is freed when routes
are removed or when the server is destroyed (e.g., in flb_hs_unregister_endpoint
/ hs cleanup); alternatively, if you prefer not to change ownership, add a clear
comment in the header next to struct flb_hs_route and the prototype for
flb_hs_register_endpoint stating that the caller must guarantee the path string
outlives the flb_hs instance—pick one approach and implement corresponding
allocation/free or documentation consistently for the other occurrences noted
(the second flb_hs_route usage).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/flb_input.c`:
- Around line 392-398: When flb_net_host_set(plugin->name, &instance->host,
input) fails the code frees http_server_config and instance but leaks
already-allocated resources; update the failure path to also free/destroy
ht_log_chunks, ht_metric_chunks, ht_trace_chunks and ht_profile_chunks and tear
down any proxy context (instance->proxy) before calling
flb_free(instance->http_server_config) and flb_free(instance). Use the same
destroy/free helpers used when tearing down an input instance (null-check each
of ht_log_chunks, ht_metric_chunks, ht_trace_chunks, ht_profile_chunks and
instance->proxy before calling their respective destroy/free functions) so the
path fully cleans up on flb_net_host_set failure.

In `@src/http_server/api/v1/trace.c`:
- Around line 282-283: The code calls cfl_sds_len(request->body) (used in
flb_pack_json calls) without checking request->body for NULL, which can crash on
body-less requests; add a guard before both JSON parse paths (the flb_pack_json
calls around the symbols request->body, cfl_sds_len, and flb_pack_json) that
checks if request->body is non-NULL (and optionally non-empty) and if NULL
respond with an HTTP 400 / error path (or skip parsing and handle empty body)
instead of calling cfl_sds_len; update both locations (the two flb_pack_json
usages) to perform the NULL check and return/handle appropriately.

In `@src/http_server/flb_hs.c`:
- Around line 351-379: The endpoint state allocated by flb_hs_endpoints(hs) is
not freed on the pre-init error paths; update the error handling after endpoint
initialization (the failure branches that currently call flb_free(hs)) to call
flb_hs_endpoints_free(hs) (or flb_hs_destroy(hs) if you want full cleanup)
before freeing hs so endpoint buffers like ep_root_buf are released.
Specifically, replace the direct flb_free(hs) on the validation and
flb_http_server_init_with_options() failure paths with a call to
flb_hs_endpoints_free(hs) followed by flb_free(hs) (or call flb_hs_destroy(hs)
and return NULL) to ensure consistent cleanup.

---

Duplicate comments:
In `@src/http_server/flb_hs.c`:
- Around line 354-374: The built-in listener currently hard-codes options (e.g.,
options.protocol_version) and leaves other flb_http_server_options fields at
defaults so configured http_server.* values (http2, buffer_max_size,
max_connections, flags) are ignored; update the flb_hs code that calls
flb_http_server_options_init and sets options.* so it copies the configured
http_server settings into options (map configured http2 ->
options.protocol_version, buffer_max_size -> options.buffer_max_size,
max_connections -> options.max_connections, and preserve configured
flags/networking flags), following the same wiring approach used in
plugins/out_prometheus_exporter/prom_http.c (use the same fields and helpers
rather than hard-coded defaults), keeping request_callback =
flb_hs_request_handler and the existing networking/event-loop assignments.

---

Nitpick comments:
In `@include/fluent-bit/http_server/flb_hs.h`:
- Around line 71-76: The public API currently stores a borrowed pointer in
struct flb_hs_route.path via flb_hs_register_endpoint; make the lifetime
contract explicit by owning the string: modify flb_hs_register_endpoint to
strdup/copy the provided path and store that copy in flb_hs_route.path, and
ensure the copy is freed when routes are removed or when the server is destroyed
(e.g., in flb_hs_unregister_endpoint / hs cleanup); alternatively, if you prefer
not to change ownership, add a clear comment in the header next to struct
flb_hs_route and the prototype for flb_hs_register_endpoint stating that the
caller must guarantee the path string outlives the flb_hs instance—pick one
approach and implement corresponding allocation/free or documentation
consistently for the other occurrences noted (the second flb_hs_route usage).

In `@src/http_server/api/v2/reload.c`:
- Around line 175-184: Replace the final 400 response with a 405 Method Not
Allowed and add an Allow header listing the supported methods; specifically, in
the branch after the GET/POST/PUT handling, call
flb_http_response_set_status(response, 405), add the header "Allow: GET, POST,
PUT" using the response header API used elsewhere in this module, then call
flb_http_response_commit(response) as before (refer to request->method,
flb_http_response_set_status, and flb_http_response_commit in the snippet).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d86bb3ab-7ad8-4eb3-9d5c-ee7573790d57

📥 Commits

Reviewing files that changed from the base of the PR and between dff5d4c and b7e015b.

📒 Files selected for processing (13)
  • include/fluent-bit/http_server/flb_hs.h
  • plugins/in_http/http_prot.c
  • plugins/in_prometheus_remote_write/prom_rw.c
  • plugins/out_prometheus_exporter/prom_http.c
  • src/flb_input.c
  • src/http_server/api/v1/metrics.c
  • src/http_server/api/v1/register.c
  • src/http_server/api/v1/storage.c
  • src/http_server/api/v1/trace.c
  • src/http_server/api/v2/metrics.c
  • src/http_server/api/v2/reload.c
  • src/http_server/flb_hs.c
  • tests/internal/http_server.c

edsiper added 24 commits March 12, 2026 10:57
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
…rtup

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
…aths

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant