Skip to content

collector: add nfqueue collector#3551

Open
denisvmedia wants to merge 1 commit intoprometheus:masterfrom
denisvmedia:collector/nfqueue
Open

collector: add nfqueue collector#3551
denisvmedia wants to merge 1 commit intoprometheus:masterfrom
denisvmedia:collector/nfqueue

Conversation

@denisvmedia
Copy link

@denisvmedia denisvmedia commented Feb 22, 2026

Closes #3518

Summary

Add a new collector that exposes metrics from /proc/net/netfilter/nfnetlink_queue, enabling monitoring of NFQUEUE userspace packet processing queues.

The collector is disabled by default and can be enabled with --collector.nfqueue. It can be excluded at build time with the nonfqueue build tag.

Motivation

NFQUEUE is used by many common network security tools — fail2ban, Suricata (IPS mode), Snort, and custom libnetfilter_queue applications. When a queue fills up, packets are silently dropped by the kernel, which is notoriously hard to diagnose without direct access to the host. This collector makes that state observable.

Metrics

Metric Type Description
node_nfqueue_queue_total Gauge Current number of packets waiting in the queue
node_nfqueue_dropped_total Counter Packets dropped, with reason label: queue_full (queue was full) or user (failed to send to userspace)
node_nfqueue_info Gauge (=1) Non-numeric metadata: peer_portid, copy_mode, copy_range

All metrics carry a queue label with the queue ID. The copy_mode label in node_nfqueue_info is a human-readable string: none, meta, or packet.

Example output for a live system with one active queue:

# HELP node_nfqueue_dropped_total Total number of packets dropped.
# TYPE node_nfqueue_dropped_total counter
node_nfqueue_dropped_total{queue="1",reason="queue_full"} 0
node_nfqueue_dropped_total{queue="1",reason="user"} 0
# HELP node_nfqueue_info Non-numeric metadata about the queue (value is always 1).
# TYPE node_nfqueue_info gauge
node_nfqueue_info{copy_mode="packet",copy_range="1500",peer_portid="803",queue="1"} 1
# HELP node_nfqueue_queue_total Current number of packets waiting in the queue.
# TYPE node_nfqueue_queue_total gauge
node_nfqueue_queue_total{queue="1"} 0

Implementation notes

  • Data source: procfs.NFNetLinkQueue() from github.com/prometheus/procfs — already present in the dependency tree at v0.19.2 (added in procfs#677). No dependency changes required.
  • Graceful degradation: returns ErrNoData if /proc/net/netfilter/nfnetlink_queue does not exist, so the collector does not produce errors on systems where NFQUEUE is not in use.
  • No special permissions: the proc file is world-readable.
  • Cardinality: bounded by the number of active NFQUEUE queues. Each queue produces exactly one time series per metric. Even in extreme cases (100+ queues) the cardinality is negligible.

Testing

Unit tests added in collector/nfqueue_linux_test.go using testutil.GatherAndCompare against a fixture file with three synthetic queue entries (collector/fixtures/proc/net/netfilter/nfnetlink_queue). Tests pass on Linux (verified via Docker).

The collector was also validated against a live Linux host where an active NFQUEUE queue was present, confirming correct metric emission with real kernel data.

@denisvmedia denisvmedia force-pushed the collector/nfqueue branch 8 times, most recently from aed9afd to f846d6b Compare February 23, 2026 05:21
@discordianfish
Copy link
Member

While I don't have an issue in general with vibe-coded contributions, it would be polite to make them as such and at least make sure the style (e.g tests using the fixtures instead) matched and its following naming best practices.

@SuperQ We should probably figure out how to deal with that. Maybe if we hint AI agents the contributions would be even better at following our style and best practices than most human contributions 😬

@denisvmedia
Copy link
Author

denisvmedia commented Feb 23, 2026

@discordianfish I indeed used AI, but not blindly, I quickly hacked the solution myself, then asked AI to write the tests and brush up the code a little bit (to comply with the standards and fix a couple of issues it spot with the tests). I used AI to create the PR description - it's where I'm fully guilty :)

P.S. And thanks for the review!

@denisvmedia denisvmedia force-pushed the collector/nfqueue branch 2 times, most recently from 852f8cb to cc50a39 Compare February 23, 2026 15:41
}

func TestNFQueueStats(t *testing.T) {
testcase := `# HELP node_nfqueue_dropped_total Total number of packets dropped.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to not use the e2e fixtures instead of this test?

Copy link
Author

@denisvmedia denisvmedia Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, some other tests do that as well:

Image

To me it looks like there is on the opposite no strong motivation to use e2e fixtures here. We would have to update end-to-end-test.sh then to enable this collector and then regenerate all fixtures. And given it's a niche one, I think it's cleaner to keep it here as a string instead.

WDYT?

Signed-off-by: Denis Voytyuk <5462781+denisvmedia@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Netfilter Queue (NFQUEUE) Collector

2 participants