Skip to content

Flow-filter fixes (ICMP, private prefix overlap)#1319

Open
qmonnet wants to merge 16 commits intomainfrom
pr/qmonnet/flow-filter-fix
Open

Flow-filter fixes (ICMP, private prefix overlap)#1319
qmonnet wants to merge 16 commits intomainfrom
pr/qmonnet/flow-filter-fix

Conversation

@qmonnet
Copy link
Member

@qmonnet qmonnet commented Mar 6, 2026

This Pull Request contains a fix to address a failure reported by Fredi, and then a bunch of follow-up fixes that turned out to be required on top of the first change.

See individual commit descriptions for details. The important ones are the fix(...): ... commits.

Fixes: #1310

@qmonnet qmonnet self-assigned this Mar 6, 2026
@qmonnet qmonnet requested a review from a team as a code owner March 6, 2026 22:12
Copilot AI review requested due to automatic review settings March 6, 2026 22:12
@qmonnet qmonnet added bug Something isn't working area/nat Related to Network Address Translation (NAT) labels Mar 6, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes flow-filter lookup behavior for portless protocols (e.g., ICMP) and for cases where port-forwarding rules overlap with “no-ports”/masquerading rules, addressing issue #1310.

Changes:

  • Refactors FlowFilterTable into separate subtables for traffic with ports (TCP/UDP) vs without ports (ICMP), and updates overlay setup accordingly.
  • Updates IpPortPrefixTrie::lookup() to prefer more-specific matching prefixes when both port-scoped and all-ports entries overlap.
  • Enhances packet test utilities and adds/updates flow-filter tests, including a new regression test for private-prefix overlap with masquerading + port-forwarding.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
net/src/packet/test_utils.rs Extracts common test header/transport builders and adds an IPv6 “with transport” packet builder used by tests.
lpm/src/trie/ip_port_prefix_trie.rs Adjusts lookup ordering to handle overlaps between port-specific and all-ports entries.
flow-filter/src/tables.rs Splits flow-filter state into with_ports vs no_ports subtables and updates lookups/tests.
flow-filter/src/setup.rs Builds both subtables from overlay data, with an option to skip port-specific prefixes for the no-ports table.
flow-filter/src/lib.rs Updates test helpers and adds a regression test reproducing the reported overlap/ICMP failure.

@qmonnet qmonnet force-pushed the pr/qmonnet/flow-filter-fix branch from db5ed1c to 437e8c0 Compare March 6, 2026 22:32
@qmonnet qmonnet added the ci:+vlab Enable VLAB tests label Mar 9, 2026
@qmonnet qmonnet closed this Mar 9, 2026
@qmonnet qmonnet reopened this Mar 9, 2026
qmonnet added 4 commits March 9, 2026 15:39
Several packet-creation utilities use the same code to build an Ethernet
header (modulo the Ethertype field), we can move it to a shared function
to avoid duplicating too much code. Similarly, we move transport
creation out of build_test_ipv4_packet_with_transport(), because we'll
reuse it for the IPv6 equivalent function in a follow-up commit.

Also remove an erroneous comment for build_test_tcp_ipv4_packet().

Signed-off-by: Quentin Monnet <qmo@qmon.net>
Counterpart to build_test_ipv4_packet_with_transport(); we'll use it in
a follow-up commit to build IPv6 UDP or TCP packets with actual
transport headers, for tests in flow-filter.

Signed-off-by: Quentin Monnet <qmo@qmon.net>
When we changed the type of the sets of IPs within struct VpcExpose to
PrefixPortsSet, we ommitted a few occurrences. Fix the return value for
function get_split_prefixes_for_manifest() in the flow-filter crate.

Some other occurrences in the config crate will be addressed in
follow-up work (separate PR).

Fixes: 9fee292 ("refactor(config): Use wrapper type for lists of prefixes")
Signed-off-by: Quentin Monnet <qmo@qmon.net>
Consider the following setup:

- VPC 2 and VPC 3 expose disjoint prefixes to VPC 1
- VPC 1 exposes 1.0.0.1/32:22 (TCP) via port forwarding to VPC 2
- VPC 1 exposes 1.0.0.0/24 via masquerading to VPC 3

On VPC 1, this means we have an overlap between the private prefixes and
associated port ranges for the peerings with VPC 2 and VPC 3. When we do
the flow-filter lookup for the source connection information for pinging
over the masqueraded connection, the following happens:

- We run a longest-prefix match to see if the longest-matching entry
  covers all ports. This would be good without the overlap. Here this
  fails, because the port-forwarding rule is more specific (1.0.0.1/32)
  than the prefix for masquerading (1.0.0.0/24), but it does not cover
  all ports.

- Then we try to see if there's a matching prefix in the trie that
  covers the ports for the packet, but this also fails, because the
  packet may have no ports (ICMP) or ports not within the range of the
  port forwarding rules.

Instead, we need to look at all entries, starting with the
longest-prefix match, and see if there's one that either works with the
packet's ports (if it has ports), or is not restricted to any port range
(for use with masquerading). That's what we do in this commit.

After the change, the example works. However, this analysis raised a
follow-up issue: When sending from 1.0.0.1, port 22, we always retrieve
the destination information object for a single destination (they should
be two, one for VPC 2 and one for VPC 3, because both can be reached
from this IP and port: we should be able to tell based on the
destination address). This is because we skip prefix splitting in some
cases, but this should not happen. We'll treat this issue (and add a
test) in a follow-up commit.

Reported-by: Fredi Raspall <fredi@githedgehog.com>
Signed-off-by: Quentin Monnet <qmo@qmon.net>
@qmonnet qmonnet force-pushed the pr/qmonnet/flow-filter-fix branch 2 times, most recently from 2ccf855 to c156270 Compare March 9, 2026 16:44
qmonnet added 10 commits March 10, 2026 00:47
Weather or not we find multiple matches for the local expose when
mapping it against the default remote expose (if any) doesn't matter: we
always need to insert the same information. Fix, and simplify insertion
of connection information in that case.

We haven't hit the error that we're removing so far, but we'd hit it in
a test after a fix coming in a follow-up commit, for splitting prefixes.

Signed-off-by: Quentin Monnet <qmo@qmon.net>
In flow-filter, when doing overlap detection to proceed to splitting
overlapping prefixes, we'd "optimise" the case when two prefixes overlap
on the local or the remote end (but not both) of a manifest, in which
case we're able to tell the destination VPC discriminant by combining
information from both source and destination, anyway.

But this is not correct, because we retrieve the destination information
from the source information, and if we don't know we've got overlap when
looking up the source information, then we won't find all the relevant
destination information and won't be able to figure out the right
connection information that should apply.

For example:

- VPC 2 and 3 expose disjoint prefixes to VPC 1
- VPC 1 exposes 1.0.0.1/32:22 (TCP) to VPC 2 via port forwarding
- VPC 1 exposes 1.0.0.0/24 to VPC 3 via masquerading

VPC 1 sends a TCP packet from 1.0.0.1:22. If we haven't split, we'll
retrieve either the destination information for the peering with VPC 2,
or the one with VPC 3, but not both. By contrast, if we split, we know
that 1.0.0.1:22 (TCP) is associated to two different prefixes, the one
exposed from VPC 2, the one from VPC 3, and then we're able to compare
the destination address to determine where to send (and what NAT mode to
apply).

This change addresses the issue for TCP and UDP. However... It breaks
ICMP in some cases! This is because as soon as there is overlap
involving ports, like in the example above, prefixes (as in "prefixes
with associated optional port ranges") are split to extract the
overlapping sections: even prefixes not restricted to ports, once split
against prefixes with ports, produce a set of prefixes that all have
associated port ranges. And we cannot compare an ICMP packet against a
rule that has associated port ranges (given that ICMP doesn't use L4
ports), so we never find a match.

This issue will be addressed by creating a separate context table for
ICMP, in follow-up commits.

Signed-off-by: Quentin Monnet <qmo@qmon.net>
Change the internal of the struct in preparation for adding another
field to it.

Signed-off-by: Quentin Monnet <qmo@qmon.net>
In preparation for the addition of a separate table for holding context
for ICMP packets, introduce struct FlowFilterSubtable, that holds the
HashMap, and now implements the several variants for the insertion
methods that we need.

To avoid updating the ".insert()" call (and exposing the "with_ports"
subtable) to all call locations in tests, we add an ".insert()" helper
back to struct FlowFilterTable, just for using in tests.

Signed-off-by: Quentin Monnet <qmo@qmon.net>
Keep the code to detect overlap between prefixes for a peering inside of
FlowFilterTable's add_peering() method, but move the code for the
follow-up processing and context insertion to the sub-table object's
implementation. This is in preparation for adding a second sub-table to
FlowFilterTable, working on different prefix lists.

Signed-off-by: Quentin Monnet <qmo@qmon.net>
Clean-up FlowFilterTable's add_peering() by moving overlap detection and
prefix splitting to a separate function. This may help to generate
distinct lists of prefixes to process in a follow-up commit.

Signed-off-by: Quentin Monnet <qmo@qmon.net>
We are about to add a second sub-table to struct FlowFilterTable, and to
look up for peering information from that table when a packet has no L4
ports.

The issue we have with the current tests is that tests do not build
full, valid packets, and many of them don't even bother building an
actual transport header. As a result, they don't have L4 ports, and
the flow-filter won't be looking at the right sub-table when processing
them.

In preparation for the addition of this new sub-table, this commit adds
L4 ports to all packets claiming to be TCP or UDP in flow-filter tests.

Signed-off-by: Quentin Monnet <qmo@qmon.net>
It turns out that ICMP, or any other protocol without L4 ports that we'd
like to process, need a separate processing in flow-filter. This is
because in case of overlap between prefixes (be it on the same or
distinct ends of a manifest), we split prefixes to "extract" the
overlapping section, and assign the multiple connection information
objects to this "shared" section only. But this means that as soon as
one prefix has an associated port range (such as an expose using port
forwarding) and overlaps with another one, the split prefixes generated
from this overlap all use port ranges, even if the second prefix was
port-agnostic in the first place. As a result, ICMP packets will no
longer match against the resulting "rules" contained in the table: it
simply has no L4 ports to compare to the table entries.

We could maybe solve this by changing the architecture of the table one
way or another, but this would likely add some complexity. Here we pick
a rather simple approach, even if it may not be the most elegant: we
build a second sub-table where we only insert prefixes that have no
associated port ranges. This relies on the assumption that ICMP packets
will never match a connection supposed to use a specific port range,
anyway.

We may want to change this solution in the future. There's some impact
on memory consumption: we grow the FlowFilterTable quite a lot, in
particular if there's no prefix associated with port ranges in the
config, we double its size for no gain. But as a workaround for
addressing ICMP support, this should work for now.

Signed-off-by: Quentin Monnet <qmo@qmon.net>
In recent commits, we addressed several issues related to ICMP support
and prefix splitting in the flow-filter stage. Add a test to validate
the fixes.

This test is largely based on a configuration provided by Fredi. The
first ICMP packet check is the one that made us realise the issues; the
other two checks about overlapping private addresses and ports on
192.168.90.100:22 were added as a result of subsequent code analysis.

Signed-off-by: Quentin Monnet <qmo@qmon.net>
When dealing with the case when we have multiple matches from the
flow-filter lookup, but not flow table entry for the packet, we added a
debug_assert!() to ensure that, indeed, we had multiple entries.
However, we check the assertion _after_ filtering results based on the
L4 protocol in use, which may just as well filter out all matches we
found and leave the "data_set" empty.

Instead we want to check that we have multiple matches, but before
filtering them.

Update the related test to avoid hitting the assert, but make it fail to
find a relevant RemoteData object due to the L4 protocol filtering.

Fixes: 9d8336d ("feat(flow-filter): Account for TCP/UDP for port forwarding")
Signed-off-by: Quentin Monnet <qmo@qmon.net>
@qmonnet qmonnet force-pushed the pr/qmonnet/flow-filter-fix branch from c156270 to 30ea11b Compare March 10, 2026 01:14
qmonnet added 2 commits March 10, 2026 12:02
The logic in overlap processing to split prefixes is complex, and
incorrect. Let's fix it:

1. For each prefix, split as much as we can, and re-insert fragments in
   the stack of fragments to process.

   This is to make sure that after splitting a fragment once, we still
   compare it to other, potentially smaller, fragments. This solves an
   issue encountered when combining several exposes with port forwarding
   and masquerading in a manifest: some prefixes would be left whole,
   later causing a failure to insert entries in the table.

2. Once we have split the prefix as much as we could, there's no partial
   overlap possible. Look for full overlap, and add a MultipleMatches
   entry each time we find it.

3. If there's no overlap for the fragment, then add a Single entry
   object.

Also add a test to validate that we've fixed the issue.

Fixes: a0b9e76 ("fix(flow-filter): Fix prefixes splitting for multiple overlap")
Reported-by: Fredi Raspall <fredi@githedgehog.com>
Signed-off-by: Quentin Monnet <qmo@qmon.net>
When we produce the set of overlapping prefixes, we *insert* overlapping
prefixes, as we found them, into sets local_overlap and remote_overlap;
and then, we *extend* the accumulator sets with these.

But the two operations, insert and extend, may actually overwrite
previous entries in the sets. It does not matter in terms of overlap
detection, but it does matter for the RemoteData values associated to
the overlapping prefixes in the HashMap, because we need to use them
later.

Make sure we do not overwrite, but instead complete, existing entries in
a set when collecting overlapping prefixes.

Fixes: e6a906c ("feat(flow-filter): Add support for masquerade/port forwarding overlap")
Signed-off-by: Quentin Monnet <qmo@qmon.net>
@qmonnet qmonnet force-pushed the pr/qmonnet/flow-filter-fix branch from 30ea11b to ef5b833 Compare March 10, 2026 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/nat Related to Network Address Translation (NAT) bug Something isn't working ci:+vlab Enable VLAB tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flow-filter denies legit traffic, at least with masquerading.

2 participants