Add UID-based nftables firewall for NATS and monit connections#399
Add UID-based nftables firewall for NATS and monit connections#399rkoster wants to merge 5 commits intocloudfoundry:mainfrom
Conversation
Implement a firewall mechanism that restricts NATS (mbus) connections to the bosh-agent process only, using UID-based filtering with nftables. Key changes: - Add platform/firewall package with Manager and NatsFirewallHook interfaces - Implement NftablesFirewall that creates UID-based egress rules - Add GetNatsFirewallHook() to Platform interface - Integrate BeforeConnect hook in nats_handler.go for connection/reconnection - Support DNS re-resolution on reconnect for HA failover scenarios - Add stub implementations for Windows and dummy platforms The firewall rules allow only the agent's UID to connect to NATS/director ports while blocking other processes, improving security posture.
Add comprehensive unit tests for the new firewall functionality: platform/firewall tests (23 tests): - SetupMonitFirewall: table/chain/rule creation, error handling - SetupNATSFirewall: IPv4/IPv6, DNS resolution, https/empty URL handling - BeforeConnect: delegation to SetupNATSFirewall - Cleanup: table deletion and error handling mbus/nats_handler tests (4 new tests): - Firewall hook is called on Start - BeforeConnect receives correct mbus URL - Handler still starts when hook returns nil - Warning logged but no failure when BeforeConnect errors Also: - Add DNSResolver interface for testable DNS resolution - Inject resolver dependency via NewNftablesFirewallWithDeps - Configure test logging to use GinkgoWriter for visibility
- Fix ST1023 linter error: omit type from var declaration - Add linux_header.txt for counterfeiter to add build tags to Linux-only fakes - Regenerate fake_nftables_conn.go and fake_dnsresolver.go with //go:build linux tag - This fixes macOS/Windows build failures due to google/nftables being Linux-only
mbus/nats_handler.go
Outdated
| var natsOptions = []nats.Option{ | ||
| nats.RetryOnFailedConnect(true), | ||
| nats.DisconnectErrHandler(func(c *nats.Conn, err error) { | ||
| h.logger.Debug(natsHandlerLogTag, "Nats disconnected with Error: %v", err.Error()) |
There was a problem hiding this comment.
I know this was not introduced in your PR, but I see you handle err nil on line 170 and this seems to have similar issue?
| } | ||
|
|
||
| // Update firewall rules before initial connection | ||
| h.updateFirewallForNATS() |
There was a problem hiding this comment.
The SetupNetworking in ubuntu_net_manager is still calling SetupNatsFirewall. Will this be an issue to have both iptables with cgroup matching and nftables?
| &expr.Cmp{ | ||
| Op: expr.CmpOpEq, | ||
| Register: 1, | ||
| Data: net.ParseIP("127.0.0.1").To4(), |
There was a problem hiding this comment.
Do we also want to add a rule for ipv6 for loopback?
platform/firewall/firewall.go
Outdated
| SetupNATSFirewall(mbusURL string) error | ||
|
|
||
| // Cleanup removes all agent-managed firewall rules | ||
| Cleanup() error |
There was a problem hiding this comment.
I don't see this is being called?
There was a problem hiding this comment.
Good catch, have removed it, since even during a restart of the agent (which sometimes happens during an update_settings) it is better to keep the firewall rules in place.
platform/linux_platform.go
Outdated
| p.firewallManager = mgr | ||
|
|
||
| // Set up monit firewall rules immediately | ||
| if err := mgr.SetupMonitFirewall(); err != nil { |
There was a problem hiding this comment.
Wondering if this can be more explicitly called during setup and not in a getter?
| return f.conn.Flush() | ||
| } | ||
|
|
||
| func (f *NftablesFirewall) ensureTable() error { |
There was a problem hiding this comment.
Looks like these 2 methods always return nil?
|
Im a little worried by the general approach of teaching the agent about os-version specific things. Specifically I worry that it will (further?) violate layering by pushing version specific customisation from the stemcell into the agent. Its probably ok here since this somewhat sits between agent setup and stemcell config, but i wanted to at least mention it even if nothing comes of it. |
- Fix nil pointer dereference in DisconnectErrHandler when err is nil - Remove iptables-based SetupNatsFirewall code (replaced by nftables) - Remove unused Cleanup() method from firewall interface - Move firewall initialization from lazy getter to explicit SetupFirewall() - Add comment explaining IPv6 loopback is intentionally not protected (monit only binds to 127.0.0.1:2822)
There currently is nothing OS specific about this feature, because it works on both noble an jammy. So this is an effort to simplify and centralise all the different nats and monit firewall codepaths into the agent, where it can more easily be tested (compared to the stemcell builder). |
The nftables library batches operations until Flush() is called, so AddTable/AddChain/AddRule never return errors. Removing the misleading error return types from these internal helper methods.
There was a problem hiding this comment.
This was confusing as I was expecting a C header based on naming.
Maybe clearer would be to put it in firewallfakes/ and name it linux_build_flag_header.txt or - if not in the directory - have the name indicate counterfeiter or fakes: {counterfeiter|fakes}_linux_build_flag_header.txt.
Summary
This PR implements a UID-based nftables firewall to protect NATS (mbus) and monit connections, replacing the previous cgroup-based iptables approach.
Motivation
The existing cgroup-based firewall approach has limitations in nested container environments (like Garden containers running on BOSH VMs). In these environments:
The UID-based approach solves this by using nftables with
meta skuidmatching, which works reliably regardless of container nesting.Changes
New
platform/firewallPackagefirewall.go- DefinesManagerandNatsFirewallHookinterfacesnftables_firewall.go- Linux implementation usinggithub.com/google/nftables(netlink-based, no CLI required)nftables_firewall_other.go- Stub for non-Linux platformsFirewall Rules
Creates an nftables table
bosh_agentwith two output chains:monit_accessnats_accessPlatform Integration
GetNatsFirewallHook()toPlatforminterfaceLinuxPlatforminitializes firewall and implements the hookNATS Handler Integration
nats_handler.gocallsBeforeConnecthook before NATS connection and on each reconnectionTesting
Technical Details
github.com/google/nftableslibrary which communicates via netlink (nonftCLI needed)Testing Performed
ginkgo -r platform/firewall mbuscurl 127.0.0.1:2822hangs (blocked)curl 127.0.0.1:2822returns 401 (allowed through firewall)