Add TLS-through-proxy E2E test and diagnostic logging#88
Merged
shawnburke merged 12 commits intomainfrom Mar 11, 2026
Merged
Conversation
…roxy This adds comprehensive testing and debugging capabilities for the WebSocket proxy's TLS-through-HTTP-CONNECT-proxy scenario, which matches customer production environments (corporate proxy → relay.cortex.io:443). Changes: - Add diagnostic logging to ws_proxy.go: - CONNECT response details (status, headers, close indication) - TLS handshake start/completion with certificate info - First read details on tunnel startup for debugging failures - Buffered data warnings after CONNECT - Add new unit test TestWebSocketProxyFullFlowThroughHTTPProxyWithTLS: - Tests TCP → HTTP CONNECT → TLS handshake → WebSocket upgrade - Verifies TLS connection through transparent HTTP proxy - Confirms 101 Switching Protocols over TLS - Enable TLS in E2E relay test (PROXY=1 mode): - Add nginx TLS termination proxy (snyk-broker-tls) - Create docker-compose.proxy.yml for HTTPS broker URL - Configure mitmproxy with combined CA bundle for upstream verification - Full test: axon-relay → mitmproxy → TLS → snyk-broker Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Split docker-tests job into parallel jobs: - docker-build: builds image and uploads artifact - scaffold-tests: runs scaffold tests (parallel) - relay-test-no-proxy: runs relay test without proxy (parallel) - relay-test-with-proxy: runs relay test with TLS proxy (parallel) - Add certificate generation for CI: - generate-certs.sh: creates broker TLS cert from mitmproxy CA - Track mitmproxy CA files in git (needed to sign broker cert) - Generated certs (broker-server-*, combined-ca-bundle.crt) remain gitignored Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove t.Logf calls from goroutines in the proxy handler to prevent "Log in goroutine after Test has completed" panic. The test may return before the proxy's copy goroutines finish logging. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Revert to sequential test execution (simpler, more debuggable) - Run proxy test before no-proxy test - Add LOG_LEVEL=debug to proxy.env for verbose logging - Add debug output when axon endpoint check fails: - Show full info response - Show axon-relay logs - Show mitmproxy logs (proxy mode) - List certificate files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The axon-relay was trying to proxy requests to cortex-fake through mitmproxy, but DNS resolution for mitmproxy was failing. Add NO_PROXY to bypass the proxy for internal services that don't need it. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The DNS lookup for mitmproxy was failing with "server misbehaving" because mitmproxy wasn't fully ready when axon-relay started. Add a healthcheck to verify mitmproxy is listening on port 8080. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The mitmproxy Docker image doesn't include netcat (nc), causing the healthcheck to fail. Use python3's socket module instead, which is available in the image. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
After killing the broker process, TCP sockets may remain in TIME_WAIT state, causing EADDRINUSE if we start the new broker too quickly. Add waitForPortAvailable() which polls until the port is bindable, with a 10 second timeout. This gives the OS time to release the port from TIME_WAIT state before the new broker tries to bind. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The script was exiting early if broker certs existed, but the combined CA bundle creation was after that check. This caused mitmproxy to fail in CI with exit code 6 because combined-ca-bundle.crt didn't exist. Move the combined CA bundle creation before the broker cert check so it's always created. The combined bundle must be regenerated on each machine anyway since system CAs differ between macOS/Linux/CI. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Show certificate files before docker-compose up, and capture mitmproxy logs if the container fails to start. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The mitmproxy entrypoint script tries to change the mitmproxy user's UID/GID to match files in /home/mitmproxy/.mitmproxy. In CI, files are owned by UID 1001 (runner), but group 1001 doesn't exist in the container, causing "usermod: group '1001' does not exist". Fix by: 1. Mount certs to /certs instead of /home/mitmproxy/.mitmproxy 2. Use --set confdir=/certs to tell mitmproxy where to find CA files This prevents the entrypoint from finding files in the default location and trying to change ownership. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
8bb9fa6 to
46b10b3
Compare
46b10b3 to
b9be750
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ws_proxy.gofor debugging WebSocket tunnel failures through corporate proxiesTestWebSocketProxyFullFlowThroughHTTPProxyWithTLScovering TCP → CONNECT → TLS → WebSocket flowPROXY=1mode)Context
Customer reported WebSocket proxy not working through their corporate proxy (
proxy-na0.fiserv.one:8080→relay.cortex.io:443). The diagnostic logging will help identify where the tunnel fails in their environment. The E2E test now validates the exact same scenario: HTTP CONNECT proxy → TLS handshake → WebSocket upgrade.Changes
Diagnostic logging (
ws_proxy.go):Connection: closeindication)Unit test (
ws_proxy_test.go):TestWebSocketProxyFullFlowThroughHTTPProxyWithTLS: Tests TLS through transparent HTTP CONNECT proxyE2E test (
test/relay/):docker-compose.proxy.yml: Override for HTTPS broker URL in proxy modenginx-tls-proxy.conf: nginx TLS termination in front of snyk-brokerdocker-compose.yml: Added snyk-broker-tls service, mitmproxy CA trustrelay_test.sh: Uses proxy override whenPROXY=1generate-certs.sh: Creates broker TLS cert for CI environmentsCI (
.github/workflows/ci.yml):docker-testsinto parallel jobs:docker-build→ uploads image artifactscaffold-tests(parallel)relay-test-no-proxy(parallel)relay-test-with-proxy(parallel)Test plan
go test ./server/snykbroker/...- all tests pass./relay_test.sh(non-proxy mode) - passesPROXY=1 ./relay_test.sh(TLS-through-proxy) - passes🤖 Generated with Claude Code