Skip to content

Add TLS-through-proxy E2E test and diagnostic logging#88

Merged
shawnburke merged 12 commits intomainfrom
shawn/websocket-tls-proxy-e2e-test
Mar 11, 2026
Merged

Add TLS-through-proxy E2E test and diagnostic logging#88
shawnburke merged 12 commits intomainfrom
shawn/websocket-tls-proxy-e2e-test

Conversation

@shawnburke
Copy link
Collaborator

@shawnburke shawnburke commented Mar 10, 2026

Summary

  • Adds diagnostic logging to ws_proxy.go for debugging WebSocket tunnel failures through corporate proxies
  • Adds unit test TestWebSocketProxyFullFlowThroughHTTPProxyWithTLS covering TCP → CONNECT → TLS → WebSocket flow
  • Enables full TLS-through-proxy scenario in E2E relay test (PROXY=1 mode)
  • Runs proxy and no-proxy E2E tests in parallel in CI (reduces CI time)

Context

Customer reported WebSocket proxy not working through their corporate proxy (proxy-na0.fiserv.one:8080relay.cortex.io:443). The diagnostic logging will help identify where the tunnel fails in their environment. The E2E test now validates the exact same scenario: HTTP CONNECT proxy → TLS handshake → WebSocket upgrade.

Changes

Diagnostic logging (ws_proxy.go):

  • CONNECT response details (status, headers, Connection: close indication)
  • TLS handshake start/completion with peer certificate info
  • First read on tunnel startup (helps identify immediate failures)

Unit test (ws_proxy_test.go):

  • TestWebSocketProxyFullFlowThroughHTTPProxyWithTLS: Tests TLS through transparent HTTP CONNECT proxy

E2E test (test/relay/):

  • docker-compose.proxy.yml: Override for HTTPS broker URL in proxy mode
  • nginx-tls-proxy.conf: nginx TLS termination in front of snyk-broker
  • docker-compose.yml: Added snyk-broker-tls service, mitmproxy CA trust
  • relay_test.sh: Uses proxy override when PROXY=1
  • generate-certs.sh: Creates broker TLS cert for CI environments

CI (.github/workflows/ci.yml):

  • Split docker-tests into parallel jobs:
    • docker-build → uploads image artifact
    • scaffold-tests (parallel)
    • relay-test-no-proxy (parallel)
    • relay-test-with-proxy (parallel)

Test plan

  • go test ./server/snykbroker/... - all tests pass
  • ./relay_test.sh (non-proxy mode) - passes
  • PROXY=1 ./relay_test.sh (TLS-through-proxy) - passes
  • CI runs parallel tests successfully

🤖 Generated with Claude Code

shawnburke and others added 11 commits March 11, 2026 04:06
…roxy

This adds comprehensive testing and debugging capabilities for the WebSocket
proxy's TLS-through-HTTP-CONNECT-proxy scenario, which matches customer
production environments (corporate proxy → relay.cortex.io:443).

Changes:
- Add diagnostic logging to ws_proxy.go:
  - CONNECT response details (status, headers, close indication)
  - TLS handshake start/completion with certificate info
  - First read details on tunnel startup for debugging failures
  - Buffered data warnings after CONNECT

- Add new unit test TestWebSocketProxyFullFlowThroughHTTPProxyWithTLS:
  - Tests TCP → HTTP CONNECT → TLS handshake → WebSocket upgrade
  - Verifies TLS connection through transparent HTTP proxy
  - Confirms 101 Switching Protocols over TLS

- Enable TLS in E2E relay test (PROXY=1 mode):
  - Add nginx TLS termination proxy (snyk-broker-tls)
  - Create docker-compose.proxy.yml for HTTPS broker URL
  - Configure mitmproxy with combined CA bundle for upstream verification
  - Full test: axon-relay → mitmproxy → TLS → snyk-broker

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Split docker-tests job into parallel jobs:
  - docker-build: builds image and uploads artifact
  - scaffold-tests: runs scaffold tests (parallel)
  - relay-test-no-proxy: runs relay test without proxy (parallel)
  - relay-test-with-proxy: runs relay test with TLS proxy (parallel)

- Add certificate generation for CI:
  - generate-certs.sh: creates broker TLS cert from mitmproxy CA
  - Track mitmproxy CA files in git (needed to sign broker cert)
  - Generated certs (broker-server-*, combined-ca-bundle.crt) remain gitignored

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove t.Logf calls from goroutines in the proxy handler to prevent
"Log in goroutine after Test has completed" panic. The test may
return before the proxy's copy goroutines finish logging.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Revert to sequential test execution (simpler, more debuggable)
- Run proxy test before no-proxy test
- Add LOG_LEVEL=debug to proxy.env for verbose logging
- Add debug output when axon endpoint check fails:
  - Show full info response
  - Show axon-relay logs
  - Show mitmproxy logs (proxy mode)
  - List certificate files

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The axon-relay was trying to proxy requests to cortex-fake through
mitmproxy, but DNS resolution for mitmproxy was failing. Add NO_PROXY
to bypass the proxy for internal services that don't need it.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The DNS lookup for mitmproxy was failing with "server misbehaving"
because mitmproxy wasn't fully ready when axon-relay started. Add a
healthcheck to verify mitmproxy is listening on port 8080.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The mitmproxy Docker image doesn't include netcat (nc), causing the
healthcheck to fail. Use python3's socket module instead, which is
available in the image.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
After killing the broker process, TCP sockets may remain in TIME_WAIT
state, causing EADDRINUSE if we start the new broker too quickly.

Add waitForPortAvailable() which polls until the port is bindable,
with a 10 second timeout. This gives the OS time to release the port
from TIME_WAIT state before the new broker tries to bind.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The script was exiting early if broker certs existed, but the combined
CA bundle creation was after that check. This caused mitmproxy to fail
in CI with exit code 6 because combined-ca-bundle.crt didn't exist.

Move the combined CA bundle creation before the broker cert check so
it's always created. The combined bundle must be regenerated on each
machine anyway since system CAs differ between macOS/Linux/CI.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Show certificate files before docker-compose up, and capture
mitmproxy logs if the container fails to start.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The mitmproxy entrypoint script tries to change the mitmproxy user's
UID/GID to match files in /home/mitmproxy/.mitmproxy. In CI, files are
owned by UID 1001 (runner), but group 1001 doesn't exist in the
container, causing "usermod: group '1001' does not exist".

Fix by:
1. Mount certs to /certs instead of /home/mitmproxy/.mitmproxy
2. Use --set confdir=/certs to tell mitmproxy where to find CA files

This prevents the entrypoint from finding files in the default location
and trying to change ownership.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@shawnburke shawnburke force-pushed the shawn/websocket-tls-proxy-e2e-test branch 2 times, most recently from 8bb9fa6 to 46b10b3 Compare March 11, 2026 00:03
@shawnburke shawnburke force-pushed the shawn/websocket-tls-proxy-e2e-test branch from 46b10b3 to b9be750 Compare March 11, 2026 00:27
@shawnburke shawnburke merged commit da8d4ec into main Mar 11, 2026
16 checks passed
@shawnburke shawnburke deleted the shawn/websocket-tls-proxy-e2e-test branch March 11, 2026 00:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant