-
Notifications
You must be signed in to change notification settings - Fork 2
fix: ensure fully qualified URL recorded for auditing #142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bec1a7e to
5948012
Compare
| Method string | ||
| URL string | ||
| URL string // The fully qualified request URL (scheme, domain, optional path). | ||
| Host string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Including the host here seems redundant if the URL is fully qualified. Should we remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for analyzing logs and building dashboards based on logs - it will be easier to have structured logs:
- scheme
- domain
- path
- query_params
?
Probably we'll want to run some aggregations queries based on domain? It may be harder to run aggregations queries if we only have string with fully qualified URL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I'm inclined to just make this the fully-qualified URL so that the application logs are accurate and the consumers actually get the full URL. Consumers can parse it, if desired.
That said, I think it would make sense to ultimately structure the logs a bit better in coder. I'll see how things pan out after dogfooding for a bit.
proxy/proxy.go
Outdated
| // For HTTP proxy requests, req.URL contains the full URL. | ||
| // For HTTPS via CONNECT tunnel, req.URL only contains the path. | ||
| evalURL := req.URL.String() | ||
| if req.URL.Scheme == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check that req.URL.Scheme is empty or req.URL.Host is empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be the scheme because http.ReadRequest() removes the scheme internally for CONNECT.
Previously the audit URL only contained the path (e.g., "/lookup/github.com/...") instead of the full URL with scheme and host (e.g., "https://sum.golang.org/..."). This happened because http.ReadRequest() only populates req.URL with the path when clients don't know they're going through a proxy (boundary's normal transparent proxy operation). The fix constructs a fully qualified URL by prepending the scheme and host when req.URL.Scheme is empty. Note: When clients explicitly configure boundary as a proxy (e.g., via HTTP_PROXY or curl -x), req.URL already contains the full URL and no reconstruction is needed. This case is uncommon in normal boundary usage but is handled correctly. Fixes #141
5948012 to
b7c34b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before:
time=2026-01-07T20:23:10.453Z level=INFO msg=ALLOW method=GET url=/test host=httpforever.com rule="domain=httpforever.com”
After:
time=2026-01-07T20:24:02.912Z level=INFO msg=ALLOW method=GET url=http://httpforever.com/test host=httpforever.com rule="domain=httpforever.com"
Previously the audit URL only contained the path (e.g., "/lookup/github.com/...") for HTTPS requests via CONNECT tunnel, instead of the full URL with scheme and host (e.g.,
"https://sum.golang.org/lookup/github.com/...").For example, the application logs look like this
time=2026-01-06T15:51:12.123Z level=WARN msg=DENY method=POST url=/api/api_post.php host=pastebin.com. Theurlfield should beurl=https://pastebin.com/api/api_post.php.This happened because http.ReadRequest() only populates req.URL with the path when parsing requests inside a TLS tunnel. The fix checks if req.URL.Scheme is empty and prepends the scheme and host when needed. A single URL is constructed for use with the rules engine and auditing purposes.
Fixes #141