fix(security): rebuild proxy URL from trusted endpoint (py/full-ssrf)#100
Conversation
…ll-ssrf The previous py/partial-ssrf fix validated the joined URL's scheme and netloc against the trusted deployment endpoint, but CodeQL upgraded the finding to py/full-ssrf because the final target_url still flowed directly from urljoin(endpoint, user_path) into client.stream(url=...). Rebuild target_url from urlunparse() using the trusted endpoint's scheme and netloc, with only path/params/query carried over from the joined URL. The validation still raises 400 for mismatches; the rebuild step makes the host portion definitely come from the DB-loaded endpoint, breaking the full-URL taint flow. Also added an upfront sanity check that rejects empty scheme/netloc on the endpoint so a malformed deployment record can't degrade the guard. Signed-off-by: mschwab <mschwab@nvidia.com>
Signed-off-by: mschwab <mschwab@nvidia.com>
|
📝 WalkthroughWalkthroughGateway proxy target URL validation is refactored to robustly validate the configured endpoint, join request paths, and reject requests that attempt to change the upstream scheme or host. The logic now validates endpoint configuration upfront, rebuilds the final URL via ChangesProxy Target URL Validation
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
plugins/nemo-agents/src/nemo_agents_plugin/api/v2/gateway.py (1)
169-170: 💤 Low valuePre-existing edge case: query string collision.
If
trailing_uricontains an encoded?(e.g.,path%3Fx=1),joined.querywill be non-empty. Appending?{request.url.query}then produces a malformed URL with two?characters. Not introduced by this PR, but may warrant a future fix.Example fix
- if request.url.query: - target_url = f"{target_url}?{request.url.query}" + if request.url.query: + sep = "&" if joined.query else "?" + target_url = f"{target_url}{sep}{request.url.query}"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/nemo-agents/src/nemo_agents_plugin/api/v2/gateway.py` around lines 169 - 170, The current append logic blindly adds "?{request.url.query}" which produces malformed URLs when the target already contains a query or an encoded '?' in trailing_uri; update the conditional to detect existing queries by checking joined.query or the presence of '?' in target_url or '%3F' in trailing_uri and, when any are true, append "&{request.url.query}" instead of "?{request.url.query}"; reference the variables target_url, request.url.query, trailing_uri and joined.query in the fix.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@plugins/nemo-agents/src/nemo_agents_plugin/api/v2/gateway.py`:
- Around line 169-170: The current append logic blindly adds
"?{request.url.query}" which produces malformed URLs when the target already
contains a query or an encoded '?' in trailing_uri; update the conditional to
detect existing queries by checking joined.query or the presence of '?' in
target_url or '%3F' in trailing_uri and, when any are true, append
"&{request.url.query}" instead of "?{request.url.query}"; reference the
variables target_url, request.url.query, trailing_uri and joined.query in the
fix.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: a63e4b0d-8b8b-4b42-aae5-d0c6d2b064a5
📒 Files selected for processing (1)
plugins/nemo-agents/src/nemo_agents_plugin/api/v2/gateway.py
Summary
Batch 3 of the CodeQL cleanup. Addresses the 1 critical
py/full-ssrfalert that surfaced after #91 cleared the relatedpy/partial-ssrf.Alert addressed
py/full-ssrfplugins/nemo-agents/.../api/v2/gateway.py—client.stream(url=target_url, ...)wheretarget_urlwas the result ofurljoin(endpoint, user_path). The previouspy/partial-ssrffix validated the joined URL's scheme/netloc against the trusted deployment endpoint, but CodeQL upgraded the finding because the full URL still flowed directly from a user-supplied path component into the request.target_urlviaurlunparse((endpoint.scheme, endpoint.netloc, joined.path, joined.params, joined.query, ""))after the existing scheme/netloc match check. The host portion now definitely comes from the DB-loaded deployment endpoint instead of from the user-supplied path text, breaking the full-URL taint flow. Added an upfront 500 if the deployment's stored endpoint has no scheme/netloc so a malformed record can't sneak past the guard.Test plan
uv run pytest plugins/nemo-agents/tests/unit/test_gateway.py— 20 pass (including the existing parametrized SSRF guard test from fix(security): address high+ CodeQL alerts across plugins, services, SDK #91)ruff checkon touched file — cleanpy/full-ssrfclosesStacking
This branch is off
main. PRs #91 is merged; #98 is still open. No conflicts expected with either.Summary by CodeRabbit