feat(custom-engine): HTTP transport multipart file upload (http.files)#102
Open
zpzjzj wants to merge 3 commits into
Open
feat(custom-engine): HTTP transport multipart file upload (http.files)#102zpzjzj wants to merge 3 commits into
zpzjzj wants to merge 3 commits into
Conversation
Builds on the http JSON core (#99): when engine.custom.http.files is declared, the request becomes multipart/form-data — the rendered JSON body is sent as the `payload` field and each matched workspace file as a `files` part (the part filename is the workspace-relative path). Without files the request stays plain JSON (unchanged). - config: validateCustomHTTP now accepts files; each path must be a non-empty relative path (no leading / or ..). Shared config.WorkspaceRelPathSafe applies the same rule at config load and in the transport. - agent (custom_http_files.go): expandHTTPFiles resolves each entry against the workspace file listing — exact path or doublestar glob; only regular files (the `find -type f` listing excludes directories and .git); required:true (the default) errors on no match, required:false skips; results deduped + sorted. buildMultipartBody streams the payload + file parts with a 256 MiB total cap. Reuses the temp-file + DownloadFile read pattern. - custom_http.go run(): un-blanks rt and dispatches JSON vs multipart via the new buildRequest; the multipart Content-Type (with boundary) wins over any user header. listWorkspaceFiles mirrors the evaluator's collect-artifacts lister; they stay separate only because agent cannot import evaluator (a shared leaf package would consolidate them). Scope: multipart upload only. URL artifact download in the result is a follow-up. Tests: httptest multipart cases (exact file, glob excluding dirs/non-matches, required-missing error, optional-missing skip, unsafe path reject, multipart Content-Type, payload carries SessionInput); validator file-path cases. make test (race), make verify (0 issues), and the e2e custom-engine suite are green. Refs docs/design/custom-engine.md (HTTP file upload) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…r test CI (ubuntu + windows) failed on the newline-filename test: - A newline in a multipart part filename corrupts the Content-Disposition header (Go's multipart writer does not escape CR/LF) — a real injection/ corruption edge, not just a test artifact. - Windows cannot create a newline-named file, so the real-file test was non-portable. Fix: listWorkspaceFiles now also drops entries containing CR/LF (alongside the absolute/.. filter and the -print0 split fix), so such names never reach a multipart filename or DownloadFile. Replace the non-portable real-file test with a portable fakeFindRuntime test that feeds canned `find -print0` output (a safe relative path, an absolute entry, and a CR/LF name) and asserts only the safe relative path survives expandHTTPFiles.
Collaborator
Author
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 254a3669d6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…ching listWorkspaceFiles strips a leading ./ from every workspace entry, but expandHTTPFiles matched the raw configured path/glob against the stripped listing. So a natural relative form like `./diff.patch` or `./src/*.go` matched nothing — a required entry errored "matched no files" and an optional one was silently skipped even though the file exists. Normalize the configured pattern (strip leading ./) the same way before matching. Test covers exact + glob. Found by Codex review on #102.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Builds on the http JSON core (#99). When
engine.custom.http.filesis declared, the request becomesmultipart/form-data— the rendered JSON body is sent as thepayloadfield and each matched workspace file as afilespart (the part filename is the workspace-relative path). Without files the request stays plain JSON (unchanged).Refs
docs/design/custom-engine.md(HTTP file upload). NotCloses— URL artifact download in the result is the remaining follow-up (PR-C).What's implemented
validator.go):validateCustomHTTPaccepts files; eachpathmust be a non-empty workspace-relative path (no leading/or..). The sharedconfig.WorkspaceRelPathSafeapplies the same rule at config load and in the transport.custom_http_files.go):expandHTTPFilesresolves each entry against the workspace file listing — exact path or doublestar glob (*,**); only regular files (thefind -type flisting excludes directories and.git);required: true(default) errors on no match,required: falseskips; results deduped + sorted.buildMultipartBody/addFilePartstream each file (download →os.Statcap-check →io.Copy) with a 256 MiB cumulative cap.custom_http.go):rundispatches JSON vs multipart via a newbuildRequest; the multipartContent-Type(with boundary) wins over any user header.Security hardening (from self-review)
find -print0(NUL-separated) instead of newline-separated: a workspace filename containing a newline can no longer split one listing entry into a bogus absolute path that a**/*glob would match and read off-workspace. Listing entries are additionally filtered throughWorkspaceRelPathSafe. Regression test included.os.Statbefore loading, and content is streamed (io.Copy), so a single oversized file is rejected before it is read into memory.Known limitations
maxHTTPUploadBytes(256 MiB); with many parallel cases peak RAM isN × cap. A streamingio.Pipebody would lower it — a future improvement (noted in code).internal/evaluator/artifacts_collect.go::listWorkspaceFileshas the same pre-existing newline-split issue; out of scope here, flagged for a separate fix.Test plan
make test(go test -race ./...) — green; newcustom_http_test.gomultipart cases: exact file, glob (excludes dirs + non-matches), required-missing error, optional-missing skip, unsafe-path reject, newline-filename regression, multipart Content-Type, payload carriesSessionInput.internal/confighttp file-path validator cases (valid globs /../ absolute).make verify(fmt + vet + revive + golangci-lint) — 0 issues.make e2ecustom-engine (local + http JSON) suite — green.Notes for reviewers
listWorkspaceFilesmirrors the evaluator's collect-artifacts lister; they stay separate only becauseagentcannot importevaluator(a shared leaf package would consolidate them).🤖 Generated with Claude Code