Skip to content

feat(storage): add storage-disk abstraction with local and S3 drivers (#3157)#3248

Open
bpamiri wants to merge 3 commits into
developfrom
fix/bot-3157-storage-disk-abstraction
Open

feat(storage): add storage-disk abstraction with local and S3 drivers (#3157)#3248
bpamiri wants to merge 3 commits into
developfrom
fix/bot-3157-storage-disk-abstraction

Conversation

@bpamiri

@bpamiri bpamiri commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Closes #3157. Phase 1 of the attachments line (child of #2962, sequenced last and independently shippable).

What this adds

A pluggable storage layer under vendor/wheels/storage/ — local filesystem + S3 behind one uniform disk interface — modeled on Laravel Filesystem / Rails ActiveStorage / AdonisJS Drive. Fully self-contained: no edits to existing framework files, purely additive.

  • vendor/wheels/interfaces/StorageDiskInterface.cfc — 6-method contract: put / get / exists / delete / url / signedUrl.
  • vendor/wheels/storage/StorageManager.cfc — resolves named disks from config and caches them (mirrors service() / model() resolution).
  • vendor/wheels/storage/drivers/LocalDisk.cfc — filesystem disk; HMAC-signed local URLs + a verifySignature() for a future serving route; rejects .. traversal; creates dirs via java.io.File.mkdirs() (not Lucee-only DirectoryCreate).
  • vendor/wheels/storage/drivers/S3Disk.cfc — S3 over plain cfhttp, no AWS SDK / no JARs.
  • vendor/wheels/storage/S3Signer.cfcfrom-scratch AWS SigV4 (presigned GET URLs + Authorization-header signing).
  • vendor/wheels/tests/specs/storage/StorageSpec.cfc — 20 specs.
  • changelog.d/3157-storage-disk-abstraction.added.md.

Explicitly deferred to follow-ups (per the issue body + parent #2962)

  • The hasOneAttached model macro / polymorphic blob tables (Phase 2).
  • The built-in local-file-serving route + handler (the one genuinely-new infra piece).
  • The storage() controller/model mixin helper.

So this slice is drivers only — no unmade design call blocks it. For now, resolve a disk programmatically (e.g. register StorageManager as a singleton in config/services.cfm).

Test status — GREEN on two engines

Ran the storage bundle against the worktree on prebuilt engine images (dir-only Docker mount):

Engine Result
Lucee 7.0.1 20 pass / 0 fail / 0 error (HTTP 200)
Adobe CF 2023 20 pass / 0 fail / 0 error (HTTP 200)

The SigV4 presign spec pins the AWS-documented test vector (examplebucket/test.txt, 20130524T000000Z → signature aeeed9bb…f604d404). That value was also reproduced independently from the canonical algorithm in Python before trusting it, so the signer is provably byte-correct against AWS's own reference — on both Lucee and Adobe.

Bug found and fixed during validation

The original spec used var url = signer.presignGetUrl(...) then expect(url). url is a CFML reserved scope (Anti-Pattern #11): the reference resolved to the URL scope struct, so the matcher compared against {db=…, format=…, directory=…, method=runRemote} instead of the return value — 3 specs failed on Lucee. Renamed the locals to presigned. (The production code never names anything url, so only the test was affected.) This is exactly the silent-ship trap that running the suite catches.

Cross-engine notes (verified)

  • Invariant 10: the S3 cfhttp wrapper builds a plain header struct and attaches each header via cfhttpparam in a loop — never attributeCollection=arguments. Same pattern as vendor/wheels/services/packages/HttpClient.cfc.
  • SigV4 crypto: SHA-256 hex + chained HMAC-SHA256 key derivation via javax.crypto.Mac (the built-in HMac() only takes string keys; SigV4 chains binary-keyed rounds). Same primitive family as wheels.auth.JwtService.
  • Canonical query sorts keys with ArraySort(…, "text") — does not rely on struct insertion-order iteration.
  • No closures-as-constructor-args (Invariant 5); no obj.map(); no reserved-scope params in production code; driver private helpers are fine (standalone services, not $integrateComponents-mixed).

Sensitive-area review flags (kept as DRAFT)

This touches DI/config resolution + security-sensitive crypto + outbound HTTP, so extra scrutiny is warranted on:

  • The SigV4 signer — request-signing correctness and credential handling. (Mitigated: AWS test-vector pinned + green on Lucee 7 and Adobe 2023.)
  • Path-traversal rejection in LocalDisk.$resolve (.. blocked).
  • Local signed-URL HMAC verification (verifySignature) — constant-time-ish compare; note the future serving route that consumes it is deferred, so nothing serves local files yet.

🤖 Generated with Claude Code

…#3157)

Introduce a pluggable storage layer under vendor/wheels/storage: a small
uniform disk interface (put/get/exists/delete/url/signedUrl), LocalDisk and
S3Disk drivers, and a StorageManager that resolves named disks from config
(mirroring service()/model() resolution).

S3 access uses a from-scratch SigV4 signer (no AWS SDK, no JARs): presigned
expiring GET URLs plus Authorization-header signing for cfhttp put/get/delete.
The HMAC key-derivation and SHA-256 canonical hashing reuse the same
cross-engine-green primitives as wheels.auth.JwtService.

Phase 1 of #2962. The hasOneAttached model macro (Phase 2), the built-in
local-file-serving route, and the storage() controller/model mixin helper are
deferred to follow-ups.

Tests: storage spec green on Lucee 7 and Adobe CF 2023 (20/20 each); the
SigV4 presign test pins the AWS-documented test vector. Fixed a reserved-scope
shadowing bug in the spec (Anti-Pattern #11): `var url` reads the URL scope on
both engines, so `expect(url)` saw the request struct, not the return value;
renamed to `presigned`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Peter Amiri <petera@pai.com>
@bpamiri

bpamiri commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator Author

Adversarial review follow-ups (workflow triage-and-fix-open-issues)

This PR was implemented and reviewed by an automated triage+fix workflow. The review verdict was looks-good / cross-engine clean / fixes the issue, with the SigV4 signature math independently re-derived in Python and matched byte-for-byte against AWS's reference vector. The items below are non-blocking nits captured so they aren't lost before this graduates from draft.

Code nits

  • SigV4 canonical-query sort is case-sensitive. S3Signer.$buildCanonicalQuery sorts with ArraySort(keys, "text"). SigV4 mandates byte/ASCII order of the percent-encoded keys. This is correct only for the current fixed key set (X-Amz-*, optional response-content-disposition) — harden to true byte-order (or document the invariant) before the API ever accepts caller-supplied / mixed-case query params.
  • LocalDisk.verifySignature() comment overstates the guarantee. It labels the compare "Constant-time-ish", but CompareNoCase(expected, signature) == 0 short-circuits on the first differing char and is not constant-time. Drop the wording, or implement a real fixed-time compare, before the local-serve route lands.
  • S3Signer.$amzNow() is the one untested production path. Every spec pins amzDate for determinism, so the live DateConvert("local2utc", Now()) + format path is never exercised. Add a test (or inject/pin the clock) so timestamp generation + UTC conversion is covered.

Design decisions to confirm before merge

@bpamiri bpamiri marked this pull request as ready for review June 24, 2026 12:00

@wheels-bot wheels-bot Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wheels Bot — Reviewer

TL;DR — This adds a purely-additive, well-factored storage-disk abstraction (LocalDisk + S3Disk behind a 6-method interface, resolved by StorageManager, with a from-scratch SigV4 signer pinned to the AWS test vector). The design, cross-engine discipline, changelog fragment, and commit hygiene are all solid. One correctness issue blocks: S3Disk treats a connection-failure / unparseable HTTP status as success, so put() silently reports a stored object that was never written. Verdict: request changes for that one fix; everything else is non-blocking.

Correctness

vendor/wheels/storage/drivers/S3Disk.cfc:128 + :46, :55-61, :67, :74 — a failed/connection-less request is interpreted as success (silent data loss).

$request builds its result with:

statusCode = ListFirst(local.httpResult.statusCode ?: "0", " "),

cfhttp here does not set throwOnError, which defaults to false on Lucee and Adobe, so a DNS/network/connection failure does not throw — it returns a non-numeric status. On Lucee cfhttp.statusCode is "Connection Failure"ListFirst(..., " ")"Connection"Val("Connection") = 0; on a missing key the elvis yields "0". Either way the downstream guards read 0, so put() does not throw and returns the key as if the object were stored — silent data loss. The same 0-is-success hole affects get() (line 61 returns the "Connection Failure" error body as if it were the object), exists() (line 67 returns true), and delete() (line 74 returns true). Note credential/permission errors are handled correctly (403/404 ≥ 300 → throw); it is specifically the connection-failure / empty-status path that slips through.

Fix: treat a non-2xx or a non-positive/unparseable status as failure — capture local.code = Val(local.result.statusCode) and throw Wheels.Storage.RequestFailed when local.code < 200 || local.code >= 300 (keeping get()'s 404 → Wheels.Storage.NotFound special-case first). The repo's own vendor/wheels/services/packages/HttpClient.cfc:31 deliberately returns the status to the caller rather than interpreting 0 as OK — worth mirroring that intent here.

Non-blocking observation — S3Disk.cfc:91-94 ($objectPath) vs S3Signer.cfc:136-138 path encoding. The signer signs the RFC3986-encoded canonical URI ($uriEncodePath), but $objectPath builds the actual request URL from the raw, unencoded key. For simple keys (the tested cases) these are byte-identical and fine; for keys containing spaces or reserved characters the signed path and the wire path can diverge and S3 may return SignatureDoesNotMatch. I cannot prove it breaks without a live S3 (cfhttp's own URL-encoding behavior is engine-dependent), so this is a heads-up, not a blocker — consider running the request URL through the same encoder the signer uses.

Cross-engine

No issues found. The SigV4 crypto uses javax.crypto.Mac with binary keys (correct — built-in HMac() only takes string keys), $sha256Hex branches on IsBinary, the canonical query sorts via ArraySort(..., "text") rather than relying on struct iteration order, LocalDisk.$ensureParentDir uses java.io.File.mkdirs() instead of the Lucee-only DirectoryCreate recurse flag, and the cfhttp wrapper attaches headers one-by-one through a plain struct rather than attributeCollection=arguments (Cross-Engine Invariant 10). Left(local.amzDate, 8) operates on a fixed 16-char timestamp, so the Lucee-7 Left(str,0) trap (Invariant 8) does not apply. PR reports 20/20 green on Lucee 7 and Adobe 2023.

Tests

StorageSpec.cfc (20 specs) covers the SigV4 presign against the AWS-documented vector, header signing, the full LocalDisk lifecycle, path-traversal rejection, signed-URL round-trip + expiry, and StorageManager resolution/caching/error paths — good happy-and-error-path coverage for the unit-testable surface. Gap (acceptable for a drivers-only Phase 1): the S3Disk network methods (put/get/exists/delete over cfhttp) are untested, which is exactly why the status-handling bug above went unnoticed. Hard to unit-test without an HTTP mock, but the fix should at least be reasoned through against the connection-failure status.

Security

Minor — LocalDisk.cfc:99-100: the comment calls the token check "Constant-time-ish", but CompareNoCase short-circuits and is not constant-time, so it is a (low-risk, local-only) timing oracle on the HMAC token. Consider a length-then-XOR-accumulate compare and softening the comment. Also note signedUrl() does not bind disposition into the HMAC (line 79 appends it outside $sign), so a holder of a valid URL can alter the disposition param — low impact, worth a one-line note. Neither blocks.

Docs

Changelog fragment changelog.d/3157-storage-disk-abstraction.added.md is present and correctly named (<slug>.added.md) — no direct CHANGELOG.md edit. No .ai/wheels/storage/ reference doc or user guide page yet; reasonable to defer since nothing is consumer-facing until the Phase 2 mixin/route lands, but worth adding when it becomes usable.

Commits

Head commit feat(storage): add storage-disk abstraction with local and S3 drivers (#3157) is a valid conventional-commit header (~76 chars, allowed type, sentence-case) and matches the squash-landing PR title. DCO sign-off present (Signed-off-by: Peter Amiri). The body explains the "why" and documents the reserved-scope test fix. No issues.

…gned-url binding

Addresses the review on #3248.

S3Disk treated a cfhttp connection failure as success. cfhttp does not set
throwOnError, so a DNS/connection error returns a non-numeric status (e.g.
"Connection Failure") whose Val() is 0, which the `< 300` guards read as OK:
put() reported a stored object that was never written, get() returned the
error body as the object, and exists()/delete() returned true. Centralize
status interpretation in $assertSuccess and throw Wheels.Storage.RequestFailed
on any non-2xx or unparseable status (put/get/delete). exists() now returns
true on 2xx, false on 404, and throws on a connection failure / 5xx rather
than reporting a definitive absence it cannot prove. $request preserves the
raw status line for diagnostics instead of collapsing it via ListFirst.

The wire path was built from the raw key while the signer signs the
RFC3986-encoded canonical path, so keys containing spaces / reserved
characters would fail with SignatureDoesNotMatch. Expose S3Signer.encodeKey()
and route $objectPath through it so the request and public URLs are
byte-identical to what was signed (mirrors the signer's canonicalUri for
path-style vs virtual-hosted).

LocalDisk.signedUrl() appended contentDisposition outside the HMAC, so a URL
holder could alter the served disposition; it is now bound into the signed
payload via $signaturePayload (an empty disposition reproduces the legacy
"key|expires" payload, so already-issued URLs still verify). Replaced the
mislabeled "constant-time-ish" CompareNoCase, which short-circuits on the
first differing character, with a length-then-XOR-accumulate $secureEquals
that does not leak the match length through timing.

Adds 7 specs to StorageSpec.cfc: four asserting put/get/exists/delete throw on
a refused connection (pointed at 127.0.0.1:1), one for public-url encoding,
one for disposition binding, and one wrong-length-token guard. Storage bundle
is 27/27 green on Lucee 7 and Adobe 2023.

Signed-off-by: Peter Amiri <petera@pai.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
wheels-bot[bot]
wheels-bot Bot previously approved these changes Jun 24, 2026

@wheels-bot wheels-bot Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wheels Bot — Reviewer

TL;DR — This is the re-review of the storage-disk abstraction PR (LocalDisk + S3Disk behind a 6-method interface) on the new head 6eec288. The fix commit fix(storage): harden S3 request failures, path encoding, and local signed-url binding resolves all three findings from the prior review — the blocking silent-data-loss bug plus the two non-blocking observations. The diff is now clean and well-tested (27/27 reported green on Lucee 7 + Adobe 2023). My earlier review left an active CHANGES_REQUESTED; with every blocking finding resolved I am submitting approve to supersede it (issue #3048). One trivial non-blocking observation remains. Verdict: approve.

Previously-blocking findings — now resolved

1. (was blocking) S3Disk treated a connection failure as success → silent data loss. Resolved. Status interpretation is centralized in $assertSuccess (S3Disk.cfc:119-127), which throws Wheels.Storage.RequestFailed on any code < 200 || code >= 300 — and crucially the < 200 arm catches the connection-failure path where Val("Connection Failure")/Val("") is 0. put() (:46), get() (:56), and delete() (:83) all route through it; delete() now returns a hard true only after the assert passes. exists() (:60-76) returns true on 2xx, false only on a real 404, and throws on a connection failure / 5xx instead of reporting a false absence. $request (:171-174) now preserves the raw status line rather than collapsing it via ListFirst. Four new specs (StorageSpec.cfc:228-271) point a disk at 127.0.0.1:1 and assert put/get/exists/delete each throw Wheels.Storage.RequestFailed on a refused connection — exactly the hole that was open before.

2. (was non-blocking) Wire path built from the raw key while the signer signed the encoded canonical path. Resolved. S3Signer.encodeKey() (S3Signer.cfc:184-186) exposes $uriEncodePath, and $objectPath (S3Disk.cfc:101-110) now routes both path-style and virtual-hosted URLs through it, mirroring the signer's canonicalUri (S3Signer.cfc:79-81, :136-138) byte-for-byte. New spec StorageSpec.cfc:203-211 asserts s3.url("my docs/q3 report.pdf") encodes to …/my%20docs/q3%20report.pdf.

3. (was non-blocking) signedUrl disposition not bound into the HMAC + non-constant-time compare. Resolved. $signaturePayload (LocalDisk.cfc:115-121) binds contentDisposition into the signed payload, with the empty case reproducing the legacy key|expires string so already-issued URLs still verify; verifySignature (:92-101) takes the disposition and recomputes against it. The mislabeled "constant-time-ish" CompareNoCase is replaced by $secureEquals (:130-141), a length-guard-then-XOR-accumulate compare over the fixed-width hex tokens. Two new specs (StorageSpec.cfc:150-169) cover disposition tampering and a wrong-length token rejecting without erroring.

Cross-engine

No issues. $secureEquals uses only Asc/Mid/BitXor/BitOr/LCase/Len and its for loop is in plain body scope (not a finally), so Invariant 12 does not apply. The SigV4 crypto, cfhttp one-header-at-a-time attachment (Invariant 10), and Left(amzDate, 8) on a fixed-width timestamp (Invariant 8 safe) are all unchanged and remain green.

Conventions

Non-blocking observation — S3Disk.cfc:102 vs S3Signer.cfc:137/:80 leading-slash handling. $objectPath strips leading slashes (REReplace(key, "^/+", "")) before encoding, but signedHeaders/presignGetUrl sign the raw key without stripping. For any normal key these are identical; only a key with a leading slash (e.g. "/foo.txt") would make the wire path (/foo.txt) diverge from the signed canonical path (//foo.txt) and draw a SignatureDoesNotMatch. Leading-slash keys are non-idiomatic, so this is a heads-up rather than a fix-now — normalizing the key once at the disk boundary (or in the signer) before both signing and URL-building would close it.

Tests

StorageSpec.cfc grows from 20 to 27 specs, all BDD under wheels.WheelsTest. The additions are well-targeted: they exercise precisely the three code paths that were wrong, using a hermetic refused-connection endpoint rather than a live S3, so they are fast and deterministic. The S3Disk happy-path network methods remain unmockable without an HTTP fixture (acceptable for a drivers-only Phase 1), but the previously-untested failure path is now covered.

Commits

Head commit fix(storage): harden S3 request failures, path encoding, and local signed-url binding is a valid conventional-commit header (fix type, storage scope, ~84 chars, sentence-case), matches the squash-landing intent, has a DCO sign-off (Signed-off-by: Peter Amiri), and the body explains the "why" for each of the three fixes. No issues.

…configurable

The storage PR adds vendor/wheels/interfaces/StorageDiskInterface.cfc — the
24th interface file — which tripped the structural guard in
InterfaceCompilationSpec ("finds exactly 23 interface files", now 24). Bump the
expected count and note that it tracks the wheels.interfaces.* contract set.
The full CI suite caught this; the storage-only bundle run and the diff-based
bot review did not, since the guard lives in tests/specs/interfaces.

Also make the S3 request timeout configurable (config.timeout, default 60s)
and use a short timeout in the connection-failure specs. In CI the get()
failure spec waited the full 60s cfhttp timeout (only GET hung; PUT/HEAD/DELETE
were refused instantly), adding a minute to every core run; capping it keeps
the suite fast without changing the production default.

Storage + interfaces bundles: 30/30 green on Lucee 7 and Adobe 2023.

Signed-off-by: Peter Amiri <petera@pai.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@wheels-bot wheels-bot Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wheels Bot — Reviewer

TL;DR — This is the re-review of the storage-disk abstraction PR (LocalDisk + S3Disk behind a 6-method interface, resolved by StorageManager, with a from-scratch SigV4 signer). The two follow-up commits on this head (6eec28885, ed7660e07) resolve every finding from the prior CHANGES_REQUESTED review — including the one blocking correctness issue (S3 silent data loss). My independent pass over the current head turns up no blocking Correctness / Cross-engine / Security finding and no commitlint / TDD violation, so I'm clearing the stale block. Verdict: approve.

Resolution audit (supersede of the prior CHANGES_REQUESTED)

Prior finding Status Evidence on this head
(blocking) S3Disk reads a connection-failure / unparseable status as success → silent data loss Resolved $assertSuccess (vendor/wheels/storage/drivers/S3Disk.cfc:120-128) throws Wheels.Storage.RequestFailed when `Val(statusCode) < 200
(non-blocking) raw-key wire path vs RFC3986-encoded signed path causing SignatureDoesNotMatch for keys with spaces Resolved $objectPath now encodes through variables.signer.encodeKey() (S3Disk.cfc:102-111), with encodeKey() exposing the signer's canonical path encoder (S3Signer.cfc:184-186). Spec "rfc3986-encodes spaces in the public url" (StorageSpec.cfc:203-210).
(non-blocking, security) CompareNoCase token check is not constant-time Resolved Replaced with length-then-XOR-accumulate $secureEquals (LocalDisk.cfc:130-141); wrong-length-token spec at StorageSpec.cfc:162-168.
(non-blocking, security) disposition not bound into the signed-URL HMAC Resolved $signaturePayload binds disposition into the signed payload (LocalDisk.cfc:115-121); the empty-disposition case reproduces the legacy `key

I re-verified the structural-guard fix too: ed7660e07 bumps InterfaceCompilationSpec "finds exactly 24 interface files" (:35-45) to match the added StorageDiskInterface.cfc — a recursive scan of vendor/wheels/interfaces returns 24, so the count is correct.

Cross-engine

No issues. SigV4 crypto uses javax.crypto.Mac with binary keys (built-in HMac() only takes string keys); $secureEquals's for (local.i = ...) loop (LocalDisk.cfc:137) is not inside a finally, so the Lucee-7 finally-loop trap (Invariant 12) does not apply; Left(local.amzDate, 8) operates on a fixed 16-char timestamp (Invariant 8 N/A); the cfhttp wrapper attaches headers one-by-one through a plain struct rather than attributeCollection=arguments (Invariant 10, S3Disk.cfc:152-167); driver private helpers are fine since these are standalone services, not $integrateComponents-mixed (Invariant 7). PR reports 30/30 green on Lucee 7 and Adobe 2023.

Tests

StorageSpec.cfc is now 27 specs: AWS-vector SigV4 presign, header signing, full LocalDisk lifecycle + traversal rejection + signed-URL round-trip/expiry/disposition-binding/wrong-length-token, S3Disk URL/encoding helpers, the four refused-connection failure specs, and StorageManager resolution/caching/error paths. Happy and error paths are both exercised for the unit-testable surface.

Conventions (non-blocking nits)

  • StorageManager.disk() reads and writes variables.resolved without a lock (StorageManager.cfc:46-73). The manager is documented as a singleton, but the disks are stateless and construction is idempotent, so a concurrent-init race only risks a discarded duplicate instance — benign. Worth a cflock if you want the cache strictly single-instance, but not required.
  • LocalDisk.$resolve rejects any key containing .. (LocalDisk.cfc:146), which also rejects otherwise-legal keys like report..final.pdf. Safe (false-reject, never false-accept) and fine for now; a path-normalization-then-prefix check would be more precise if real keys ever need consecutive dots.

Docs

Changelog fragment changelog.d/3157-storage-disk-abstraction.added.md present and correctly named — no direct CHANGELOG.md edit. No .ai/wheels/storage/ reference doc or user guide yet; reasonable to defer until the Phase 2 mixin/serving route makes this consumer-facing, as the prior review noted.

Commits

feat(storage): … and the two fix(storage): … follow-ups are all valid conventional-commit headers (allowed types, <= 100 chars, sentence-case), with DCO sign-off (Signed-off-by: Peter Amiri) and bodies that explain the "why."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feature: storage-disk abstraction (local + S3 adapters, signed URLs) — attachments phase 1

1 participant