fix(huggingface-transformers): align HFT_Device tests with auto-resolve semantics by sroussey · Pull Request #595 · workglow-dev/libs

sroussey · 2026-06-23T08:23:20Z

Problem

resolveHftPipelineDevice (in providers/huggingface-transformers/src/ai/common/HFT_Device.ts) deliberately normalizes server-side undefined / "wasm" / "webgpu" inputs to "auto" so transformers.js can delegate EP selection to onnxruntime-node. The accompanying unit tests in packages/test/src/test/ai-provider-hft/HFT_Device.test.ts shipped in 0.3.18/0.3.19 asserting toBeUndefined() for those same cases — the tests were carried over from the deleted HFT_Pipeline.ts and never updated. The test/source mismatch shipped as a latent bug.

Decision

Source is the spec; rewrite the tests. Downstream consumer builder flipped "webgpu" → "auto" in commit dd2acdc because the auto-resolve path is the intended runtime behavior. The source matches the downstream consumer's runtime intent, so the source stays and the tests are corrected.

Files changed

packages/test/src/test/ai-provider-hft/HFT_Device.test.ts
- In "passes auto through on the server": resolveHftPipelineDevice(undefined) now asserts .toBe("auto") instead of .toBeUndefined().
- Renamed the second it("strips browser-only devices on the server", ...) to it("normalizes browser-only devices to auto on the server", ...), with both assertions changed to .toBe("auto").
- Browser block left untouched.
providers/huggingface-transformers/src/ai/common/HFT_Device.ts
- Extended the JSDoc above resolveHftPipelineDevice with a one-line note that on the server, missing/browser-only inputs normalize to "auto" (which transformers.js delegates to onnxruntime-node for EP auto-selection).

Verification

Command	Result
`bunx vitest run packages/test/src/test/ai-provider-hft/HFT_Device.test.ts`	3 passed (3)
`bun run build:types` (turbo, 36 packages)	36 successful, 36 cached, FULL TURBO

No source semantics changed; only the tests and JSDoc were corrected to reflect the existing runtime behavior already in use by the downstream builder consumer.

Generated by Claude Code

…ve semantics The HFT_Device unit tests asserted that resolveHftPipelineDevice returns undefined on the server for missing/browser-only inputs, but the source deliberately normalizes those inputs to "auto" so transformers.js delegates EP selection to onnxruntime-node. The tests were carried over from the deleted HFT_Pipeline.ts and never updated; downstream builder/main (commit dd2acdc, "webgpu" -> "auto") confirms the auto-resolve path is the intended runtime behavior. - Update HFT_Device.test.ts to expect "auto" for undefined / wasm / webgpu on the server, matching source. - Rename the second case to "normalizes browser-only devices to auto on the server" to reflect the new assertions. - Add a JSDoc note on resolveHftPipelineDevice explaining the server normalization to "auto" and the onnxruntime-node EP delegation. Co-Authored-By: Claude <noreply@anthropic.com>

sroussey · 2026-06-23T17:34:01Z

CI status update (initial check, this PR):

Two job failures (test-vitest-ai-provider-hft, test-vitest-rag) appear to be out of scope for this PR. Both fail at the same boundary:

Error: OrtSessionOptionsAppendExecutionProvider_Cuda: Failed to load shared library
  at new OnnxruntimeSessionHandler (.../onnxruntime-node@1.24.3/dist/backend.js:50:92)

The failing tests are ZeroShotTasks.integration.test.ts:168/229/300 and RagWorkflow.integration.test.ts:185 — none of which this PR touches. Both jobs invoke HF_TRANSFORMERS_ONNX provider tasks that now flow through the new resolveHftPipelineDevice shipped in 0.3.18/0.3.19 (commit 7d1c6b2). With the new logic returning "auto" on the server for missing/wasm/webgpu, transformers.js asks onnxruntime-node to auto-pick an execution provider; on a CI runner without CUDA shared libraries the CUDA EP load fails before falling back to CPU.

This PR's scope was purely to align the HFT_Device unit tests with the shipped source semantics (and clarify the JSDoc). All HFT_Device.test.ts cases pass (3/3). The integration-test breakage exists in main itself (the new "auto" path landed before this PR was opened, and no full main CI run has executed against the post-7d1c6b2 tree to confirm it independently — the run on 936c0e1 was cancelled).

Recommend handling the EP-selection issue in a separate fix on main — options include:

Detecting CUDA availability before returning "auto" in resolveHftPipelineDevice (return "cpu" as fallback when process.env.ORT_PROVIDERS doesn't include CUDAExecutionProvider).
Wiring onnxruntime-node session options with an explicit EP allowlist that omits CUDA in CI.
Conditionally registering a CPU-only EP in CI via env var.

The other PR jobs (build, typecheck-budget, test-vitest-unit, test-vitest-integration, test-vitest-ai-provider-api, test-vitest-ai-provider-nodellama, CodeQL, Analyze) all pass.

Generated by Claude Code

sroussey closed this Jun 24, 2026

sroussey deleted the claude/beautiful-mayer-y6cb4l branch June 24, 2026 04:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(huggingface-transformers): align HFT_Device tests with auto-resolve semantics#595

fix(huggingface-transformers): align HFT_Device tests with auto-resolve semantics#595
sroussey wants to merge 1 commit into
mainfrom
claude/beautiful-mayer-y6cb4l

sroussey commented Jun 23, 2026

Uh oh!

sroussey commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

sroussey commented Jun 23, 2026

Problem

Decision

Files changed

Verification

Uh oh!

sroussey commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants