Skip to content

fix(huggingface-transformers): align HFT_Device tests with auto-resolve semantics#595

Closed
sroussey wants to merge 1 commit into
mainfrom
claude/beautiful-mayer-y6cb4l
Closed

fix(huggingface-transformers): align HFT_Device tests with auto-resolve semantics#595
sroussey wants to merge 1 commit into
mainfrom
claude/beautiful-mayer-y6cb4l

Conversation

@sroussey

Copy link
Copy Markdown
Collaborator

Problem

resolveHftPipelineDevice (in providers/huggingface-transformers/src/ai/common/HFT_Device.ts) deliberately normalizes server-side undefined / "wasm" / "webgpu" inputs to "auto" so transformers.js can delegate EP selection to onnxruntime-node. The accompanying unit tests in packages/test/src/test/ai-provider-hft/HFT_Device.test.ts shipped in 0.3.18/0.3.19 asserting toBeUndefined() for those same cases — the tests were carried over from the deleted HFT_Pipeline.ts and never updated. The test/source mismatch shipped as a latent bug.

Decision

Source is the spec; rewrite the tests. Downstream consumer builder flipped "webgpu""auto" in commit dd2acdc because the auto-resolve path is the intended runtime behavior. The source matches the downstream consumer's runtime intent, so the source stays and the tests are corrected.

Files changed

  • packages/test/src/test/ai-provider-hft/HFT_Device.test.ts
    • In "passes auto through on the server": resolveHftPipelineDevice(undefined) now asserts .toBe("auto") instead of .toBeUndefined().
    • Renamed the second it("strips browser-only devices on the server", ...) to it("normalizes browser-only devices to auto on the server", ...), with both assertions changed to .toBe("auto").
    • Browser block left untouched.
  • providers/huggingface-transformers/src/ai/common/HFT_Device.ts
    • Extended the JSDoc above resolveHftPipelineDevice with a one-line note that on the server, missing/browser-only inputs normalize to "auto" (which transformers.js delegates to onnxruntime-node for EP auto-selection).

Verification

Command Result
bunx vitest run packages/test/src/test/ai-provider-hft/HFT_Device.test.ts 3 passed (3)
bun run build:types (turbo, 36 packages) 36 successful, 36 cached, FULL TURBO

No source semantics changed; only the tests and JSDoc were corrected to reflect the existing runtime behavior already in use by the downstream builder consumer.


Generated by Claude Code

…ve semantics

The HFT_Device unit tests asserted that resolveHftPipelineDevice returns
undefined on the server for missing/browser-only inputs, but the source
deliberately normalizes those inputs to "auto" so transformers.js delegates
EP selection to onnxruntime-node. The tests were carried over from the
deleted HFT_Pipeline.ts and never updated; downstream builder/main
(commit dd2acdc, "webgpu" -> "auto") confirms the auto-resolve path is
the intended runtime behavior.

- Update HFT_Device.test.ts to expect "auto" for undefined / wasm / webgpu
  on the server, matching source.
- Rename the second case to "normalizes browser-only devices to auto on
  the server" to reflect the new assertions.
- Add a JSDoc note on resolveHftPipelineDevice explaining the server
  normalization to "auto" and the onnxruntime-node EP delegation.

Co-Authored-By: Claude <noreply@anthropic.com>

Copy link
Copy Markdown
Collaborator Author

CI status update (initial check, this PR):

Two job failures (test-vitest-ai-provider-hft, test-vitest-rag) appear to be out of scope for this PR. Both fail at the same boundary:

Error: OrtSessionOptionsAppendExecutionProvider_Cuda: Failed to load shared library
  at new OnnxruntimeSessionHandler (.../onnxruntime-node@1.24.3/dist/backend.js:50:92)

The failing tests are ZeroShotTasks.integration.test.ts:168/229/300 and RagWorkflow.integration.test.ts:185 — none of which this PR touches. Both jobs invoke HF_TRANSFORMERS_ONNX provider tasks that now flow through the new resolveHftPipelineDevice shipped in 0.3.18/0.3.19 (commit 7d1c6b2). With the new logic returning "auto" on the server for missing/wasm/webgpu, transformers.js asks onnxruntime-node to auto-pick an execution provider; on a CI runner without CUDA shared libraries the CUDA EP load fails before falling back to CPU.

This PR's scope was purely to align the HFT_Device unit tests with the shipped source semantics (and clarify the JSDoc). All HFT_Device.test.ts cases pass (3/3). The integration-test breakage exists in main itself (the new "auto" path landed before this PR was opened, and no full main CI run has executed against the post-7d1c6b2 tree to confirm it independently — the run on 936c0e1 was cancelled).

Recommend handling the EP-selection issue in a separate fix on main — options include:

  1. Detecting CUDA availability before returning "auto" in resolveHftPipelineDevice (return "cpu" as fallback when process.env.ORT_PROVIDERS doesn't include CUDAExecutionProvider).
  2. Wiring onnxruntime-node session options with an explicit EP allowlist that omits CUDA in CI.
  3. Conditionally registering a CPU-only EP in CI via env var.

The other PR jobs (build, typecheck-budget, test-vitest-unit, test-vitest-integration, test-vitest-ai-provider-api, test-vitest-ai-provider-nodellama, CodeQL, Analyze) all pass.


Generated by Claude Code

@sroussey sroussey closed this Jun 24, 2026
@sroussey sroussey deleted the claude/beautiful-mayer-y6cb4l branch June 24, 2026 04:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants