Fix server-side device resolution to avoid CUDA probing#597
Merged
Conversation
…tead of "auto" Passing device "auto" to onnxruntime-node makes it probe the CUDA execution provider, which throws "OrtSessionOptionsAppendExecutionProvider_Cuda: Failed to load shared library" on hosts without CUDA (CPU-only CI runners), failing every HFT integration test as a PermanentJobError. Resolve "auto" (and the browser-only "wasm"/"webgpu") to undefined on the server so onnxruntime-node defaults to the CPU execution provider. Concrete server devices (cpu/gpu/metal) still pass through. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01N2vdcSPQJz63DPoiD9JSY2
Coverage Report
File Coverage
|
||||||||||||||||||||||||||||||||||||||
sroussey
added a commit
that referenced
this pull request
Jun 24, 2026
## @workglow/task-graph ### Features #### task-graph - bubble subgraph events for While and Fallback (data) groups - emit and bubble per-task task_progress events - bubble subgraph events from streaming groups too - bubble subgraph task events up to the parent graph - emit graph-level task_complete event per finished task ### Bug Fixes #### task-graph - only emit task_progress while a task is actively running - tear down subgraph event bridges in finally - harden task_complete emit against throwing listeners ## @workglow/test ### Features #### task-graph - bubble subgraph events for While and Fallback (data) groups - emit and bubble per-task task_progress events - bubble subgraph events from streaming groups too - bubble subgraph task events up to the parent graph - emit graph-level task_complete event per finished task ### Bug Fixes #### task-graph - only emit task_progress while a task is actively running - harden task_complete emit against throwing listeners #### huggingface-transformers - resolve server device to undefined instead of "auto" (#597) ### Chores - update deps ### Updated Dependencies - `@aws-sdk/client-sqs`: ^3.1075.0 - `@cloudflare/workers-types`: ^4.20260624.1 - `miniflare`: ^4.20260623.0 ## @workglow/aws ### Chores - update deps ### Updated Dependencies - `@aws-sdk/client-sqs`: ^3.1075.0 ## @workglow/cloudflare ### Chores - update deps ### Updated Dependencies - `@cloudflare/workers-types`: ^4.20260624.1 ## @workglow/huggingface-transformers ### Bug Fixes #### huggingface-transformers - resolve server device to undefined instead of "auto" (#597) ## @workglow/web ### Chores - update deps ### Updated Dependencies - `@vitejs/plugin-react`: ^6.0.3 - `vite`: ^8.1.0
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Changes the server-side device resolution logic in
resolveHftPipelineDevice()to returnundefinedinstead of"auto"for unspecified or browser-only device types. This allows onnxruntime-node to default to the CPU execution provider instead of probing for CUDA, which fails on CPU-only environments (e.g., CI runners without CUDA libraries).Key Changes
resolveHftPipelineDevice()to returnstring | undefinedinstead ofstringundefinedfor:undefinedinput (was"auto")"auto"input (was"auto")"wasm"and"webgpu"(browser-only, now stripped on server)undefinedreturn value for"auto"on the serverImplementation Details
The change addresses a runtime issue where onnxruntime-node would attempt CUDA detection when given
"auto", causing failures in environments without CUDA libraries installed. By returningundefined, the library uses its default CPU execution provider without probing, which is the appropriate behavior for server environments that don't explicitly request GPU acceleration.https://claude.ai/code/session_01N2vdcSPQJz63DPoiD9JSY2