Skip to content

Fix server-side device resolution to avoid CUDA probing#597

Merged
sroussey merged 1 commit into
mainfrom
claude/trusting-shannon-pqyk7n
Jun 23, 2026
Merged

Fix server-side device resolution to avoid CUDA probing#597
sroussey merged 1 commit into
mainfrom
claude/trusting-shannon-pqyk7n

Conversation

@sroussey

Copy link
Copy Markdown
Collaborator

Summary

Changes the server-side device resolution logic in resolveHftPipelineDevice() to return undefined instead of "auto" for unspecified or browser-only device types. This allows onnxruntime-node to default to the CPU execution provider instead of probing for CUDA, which fails on CPU-only environments (e.g., CI runners without CUDA libraries).

Key Changes

  • Return type: Updated resolveHftPipelineDevice() to return string | undefined instead of string
  • Server-side logic: Changed to return undefined for:
    • undefined input (was "auto")
    • "auto" input (was "auto")
    • "wasm" and "webgpu" (browser-only, now stripped on server)
  • Fallback behavior: Non-matching device strings are passed through unchanged
  • Test updates: Updated test expectations to reflect the new undefined return value for "auto" on the server

Implementation Details

The change addresses a runtime issue where onnxruntime-node would attempt CUDA detection when given "auto", causing failures in environments without CUDA libraries installed. By returning undefined, the library uses its default CPU execution provider without probing, which is the appropriate behavior for server environments that don't explicitly request GPU acceleration.

https://claude.ai/code/session_01N2vdcSPQJz63DPoiD9JSY2

…tead of "auto"

Passing device "auto" to onnxruntime-node makes it probe the CUDA execution
provider, which throws "OrtSessionOptionsAppendExecutionProvider_Cuda: Failed to
load shared library" on hosts without CUDA (CPU-only CI runners), failing every
HFT integration test as a PermanentJobError.

Resolve "auto" (and the browser-only "wasm"/"webgpu") to undefined on the server
so onnxruntime-node defaults to the CPU execution provider. Concrete server
devices (cpu/gpu/metal) still pass through.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01N2vdcSPQJz63DPoiD9JSY2
@github-actions

Copy link
Copy Markdown

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 62.45% 25214 / 40372
🔵 Statements 62.3% 26094 / 41883
🔵 Functions 63.69% 4788 / 7517
🔵 Branches 51.13% 12334 / 24119
File Coverage
File Stmts Branches Functions Lines Uncovered Lines
Changed Files
providers/huggingface-transformers/src/ai/common/HFT_Device.ts 0% 0% 0% 0% 9-35
Generated in workflow #2610 for commit b9eb099 by the Vitest Coverage Report Action

@sroussey sroussey merged commit 4b148a4 into main Jun 23, 2026
14 checks passed
sroussey added a commit that referenced this pull request Jun 24, 2026
## @workglow/task-graph

### Features

#### task-graph

- bubble subgraph events for While and Fallback (data) groups
- emit and bubble per-task task_progress events
- bubble subgraph events from streaming groups too
- bubble subgraph task events up to the parent graph
- emit graph-level task_complete event per finished task

### Bug Fixes

#### task-graph

- only emit task_progress while a task is actively running
- tear down subgraph event bridges in finally
- harden task_complete emit against throwing listeners

## @workglow/test

### Features

#### task-graph

- bubble subgraph events for While and Fallback (data) groups
- emit and bubble per-task task_progress events
- bubble subgraph events from streaming groups too
- bubble subgraph task events up to the parent graph
- emit graph-level task_complete event per finished task

### Bug Fixes

#### task-graph

- only emit task_progress while a task is actively running
- harden task_complete emit against throwing listeners

#### huggingface-transformers

- resolve server device to undefined instead of "auto" (#597)

### Chores

- update deps

### Updated Dependencies

- `@aws-sdk/client-sqs`: ^3.1075.0
- `@cloudflare/workers-types`: ^4.20260624.1
- `miniflare`: ^4.20260623.0

## @workglow/aws

### Chores

- update deps

### Updated Dependencies

- `@aws-sdk/client-sqs`: ^3.1075.0

## @workglow/cloudflare

### Chores

- update deps

### Updated Dependencies

- `@cloudflare/workers-types`: ^4.20260624.1

## @workglow/huggingface-transformers

### Bug Fixes

#### huggingface-transformers

- resolve server device to undefined instead of "auto" (#597)

## @workglow/web

### Chores

- update deps

### Updated Dependencies

- `@vitejs/plugin-react`: ^6.0.3
- `vite`: ^8.1.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants