Skip to content

Fix CUDA build with contrib ops disabled#28554

Draft
Copilot wants to merge 2 commits into
mainfrom
copilot/fix-onnxruntime-build-cuda
Draft

Fix CUDA build with contrib ops disabled#28554
Copilot wants to merge 2 commits into
mainfrom
copilot/fix-onnxruntime-build-cuda

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 19, 2026

Description

The CUDA Attention kernel (core/providers/cuda/llm/attention.cc) depends on contrib_ops internals (flash attention, memory efficient attention, unfused attention helpers) but was compiled unconditionally. When building with --disable_contrib_ops, GetAttentionKernelOptions() is unavailable (guarded by #ifndef DISABLE_CONTRIB_OPS in cuda_kernel.h), causing a compile error.

Changes:

  • cmake/onnxruntime_providers_cuda.cmake — Exclude attention.h/attention.cc from the CUDA provider source list when contrib ops are disabled
  • cuda_execution_provider.cc — Guard Attention kernel forward declarations and BuildKernelCreateInfo registrations (opset 23 and 24) with #ifndef DISABLE_CONTRIB_OPS

The CPU EP still provides the ONNX domain Attention kernel as fallback.

Motivation and Context

Building onnxruntime with CUDA enabled and --disable_contrib_ops fails:

error C2039: 'GetAttentionKernelOptions': is not a member of 'onnxruntime::cuda::Attention<float>'

This is a valid build configuration (useful for reducing compile time) that should be supported.

The CUDA Attention kernel implementation (core/providers/cuda/llm/attention.cc)
depends on contrib ops (flash attention, memory efficient attention, unfused
attention helpers from contrib_ops/cuda/bert/). When DISABLE_CONTRIB_OPS is
defined, these dependencies are unavailable causing compilation failures.

Fix by:
1. Excluding attention.h/attention.cc from the CUDA provider build when
   contrib ops are disabled (cmake change).
2. Guarding the Attention kernel class declarations and registrations in
   cuda_execution_provider.cc with #ifndef DISABLE_CONTRIB_OPS.

The CPU EP still provides the standard ONNX domain Attention kernel as
fallback when the CUDA implementation is unavailable.

Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/4bbef367-4e58-49e5-9bca-8d5a2c8ee872

Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix onnxruntime build with CUDA enabled and contrib ops disabled Fix CUDA build with contrib ops disabled May 19, 2026
Copilot AI requested a review from tianleiwu May 19, 2026 03:59
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

@@ -3083,9 +3089,11 @@ static Status RegisterCudaKernels(KernelRegistry& kernel_registry) {
BuildKernelCreateInfo<ONNX_OPERATOR_VERSIONED_KERNEL_CLASS_NAME(kCudaExecutionProvider, kOnnxDomain, 23, 23, Unsqueeze)>,

// Opset 24
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Opset 24
// Opset 24

@@ -3005,9 +3009,11 @@ static Status RegisterCudaKernels(KernelRegistry& kernel_registry) {
BuildKernelCreateInfo<ONNX_OPERATOR_TYPED_KERNEL_CLASS_NAME(kCudaExecutionProvider, kOnnxDomain, 22, BFloat16, Sin)>,

// Opset 23
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Opset 23
// Opset 23

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Build] Cannot build onnxruntime with CUDA enabled and contrib ops disabled

2 participants