[TIRX] Bind parallel loops to GPU threads before VerifyMemory by zhils · Pull Request #19363 · apache/tvm

zhils · 2026-04-06T08:04:31Z

VerifyMemory on GPU targets treats direct accesses outside thread environments as illegal. In the ScatterValue CUDA lowering path, topi.scatter_elements emits ForKind::kParallel loops without explicit thread bindings, which triggers false host-memory access failures (e.g. "Did you forget to bind?") during TIR verification.

This change adds a new tirx pass (BindParallelLoopsToThreads) and inserts it before VerifyMemory in the s_tir pipelines (including adreno). The pass rewrites parallel loops into blockIdx.x/threadIdx.x thread-extent regions, substitutes loop vars with global thread indices, and adds bounds checks for non-divisible extents. This preserves correctness while ensuring GPU kernels pass memory verification for this path.

`VerifyMemory` on GPU targets treats direct accesses outside thread environments as illegal. In the ScatterValue CUDA lowering path, `topi.scatter_elements` emits `ForKind::kParallel` loops without explicit thread bindings, which triggers false host-memory access failures (e.g. "Did you forget to bind?") during TIR verification. This change adds a new `tirx` pass (`BindParallelLoopsToThreads`) and inserts it before `VerifyMemory` in the `s_tir` pipelines (including adreno). The pass rewrites parallel loops into `blockIdx.x/threadIdx.x` thread-extent regions, substitutes loop vars with global thread indices, and adds bounds checks for non-divisible extents. This preserves correctness while ensuring GPU kernels pass memory verification for this path.

gemini-code-assist

Code Review

This pull request introduces the BindParallelLoopsToThreads pass, which converts ForKind::kParallel loops into GPU block and thread bindings, and integrates this pass into the S-TIR pipelines. Additionally, it provides a configuration option to allow unsupported host compilers for NVCC on Windows and adds a functional test for scatter operations on CUDA. Review feedback identifies a critical issue regarding the handling of nested parallel loops which could lead to invalid GPU register bindings, an inconsistency in GPU device type definitions between files, and a minor code redundancy in the loop variable substitution logic.

src/tirx/transform/bind_parallel_loops_to_threads.cc

Fix three correctness/configuration issues in the GPU parallel-loop binding path used before VerifyMemory. First, preserve non-zero loop mins by mapping parallel indices as min + global_idx instead of global_idx. Second, avoid rewriting parallel loops when already inside a thread environment to prevent invalid nested bindings. Third, register cuda.nvcc_allow_unsupported_compiler as a valid PassContext key so the NVCC workaround can be enabled via config without raising Invalid config option. Made-with: Cursor

- Add kDLWebGPU to IsGPUDevice in verify_memory.cc - Remove redundant Var wrapper in loop_partition.cc - Fix nested parallel loop handling in bind_parallel_loops_to_threads.cc

gemini-code-assist bot reviewed Apr 6, 2026

View reviewed changes

src/tirx/transform/bind_parallel_loops_to_threads.cc Outdated Show resolved Hide resolved

src/tirx/transform/bind_parallel_loops_to_threads.cc Show resolved Hide resolved

src/tirx/transform/bind_parallel_loops_to_threads.cc Outdated Show resolved Hide resolved

zhils added 3 commits April 6, 2026 16:32

Fix code review issues

a65d799

- Add kDLWebGPU to IsGPUDevice in verify_memory.cc - Remove redundant Var wrapper in loop_partition.cc - Fix nested parallel loop handling in bind_parallel_loops_to_threads.cc

Trigger CI

8f7d42a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TIRX] Bind parallel loops to GPU threads before VerifyMemory#19363

[TIRX] Bind parallel loops to GPU threads before VerifyMemory#19363
zhils wants to merge 4 commits intoapache:mainfrom
zhils:my-fix-branch

zhils commented Apr 6, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zhils commented Apr 6, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant