Signal when SPIR-V legalization needs targeted cleanup by AnastaZIuk · Pull Request #8283 · microsoft/DirectXShaderCompiler

AnastaZIuk · 2026-03-20T18:12:56Z

Summary

add dedicated SpirvEmitter bits for legalization-time loop unroll and legalization-time SSA rewrite
set those bits only for the lowering paths that still need those transforms for correctness
pass those signals into the narrowed SPIR-V legalization overload
update external/SPIRV-Headers and external/SPIRV-Tools submodule pointers to the matching branch state
adapt DXC to the current SPIRV-Tools legalization API and the current MSVC getFunctionType signature
fix vector splat of spec constants so specialization-constant composites emit the legal instruction form
update 7 CodeGenSPIRV checks to match the accepted SPIR-V shape

Root cause

The expensive blanket policy lives in the SPIR-V optimizer recipe, not in HLSL parsing or the DXC CLI front-end.

DXC is still the producer that knows when a specific lowering path genuinely needs those expensive legalization transforms for correctness, so the optimizer should not have to guess that globally.

Two confirmed correctness-sensitive cases are:

variable image sample offsets, where a producer-side signal is still needed for lowerings that must end in constant-image-operand forms such as ConstOffset
combined image sampler conversion, which does not need loop unroll but still needs legalize-time SSA rewrite cleanup to avoid invalid SPIR-V such as storing sampled image types

This branch also fixes one separate producer-side correctness bug: vector splat of bool spec constants was being emitted as OpConstantComposite instead of the legal OpSpecConstantComposite.

The same narrowing also exposed one existing image-atomic cleanup dependency. The matching LocalSingleStoreElim follow-up lives in the companion KhronosGroup/SPIRV-Tools#6612 branch and is rolled here through the updated submodule pointer.

Spec basis

The hard legality rules here are much narrower than the old blanket cleanup.

Image, sampler, and sampled image objects must not appear as operands to OpPhi instructions, or OpSelect instructions, or any instructions other than the image or sampler instructions specified to operate on them.

All OpSampledImage instructions must be in the same block in which their Result <id> are consumed.

Spec:

https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_universal_validation_rules

The involved types are explicitly opaque:

This type is opaque: values of this type have no defined physical size or bit pattern.

Spec:

For the image-offset case, the relevant SPIR-V rule is also explicit:

ConstOffset ... It must be an <id> of a constant instruction

Spec:

https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Image_Operands

For the retained-loop case, OpSelect is also explicitly allowed on pointer-typed objects:

Before version 1.4, Result Type must be a pointer, scalar, or vector.

The types of Object 1 and Object 2 must be the same as Result Type.

Spec:

https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpSelect

For the image-atomic follow-up, OpImageTexelPointer is also explicit:

Image must have a type of OpTypePointer with Type OpTypeImage.

Spec:

https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpImageTexelPointer

For the spec-constant splat fix, the composite rules are also explicit:

The Constituents must all be <id>s of non-specialization constant-instruction declarations or an OpUndef.

The Constituents must be the <id> of other specialization constants, constant declarations, or an OpUndef.

Spec:

Vulkan tightens this further:

Objects of types OpTypeImage, OpTypeSampler, OpTypeSampledImage ... must not be stored to or modified

If the DescriptorHeapEXT capability is not declared, structure types must not contain opaque types

Spec:

https://docs.vulkan.org/refpages/latest/refpages/source/StandaloneSpirv.html

Benchmark

reproducer: godbolt.org/z/o5xf1hq36
benchmark: 19.161 s -> 3.02282 s (median of 3 runs)

Validation

fresh full local CodeGenSPIRV on this branch pair passes with 1438 expected passes, 2 expected failures, and 0 unexpected failures

Updated test notes

vk.spec-constant.reuse.hlsl: the old check expected OpConstantComposite from a bool spec-constant splat. That shape is invalid. This branch now emits the legal OpSpecConstantComposite.
15-loop-var-unroll-ok.hlsl: the old check implicitly depended on eager optimizer-side full unroll. This branch keeps LoopControl::Unroll in the IR and uses a legal pointer OpSelect path instead.
cast.flat-conversion.matrix.hlsl: the old check assumed the matrix-to-struct path stayed as pre-folded constant composites. This branch rebuilds the target values from the loaded matrix via OpCompositeExtract and OpCompositeConstruct. The stored value is the same and the row-major or column-major decoration still carries the memory layout rule.
type.rwstructured-buffer.array.counter.indirect.hlsl and type.rwstructured-buffer.array.counter.indirect2.hlsl: the old checks assumed a single flattened access chain directly to the counter or data element. This branch keeps an intermediate structured access chain and then indexes the member. The resource, binding, atomic, and data access semantics are unchanged.
vk.binding.global-struct-of-resources.optimized.hlsl: the old regex overconstrained generated ids. This branch preserves the same resource names and bindings but uses different legal ids, so the test now checks the stable semantic parts instead.
max_id.hlsl: the old check overconstrained an exact Bound value. This branch still exercises the same larger-limit success path, but the exact bound is no longer stable after the accepted optimizer recipe changes.

Companion SPIR-V optimizer PR:
KhronosGroup/SPIRV-Tools#6612

github-actions · 2026-03-20T18:18:01Z

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:

git-clang-format --diff 54afd2c6a5f1a5bcbef66967515fa69f5a3650fd 9ac57982725c4264348b20ad3fb271acc0993754 -- tools/clang/lib/Frontend/Rewrite/RewriteObjC.cpp tools/clang/lib/SPIRV/SpirvEmitter.cpp tools/clang/lib/SPIRV/SpirvEmitter.h

View the diff from clang-format here.

diff --git a/tools/clang/lib/SPIRV/SpirvEmitter.cpp b/tools/clang/lib/SPIRV/SpirvEmitter.cpp
index 4a616a00..0920fc1c 100644
--- a/tools/clang/lib/SPIRV/SpirvEmitter.cpp
+++ b/tools/clang/lib/SPIRV/SpirvEmitter.cpp
@@ -593,8 +593,7 @@ SpirvEmitter::SpirvEmitter(CompilerInstance &ci)
       constEvaluator(astContext, spvBuilder), entryFunction(nullptr),
       curFunction(nullptr), curThis(nullptr), seenPushConstantAt(),
       isSpecConstantMode(false), needsLegalization(false),
-      needsLegalizationLoopUnroll(false),
-      needsLegalizationSsaRewrite(false),
+      needsLegalizationLoopUnroll(false), needsLegalizationSsaRewrite(false),
       beforeHlslLegalization(false), mainSourceFile(nullptr) {
 
   // Get ShaderModel from command line hlsl profile option.
@@ -957,8 +956,7 @@ void SpirvEmitter::HandleTranslationUnit(ASTContext &context) {
       !dsetbindingsToCombineImageSampler.empty() ||
       spirvOptions.signaturePacking;
   needsLegalizationSsaRewrite =
-      needsLegalizationSsaRewrite ||
-      !dsetbindingsToCombineImageSampler.empty();
+      needsLegalizationSsaRewrite || !dsetbindingsToCombineImageSampler.empty();
 
   // Run legalization passes
   if (spirvOptions.codeGenHighLevel) {
@@ -16679,9 +16677,9 @@ bool SpirvEmitter::spirvToolsLegalize(std::vector<uint32_t> *mod,
   } else if (needsLegalizationSsaRewrite) {
     legalizationSsaRewriteMode = spvtools::SSARewriteMode::OpaqueOnly;
   }
-  optimizer.RegisterLegalizationPasses(
-      spirvOptions.preserveInterface, needsLegalizationLoopUnroll,
-      legalizationSsaRewriteMode);
+  optimizer.RegisterLegalizationPasses(spirvOptions.preserveInterface,
+                                       needsLegalizationLoopUnroll,
+                                       legalizationSsaRewriteMode);
   // Add flattening of resources if needed.
   if (spirvOptions.flattenResourceArrays) {
     optimizer.RegisterPass(

Check this box to apply formatting changes to this branch.

s-perron

I don't think we will be able to accept these changes. I understand the place this is coming from. Optimization take time, and if we do lots of unrolling it creates lots of extra code that needs to be optimized.

The problem is that we have to generate legal code, and there are cases where these optimizations have proven necessary.

As a simple example, this PR changes the expected behaviour of on of our example from the SPIR-V cookbook.

We can try to work on alternatives, but they have to be done in a way that we can still legalize as many cases as we currently do. If not, we will break someone.

Things that cannot happen:

The front-end is not the place to determine if an optimization can be skipped. It has too little information and cannot account for how other optimizations will change things. For example, can we correctly legalize the code if we limit ssa rewrite to opaque objects only? Not very likely. That pass is needed to enable all sort of things like scalar replacement. It might work in your code, but I can easily come up with cases where it will not.
Unrolling can significantly impact runtime performance, and the impact is often unpredictable. There are some drivers that I have see that do not do much unrolling in their drivers, and having spir-v do the unrolling is important for runtime performance. We cannot change the unrolling pass in the performance passes.

Potential ideas:

We could modify the unrolling in the legalization passes to look for illegal code in the loop. This could leave many of the loops in your code.
You can disable the performance passs, or replace them with your own set of passes to run for the optimization phase. See here.
If you want even more control, you can dump the code before legalization (-fcgl), and call the optimizer yourself with the passes you want.

These are ideas for limiting the unrolling. I don't think there is a way to limit the ssa rewrite pass the way you want. However, I don't think that is a good idea. It could impact too many other passes.

On a side note, do you see any changes in the time it takes drivers to compile the shaders/pipelines if you skip all of the unrolling in sirv-opt?

s-perron · 2026-03-26T15:55:08Z

tools/clang/test/CodeGenSPIRV/legal-examples/15-loop-var-unroll-ok.hlsl

-// CHECK-NEXT: [[ptr:%[0-9]+]] = OpAccessChain %_ptr_Uniform_S %gRWSBuffer
-// CHECK-NEXT:                OpStore [[ptr]] [[val]]
+// CHECK: OpLoopMerge {{%[0-9]+}} {{%[0-9]+}} Unroll
+// CHECK: [[which:%[0-9]+]] = OpSelect %_ptr_Uniform_type_StructuredBuffer_S {{%[0-9]+}} %gSBuffer1 %gSBuffer2


This is not legal. We have to unroll in this case.

yep, it would only be legal with SPV_KHR_variable_pointers if the gSBuffer pointers were of Shader Storage Class.

However I could go out on a limb and claim that it was never supposed to be legal without [[unroll]] on the loop ( I know the example has unroll so in this case SPIR-V codegen shouldn't change)

devshgraphicsprogramming · 2026-03-26T18:43:39Z

We can try to work on alternatives, but they have to be done in a way that we can still legalize as many cases as we currently do. If not, we will break someone.

Yes my first suggestion was probably do it as a flag like the #7996 PR which also changes previous behaviour, but takes care to avoid changing the old default behaviour.

Unrolling can significantly impact runtime performance, and the impact is often unpredictable. There are some drivers that I have see that do not do much unrolling in their drivers, and having spir-v do the unrolling is important for runtime performance. We cannot change the unrolling pass in the performance passes.

We know that not every driver is Mesa and doesn't optimize fully and are currently running benchmarks for performance regressions.

However I am of the opinion that:

if something is marked up explicitly as [loop] or the equivalent SPIR-V hint it should never unroll even with -O3 or if that breaks legalization
things without [unroll] shouldn't unroll unless: their body codegen is illegal or that SPIR-V optimizer pass has been explicitly called for

TL;DR the performance passes should only be allowed to unroll loops without [loop], but thats a thing to address in SPIR-V Tools.

There's obviously the problem of if-statements, switches, constrant propagation, etc. which can allow turning illegal code legal and they have no attributes to control that.

There's a few examples which in my option are a bit messed up and wouldn't make people cry if a flag disabled them
https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/SPIRV-Cookbook.rst#multiple-calls-to-a-function

We could modify the unrolling in the legalization passes to look for illegal code in the loop. This could leave many of the loops in your code.

Love the idea, I guess that moves the problem and our impl into SPIR-V Tools ?

The front-end is not the place to determine if an optimization can be skipped. It has too little information and cannot account for how other optimizations will change things. For example, can we correctly legalize the code if we limit ssa rewrite to opaque objects only? Not very likely. That pass is needed to enable all sort of things like scalar replacement. It might work in your code, but I can easily come up with cases where it will not.

Actually if we step back and look at your idea above, an attribute like [[vk::allow_legalization]] and [[vk::disallow_legalization]] plus a flag to control the default for anything unannotated could be a neat design.

If this could percolate to the SPIR-V Spec as an OpDecorate or similar it could be a really powerful hint what to skip analyzing or speculatively unrolling, constant propagate, etc. for legalization within SPIR-V Tools

At the end of the day analyzing whether to unroll or legalize might be more expensive than just unrolling and legalizing, since you can't exaclty and in leess time, know if all the transformations will end up making the code legal before you try them.

My intuition tells me Halting Problem will rear its ugly head at some point if we try this without annotations and some help from the frontend.

DXC is still the producer that knows when a specific lowering path genuinely needs those expensive legalization transforms for correctness, so the optimizer should not have to guess that globally.

@AnastaZIuk is correct here, unrolling for legalization should not kick in without [unroll]

Stuff like

  [unroll]
  for( int j = 0; j < 2; j++ ) {
    if (constant > j) {
      lSBuffer = gSBuffer1;
    } else {
      lSBuffer = gSBuffer2;
    }
    gRWSBuffer[j] = lSBuffer[j];
  }

should legalize fine, but things like

  [loop]
for( int j = 0; j < 2; j++ ) {
  if (constant > j) {
    lSBuffer = gSBuffer1;
  } else {
    lSBuffer = gSBuffer2;
  }
  gRWSBuffer[j] = lSBuffer[j];
}

really shouldn't IMHO

On a side note, do you see any changes in the time it takes drivers to compile the shaders/pipelines if you skip all of the unrolling in spirv-opt?

We'll take our worst shader and let you know.

The main problem for us is hot-reload/iteration time, and majority of the time is spent in SPIR-V Tools. Whereas with a 58k LoC shader with template hell like ours we'd expect to spend majority of the time in Clang's Sema.

Mind you the difference between SPIR-V ResultID upper bound is only 10k vs 16k for the -fcgl - Vd case and -O3, and the SPIR-V is 12k vs 20k opcodes but the compile time difference is an order of magnitude.

We're open for ideas

#s-perron how would you have us solve the problem of "too much SPIR-V spam".

Sidequests

This branch also fixes one separate producer-side correctness bug: vector splat of bool spec constants was being emitted as OpConstantComposite instead of the legal OpSpecConstantComposite.

@s-perron was that done correctly, or is that something you'd want fixed?

github-project-automation bot added this to HLSL Roadmap Mar 20, 2026

github-project-automation bot moved this to New in HLSL Roadmap Mar 20, 2026

AnastaZIuk mentioned this pull request Mar 20, 2026

Narrow blanket SPIR-V legalization work in optimizer recipes KhronosGroup/SPIRV-Tools#6612

Open

AnastaZIuk marked this pull request as ready for review March 20, 2026 18:18

AnastaZIuk changed the title ~~Signal when SPIR-V legalization needs loop unroll~~ Signal when SPIR-V legalization needs targeted cleanup Mar 20, 2026

AnastaZIuk mentioned this pull request Mar 21, 2026

Narrow blanket SPIR-V loop unroll in optimizer recipes Devsh-Graphics-Programming/SPIRV-Tools#2

Draft

Signal when SPIR-V legalization needs targeted cleanup

66e975a

AnastaZIuk force-pushed the unroll branch from a3904f6 to 66e975a Compare March 22, 2026 07:12

Update SPIR-V submodule pointers

9ac5798

s-perron requested changes Mar 26, 2026

View reviewed changes

github-project-automation bot moved this from New to In progress in HLSL Roadmap Mar 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Signal when SPIR-V legalization needs targeted cleanup#8283

Signal when SPIR-V legalization needs targeted cleanup#8283
AnastaZIuk wants to merge 2 commits intomicrosoft:mainfrom
Devsh-Graphics-Programming:unroll

AnastaZIuk commented Mar 20, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

s-perron left a comment

Uh oh!

s-perron Mar 26, 2026

Uh oh!

devshgraphicsprogramming Mar 26, 2026 •

edited

Loading

Uh oh!

devshgraphicsprogramming commented Mar 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

AnastaZIuk commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Spec basis

Benchmark

Validation

Updated test notes

Uh oh!

github-actions bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

s-perron left a comment

Choose a reason for hiding this comment

Uh oh!

s-perron Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

devshgraphicsprogramming Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

devshgraphicsprogramming commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

We're open for ideas

Sidequests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AnastaZIuk commented Mar 20, 2026 •

edited

Loading

github-actions bot commented Mar 20, 2026 •

edited

Loading

devshgraphicsprogramming Mar 26, 2026 •

edited

Loading

devshgraphicsprogramming commented Mar 26, 2026 •

edited

Loading