LLVM and SPIRV-LLVM-Translator pulldown (WW15 2026) by iclsrc · Pull Request #21723 · intel/llvm

iclsrc · 2026-04-10T03:01:43Z

LLVM: llvm/llvm-project@7a3b7f1
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@b241000

These are previously covered by AMDGPUWmmaIntrinsicModsAllReuse.

CONFLICT (content): Merge conflict in clang/include/clang/Basic/DiagnosticSemaKinds.td

As proposed in riscv-non-isa/riscv-c-api-doc#110. No real compiler-rt implementation as Linux does not list these extensions in hwprobe. Signed-off-by: Luke Wren <wren6991@gmail.com>

@bogners

…yout (#188139) fixes #188131 This change address stylistic changes @bogners requested in llvm/llvm-project#186215 It also adds the `storeMatrixArrayFromVector`. to SPIRVLegalizePointerCast.cpp when we detect the matrix array of vector memory layout Changes to storeArrayFromVector were cleanup Assisted-by Github Copilot for test case check lines

…#188896) When SPIRV-LLVM-Translator is built in-tree (i.e., placed in llvm/projects folder), llvm-spirv target exists. Drop legacy llvm-spirv_target dependency (was for non-runtime build) and add llvm-spirv to runtimes dependencies.

Get rid of several .h.def files which were used to ensure that the macro definitions from llvm-libc-macro would be included in the public header. Replace this logic with YAML instead - add entries to the "macros" list that point to the correct "macro_header" to ensure it would be included. For C standard library headers, list several standard-define macros to document their availability. For POSIX/Linux headers, only reference a handful of macro, since more planning is needed to decide how to represent platform-specific macro in YAML.

…an (#189109)

…123) Use the generic switch rather than encoding the version number it currently corresponds to.

… for risc-v (#110690) The code generated for calls with FPCC eligible structs as arguments doesn't consider the bitfield, which results in a store crossing the boundary of the memory allocated using alloca, e.g. For the code: ``` struct __attribute__((packed, aligned(1))) S { const float f0; unsigned f1 : 1; }; unsigned func(struct S arg) { return arg.f1; } ``` The generated IR is: ``` define dso_local signext i32 @func( float [[TMP0:%.*]], i32 [[TMP1:%.*]]) #[[ATTR0:[0-9]+]] { [[ENTRY:.*:]] [[ARG:%.*]] = alloca [[STRUCT_S:%.*]], align 1 [[TMP2:%.*]] = getelementptr inbounds nuw { float, i32 }, ptr [[ARG]], i32 0, i32 0 store float [[TMP0]], ptr [[TMP2]], align 1 [[TMP3:%.*]] = getelementptr inbounds nuw { float, i32 }, ptr [[ARG]], i32 0, i32 1 store i32 [[TMP1]], ptr [[TMP3]], align 1 [[F1:%.*]] = getelementptr inbounds nuw [[STRUCT_S]], ptr [[ARG]], i32 0, i32 1 [[BF_LOAD:%.*]] = load i8, ptr [[F1]], align 1 [[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], 1 [[BF_CAST:%.*]] = zext i8 [[BF_CLEAR]] to i32 ret i32 [[BF_CAST]] ``` Where, `store i32 [[TMP1]], ptr [[TMP3]], align 1` can be seen crossing the boundary of the allocated memory. If, the IR is seen after optimizations (EarlyCSEPass), the IR left is: ``` define dso_local noundef signext i32 @func( float [[TMP0:%.*]], i32 [[TMP1:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] { [[ENTRY:.*:]] ret i32 0 ``` The patch trims the second member of the struct after taking into consideration the bitwidth to decide the appropriate integer type and the test shows the results of this patch. Note that the bug is seen only when `f` extension is enabled for FPCC eligibility. Co-authored-by: muhammad.kamran4 <muhammad.kamran@esperantotech.com>

…697) Device libs has a fast sqrt macro implemented this way.

Add tests targeting assembly printing and miscellaneous CodeGen areas with low coverage: - asm-printer-cpool.ll: HexagonAsmPrinter exercising constant pool entry emission. - asm-operand-modifiers.ll: Inline asm operand modifier printing paths (lo/hi/mem). - target-objfile-sdata.ll, split-double-volatile.ll, reg-info-types.ll: Miscellaneous CodeGen coverage for HexagonTargetObjectFile small data classification, HexagonSplitDouble volatile load handling, and HexagonRegisterInfo register class queries. - constext-store-imm.ll: HexagonConstExtenders store-immediate optimization paths.

This removes dyn_cast invocations where the argument is already of the target type (including through subtyping). This was created by adding a static assert in dyn_cast and letting an LLM iterate until the code base compiled. I then went through each example and cleaned it up. This does not commit the static assert in dyn_cast, because it would prevent a lot of uses in templated code. To prevent backsliding we should instead add an LLVM aware version of https://clang.llvm.org/extra/clang-tidy/checks/readability/redundant-casting.html (or expand the existing one).

CONFLICT (content): Merge conflict in llvm/lib/IR/DiagnosticInfo.cpp

The test used to look all good, but actually not. The WeakVH just make itself null after the pointed value being replaced. So a zero value was used because VarIndex become null. The test checks looks all good. Actually only the WeakTrackingVH have the ability to be updated to new value. Change the test slightly to make that using zero index is wrong.

Previously, it generated extra `single` quote marks around the outer braces (i.e., `'{'` `6442:\220,1\22` `'}'`). SPIR-V backend does not expect that. It expects `{6442:\220,1\22}`.

… device (#189140) [Driver][HIP] Fix bundled -S emitting bitcode instead of assembly for device PR #188262 added support for bundling HIP -S output under the new offload driver, but the device backend still entered the bitcode-emitting path in ConstructPhaseAction. The condition at the Backend phase checked for the new offload driver and directed device code to emit TY_LLVM_BC, without excluding the -S case. This caused the device section in the bundled .s to contain LLVM bitcode instead of textual AMDGPU assembly. This broke the HIP UT CheckCodeObjAttr test which greps copyKernel.s for "uniform_work_group_size" — a string that only appears in textual assembly, not in bitcode. Fix by excluding -S (without -emit-llvm) from the new-driver bitcode path, so the device backend falls through to emit TY_PP_Asm (textual assembly). Also add a missing lit test check that the device backend produces assembler output for the bundled -S case. Fixes: LCOMPILER-553

…aries (#189044) We only did this for local variables but were were missing it for globals.

… (#189058)

…188917)

…ardOperands API to BranchOpInterface (#187864) To simplify the output of the reduction-tree pass, this PR introduces the eraseRedundantBlocksInRegion. For regions containing multiple execution paths, this functionality selects the shortest 'interesting' path. Additionally, this PR adds the getSuccessorForwardOperands API to BranchOpInterface. This allows us to extract the ForwardOperands for a specific path chosen from multiple alternatives, enabling the creation of a cf.br operation for the redirected jump.

…tions (#189113) Fixes llvm/llvm-project#187716.

…ter (#188924)

…ssorForwardOperands API to BranchOpInterface" (#189150) Reverts llvm/llvm-project#187864, because it is causing same build bot failures. See https://lab.llvm.org/buildbot/#/builders/138/builds/27662 and https://lab.llvm.org/buildbot/#/builders/169/builds/21376/steps/11/logs/stdio for memory leak issues.

…on index (#188508) When a dynamic index of -1 (the kPoisonIndex sentinel) was folded into the static position of a vector.insert op, foldDenseElementsAttrDestInsertOp would proceed to call calculateInsertPosition, which returned -1. The subsequent iterator arithmetic (allValues.begin() + (-1)) was undefined behaviour, causing an assertion in DenseElementsAttr::get. Fix by bailing out early in foldDenseElementsAttrDestInsertOp when any static position equals kPoisonIndex, consistent with how InsertChainFullyInitialized already guards this case. Fixes #188404 Assisted-by: Claude Code

…nt (#189163) When invoking `-test-bytecode-roundtrip=test-dialect-version=X.Y` on a module that contains no test dialect operations, the reader type callback in `runTest0` called `reader.getDialectVersion<test::TestDialect>()` and then immediately asserted that it succeeded. However, if the test dialect was never referenced in the bytecode (because no test dialect types appear in the module), the dialect's version information is not stored in the bytecode, so `getDialectVersion` legitimately returns failure. When the test dialect version is unavailable in the bytecode being read, the module contains no test dialect types, so no "funky"-group overrides are needed and the callback can safely skip by returning `success()`. A regression test is added with a module that has no test dialect ops, exercising the `test-dialect-version=2.0` path that previously crashed. Fixes #128321 Fixes #128325 Assisted-by: Claude Code

… (#188064) This PR adds two new field specifiers (`operand` and `attribute`) and extends the existing one (`result`): - `default_factory` parameter is added for `result` and `attribute` to specify default value via a lambda/function - `kw_only` parameter is added for all these three specifiers, to make a field a keyword-only parameter (without giving a default value). ```python def result( *, infer_type: bool = False, default_factory: Optional[Callable[[], Any]] = None, kw_only: bool = False, ) -> Any: ... def operand( *, kw_only: bool = False, ) -> Any: ... def attribute( *, default_factory: Optional[Callable[[], Any]] = None, kw_only: bool = False, ) -> Any: ... ``` Examples about how to use them: ```python class OperandSpecifierOp(TestFieldSpecifiers.Operation, name="operand_specifier"): a: Operand[IntegerType[32]] = operand() b: Optional[Operand[IntegerType[32]]] = None c: Operand[IntegerType[32]] = operand(kw_only=True) class ResultSpecifierOp(TestFieldSpecifiers.Operation, name="result_specifier"): a: Result[IntegerType[32]] = result() b: Result[IntegerType[16]] = result(infer_type=True) c: Result[IntegerType] = result( default_factory=lambda: IntegerType.get_signless(8) ) d: Sequence[Result[IntegerType]] = result(default_factory=list) e: Result[IntegerType[32]] = result(kw_only=True) class AttributeSpecifierOp( TestFieldSpecifiers.Operation, name="attribute_specifier" ): a: IntegerAttr = attribute() b: IntegerAttr = attribute( default_factory=lambda: IntegerAttr.get(IntegerType.get_signless(32), 42) ) c: StringAttr["a"] | StringAttr["b"] = attribute( default_factory=lambda: StringAttr.get("a") ) d: IntegerAttr = attribute(kw_only=True) ``` --------- Co-authored-by: Rolf Morel <rolfmorel@gmail.com>

This fixes 04785ad. Co-authored-by: Google Bazel Bot <google-bazel-bot@google.com>

Before the start of the algorithm in weak crossing SIV test, we need to ensure both addrecs are `nsw`

If the trimming candidate subtree is rooted at an alternate-shuffle node with binary ops, and this subtree has the same cost as the buildvector node cost, better to stick with the buildvector node to avoid runtime perf regressions from shuffle/extra operations overhead that the cost model may underestimate. Skip trimming if the subtree contains ExtractElement nodes, since those operate on already-materialized vectors, which may reduced vector-to-scalar code movement and have better perf. Reviewers: hiraditya, bababuck, fhahn, RKSimon Pull Request: llvm/llvm-project#188272

Implement non-negative value tracking for SUB-CTLZ chains in GlobalISel, matching the behavior previously added to SelectionDAG. Additionally, refactor the SelectionDAG implementation from the previous patch to improve performance and code density. Related to llvm/llvm-project#136516 and llvm/llvm-project#186338 (comment)

…ace (#188514) The `PromotableRegionOpInterface` implementations use two helpers that are likely useful for other dialects implementing this interface as well: - `updateTerminator`: Appends the reaching definition as an operand to a block's terminator, falling back to a default when the block has no entry (e.g. dead code). - `replaceWithNewResults`: Clones an operation with additional result types while preserving its regions, then replaces the original. This PR extracts them into a common utility header so that downstream dialects can reuse them directly. I'm open to discussion about the location of these utilities.

This implements handling for throwing calls inside an EH cleanup handler. When such a call occurs, the CFG flattening pass replaces it with a cir.try_call op that unwinds to a terminate block. A new CIR operation, cir.eh.terminate, is added to facilitate this handling, and the design document is updated to describe the new behavior. Assisted-by: Cursor / claude-4.6-opus-high

…320) We had an errorNYI diagnostic to trigger when we generated an alias for a ctor or dtor that had an existing declaration. Because functions are used via flat symbol references, all that is needed is to erase the old declaration. This change does that.

Move some functions around so that the CallBrInst processing is contained. The 'static' functions don't need to be declared at the top; just place them before the calls. Fix the naming to use lower-case for the first letter of function names.

This fixes b6e4d27. Co-authored-by: Google Bazel Bot <google-bazel-bot@google.com>

…t & mask ops in sg to wi pass (#187392) This PR adds patterns for following vector ops in the new sg-to-wi pass 1. Transpose 2. BitCast 3. CreateMask 4. ConstantMask

…6 (#189468) Fixes: LCOMPILER-1673

…ol-conversion (#189149) Fixes llvm/llvm-project#176889.

…(#189279) This patch introduces an amdgpu wrapper for `rocdl.global.load.async.to.lds.bN` intrinsics, which were introduced in gfx1250. Assisted-by: Claude --------- Signed-off-by: Eric Feng <Eric.Feng@amd.com>

…e.delinearize_index (#188369) Allow `affine.delinearize_index` and `affine.linearize_index` to operate on `vector<...x index>` types in addition to scalar indices. --------- Signed-off-by: Keshav Vinayak Jha <keshavvinayakjha@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

This implements handling of cleanup scopes in cases where a flag is needed to indicate whether or not the cleanup is active. This happens in cases where a cleanup is no longer required, but it isn't at the top of the cleanup stack so it can't be popped. A temporary variable is used to set the cleanup to an inactive state when it is no longer needed. Assisted-by: Cursor / claude-4.6-opus-high (implementation) Assisted-by: Cursor / gpt-5.3-codex (tests)

…v_pulldown

…sts (#3660) Round trip for corresponding CHECK-LLVM is already working for some tests. So they could be enabled Original commit: KhronosGroup/SPIRV-LLVM-Translator@3f5257681447f4c

Update after llvm-project commit 8e1e371 ("[IR][NFC] Mark BranchInst as deprecated (#187314)", 2026-03-19). Original commit: KhronosGroup/SPIRV-LLVM-Translator@6b5f17f12b4be00

After llvm-project commit cf92512 ("[DebugInfo] Add Verifier check for local imports in CU's imports field (#187118)", 2026-03-19), DebugInfo got lost for these tests. Ensure the metadata follows the expected format. Original commit: KhronosGroup/SPIRV-LLVM-Translator@9691713f67ce02c

The tests started to fail with "Unable to meet SPIR-V requirements for this target" after upstream commit llvm/llvm-project@85049fc357ac ("[HLSL][SPIRV] Add support for -g to generate NonSemantic Debug Info (#187051)", 2026-03-25). Original commit: KhronosGroup/SPIRV-LLVM-Translator@40ce6c71d8d5b56

) Original commit: KhronosGroup/SPIRV-LLVM-Translator@34fdf7fcf4e0fd7

Replace manual save/set/restore of `SPIRVUseTextFormat` with `llvm::SaveAndRestore` to guarantee restoration on all exit paths, including the early return on write error. Fixes Coverity CID 546125. Resolves KhronosGroup/SPIRV-LLVM-Translator#3414 Original commit: KhronosGroup/SPIRV-LLVM-Translator@01ee67ccc9a2c61

Move annotation strings created from UserSemantic decorations to the constant address space. Even though these strings should disappear before instruction selection, we ought to avoid globals in the private addrspace. Also set the source file and auxilliary data arguments to `null` instead poison/undef which seems to be more common in llvm. Original commit: KhronosGroup/SPIRV-LLVM-Translator@8f16307ff9dbe9e

A recent version of SPIRV-Tools found several issues with the test, such as `DebugTypeFunction` having the wrong return type operand and `DebugTypeBasic` missing the flags operand. Original commit: KhronosGroup/SPIRV-LLVM-Translator@bf469923a25d484

) A malformed SPIR-V binary can contain an instruction WordCount below the instruction's minimum, causing wraparound in `resize(WordCount - FixedWC)` and a ~17 GB allocation that can result in `std::bad_alloc` when VA space is limited (32-bit systems, ulimit) or process hang on memory access. Fix by rejecting the malformed input early. AI-assisted: Claude Sonnet 4.6 (commercial SaaS) Original commit: KhronosGroup/SPIRV-LLVM-Translator@5adf335eedd8ba0

As in title, problem exposed during `sanitize_overflow` enablement in triton compiler: intel/intel-xpu-backend-for-triton#6533 Original commit: KhronosGroup/SPIRV-LLVM-Translator@b2410000b1ff3c9

Conflicts: clang/test/lit.site.cfg.py.in libclc/clc/lib/amdgpu/workitem/clc_get_local_id.cl libclc/libspirv/lib/amdgcn-amdhsa/SOURCES

github-advanced-security

zizmor found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

rampitec and others added 30 commits March 27, 2026 15:20

[AMDGPU] Remove neg support from 4 more gfx1250 WMMA (#189115)

a2d84b5

These are previously covered by AMDGPUWmmaIntrinsicModsAllReuse.

Merge from 'main' to 'sycl-web' (149 commits)

bbb3d47

CONFLICT (content): Merge conflict in clang/include/clang/Basic/DiagnosticSemaKinds.td

[RISCV] Allocate feature bits for Zifencei and Zmmul (#143306)

efba01a

As proposed in riscv-non-isa/riscv-c-api-doc#110. No real compiler-rt implementation as Linux does not list these extensions in hwprobe. Signed-off-by: Luke Wren <wren6991@gmail.com>

[LLD][skip ci] Fix typo in linker_script.rst (#148867)

1128d74

libclc: Simplify fract implementation (#189080)

15bc5b0

[compiler-rt] Add interceptors for free_[aligned_]sized for asan+hwas…

a5fa4db

…an (#189109)

[Fuchsia] Set LIBCXX_ABI_UNSTABLE instead of LIBCXX_ABI_VERSION (#189…

c4847d2

…123) Use the generic switch rather than encoding the version number it currently corresponds to.

AMDGPU: Skip last corrections and scaling for afn llvm.sqrt.f64 (#183…

9be0cc1

…697) Device libs has a fast sqrt macro implemented this way.

Merge from 'main' to 'sycl-web' (18 commits)

0781c47

CONFLICT (content): Merge conflict in llvm/lib/IR/DiagnosticInfo.cpp

[XeVM] Fix the cache-control metadata string generation. (#187591)

8e59c3a

Previously, it generated extra `single` quote marks around the outer braces (i.e., `'{'` `6442:\220,1\22` `'}'`). SPIR-V backend does not expect that. It expects `{6442:\220,1\22}`.

[clang][bytecode] Skip rvalue subobject adjustments for global tempor…

fb09449

…aries (#189044) We only did this for local variables but were were missing it for globals.

[clang][bytecode] Add support for objc array- and dictionary literals…

cb8b65e

… (#189058)

[clang][bytecode] Handle strcmp() not pointing to primitive arrays (#…

097abb3

…188917)

[clang-tidy] Fix rvalue-reference-param-not-moved FP on implicit func…

ad91a2f

…tions (#189113) Fixes llvm/llvm-project#187716.

[mlir][vector] Reject alignment attribute on tensor-level gather/scat…

5ae2fe7

…ter (#188924)

[Clang][NFC] Add the list of C++26 papers approved in Kona and Croydon

64d2f70

[Clang][NFC] Trivial relocation is no longer a c++26 feature

16e0658

jhuber6 and others added 27 commits March 30, 2026 14:32

[LLVM] Fix invalid shadowed type name

23f95fa

[Bazel] Fixes 04785ad (#189456)

6021270

This fixes 04785ad. Co-authored-by: Google Bazel Bot <google-bazel-bot@google.com>

[DA] Require nsw for AddRecs in the WeakCrossing SIV test (#185041)

804ece6

Before the start of the algorithm in weak crossing SIV test, we need to ensure both addrecs are `nsw`

[Bazel] Fixes b6e4d27 (#189473)

19caff4

This fixes b6e4d27. Co-authored-by: Google Bazel Bot <google-bazel-bot@google.com>

[MLIR] [XeGPU] Add distribution patterns for vector transpose, bitcas…

e50f08b

…t & mask ops in sg to wi pass (#187392) This PR adds patterns for following vector ops in the new sg-to-wi pass 1. Transpose 2. BitCast 3. CreateMask 4. ConstantMask

[AMDGPU] Drop A and B neg modifier from amdgcn_wmma_bf16_16x16x32_bf1…

5f99854

…6 (#189468) Fixes: LCOMPILER-1673

[clang-tidy] Add AllowLogicalOperatorConversion option to implicit-bo…

76f5c5d

…ol-conversion (#189149) Fixes llvm/llvm-project#176889.

[mlir][amdgpu] implement amdgpu.global_load_async_to_lds for gfx1250 …

ae835de

…(#189279) This patch introduces an amdgpu wrapper for `rocdl.global.load.async.to.lds.bN` intrinsics, which were introduced in gfx1250. Assisted-by: Claude --------- Signed-off-by: Eric Feng <Eric.Feng@amd.com>

Merge commit '7a3b7f142d8ffd7b3e2a9cf0a065e3ff7bf76241' into llvmspir…

78a867a

…v_pulldown

Add round-trip tests through SPIR-V backend for previously failing te…

8b096c3

…sts (#3660) Round trip for corresponding CHECK-LLVM is already working for some tests. So they could be enabled Original commit: KhronosGroup/SPIRV-LLVM-Translator@3f5257681447f4c

Migrate away from BranchInst

dba69fd

Update after llvm-project commit 8e1e371 ("[IR][NFC] Mark BranchInst as deprecated (#187314)", 2026-03-19). Original commit: KhronosGroup/SPIRV-LLVM-Translator@6b5f17f12b4be00

Adjust tests where DCE is removing IR and enable round-trip tests (#3665

dd76f0d

) Original commit: KhronosGroup/SPIRV-LLVM-Translator@34fdf7fcf4e0fd7

Add sadd_with_overflow_i8 support (#3673)

ff06617

As in title, problem exposed during `sanitize_overflow` enablement in triton compiler: intel/intel-xpu-backend-for-triton#6533 Original commit: KhronosGroup/SPIRV-LLVM-Translator@b2410000b1ff3c9

iclsrc added the disable-lint Skip linter check step and proceed with build jobs label Apr 10, 2026

Merge remote-tracking branch 'origin/sycl' into llvmspirv_pulldown

4582a4c

Conflicts: clang/test/lit.site.cfg.py.in libclc/clc/lib/amdgpu/workitem/clc_get_local_id.cl libclc/libspirv/lib/amdgcn-amdhsa/SOURCES

github-advanced-security bot found potential problems Apr 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLVM and SPIRV-LLVM-Translator pulldown (WW15 2026)#21723

LLVM and SPIRV-LLVM-Translator pulldown (WW15 2026)#21723
iclsrc wants to merge 3218 commits intosyclfrom
llvmspirv_pulldown

iclsrc commented Apr 10, 2026

Uh oh!

github-advanced-security bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

iclsrc commented Apr 10, 2026

Uh oh!

github-advanced-security bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants