Enable MSVC (cl.exe) build on Windows ARM64#1285
Enable MSVC (cl.exe) build on Windows ARM64#1285pdeep854 wants to merge 5 commits intoxtensor-stack:masterfrom
Conversation
serge-sans-paille
left a comment
There was a problem hiding this comment.
First set of review, more to come later but at least it will trigger the discussion
| #include <algorithm> | ||
| #include <array> | ||
| #include <complex> | ||
| #include <cstdio> |
There was a problem hiding this comment.
unused header, probably a leftover from a debug session.
| for (size_t i = 0; i < sizeof...(Is); ++i) | ||
| if ((bitmask >> i) & 1u) | ||
| std::swap(mask_buffer[inserted++], mask_buffer[i]); | ||
| // Fill remaining positions with the last valid index to avoid undefined behavior |
There was a problem hiding this comment.
why not always setting zero?
There was a problem hiding this comment.
after swapping some unused entries at the end of mask_buffer still contain old index values. This mask is later used by swizzle() to gather vector elements. on MSVC arm64, swizzle() returns 0 for invalid indices, so these leftover values cause incorrect results.
to fix this the unused entries are filled with the last valid index, which acts as a safe placeholder. can also set it to 0
| compress_mask.store_aligned(&mask_out[0]); | ||
| alignas(A::alignment()) T z_out[size]; | ||
| z.store_aligned(&z_out[0]); | ||
| auto res = swizzle(z, compress_mask); |
There was a problem hiding this comment.
what if the overload we were using previously actually exists?
There was a problem hiding this comment.
thanks will remove it as it was added for debugging
include/xsimd/arch/xsimd_neon.hpp
Outdated
| alignas(16) T data[] = { static_cast<T>(args)... }; | ||
| if constexpr (sizeof(T) == 1) | ||
| { | ||
| if constexpr (std::is_unsigned<T>::value) |
There was a problem hiding this comment.
xsimd requires only C++14 as of this PR :-/
| { | ||
| #if defined(_MSC_VER) && defined(_M_ARM64) | ||
| alignas(16) double data[] = { d0, d1 }; | ||
| return vld1q_f64(data); |
There was a problem hiding this comment.
we could make this the default if it's more prtable: https://godbolt.org/z/fP3Tvjv5f
| static_cast<unsigned_type>(b0 ? -1LL : 0LL), | ||
| static_cast<unsigned_type>(b1 ? -1LL : 0LL) | ||
| }; | ||
| return vld1q_u64(data); |
9862d57 to
920fafc
Compare
920fafc to
e2f0536
Compare
|
@serge-sans-paille i have addressed all review comments. request you to review |
serge-sans-paille
left a comment
There was a problem hiding this comment.
Another set of review.
| if ((bitmask >> i) & 1u) | ||
| std::swap(mask_buffer[inserted++], mask_buffer[i]); | ||
| // Fill remaining (don't-care) tail positions with index 0. | ||
| for (size_t i = inserted; i < sizeof...(Is); ++i) |
There was a problem hiding this comment.
use std::fill or memset instead.
| { | ||
| using type = uint64x2_t; | ||
| }; | ||
|
|
There was a problem hiding this comment.
I like this change
include/xsimd/arch/xsimd_neon.hpp
Outdated
| template <class A, class T, class... Args, detail::enable_integral_t<T> = 0> | ||
| XSIMD_INLINE batch<T, A> set(batch<T, A> const&, requires_arch<neon>, Args... args) noexcept | ||
| { | ||
| #if defined(_MSC_VER) && defined(_M_ARM64) && !defined(__clang__) |
There was a problem hiding this comment.
we could probably syndicate this long guard in a single macro variable somewhere.
include/xsimd/arch/xsimd_neon.hpp
Outdated
| alignas(16) T data[] = { static_cast<T>(args)... }; | ||
| return detail::msvc_arm64_load<T>(data); | ||
| #else | ||
| return xsimd::types::detail::neon_vector_type<T> { args... }; |
There was a problem hiding this comment.
I have the feeling that we could use detail::msvc_arm64_load<T> in all context there, avoiding the macro guard.
include/xsimd/arch/xsimd_neon.hpp
Outdated
| using unsigned_type = as_unsigned_integer_t<T>; | ||
| alignas(16) unsigned_type data[] = { static_cast<unsigned_type>(args ? -1LL : 0LL)... }; | ||
| return detail::msvc_arm64_load_u<T>(data); | ||
| #else |
PR: Enable MSVC (cl.exe) build on Windows ARM64
Summary
This PR restores full build compatibility for xsimd with the MSVC compiler (
cl.exe) onWindows ARM64 (
_M_ARM64). Prior to this change, the project failed to compile with MSVCon ARM64 due to fundamental differences in how MSVC exposes ARM NEON intrinsics compared
to GCC and Clang.
All changes are strictly additive and backward-compatible: every existing GCC/Clang code
path is preserved unchanged inside
#elsebranches. No behaviour is altered for anynon-MSVC-ARM64 target.
Testing
The build was verified locally on a Windows ARM64 machine using Visual Studio 2022:
Result: All three targets build successfully with zero errors: