initial migration of OA and static_map by srinivasyadav18 · Pull Request #7705 · NVIDIA/cccl

srinivasyadav18 · 2026-02-18T04:14:14Z

Description

closes #7463

Checklist

New or existing tests cover these changes.
The documentation is up to date with these changes.

copy-pr-bot · 2026-02-18T04:14:18Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

PointKernel · 2026-02-18T20:47:35Z

cudax/include/cuda/experimental/__cuco/__detail/bitwise_compare.cuh

+_CCCL_HOST_DEVICE inline int __cuda_memcmp(const void* __lhs, const void* __rhs, ::cuda::std::size_t __count)
+{
+  auto __lhs_c = reinterpret_cast<const unsigned char*>(__lhs);
+  auto __rhs_c = reinterpret_cast<const unsigned char*>(__rhs);
+  while (__count--)
+  {
+    auto const __lhs_v = *__lhs_c++;
+    auto const __rhs_v = *__rhs_c++;
+    if (__lhs_v < __rhs_v)
+    {
+      return -1;
+    }
+    if (__lhs_v > __rhs_v)
+    {
+      return 1;
+    }
+  }
+  return 0;
+}


Does CCCL internally offer something similar? I feel this is a very generic util that we don't need to custom.

PointKernel · 2026-02-18T20:54:08Z

cudax/include/cuda/experimental/__cuco/__detail/equal_wrapper.cuh

+  template <class _Lhs, class _Rhs>
+  _CCCL_DEVICE constexpr __equal_result __equal_to(const _Lhs& __lhs, const _Rhs& __rhs) const noexcept
+  {
+    return __equal(__lhs, __rhs) ? __equal_result::__equal : __equal_result::__unequal;
+  }
+
+  template <__is_insert _IsInsert, class _Lhs, class _Rhs>
+  _CCCL_DEVICE constexpr __equal_result operator()(const _Lhs& __lhs, const _Rhs& __rhs) const noexcept


The purpose of this equal wrapper is to encapsulate both the bitwise sentinel check and the key comparison via the key_equal comparator into a single API, so users don’t need to manually perform a sentinel check before invoking the equality comparison.

However, in cases where the sentinel check has already been performed and only key equality is desired, it is preferable to call __equal_to directly instead of using the wrapper operator, which always performs the sentinel check.

It would be helpful to add documentation here to clarify this distinction and guide users on when to use each path.

PointKernel · 2026-02-18T20:58:07Z

cudax/include/cuda/experimental/__cuco/__detail/prime.hpp

we probably want to use cuda::std::array instead of std::array for the prime array as it can be calculated on either device or host

PointKernel · 2026-02-18T21:01:31Z

cudax/include/cuda/experimental/__cuco/__detail/utils.cuh

+//! @brief Converts pair to tuple.
+template <class _Key, class _Value>
+struct __slot_to_tuple
+{
+  template <class _Slot>
+  _CCCL_DEVICE ::cuda::std::tuple<_Key, _Value> operator()(const _Slot& __slot)
+  {
+    return ::cuda::std::tuple<_Key, _Value>(__slot.first, __slot.second);
+  }
+};
+
+//! @brief Device functor returning whether the input slot is filled.
+//!
+//! Template parameter:
+//! - `_Key`: Key type
+
+template <class _Key>
+struct __slot_is_filled
+{
+  _Key __empty_key_sentinel;
+
+  template <class _Slot>
+  _CCCL_DEVICE bool operator()(const _Slot& __slot)
+  {
+    return !__detail::__bitwise_compare(::cuda::std::get<0>(__slot), __empty_key_sentinel);
+  }
+};


I think they can be safely removed as it's only used by the legacy impl

PointKernel · 2026-02-18T21:02:30Z

cudax/include/cuda/experimental/__cuco/__detail/utils.hpp

+constexpr _ForwardIt __lower_bound(_ForwardIt __first, _ForwardIt __last, const _Tp& __value)
+{
+  using __diff_type = typename std::iterator_traits<_ForwardIt>::difference_type;


Suggested change

constexpr _ForwardIt __lower_bound(_ForwardIt __first, _ForwardIt __last, const _Tp& __value)

{

using __diff_type = typename std::iterator_traits<_ForwardIt>::difference_type;

_CCCL_HOST_DEVICE constexpr _ForwardIt __lower_bound(_ForwardIt __first, _ForwardIt __last, const _Tp& __value)

{

using __diff_type = typename cuda::std::iterator_traits<_ForwardIt>::difference_type;

This is supposed to be a host device API

We could also use CCCL’s lower_bound if it exists and is constexpr under C++17.

PointKernel · 2026-02-18T21:33:43Z

cudax/include/cuda/experimental/__cuco/__static_map/kernels.cuh

+  const auto __loop_stride = ::cuda::experimental::cuco::__detail::__grid_stride() / _CgSize;
+  auto __idx               = ::cuda::experimental::cuco::__detail::__global_thread_id() / _CgSize;
+
+  auto __warp                  = cg::tiled_partition<32, cg::thread_block>(__block);


Suggested change

auto __warp = cg::tiled_partition<32, cg::thread_block>(__block);

auto __warp = cg::tiled_partition<warp_size, cg::thread_block>(__block);

any internal util we could use to avoid magic number? We do have this util in cuco though.

PointKernel · 2026-02-18T21:34:56Z

cudax/include/cuda/experimental/__cuco/static_map.cuh

+template <class _Key,
+          class _Tp,
+          ::cuda::thread_scope _Scope = ::cuda::thread_scope_device,
+          class _KeyEqual             = thrust::equal_to<_Key>,


Suggested change

class _KeyEqual = thrust::equal_to<_Key>,

class _KeyEqual = cuda::std::equal_to<_Key>,

PointKernel · 2026-02-18T21:37:37Z

cudax/include/cuda/experimental/__cuco/static_map.cuh

+  using key_type            = _Key;
+  using mapped_type         = _Tp;
+  using value_type          = ::cuda::std::pair<_Key, _Tp>;
+  using size_type           = ::cuda::std::size_t;


If so, we get rid of cuco::extent as well as hash sanitizing logic algother.

PointKernel · 2026-02-18T21:41:11Z

cudax/include/cuda/experimental/__cuco/static_map.cuh

+//! @tparam _KeyEqual Binary callable type used to compare two keys for equality
+//! @tparam _ProbingScheme Probing scheme type (e.g., `linear_probing`, `double_hashing`)
+//! @tparam _BucketSize Number of slots per bucket
+//! @tparam _MemoryResource Type of memory resource used for device storage


Let’s move all existing documentation here as well.

PointKernel · 2026-02-18T21:48:58Z

cudax/include/cuda/experimental/__cuco/__detail/extent.cuh

+{
+/// @brief A valid (post-rounding) extent type.
+template <class _SizeType>
+using __valid_extent = extent<_SizeType, dynamic_extent>;


valid_extent is intended as a wrapper around fast_div for runtime size or compile-time constants otherwise. If we drop compile-time support, there is no reason to keep this type.

initial migration of OA and static_map

0edb761

github-project-automation bot added this to CCCL Feb 18, 2026

github-project-automation bot moved this to Todo in CCCL Feb 18, 2026

cccl-authenticator-app bot moved this from Todo to In Progress in CCCL Feb 18, 2026

PointKernel self-requested a review February 18, 2026 20:10

PointKernel requested changes Feb 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

initial migration of OA and static_map#7705

initial migration of OA and static_map#7705
srinivasyadav18 wants to merge 1 commit intoNVIDIA:mainfrom
srinivasyadav18:cuco_static_map

srinivasyadav18 commented Feb 18, 2026

Uh oh!

copy-pr-bot bot commented Feb 18, 2026

Uh oh!

PointKernel Feb 18, 2026

Uh oh!

PointKernel Feb 18, 2026

Uh oh!

PointKernel Feb 18, 2026

Uh oh!

PointKernel Feb 18, 2026

Uh oh!

PointKernel Feb 18, 2026

Uh oh!

PointKernel Feb 18, 2026

Uh oh!

PointKernel Feb 18, 2026

Uh oh!

PointKernel Feb 18, 2026

Uh oh!

PointKernel Feb 18, 2026

Uh oh!

PointKernel Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

	auto __warp = cg::tiled_partition<32, cg::thread_block>(__block);
	auto __warp = cg::tiled_partition<warp_size, cg::thread_block>(__block);

	class _KeyEqual = thrust::equal_to<_Key>,
	class _KeyEqual = cuda::std::equal_to<_Key>,

Conversation

srinivasyadav18 commented Feb 18, 2026

Description

Checklist

Uh oh!

copy-pr-bot bot commented Feb 18, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments