[WIP] Improve reducer API for memory allocation & compute by acosmicflamingo · Pull Request #9290 · NVIDIA/cccl

acosmicflamingo · 2026-06-08T02:13:19Z

Description

Resolves #5638

Add 'get_temp_storage_bytes' and 'compute' functions to _Reduce class to improve API usability (as suggested by @kkraus14 and @shwina).

Checklist

New or existing tests cover these changes.
The documentation is up to date with these changes.

@kkraus14

* Start migrating Reduce.__call__ dunder methods to more descriptive API (as suggested by @kkraus14 and @shwina) * Add 'get_temp_storage_bytes' and 'compute' method stubs * Display deprecation warnings when users use utilize Reduce.__call__ dunder method (and add respective tests)

copy-pr-bot · 2026-06-08T02:13:22Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

acosmicflamingo · 2026-06-08T22:14:07Z

@shwina I imagine that we want to avoid something like this, correct?

diff a/python/cuda_cccl/cuda/compute/algorithms/_reduce.py 
     b/python/cuda_cccl/cuda/compute/algorithms/_reduce.py
@@ -35,6 +35,8 @@ from ..typing import (
 
 class _Reduce:
     __slots__ = [
+        "d_in",
+        "d_out",
         "d_in_cccl",
         "d_out_cccl",
         "h_init_cccl",
@@ -52,6 +54,8 @@ class _Reduce:
         h_init: np.ndarray | GpuStruct,
         determinism: Determinism,
     ):
+        self.d_in = d_in
+        self.d_out = d_out
         self.d_in_cccl = cccl.to_cccl_input_iter(d_in)
         self.d_out_cccl = cccl.to_cccl_output_iter(d_out)
         self.h_init_cccl = cccl.to_cccl_value(h_init)
@@ -152,8 +156,8 @@ class _Reduce:
         h_init: np.ndarray | GpuStruct,
         stream=None,
     ):
-        set_cccl_iterator_state(self.d_in_cccl, d_in)
-        set_cccl_iterator_state(self.d_out_cccl, d_out)
+        set_cccl_iterator_state(self.d_in_cccl, self.d_in)
+        set_cccl_iterator_state(self.d_out_cccl, self.d_out)
 
         # Update op state for stateful ops
         op_adapter = make_op_adapter(op)

On the surface, it would pave the way for implementing the API we seek. However, make_reduce_into utilizes the cache_with_registered_key_functions decorator, and caching can pose an issue if we're now saving self.d_in and self.d_out attributes in our reducers. It explains why in my test suite, I could have two identical tests (or three, four, etc) and only the first would pass an assert statement. In the subsequent tests when I'm creating a new reducer, it fetches the reducer created from the first test, and mutates those self.d_in and self.d_out attributes and doesn't actually touch the d_in and d_out arrays that were passed to make_reduce_into in the current test.

acosmicflamingo added 3 commits June 7, 2026 16:59

Fix broken link in python/cuda_cccl README

f4d7242

Move deprecated _Reduce.__call__ functionality to _execute_call

4022970

github-project-automation Bot added this to CCCL Jun 8, 2026

github-project-automation Bot moved this to Todo in CCCL Jun 8, 2026

cccl-authenticator-app Bot moved this from Todo to In Progress in CCCL Jun 8, 2026

Polish new API kwargs

b17dadc

acosmicflamingo force-pushed the improve-mem-compute-api branch from 3d56a30 to b17dadc Compare June 8, 2026 14:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Improve reducer API for memory allocation & compute#9290

[WIP] Improve reducer API for memory allocation & compute#9290
acosmicflamingo wants to merge 4 commits into
NVIDIA:mainfrom
acosmicflamingo:improve-mem-compute-api

acosmicflamingo commented Jun 8, 2026

Uh oh!

copy-pr-bot Bot commented Jun 8, 2026

Uh oh!

acosmicflamingo commented Jun 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

acosmicflamingo commented Jun 8, 2026

Description

Checklist

Uh oh!

copy-pr-bot Bot commented Jun 8, 2026

Uh oh!

acosmicflamingo commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

acosmicflamingo commented Jun 8, 2026 •

edited

Loading