cudax/stf: cuda_try migration — stream_ctx (PR8)#9306
Conversation
Convert stream synchronization, submit timing, wait(), and UNITTEST host register paths to cuda_try; roll back reorder timing events on throw via SCOPE(fail) and use cudaEventCreateWithFlags for overload-set safety.
SummaryThis PR continues the staged migration from ChangesCore stream context operations:
Unit test updates:
Metrics
WalkthroughSingle file updating stream context error handling from ChangesStream Context Error Handling Refactoring
Possibly related PRs
Suggested labels
Suggested reviewers
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 5c509e00-6061-424e-9f00-203ceaf361b2
📒 Files selected for processing (1)
cudax/include/cuda/experimental/__stf/stream/stream_ctx.cuh
| startEvent = cuda_try<cudaEventCreateWithFlags>(cudaEventDefault); | ||
| stopEvent = cuda_try<cudaEventCreateWithFlags>(cudaEventDefault); | ||
| timing_events_owned = true; |
There was a problem hiding this comment.
critical: Event leak on partial creation failure.
If cudaEventCreateWithFlags succeeds at line 565 but fails at line 566, startEvent is non-null but timing_events_owned is still false. The SCOPE(fail) guard at line 530 checks if (timing_events_owned) and skips cleanup, leaking startEvent.
Move line 567 immediately after line 565:
startEvent = cuda_try<cudaEventCreateWithFlags>(cudaEventDefault);
+timing_events_owned = true;
stopEvent = cuda_try<cudaEventCreateWithFlags>(cudaEventDefault);
-timing_events_owned = true;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| startEvent = cuda_try<cudaEventCreateWithFlags>(cudaEventDefault); | |
| stopEvent = cuda_try<cudaEventCreateWithFlags>(cudaEventDefault); | |
| timing_events_owned = true; | |
| startEvent = cuda_try<cudaEventCreateWithFlags>(cudaEventDefault); | |
| timing_events_owned = true; | |
| stopEvent = cuda_try<cudaEventCreateWithFlags>(cudaEventDefault); |
| cuda_try<cudaSetDevice>(0); | ||
| cuda_try<cudaEventRecord>(stopEvent, fence()); | ||
| cuda_try<cudaEventSynchronize>(stopEvent); | ||
| state.submission_time = cuda_try<cudaEventElapsedTime>(startEvent, stopEvent); | ||
| timing_events_owned = false; |
There was a problem hiding this comment.
critical: Event leak in success path.
After computing elapsed time at line 582, neither startEvent nor stopEvent is destroyed. Line 583 disables the SCOPE(fail) cleanup by setting timing_events_owned = false, so the events are never freed. cudaEvent_t requires explicit cudaEventDestroy.
Add destruction before disabling ownership:
state.submission_time = cuda_try<cudaEventElapsedTime>(startEvent, stopEvent);
+cuda_safe_call(cudaEventDestroy(startEvent));
+cuda_safe_call(cudaEventDestroy(stopEvent));
timing_events_owned = false;Use cuda_safe_call here (not cuda_try) because the SCOPE(fail) is still active and we cannot let destruction throw into the failure handler.
|
/ok to test 837b649 |
😬 CI Workflow Results🟥 Finished in 34m 53s: Pass: 21%/55 | Total: 5h 44m | Max: 34m 50s | Hits: 99%/2329See results here. |
|
/ok to test 85303ba |
Summary
stream/stream_ctx.cuhfromcuda_safe_calltocuda_try(16 sites).cudaEventCreateWithFlagsandSCOPE(fail)to destroy timing events iftask.run()or end-timing throws.cudaHostRegisterpaths usecuda_try<>; scalar fixture usesSCOPE(fail)tofreemalloc on register failure.cuda_safe_call→cuda_trymigration (PR8 of 11).Test plan
ninja -j 12 -C build/cudax-cpp20 cudax.test.stf.unittest_headers.__stf.stream.stream_ctx