Conversation
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
cpp/src/mip_heuristics/mip_scaling_strategy.cu (1)
260-263:row_skip_scalingis now correctly zero-initialized before use.
thrust::fillat Lines 261-262 ensuresrow_skip_scalingis deterministic beforecompute_big_m_skip_rowspopulates it. This resolves the prior uninitialized-read risk.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/mip_scaling_strategy.cu` around lines 260 - 263, The prior uninitialized-read risk for row_skip_scaling is fixed by explicitly zero-initializing it; ensure you keep the thrust::fill call that uses handle_ptr_->get_thrust_policy() to fill row_skip_scaling.begin()..end() with i_t(0) before calling compute_big_m_skip_rows so compute_big_m_skip_rows only reads initialized values from row_skip_scaling; confirm the allocation of iteration_scaling remains unchanged and placed after row_skip_scaling is initialized.
🧹 Nitpick comments (2)
cpp/src/mip_heuristics/mip_scaling_strategy.cu (2)
66-80: Preferthrust::maximum<T>/thrust::minimum<T>over custommax_op_t/min_op_t.Both structs are direct re-implementations of standard Thrust functors that are already
__host__ __device__-compatible and accepted by CUB'sDeviceSegmentedReduce::Reduce.♻️ Proposed refactor
-template <typename t_t> -struct max_op_t { - __host__ __device__ t_t operator()(const t_t& lhs, const t_t& rhs) const - { - return lhs > rhs ? lhs : rhs; - } -}; - -template <typename t_t> -struct min_op_t { - __host__ __device__ t_t operator()(const t_t& lhs, const t_t& rhs) const - { - return lhs < rhs ? lhs : rhs; - } -};Then replace usages with
thrust::maximum<f_t>{}andthrust::minimum<f_t>{}at call sites.As per coding guidelines: "Avoid reinventing functionality already available in Thrust, CCCL, or RMM libraries."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/mip_scaling_strategy.cu` around lines 66 - 80, Replace the custom functors max_op_t and min_op_t with the standard Thrust functors: remove or stop using max_op_t and min_op_t and update all call sites to use thrust::maximum<f_t>{} and thrust::minimum<f_t>{} (or the appropriate template type instead of f_t) so the code leverages the existing __host__ __device__-compatible Thrust implementations accepted by CUB::DeviceSegmentedReduce::Reduce.
104-177:compute_big_m_skip_rowsrecomputesrow_inf_normredundantly, and ignores the passedtemp_storagestream.Two issues:
- The function hard-codes
op_problem.handle_ptr->get_stream()for its CUB calls (unlikecompute_row_inf_normwhich accepts astream_viewparameter). This inconsistency would silently break if a caller ever passes a different stream.- The max-abs segmented reduction on Lines 121-130 recomputes
row_inf_normthat is already computed in the first call tocompute_row_inf_normon loop iteration 0 (Line 301). Sincecompute_big_m_skip_rowsis called once before the loop,row_inf_normgets computed twice before any scaling is applied.Consider adding a
stream_viewparameter tocompute_big_m_skip_rowsconsistent withcompute_row_inf_norm, and passing the already-computedrow_inf_normfrom a precedingcompute_row_inf_normcall instead of re-running the max reduce.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/mip_scaling_strategy.cu` around lines 104 - 177, compute_big_m_skip_rows currently reruns the max-abs segmented reduction into row_inf_norm and hard-codes op_problem.handle_ptr->get_stream(), causing redundant work and a stream mismatch; change the signature of compute_big_m_skip_rows to accept a raft::stream_view (or stream_view alias used elsewhere) and remove the initial max-abs cub::DeviceSegmentedReduce call so the function uses the row_inf_norm that was produced by compute_row_inf_norm; also replace all uses of op_problem.handle_ptr->get_stream() in compute_big_m_skip_rows with the new stream_view and ensure temp_storage/temp_storage_bytes and other CUB/Thrust calls use that stream_view when invoked; finally update callers to compute_big_m_skip_rows to pass the stream_view and to call compute_row_inf_norm once beforehand so row_inf_norm is not recomputed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/mip_heuristics/mip_scaling_strategy.cu`:
- Around line 327-370: You apply per-iteration scaling (iteration_scaling) to
coefficients and bounds but never record the accumulated per-row scaling
factors, so later dual unscaling can't be performed; add and update a per-row
accumulator in op_problem_scaled_ (e.g., op_problem_scaled_.row_scaling_factors
initialized to 1) and multiply it by the iteration scaling for each row each
iteration. Concretely: locate iteration_scaling, coefficient_row_index, and the
three thrust::transform calls that multiply op_problem_scaled_.coefficients,
.constraint_lower_bounds, and .constraint_upper_bounds; add a thrust::transform
to update op_problem_scaled_.row_scaling_factors by applying iteration_scaling
(use a permutation iterator with coefficient_row_index when mapping scalings
through coefficient indices to row indices for coefficients, and a direct
iterator for constraint bounds) so the per-row product is maintained for
post-solve dual unscaling.
---
Duplicate comments:
In `@cpp/src/mip_heuristics/mip_scaling_strategy.cu`:
- Around line 260-263: The prior uninitialized-read risk for row_skip_scaling is
fixed by explicitly zero-initializing it; ensure you keep the thrust::fill call
that uses handle_ptr_->get_thrust_policy() to fill
row_skip_scaling.begin()..end() with i_t(0) before calling
compute_big_m_skip_rows so compute_big_m_skip_rows only reads initialized values
from row_skip_scaling; confirm the allocation of iteration_scaling remains
unchanged and placed after row_skip_scaling is initialized.
---
Nitpick comments:
In `@cpp/src/mip_heuristics/mip_scaling_strategy.cu`:
- Around line 66-80: Replace the custom functors max_op_t and min_op_t with the
standard Thrust functors: remove or stop using max_op_t and min_op_t and update
all call sites to use thrust::maximum<f_t>{} and thrust::minimum<f_t>{} (or the
appropriate template type instead of f_t) so the code leverages the existing
__host__ __device__-compatible Thrust implementations accepted by
CUB::DeviceSegmentedReduce::Reduce.
- Around line 104-177: compute_big_m_skip_rows currently reruns the max-abs
segmented reduction into row_inf_norm and hard-codes
op_problem.handle_ptr->get_stream(), causing redundant work and a stream
mismatch; change the signature of compute_big_m_skip_rows to accept a
raft::stream_view (or stream_view alias used elsewhere) and remove the initial
max-abs cub::DeviceSegmentedReduce call so the function uses the row_inf_norm
that was produced by compute_row_inf_norm; also replace all uses of
op_problem.handle_ptr->get_stream() in compute_big_m_skip_rows with the new
stream_view and ensure temp_storage/temp_storage_bytes and other CUB/Thrust
calls use that stream_view when invoked; finally update callers to
compute_big_m_skip_rows to pass the stream_view and to call compute_row_inf_norm
once beforehand so row_inf_norm is not recomputed.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/mip_heuristics/mip_scaling_strategy.cu`:
- Around line 146-150: Thrust kernel launches in mip_scaling_strategy.cu (e.g.,
the thrust::transform that uses handle_ptr->get_thrust_policy() and
min_nonzero_bounds_abs_transform_t<f_t>{}) are missing explicit CUDA error
checks; after each Thrust call (all thrust::transform / thrust::copy /
thrust::fill invocations at the reported locations) add a CUDA
device-synchronization error check using the same pattern as CUB calls (e.g.,
wrap a cudaDeviceSynchronize() or thrust::detail::backend::cuda::sync call
followed by RAFT_CUDA_TRY or
CUDA_CHECK(cudaGetLastError())/CUDA_CHECK(cudaDeviceSynchronize()) ) so any
launch or runtime error is caught and propagated—apply this change to each
affected symbol/site in the file (the transform using
min_nonzero_bounds_abs_transform_t, other thrust::transform/copy/fill
invocations around lines reported) to ensure consistent error handling.
- Around line 300-309: Clamp the user-provided
row_scaling_k_min/row_scaling_k_max to a safe exponent range before calling
ldexp and before passing into the kernel: compute safe_k_min/safe_k_max by
clamping row_scaling_k_min/row_scaling_k_max to a predetermined safe exponent
window (e.g., the sensible defaults like -20..20 or the numeric limits for f_t's
exponent range), then use safe_k_min/safe_k_max when computing
min_scaling/max_scaling (instead of raw row_scaling_k_min/row_scaling_k_max),
include those safe values in the kernel lambda capture, and replace the per-row
k and required_k clamps with clamps against safe_k_min/safe_k_max so ldexp(1,k)
cannot underflow/overflow.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
cpp/src/mip_heuristics/diversity/diversity_manager.cucpp/src/mip_heuristics/diversity/diversity_manager.cuhcpp/src/mip_heuristics/mip_scaling_strategy.cucpp/src/mip_heuristics/mip_scaling_strategy.cuh
| thrust::transform(handle_ptr->get_thrust_policy(), | ||
| row_bounds_begin, | ||
| row_bounds_end, | ||
| row_min_nonzero_bounds.begin(), | ||
| min_nonzero_bounds_abs_transform_t<f_t>{}); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the file structure and the specific lines mentioned
wc -l cpp/src/mip_heuristics/mip_scaling_strategy.cuRepository: NVIDIA/cuopt
Length of output: 106
🏁 Script executed:
# Read the section around lines 146-150 to see the context
sed -n '140,160p' cpp/src/mip_heuristics/mip_scaling_strategy.cuRepository: NVIDIA/cuopt
Length of output: 1229
🏁 Script executed:
# Search for all thrust:: launches in the file and check if they are followed by RAFT_CUDA_TRY
cat cpp/src/mip_heuristics/mip_scaling_strategy.cu | grep -n "thrust::" | head -20Repository: NVIDIA/cuopt
Length of output: 1476
🏁 Script executed:
# More detailed check: look for the pattern of thrust launches and nearby error checks
python3 <<'EOF'
import re
with open('cpp/src/mip_heuristics/mip_scaling_strategy.cu', 'r') as f:
lines = f.readlines()
# Find all thrust:: calls with context
results = []
for i, line in enumerate(lines):
if 'thrust::' in line and not line.strip().startswith('//'):
# Look at the next few lines to see if there's a RAFT_CUDA_TRY
context_start = max(0, i - 1)
context_end = min(len(lines), i + 5)
context = ''.join(lines[context_start:context_end])
# Check if RAFT_CUDA_TRY appears in the next few lines
has_check = 'RAFT_CUDA_TRY' in context
results.append({
'line': i + 1,
'code': line.strip(),
'has_check': has_check,
'context': context.strip()
})
# Print summary
print("Thrust launches found:")
print("=" * 80)
for r in results:
print(f"Line {r['line']}: {r['code'][:70]}")
print(f" Has error check: {r['has_check']}")
print()
EOFRepository: NVIDIA/cuopt
Length of output: 4849
🏁 Script executed:
# Also check what RAFT_CUDA_TRY pattern looks like in this file
grep -n "RAFT_CUDA_TRY" cpp/src/mip_heuristics/mip_scaling_strategy.cu | head -10Repository: NVIDIA/cuopt
Length of output: 626
🏁 Script executed:
# Check the specific ranges mentioned in the comment
echo "=== Line 146-150 ==="
sed -n '146,150p' cpp/src/mip_heuristics/mip_scaling_strategy.cu
echo ""
echo "=== Line 207-225 ==="
sed -n '207,225p' cpp/src/mip_heuristics/mip_scaling_strategy.cu
echo ""
echo "=== Line 320-321 ==="
sed -n '320,321p' cpp/src/mip_heuristics/mip_scaling_strategy.cu
echo ""
echo "=== Line 325-336 ==="
sed -n '325,336p' cpp/src/mip_heuristics/mip_scaling_strategy.cuRepository: NVIDIA/cuopt
Length of output: 1835
🏁 Script executed:
# Check whether this file uses a specific pattern - perhaps thrust operations are wrapped at a higher level
grep -B2 -A2 "thrust::transform\|thrust::upper_bound\|thrust::count\|thrust::fill" cpp/src/mip_heuristics/mip_scaling_strategy.cu | head -50Repository: NVIDIA/cuopt
Length of output: 2085
🏁 Script executed:
# Check if there's any wrapper or special handling in thrust_policy
find . -type f -name "*.hpp" -o -name "*.h" | xargs grep -l "thrust_policy\|get_thrust_policy" | head -5Repository: NVIDIA/cuopt
Length of output: 222
🏁 Script executed:
# Check the RAFT handle implementation to see if get_thrust_policy includes error checking
rg "get_thrust_policy|thrust_policy" -A 3 -B 1 --max-count=10Repository: NVIDIA/cuopt
Length of output: 50369
🏁 Script executed:
# Look for any error checking patterns after thrust operations in the broader codebase
rg "thrust::(transform|fill|upper_bound|count|transform_reduce)" -A 2 | grep -E "CUDA|cuda|error|check" | head -20Repository: NVIDIA/cuopt
Length of output: 1439
🏁 Script executed:
# Specifically check other scaling/heuristics files in the same directory for comparison
ls -la cpp/src/mip_heuristics/Repository: NVIDIA/cuopt
Length of output: 1531
🏁 Script executed:
# Check how other files in the same directory handle Thrust launches
find cpp/src/mip_heuristics -name "*.cu" -o -name "*.cuh" | xargs grep -l "thrust::" | head -3Repository: NVIDIA/cuopt
Length of output: 210
🏁 Script executed:
# Sample one other file to see if this is a systemic pattern
head -250 cpp/src/mip_heuristics/mip_scaling_strategy.cu | tail -50 | rg "thrust::" -A 1Repository: NVIDIA/cuopt
Length of output: 857
🏁 Script executed:
# More comprehensive check: do ANY thrust calls in this repo have post-call error checks?
rg "thrust::(transform|fill|upper_bound|count)" -A 1 | grep -B1 "RAFT_CUDA_TRY\|cudaPeekAtLastError" | head -20Repository: NVIDIA/cuopt
Length of output: 38
Add explicit CUDA error checks after Thrust launches.
CUB calls are checked with RAFT_CUDA_TRY, but Thrust kernel launches lack explicit error verification. This inconsistency can hide launch and runtime failures, weakening error propagation.
🔧 Minimal fix pattern
thrust::transform(handle_ptr->get_thrust_policy(),
row_bounds_begin,
row_bounds_end,
row_min_nonzero_bounds.begin(),
min_nonzero_bounds_abs_transform_t<f_t>{});
+ RAFT_CUDA_TRY(cudaPeekAtLastError());Apply the same pattern after each Thrust launch.
Affected locations: 146–150, 207–225, 320–321, 325–336, 331–336, 352–353, 374–387, 396–402, 462–468, 470–475, 476–481.
Per coding guidelines: "Every CUDA kernel launch and memory operation must have error checking with CUDA_CHECK or equivalent verification" and "Verify error propagation from CUDA to user-facing APIs is complete."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| thrust::transform(handle_ptr->get_thrust_policy(), | |
| row_bounds_begin, | |
| row_bounds_end, | |
| row_min_nonzero_bounds.begin(), | |
| min_nonzero_bounds_abs_transform_t<f_t>{}); | |
| thrust::transform(handle_ptr->get_thrust_policy(), | |
| row_bounds_begin, | |
| row_bounds_end, | |
| row_min_nonzero_bounds.begin(), | |
| min_nonzero_bounds_abs_transform_t<f_t>{}); | |
| RAFT_CUDA_TRY(cudaPeekAtLastError()); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/mip_scaling_strategy.cu` around lines 146 - 150,
Thrust kernel launches in mip_scaling_strategy.cu (e.g., the thrust::transform
that uses handle_ptr->get_thrust_policy() and
min_nonzero_bounds_abs_transform_t<f_t>{}) are missing explicit CUDA error
checks; after each Thrust call (all thrust::transform / thrust::copy /
thrust::fill invocations at the reported locations) add a CUDA
device-synchronization error check using the same pattern as CUB calls (e.g.,
wrap a cudaDeviceSynchronize() or thrust::detail::backend::cuda::sync call
followed by RAFT_CUDA_TRY or
CUDA_CHECK(cudaGetLastError())/CUDA_CHECK(cudaDeviceSynchronize()) ) so any
launch or runtime error is caught and propagated—apply this change to each
affected symbol/site in the file (the transform using
min_nonzero_bounds_abs_transform_t, other thrust::transform/copy/fill
invocations around lines reported) to ensure consistent error handling.
373d374 to
cfcf0ab
Compare
This PR adds row scaling to MIP and uses PDLP scaling for both barrier and PDLP.
Summary by CodeRabbit
New Features
Bug Fixes
Chores