Add Probing Implications and Improve Variable Order#580
Add Probing Implications and Improve Variable Order#580rapids-bot[bot] merged 25 commits intoNVIDIA:mainfrom
Conversation
📝 WalkthroughWalkthroughAdjusts probing behavior and defaults (probing cache now optional in presolve, removed from early solver path), enlarges probing cache implementation (returns bool, supports modification/substitution queues), adds variable-substitution and bounds APIs on problem_t, introduces remap flag for trivial presolve, updates diversity_config defaults, and adds a thread-safe work unit ordered queue. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
cpp/src/mip/diversity/diversity_config.hpp(1 hunks)cpp/src/mip/diversity/diversity_manager.cu(1 hunks)cpp/src/mip/local_search/rounding/constraint_prop.cu(1 hunks)cpp/src/mip/presolve/multi_probe.cu(1 hunks)cpp/src/mip/presolve/probing_cache.cu(5 hunks)cpp/src/mip/presolve/trivial_presolve.cu(1 hunks)cpp/src/mip/presolve/trivial_presolve.cuh(3 hunks)cpp/src/mip/presolve/trivial_presolve_helpers.cuh(1 hunks)cpp/src/mip/problem/problem.cu(4 hunks)cpp/src/mip/problem/problem.cuh(1 hunks)cpp/src/mip/problem/problem_helpers.cuh(3 hunks)cpp/src/mip/utilities/work_unit_ordered_queue.cuh(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-22T14:25:22.899Z
Learnt from: aliceb-nv
Repo: NVIDIA/cuopt PR: 527
File: cpp/src/mip/diversity/lns/rins.cu:167-175
Timestamp: 2025-10-22T14:25:22.899Z
Learning: In MIP (Mixed Integer Programming) problems in the cuOPT codebase, `n_integer_vars == 0` is impossible by definition—MIP problems must have at least one integer variable. If there are no integer variables, it would be a pure Linear Programming (LP) problem, not a MIP problem.
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/problem/problem_helpers.cuh
🧬 Code graph analysis (5)
cpp/src/mip/local_search/rounding/constraint_prop.cu (1)
cpp/src/utilities/copy_helpers.hpp (4)
get_lower(107-110)get_lower(107-107)get_upper(113-116)get_upper(113-113)
cpp/src/mip/presolve/multi_probe.cu (2)
cpp/src/linear_programming/pdlp.cu (2)
compute_stats(843-867)compute_stats(843-846)cpp/src/mip/presolve/bounds_presolve.cu (2)
calculate_activity(103-113)calculate_activity(103-103)
cpp/src/mip/problem/problem.cu (1)
cpp/src/utilities/copy_helpers.hpp (16)
device_copy(247-253)device_copy(247-248)device_copy(264-270)device_copy(264-266)device_copy(281-287)device_copy(281-282)device_copy(296-313)device_copy(296-296)make_span(332-337)make_span(332-334)make_span(340-345)make_span(340-342)make_span(348-351)make_span(348-348)make_span(354-357)make_span(354-354)
cpp/src/mip/diversity/diversity_manager.cu (2)
cpp/src/mip/presolve/probing_cache.cu (2)
compute_probing_cache(708-803)compute_probing_cache(708-710)cpp/src/mip/presolve/lb_probing_cache.cu (2)
compute_probing_cache(320-406)compute_probing_cache(320-322)
cpp/src/mip/presolve/probing_cache.cu (3)
cpp/src/utilities/copy_helpers.hpp (22)
make_span(332-337)make_span(332-334)make_span(340-345)make_span(340-342)make_span(348-351)make_span(348-348)make_span(354-357)make_span(354-354)host_copy(128-135)host_copy(128-128)host_copy(146-161)host_copy(146-146)host_copy(172-175)host_copy(172-172)host_copy(185-191)host_copy(185-185)host_copy(202-208)host_copy(202-202)host_copy(219-222)host_copy(219-219)host_copy(233-236)host_copy(233-233)cpp/src/mip/problem/problem.cu (2)
view(917-954)view(917-917)cpp/src/mip/presolve/trivial_presolve.cu (2)
trivial_presolve(24-24)trivial_presolve(28-28)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.10, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.12, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.13, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.13, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.11, arm64, rockylinux8
- GitHub Check: checks / check-style
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
cpp/src/mip/diversity/diversity_manager.cu (2)
18-18: Consider makingfj_only_runa runtime configuration.The compile-time constant limits flexibility. If different problem instances might benefit from skipping probing cache during presolve, consider making this a runtime configuration parameter in
diversity_config.
188-191: Add early exit when insufficient time remains for probing.The concern is valid. Although
compute_probing_cacheinternally guards withtimer.check_time_limit()and exits immediately, it still incurs function call and parallel setup overhead. Whentime_for_probing_cacheis zero or negative (which can occur iftime_limitis small or nearly exhausted), the function would exit immediately on its first check without performing any useful work.Following the established codebase pattern (seen throughout
diverse_solver.hpp), add a guard at the call site to skip the function entirely when time budget is insufficient:if (time_for_probing_cache > 0.0) { timer_t probing_timer{time_for_probing_cache}; compute_probing_cache(ls.constraint_prop.bounds_update, *problem_ptr, probing_timer); }This avoids unnecessary overhead and aligns with how time-limited operations are handled elsewhere in the codebase.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
cpp/src/mip/diversity/diversity_config.hpp(1 hunks)cpp/src/mip/diversity/diversity_manager.cu(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- cpp/src/mip/diversity/diversity_config.hpp
🧰 Additional context used
🧬 Code graph analysis (1)
cpp/src/mip/diversity/diversity_manager.cu (2)
cpp/src/mip/presolve/probing_cache.cu (2)
compute_probing_cache(698-793)compute_probing_cache(698-700)cpp/src/mip/presolve/lb_probing_cache.cu (2)
compute_probing_cache(310-396)compute_probing_cache(310-312)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: conda-cpp-build / 12.9.1, 3.10, amd64, rockylinux8
- GitHub Check: conda-cpp-build / 13.0.2, 3.10, amd64, rockylinux8
- GitHub Check: conda-cpp-build / 12.9.1, 3.10, arm64, rockylinux8
- GitHub Check: conda-cpp-build / 13.0.2, 3.10, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.11, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.12, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.13, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.10, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-sh-client / 13.0.2, 3.10, amd64, rockylinux8
🔇 Additional comments (1)
cpp/src/mip/diversity/diversity_manager.cu (1)
191-193: The double invocation oftrivial_presolveis intentional, not redundant.The two calls serve different purposes:
- Inside
compute_probing_cache(probing_cache.cu:790):trivial_presolve(problem, true)remaps cache IDs- After
compute_probing_cachereturns (diversity_manager.cu:193):trivial_presolve(*problem_ptr)performs cleanup without remapping (remap_cache_ids defaults to false)The different parameter values indicate distinct responsibilities—the first call handles cache ID remapping as part of probing, while the second performs final presolve operations. No changes needed.
Note: The review references line 793, but the actual call is at line 790 in probing_cache.cu.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
cpp/src/mip/problem/problem.cu (2)
1120-1139: Fix lost objective mass and race during substitution updatesWe’re still dropping objective contribution when substituting: multiple substitutions targeting the same survivor race on
objective_coefficients[substituting_var_idx], and the eliminated variable’s coefficient stays non-zero, so its term survives in the objective. This is the same bug flagged earlier—the data race plus the lingering coefficient wipe out the intendedc_i * coeff * x_sub. Please atomically accumulate onto the surviving variable and zero the eliminated one.- f_t objective_offset_difference = - objective_coefficients[var_idx] * substitute_offset[idx]; - atomicAdd(objective_offset, objective_offset_difference); - objective_coefficients[substituting_var_idx] += - objective_coefficients[var_idx] * substitute_coefficient[idx]; + f_t coeff = objective_coefficients[var_idx]; + f_t objective_offset_difference = coeff * substitute_offset[idx]; + atomicAdd(objective_offset, objective_offset_difference); + atomicAdd(objective_coefficients.data() + substituting_var_idx, + coeff * substitute_coefficient[idx]); + objective_coefficients[var_idx] = 0.;
1226-1233: Do not zero the surviving variable’s objective coefficientThis line zeros
objective_coefficients[next_var_idx], i.e. the coefficient you just transferred mass onto. The surviving variable loses all objective weight, reintroducing the bug from the earlier review. Drop this assignment—the eliminated variable is already nulled out by the fix above.- variables[offset_begin + 1] = dummy_substituted_variable; - coefficients[offset_begin + 1] = 0.; - objective_coefficients[next_var_idx] = 0.; + variables[offset_begin + 1] = dummy_substituted_variable; + coefficients[offset_begin + 1] = 0.;
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
cpp/src/mip/presolve/probing_cache.cu(5 hunks)cpp/src/mip/problem/problem.cu(4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-22T14:25:22.899Z
Learnt from: aliceb-nv
Repo: NVIDIA/cuopt PR: 527
File: cpp/src/mip/diversity/lns/rins.cu:167-175
Timestamp: 2025-10-22T14:25:22.899Z
Learning: In MIP (Mixed Integer Programming) problems in the cuOPT codebase, `n_integer_vars == 0` is impossible by definition—MIP problems must have at least one integer variable. If there are no integer variables, it would be a pure Linear Programming (LP) problem, not a MIP problem.
Applied to files:
cpp/src/mip/problem/problem.cu
🧬 Code graph analysis (2)
cpp/src/mip/problem/problem.cu (1)
cpp/src/utilities/copy_helpers.hpp (16)
device_copy(237-243)device_copy(237-238)device_copy(254-260)device_copy(254-256)device_copy(271-277)device_copy(271-272)device_copy(286-303)device_copy(286-286)make_span(322-327)make_span(322-324)make_span(330-335)make_span(330-332)make_span(338-341)make_span(338-338)make_span(344-347)make_span(344-344)
cpp/src/mip/presolve/probing_cache.cu (3)
cpp/src/utilities/copy_helpers.hpp (22)
make_span(322-327)make_span(322-324)make_span(330-335)make_span(330-332)make_span(338-341)make_span(338-338)make_span(344-347)make_span(344-344)host_copy(118-125)host_copy(118-118)host_copy(136-151)host_copy(136-136)host_copy(162-165)host_copy(162-162)host_copy(175-181)host_copy(175-175)host_copy(192-198)host_copy(192-192)host_copy(209-212)host_copy(209-209)host_copy(223-226)host_copy(223-223)cpp/src/mip/problem/problem.cu (2)
view(907-944)view(907-907)cpp/src/mip/presolve/trivial_presolve.cu (2)
trivial_presolve(14-14)trivial_presolve(18-18)
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
cpp/src/mip/presolve/probing_cache.cu (1)
686-693: Critical: std::find on device memory causes undefined behavior.
count_per_variableis adevice_uvector, so its iterators point to GPU memory. Callingstd::findon device iterators from host code is undefined behavior and will either crash or return garbage.Copy
count_per_variableto host first, then find the cutoff:auto h_priority_indices = host_copy(priority_indices); + auto h_counts = host_copy(count_per_variable); problem.handle_ptr->sync_stream(); - // Find the index of the first 0 element in h_priority_indices - auto first_zero_it = std::find(count_per_variable.begin(), count_per_variable.end(), 0); + // Find the index of the first 0 element in counts + auto first_zero_it = std::find(h_counts.begin(), h_counts.end(), 0); size_t first_zero_index = (first_zero_it != count_per_variable.end()) - ? std::distance(count_per_variable.begin(), first_zero_it) - : count_per_variable.size(); + ? std::distance(h_counts.begin(), first_zero_it) + : h_counts.size(); h_priority_indices.erase(h_priority_indices.begin() + first_zero_index, h_priority_indices.end());Note: This is the same issue flagged in the past review comment for lines 686-694, indicating it was not fully addressed.
Based on past review comment
🧹 Nitpick comments (6)
cpp/src/mip/presolve/probing_cache.cu (6)
499-500: Expensive assertions in hot path.These bound-validation assertions execute once per variable per probe in a multi-threaded loop. In debug builds this adds significant overhead; in release builds they vanish, hiding potential bugs.
Consider replacing with conditional error handling that logs and gracefully degrades:
- cuopt_assert(h_var_bounds[i].x <= lower_bound, "lower bound violation"); - cuopt_assert(h_var_bounds[i].y >= upper_bound, "upper bound violation"); + if (h_var_bounds[i].x > lower_bound || h_var_bounds[i].y < upper_bound) { + CUOPT_LOG_WARNING("Bound violation for var %d: inferred [%f,%f] vs current [%f,%f]", + i, lower_bound, upper_bound, h_var_bounds[i].x, h_var_bounds[i].y); + continue; // Skip this update + }
568-597: First-substitution-wins policy may be suboptimal.When multiple substitutions target the same variable, the code arbitrarily keeps the first one (line 578). If later substitutions have better coefficients or fewer dependencies, this heuristic might miss opportunities.
Consider documenting this policy explicitly or evaluating substitutions by a quality metric (e.g., smallest coefficient magnitude, fewest cascading dependencies).
720-722: Modification/substitution vectors accumulate across steps.The per-thread
modification_vector_poolandsubstitution_vector_poolare never cleared afterapply_modification_queue_to_problem(line 771) or between steps. While modifications are applied incrementally, the vectors continue to grow, holding stale entries.Clear the modification pool after applying, or document if accumulation across steps is intentional for replay/debugging:
{ // TODO when we have determinism, check current threads work/time counter and filter queue // items that are smaller or equal to that apply_modification_queue_to_problem(modification_vector_pool, problem); + // Clear applied modifications + for (auto& vec : modification_vector_pool) { vec.clear(); }
736-736: Hardcoded step size may not suit all problem scales.The step size is fixed at
min(2048, priority_indices.size()). For very large problems (millions of variables) this creates many sequential steps, reducing parallelism. For tiny problems it's a no-op but adds loop overhead.Consider scaling step size by available parallelism or making it configurable:
- const size_t step_size = min((size_t)2048, priority_indices.size()); + const size_t step_size = std::min(priority_indices.size(), + std::max((size_t)256, priority_indices.size() / (max_threads * 4)));
728-728: Unused compute_stats flag.
compute_statsis set totruefor each worker, but the stats are never collected, aggregated, or logged. This suggests incomplete instrumentation.Do you want me to generate code to collect and report probing statistics (e.g., infeasibility counts, bound improvements per probe, wall-clock time per variable)?
704-705: Commented-out call to old priority computation.The old
compute_prioritized_integer_indicespath is commented out. If it's deprecated, remove it; if it's an alternative heuristic, document when to use which.Remove the commented line or convert it to a configuration option if both heuristics have merit.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cpp/src/mip/presolve/probing_cache.cu(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
cpp/src/mip/presolve/probing_cache.cu (3)
cpp/src/utilities/copy_helpers.hpp (22)
make_span(322-327)make_span(322-324)make_span(330-335)make_span(330-332)make_span(338-341)make_span(338-338)make_span(344-347)make_span(344-344)host_copy(118-125)host_copy(118-118)host_copy(136-151)host_copy(136-136)host_copy(162-165)host_copy(162-162)host_copy(175-181)host_copy(175-175)host_copy(192-198)host_copy(192-192)host_copy(209-212)host_copy(209-209)host_copy(223-226)host_copy(223-223)cpp/src/mip/problem/problem.cu (2)
view(907-944)view(907-907)cpp/src/mip/presolve/trivial_presolve.cu (2)
trivial_presolve(14-14)trivial_presolve(18-18)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.12, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.12, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.10, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.11, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.13, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.10, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.11, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.13, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-sh-client / 13.0.2, 3.10, amd64, rockylinux8
🔇 Additional comments (5)
cpp/src/mip/presolve/probing_cache.cu (5)
9-9: LGTM!The new includes support the refactored probing workflow.
trivial_presolve.cuhis invoked at line 790, andwork_unit_ordered_queue.cuhaligns with the new per-thread work-unit processing described in the PR objectives.Also applies to: 13-13
534-566: LGTM!The modification queue correctly merges per-thread updates by taking the intersection of bounds (max of lower, min of upper), ensuring conservative tightening when multiple threads propose conflicting modifications for the same variable.
381-382: ****The codebase does not contain a
problem.is_binary_variablefield, and the current implementation is semantically correct. Theis_binarycheck at line 382 appropriately distinguishes between variables with exact [0,1] bounds versus any variable with bound width of 1. This distinction is intentional and properly explained by the comment—the variable substitution logic at line 513 requires specifically binary-constrained variables [0,1], not just any unit-interval variable. The comment already clarifies this nuance.Likely an incorrect or invalid review comment.
771-771: The review comment is incorrect and based on a false premise.Modifications in
apply_modification_queue_to_problemonly update variable bounds viaproblem.update_variable_bounds()— they do not eliminate variables or change variable indices. The tuple structure<time, var_idx, lb, ub>confirms modifications only tighten bounds.Since modifications cannot eliminate variables, substitutions applied later (line 785) safely reference the indices collected during the loop. The indices remain valid. The asymmetry between modifications (in-loop) and substitutions (after-loop) poses no risk of stale references.
trivial_presolvewithremap_cache_ids=truehandles any subsequent variable elimination and remapping after substitutions are complete.Likely an incorrect or invalid review comment.
646-670: The review comment is incorrect. The code properly handles CUB's runtime buffer sizing.CUB's temp_storage_bytes requirement depends on the operator, data type, segment layout, number of segments and device configuration, with no closed-form formula. The two reductions operate on fundamentally different segment structures:
- Reduction 1:
n_constraintssegments usingproblem.offsets- Reduction 2:
n_variablessegments usingproblem.reverse_offsetsSince segment count and layout differ, the buffer requirements are independent and can differ. The resize at line 657 is necessary, not redundant. The code correctly implements CUB's pattern: call with d_temp_storage == nullptr to query the requirement, allocate, then call again with the allocated storage.
Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
cpp/src/mip/presolve/probing_cache.cu (2)
439-441: Replace assertion with proper infeasibility propagation.The assertion requires at least one probe to be feasible, but both probes can legitimately fail when the variable's domain is genuinely infeasible due to conflicting constraints discovered during probing. This assertion will crash in debug builds or be silently ignored in release builds, masking real infeasibility.
Replace the assertion with proper infeasibility detection and propagation:
- cuopt_assert(multi_probe_presolve.infeas_constraints_count_0 == 0 || - multi_probe_presolve.infeas_constraints_count_1 == 0, - "At least one of the probes must be feasible"); + if (multi_probe_presolve.infeas_constraints_count_0 > 0 && + multi_probe_presolve.infeas_constraints_count_1 > 0) { + CUOPT_LOG_WARNING("Both probes infeasible for var %d - variable domain is infeasible", var_idx); + // Signal global infeasibility or propagate to caller + return; + }Based on past review comments.
658-671: Remove redundant segmented reduce call.Lines 647-656 and 662-671 call
cub::DeviceSegmentedReduce::Reducewith identical parameters, both writing tocount_per_variable. The second call simply overwrites the result of the first, making lines 647-656 redundant.Remove the redundant first reduce call:
cub::DeviceSegmentedReduce::Reduce(d_temp_storage, temp_storage_bytes, input_transform_it_2, count_per_variable.data(), problem.n_variables, problem.reverse_offsets.data(), problem.reverse_offsets.data() + 1, cuda::std::plus<>{}, 0, problem.handle_ptr->get_stream()); - - temp_storage.resize(temp_storage_bytes, problem.handle_ptr->get_stream()); - d_temp_storage = thrust::raw_pointer_cast(temp_storage.data()); - - // Run reduction - cub::DeviceSegmentedReduce::Reduce(d_temp_storage, - temp_storage_bytes, - input_transform_it_2, - count_per_variable.data(), - problem.n_variables, - problem.reverse_offsets.data(), - problem.reverse_offsets.data() + 1, - cuda::std::plus<>{}, - 0, - problem.handle_ptr->get_stream());cpp/src/mip/problem/problem.cu (1)
1119-1138: Zero out eliminated variables' objective coefficients.The kernel correctly transfers the scaled objective coefficient to the surviving variable using
atomicAdd(lines 1136-1137). However, the eliminated variable's objective coefficient (objective_coefficients[var_idx]) is never explicitly zeroed unless it creates duplicates (see line 1232). If an eliminated variable doesn't produce duplicates in any constraint, its coefficient remains non-zero and will incorrectly affect the objective value.Add this kernel after line 1138 to zero out all eliminated variables' objective coefficients:
+ thrust::for_each(handle_ptr->get_thrust_policy(), + d_var_indices.begin(), + d_var_indices.end(), + [objective_coefficients = make_span(objective_coefficients)] __device__(i_t var_idx) { + objective_coefficients[var_idx] = 0.; + });Based on past review comments.
🧹 Nitpick comments (1)
cpp/src/mip/problem/problem.cu (1)
1856-1857: Clarify assertion messages for bound tightening.The assertion logic is correct (verifying that new bounds tighten existing bounds), but the error messages could be clearer about what constitutes a violation.
Consider updating the messages to be more descriptive:
- cuopt_assert(variable_bounds[var_idx].x <= lb_values[i], "variable lower bound violation"); - cuopt_assert(variable_bounds[var_idx].y >= ub_values[i], "variable upper bound violation"); + cuopt_assert(variable_bounds[var_idx].x <= lb_values[i], + "New lower bound must be >= existing lower bound (bound tightening)"); + cuopt_assert(variable_bounds[var_idx].y >= ub_values[i], + "New upper bound must be <= existing upper bound (bound tightening)");
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
cpp/src/mip/presolve/probing_cache.cu(5 hunks)cpp/src/mip/problem/problem.cu(4 hunks)python/libcuopt/pyproject.toml(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-22T14:25:22.899Z
Learnt from: aliceb-nv
Repo: NVIDIA/cuopt PR: 527
File: cpp/src/mip/diversity/lns/rins.cu:167-175
Timestamp: 2025-10-22T14:25:22.899Z
Learning: In MIP (Mixed Integer Programming) problems in the cuOPT codebase, `n_integer_vars == 0` is impossible by definition—MIP problems must have at least one integer variable. If there are no integer variables, it would be a pure Linear Programming (LP) problem, not a MIP problem.
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
🧬 Code graph analysis (2)
cpp/src/mip/problem/problem.cu (1)
cpp/src/utilities/copy_helpers.hpp (16)
device_copy(237-243)device_copy(237-238)device_copy(254-260)device_copy(254-256)device_copy(271-277)device_copy(271-272)device_copy(286-303)device_copy(286-286)make_span(322-327)make_span(322-324)make_span(330-335)make_span(330-332)make_span(338-341)make_span(338-338)make_span(344-347)make_span(344-344)
cpp/src/mip/presolve/probing_cache.cu (3)
cpp/src/utilities/copy_helpers.hpp (22)
make_span(322-327)make_span(322-324)make_span(330-335)make_span(330-332)make_span(338-341)make_span(338-338)make_span(344-347)make_span(344-344)host_copy(118-125)host_copy(118-118)host_copy(136-151)host_copy(136-136)host_copy(162-165)host_copy(162-162)host_copy(175-181)host_copy(175-175)host_copy(192-198)host_copy(192-192)host_copy(209-212)host_copy(209-209)host_copy(223-226)host_copy(223-223)cpp/src/mip/problem/problem.cu (2)
view(907-944)view(907-907)cpp/src/mip/presolve/trivial_presolve.cu (2)
trivial_presolve(14-14)trivial_presolve(18-18)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: wheel-build-cuopt-sh-client / 13.0.2, 3.10, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.12, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.13, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.10, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.13, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.11, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.11, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.12, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.10, arm64, rockylinux8
🔇 Additional comments (6)
python/libcuopt/pyproject.toml (1)
57-57: Verify that the updated package size threshold was validated during testing.The
max_allowed_size_compressedthreshold increased by 10MB (1.7%) to accommodate the new features (probing implications, variable substitution, work-queue utilities). This appears proportionate given the substantial backend additions described in the PR.Ensure the package was built and its compressed size verified to confirm it fits within 605M during your local testing.
cpp/src/mip/problem/problem.cu (2)
1701-1701: Clarify the commented-out mutex lock.The commented-out
std::lock_guardraises questions about thread safety inget_host_user_problem. If concurrent access is possible and thread safety is required, the mutex should be uncommented. If the mutex is unnecessary, the comment should be removed to avoid confusion.Could you clarify whether:
- Thread safety is needed for this method?
- If yes, why is the mutex commented out?
- If no, can the comment be removed?
1328-1332: LGTM - Clean CSR sorting wrapper.The function provides a clean interface to sort constraint rows by variable indices using cuSPARSE.
cpp/src/mip/presolve/probing_cache.cu (3)
535-567: LGTM - Correct bound tightening logic.The function correctly merges bound modifications by taking the maximum lower bound and minimum upper bound for each variable, which is commutative and ensures determinism regardless of processing order.
569-598: LGTM - Proper substitution deduplication.The function correctly handles potential duplicate substitutions by taking only the first substitution for each variable (line 579), preventing conflicting substitutions. The determinism depends on the queue order being deterministic, which appears to be ensured by timestamp ordering.
689-697: LGTM - Correctly implemented priority computation and early exit.The code correctly:
- Uses
thrust::lower_boundwiththrust::greatercomparator to find the first zero in the descending priority sequence (lines 689-697)- Ensures a non-zero early-exit threshold using
std::max(2, ...)to prevent premature termination on small problems (line 783)Both improvements address issues from past reviews.
Also applies to: 782-784
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (2)
cpp/src/mip/presolve/probing_cache.cu (2)
439-441: Both probes can be infeasible; replace assert with handlingThe assert can trip on genuinely infeasible domains. Handle and propagate infeasibility instead of asserting. This was flagged previously.
- cuopt_assert(multi_probe_presolve.infeas_constraints_count_0 == 0 || - multi_probe_presolve.infeas_constraints_count_1 == 0, - "At least one of the probes must be feasible"); + if (multi_probe_presolve.infeas_constraints_count_0 > 0 && + multi_probe_presolve.infeas_constraints_count_1 > 0) { + CUOPT_LOG_WARNING("Both probes infeasible for var %d; domain infeasible", var_idx); + // Option 1: enqueue global infeasibility signal and return + return; + }
538-571: Bound-merge can produce inverted intervals; guard before applyingMerging via max(lb), min(ub) may yield lb>ub. Validate and either drop the conflicting update or clamp with tolerance before calling
update_variable_bounds.for (const auto& [var_idx, modifications] : var_bounds_modifications) { - var_indices.push_back(var_idx); - lb_values.push_back(modifications.first); - ub_values.push_back(modifications.second); + auto lb = modifications.first; + auto ub = modifications.second; + if (lb <= ub) { + var_indices.push_back(var_idx); + lb_values.push_back(lb); + ub_values.push_back(ub); + } else { + CUOPT_LOG_WARNING("Skipping conflicting bounds for var %d: [%f, %f]", var_idx, lb, ub); + } }
🧹 Nitpick comments (2)
cpp/src/mip/problem/problem.cu (1)
1088-1101: Simplify template and call-site for are_exclusive
are_exclusivedoesn’t usef_t. Drop the extra template parameter and update the call to avoid confusion.-template <typename i_t, typename f_t> -bool are_exclusive(const std::vector<i_t>& var_indices, - const std::vector<i_t>& var_to_substitude_indices) +template <typename i_t> +bool are_exclusive(const std::vector<i_t>& var_indices, + const std::vector<i_t>& var_to_substitude_indices)And:
- cuopt_assert((are_exclusive<i_t, f_t>(var_indices, var_to_substitude_indices)), + cuopt_assert((are_exclusive<i_t>(var_indices, var_to_substitude_indices)),cpp/src/mip/presolve/probing_cache.cu (1)
692-701: Trim priority list by zero counts using host data for simplicityCurrent device-side lower_bound is more complex than needed and mixes device iterators with host erase. Copy counts to host and trim by the first zero position to avoid policy/iterator pitfalls.
- // Find the index of the first 0 element in count_per_variable - auto first_zero_it = thrust::lower_bound(problem.handle_ptr->get_thrust_policy(), - count_per_variable.begin(), - count_per_variable.end(), - 0, - thrust::greater<i_t>()); - size_t first_zero_index = (first_zero_it != count_per_variable.end()) - ? thrust::distance(count_per_variable.begin(), first_zero_it) - : count_per_variable.size(); + auto h_counts = host_copy(count_per_variable); + auto first_zero_it = std::find(h_counts.begin(), h_counts.end(), 0); + size_t first_zero_index = (first_zero_it != h_counts.end()) + ? std::distance(h_counts.begin(), first_zero_it) + : h_counts.size();
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
cpp/src/mip/presolve/probing_cache.cu(5 hunks)cpp/src/mip/problem/problem.cu(4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-22T14:25:22.899Z
Learnt from: aliceb-nv
Repo: NVIDIA/cuopt PR: 527
File: cpp/src/mip/diversity/lns/rins.cu:167-175
Timestamp: 2025-10-22T14:25:22.899Z
Learning: In MIP (Mixed Integer Programming) problems in the cuOPT codebase, `n_integer_vars == 0` is impossible by definition—MIP problems must have at least one integer variable. If there are no integer variables, it would be a pure Linear Programming (LP) problem, not a MIP problem.
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
🧬 Code graph analysis (2)
cpp/src/mip/problem/problem.cu (1)
cpp/src/utilities/copy_helpers.hpp (16)
device_copy(237-243)device_copy(237-238)device_copy(254-260)device_copy(254-256)device_copy(271-277)device_copy(271-272)device_copy(286-303)device_copy(286-286)make_span(322-327)make_span(322-324)make_span(330-335)make_span(330-332)make_span(338-341)make_span(338-338)make_span(344-347)make_span(344-344)
cpp/src/mip/presolve/probing_cache.cu (3)
cpp/src/utilities/copy_helpers.hpp (22)
make_span(322-327)make_span(322-324)make_span(330-335)make_span(330-332)make_span(338-341)make_span(338-338)make_span(344-347)make_span(344-344)host_copy(118-125)host_copy(118-118)host_copy(136-151)host_copy(136-136)host_copy(162-165)host_copy(162-162)host_copy(175-181)host_copy(175-175)host_copy(192-198)host_copy(192-192)host_copy(209-212)host_copy(209-209)host_copy(223-226)host_copy(223-223)cpp/src/mip/problem/problem.cu (4)
integer_equal(749-752)integer_equal(749-749)view(907-944)view(907-907)cpp/src/mip/presolve/trivial_presolve.cu (2)
trivial_presolve(14-14)trivial_presolve(18-18)
🔇 Additional comments (2)
cpp/src/mip/problem/problem.cu (1)
1135-1156: Objective mass transfer: LGTMAtomic accumulation onto the surviving variable and the objective offset fix the earlier loss-of-mass bug under parallel substitution.
If double precision may run on pre-sm_60 GPUs, confirm
atomicAdd(double*, ...)availability or use a fallback (e.g., RAFT/CUB atomics).cpp/src/mip/presolve/probing_cache.cu (1)
792-798: Good integration: apply substitutions then remap via trivial presolveApplying substitutions before
trivial_presolve(problem, remap_cache_ids)aligns with the new workflow and should maximize reductions.
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
@akifcorduk should this PR go into the 25.12 release? |
|
Moving base branch to main as substitution logic is a bit more involved. |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
1 similar comment
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
2 similar comments
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In @cpp/src/mip/presolve/trivial_presolve.cuh:
- Around line 184-197: The code uses std::fill on pb.reverse_original_ids
without ensuring it has the expected size; explicitly resize
pb.reverse_original_ids to pb.n_variables before the std::fill call so the fill
and subsequent writes in the for loop are safe; update the block around
remap_cache_ids to call pb.reverse_original_ids.resize(pb.n_variables)
immediately before std::fill, then proceed with std::fill(...) and the existing
loop that sets pb.reverse_original_ids[pb.original_ids[i]] = i.
In @cpp/src/mip/problem/problem_helpers.cuh:
- Around line 358-396: The code creates cuSPARSE resources (cusparseCreate,
cusparseSetStream, cusparseCreateMatDescr, cusparseSetMatIndexBase,
cusparseSetMatType) without checking return values so exceptions from later
calls (check_cusparse_status around
cusparseXcsrsort/cusparseXcsrsort_bufferSizeExt) can leak the handle/descriptor;
fix by wrapping resource acquisition in RAII or by checking each creation call
with check_cusparse_status (or equivalent) immediately after cusparseCreate and
cusparseCreateMatDescr and on the set calls, and ensure you call
cusparseDestroy(handle) and cusparseDestroyMatDescr(matA) in all error paths
(use try/catch or a scope guard) so resources are always freed if an exception
is thrown.
In @cpp/src/mip/problem/problem.cu:
- Around line 1264-1269: The code reads var_indices[0] into
dummy_substituted_variable before checking sizes, causing undefined behavior if
var_indices is empty; move the assignment of dummy_substituted_variable to after
the size assertions (or add an explicit cuopt_assert(!var_indices.empty(),
"...") before accessing index 0) and retain the existing cuopt_assert checks
comparing var_indices.size() to var_to_substitude_indices.size(),
offset_values.size(), and coefficient_values.size(); ensure are_exclusive<i_t,
f_t>(var_indices, var_to_substitude_indices) is still validated either before or
after the non-empty check as appropriate.
🧹 Nitpick comments (2)
cpp/src/mip/problem/problem_helpers.cuh (1)
385-390: Consider using in-place gather to avoid extra allocation.The current implementation allocates a temporary
values_sortedbuffer, gathers into it, then copies back. This could be simplified.♻️ Alternative using swap
// apply the permutation to the values rmm::device_uvector<f_t> values_sorted(nnz, stream); thrust::gather( handle_ptr->get_thrust_policy(), P.begin(), P.end(), values.begin(), values_sorted.begin()); - thrust::copy( - handle_ptr->get_thrust_policy(), values_sorted.begin(), values_sorted.end(), values.begin()); + values = std::move(values_sorted); // Swap instead of copycpp/src/mip/presolve/probing_cache.cu (1)
542-574: Minor non-determinism in iteration order.While the comment states this should be deterministic,
std::unordered_mapiteration order is implementation-defined. The final bounds values are deterministic (max lb, min ub), but the order in which variables are processed might vary. This shouldn't affect correctness but may affect debug reproducibility.🔧 Optional: Use ordered map or sort output for deterministic processing order
- std::unordered_map<i_t, std::pair<f_t, f_t>> var_bounds_modifications; + std::map<i_t, std::pair<f_t, f_t>> var_bounds_modifications;
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
cpp/CMakeLists.txtcpp/src/mip/diversity/diversity_manager.cucpp/src/mip/presolve/probing_cache.cucpp/src/mip/presolve/probing_cache.cuhcpp/src/mip/presolve/trivial_presolve.cuhcpp/src/mip/problem/presolve_data.cuhcpp/src/mip/problem/problem.cucpp/src/mip/problem/problem.cuhcpp/src/mip/problem/problem_helpers.cuh
🚧 Files skipped from review as they are similar to previous changes (2)
- cpp/src/mip/diversity/diversity_manager.cu
- cpp/src/mip/problem/problem.cuh
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{cu,cuh}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cu,cuh}: Every CUDA kernel launch and memory operation must have error checking with CUDA_CHECK or equivalent verification
Avoid reinventing functionality already available in Thrust, CCCL, or RMM libraries; prefer standard library utilities over custom implementations
Files:
cpp/src/mip/presolve/probing_cache.cuhcpp/src/mip/presolve/trivial_presolve.cuhcpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem_helpers.cuhcpp/src/mip/problem/problem.cu
**/*.{cu,cuh,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cu,cuh,cpp,hpp,h}: Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Check that hard-coded GPU de...
Files:
cpp/src/mip/presolve/probing_cache.cuhcpp/src/mip/presolve/trivial_presolve.cuhcpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem_helpers.cuhcpp/src/mip/problem/problem.cu
**/*.cu
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.cu: Verify race conditions and correctness of GPU kernel shared memory, atomics, and warp-level operations
Detect inefficient GPU kernel launches with low occupancy or poor memory access patterns; optimize for coalesced memory access and minimize warp divergence in hot paths
Files:
cpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem.cu
**/*.{cu,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Files:
cpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem.cu
🧠 Learnings (28)
📓 Common learnings
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check that hard-coded GPU device IDs and resource limits are made configurable; abstract multi-backend support for different CUDA versions
Applied to files:
cpp/CMakeLists.txtcpp/src/mip/presolve/probing_cache.cuhcpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem_helpers.cuhcpp/src/mip/problem/problem.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify error propagation from CUDA to user-facing APIs is complete; ensure CUDA errors are caught and mapped to meaningful user error codes
Applied to files:
cpp/CMakeLists.txtcpp/src/mip/presolve/probing_cache.cuhcpp/src/mip/problem/presolve_data.cuhcpp/src/mip/problem/problem_helpers.cuhcpp/src/mip/problem/problem.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
Applied to files:
cpp/src/mip/presolve/probing_cache.cuhcpp/src/mip/presolve/trivial_presolve.cuhcpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem_helpers.cuhcpp/src/mip/problem/problem.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Applied to files:
cpp/src/mip/presolve/probing_cache.cuhcpp/src/mip/problem/presolve_data.cuhcpp/src/mip/problem/problem_helpers.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Applied to files:
cpp/src/mip/presolve/probing_cache.cuhcpp/src/mip/problem/presolve_data.cuhcpp/src/mip/problem/problem_helpers.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.cu : Verify race conditions and correctness of GPU kernel shared memory, atomics, and warp-level operations
Applied to files:
cpp/src/mip/presolve/probing_cache.cuhcpp/src/mip/problem/presolve_data.cuhcpp/src/mip/problem/problem_helpers.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Applied to files:
cpp/src/mip/presolve/probing_cache.cuhcpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem_helpers.cuhcpp/src/mip/problem/problem.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Applied to files:
cpp/src/mip/presolve/probing_cache.cuhcpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem_helpers.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Ensure test isolation: prevent GPU state, cached memory, and global variables from leaking between test cases; verify each test independently initializes its environment
Applied to files:
cpp/src/mip/presolve/probing_cache.cuhcpp/src/mip/problem/presolve_data.cuhcpp/src/mip/problem/problem_helpers.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Applied to files:
cpp/src/mip/presolve/probing_cache.cuhcpp/src/mip/problem/problem_helpers.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cpp,hpp,h} : Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Applied to files:
cpp/src/mip/presolve/probing_cache.cuhcpp/src/mip/problem/presolve_data.cuhcpp/src/mip/problem/problem_helpers.cuhcpp/src/mip/problem/problem.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Applied to files:
cpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem_helpers.cuhcpp/src/mip/problem/problem.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Applied to files:
cpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem_helpers.cuhcpp/src/mip/problem/problem.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for problem transformations: verify correctness of original→transformed→postsolve mappings and index consistency across problem representations
Applied to files:
cpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Applied to files:
cpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem_helpers.cuhcpp/src/mip/problem/problem.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh} : Avoid reinventing functionality already available in Thrust, CCCL, or RMM libraries; prefer standard library utilities over custom implementations
Applied to files:
cpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem_helpers.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.cu : Detect inefficient GPU kernel launches with low occupancy or poor memory access patterns; optimize for coalesced memory access and minimize warp divergence in hot paths
Applied to files:
cpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem_helpers.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Test with free variables, singleton problems, and extreme problem dimensions near resource limits to validate edge case handling
Applied to files:
cpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Applied to files:
cpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem_helpers.cuhcpp/src/mip/problem/problem.cu
📚 Learning: 2025-10-22T14:25:22.899Z
Learnt from: aliceb-nv
Repo: NVIDIA/cuopt PR: 527
File: cpp/src/mip/diversity/lns/rins.cu:167-175
Timestamp: 2025-10-22T14:25:22.899Z
Learning: In MIP (Mixed Integer Programming) problems in the cuOPT codebase, `n_integer_vars == 0` is impossible by definition—MIP problems must have at least one integer variable. If there are no integer variables, it would be a pure Linear Programming (LP) problem, not a MIP problem.
Applied to files:
cpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Applied to files:
cpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Applied to files:
cpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Write tests validating numerical correctness of optimization results (not just 'runs without error'); test degenerate cases (infeasible, unbounded, empty, singleton problems)
Applied to files:
cpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for algorithm phase transitions: verify correct initialization of bounds and state when transitioning from presolve to simplex to diving to crossover
Applied to files:
cpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem.cu
📚 Learning: 2025-12-04T04:11:12.640Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 500
File: cpp/src/dual_simplex/scaling.cpp:68-76
Timestamp: 2025-12-04T04:11:12.640Z
Learning: In the cuOPT dual simplex solver, CSR/CSC matrices (including the quadratic objective matrix Q) are required to have valid dimensions and indices by construction. Runtime bounds checking in performance-critical paths like matrix scaling is avoided to prevent slowdowns. Validation is performed via debug-only check_matrix() calls wrapped in #ifdef CHECK_MATRIX.
Applied to files:
cpp/src/mip/problem/problem_helpers.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh} : Every CUDA kernel launch and memory operation must have error checking with CUDA_CHECK or equivalent verification
Applied to files:
cpp/src/mip/problem/problem_helpers.cuh
📚 Learning: 2025-12-03T23:29:26.391Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 500
File: cpp/src/dual_simplex/sparse_matrix.cpp:519-524
Timestamp: 2025-12-03T23:29:26.391Z
Learning: In cpp/src/dual_simplex/sparse_matrix.cpp, the check_matrix() function is debug/diagnostic code (wrapped in #ifdef CHECK_MATRIX) that intentionally prints errors without necessarily returning early. The return codes from this debug code are not actively checked; the purpose is to print all validation errors in one pass for better diagnostics.
Applied to files:
cpp/src/mip/problem/problem_helpers.cuh
🧬 Code graph analysis (1)
cpp/src/mip/problem/problem.cu (1)
cpp/src/utilities/copy_helpers.hpp (8)
device_copy(207-213)device_copy(207-208)device_copy(224-230)device_copy(224-226)device_copy(241-247)device_copy(241-242)device_copy(256-273)device_copy(256-256)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.12, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.13, arm64, rockylinux8
- GitHub Check: checks / check-style
🔇 Additional comments (21)
cpp/CMakeLists.txt (1)
21-27: LGTM - Environment variable override for CUDA architectures.This provides useful flexibility for developers to specify custom GPU architectures. The conditional logic and fallback to
rapids_cuda_init_architecturesis correct.Consider adding a note in documentation about valid architecture string formats (e.g.,
"70;80;90"or"70-real;80-real") to help users avoid build errors from invalid values.cpp/src/mip/presolve/probing_cache.cuh (1)
117-120: LGTM - Updated function signature for presolve integration.The signature change appropriately:
- Returns
boolto indicate feasibility/success status- Accepts
bound_presolve_t&for applying bound modifications- Includes
timer_tfor time-bounded operationsThis aligns well with the PR's objective of applying probing implications during presolve rather than only during solve.
cpp/src/mip/problem/problem_helpers.cuh (1)
317-322: LGTM - cuSPARSE error helper.Good error handling wrapper that converts cuSPARSE errors to exceptions with meaningful error messages.
cpp/src/mip/presolve/trivial_presolve.cuh (1)
357-362: LGTM - Newremap_cache_idsparameter propagated correctly.The parameter is correctly forwarded from
trivial_presolvetoupdate_from_csr. The default value offalsemaintains backward compatibility with existing call sites.cpp/src/mip/problem/presolve_data.cuh (3)
22-29: LGTM - Well-structured substitution record.The
substitution_tstruct cleanly captures variable substitution relationships with timestamp for ordering during postsolve. The formulax_substituted = offset + coefficient * x_substitutingis clearly documented.Consider adding a brief comment indicating which problem context the indices (
substituting_var,substituted_var) refer to (e.g., original problem indices vs. presolved indices) to help maintainers verify index mapping consistency across problem transformations. Based on learnings about ensuring correct problem context access.
54-55: LGTM - Copy constructor correctly propagates substitution data.The
variable_substitutionsvector is correctly copied in the copy constructor, ensuring substitution data is preserved whenpresolve_data_tis copied across streams or problem instances.
92-95: LGTM - Clear documentation for substitution storage.The comment clearly explains the substitution formula and when it's applied. Using
std::vectorfor host-side postsolve data is appropriate.cpp/src/mip/problem/problem.cu (5)
826-840: LGTM! Post-process substitution logic is correct.The linear substitution formula
x[substituted] = offset + coefficient * x[substituting]properly reverses the presolve substitutions. The bounds assertions ensure index safety.
1239-1251: LGTM! Exclusive check implementation is correct.The algorithm correctly sorts copies and uses
std::set_intersectionto detect overlap. O(n log n) complexity is acceptable.
1379-1405: LGTM! Duplicate detection logic is correct.The algorithm correctly handles chains of consecutive duplicates by accumulating coefficients at
duplicate_start_idxand marking subsequent entries for elimination.
1503-1507: LGTM!Clean delegation to the existing CSR sorting utility.
2024-2052: LGTM! Bounds update logic with validation is correct.The assertions correctly validate that bounds are being tightened (new lower bound ≥ old, new upper bound ≤ old), preventing incorrect bound relaxation.
cpp/src/mip/presolve/probing_cache.cu (9)
9-17: LGTM!Appropriate includes for the new probing functionality.
497-538: LGTM! Global bounds inference and substitution detection are mathematically sound.The logic correctly:
- Infers global bounds by taking
minof lower bounds andmaxof upper bounds from both probes- Detects substitution opportunities when a binary variable's probes imply different singleton values for another variable, using the correct linear formula
x_i = l_0 + (l_1 - l_0) * x_var_idx
576-611: LGTM! Cycle removal logic is correct.The algorithm deterministically breaks bidirectional edges by keeping only the edge where
substituting_var < substituted_var, ensuring a consistent DAG structure.
613-643: LGTM! DFS substitution merging is mathematically correct.The composition formula correctly chains substitutions:
child = C_offset + C_coeff * (P_offset + P_coeff * Root)= (C_offset + C_coeff * P_offset) + (C_coeff * P_coeff) * RootConsider that deep substitution chains could cause stack overflow in the recursive implementation, though this is unlikely in practice for typical MIP problems.
If substitution chains can be arbitrarily deep in pathological cases, consider converting to an iterative implementation with an explicit stack.
645-677: LGTM! Substitution graph flattening is well-designed.The algorithm correctly:
- Removes cycles via
sanitize_graph- Identifies root nodes (substituting vars that aren't substituted)
- Flattens chains via DFS so all substitutions point directly to roots
679-736: LGTM! Substitution application logic is correct.The function properly:
- Merges and flattens substitutions from all threads
- Converts current problem indices to original indices for post-processing
- Stores substitutions for postsolve recovery
- Applies substitutions to the problem
738-840: LGTM! Priority computation algorithm is efficient and correct.The two-pass reduction efficiently computes how many integer variables each variable shares constraints with (O(nnz) complexity). The
thrust::lower_boundwithgreater<>correctly finds the first zero in the descending-sorted array.
940-941: LGTM! Return value now propagates infeasibility detection.The function correctly returns
problem_is_infeasible.load()to allow callers to detect and handle problem infeasibility discovered during probing.
914-931: Move#pragma omp singleinside the#pragma omp parallelblock.The
#pragma omp singleat line 915 is positioned outside the#pragma omp parallel { }block (which closes at line 914), making it an implicit parallel region rather than a synchronization point within the existing parallel region. To properly synchronize modification application after the parallel for loop completes, move the#pragma omp singleblock inside the parallel region, before its closing brace.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @cpp/src/mip/diversity/diversity_manager.cu:
- Around line 199-200: The call to trivial_presolve currently omits
remap_cache_ids after compute_probing_cache, causing index mapping mismatch;
change the invocation of trivial_presolve on the problem (the call at
trivial_presolve(*problem_ptr)) to pass remap_cache_ids=true (i.e., call
trivial_presolve(*problem_ptr, /*remap_cache_ids=*/true)) so it matches the
pattern used after compute_probing_cache and preserves probing cache ID
remapping; update the invocation near compute_probing_cache to include the remap
flag.
🧹 Nitpick comments (3)
cpp/src/mip/utilities/work_unit_ordered_queue.cuh (3)
70-74: Consider adding a combinedtry_popfor safe concurrent usage.The current API allows a TOCTOU race: thread A calls
empty()→ returns false, thread B pops the last item, thread A callspop()ortop()→ undefined behavior. This is a common pitfall with thread-safe queues.Consider adding a combined atomic operation:
♻️ Suggested addition
/** * @brief Atomically pop and return the front entry if non-empty. * * @return std::optional<entry_t> The front entry, or std::nullopt if empty. */ std::optional<entry_t> try_pop() { std::lock_guard<std::mutex> lock(mutex_); if (queue_.empty()) { return std::nullopt; } entry_t result = std::move(const_cast<entry_t&>(queue_.top())); queue_.pop(); return result; }This requires adding
#include <optional>. Alternatively, document the expected usage pattern in the class docstring to warn users about the TOCTOU hazard.Also applies to: 81-95
125-130: Consider usingstd::greaterinstead of custom comparator.The custom
min_heap_cmpcan be replaced withstd::greater<entry_t>, which lexicographically compares pairs (first by timestamp, then byTif timestamps are equal). This aligns with the guideline to prefer standard library utilities. Based on learnings, prefer standard library utilities over custom implementations.♻️ Suggested simplification
- private: - // Custom comparator for min-heap based on timestamp - struct min_heap_cmp { - bool operator()(const entry_t& a, const entry_t& b) const - { - return a.first > b.first; // earlier time has higher priority - } - }; - - std::priority_queue<entry_t, std::vector<entry_t>, min_heap_cmp> queue_; + private: + std::priority_queue<entry_t, std::vector<entry_t>, std::greater<entry_t>> queue_;Note: This changes tie-breaking behavior when timestamps are equal (will compare
Tvalues). IfTdoesn't support comparison or if timestamp-only ordering is required, keep the custom comparator.
35-36: File extension suggests CUDA but contains no CUDA code.The
.cuhextension typically indicates a CUDA header with device code. This file is pure C++ (mutex, std::priority_queue). Consider renaming to.hppif project conventions allow, to clarify that no CUDA compilation is required for includers.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
cpp/src/mip/diversity/diversity_config.hppcpp/src/mip/diversity/diversity_manager.cucpp/src/mip/presolve/multi_probe.cucpp/src/mip/presolve/trivial_presolve.cucpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/problem/problem.cuhcpp/src/mip/utilities/work_unit_ordered_queue.cuh
🚧 Files skipped from review as they are similar to previous changes (3)
- cpp/src/mip/presolve/multi_probe.cu
- cpp/src/mip/problem/problem.cuh
- cpp/src/mip/presolve/trivial_presolve.cu
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{cu,cuh}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cu,cuh}: Every CUDA kernel launch and memory operation must have error checking with CUDA_CHECK or equivalent verification
Avoid reinventing functionality already available in Thrust, CCCL, or RMM libraries; prefer standard library utilities over custom implementations
Files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/utilities/work_unit_ordered_queue.cuhcpp/src/mip/diversity/diversity_manager.cu
**/*.{cu,cuh,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cu,cuh,cpp,hpp,h}: Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Check that hard-coded GPU de...
Files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/utilities/work_unit_ordered_queue.cuhcpp/src/mip/diversity/diversity_manager.cucpp/src/mip/diversity/diversity_config.hpp
**/*.cu
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.cu: Verify race conditions and correctness of GPU kernel shared memory, atomics, and warp-level operations
Detect inefficient GPU kernel launches with low occupancy or poor memory access patterns; optimize for coalesced memory access and minimize warp divergence in hot paths
Files:
cpp/src/mip/diversity/diversity_manager.cu
**/*.{cu,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/diversity/diversity_config.hpp
**/*.{h,hpp,py}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Verify C API does not break ABI stability (no struct layout changes, field reordering); maintain backward compatibility in Python and server APIs with deprecation warnings
Files:
cpp/src/mip/diversity/diversity_config.hpp
**/*.{cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cpp,hpp,h}: Check for unclosed file handles when reading MPS/QPS problem files; ensure RAII patterns or proper cleanup in exception paths
Validate input sanitization to prevent buffer overflows and resource exhaustion attacks; avoid unsafe deserialization of problem files
Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state
Files:
cpp/src/mip/diversity/diversity_config.hpp
🧠 Learnings (20)
📓 Common learnings
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Applied to files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/diversity/diversity_manager.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
Applied to files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/diversity/diversity_manager.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Applied to files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Applied to files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/diversity/diversity_manager.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Test with free variables, singleton problems, and extreme problem dimensions near resource limits to validate edge case handling
Applied to files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Applied to files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/diversity/diversity_manager.cucpp/src/mip/diversity/diversity_config.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh} : Avoid reinventing functionality already available in Thrust, CCCL, or RMM libraries; prefer standard library utilities over custom implementations
Applied to files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/utilities/work_unit_ordered_queue.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check that hard-coded GPU device IDs and resource limits are made configurable; abstract multi-backend support for different CUDA versions
Applied to files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/diversity/diversity_manager.cucpp/src/mip/diversity/diversity_config.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Applied to files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/diversity/diversity_manager.cucpp/src/mip/diversity/diversity_config.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Applied to files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/diversity/diversity_manager.cucpp/src/mip/diversity/diversity_config.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.cu : Verify race conditions and correctness of GPU kernel shared memory, atomics, and warp-level operations
Applied to files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/diversity/diversity_manager.cucpp/src/mip/diversity/diversity_config.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cpp,hpp,h} : Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Applied to files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/diversity/diversity_manager.cucpp/src/mip/diversity/diversity_config.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify error propagation from CUDA to user-facing APIs is complete; ensure CUDA errors are caught and mapped to meaningful user error codes
Applied to files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/diversity/diversity_manager.cucpp/src/mip/diversity/diversity_config.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Ensure test isolation: prevent GPU state, cached memory, and global variables from leaking between test cases; verify each test independently initializes its environment
Applied to files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/diversity/diversity_manager.cucpp/src/mip/diversity/diversity_config.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Applied to files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/diversity/diversity_manager.cucpp/src/mip/diversity/diversity_config.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Applied to files:
cpp/src/mip/presolve/trivial_presolve_helpers.cuhcpp/src/mip/diversity/diversity_config.hpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Applied to files:
cpp/src/mip/utilities/work_unit_ordered_queue.cuhcpp/src/mip/diversity/diversity_manager.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for algorithm phase transitions: verify correct initialization of bounds and state when transitioning from presolve to simplex to diving to crossover
Applied to files:
cpp/src/mip/diversity/diversity_manager.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms
Applied to files:
cpp/src/mip/diversity/diversity_manager.cu
🧬 Code graph analysis (1)
cpp/src/mip/diversity/diversity_manager.cu (2)
cpp/src/mip/presolve/lb_probing_cache.cu (2)
compute_probing_cache(310-397)compute_probing_cache(310-312)cpp/src/mip/presolve/trivial_presolve.cu (2)
trivial_presolve(14-14)trivial_presolve(18-18)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.13, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.10, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.11, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.12, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.11, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.13, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.10, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.12, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-sh-client / 13.1.0, 3.10, amd64, rockylinux8
- GitHub Check: checks / check-style
🔇 Additional comments (6)
cpp/src/mip/presolve/trivial_presolve_helpers.cuh (1)
28-37: LGTM! Correct optimization to skip zero-coefficient edges.The early return for zero coefficients is a valid optimization that avoids unnecessary bound checks on edges that contribute nothing to the constraint. The exact
== 0.comparison is appropriate here since matrix coefficients are typically exact input values (not computed), and this pattern is consistent with existing code in this file (line 142).cpp/src/mip/utilities/work_unit_ordered_queue.cuh (3)
1-26: LGTM on includes and license header.Standard includes are appropriate for the functionality. No issues here.
117-121:clear()implementation is correct.Reinitializing via assignment is a standard pattern since
std::priority_queuelacks a nativeclear(). Thread-safe with the lock.
49-65: Push methods are thread-safe and well-implemented.Both copy and move overloads properly acquire the lock before modifying the queue.
cpp/src/mip/diversity/diversity_config.hpp (1)
17-18: Significant increase in probing time budgets.The time budget for probing has increased substantially:
time_ratio_of_probing_cache: 0.04 → 0.1 (2.5× increase, from 4% to 10% of solve time)max_time_on_probing: 15.0 → 60.0 (4× increase)These changes align with the PR objective of applying probing implications during presolve. Verify that the 60-second cap doesn't negatively impact users who expect faster turnaround on smaller problems or those with tight time budgets.
cpp/src/mip/diversity/diversity_manager.cu (1)
188-198: Probing cache computation correctly added before trivial presolve.The probing step is properly guarded by
!fj_only_runand implements sound time budget management:
- Uses
std::minto cap probing time at eithermax_time_on_probingor the proportional limit- Creates a dedicated
probing_timerfor the operation- Correctly captures the
boolreturn value indicating infeasibility and exits early if neededThe
compute_probing_cachefunction returnsboolfor thebound_presolve_ttemplate specialization used here, so the assignment and infeasibility check are correct.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
cpp/src/mip/presolve/probing_cache.cu (2)
542-574: Minor determinism concern with unordered_map iteration.The comment at lines 547-548 claims determinism, but
std::unordered_mapiteration order is not guaranteed to be deterministic across different runs or platforms. While this doesn't affect correctness (the bound updates are commutative), it could affect reproducibility for debugging purposes.If strict determinism is required, consider using
std::mapor sortingvar_indicesbefore callingupdate_variable_bounds.Optional: Sort for deterministic order
for (const auto& [var_idx, modifications] : var_bounds_modifications) { var_indices.push_back(var_idx); lb_values.push_back(modifications.first); ub_values.push_back(modifications.second); } + // Sort for deterministic order (optional) + std::vector<size_t> indices(var_indices.size()); + std::iota(indices.begin(), indices.end(), 0); + std::sort(indices.begin(), indices.end(), [&](size_t a, size_t b) { + return var_indices[a] < var_indices[b]; + }); + std::vector<i_t> sorted_var_indices(var_indices.size()); + std::vector<f_t> sorted_lb(var_indices.size()), sorted_ub(var_indices.size()); + for (size_t i = 0; i < indices.size(); ++i) { + sorted_var_indices[i] = var_indices[indices[i]]; + sorted_lb[i] = lb_values[indices[i]]; + sorted_ub[i] = ub_values[indices[i]]; + } if (var_indices.size() > 0) { - problem.update_variable_bounds(var_indices, lb_values, ub_values); + problem.update_variable_bounds(sorted_var_indices, sorted_lb, sorted_ub);
883-931: OpenMPsingledirective appears misplaced outside parallel region.The
#pragma omp singleat line 914 appears after the closing brace of the#pragma omp parallelregion at line 913. This means the single block executes sequentially outside the parallel region, which is the intended behavior here (synchronization point), but the#pragma omp singledirective is unnecessary and misleading in this context.The code structure should either:
- Remove the
#pragma omp singlesince it's already sequential, or- Move the synchronization logic inside the parallel region with proper barriers.
Current behavior is correct but the pragma is confusing.
Remove unnecessary `#pragma omp single`
} } -#pragma omp single - { + // Synchronization point - apply queued modifications + { // TODO when we have determinism, check current threads work/time counter and filter queue // items that are smaller or equal to that apply_modification_queue_to_problem(modification_vector_pool, problem);
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/presolve_data.cuh
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{cu,cuh}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cu,cuh}: Every CUDA kernel launch and memory operation must have error checking with CUDA_CHECK or equivalent verification
Avoid reinventing functionality already available in Thrust, CCCL, or RMM libraries; prefer standard library utilities over custom implementations
Files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cu
**/*.cu
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.cu: Verify race conditions and correctness of GPU kernel shared memory, atomics, and warp-level operations
Detect inefficient GPU kernel launches with low occupancy or poor memory access patterns; optimize for coalesced memory access and minimize warp divergence in hot paths
Files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/presolve/probing_cache.cu
**/*.{cu,cuh,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cu,cuh,cpp,hpp,h}: Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Check that hard-coded GPU de...
Files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cu
**/*.{cu,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/presolve/probing_cache.cu
🧠 Learnings (26)
📓 Common learnings
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for algorithm phase transitions: verify correct initialization of bounds and state when transitioning from presolve to simplex to diving to crossover
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check that hard-coded GPU device IDs and resource limits are made configurable; abstract multi-backend support for different CUDA versions
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms
Applied to files:
cpp/src/mip/diversity/diversity_manager.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify error propagation from CUDA to user-facing APIs is complete; ensure CUDA errors are caught and mapped to meaningful user error codes
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/problem/presolve_data.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.cu : Detect inefficient GPU kernel launches with low occupancy or poor memory access patterns; optimize for coalesced memory access and minimize warp divergence in hot paths
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh} : Avoid reinventing functionality already available in Thrust, CCCL, or RMM libraries; prefer standard library utilities over custom implementations
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for problem transformations: verify correctness of original→transformed→postsolve mappings and index consistency across problem representations
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/problem/presolve_data.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/problem/presolve_data.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.cu : Verify race conditions and correctness of GPU kernel shared memory, atomics, and warp-level operations
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/problem/presolve_data.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cpp,hpp,h} : Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/problem/presolve_data.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/problem/presolve_data.cuhcpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Ensure test isolation: prevent GPU state, cached memory, and global variables from leaking between test cases; verify each test independently initializes its environment
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/problem/presolve_data.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Applied to files:
cpp/src/mip/diversity/diversity_manager.cucpp/src/mip/problem/presolve_data.cuh
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Applied to files:
cpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Test with free variables, singleton problems, and extreme problem dimensions near resource limits to validate edge case handling
Applied to files:
cpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-10-22T14:25:22.899Z
Learnt from: aliceb-nv
Repo: NVIDIA/cuopt PR: 527
File: cpp/src/mip/diversity/lns/rins.cu:167-175
Timestamp: 2025-10-22T14:25:22.899Z
Learning: In MIP (Mixed Integer Programming) problems in the cuOPT codebase, `n_integer_vars == 0` is impossible by definition—MIP problems must have at least one integer variable. If there are no integer variables, it would be a pure Linear Programming (LP) problem, not a MIP problem.
Applied to files:
cpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Applied to files:
cpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Applied to files:
cpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Write tests validating numerical correctness of optimization results (not just 'runs without error'); test degenerate cases (infeasible, unbounded, empty, singleton problems)
Applied to files:
cpp/src/mip/presolve/probing_cache.cu
🧬 Code graph analysis (1)
cpp/src/mip/diversity/diversity_manager.cu (3)
cpp/src/mip/presolve/probing_cache.cu (2)
compute_probing_cache(843-939)compute_probing_cache(843-845)cpp/src/mip/presolve/lb_probing_cache.cu (2)
compute_probing_cache(310-397)compute_probing_cache(310-312)cpp/src/mip/presolve/trivial_presolve.cu (2)
trivial_presolve(14-14)trivial_presolve(18-18)
🔇 Additional comments (11)
cpp/src/mip/problem/presolve_data.cuh (2)
22-29: LGTM! Clean substitution struct design.The
substitution_tstruct appropriately captures the linear relationshipx_substituted = offset + coefficient * x_substituting. The fields are well-named and the structure aligns with standard variable substitution patterns in MIP presolving.One minor observation:
timestampusesf_t(floating-point), which is consistent with thetimer.elapsed_time()return type used elsewhere. This is acceptable for ordering purposes.
54-55: Proper initialization and lifecycle management for substitutions.The
variable_substitutionsvector is correctly:
- Initialized from the source in the copy constructor (line 55)
- Cleared during
initialize_var_mapping(line 69) to ensure clean stateThe documentation comment at lines 94-95 clearly explains the substitution formula and its usage in post-processing.
Also applies to: 69-69, 93-96
cpp/src/mip/presolve/probing_cache.cu (7)
383-384: Verifyis_binarydetection uses correct bounds.The
is_binarycheck at line 384 uses the current boundslbandubfromh_var_bounds[var_idx]. However, if previous probing iterations have already modified these bounds viamodification_vector, theh_var_boundsvector may be stale until the next synchronization step. This could cause missed substitution opportunities or incorrect substitution detection.Consider whether
is_binaryshould be determined from the original problem bounds or if the current approach is intentional (detecting binary variables after bound tightening).
460-468: Verify bound modification uses correct original bounds.When the current probe is infeasible and the other probe's interval type is
GEQorLEQ, the code usesbounds.xandbounds.y(lines 464, 467) which reference the variable bounds fetched at line 380. Ensure these are the intended values, asboundsis a local copy ofh_var_bounds[var_idx]and modifications from other threads won't be reflected until the synchronization step.
497-538: Global bounds inference and substitution logic is correct.The logic correctly:
- Computes global bounds as
min(lb_0, lb_1)andmax(ub_0, ub_1)since if a bound holds for both probe directions, it must hold globally.- Detects substitutions for binary variables when both probes result in singletons with different values.
- Uses the correct linear formula:
x_substituted = l_0 + (l_1 - l_0) * x_substituting.Note: The
TODOat line 499 suggests parallelizing this check. For large problems with many variables, the current O(n_variables) per probed variable could become a bottleneck.
614-643: DFS substitution composition logic is correct.The DFS correctly composes substitution chains:
- Given parent:
curr_var = P_offset + P_coeff * Root_Var- And child:
child_var = C_offset + C_coeff * curr_var- Produces:
child_var = (C_offset + C_coeff * P_offset) + (C_coeff * P_coeff) * Root_VarThis is mathematically correct linear composition.
830-838: Verifylower_boundwithgreatercomparator finds zeros correctly.The code uses
thrust::lower_boundwiththrust::greater<i_t>()on a descending-sorted array to find the first zero. With thegreatercomparator,lower_boundfinds the first position where the element is not greater than 0 (i.e., ≤ 0). Since the values are non-negative counts, this correctly finds the first zero.This is subtle but correct. Consider adding a comment explaining the logic.
925-928: Early exit heuristic may terminate prematurely on small problems.The early exit condition checks if the number of new implied singletons is less than
max(2, min(100, n_variables / 50)). For problems with fewer than 100 variables, this threshold could be as low as 2, potentially causing early termination even when useful implications remain to be discovered.Consider whether this threshold is appropriate for small problems or if a minimum iteration count should be enforced.
701-729: The control flow and index mapping are correct. Theh_variable_mappingsnapshot taken at lines 702–704 is used exclusively within the loop at lines 708–729 for converting current indices to original IDs before the loop completes.substitute_variablesis called after the loop finishes at line 733–734 and does not modifyvariable_mapping(it only updates coefficients, variables, and constraint bounds). No stale mapping issue exists.Likely an incorrect or invalid review comment.
cpp/src/mip/diversity/diversity_manager.cu (2)
188-202: Well-structured probing integration in presolve.The probing cache computation is correctly integrated:
- Time budget is computed from config parameters with a maximum cap (line 193).
- Infeasibility is properly detected and returned early (lines 196-198).
- Cache ID remapping is enabled for trivial presolve (line 201).
- Bounds sanity check follows (line 202).
The guard
!fj_only_runensures this doesn't run in FJ-only mode, which is appropriate.
190-193: No changes needed. The probing time configuration values (time_ratio_of_probing_cacheandmax_time_on_probing) are properly configurable parameters with sensible defaults (10% ratio, 60-second maximum). The time allocation logic at line 193 correctly applies both constraints usingstd::min(), ensuring the allocated time respects both the percentage-based limit and the absolute maximum. The code follows the guideline to make resource limits configurable rather than hard-coded.
aliceb-nv
left a comment
There was a problem hiding this comment.
Awesome work Akif :) Very good idea to add some determinism future-proofing
My comments are minor, but I think it could be beneficial as well to add nvtx ranges to these functions for ease of profiling, as probing can take a substantial amount of runtime
LGTM!
There was a problem hiding this comment.
You can actually just pass -DCMAKE_CUDA_ARCHITECTURES='90-real' for example to the --cmake-args arg of ./build.sh; cmake takes the last specified value on the cmdline for its define so it will override the default
There was a problem hiding this comment.
oh that's great pointer. i was going to change the whole build script to do that. i was going to remove cmake changes anyway.
| f_t lb = get_lower(bounds); | ||
| f_t ub = get_upper(bounds); | ||
| // note that is_binary does not always mean the bound difference is one | ||
| bool is_binary = ub == 1 && lb == 0; |
There was a problem hiding this comment.
We might want to add a canonization pass to the trivial_presolve to turn any integer with [lb, lb+1] bounds into [0, 1]; FJ benefits greatly from recognizing binary variables
Papilo might already do that!
There was a problem hiding this comment.
Let me test on few instances to see if papilo actually does that. Thanks for the pointer.
| } | ||
|
|
||
| template <typename i_t, typename f_t> | ||
| std::vector<i_t> compute_priority_indices_by_implied_integers(problem_t<i_t, f_t>& problem) |
There was a problem hiding this comment.
What does the "implied integers" refer to here?
There was a problem hiding this comment.
The implied integer means that a change in ones variable bounds might imply whole integer bound change on other variables. We prioritize variables that are neighbors to more integer variables. Probing on variables that are neighbors with mosly continuous variables is not super useful.
There was a problem hiding this comment.
Gotcha! I got confused at first because Papilo also produces a list of "implied integers" - in the sense of continuous variables that are proven to only take integer values in feasible solutions
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cpp/src/dual_simplex/presolve.cpp
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{cu,cuh,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cu,cuh,cpp,hpp,h}: Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Check that hard-coded GPU de...
Files:
cpp/src/dual_simplex/presolve.cpp
**/*.{cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cpp,hpp,h}: Check for unclosed file handles when reading MPS/QPS problem files; ensure RAII patterns or proper cleanup in exception paths
Validate input sanitization to prevent buffer overflows and resource exhaustion attacks; avoid unsafe deserialization of problem files
Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state
Files:
cpp/src/dual_simplex/presolve.cpp
**/*.{cu,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Files:
cpp/src/dual_simplex/presolve.cpp
🧠 Learnings (19)
📓 Common learnings
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for algorithm phase transitions: verify correct initialization of bounds and state when transitioning from presolve to simplex to diving to crossover
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2026-01-14T00:38:33.700Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 746
File: cpp/src/dual_simplex/barrier.cu:549-552
Timestamp: 2026-01-14T00:38:33.700Z
Learning: In cpp/src/dual_simplex/barrier.cu's form_adat() method within the barrier solver, the raft::copy from d_original_A_values to device_AD.x is necessary and must not be removed. This copy restores the original matrix A values before they are scaled by the diagonal matrix D (via d_inv_diag_prime). Since D changes between iterations and the scaling mutates device_AD.x, copying the original values each iteration is required for correctness.
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Test with free variables, singleton problems, and extreme problem dimensions near resource limits to validate edge case handling
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check that hard-coded GPU device IDs and resource limits are made configurable; abstract multi-backend support for different CUDA versions
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cpp,hpp,h} : Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify error propagation from CUDA to user-facing APIs is complete; ensure CUDA errors are caught and mapped to meaningful user error codes
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.cu : Verify race conditions and correctness of GPU kernel shared memory, atomics, and warp-level operations
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Applied to files:
cpp/src/dual_simplex/presolve.cpp
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Applied to files:
cpp/src/dual_simplex/presolve.cpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.12, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.11, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.10, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.12, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.13, arm64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.11, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.13, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-mps-parser / 13.1.0, 3.10, amd64, rockylinux8
- GitHub Check: wheel-build-cuopt-sh-client / 13.1.0, 3.10, amd64, rockylinux8
🔇 Additional comments (1)
cpp/src/dual_simplex/presolve.cpp (1)
1-6: Copyright header update looks good.Standard year range update for ongoing development.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| // TODO: fix me! In test ./cpp/build/tests/linear_programming/C_API_TEST | ||
| // c_api/TimeLimitTestFixture.time_limit/2 this is crashing. It is crashing only if it is run as | ||
| // whole in sequence and not filtering the respective test. Crash could be observed in previous | ||
| // versions by setting probing cache time to zero. assert(dual_res_inf < 1e-6); | ||
| return dual_res_inf; |
There was a problem hiding this comment.
Track this TODO: commented-out assertion may mask real correctness issues.
Disabling the dual residual assertion due to test failures is a reasonable temporary workaround, but the asymmetry with the primal assertion at line 1236 (assert(primal_res < 1e-6)) suggests the root cause needs investigation. The TODO indicates the crash depends on test execution order and probing cache timing, which hints at possible state pollution between tests rather than a fundamental tolerance issue.
Consider:
- Opening an issue to track this TODO and prevent it from becoming permanent tech debt.
- Investigating whether test isolation is the problem (shared state not being reset between tests).
- If the tolerance is genuinely too strict, adjusting both primal and dual assertions consistently rather than removing only one.
The function still returns dual_res_inf, so callers can perform validation if needed, but silent failures in the crushing step could propagate incorrect dual solutions.
Would you like me to open an issue to track the investigation of this assertion failure?
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
cpp/src/mip/problem/problem.cu (2)
1264-1269: Guard against empty substitution set still needed.Line 1266 accesses
var_indices[0]before validating that the vector is non-empty. If called with an empty substitution set, this causes undefined behavior.🔧 Recommended fix
void problem_t<i_t, f_t>::substitute_variables(const std::vector<i_t>& var_indices, const std::vector<i_t>& var_to_substitute_indices, const std::vector<f_t>& offset_values, const std::vector<f_t>& coefficient_values) { raft::common::nvtx::range fun_scope("substitute_variables"); + cuopt_assert(var_indices.size() == var_to_substitute_indices.size(), "size mismatch"); + cuopt_assert(var_indices.size() == offset_values.size(), "size mismatch"); + cuopt_assert(var_indices.size() == coefficient_values.size(), "size mismatch"); + if (var_indices.empty()) { return; } cuopt_assert((are_exclusive<i_t, f_t>(var_indices, var_to_substitute_indices)), "variables and var_to_substitute_indices are not exclusive"); const i_t dummy_substituted_variable = var_indices[0]; - cuopt_assert(var_indices.size() == var_to_substitute_indices.size(), "size mismatch"); - cuopt_assert(var_indices.size() == offset_values.size(), "size mismatch"); - cuopt_assert(var_indices.size() == coefficient_values.size(), "size mismatch");
2036-2064: Validate lb ≤ ub before writing bounds.The assertions on lines 2057-2058 verify that new bounds are stricter than old bounds, but there's no check that
lb_values[i] <= ub_values[i]. If the caller passes inverted bounds (lb > ub), they would be written tovariable_bounds, potentially causing downstream solver failures.🔧 Recommended fix
[lb_values = make_span(d_lb_values), ub_values = make_span(d_ub_values), variable_bounds = make_span(variable_bounds), var_indices = make_span(d_var_indices)] __device__(auto i) { i_t var_idx = var_indices[i]; cuopt_assert(variable_bounds[var_idx].x <= lb_values[i], "variable lower bound violation"); cuopt_assert(variable_bounds[var_idx].y >= ub_values[i], "variable upper bound violation"); + cuopt_assert(lb_values[i] <= ub_values[i], "inverted bounds: lb > ub"); variable_bounds[var_idx].x = lb_values[i]; variable_bounds[var_idx].y = ub_values[i]; });
🧹 Nitpick comments (5)
cpp/src/mip/presolve/probing_cache.cu (3)
383-384: Consider using tolerance-based comparison for binary check.The exact equality check
ub == 1 && lb == 0could miss near-binary variables due to floating-point precision. While line 388 correctly usesproblem.integer_equal(ub - lb, 1.)for interval detection, thisis_binaryflag gates the substitution logic on line 520.This is consistent with a past comment that suggested using tolerance-based comparison here.
♻️ Suggested improvement
- // note that is_binary does not always mean the bound difference is one - bool is_binary = ub == 1 && lb == 0; + // note that is_binary does not always mean the bound difference is one + f_t int_tol = bound_presolve.context.settings.tolerances.integrality_tolerance; + bool is_binary = integer_equal<f_t>(ub, 1., int_tol) && integer_equal<f_t>(lb, 0., int_tol);
860-861: Hard-coded thread count should be configurable.The
max_threads = 8is hard-coded, which violates the guideline to make resource limits configurable. This could underutilize machines with more cores or cause issues on systems with fewer cores.♻️ Suggested improvement
Consider making this configurable via settings or at least auto-detecting based on available hardware:
- const size_t max_threads = 8; + const size_t max_threads = std::min(static_cast<size_t>(8), + static_cast<size_t>(omp_get_max_threads()));Or better, expose this as a configuration parameter in the solver settings.
913-929: Remove misleading#pragma omp singleoutside parallel region.The
#pragma omp singleon line 913 is placed outside the#pragma omp parallelblock which ends at line 912. When used outside a parallel region,omp singlehas no effect—the code simply executes on the main thread, which is what happens anyway.This appears intentional (synchronize modifications after parallel work), but the pragma is misleading and suggests a misunderstanding.
♻️ Suggested fix
} } -#pragma omp single - { // TODO when we have determinism, check current threads work/time counter and filter queue // items that are smaller or equal to that apply_modification_queue_to_problem(modification_vector_pool, problem); // copy host bounds again, because we changed some problem bounds raft::copy(h_var_bounds.data(), problem.variable_bounds.data(), h_var_bounds.size(), problem.handle_ptr->get_stream()); problem.handle_ptr->sync_stream(); if (n_of_implied_singletons - last_it_implied_singletons < (size_t)std::max(2, (min(100, problem.n_variables / 50)))) { early_exit = true; } last_it_implied_singletons = n_of_implied_singletons; - } } // end of stepcpp/src/mip/problem/problem.cu (2)
1313-1314: Note: Non-deterministic atomicAdd on objective coefficients.The
atomicAddonobjective_coefficients[substituting_var_idx]is thread-safe but may produce non-deterministic results due to floating-point associativity when multiple source variables substitute to the same target. This was acknowledged in prior discussion—documenting for awareness.
1900-1900: Consider removing or documenting commented-out lock guards.The commented-out
std::lock_guardon lines 1900 and 2042 suggests thread-safety was considered. If these methods are guaranteed single-threaded, remove the dead code. If concurrent access is possible, document the thread-safety assumptions or restore the guards.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
cpp/src/mip/presolve/probing_cache.cucpp/src/mip/problem/problem.cu
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{cu,cuh}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cu,cuh}: Every CUDA kernel launch and memory operation must have error checking with CUDA_CHECK or equivalent verification
Avoid reinventing functionality already available in Thrust, CCCL, or RMM libraries; prefer standard library utilities over custom implementations
Files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
**/*.cu
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.cu: Verify race conditions and correctness of GPU kernel shared memory, atomics, and warp-level operations
Detect inefficient GPU kernel launches with low occupancy or poor memory access patterns; optimize for coalesced memory access and minimize warp divergence in hot paths
Files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
**/*.{cu,cuh,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
**/*.{cu,cuh,cpp,hpp,h}: Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Check that hard-coded GPU de...
Files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
**/*.{cu,cpp,hpp,h}
📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)
Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
🧠 Learnings (20)
📓 Common learnings
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for problem transformations: verify correctness of original→transformed→postsolve mappings and index consistency across problem representations
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Test with free variables, singleton problems, and extreme problem dimensions near resource limits to validate edge case handling
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh} : Avoid reinventing functionality already available in Thrust, CCCL, or RMM libraries; prefer standard library utilities over custom implementations
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check that hard-coded GPU device IDs and resource limits are made configurable; abstract multi-backend support for different CUDA versions
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2026-01-14T00:38:38.038Z
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 746
File: cpp/src/dual_simplex/barrier.cu:549-552
Timestamp: 2026-01-14T00:38:38.038Z
Learning: In cpp/src/dual_simplex/barrier.cu's form_adat() method within the barrier solver, the raft::copy from d_original_A_values to device_AD.x is necessary and must not be removed. This copy restores the original matrix A values before they are scaled by the diagonal matrix D (via d_inv_diag_prime). Since D changes between iterations and the scaling mutates device_AD.x, copying the original values each iteration is required for correctness.
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-10-22T14:25:22.899Z
Learnt from: aliceb-nv
Repo: NVIDIA/cuopt PR: 527
File: cpp/src/mip/diversity/lns/rins.cu:167-175
Timestamp: 2025-10-22T14:25:22.899Z
Learning: In MIP (Mixed Integer Programming) problems in the cuOPT codebase, `n_integer_vars == 0` is impossible by definition—MIP problems must have at least one integer variable. If there are no integer variables, it would be a pure Linear Programming (LP) problem, not a MIP problem.
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify error propagation from CUDA to user-facing APIs is complete; ensure CUDA errors are caught and mapped to meaningful user error codes
Applied to files:
cpp/src/mip/problem/problem.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for algorithm phase transitions: verify correct initialization of bounds and state when transitioning from presolve to simplex to diving to crossover
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Applied to files:
cpp/src/mip/problem/problem.cucpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cpp,hpp,h} : Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code
Applied to files:
cpp/src/mip/problem/problem.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Applied to files:
cpp/src/mip/presolve/probing_cache.cu
📚 Learning: 2025-11-25T10:20:49.822Z
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Write tests validating numerical correctness of optimization results (not just 'runs without error'); test degenerate cases (infeasible, unbounded, empty, singleton problems)
Applied to files:
cpp/src/mip/presolve/probing_cache.cu
🧬 Code graph analysis (1)
cpp/src/mip/presolve/probing_cache.cu (2)
cpp/src/utilities/copy_helpers.hpp (18)
host_copy(118-125)host_copy(118-118)host_copy(136-151)host_copy(136-136)host_copy(162-168)host_copy(162-162)host_copy(179-182)host_copy(179-179)host_copy(193-196)host_copy(193-193)make_span(292-297)make_span(292-294)make_span(300-305)make_span(300-302)make_span(308-311)make_span(308-308)make_span(314-317)make_span(314-314)cpp/src/mip/problem/problem.cu (2)
integer_equal(900-900)view(1058-1058)
🔇 Additional comments (11)
cpp/src/mip/presolve/probing_cache.cu (9)
123-138: LGTM!Log level changes from DEBUG to TRACE appropriately reduce output clutter for these internal diagnostic messages, as requested in previous feedback.
437-471: LGTM!The infeasibility handling now properly detects when both probes are infeasible and signals global infeasibility instead of asserting. The modification queue correctly captures bound updates based on the feasible interval type. This addresses the critical issue from previous reviews.
497-538: LGTM!The global bounds inference logic is correct: when both probes are feasible,
min(lb_0, lb_1)andmax(ub_0, ub_1)give conservative global bounds that are valid regardless of the probed variable's value. The substitution formulax_i = offset + coefficient * x_var_idxis correctly computed. The tolerance-based singleton check addresses prior feedback.
542-574: LGTM!The modification aggregation correctly takes the tightest bounds when multiple threads report updates for the same variable. The deterministic comment in the code is accurate for the bound values themselves, even though
unordered_mapiteration order is undefined.
576-677: LGTM!The graph-based substitution merging is well-designed:
sanitize_graphremoves cycles with a deterministic tie-breakerdfscorrectly propagates substitution coefficients using the formula for composing linear transformationsmerge_substitutionsproperly identifies root variables and flattens the substitution graphThe mathematical transformations in lines 631-634 are correct.
679-736: LGTM!The substitution application correctly:
- Flattens the substitution graph via
merge_substitutions- Converts current variable indices to original indices for postsolve storage (lines 702-704, 717-728)
- Applies substitutions to the problem
This properly maintains index mapping consistency across problem transformations. Based on learnings, this is critical for correct postsolve behavior.
738-840: LGTM!The priority computation correctly identifies variables that share constraints with many other integer variables. Key observations:
- CUB segmented reduce operations now properly query temp storage size before each reduction (lines 750-759, 786-797), addressing the previous duplicate reduce issue
- The
thrust::lower_boundwiththrust::greater<i_t>()correctly finds the first zero-count position in the descending-sorted array (lines 830-834)- Trimming non-integer variables (count = 0) avoids unnecessary probing
924-927: LGTM!The early-exit threshold now uses
std::max(2, ...)to ensure a non-zero minimum, addressing the previous issue where small problems withn_variables < 50would compute a threshold of zero.
940-944: LGTM!Template instantiation macro correctly updated to reflect the new
boolreturn type ofcompute_probing_cache.cpp/src/mip/problem/problem.cu (2)
1239-1251: LGTM! Helper function correctly checks for exclusive sets.The implementation correctly determines if two vectors share any common elements using sorted set intersection.
826-841: Variable substitution post-processing logic is correct.The formula
h_assignment[sub.substituted_var] = offset + coefficient * h_assignment[sub.substituting_var]correctly reverses the presolve substitution. The bounds checks viacuopt_assertproperly guard against out-of-range access.Transitive substitution chains (A→B→C) are already flattened by
merge_substitutions()in the upstream probing logic via DFS, ensuring all substituting variables are roots with no further dependencies. Forward iteration is therefore safe regardless of vector order.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
|
/merge |
We were computing the probing cache but actually were not using the probing implications to change the problem. We were just using the cache during solve. In this PR, we add implications to the problem during presolve. When a probing a range is infeasible, we fix the variable bounds to other range. If both probing ranges imply some global bounds on other variables, we update those global bounds. Also if both probings of a binary variable imply some singletons on some other variable, we can substitute that variable with the current variable(with some coefficients).
This PR also changes the variable ordering to prioritize max possible implications: number of other integers that the variable shares constraint with.
With this PR, we can add more variable substitutions since the function is ready now.
Overall there is a consistent increase in number of feasible on H100 system. Compared to baseline main over two batches of miplib runs: 226(average) compared to before: 223(average). Objective gap: 13.4% compared to 14.4%. Overall 3 feasible and 1% gap improvement.
Summary by CodeRabbit
New Features
Bug Fixes
Performance
✏️ Tip: You can customize this high-level summary in your review settings.