Conversation
📝 WalkthroughWalkthroughThis PR extends MPS format writing with quadratic objective support, implements CSR matrix symmetrization utilities in two locations, improves numerical stability in presolve via Kahan compensation, optimizes quadratic matrix construction, and introduces comprehensive round-trip MPS I/O tests. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~35 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (6)
cpp/src/barrier/barrier.cu (1)
3484-3484: Send this through debug logging instead ofprintf.Line 3484 adds solver-internal diagnostics to the normal barrier output stream. That changes the default log format for every solve; please gate it behind
settings.log.debug(...)or a debug flag instead.Suggested change
- settings.log.printf("norm_b: %.2e, norm_c: %.2e\n", norm_b, norm_c); + settings.log.debug("norm_b: %.2e, norm_c: %.2e\n", norm_b, norm_c);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/barrier/barrier.cu` at line 3484, The diagnostic printf call settings.log.printf("norm_b: %.2e, norm_c: %.2e\n", norm_b, norm_c) should be moved to debug logging so it doesn't alter normal output; replace or wrap it with the logger's debug method (e.g., settings.log.debug(...)) or conditionally emit it only when a debug flag is enabled, keeping the same formatted message and variable references (norm_b, norm_c) so the diagnostic remains available but only in debug mode.cpp/src/dual_simplex/presolve.cpp (1)
1504-1512: Consider adding a defensive assertion for size consistency.The loop iterates over
input_x.size()but accessesremoved_lower_bounds[j]. While the current code flow ensures these sizes match after variable restoration, adding an explicit assertion would guard against future refactoring that might break this invariant.🛡️ Suggested defensive assertion
if (presolve_info.removed_lower_bounds.size() > 0) { i_t num_lower_bounds = 0; + assert(input_x.size() == presolve_info.removed_lower_bounds.size()); // We removed some lower bounds so we need to map the crushed solution back to the original // variables for (i_t j = 0; j < input_x.size(); j++) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/dual_simplex/presolve.cpp` around lines 1504 - 1512, Before iterating to restore removed lower bounds, add a defensive assertion that input_x.size() equals presolve_info.removed_lower_bounds.size() to catch future regressions; place it just before the loop that references input_x and presolve_info.removed_lower_bounds (the block that computes num_lower_bounds, updates input_x[j], and calls settings.log.printf) so the code verifies size consistency before accessing removed_lower_bounds[j].cpp/src/utilities/sparse_matrix_helpers.hpp (1)
35-119: Significant code duplication withsparse_matrix_utils.hpp.This implementation is nearly identical to
cpp/libmps_parser/src/utilities/sparse_matrix_utils.hpp. Consider consolidating into a single shared utility to avoid maintenance burden and potential divergence.Additionally, the comment at line 44 says "Optimized 2-pass algorithm" but the implementation is actually a 3-pass algorithm (count → fill → deduplicate).
💡 Comment fix
- // Optimized 2-pass algorithm (no COO intermediate) + // Optimized 3-pass algorithm (no COO intermediate)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/utilities/sparse_matrix_helpers.hpp` around lines 35 - 119, The function symmetrize_csr is duplicated vs cpp/libmps_parser/src/utilities/sparse_matrix_utils.hpp and its header comment incorrectly says "Optimized 2-pass algorithm" though the code uses three passes (count → fill → deduplicate); consolidate by moving the implementation of symmetrize_csr (and its helper locals: temp_offsets, temp_indices, temp_values, row_pos, workspace, out_offsets/out_indices/out_values handling) into a single shared utility header/source used by both consumers (create a common utilities header/namespace, replace the duplicate definitions with an include or call to that single symmetrize_csr), update callers to include the shared header, remove the redundant implementation, and correct the top comment to "Optimized 3-pass algorithm (count → fill → deduplicate)".cpp/libmps_parser/src/utilities/sparse_matrix_utils.hpp (1)
32-50: Guard againstn_rows <= 0in pointer-based overload.If
n_rowsis zero or negative, the function will still allocate vectors and attempt iteration, which could lead to unexpected behavior.🛡️ Proposed early-return guard
template <typename i_t, typename f_t> void symmetrize_csr(const f_t* in_values, const i_t* in_indices, const i_t* in_offsets, i_t n_rows, std::vector<f_t>& out_values, std::vector<i_t>& out_indices, std::vector<i_t>& out_offsets) { + if (n_rows <= 0) { + out_values.clear(); + out_indices.clear(); + out_offsets.assign(1, 0); + return; + } // Optimized 3-pass algorithm // Pass 1: Count entries per row in A + A^T🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/libmps_parser/src/utilities/sparse_matrix_utils.hpp` around lines 32 - 50, The symmetrize_csr function lacks a guard for non-positive n_rows and may dereference in_offsets/in_indices; at the top of symmetrize_csr add an early return for n_rows <= 0 that clears out_values and out_indices and sets out_offsets to an empty vector (or a single 0 element if callers expect offsets.size() == 1), then return immediately to avoid accessing in_offsets[0] or allocating row_counts; this prevents any further loops or allocations (references: symmetrize_csr, in_offsets, out_offsets, row_counts).cpp/src/pdlp/optimization_problem.cu (1)
18-18: Consider using thesymmetrize_csrutility instead of duplicating the algorithm.The
sparse_matrix_helpers.hppheader is included at line 18, but instead of callingcuopt::symmetrize_csr(), the same 3-pass algorithm is duplicated inline. This creates maintenance burden and risks divergence.Also, the comment at line 192 says "2-pass algorithm" but it's actually 3-pass.
♻️ Proposed refactor using the utility
- // Build Q + Q^T using optimized 2-pass algorithm (no COO intermediate) - // Memory: ~3× nnz, ~2x faster than original COO-based approach - i_t qn = size_offsets - 1; // Number of variables - - // Pass 1: Count entries per row in Q + Q^T - std::vector<i_t> row_counts(qn, 0); - ... (entire algorithm) - Q_values_.resize(nz); + // Build Q + Q^T using optimized 3-pass algorithm (no COO intermediate) + std::vector<f_t> in_values(Q_values, Q_values + size_values); + std::vector<i_t> in_indices(Q_indices, Q_indices + size_indices); + std::vector<i_t> in_offsets(Q_offsets, Q_offsets + size_offsets); + + cuopt::symmetrize_csr(in_values, in_indices, in_offsets, Q_values_, Q_indices_, Q_offsets_);Also applies to: 192-267
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/pdlp/optimization_problem.cu` at line 18, Replace the duplicated 3-pass CSR symmetrization code with a call to the existing utility cuopt::symmetrize_csr(), removing the inline algorithm between the current symmetrization start and end (the block flagged around lines 192-267) and pass the same CSR inputs/outputs (row_ptr, col_idx, values or their device pointers) used by the duplicated code; ensure the call signature matches cuopt::symmetrize_csr(row_ptr_in, col_idx_in, vals_in, nnz_in, row_ptr_out, col_idx_out, vals_out, allocator/stream arguments) and adapt variable names accordingly, then update the nearby comment that currently reads "2-pass algorithm" to correctly reflect "3-pass algorithm" or remove it if no longer needed.cpp/libmps_parser/tests/mps_parser_test.cpp (1)
1022-1043: Consider using a cross-platform temp file mechanism and RAII for cleanup.The tests use hardcoded
/tmp/paths which may not work on all platforms. Additionally, if a test fails before reachingstd::filesystem::remove(), the temp file will remain.Consider using
std::filesystem::temp_directory_path()and a RAII wrapper or test fixture for automatic cleanup.💡 Example improvement
TEST(mps_roundtrip, linear_programming_basic) { std::string input_file = cuopt::test::get_rapids_dataset_root_dir() + "/linear_programming/good-mps-1.mps"; - std::string temp_file = "/tmp/mps_roundtrip_lp_test.mps"; + std::string temp_file = + (std::filesystem::temp_directory_path() / "mps_roundtrip_lp_test.mps").string(); + + // RAII cleanup + struct TempFileGuard { + std::string path; + ~TempFileGuard() { std::filesystem::remove(path); } + } guard{temp_file}; // Read original auto original = parse_mps<int, double>(input_file, true); // ... rest of test - // Cleanup - std::filesystem::remove(temp_file); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/libmps_parser/tests/mps_parser_test.cpp` around lines 1022 - 1043, Replace the hardcoded "/tmp/mps_roundtrip_lp_test.mps" with a platform-safe temp path (use std::filesystem::temp_directory_path() combined with std::filesystem::unique_path, e.g. unique_path("mps_roundtrip_%%%%-%%%%.mps")) inside the TEST(mps_roundtrip, linear_programming_basic) test; create a small RAII helper (e.g. TempFile or use a test fixture with a destructor/TearDown) that holds the std::filesystem::path and removes the file in its destructor, then use that temp path when calling mps_writer_t<int,double>::write and parse_mps<int,double> so cleanup always runs even on failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/libmps_parser/src/utilities/sparse_matrix_utils.hpp`:
- Around line 119-135: The wrapper symmetrize_csr must handle an empty
in_offsets to avoid underflow when computing n_rows; add a guard at the start of
the function that checks if in_offsets.empty() and if so clear out_values and
out_indices and set out_offsets to a single element {0} (or otherwise initialize
outputs to represent an empty CSR) then return early; otherwise proceed to
compute n_rows = in_offsets.size() - 1 and call the underlying symmetrize_csr
overload as before.
In `@cpp/src/utilities/sparse_matrix_helpers.hpp`:
- Around line 136-152: The wrapper symmetrize_csr currently computes n_rows =
in_offsets.size() - 1 which underflows when in_offsets is empty; add a guard at
the top of the function to handle the empty-input case (e.g., if
in_offsets.empty() ) and either return early after clearing/setting out_values,
out_indices, and out_offsets to empty/default state or set n_rows = 0 before
calling the lower-level symmetrize_csr overload; update the function
symmetrize_csr(const std::vector<f_t>&,... ) to use this guard so it never
performs size()-1 on an empty in_offsets.
---
Nitpick comments:
In `@cpp/libmps_parser/src/utilities/sparse_matrix_utils.hpp`:
- Around line 32-50: The symmetrize_csr function lacks a guard for non-positive
n_rows and may dereference in_offsets/in_indices; at the top of symmetrize_csr
add an early return for n_rows <= 0 that clears out_values and out_indices and
sets out_offsets to an empty vector (or a single 0 element if callers expect
offsets.size() == 1), then return immediately to avoid accessing in_offsets[0]
or allocating row_counts; this prevents any further loops or allocations
(references: symmetrize_csr, in_offsets, out_offsets, row_counts).
In `@cpp/libmps_parser/tests/mps_parser_test.cpp`:
- Around line 1022-1043: Replace the hardcoded "/tmp/mps_roundtrip_lp_test.mps"
with a platform-safe temp path (use std::filesystem::temp_directory_path()
combined with std::filesystem::unique_path, e.g.
unique_path("mps_roundtrip_%%%%-%%%%.mps")) inside the TEST(mps_roundtrip,
linear_programming_basic) test; create a small RAII helper (e.g. TempFile or use
a test fixture with a destructor/TearDown) that holds the std::filesystem::path
and removes the file in its destructor, then use that temp path when calling
mps_writer_t<int,double>::write and parse_mps<int,double> so cleanup always runs
even on failures.
In `@cpp/src/barrier/barrier.cu`:
- Line 3484: The diagnostic printf call settings.log.printf("norm_b: %.2e,
norm_c: %.2e\n", norm_b, norm_c) should be moved to debug logging so it doesn't
alter normal output; replace or wrap it with the logger's debug method (e.g.,
settings.log.debug(...)) or conditionally emit it only when a debug flag is
enabled, keeping the same formatted message and variable references (norm_b,
norm_c) so the diagnostic remains available but only in debug mode.
In `@cpp/src/dual_simplex/presolve.cpp`:
- Around line 1504-1512: Before iterating to restore removed lower bounds, add a
defensive assertion that input_x.size() equals
presolve_info.removed_lower_bounds.size() to catch future regressions; place it
just before the loop that references input_x and
presolve_info.removed_lower_bounds (the block that computes num_lower_bounds,
updates input_x[j], and calls settings.log.printf) so the code verifies size
consistency before accessing removed_lower_bounds[j].
In `@cpp/src/pdlp/optimization_problem.cu`:
- Line 18: Replace the duplicated 3-pass CSR symmetrization code with a call to
the existing utility cuopt::symmetrize_csr(), removing the inline algorithm
between the current symmetrization start and end (the block flagged around lines
192-267) and pass the same CSR inputs/outputs (row_ptr, col_idx, values or their
device pointers) used by the duplicated code; ensure the call signature matches
cuopt::symmetrize_csr(row_ptr_in, col_idx_in, vals_in, nnz_in, row_ptr_out,
col_idx_out, vals_out, allocator/stream arguments) and adapt variable names
accordingly, then update the nearby comment that currently reads "2-pass
algorithm" to correctly reflect "3-pass algorithm" or remove it if no longer
needed.
In `@cpp/src/utilities/sparse_matrix_helpers.hpp`:
- Around line 35-119: The function symmetrize_csr is duplicated vs
cpp/libmps_parser/src/utilities/sparse_matrix_utils.hpp and its header comment
incorrectly says "Optimized 2-pass algorithm" though the code uses three passes
(count → fill → deduplicate); consolidate by moving the implementation of
symmetrize_csr (and its helper locals: temp_offsets, temp_indices, temp_values,
row_pos, workspace, out_offsets/out_indices/out_values handling) into a single
shared utility header/source used by both consumers (create a common utilities
header/namespace, replace the duplicate definitions with an include or call to
that single symmetrize_csr), update callers to include the shared header, remove
the redundant implementation, and correct the top comment to "Optimized 3-pass
algorithm (count → fill → deduplicate)".
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6f4d3fea-12c6-4d00-a3ec-893c0132ba72
📒 Files selected for processing (8)
cpp/libmps_parser/include/mps_parser/mps_writer.hppcpp/libmps_parser/src/mps_writer.cppcpp/libmps_parser/src/utilities/sparse_matrix_utils.hppcpp/libmps_parser/tests/mps_parser_test.cppcpp/src/barrier/barrier.cucpp/src/dual_simplex/presolve.cppcpp/src/pdlp/optimization_problem.cucpp/src/utilities/sparse_matrix_helpers.hpp
| template <typename i_t, typename f_t> | ||
| void symmetrize_csr(const std::vector<f_t>& in_values, | ||
| const std::vector<i_t>& in_indices, | ||
| const std::vector<i_t>& in_offsets, | ||
| std::vector<f_t>& out_values, | ||
| std::vector<i_t>& out_indices, | ||
| std::vector<i_t>& out_offsets) | ||
| { | ||
| i_t n_rows = static_cast<i_t>(in_offsets.size()) - 1; | ||
| symmetrize_csr(in_values.data(), | ||
| in_indices.data(), | ||
| in_offsets.data(), | ||
| n_rows, | ||
| out_values, | ||
| out_indices, | ||
| out_offsets); | ||
| } |
There was a problem hiding this comment.
Handle edge case: empty in_offsets vector.
When in_offsets is empty, in_offsets.size() - 1 will underflow (for unsigned) or produce an unexpected value. Add a guard for this edge case.
🛡️ Proposed fix
template <typename i_t, typename f_t>
void symmetrize_csr(const std::vector<f_t>& in_values,
const std::vector<i_t>& in_indices,
const std::vector<i_t>& in_offsets,
std::vector<f_t>& out_values,
std::vector<i_t>& out_indices,
std::vector<i_t>& out_offsets)
{
+ if (in_offsets.empty()) {
+ out_values.clear();
+ out_indices.clear();
+ out_offsets.clear();
+ return;
+ }
i_t n_rows = static_cast<i_t>(in_offsets.size()) - 1;
symmetrize_csr(in_values.data(),
in_indices.data(),
in_offsets.data(),
n_rows,
out_values,
out_indices,
out_offsets);
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/libmps_parser/src/utilities/sparse_matrix_utils.hpp` around lines 119 -
135, The wrapper symmetrize_csr must handle an empty in_offsets to avoid
underflow when computing n_rows; add a guard at the start of the function that
checks if in_offsets.empty() and if so clear out_values and out_indices and set
out_offsets to a single element {0} (or otherwise initialize outputs to
represent an empty CSR) then return early; otherwise proceed to compute n_rows =
in_offsets.size() - 1 and call the underlying symmetrize_csr overload as before.
| template <typename i_t, typename f_t> | ||
| void symmetrize_csr(const std::vector<f_t>& in_values, | ||
| const std::vector<i_t>& in_indices, | ||
| const std::vector<i_t>& in_offsets, | ||
| std::vector<f_t>& out_values, | ||
| std::vector<i_t>& out_indices, | ||
| std::vector<i_t>& out_offsets) | ||
| { | ||
| i_t n_rows = static_cast<i_t>(in_offsets.size()) - 1; | ||
| symmetrize_csr(in_values.data(), | ||
| in_indices.data(), | ||
| in_offsets.data(), | ||
| n_rows, | ||
| out_values, | ||
| out_indices, | ||
| out_offsets); | ||
| } |
There was a problem hiding this comment.
Same edge case issue: empty in_offsets can cause underflow.
Same issue as in sparse_matrix_utils.hpp - if in_offsets is empty, in_offsets.size() - 1 will produce an unexpected value.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/utilities/sparse_matrix_helpers.hpp` around lines 136 - 152, The
wrapper symmetrize_csr currently computes n_rows = in_offsets.size() - 1 which
underflows when in_offsets is empty; add a guard at the top of the function to
handle the empty-input case (e.g., if in_offsets.empty() ) and either return
early after clearing/setting out_values, out_indices, and out_offsets to
empty/default state or set n_rows = 0 before calling the lower-level
symmetrize_csr overload; update the function symmetrize_csr(const
std::vector<f_t>&,... ) to use this guard so it never performs size()-1 on an
empty in_offsets.
Description
Issue
Checklist