Add support for handling free variables in QPs with the augmented system#1146
Add support for handling free variables in QPs with the augmented system#1146chris-maes wants to merge 3 commits into
Conversation
📝 WalkthroughWalkthroughThis PR adds native tracking and handling of free primal variables in the barrier QP path: presolve records free indices for QP, iteration state stores a device free-flag mask, initial point and positivity logic respect free vars, GPU kernels skip complementarity terms for free vars, and barrier mu/RHS calculations are adjusted. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/barrier/barrier.cu`:
- Around line 3077-3079: The division by mu_denom when computing mu_aff can hit
zero if all variables are free (data.n_free_vars == data.x.size() and
data.n_upper_bounds == 0); create a helper (e.g., compute_mu_denom or centralize
pair-count logic used by compute_mu() and solve()) that computes denom =
static_cast<f_t>(data.x.size()) + static_cast<f_t>(data.n_upper_bounds) -
static_cast<f_t>(data.n_free_vars) and returns a guarded value or a boolean
indicating zero; use that helper both where mu_aff is set (mu_aff =
complementarity_aff_sum / denom) and where the initial mu is computed in
solve(), and if denom is zero or below an epsilon, set mu_aff/mu to 0 (or a safe
fallback) instead of performing the division to avoid NaN/INF.
- Around line 292-296: The change currently forces use_augmented = true whenever
has_Q is true, removing the ADAT path and skipping dense-column handling;
instead only enable the augmented system when it is actually required (e.g., Q
is non-diagonal or there is at least one native free column). Update the
conditional around has_Q to test the actual need for augmentation (check the
Q-diagonality flag/criterion used in this module and the native free-column
count variable) and only set use_augmented = true and n_dense_columns = 0 in
that case; otherwise preserve the ADAT path and run dense-column detection as
before (keep existing logic that handles diagonal-Q + zero native-free-columns).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 2ac07e73-8534-4bbf-8dc3-4d43e4e1321b
📒 Files selected for processing (3)
cpp/src/barrier/barrier.cucpp/src/dual_simplex/presolve.cppcpp/src/dual_simplex/presolve.hpp
| f_t mu_denom = static_cast<f_t>(data.x.size()) + static_cast<f_t>(data.n_upper_bounds); | ||
| mu_denom -= static_cast<f_t>(data.n_free_vars); | ||
| mu_aff = complementarity_aff_sum / mu_denom; |
There was a problem hiding this comment.
Guard the zero-complementarity case before dividing by mu_denom.
With native free variables, an equality-constrained QP can legitimately have n_free_vars == x.size() and n_upper_bounds == 0. Then this denominator becomes 0, so mu_aff/mu turn into NaN and the predictor-corrector step blows up. Please centralize the pair-count calculation, handle denom == 0, and reuse that helper for the initial mu computed in solve() as well.
Possible fix
+ auto complementarity_pairs = [&]() -> f_t {
+ return static_cast<f_t>(data.x.size()) +
+ static_cast<f_t>(data.n_upper_bounds) -
+ static_cast<f_t>(data.n_free_vars);
+ };
+
- f_t mu_denom = static_cast<f_t>(data.x.size()) + static_cast<f_t>(data.n_upper_bounds);
- mu_denom -= static_cast<f_t>(data.n_free_vars);
- mu_aff = complementarity_aff_sum / mu_denom;
+ const f_t mu_denom = complementarity_pairs();
+ if (mu_denom == f_t(0)) {
+ mu_aff = f_t(0);
+ sigma = f_t(0);
+ new_mu = f_t(0);
+ return;
+ }
+ mu_aff = complementarity_aff_sum / mu_denom;Apply the same helper/guard in compute_mu() and in the initial mu setup inside solve().
As per coding guidelines, "Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks."
Also applies to: 3303-3312
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/barrier/barrier.cu` around lines 3077 - 3079, The division by
mu_denom when computing mu_aff can hit zero if all variables are free
(data.n_free_vars == data.x.size() and data.n_upper_bounds == 0); create a
helper (e.g., compute_mu_denom or centralize pair-count logic used by
compute_mu() and solve()) that computes denom = static_cast<f_t>(data.x.size())
+ static_cast<f_t>(data.n_upper_bounds) - static_cast<f_t>(data.n_free_vars) and
returns a guarded value or a boolean indicating zero; use that helper both where
mu_aff is set (mu_aff = complementarity_aff_sum / denom) and where the initial
mu is computed in solve(), and if denom is zero or below an epsilon, set
mu_aff/mu to 0 (or a safe fallback) instead of performing the division to avoid
NaN/INF.
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
1 similar comment
|
🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
| vector_norm2<i_t, f_t>(data.dual_residual)); | ||
| #endif | ||
| // Make sure (w, x, v, z) > 0 | ||
| // Save free variable x values before forcing positive (they can be negative) |
There was a problem hiding this comment.
Yuck. This save and restore is gross. We should just have another ensure_positive take in a list of indices to keep positive.
| problem.Q.check_matrix("Before free variable expansion"); | ||
|
|
||
| if (settings.barrier_presolve && free_variables > 0) { | ||
| // For QPs using the augmented system, keep free variables as-is rather than |
There was a problem hiding this comment.
Do we know we are using the augmented system yet?
| data.d_diag_.size(), | ||
| cuda::std::divides<>{}, | ||
| stream_view_.value()); | ||
| // For native free variables (QP): use Q diagonal if available, otherwise a static regularizer |
There was a problem hiding this comment.
Is this only for the ADA^T case?
| }, | ||
| stream_view_.value()); | ||
| } else { | ||
| cub::DeviceTransform::Transform( |
There was a problem hiding this comment.
I'm confused about this case: it is for data.Q.n == 0 || !data.Q.diagonal
| if (lp.Q.n > 0) { create_Q(lp, Q); } | ||
|
|
||
| iteration_data_t<i_t, f_t> data(lp, num_upper_bounds, Q, settings); | ||
| // Set up native free variable tracking for QPs |
There was a problem hiding this comment.
This should happen inside the iteration_data_t constructor
| data.d_complementarity_xz_rhs_.size(), | ||
| [new_mu] HD(f_t dx_aff, f_t dz_aff) { return -(dx_aff * dz_aff) + new_mu; }, | ||
| stream_view_.value()); | ||
| auto fill_linear_cc_rhs = [&](raft::device_span<f_t> out, |
There was a problem hiding this comment.
This should be a method rather than a lambda
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cpp/src/barrier/barrier.cu`:
- Around line 1996-2003: The mask passed to data.x.ensure_positive is allocated
with data.w.size() but ensure_positive iterates over data.x.size(), causing
out-of-bounds access; change the allocation of nonnegative_variables to use
data.x.size() (and/or validate that every index in
presolve_info.free_variable_indices is < data.x.size()) so the mask length
matches data.x and then unset entries for each j in
presolve_info.free_variable_indices before calling data.x.ensure_positive.
In `@cpp/src/dual_simplex/presolve.cpp`:
- Around line 1145-1152: The native-QP branch clears
presolve_info.free_variable_indices but not presolve_info.free_variable_pairs,
which can leave stale split-pair metadata; in the block guarded by
settings.barrier_presolve && free_variables > 0 && problem.Q.n > 0, also clear
presolve_info.free_variable_pairs (and any related split-pair state in
presolve_info_t) so that when the code later reads
presolve_info.free_variable_pairs it cannot see stale data from a previous run;
update the same logical place that currently calls
presolve_info.free_variable_indices.clear() to also reset
presolve_info.free_variable_pairs (and any size/counter fields) to an empty/zero
state.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 08e56a46-95d0-4316-aa97-aa4e5c5a1d18
📒 Files selected for processing (3)
cpp/src/barrier/barrier.cucpp/src/barrier/dense_vector.hppcpp/src/dual_simplex/presolve.cpp
| if (data.n_free_vars > 0) { | ||
| std::vector<i_t> nonnegative_variables(data.w.size(), 1); | ||
| for (i_t j : presolve_info.free_variable_indices) { | ||
| nonnegative_variables[j] = 0; | ||
| } | ||
|
|
||
| data.x.ensure_positive(epsilon_adjust, nonnegative_variables); | ||
|
|
There was a problem hiding this comment.
Build the positivity mask at x.size(), not w.size().
data.x.ensure_positive(..., mask) walks x.size() entries, but this mask is allocated with data.w.size() (n_upper_bounds). On any QP where a free column index is outside the upper-bound list, this writes/reads past the mask and can corrupt the initial point or crash before the first iteration.
Suggested fix
- std::vector<i_t> nonnegative_variables(data.w.size(), 1);
+ std::vector<i_t> nonnegative_variables(data.x.size(), 1);
for (i_t j : presolve_info.free_variable_indices) {
nonnegative_variables[j] = 0;
}
data.x.ensure_positive(epsilon_adjust, nonnegative_variables);As per coding guidelines, "Flag invalid memory access patterns including out-of-bounds, use-after-free, and host/device confusion."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cpp/src/barrier/barrier.cu` around lines 1996 - 2003, The mask passed to
data.x.ensure_positive is allocated with data.w.size() but ensure_positive
iterates over data.x.size(), causing out-of-bounds access; change the allocation
of nonnegative_variables to use data.x.size() (and/or validate that every index
in presolve_info.free_variable_indices is < data.x.size()) so the mask length
matches data.x and then unset entries for each j in
presolve_info.free_variable_indices before calling data.x.ensure_positive.
| if (settings.barrier_presolve && free_variables > 0 && problem.Q.n > 0) { | ||
| presolve_info.free_variable_indices.clear(); | ||
| for (i_t j = 0; j < problem.num_cols; j++) { | ||
| if (problem.lower[j] == -inf && problem.upper[j] == inf) { | ||
| presolve_info.free_variable_indices.push_back(j); | ||
| } | ||
| } | ||
| } else if (settings.barrier_presolve && free_variables > 0) { |
There was a problem hiding this comment.
Clear the split-pair metadata in the native-QP branch.
This path switches free-variable representation from free_variable_pairs to free_variable_indices, but it only clears the latter. Later code still reads presolve_info.free_variable_pairs, so a reused presolve_info_t can carry stale split-variable mappings into a QP that kept its columns native.
Suggested fix
if (settings.barrier_presolve && free_variables > 0 && problem.Q.n > 0) {
presolve_info.free_variable_indices.clear();
+ presolve_info.free_variable_pairs.clear();
for (i_t j = 0; j < problem.num_cols; j++) {
if (problem.lower[j] == -inf && problem.upper[j] == inf) {
presolve_info.free_variable_indices.push_back(j);
}
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cpp/src/dual_simplex/presolve.cpp` around lines 1145 - 1152, The native-QP
branch clears presolve_info.free_variable_indices but not
presolve_info.free_variable_pairs, which can leave stale split-pair metadata; in
the block guarded by settings.barrier_presolve && free_variables > 0 &&
problem.Q.n > 0, also clear presolve_info.free_variable_pairs (and any related
split-pair state in presolve_info_t) so that when the code later reads
presolve_info.free_variable_pairs it cannot see stale data from a previous run;
update the same logical place that currently calls
presolve_info.free_variable_indices.clear() to also reset
presolve_info.free_variable_pairs (and any size/counter fields) to an empty/zero
state.
Description
Issue
Checklist