Skip to content

fix(sos): refuse masked SOS variables with a clear error; fix reformulate_sos=True no-op on native SOS solvers#689

Merged
FabianHofmann merged 21 commits into
masterfrom
fix/sos-masked-variables
May 18, 2026
Merged

fix(sos): refuse masked SOS variables with a clear error; fix reformulate_sos=True no-op on native SOS solvers#689
FabianHofmann merged 21 commits into
masterfrom
fix/sos-masked-variables

Conversation

@FBumann
Copy link
Copy Markdown
Collaborator

@FBumann FBumann commented May 18, 2026

Stacked on #684. Tracks #688.

Summary

Two related fixes around SOS handling:

  1. Refuse masked SOS variables up front. The SOS plumbing (direct-API solver builds + LP-file writer) treats linopy variable labels as solver column indices, which breaks as soon as any variable in the model is masked. Today's symptoms are solver-specific and confusing:

    Until the proper fix (Using SOS on masked variables breaks #688), surface a single clear NotImplementedError instead.

  2. Fix reformulate_sos=True to actually reformulate on native SOS solvers. It was silently no-op'd with a warning ("supports SOS natively, ignored"). The 3 options (True, 'auto', False) now each have a distinct use case.

This enables a clean workaround for the masked-SOS bug: pass reformulate_sos=True and the SOS is converted to binary+linear before reaching the masked-SOS guard.

Where the guard lives

In Solver._build (covers both Model.solve and the 2-step Solver.from_model(...).solve() API) and inside sos_to_file (covers standalone m.to_file()).

Affected piecewise test

test_lp_per_entity_nan_padding exercised both method='lp' and method='sos2' on NaN-padded breakpoints. The sos2 case relied on the silent corruption — split into:

  • test_lp_per_entity_nan_padding — keeps the LP-path regression coverage (the original purpose)
  • test_sos2_per_entity_nan_padding_errors — asserts the new guard fires

Users on piecewise with heterogeneous breakpoints should pick method='lp' (or 'incremental') until #688.

Test plan

  • test/test_sos_constraints.py — 5 new tests (Gurobi/Xpress direct raise, LP writer raise, reformulate_sos=True workaround works, reformulate_sos=True actually reformulates on Gurobi)
  • test/test_sos_reformulation.py — all 60 existing tests still pass
  • test/test_piecewise_constraints.py — all 242 tests pass (1 updated, 1 added)
  • pre-commit run --all-files clean

🤖 Generated with Claude Code

@FabianHofmann I would like to merge this, creating a baseline for masked sos constraints. Then working on proper support for them with tests etc. SO much of this code will be reverted.

FabianHofmann and others added 15 commits May 18, 2026 10:30
Adds io_api='direct' for Xpress: loads the model through the native
loadproblem array API instead of writing an LP/MPS file. Includes a
model.to_xpress() helper mirroring to_gurobipy/to_highspy/to_mosek,
SOS attachment, and feature flags (DIRECT_API, SOS_CONSTRAINTS).
Refactors _run_file to share a _solve helper with the new _run_direct.
The lowercase Xpress API (addnames, chgobjsense, read, setlogfile,
readbasis, postsolve, writebasis, writebinsol, getDual) exists in
every released xpress version. Drop the try/except dispatch to the
PascalCase aliases — they only added uncovered fallback branches
without behavioural benefit on either 9.4 or 9.8+.
…e` on native SOS solvers

The SOS plumbing (direct-API solver builds + LP-file writer) treats linopy
variable labels as solver column indices. That assumption breaks as soon as
any variable in the model is masked (label space becomes non-contiguous):
- Gurobi direct: `IndexError`
- Xpress direct (after #684): `?404 Invalid column number`
- LP file: parse errors on Xpress/CPLEX, silent SOS-set corruption on Gurobi

Until the proper fix (#688), refuse masked SOS up front with a clear
`NotImplementedError` instead of producing solver-specific failures deeper
down.

- Add `linopy.io._raise_if_sos_has_masked(model)` and call it from
  `Solver._build` (covers both `Model.solve` and the 2-step
  `Solver.from_model().solve()` API) plus `sos_to_file` (covers standalone
  `m.to_file()`).
- Fix the related bug where `reformulate_sos=True` silently no-op'd on
  solvers that support SOS natively (only a warning was emitted). `True`
  now means "always reformulate", as documented.
- The combination of the two changes gives users a working workaround for
  masked SOS: pass `reformulate_sos=True` and the SOS gets converted to
  binary+linear constraints before reaching the masked-SOS guard.
- Adjust the piecewise NaN-padding regression test: split the LP and SOS2
  cases since the SOS2 path now errors (separate `test_sos2_per_entity_
  nan_padding_errors` asserts that), and the LP case keeps its original
  regression coverage.

Stacked on #684 — the Xpress changes are part of the affected surface.

Refs #688.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Felix <117816358+FBumann@users.noreply.github.com>
Move the masked-SOS check to ``Model._check_sos_unmasked`` and call it
from a single hoisted spot in ``to_file`` (covers LP and MPS) plus
``Solver._build``. Removes the now-unreachable masked filter from the
Xpress direct ``add_sos`` helper. Restore Big-M / finite-bounds note to
the ``reformulate_sos`` docstring. Replace the brittle log-string check
in ``test_reformulate_sos_true_reformulates_on_native_solver`` with a
behavioural assertion on the auxiliary artifacts the reformulation
writes into the LP file, and parametrize the workaround test to also
run on HiGHS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Conflict in test/test_sos_constraints.py: both branches added new tests
after test_sos2_xpress_direct. Kept both — xpress QP+SOS test from the
base, masked-SOS guard + reformulate_sos workaround tests from this
branch. Tightened ``_masked_sos_model``'s ``sos_type`` parameter to
``Literal[1, 2]`` so mypy stays clean across the merge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@FBumann FBumann marked this pull request as ready for review May 18, 2026 13:23
Copy link
Copy Markdown
Collaborator

@FabianHofmann FabianHofmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good and the error message makes things quite clear

Comment thread test/test_sos_constraints.py Outdated
assert np.isclose(m.objective.value, 5)


def _masked_sos_model(sos_type: int = 1) -> Model:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's make this a fixture

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread test/test_sos_constraints.py Outdated
Comment on lines +191 to +208
@pytest.mark.skipif("gurobi" not in available_solvers, reason="Gurobi not installed")
def test_gurobi_direct_raises_on_masked_sos() -> None:
m = _masked_sos_model()
with pytest.raises(NotImplementedError, match="masked"):
m.solve(solver_name="gurobi", io_api="direct")


@pytest.mark.skipif("xpress" not in available_solvers, reason="Xpress not installed")
def test_xpress_direct_raises_on_masked_sos() -> None:
m = _masked_sos_model()
with pytest.raises(NotImplementedError, match="masked"):
m.solve(solver_name="xpress", io_api="direct")


def test_lp_writer_raises_on_masked_sos(tmp_path: Path) -> None:
m = _masked_sos_model()
with pytest.raises(NotImplementedError, match="masked"):
m.to_file(tmp_path / "sos.lp", io_api="lp")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pack that into a parameter matrix

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Base automatically changed from xpress-direct to master May 18, 2026 14:07
FBumann and others added 6 commits May 18, 2026 16:19
Addresses review feedback on #689:
- _masked_sos_model() helper -> masked_sos_model pytest fixture
- collapse three test_*_raises_on_masked_sos tests into one
  test_masked_sos_raises parametrized over (gurobi-direct, xpress-direct, lp-writer)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace single parametrized test using lambda triggers with:
- test_direct_api_raises_on_masked_sos: parametrized over solver_name
- test_lp_writer_raises_on_masked_sos: dedicated for the to_file path

The solve and to_file entry points hit distinct guards (Solver._build vs
sos_to_file), so a single matrix obscured what was being exercised.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace hard-coded gurobi/xpress + skipif marks with a parametrize list
built from SolverFeature.SOS_CONSTRAINTS ∩ DIRECT_API, matching the
pattern already used in test_piecewise_constraints.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolve conflicts in linopy/solvers.py and test/test_sos_constraints.py
in favor of HEAD: the centralized Solver._build / to_file guard
(_check_sos_unmasked) rejects masked SOS up front, so master's per-solver
mask filter in the Xpress build path is unreachable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Deduplicate the shared 10-line model construction between
test_lp_per_entity_nan_padding and test_sos2_per_entity_nan_padding_errors
into a module-level nan_padded_pwl_model factory fixture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… fix

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@FBumann
Copy link
Copy Markdown
Collaborator Author

FBumann commented May 18, 2026

@FabianHofmann ready for approval

@FBumann FBumann added this to the v0.8.0 milestone May 18, 2026
@FabianHofmann FabianHofmann merged commit deb6901 into master May 18, 2026
22 checks passed
@FabianHofmann FabianHofmann deleted the fix/sos-masked-variables branch May 18, 2026 15:51
FBumann added a commit that referenced this pull request May 18, 2026
Add test/test_sos_masked.py covering SOS-with-masked-variables across:

- 1D and 2D SOS variables
- All mask placements (none, sos_dim, non_sos_dim, both_dims) plus an
  optional mask on an unrelated variable (the angle that exposes the
  label-vs-position bug elsewhere in the model)
- SOS1 and SOS2
- All SOS-capable solvers × {direct, lp} io_apis

Asymmetric objective coefficients [-1,-2,-3,-4] along the SOS dim break
permutation symmetry. The fixture's analytical optimizer (matching
list-position adjacency for SOS2) supplies an exact (objective,
per-slot solution) oracle so wrong indexing surfaces as a visible
divergence, not a permutation-equivalent silent pass.

Three-layer oracle per test:
  1. status == "ok" (catches OOB raises and LP parser rejections)
  2. objective.value == expected (catches silent-wrong-answer when the
     bug constrains the wrong columns)
  3. element-wise solution.values == expected (catches the rare
     permutation-equivalent case where objective happens to match)

Also drop the defensive Model._check_sos_unmasked workaround from
master (#689) along with its call sites in Solver._build and to_file.
With this commit alone the tests run and surface the real bug;
subsequent commits in this PR apply the actual fixes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
FBumann added a commit that referenced this pull request May 19, 2026
Add test/test_sos_masked.py covering SOS-with-masked-variables across:

- 1D and 2D SOS variables
- All mask placements (none, sos_dim, non_sos_dim, both_dims) plus an
  optional mask on an unrelated variable (the angle that exposes the
  label-vs-position bug elsewhere in the model)
- SOS1 and SOS2
- All SOS-capable solvers × {direct, lp} io_apis

Asymmetric objective coefficients [-1,-2,-3,-4] along the SOS dim break
permutation symmetry. The fixture's analytical optimizer (matching
list-position adjacency for SOS2) supplies an exact (objective,
per-slot solution) oracle so wrong indexing surfaces as a visible
divergence, not a permutation-equivalent silent pass.

Three-layer oracle per test:
  1. status == "ok" (catches OOB raises and LP parser rejections)
  2. objective.value == expected (catches silent-wrong-answer when the
     bug constrains the wrong columns)
  3. element-wise solution.values == expected (catches the rare
     permutation-equivalent case where objective happens to match)

Also drop the defensive Model._check_sos_unmasked workaround from
master (#689) along with its call sites in Solver._build and to_file.
With this commit alone the tests run and surface the real bug;
subsequent commits in this PR apply the actual fixes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
FBumann added a commit that referenced this pull request May 20, 2026
* test(sos): regression matrix for masked SOS variables (#688)

Add test/test_sos_masked.py covering SOS-with-masked-variables across:

- 1D and 2D SOS variables
- All mask placements (none, sos_dim, non_sos_dim, both_dims) plus an
  optional mask on an unrelated variable (the angle that exposes the
  label-vs-position bug elsewhere in the model)
- SOS1 and SOS2
- All SOS-capable solvers × {direct, lp} io_apis

Asymmetric objective coefficients [-1,-2,-3,-4] along the SOS dim break
permutation symmetry. The fixture's analytical optimizer (matching
list-position adjacency for SOS2) supplies an exact (objective,
per-slot solution) oracle so wrong indexing surfaces as a visible
divergence, not a permutation-equivalent silent pass.

Three-layer oracle per test:
  1. status == "ok" (catches OOB raises and LP parser rejections)
  2. objective.value == expected (catches silent-wrong-answer when the
     bug constrains the wrong columns)
  3. element-wise solution.values == expected (catches the rare
     permutation-equivalent case where objective happens to match)

Also drop the defensive Model._check_sos_unmasked workaround from
master (#689) along with its call sites in Solver._build and to_file.
With this commit alone the tests run and surface the real bug;
subsequent commits in this PR apply the actual fixes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sos): resolve linopy labels to solver column positions in direct builds

Both Gurobi and Xpress direct API builds were passing linopy variable
labels straight to vendor `addSOS` as if they were 0-based column
positions in the solver's variable array. Labels equal positions only
when no variable in the model is masked anywhere — with any mask, the
SOS members get attached to the wrong (or out-of-range) columns.

Introduce a shared `_sos_set_positions(labels, weights, label_to_pos)`
helper that drops masked entries (label = -1) along with their weights
and maps the survivors to active-variable positions via
`model.variables.label_index.label_to_pos`. Both `Gurobi._build_solver_model`
and `Xpress._build_solver_model` route through it.

Empty SOS sets (all members masked, e.g. when a non-SOS dim mask
removes an entire set) are now skipped instead of producing an empty
vendor call.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sos): filter masked SOS members + robust set id in LP writer

The LP writer was emitting variable names from the raw label array. For
a masked SOS member (label = -1) this produced `x-1`, which LP parsers
either reject outright (cplex, xpress, mosek) or silently corrupt into
a wrong SOS set (gurobi LP reader). The set identifier was computed as
`labels.isel(sos_dim=0)`, which is -1 whenever the first slot of the
set is masked — both an invalid LP identifier and one that collides
across distinct sets that all happen to have a masked first slot.

Both are fixed in `sos_to_file`:

- Per-set id is now the `max` of non-masked labels along the SOS dim,
  which is always a valid label (any of the set's surviving members
  would do; max is the simplest stable choice). Fully-masked sets get
  id -1 and are filtered out before emission.
- Masked member rows are dropped from the polars dataframe before the
  `group_by` aggregation, so masked members never reach the file.

The fix preserves the LP path's polars vectorization (no switch to
Python-level iteration) and stays separate from the direct-API helper
— the two paths have identical-shape fixes implemented in each path's
natural idiom.

Also drop the now-obsolete `test_direct_api_raises_on_masked_sos` /
`test_lp_writer_raises_on_masked_sos` tests in `test_sos_constraints`:
they were guarding the defensive `_check_sos_unmasked` workaround that
this PR replaces with the real fix. The same coverage (and much more)
is now in `test_sos_masked.py`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(pwl): convert masked-sos2 NaN-padding test to positive assertion

The companion sos2 test was guarding the defensive _check_sos_unmasked
workaround — asserting that piecewise-with-sos2 NaN-padded breakpoints
raised NotImplementedError. With the real #688 fix landed, masked SOS
lambdas now flow through both direct-API and LP-writer paths
correctly, so the sos2 piecewise solve produces the same answer as the
lp variant (y_b = 12.5 on the chord (5,10)→(15,15)).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(sos): move masked-reformulation test to test_sos_reformulation

The "reformulation + masked SOS" integration test lived in
test_sos_constraints alongside SOS-validation tests but its concern is
the reformulation pipeline (the apply/undo path on a masked input),
not the SOS constraint itself. Move it to TestSolveWithReformulation
in test_sos_reformulation.py where the other reformulation tests live.

Also drop the "documented workaround" framing in the docstring — with
#688 fixed, reformulation isn't a workaround for masked SOS anymore,
just one of two ways to solve such models.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(sos): extract _iter_sos_sets to deduplicate direct-API plumbing

The SOS handling in Gurobi._build_solver_model and
Xpress._build_solver_model was ~22 lines each, ~95% identical: same
iteration over model.variables.sos, same 1D vs multi-dim stack/groupby
branching, same masked-entry filter, same label→position resolution.
Only the vendor addSOS call differed.

Extract _iter_sos_sets(model) yielding (sos_type, positions, weights)
per active SOS set. Each subclass collapses to a 2-line loop with its
vendor call. The shared bug-prone parts (multi-dim groupby, mask
filtering, label resolution) live in one place; a future SOS-capable
direct-API solver (#683, OETC) plugs in with one addSOS line.

Also replace `# type: ignore[assignment]` on sos_type/sos_dim with
runtime int()/str() casts. The casts both document the actual runtime
types and guard against future changes to what gets stored in
var.attrs, with no measurable cost. The remaining type: ignore is on
the int() call itself (mypy is overly conservative about xarray's
Hashable-typed attrs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(pwl): parametrize masked-sos2 NaN-padding across solver × io_api

Default ``m.solve()`` only hit gurobi-lp, which silently drops unknown
``x-1`` members and produced the right answer on master by accident.
The bare assertion couldn't catch #688 on cplex-lp (parse error on
``x-1``), gurobi-direct (label-vs-position OOB), or xpress (pre-direct-
API SOS refusal) — only the full solver × io_api matrix surfaces those.

Empirically on master: 4/5 cells fail; only gurobi-lp passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(sos): numpy-only _iter_sos_sets, inline _sos_set_positions

Replace the xarray stack/groupby/unstack pipeline with a transpose +
reshape that iterates SOS sets as numpy columns, and inline the
single-use _sos_set_positions helper. _iter_sos_sets now yields numpy
arrays for positions and weights instead of round-tripping through
Python lists. Trim the LP-writer SOS comments to one concise line each.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sos): pass Python lists to Xpress addSOS

Xpress's addSOS requires list arguments; the numpy-only _iter_sos_sets
yields ndarrays, which Xpress rejects with "SOS indices must be a list
of variables". Convert positions and weights at the vendor-call
boundary, keeping _iter_sos_sets numpy-internal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(sos): convert numpy args to lists at Gurobi addSOS boundary

The numpy-only _iter_sos_sets yields ndarrays; gurobipy's stubs reject
an ndarray both as an MVar index and as addSOS weights. Convert at the
vendor-call boundary, matching the Xpress fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(test): annotate SOS fixture vars to satisfy mypy

mypy . (run in CI over test/) flagged two inference conflicts in the
SOS fixture: active_per_j mixes int and None keys, and expected_sol is
assigned both 1D and 2D arrays. Add explicit annotations so the newer
numpy shape-typed stubs accept both branches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(sos): cover empty-slice skip in LP writer

Add a test with a fully-masked SOS variable so sos_to_file hits the
`if df.is_empty(): continue` guard — the one uncovered patch line
codecov flagged. Verified the branch is covered (io.py coverage now
spans the guard).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants