diff --git a/CHANGELOG.md b/CHANGELOG.md
index a980cddc..4fa7d9de 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -22,6 +22,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
   `"survey_tsl"` `vcov_type`, and a Survey Design block in `summary()`. The non-survey path is
   byte-for-byte unchanged. Validated against `survey::svyglm` on the stacked long difference
   (numeric golden parity is the D2 follow-up).
+- **`TROP` non-absorbing (on/off) treatment support** (Athey, Imbens, Qu & Viviano 2025,
+  §2.1 / Eq. 12 / Algorithm 2). New `non_absorbing` parameter (default `False`). The paper
+  supports general assignment patterns ("units moving into and out of treatment"), not only
+  absorbing/staggered adoption; `TROP(non_absorbing=True)` (`method='local'` only) now
+  accepts treatment that switches on and off, imputing each treated cell's counterfactual via
+  the paper's `(1-W)` masking. The default `non_absorbing=False` is unchanged and still
+  rejects non-monotonic D with a `ValueError` (now also pointing to the opt-in), guarding
+  against the common mistake of encoding absorbing treatment as an event-style spike. This
+  *removes a prior implementation over-restriction* (the estimator was stricter than the
+  paper) rather than adding a deviation. `method='global'` keeps its block-assignment
+  requirement and rejects `non_absorbing=True`. A one-time `UserWarning` is emitted noting
+  that validity relies on the no-dynamic-effects assumption and that the triple-robustness
+  guarantee (Theorem 5.1) is proven only under block assignment. The Rust local LOOCV and
+  point-estimate paths were already mask-driven and unchanged (Rust/Python ATT parity is
+  regression-tested); the non-absorbing **bootstrap** is routed to the Python path, because
+  the Rust resampler lacks the no-weighted-control-support guard and can return a degenerate
+  ~0 SE on an empty control stratum. Treated cells with no weighted control support (e.g. an
+  always-treated unit under `lambda_unit>0`) are materialized as NaN and excluded from the
+  ATT (the library non-estimable->NaN convention), with a `UserWarning`.
 - **`LPDiD` non-absorbing R-parity validation** (Phase C2). Pins both non-absorbing modes
   against an independent `fixest::feols` reconstruction of the paper's Eq. 12 (`first_entry`)
   and Eq. 13 (`effect_stabilization`) clean-sample restrictions: variance-weighted point and
diff --git a/METHODOLOGY_REVIEW.md b/METHODOLOGY_REVIEW.md
index 08b1fc2a..0dc56f17 100644
--- a/METHODOLOGY_REVIEW.md
+++ b/METHODOLOGY_REVIEW.md
@@ -880,7 +880,7 @@ These three are feature deferrals (paper-supported extensions that the library h
 | Status | **Complete** (paper `method="local"`, version-pinned to arXiv v2 — see Version Pinning below) |
 | Last Review | 2026-05-24 |
 
-**Version Pinning:** This methodology promotion is anchored on **arXiv:2508.21536v2** (the version covered by the paper review on file at `docs/methodology/papers/athey-2025-review.md`). The current arXiv version is **v3** (submitted 2026-02-09). A formal v2→v3 source delta-check against the v3 PDF has **NOT** been performed for any of the sections this PR promotes (Eqs. 2-3, Algorithms 1-3, Section 2.2, Section 5.2-5.3, Section 6.1-6.2, Theorem 5.1, Corollary 1, Appendix Theorem 8.1). **Action item:** before the next paper-author reference implementation or substantive v3 release, refresh the paper review against the most recent arXiv version and re-validate the verified-component checklist; until then the promotion stays v2-anchored.
+**Version Pinning:** This methodology promotion is anchored on **arXiv:2508.21536v2** (the version covered by the paper review on file at `docs/methodology/papers/athey-2025-review.md`). The current arXiv version is **v3** (submitted 2026-02-09). The **v3 PDF was consulted for the treatment-assignment-pattern sections** during the non-absorbing support work (§2.1, §2.2 Eq. 2, §6.1 Eq. 12 / Algorithm 2, Assumption 1(i), Theorem 5.1), confirming the general-assignment scope behind `TROP(non_absorbing=True)`. A formal v2→v3 source delta-check across the remaining promoted sections (Eqs. 2-3, Algorithms 1-3, Section 2.2, Section 5.2-5.3, Section 6.1-6.2, Theorem 5.1, Corollary 1, Appendix Theorem 8.1) has **NOT** been performed in full. **Action item:** before the next paper-author reference implementation or substantive v3 release, refresh the paper review against the most recent arXiv version and re-validate the verified-component checklist; until then the promotion stays v2-anchored.
 
 **Scope:** This methodology promotion covers the paper-aligned `method="local"` path (paper Algorithm 2: per-(i, t) estimation with observation-specific weights). The library also exposes `method="global"`, documented in `REGISTRY.md` as a "computationally efficient adaptation using the (1-W) masking principle from Eq. 2" — a library-side adaptation, NOT the paper's full Algorithm 2 estimator. Defensive coverage of the global method lives in `tests/test_trop.py::TestTROPGlobalMethod` (704 lines, ~30 tests for the global-method-specific surface) and is not duplicated in the methodology walk-through. Methodology promotion of `method="global"` as a primary surface would require either (a) a paper-side derivation of the global adaptation's equivalence to Algorithm 2 under specific conditions, or (b) a separate library-extension methodology review; both are deferred.
 
@@ -892,15 +892,15 @@ These three are feature deferrals (paper-supported extensions that the library h
 - [x] Corollary 1 (paper p. 23) — **single-draw sanity checks consistent with the three unbiasedness conditions, not a repeated-MC mean-bias study**: each of the three balance conditions (a) unit balance, (b) time balance, (c) ``B = 0`` is exercised on a targeted DGP that makes one condition trivially hold while keeping the others sub-optimal. The assertion in each case is a single-realisation ``|att - τ| < 3 * se`` band using the estimator's own bootstrap SE — this is a smoke check, NOT a repeated-draw Monte Carlo bias study of the paper's conditional-unbiasedness statement under fixed weights. A stronger MC bias study at fixed λ values is deferred (would multiply test runtime by ~30x for marginal additional evidence given the existing 3-σ band already catches order-of-magnitude bias regressions).
 - [x] Theorem 5.1 (paper p. 23) — **simulation sanity check, not a direct theorem lock**: the paper's bias bound ``|E[τ_hat - τ | L]| <= ||Δ_u|| · ||Δ_t|| · ||B||_*`` is stated for FIXED, non-data-dependent weights. The library's TROP fit uses data-dependent LOOCV-tuned λ values, so the direct conditional bias bound is not tested here. Instead, the methodology test verifies the bound's empirical realisation: TROP RMSE strictly below DID RMSE under a confounded factor DGP with ``true τ = 0`` (calibration measurement: TROP/DID RMSE ratio ≈ 0.34 at ``factor_strength = 1.0``). The direct fixed-weight bound test is deferred — would require exposing oracle Γ / Λ / B from a paper-aligned DGP and computing each component of the bound from instrumented internals.
 - [x] Section 2.2 special-case reductions: **DiD benchmark sanity check** (not a direct algebraic-equivalence proof) — on a no-interactive-FE multi-period panel (additive unit + time effects only, no factor structure), TROP with ``λ_nn = ∞`` + uniform weights produces an ATT within 0.5 of `DifferenceInDifferences` fitted as `outcome ~ treat * post_flag` (basic 2×2 design with `[const, D, T, D×T]`, extended to repeated observations within each treat×post cell). This is **empirical numerical agreement on a friendly DGP**, NOT a proof of the paper Section 2.2 algebraic reduction (which would require either a true 2-period block-assignment panel where the basic-DiD comparator is the algebraic target, or a comparison against `TwoWayFixedEffects` — both deferred). **Matrix Completion code path exercised, not equivalence-checked** — TROP with uniform weights + finite ``λ_nn`` engages the nuclear-norm prox solver (effective_rank > 0) and recovers ATT better than the DiD-style baseline on a factor-confounded DGP; this verifies the code path activates but does NOT prove equivalence with an independent MC reference implementation (which would require either an external MC port or a hand-written reference solver). SC / SDID reductions deferred — see "Outstanding Concerns".
-- [x] Eq. 13 + Algorithm 2 per-(i, t) estimation: ``treatment_effects`` dict contains one finite ``τ_hat_it`` per treated cell; the aggregate ATT equals the unweighted mean of per-cell effects (Eq. 1). **Tests cover block adoption with a constant treatment effect**; **absorbing-state staggered adoption** and **heterogeneous per-cell effects** (paper Remark 6.1) are SUPPORTED by the code path but not directly verified in this methodology surface. **Section 6.1 non-absorbing / on-off / switching assignment patterns are explicitly OUT OF SCOPE** — the implementation rejects non-absorbing D-matrices via `trop_local.py` absorbing-state validation, and the methodology test enforces the rejection contract via `TestTROPDeviations::test_event_style_d_rejected_with_value_error` (event-style D being one specific non-absorbing pattern; the same absorbing-state validator catches all 1→0 transitions). Cross-coverage of the staggered-cohort fit path is `tests/test_methodology_trop.py::TestTROPAlgorithm1LOOCV::test_control_set_includes_pretreat_of_eventually_treated`.
+- [x] Eq. 13 + Algorithm 2 per-(i, t) estimation: ``treatment_effects`` dict contains one ``τ_hat_it`` entry per treated cell (finite for estimable cells; NaN for a missing outcome or for a cell whose unit/time fixed effect ``alpha_i + beta_t`` is unidentified by the two-way-FE control fit — i.e. the target unit and target period are not in the same connected component of the observed-control graph (an always-treated unit for any ``lambda_unit``, a fully-treated period, or disconnected control support under ``non_absorbing``; or an unbalanced absorbing panel with entirely-missing unit/period controls — the guard is applied to all local fits, not only non_absorbing, and the bootstrap is forced onto the guarded Python path when trimming occurs); the aggregate ATT equals the unweighted mean of the finite per-cell effects (Eq. 1). Trimming non-estimable cells to NaN matches the library-wide non-estimable→NaN convention and is documented in REGISTRY ## TROP "non-absorbing non-estimable-cell trimming" Note; locked by `TestTROPDeviations::test_non_absorbing_always_treated_unit_not_raw_outcome` and `test_non_absorbing_fully_treated_period_not_estimable`. **Tests cover block adoption with a constant treatment effect**; **absorbing-state staggered adoption** and **heterogeneous per-cell effects** (paper Remark 6.1) are SUPPORTED by the code path but not directly verified for those specific patterns. **Section 6.1 non-absorbing / on-off / switching assignment patterns are SUPPORTED via the opt-in `TROP(non_absorbing=True)` (`method='local'` only)** — matching the paper's general-assignment scope (§2.1; Eq. 12 / Algorithm 2). This *narrows* a prior implementation over-restriction (the shipped estimator was stricter than the paper) rather than adding a deviation. The default `non_absorbing=False` still rejects non-monotonic D as a defensive guard; recovery on a no-dynamic-effects toggling DGP + the caveat warning are locked by `TestTROPDeviations::test_non_absorbing_general_assignment_supported`, and the default-mode rejection contract by `TestTROPDeviations::test_event_style_d_rejected_with_value_error`. Inference caveat: Theorem 5.1's triple-robustness guarantee is proven under Assumption 1(i) block assignment only (see REGISTRY ## TROP Notes). Cross-coverage of the staggered-cohort fit path is `tests/test_methodology_trop.py::TestTROPAlgorithm1LOOCV::test_control_set_includes_pretreat_of_eventually_treated`.
 - [x] Algorithm 3 stratified pairs bootstrap: under an unbalanced (3 treated, 17 control) panel, the stratified sampler reliably produces ≥ 67% successful bootstrap draws and a positive finite SE.
 - [x] Section 3 / Eq. 6 semi-synthetic factor DGP: five recovery tests verify limiting-case uniform weights, unit-weight bias reduction, time-weight bias reduction, factor-model bias reduction with effective_rank > 0, and null-DGP recovery centred near zero.
 - [x] safe_inference contract: confidence interval uses the t-distribution with df = max(1, n_treated_obs - 1), consistent with p_value (matches REGISTRY `## TROP` "Inference CI distribution" note, post safe_inference migration).
 
 **Test Coverage:**
 
-- 36 methodology tests (10 classes) in `tests/test_methodology_trop.py`.
-- Defensive guards (107 tests in `tests/test_trop.py`): D-matrix absorbing-state validation, silent-warning audit, FISTA convergence warnings, bootstrap-failure-rate proportional warning, bootstrap NaN-SE propagation, module-split smoke tests.
+- 39 methodology tests (10 classes) in `tests/test_methodology_trop.py` (includes non-absorbing opt-in recovery + caveat-warning + default-mode no-warning + unbalanced×non-absorbing).
+- Defensive guards (117 tests in `tests/test_trop.py`): D-matrix absorbing-state validation, non-absorbing opt-in acceptance / local-only guard / params round-trip / Rust-Python parity, silent-warning audit, FISTA convergence warnings, bootstrap-failure-rate proportional warning, bootstrap NaN-SE propagation, module-split smoke tests.
 
 **Deviations from paper:**
 
diff --git a/README.md b/README.md
index d237b050..651f0edb 100644
--- a/README.md
+++ b/README.md
@@ -102,7 +102,7 @@ Full guide: `diff_diff.get_llm_guide("practitioner")`.
 - [TwoWayFixedEffects](https://diff-diff.readthedocs.io/en/stable/api/estimators.html) - panel data DiD with unit and time fixed effects via within-transformation or dummies
 - [MultiPeriodDiD](https://diff-diff.readthedocs.io/en/stable/api/estimators.html) - event study design with period-specific treatment effects for dynamic analysis
 - [CallawaySantAnna](https://diff-diff.readthedocs.io/en/stable/api/staggered.html) - Callaway & Sant'Anna (2021) group-time ATT estimator for staggered adoption
-- [ChaisemartinDHaultfoeuille](https://diff-diff.readthedocs.io/en/stable/api/chaisemartin_dhaultfoeuille.html) - de Chaisemartin & D'Haultfœuille (2020/2022) for **reversible (non-absorbing) treatments** with multi-horizon event study, normalized effects, cost-benefit delta, sup-t bands, and dynamic placebos. The only library option for treatments that switch on AND off. Alias `DCDH`.
+- [ChaisemartinDHaultfoeuille](https://diff-diff.readthedocs.io/en/stable/api/chaisemartin_dhaultfoeuille.html) - de Chaisemartin & D'Haultfœuille (2020/2022) for **reversible (non-absorbing) treatments** with multi-horizon event study, normalized effects, cost-benefit delta, sup-t bands, and dynamic placebos. The most general option for treatments that switch on AND off (see also `LPDiD`/`TROP` `non_absorbing`). Alias `DCDH`.
 - [SunAbraham](https://diff-diff.readthedocs.io/en/stable/api/staggered.html) - Sun & Abraham (2021) interaction-weighted estimator for heterogeneity-robust event studies
 - [ImputationDiD](https://diff-diff.readthedocs.io/en/stable/api/imputation.html) - Borusyak, Jaravel & Spiess (2024) imputation estimator, most efficient under homogeneous effects
 - [TwoStageDiD](https://diff-diff.readthedocs.io/en/stable/api/two_stage.html) - Gardner (2022) two-stage estimator with GMM sandwich variance
diff --git a/diff_diff/chaisemartin_dhaultfoeuille.py b/diff_diff/chaisemartin_dhaultfoeuille.py
index 59d28ce8..1c675c69 100644
--- a/diff_diff/chaisemartin_dhaultfoeuille.py
+++ b/diff_diff/chaisemartin_dhaultfoeuille.py
@@ -1,9 +1,14 @@
 """
 de Chaisemartin-D'Haultfoeuille (dCDH) estimator for reversible-treatment DiD.
 
-The dCDH estimator is the only modern DiD estimator in the diff-diff library
-that handles **non-absorbing (reversible) treatments** — treatment can switch
-on AND off over time. All other staggered estimators in the library
+The dCDH estimator is the most general DiD estimator in the diff-diff library
+for **non-absorbing (reversible) treatments** — treatment can switch on AND off
+over time, switcher vs non-switcher comparisons are its primitive object, and it
+allows dynamic (carryover) effects with explicit joiner/leaver (``DID_+`` /
+``DID_-``) decomposition. ``LPDiD`` (``non_absorbing="first_entry"`` /
+``"effect_stabilization"``) and ``TROP`` (``non_absorbing=True``, under a
+no-dynamic-effects assumption) also accept non-absorbing treatment under stronger
+assumptions. The remaining staggered estimators in the library
 (``CallawaySantAnna``, ``SunAbraham``, ``ImputationDiD``, ``TwoStageDiD``,
 ``EfficientDiD``, ``WooldridgeDiD``) assume treatment is absorbing.
 
@@ -354,9 +359,11 @@ class ChaisemartinDHaultfoeuille(ChaisemartinDHaultfoeuilleBootstrapMixin):
     """
     de Chaisemartin-D'Haultfoeuille (dCDH) estimator.
 
-    The only modern DiD estimator in the library that handles **reversible
-    (non-absorbing) treatments** - treatment may switch on AND off over
-    time. Computes the contemporaneous-switch DiD ``DID_M`` from the
+    The most general library estimator for **reversible (non-absorbing)
+    treatments** - treatment may switch on AND off over time, with explicit
+    joiner/leaver (``DID_+`` / ``DID_-``) decomposition (``LPDiD`` and ``TROP``
+    also support non-absorbing treatment under stronger assumptions; see their
+    ``non_absorbing`` parameters). Computes the contemporaneous-switch DiD ``DID_M`` from the
     AER 2020 paper (equivalently ``DID_1`` at horizon ``l = 1`` of the
     dynamic companion paper, NBER WP 29873) plus the full multi-horizon
     event study ``DID_l`` for ``l = 1..L_max`` via the ``L_max`` parameter
diff --git a/diff_diff/chaisemartin_dhaultfoeuille_results.py b/diff_diff/chaisemartin_dhaultfoeuille_results.py
index eaccbda7..b9372f7d 100644
--- a/diff_diff/chaisemartin_dhaultfoeuille_results.py
+++ b/diff_diff/chaisemartin_dhaultfoeuille_results.py
@@ -4,9 +4,11 @@
 This module contains ``ChaisemartinDHaultfoeuilleResults`` and
 ``DCDHBootstrapResults`` dataclasses produced by the
 ``ChaisemartinDHaultfoeuille`` (alias ``DCDH``) estimator. The dCDH
-estimator is the only modern DiD estimator in the library that handles
-non-absorbing (reversible) treatments. Phase 1 ships the contemporaneous-
-switch case ``DID_M`` (= ``DID_1`` of the dynamic companion paper).
+estimator is the most general library estimator for non-absorbing
+(reversible) treatments (``LPDiD`` and ``TROP`` also support non-absorbing
+treatment under stronger assumptions; see their ``non_absorbing`` parameters).
+Phase 1 ships the contemporaneous-switch case ``DID_M`` (= ``DID_1`` of the
+dynamic companion paper).
 
 References
 ----------
diff --git a/diff_diff/guides/llms-autonomous.txt b/diff_diff/guides/llms-autonomous.txt
index 197dd827..54fb7d2e 100644
--- a/diff_diff/guides/llms-autonomous.txt
+++ b/diff_diff/guides/llms-autonomous.txt
@@ -531,12 +531,21 @@ When `has_never_treated == False`:
 
 When `treatment_type == "binary_non_absorbing"`:
 
-- `ChaisemartinDHaultfoeuille` is the only estimator in the library
-  that treats this natively. Switcher / non-switcher comparisons are
-  its primitive object.
-- Other estimators assume absorbing treatment and will produce
-  estimates whose interpretation is unclear. Do not use them without
-  a well-argued reason.
+- `ChaisemartinDHaultfoeuille` is the most general / default choice and
+  treats this natively. Switcher / non-switcher comparisons are its
+  primitive object; it allows dynamic (carryover) effects and reports
+  joiner/leaver (`DID_+` / `DID_-`) views. Prefer it when effects may
+  persist after treatment turns off.
+- `LPDiD(non_absorbing="first_entry")` or `"effect_stabilization"`
+  (entry-effect estimands) and `TROP(non_absorbing=True, method="local")`
+  (valid under a no-dynamic-effects / no-carryover assumption) also handle
+  non-absorbing treatment, under stronger assumptions. Use TROP's option
+  only when effects are contemporaneous (no carryover).
+- The remaining estimators (`CallawaySantAnna`, `SunAbraham`,
+  `ImputationDiD`, `TwoStageDiD`, `EfficientDiD`, `WooldridgeDiD`) assume
+  absorbing treatment and will produce estimates whose interpretation is
+  unclear on non-absorbing data. Do not use them without a well-argued
+  reason.
 
 ### §4.6 Triple-difference design (DDD)
 
diff --git a/diff_diff/guides/llms-full.txt b/diff_diff/guides/llms-full.txt
index 825f466e..6de24848 100644
--- a/diff_diff/guides/llms-full.txt
+++ b/diff_diff/guides/llms-full.txt
@@ -231,7 +231,7 @@ plot_event_study(results)
 
 ### ChaisemartinDHaultfoeuille
 
-de Chaisemartin & D'Haultfœuille (2020/2022) estimator for **non-absorbing (reversible) treatments**. The only library estimator that handles treatments which can switch on AND off over time. Ships `DID_M` (= `DID_1` at horizon `l = 1`) plus the full multi-horizon event study `DID_l` for `l = 1..L_max` from the dynamic companion paper (NBER WP 29873). Includes normalized estimator `DID^n_l`, cost-benefit aggregate `delta`, dynamic placebos `DID^{pl}_l`, and sup-t simultaneous confidence bands.
+de Chaisemartin & D'Haultfœuille (2020/2022) estimator for **non-absorbing (reversible) treatments**. The most general library estimator for treatments that switch on AND off over time (allows dynamic/carryover effects + joiner/leaver decomposition); `LPDiD` (`non_absorbing="first_entry"`/`"effect_stabilization"`) and `TROP` (`non_absorbing=True`, no-dynamic-effects) also handle non-absorbing treatment under stronger assumptions. Ships `DID_M` (= `DID_1` at horizon `l = 1`) plus the full multi-horizon event study `DID_l` for `l = 1..L_max` from the dynamic companion paper (NBER WP 29873). Includes normalized estimator `DID^n_l`, cost-benefit aggregate `delta`, dynamic placebos `DID^{pl}_l`, and sup-t simultaneous confidence bands.
 
 ```python
 ChaisemartinDHaultfoeuille(
@@ -963,6 +963,7 @@ TROP(
     alpha: float = 0.05,
     n_bootstrap: int = 200,
     seed: int | None = None,
+    non_absorbing: bool = False,               # False: require absorbing D (reject non-monotonic). True: allow on/off treatment (Eq. 12/Alg. 2), method='local' only; emits a caveat warning (Thm 5.1 is block-only).
 )
 ```
 
@@ -972,7 +973,7 @@ TROP(
 trop.fit(
     data: pd.DataFrame,
     outcome: str,
-    treatment: str,                # Absorbing-state treatment indicator (0/1). Must be 0 for all pre-treatment periods and 1 for treatment and post-treatment periods.
+    treatment: str,                # Treatment indicator (0/1). Default (non_absorbing=False): absorbing state -- 0 for all pre-treatment periods, 1 for treatment and post-treatment; non-monotonic D raises ValueError. With non_absorbing=True: any on/off pattern (general assignment).
     unit: str,
     time: str,
 ) -> TROPResults
diff --git a/diff_diff/guides/llms-practitioner.txt b/diff_diff/guides/llms-practitioner.txt
index 274cbe04..2088f6c4 100644
--- a/diff_diff/guides/llms-practitioner.txt
+++ b/diff_diff/guides/llms-practitioner.txt
@@ -213,9 +213,12 @@ Is treatment adoption staggered (multiple cohorts, different timing)?
 |
 |-- Treatment switches ON and OFF (reversible / non-absorbing)?
 |   \-- ChaisemartinDHaultfoeuille (dCDH / alias `DCDH`)
-|       -- Only library estimator for non-absorbing treatments; supports
-|          L_max multi-horizon, dynamic placebos, cost-benefit delta,
-|          HonestDiD, and `survey_design=` (pweight + strata/PSU/FPC via TSL)
+|       -- Most general option for non-absorbing treatments (allows dynamic
+|          effects + joiner/leaver views); supports L_max multi-horizon,
+|          dynamic placebos, cost-benefit delta, HonestDiD, and
+|          `survey_design=` (pweight + strata/PSU/FPC via TSL)
+|       -- Also: LPDiD(non_absorbing="first_entry"/"effect_stabilization")
+|          and TROP(non_absorbing=True, no-dynamic-effects) under stronger assumptions
 |
 |-- Few treated units (< 20)?
 |   \-- SyntheticDiD (SDiD)    -- synthetic control + DiD hybrid
diff --git a/diff_diff/guides/llms.txt b/diff_diff/guides/llms.txt
index 5d81d8f7..f61f5f3d 100644
--- a/diff_diff/guides/llms.txt
+++ b/diff_diff/guides/llms.txt
@@ -54,7 +54,7 @@ Full practitioner guide: call `diff_diff.get_llm_guide("practitioner")`
 - [TwoWayFixedEffects](https://diff-diff.readthedocs.io/en/stable/api/estimators.html): Panel data DiD with unit and time fixed effects via within-transformation or dummies
 - [MultiPeriodDiD](https://diff-diff.readthedocs.io/en/stable/api/estimators.html): Event study design with period-specific treatment effects for dynamic analysis
 - [CallawaySantAnna](https://diff-diff.readthedocs.io/en/stable/api/staggered.html): Callaway & Sant'Anna (2021) group-time ATT estimator for staggered adoption with aggregation
-- [ChaisemartinDHaultfoeuille](https://diff-diff.readthedocs.io/en/stable/api/chaisemartin_dhaultfoeuille.html): de Chaisemartin & D'Haultfœuille (2020/2022) estimator for **reversible (non-absorbing) treatments** with multi-horizon event study (`L_max`), normalized effects, cost-benefit delta, sup-t bands, and dynamic placebos. The only library option for treatments that switch on AND off. Alias `DCDH`.
+- [ChaisemartinDHaultfoeuille](https://diff-diff.readthedocs.io/en/stable/api/chaisemartin_dhaultfoeuille.html): de Chaisemartin & D'Haultfœuille (2020/2022) estimator for **reversible (non-absorbing) treatments** with multi-horizon event study (`L_max`), normalized effects, cost-benefit delta, sup-t bands, and dynamic placebos. The most general option for treatments that switch on AND off (LPDiD/TROP `non_absorbing` also handle non-absorbing treatment under stronger assumptions). Alias `DCDH`.
 - [SunAbraham](https://diff-diff.readthedocs.io/en/stable/api/staggered.html): Sun & Abraham (2021) interaction-weighted estimator for heterogeneity-robust event studies
 - [ImputationDiD](https://diff-diff.readthedocs.io/en/stable/api/imputation.html): Borusyak, Jaravel & Spiess (2024) imputation estimator — most efficient under homogeneous effects
 - [TwoStageDiD](https://diff-diff.readthedocs.io/en/stable/api/two_stage.html): Gardner (2022) two-stage estimator with GMM sandwich variance
@@ -66,7 +66,7 @@ Full practitioner guide: call `diff_diff.get_llm_guide("practitioner")`
 - [HeterogeneousAdoptionDiD](https://diff-diff.readthedocs.io/en/stable/api/had.html): de Chaisemartin, Ciccia, D'Haultfœuille & Knau (2026) for designs where **no unit remains untreated**; local-linear estimator at the dose support boundary returning Weighted Average Slope (WAS) on Design 1' (`d̲=0` / QUG) or `WAS_{d̲}` on Design 1 (`d̲>0`, continuous-near-d̲ or mass-point), with multi-period event-study extension (last-treatment cohort, pointwise CIs). **Panel-only** in this release (repeated cross-sections rejected by the validator). Alias `HAD`.
 - [StackedDiD](https://diff-diff.readthedocs.io/en/stable/api/stacked_did.html): Wing, Freedman & Hollingsworth (2024) stacked DiD with Q-weights and sub-experiments; optional covariate balancing (`balance="entropy"`, Ustyuzhanin 2026)
 - [EfficientDiD](https://diff-diff.readthedocs.io/en/stable/api/efficient_did.html): Chen, Sant'Anna & Xie (2025) efficient DiD with optimal weighting for tighter SEs
-- [TROP](https://diff-diff.readthedocs.io/en/stable/api/trop.html): Triply Robust Panel estimator (Athey et al. 2025) with nuclear norm factor adjustment
+- [TROP](https://diff-diff.readthedocs.io/en/stable/api/trop.html): Triply Robust Panel estimator (Athey et al. 2025) with nuclear norm factor adjustment (absorbing by default; `non_absorbing=True` for on/off treatment, method='local')
 - [StaggeredTripleDifference](https://diff-diff.readthedocs.io/en/stable/api/staggered.html#staggeredtripledifference): Ortiz-Villavicencio & Sant'Anna (2025) staggered DDD with group-time ATT
 - [WooldridgeDiD](https://diff-diff.readthedocs.io/en/stable/api/wooldridge_etwfe.html): Wooldridge (2023, 2025) ETWFE — saturated OLS, logit/Poisson QMLE (ASF-based ATT). Alias: ETWFE
 - [LPDiD](https://diff-diff.readthedocs.io/en/stable/api/lpdid.html): Dube, Girardi, Jorda & Taylor (2025) Local Projections DiD: per-horizon long-difference event study on clean controls (no negative weighting); variance- or equally-weighted ATT, premean differencing, pooled pre/post, fast. Absorbing by default; non-absorbing (reversible) treatment via `non_absorbing="first_entry"` (Eq. 12) or `"effect_stabilization"` (Eq. 13, window `L`). Complex-survey designs (pweight + stratified-PSU TSL SEs) on the default path via `fit(survey_design=...)`.
diff --git a/diff_diff/trop.py b/diff_diff/trop.py
index 5c60dc15..467e8878 100644
--- a/diff_diff/trop.py
+++ b/diff_diff/trop.py
@@ -31,7 +31,11 @@
     _rust_loocv_grid_search,
 )
 from diff_diff.trop_global import TROPGlobalMixin
-from diff_diff.trop_local import TROPLocalMixin, _setup_trop_data
+from diff_diff.trop_local import (
+    TROPLocalMixin,
+    _setup_trop_data,
+    _treated_cell_is_estimable,
+)
 from diff_diff.trop_results import (
     _LAMBDA_INF,
     _PrecomputedStructures,
@@ -96,6 +100,28 @@ class TROP(TROPLocalMixin, TROPGlobalMixin):
         Number of bootstrap replications for variance estimation. Must be >= 2.
     seed : int, optional
         Random seed for reproducibility.
+    non_absorbing : bool, default=False
+        Treatment-assignment scope for the treatment indicator.
+
+        - ``False`` (default): require an ABSORBING STATE indicator (once
+          treated, always treated). A non-monotonic indicator raises
+          ``ValueError``. This guards against the common mistake of encoding
+          absorbing treatment as an event-style spike (a single D=1 period),
+          which would silently bias the ATT.
+        - ``True``: accept general (on/off) assignment patterns, where treatment
+          may switch on and off, per Athey et al. (2025) Eq. 12 / Algorithm 2.
+          Supported for ``method='local'`` only (``method='global'`` raises).
+          Relies on the paper's no-dynamic-effects (no carryover) assumption; the
+          triple-robustness guarantee (Theorem 5.1) is proven only under block
+          assignment, so a ``UserWarning`` is emitted on fit. The estimand
+          averages the per-cell effects over the **estimable** treated (D=1)
+          cells (Eq. 1): a cell is non-estimable (NaN, excluded) when its unit/time
+          fixed effect ``alpha_i + beta_t`` is unidentified by the control fit --
+          i.e. the target unit and target period are not in the same connected
+          component of the observed-control graph (an always-treated unit, a
+          fully-treated period, or disconnected control support). This matches the
+          library non-estimable->NaN convention (see REGISTRY ## TROP
+          "non-absorbing non-estimable-cell trimming").
 
     Attributes
     ----------
@@ -134,6 +160,7 @@ def __init__(
         alpha: float = 0.05,
         n_bootstrap: int = 200,
         seed: Optional[int] = None,
+        non_absorbing: bool = False,
     ):
         # Validate method parameter
         valid_methods = ("local", "global")
@@ -141,6 +168,14 @@ def __init__(
             raise ValueError(f"method must be one of {valid_methods}, got '{method}'")
         self.method = method
 
+        # Validate non_absorbing flag (must be a plain bool, not a truthy value).
+        # When False (default) TROP requires an absorbing-state treatment indicator;
+        # when True it accepts general (on/off) assignment patterns per Athey et al.
+        # (2025) Eq. 12 / Algorithm 2 -- local method only (see fit()).
+        if not isinstance(non_absorbing, bool):
+            raise ValueError(f"non_absorbing must be a bool, got {type(non_absorbing).__name__}")
+        self.non_absorbing = non_absorbing
+
         # Default grids from paper
         self.lambda_time_grid = lambda_time_grid or [0.0, 0.1, 0.5, 1.0, 2.0, 5.0]
         self.lambda_unit_grid = lambda_unit_grid or [0.0, 0.1, 0.5, 1.0, 2.0, 5.0]
@@ -389,17 +424,21 @@ def fit(
         treatment : str
             Name of the treatment indicator column (0/1).
 
-            IMPORTANT: This should be an ABSORBING STATE indicator, not a
-            treatment timing indicator. For each unit, D=1 for ALL periods
-            during and after treatment:
+            By default (``non_absorbing=False``) this must be an ABSORBING STATE
+            indicator, not a treatment timing indicator. For each unit, D=1 for
+            ALL periods during and after treatment:
 
             - D[t, i] = 0 for all t < g_i (pre-treatment periods)
             - D[t, i] = 1 for all t >= g_i (treatment and post-treatment)
 
             where g_i is the treatment start time for unit i.
 
-            For staggered adoption, different units can have different g_i.
-            The ATT averages over ALL D=1 cells per Equation 1 of the paper.
+            For staggered adoption, different units can have different g_i (this
+            is still absorbing). Set ``non_absorbing=True`` to allow treatment to
+            switch on and off (general assignment, ``method='local'`` only). The
+            ATT averages over the **estimable** D=1 cells per Equation 1 (a cell
+            whose unit/time fixed effect is unidentified by the control fit is
+            NaN and excluded; see ``non_absorbing`` and ``TROPResults``).
         unit : str
             Name of the unit identifier column.
         time : str
@@ -470,8 +509,34 @@ def fit(
 
         # Below is the local method (default)
         _ctx = _setup_trop_data(
-            data, outcome, treatment, unit, time, resolved_survey, survey_design
+            data,
+            outcome,
+            treatment,
+            unit,
+            time,
+            resolved_survey,
+            survey_design,
+            non_absorbing=self.non_absorbing,
         )
+
+        # Non-absorbing (general assignment) is a paper-supported point estimator
+        # (Athey et al. 2025 Eq. 12 / Algorithm 2) but the formal triple-robustness
+        # guarantee (Theorem 5.1) is proven only under block assignment, and the
+        # bootstrap's validity (Algorithm 3) requires a growing number of treated
+        # units. Surface that caveat once per fit so users do not over-read the SE.
+        if self.non_absorbing:
+            warnings.warn(
+                "TROP(non_absorbing=True): treating the panel as a general "
+                "(on/off) assignment pattern per Athey et al. (2025) Eq. 12 / "
+                "Algorithm 2. This relies on the no-dynamic-effects (no carryover) "
+                "assumption. The triple-robustness guarantee (Theorem 5.1) is "
+                "proven only under block assignment, and bootstrap-SE validity "
+                "requires a growing number of treated units -- interpret standard "
+                "errors with care.",
+                UserWarning,
+                stacklevel=2,
+            )
+
         n_units = _ctx["n_units"]
         n_periods = _ctx["n_periods"]
         idx_to_unit = _ctx["idx_to_unit"]
@@ -479,6 +544,7 @@ def fit(
         unit_weight_arr = _ctx["unit_weight_arr"]
         Y = _ctx["Y"]
         D = _ctx["D"]
+        missing_mask = _ctx["missing_mask"]
         n_treated_obs = _ctx["n_treated_obs"]
         treated_unit_idx = _ctx["treated_unit_idx"]
         control_unit_idx = _ctx["control_unit_idx"]
@@ -676,6 +742,7 @@ def fit(
         treated_observations = self._precomputed["treated_observations"]
         nonconverg_tracker: list = []
         n_fits_attempted = 0
+        n_no_support = 0
 
         for t, i in treated_observations:
             unit_id = idx_to_unit[i]
@@ -692,6 +759,25 @@ def fit(
                 Y, D, i, t, lambda_time, lambda_unit, control_unit_idx, n_units, n_periods
             )
 
+            # Guard against a treated cell with no positively-weighted, observed
+            # control support. Under non_absorbing with lambda_unit>0, a unit that
+            # is never observed untreated has inf distance to every donor, so all
+            # unit weights collapse to 0; the model then fits nothing and tau would
+            # silently equal the raw outcome Y_it. Mark such cells non-estimable
+            # (NaN) -- consistent with the missing-outcome NaN convention above --
+            # rather than report a wrong effect. The cell-specific check also
+            # covers lambda_unit=0 (uniform weights still leave an always-treated
+            # unit's alpha_i unidentified) and a fully-treated period (beta_t
+            # unidentified). It is a general correctness guard applied to every
+            # local fit: a no-op when each treated cell's unit and period have an
+            # observed control cell (always so on balanced panels, and in
+            # absorbing mode unless an unbalanced panel leaves a unit's pre-period
+            # controls or a period's controls entirely missing).
+            if not _treated_cell_is_estimable(control_mask, Y, weight_matrix, i, t):
+                treatment_effects[(unit_id, time_id)] = np.nan
+                n_no_support += 1
+                continue
+
             # Fit model with these weights
             n_fits_attempted += 1
             alpha_hat, beta_hat, L_hat = self._estimate_model(
@@ -717,6 +803,19 @@ def fit(
             beta_estimates.append(beta_hat)
             L_estimates.append(L_hat)
 
+        if n_no_support > 0:
+            warnings.warn(
+                f"{n_no_support} of {n_treated_obs} treated cell(s) are not "
+                f"estimable: the target unit and target period are not connected "
+                f"in the observed-control graph, so the cell's unit/time fixed "
+                f"effect (alpha_i + beta_t) is unidentified (e.g. an always-treated "
+                f"unit, a period in which every unit is treated, or disconnected "
+                f"control support). Their treatment effects are NaN and are "
+                f"excluded from the ATT.",
+                UserWarning,
+                stacklevel=2,
+            )
+
         if nonconverg_tracker:
             warn_if_not_converged(
                 False,
@@ -727,25 +826,42 @@ def fit(
                 self.tol,
             )
 
-        # Count valid treated observations
+        # Count valid (estimable) treated observations. A cell is excluded when
+        # its outcome is NaN/missing or it has no weighted control support (the
+        # latter is additionally surfaced by the no-support warning above).
         n_valid_treated = len(tau_values)
         if n_valid_treated == 0:
-            warnings.warn(
-                "All treated outcomes are NaN/missing. Cannot estimate ATT.",
-                UserWarning,
-            )
+            if n_no_support > 0:
+                warnings.warn(
+                    "No treated cells were estimable (for every treated cell the "
+                    "target unit and target period are not connected in the "
+                    "observed-control graph, leaving alpha_i + beta_t "
+                    "unidentified). Cannot estimate ATT.",
+                    UserWarning,
+                )
+            else:
+                warnings.warn(
+                    "All treated outcomes are NaN/missing. Cannot estimate ATT.",
+                    UserWarning,
+                )
         elif n_valid_treated < n_treated_obs:
             warnings.warn(
-                f"Only {n_valid_treated} of {n_treated_obs} treated outcomes are finite. "
-                "df and n_treated_obs reflect valid observations only.",
+                f"Only {n_valid_treated} of {n_treated_obs} treated cells were "
+                "estimable (finite outcome with weighted control support). "
+                "df and n_treated_obs reflect estimable observations only.",
                 UserWarning,
             )
 
-        # Average ATT (survey-weighted when applicable)
-        if unit_weight_arr is not None and tau_values:
+        # Average ATT (survey-weighted when applicable). Guard the weighted path
+        # against a zero total weight (e.g. the only estimable treated cells all
+        # carry zero survey weight after non-estimable cells are excluded), which
+        # would make np.average raise; fall back to NaN per the inference contract.
+        if unit_weight_arr is not None and tau_values and float(np.sum(tau_weights)) > 0.0:
             att = float(np.average(tau_values, weights=tau_weights))
+        elif tau_values and unit_weight_arr is None:
+            att = float(np.mean(tau_values))
         else:
-            att = np.mean(tau_values) if tau_values else np.nan
+            att = np.nan
 
         # Average parameter estimates for output (representative)
         alpha_hat = np.mean(alpha_estimates, axis=0) if alpha_estimates else np.zeros(n_units)
@@ -775,6 +891,15 @@ def fit(
             survey_design=survey_design,
             unit_weight_arr=unit_weight_arr,
             resolved_survey=resolved_survey,
+            # Force the guarded Python bootstrap (the Rust per-cell tau path lacks
+            # the estimability guard) whenever a resample could need it: (a) the
+            # point fit already trimmed a cell (n_no_support>0); or (b) the panel
+            # is unbalanced (has missing cells) -- a bootstrap resample can then
+            # lose a cell's only control support even if the original fit was
+            # fully estimable, and Rust would contaminate that draw's SE. Balanced
+            # panels keep the Rust happy path: the stratified resample always
+            # re-draws the control stratum, so support is preserved.
+            force_python=bool(n_no_support > 0 or np.any(missing_mask)),
         )
 
         # Compute test statistics
@@ -811,6 +936,7 @@ def fit(
             n_bootstrap=self.n_bootstrap,
             bootstrap_distribution=bootstrap_dist if len(bootstrap_dist) > 0 else None,
             survey_metadata=survey_metadata,
+            non_absorbing=self.non_absorbing,
         )
 
         self.is_fitted_ = True
@@ -832,6 +958,7 @@ def get_params(self) -> Dict[str, Any]:
             "alpha": self.alpha,
             "n_bootstrap": self.n_bootstrap,
             "seed": self.seed,
+            "non_absorbing": self.non_absorbing,
         }
 
     def set_params(self, **params) -> "TROP":
@@ -839,6 +966,8 @@ def set_params(self, **params) -> "TROP":
         for key, value in params.items():
             if key == "method" and value not in ("local", "global"):
                 raise ValueError(f"method must be one of ('local', 'global'), got '{value}'")
+            if key == "non_absorbing" and not isinstance(value, bool):
+                raise ValueError(f"non_absorbing must be a bool, got {type(value).__name__}")
             if hasattr(self, key):
                 setattr(self, key, value)
             else:
@@ -867,10 +996,12 @@ def trop(
     treatment : str
         Treatment indicator column name (0/1).
 
-        IMPORTANT: This should be an ABSORBING STATE indicator, not a treatment
-        timing indicator. For each unit, D=1 for ALL periods during and after
-        treatment (D[t,i]=0 for t < g_i, D[t,i]=1 for t >= g_i where g_i is
-        the treatment start time for unit i).
+        By default (``non_absorbing=False``) this must be an ABSORBING STATE
+        indicator, not a treatment timing indicator: for each unit, D=1 for ALL
+        periods during and after treatment (D[t,i]=0 for t < g_i, D[t,i]=1 for
+        t >= g_i where g_i is the treatment start time for unit i). Pass
+        ``non_absorbing=True`` (via ``**kwargs``) to accept general on/off
+        assignment patterns (``method='local'`` only); see ``TROP``.
     unit : str
         Unit identifier column name.
     time : str
@@ -878,7 +1009,7 @@ def trop(
     survey_design : SurveyDesign, optional
         Survey design specification. Supports pweight, strata, PSU, and FPC.
     **kwargs
-        Additional arguments passed to TROP constructor.
+        Additional arguments passed to TROP constructor (e.g. ``non_absorbing``).
 
     Returns
     -------
diff --git a/diff_diff/trop_global.py b/diff_diff/trop_global.py
index 4b8b4182..bc2d8d5f 100644
--- a/diff_diff/trop_global.py
+++ b/diff_diff/trop_global.py
@@ -588,7 +588,20 @@ def _fit_global(
         across units, use `method="local"` which computes observation-specific
         weights that naturally handle heterogeneous timing.
         """
-        # Data setup (shared with local method via _setup_trop_data helper).
+        # The global method's post-hoc weighting and bootstrap bake in a
+        # contiguous, simultaneous treated block (see Notes above), which is
+        # incompatible with general on/off assignment. Non-absorbing support is
+        # local-method only (Athey et al. 2025 Eq. 12 / Algorithm 2).
+        if getattr(self, "non_absorbing", False):
+            raise ValueError(
+                "non_absorbing=True requires method='local'; the global method "
+                "requires block (simultaneous) treatment assignment. Use "
+                "TROP(method='local', non_absorbing=True) for on/off treatment."
+            )
+
+        # Data setup (shared with local method via _setup_trop_data helper). The
+        # global path always validates absorbing-state (non_absorbing=False); it
+        # additionally requires simultaneous block adoption (checked below).
         _ctx = _setup_trop_data(
             data, outcome, treatment, unit, time, resolved_survey, survey_design
         )
@@ -835,6 +848,9 @@ def _fit_global(
             n_bootstrap=self.n_bootstrap,
             bootstrap_distribution=bootstrap_dist if len(bootstrap_dist) > 0 else None,
             survey_metadata=survey_metadata,
+            # Global method requires block assignment (non_absorbing=True is
+            # rejected at the top of _fit_global), so this is always absorbing.
+            non_absorbing=False,
         )
 
         self.is_fitted_ = True
diff --git a/diff_diff/trop_local.py b/diff_diff/trop_local.py
index 04c13f8e..263401db 100644
--- a/diff_diff/trop_local.py
+++ b/diff_diff/trop_local.py
@@ -32,6 +32,72 @@
 from diff_diff.utils import warn_if_not_converged
 
 
+def _treated_cell_is_estimable(
+    control_mask: np.ndarray,
+    Y: np.ndarray,
+    weight_matrix: np.ndarray,
+    i: int,
+    t: int,
+) -> bool:
+    """True iff treated cell (i, t)'s counterfactual is identified by the control fit.
+
+    The working model fits unregularized unit and time fixed effects
+    ``alpha_j`` / ``beta_s`` on the weighted observed control cells, then sets
+    ``tau_it = Y_it - alpha_i - beta_t - L_it``. For that difference to be a valid
+    counterfactual rather than a fixed-effect-contaminated raw outcome, the sum
+    ``alpha_i + beta_t`` must be identified by the two-way-FE control fit.
+
+    In a two-way fixed-effect model the effects are pinned only **within each
+    connected component** of the bipartite graph whose nodes are units and
+    periods and whose edges are the positively-weighted observed control cells
+    (``usable = (D==0) & finite(Y) & weight>0``); across components there is a
+    free per-component offset. Hence ``alpha_i + beta_t`` is identified iff the
+    **target unit node and target period node lie in the same component**.
+
+    A marginal "the target unit has some usable control AND the target period has
+    some usable control" test is necessary but NOT sufficient: e.g. usable cells
+    at ``(unitA, t0)`` and ``(unitB, t1)`` with target ``(unitA, t1)`` pass it,
+    yet ``alpha_A + beta_1`` spans two disconnected components and is unidentified.
+    This connected-component check subsumes the simpler degeneracies it replaced:
+    an always-treated unit (empty unit column) or a fully-treated period (empty
+    period row) leaves the corresponding node isolated, hence non-estimable.
+
+    This is a **general correctness guard applied to every local fit** (absorbing
+    and non-absorbing): it NaNs exactly the cells whose ``alpha_i + beta_t`` is
+    unidentified. On balanced panels (and absorbing panels with an observed
+    never-treated unit, which connects every period to every unit) the whole
+    control graph is one component, so the predicate is a no-op (no behavior
+    change). Shared by the final point fit and the bootstrap fixed-lambda refit.
+
+    Cost: a bipartite BFS bounded by the usable-cell count, run per treated cell
+    only when both the unit column and period row are non-empty (the cheap
+    fast-path rejects the common degeneracies first). non_absorbing is opt-in and
+    correctness-first, so the extra work is acceptable.
+    """
+    usable = control_mask & np.isfinite(Y) & (weight_matrix > 0)
+    # Fast path: an empty target column (alpha_i) or row (beta_t) is isolated.
+    if not bool(np.any(usable[:, i])) or not bool(np.any(usable[t, :])):
+        return False
+    # Bipartite reachability from period-node t; estimable iff unit-node i reached.
+    reached_periods = np.zeros(usable.shape[0], dtype=bool)
+    reached_units = np.zeros(usable.shape[1], dtype=bool)
+    reached_periods[t] = True
+    while True:
+        # Units adjacent to any reached period, then periods adjacent to any
+        # reached unit; iterate the bipartite expansion to a fixpoint.
+        new_units = reached_units | np.any(usable[reached_periods, :], axis=0)
+        if np.any(new_units):
+            new_periods = reached_periods | np.any(usable[:, new_units], axis=1)
+        else:
+            new_periods = reached_periods
+        if np.array_equal(new_units, reached_units) and np.array_equal(
+            new_periods, reached_periods
+        ):
+            break
+        reached_units, reached_periods = new_units, new_periods
+    return bool(reached_units[i])
+
+
 def _validate_and_pivot_treatment(data, time, unit, treatment, all_periods, all_units):
     """Validate treatment column and create D matrix with missing mask.
 
@@ -71,7 +137,16 @@ def _validate_and_pivot_treatment(data, time, unit, treatment, all_periods, all_
     return D, missing_mask
 
 
-def _setup_trop_data(data, outcome, treatment, unit, time, resolved_survey, survey_design):
+def _setup_trop_data(
+    data,
+    outcome,
+    treatment,
+    unit,
+    time,
+    resolved_survey,
+    survey_design,
+    non_absorbing: bool = False,
+):
     """Shared data setup for TROP local and global fit paths.
 
     Performs panel pivoting (long → wide), absorbing-state validation,
@@ -79,6 +154,18 @@ def _setup_trop_data(data, outcome, treatment, unit, time, resolved_survey, surv
     and pre/post period counting. Returns a dict so both callers can
     unpack only the fields they need.
 
+    When ``non_absorbing`` is False (default) the treatment indicator must be an
+    absorbing state (monotonic non-decreasing per unit) and a non-monotonic
+    indicator raises ``ValueError``; there must be at least one never-treated
+    unit and at least 2 leading pre-treatment periods. When ``non_absorbing`` is
+    True these absorbing-specific guards are relaxed to support general (on/off)
+    assignment patterns (Athey et al. 2025 Eq. 12 / Algorithm 2): the
+    monotonicity check is skipped, identification falls back to untreated *cells*
+    (rather than requiring whole never-treated units), and the pre-period guard
+    becomes a weaker "at least 2 periods contain untreated cells" check. The
+    global fit path always calls with ``non_absorbing=False`` (it additionally
+    requires simultaneous block adoption).
+
     The global-method-specific staggered-adoption check stays in
     `_fit_global` as a post-helper validation because it depends on
     estimator semantics (global method requires simultaneous treatment),
@@ -128,20 +215,25 @@ def _setup_trop_data(data, outcome, treatment, unit, time, resolved_survey, surv
         data, time, unit, treatment, all_periods, all_units
     )
 
-    violating_units = []
-    for unit_idx in range(n_units):
-        observed_mask = ~missing_mask[:, unit_idx]
-        observed_d = D[observed_mask, unit_idx]
-        if len(observed_d) > 1 and np.any(np.diff(observed_d) < 0):
-            violating_units.append(all_units[unit_idx])
-
-    if violating_units:
-        raise ValueError(
-            f"Treatment indicator is not an absorbing state for units: {violating_units}. "
-            f"D[t, unit] must be monotonic non-decreasing (once treated, always treated). "
-            f"If this is event-study style data, convert to absorbing state: "
-            f"D[t, i] = 1 for all t >= first treatment period."
-        )
+    # Absorbing-state (monotonic non-decreasing) validation. Skipped when the
+    # caller opts into general (on/off) assignment via non_absorbing=True.
+    if not non_absorbing:
+        violating_units = []
+        for unit_idx in range(n_units):
+            observed_mask = ~missing_mask[:, unit_idx]
+            observed_d = D[observed_mask, unit_idx]
+            if len(observed_d) > 1 and np.any(np.diff(observed_d) < 0):
+                violating_units.append(all_units[unit_idx])
+
+        if violating_units:
+            raise ValueError(
+                f"Treatment indicator is not an absorbing state for units: {violating_units}. "
+                f"D[t, unit] must be monotonic non-decreasing (once treated, always treated). "
+                f"If this is event-study style data with absorbing treatment, convert to "
+                f"absorbing state: D[t, i] = 1 for all t >= first treatment period. "
+                f"If treatment genuinely turns on and off (non-absorbing), pass "
+                f"non_absorbing=True (method='local' only; assumes no dynamic effects)."
+            )
 
     treated_mask = D == 1
     n_treated_obs = int(np.sum(treated_mask))
@@ -153,8 +245,28 @@ def _setup_trop_data(data, outcome, treatment, unit, time, resolved_survey, surv
     treated_unit_idx = np.where(unit_ever_treated)[0]
     control_unit_idx = np.where(~unit_ever_treated)[0]
 
-    if len(control_unit_idx) == 0:
-        raise ValueError("No control units found")
+    # Observed untreated cells. Structural panel gaps are filled with D=0
+    # (_validate_and_pivot_treatment), so identification checks under
+    # non_absorbing must exclude those filled cells (and non-finite outcomes):
+    # only an OBSERVED D=0 cell can serve as a control for the (1-W)
+    # counterfactual fit. A raw `D == 0` count would let an all-observed-treated
+    # unbalanced panel pass with no real control outcomes.
+    valid_control_mask = (D == 0) & (~missing_mask) & np.isfinite(Y)
+
+    if non_absorbing:
+        # General assignment identifies off untreated *cells* (the per-(i,t)
+        # estimator masks treated cells via (1-W) and fits the rest), so a fully
+        # toggling panel with no never-treated unit is still identified. Require
+        # at least one observed untreated cell.
+        if not np.any(valid_control_mask):
+            raise ValueError(
+                "No observed untreated (control) observations found; non_absorbing "
+                "TROP needs observed cells with D=0 (not structural panel gaps) to "
+                "impute the counterfactual."
+            )
+    else:
+        if len(control_unit_idx) == 0:
+            raise ValueError("No control units found")
 
     first_treat_period = None
     for t in range(n_periods):
@@ -168,7 +280,20 @@ def _setup_trop_data(data, outcome, treatment, unit, time, resolved_survey, surv
     n_pre_periods = first_treat_period
     n_post_periods = int(np.sum(np.any(D[first_treat_period:, :] == 1, axis=1)))
 
-    if n_pre_periods < 2:
+    if non_absorbing:
+        # "Leading all-control block" is ill-defined when treatment toggles, so
+        # the absorbing n_pre_periods>=2 guard does not apply. Require instead
+        # that at least 2 periods contain an OBSERVED untreated cell (a weak
+        # factor-model identifiability floor); finer donor-pool degeneracy is
+        # handled downstream by the LOOCV empty-control (Q=inf) and inf-distance
+        # guards.
+        n_periods_with_controls = int(np.sum(np.any(valid_control_mask, axis=1)))
+        if n_periods_with_controls < 2:
+            raise ValueError(
+                "Need at least 2 periods containing observed untreated "
+                "observations for non_absorbing TROP."
+            )
+    elif n_pre_periods < 2:
         raise ValueError("Need at least 2 pre-treatment periods")
 
     return {
@@ -1031,10 +1156,17 @@ def _bootstrap_variance(
         survey_design=None,
         unit_weight_arr: Optional[np.ndarray] = None,
         resolved_survey=None,
+        force_python: bool = False,
     ) -> Tuple[float, np.ndarray]:
         """
         Compute bootstrap standard error using unit-level block bootstrap.
 
+        ``force_python=True`` skips the Rust happy path so the cell-specific
+        estimability guard in ``_fit_with_fixed_lambda`` is applied per draw. The
+        point fit sets this whenever it trimmed any non-estimable treated cell
+        (the Rust per-cell tau path lacks the guard), keeping the bootstrap SE
+        and the point ATT on the same estimable-cell set.
+
         When the optional Rust backend is available and the matrix parameters
         (Y, D, control_unit_idx) are provided, uses parallelized Rust
         implementation for 5-15x speedup. Falls back to Python implementation
@@ -1132,13 +1264,25 @@ def _bootstrap_variance(
         )
 
         # Try Rust backend for parallel bootstrap (5-15x speedup)
-        # Only used for pweight-only designs (no strata/PSU/FPC)
+        # Only used for pweight-only designs (no strata/PSU/FPC).
+        # Routed to the Python loop when the cell-specific estimability contract
+        # could diverge from Rust: (a) non_absorbing fits -- a fully non-absorbing
+        # panel can have zero never-treated units (empty control stratum -> Rust
+        # can return a degenerate ~0 SE), and the Rust per-cell tau path lacks the
+        # estimability guard; (b) force_python -- the point fit trimmed at least
+        # one non-estimable treated cell (e.g. an unbalanced absorbing panel), so
+        # the Rust path (no guard) would compute SE over a different cell set than
+        # the point ATT. The Python `_fit_with_fixed_lambda` enforces the guard
+        # per draw in both cases.
         if (
             HAS_RUST_BACKEND
             and _rust_bootstrap_trop_variance is not None
             and self._precomputed is not None
             and Y is not None
             and D is not None
+            and n_control_units > 0
+            and not getattr(self, "non_absorbing", False)
+            and not force_python
         ):
             try:
                 control_mask = self._precomputed["control_mask"]
@@ -1479,6 +1623,14 @@ def _fit_with_fixed_lambda(
                 Y, D, i, t, lambda_time, lambda_unit, control_unit_idx, n_units, n_periods
             )
 
+            # Skip non-estimable cells (same predicate as the main fit): if the
+            # target unit or target period has no weighted observed control cell,
+            # alpha_i / beta_t are unidentified and tau leaks the fixed effect,
+            # silently biasing the draw's ATT. A draw with no estimable cell
+            # returns NaN and is counted as a failed replicate.
+            if not _treated_cell_is_estimable(control_mask, Y, weight_matrix, i, t):
+                continue
+
             # Fit model with these weights
             alpha, beta, L = self._estimate_model(
                 Y,
@@ -1499,5 +1651,13 @@ def _fit_with_fixed_lambda(
         if not tau_values:
             return float("nan")
         if local_weight_arr is not None:
+            # Guard against a degenerate weighted draw: after non-estimable cells
+            # are skipped, the remaining estimable cells can all carry zero
+            # (rescaled survey / Rao-Wu) weight, which would make np.average raise
+            # ZeroDivisionError. Treat such a draw as failed (NaN) per the
+            # bootstrap NaN-on-degenerate contract.
+            weight_sum = float(np.sum(tau_weights))
+            if not np.isfinite(weight_sum) or weight_sum <= 0.0:
+                return float("nan")
             return float(np.average(tau_values, weights=tau_weights))
         return float(np.mean(tau_values))
diff --git a/diff_diff/trop_results.py b/diff_diff/trop_results.py
index 45e54bd1..d95f09ce 100644
--- a/diff_diff/trop_results.py
+++ b/diff_diff/trop_results.py
@@ -96,7 +96,15 @@ class TROPResults:
     time_effects : dict
         Estimated time fixed effects (beta_t).
     treatment_effects : dict
-        Individual treatment effects for each treated (unit, time) pair.
+        Individual treatment effects for each treated (unit, time) pair. The
+        value is NaN for a cell that is not estimable -- a missing outcome, or a
+        cell whose unit/time fixed effect ``alpha_i + beta_t`` is unidentified by
+        the control fit (the target unit and target period are not in the same
+        connected component of the observed-control graph: an always-treated unit,
+        a fully-treated period, or disconnected control support). This applies to
+        all local TROP fits; it is reachable mainly under ``non_absorbing=True``
+        but also on unbalanced absorbing panels. The reported ATT is the mean over
+        the finite (estimable) cells.
     lambda_time : float
         Selected time weight decay parameter from grid. 0.0 = uniform time
         weights (disabled) per Eq. 3.
@@ -122,6 +130,12 @@ class TROPResults:
         Number of bootstrap replications (if bootstrap variance).
     bootstrap_distribution : np.ndarray, optional
         Bootstrap distribution of estimates.
+    non_absorbing : bool, default=False
+        Treatment-assignment scope used for the fit. False = absorbing-state
+        treatment (default); True = general on/off assignment (``method='local'``
+        only). Recorded so a persisted result retains the assignment-scope and
+        inference-caveat context (Theorem 5.1 is block-only) after the fit-time
+        ``UserWarning`` is gone.
     """
 
     att: float
@@ -149,6 +163,11 @@ class TROPResults:
     bootstrap_distribution: Optional[np.ndarray] = field(default=None, repr=False)
     # Survey design metadata (SurveyMetadata instance from diff_diff.survey)
     survey_metadata: Optional[Any] = field(default=None)
+    # Treatment-assignment scope used for the fit: False = absorbing (default),
+    # True = general on/off assignment (method='local'; Athey et al. 2025 Eq. 12).
+    # Recorded so a persisted result retains the assignment-scope / inference
+    # caveat context after the fit-time UserWarning is gone.
+    non_absorbing: bool = False
 
     def __repr__(self) -> str:
         """Concise string representation."""
@@ -195,10 +214,21 @@ def summary(self, alpha: Optional[float] = None) -> str:
             "",
             f"{'Observations:':<25} {self.n_obs:>10}",
             f"{'Treated units:':<25} {self.n_treated:>10}",
-            f"{'Control units:':<25} {self.n_control:>10}",
+            # Under non-absorbing assignment a unit can be treated in some periods
+            # and untreated in others, so n_control (never-treated units) may be 0
+            # even though many untreated control *cells* exist; label accordingly.
+            (
+                f"{'Never-treated units:':<25} {self.n_control:>10}"
+                if self.non_absorbing
+                else f"{'Control units:':<25} {self.n_control:>10}"
+            ),
             f"{'Treated observations:':<25} {self.n_treated_obs:>10}",
             f"{'Pre-treatment periods:':<25} {self.n_pre_periods:>10}",
             f"{'Post-treatment periods:':<25} {self.n_post_periods:>10}",
+        ]
+        if self.non_absorbing:
+            lines.append(f"{'Assignment scope:':<25} {'non-absorbing (on/off)':>20}")
+        lines += [
             "",
             "-" * 75,
             "Tuning Parameters (selected via LOOCV)".center(75),
@@ -280,6 +310,7 @@ def to_dict(self) -> Dict[str, Any]:
             "lambda_nn": self.lambda_nn,
             "effective_rank": self.effective_rank,
             "loocv_score": self.loocv_score,
+            "non_absorbing": self.non_absorbing,
         }
         if self.survey_metadata is not None:
             sm = self.survey_metadata
diff --git a/docs/api/chaisemartin_dhaultfoeuille.rst b/docs/api/chaisemartin_dhaultfoeuille.rst
index a28e128a..1ee5634d 100644
--- a/docs/api/chaisemartin_dhaultfoeuille.rst
+++ b/docs/api/chaisemartin_dhaultfoeuille.rst
@@ -1,9 +1,11 @@
 de Chaisemartin-D'Haultfœuille (dCDH) DiD
 ============================================
 
-The only modern staggered DiD estimator in diff-diff that handles
-**non-absorbing (reversible) treatments** — treatment may switch on AND
-off over time.
+The most general estimator in diff-diff for **non-absorbing (reversible)
+treatments** — treatment may switch on AND off over time, with explicit
+joiner/leaver decomposition and multi-horizon dynamics. (:class:`~diff_diff.LPDiD`
+and :class:`~diff_diff.TROP` also support non-absorbing treatment under stronger
+assumptions; see their ``non_absorbing`` parameters.)
 
 This module implements the methodology from de Chaisemartin & D'Haultfœuille
 (2020/2022). The estimator ships the contemporaneous-switch path ``DID_M``
@@ -79,12 +81,15 @@ The estimator:
   ``l = 1`` on cell-aggregated input (see REGISTRY.md for documented
   deviations on individual-level inputs with uneven cell sizes)
 
-All other staggered estimators in diff-diff (:class:`~diff_diff.CallawaySantAnna`,
+The remaining staggered estimators in diff-diff (:class:`~diff_diff.CallawaySantAnna`,
 :class:`~diff_diff.SunAbraham`, :class:`~diff_diff.ImputationDiD`,
 :class:`~diff_diff.TwoStageDiD`, :class:`~diff_diff.EfficientDiD`,
 :class:`~diff_diff.WooldridgeDiD`) assume treatment is **absorbing** —
-once treated, stays treated. ``ChaisemartinDHaultfoeuille`` is the only
-library option for non-absorbing treatments.
+once treated, stays treated. ``ChaisemartinDHaultfoeuille`` is the most general
+option for non-absorbing treatments; :class:`~diff_diff.LPDiD`
+(``non_absorbing="first_entry"`` / ``"effect_stabilization"``) and
+:class:`~diff_diff.TROP` (``non_absorbing=True``, under a no-dynamic-effects
+assumption) also support non-absorbing treatment.
 
 **Panel requirements (deviation from R DIDmultiplegtDYN):**
 
diff --git a/docs/api/trop.rst b/docs/api/trop.rst
index 743992a3..0d66e37e 100644
--- a/docs/api/trop.rst
+++ b/docs/api/trop.rst
@@ -168,6 +168,36 @@ Treatment effects are **heterogeneous** per-observation residuals; ATT is their
 Use ``method='local'`` for observation-specific weight optimization.
 Use ``method='global'`` for faster estimation with global weights.
 
+Non-absorbing (on/off) treatment
+--------------------------------
+
+By default TROP requires an **absorbing-state** treatment indicator (once treated,
+always treated) and rejects a non-monotonic indicator with a ``ValueError``. This
+guards against the common mistake of encoding absorbing treatment as an event-style
+spike (a single ``D=1`` period), which would silently bias the ATT.
+
+The paper, however, supports **general assignment patterns** including treatment that
+switches on and off (§2.1: "units moving into and out of treatment"; Eq. 12 /
+Algorithm 2). Enable this with the opt-in ``non_absorbing=True`` (``method='local'``
+only)::
+
+    from diff_diff import TROP
+
+    trop = TROP(method='local', non_absorbing=True)
+    results = trop.fit(data, outcome='y', treatment='treated',
+                       unit='unit_id', time='period')
+
+Caveats (a ``UserWarning`` is emitted on fit):
+
+- Validity relies on the paper's **no-spillover / no-dynamic-effects (no carryover)**
+  assumption.
+- The point estimator (Eq. 12) is general, but the formal **triple-robustness
+  guarantee (Theorem 5.1) is proven only under block assignment**; the bootstrap is
+  offered generally but its validity requires a growing number of treated units, so
+  interpret standard errors with care.
+- ``non_absorbing=True`` is supported for ``method='local'`` only;
+  ``TROP(method='global', non_absorbing=True)`` raises a ``ValueError``.
+
 Example Usage
 -------------
 
@@ -184,8 +214,11 @@ Basic usage::
     )
 
     # Note: TROP infers treatment periods from the treatment indicator column.
-    # The treatment column should be an absorbing state (D=1 for all periods
-    # during and after treatment starts).
+    # By default the treatment column must be an absorbing state (D=1 for all
+    # periods during and after treatment starts); a non-monotonic indicator
+    # raises ValueError. For treatment that genuinely switches on and off,
+    # pass non_absorbing=True (method='local' only) -- see "Non-absorbing
+    # (on/off) treatment" below.
     results = trop.fit(
         data,
         outcome='y',
diff --git a/docs/choosing_estimator.rst b/docs/choosing_estimator.rst
index 69f437b3..c3edcc31 100644
--- a/docs/choosing_estimator.rst
+++ b/docs/choosing_estimator.rst
@@ -26,7 +26,7 @@ Start here and follow the questions:
 2. **Can treatment switch on AND off?** (Reversible / non-absorbing treatment — e.g., marketing campaigns, seasonal promotions, on/off policy cycles)
 
    - **No (treatment is absorbing — once treated, stays treated)** → Go to question 3
-   - **Yes** → Use :class:`~diff_diff.ChaisemartinDHaultfoeuille` — the only library estimator that handles non-absorbing treatments
+   - **Yes** → Use :class:`~diff_diff.ChaisemartinDHaultfoeuille` — the most general option (allows dynamic/carryover effects, with joiner/leaver views). :class:`~diff_diff.LPDiD` (``non_absorbing="first_entry"`` / ``"effect_stabilization"``) and :class:`~diff_diff.TROP` (``non_absorbing=True``, under a no-dynamic-effects assumption) also handle non-absorbing treatment under stronger assumptions
 
 3. **Is treatment staggered?** (Different units treated at different times)
 
@@ -78,7 +78,7 @@ Quick Reference
      - Conditional parallel trends
      - Group-time ATT(g,t), aggregations
    * - ``ChaisemartinDHaultfoeuille``
-     - Reversible / non-absorbing treatments (only library option)
+     - Reversible / non-absorbing treatments (most general; allows dynamic effects)
      - Parallel trends + A5 (no crossing) + A11 (stable controls)
      - DID_l event study (L_max), normalized DID^n_l, cost-benefit delta, placebos, sup-t bands, TWFE diagnostic
    * - ``SyntheticDiD``
@@ -250,8 +250,13 @@ Use :class:`~diff_diff.ChaisemartinDHaultfoeuille` (alias :class:`~diff_diff.DCD
   normalized effects, cost-benefit aggregation, dynamic placebos, and
   sup-t simultaneous confidence bands
 
-This is **the only library estimator that handles non-absorbing treatments**.
-All other staggered estimators
+This is the **most general** library estimator for non-absorbing treatment: it
+allows dynamic (carryover) effects and reports separate joiner/leaver views.
+Two other estimators also accept non-absorbing treatment under stronger
+assumptions: :class:`~diff_diff.LPDiD` (``non_absorbing="first_entry"`` /
+``"effect_stabilization"`` — entry-effect estimands) and :class:`~diff_diff.TROP`
+(``non_absorbing=True``, ``method='local'`` — valid under the paper's
+no-dynamic-effects / no-carryover assumption). The remaining staggered estimators
 (:class:`~diff_diff.CallawaySantAnna`, :class:`~diff_diff.SunAbraham`,
 :class:`~diff_diff.ImputationDiD`, :class:`~diff_diff.TwoStageDiD`,
 :class:`~diff_diff.EfficientDiD`, :class:`~diff_diff.WooldridgeDiD`) assume
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
index da3f06c8..378b3aa2 100644
--- a/docs/methodology/REGISTRY.md
+++ b/docs/methodology/REGISTRY.md
@@ -648,7 +648,7 @@ The multiplier bootstrap uses random weights w_i with E[w]=0 and Var(w)=1:
 - [de Chaisemartin, C. & D'Haultfœuille, X. (2020). Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects. *American Economic Review*, 110(9), 2964-2996.](https://doi.org/10.1257/aer.20181169)
 - [de Chaisemartin, C. & D'Haultfœuille, X. (2022, revised July 2023). Difference-in-Differences Estimators of Intertemporal Treatment Effects. NBER Working Paper 29873.](https://www.nber.org/papers/w29873) — Web Appendix Section 3.7.3 contains the cohort-recentered plug-in variance formula implemented here.
 
-**Phase 1-2 scope:** Ships the contemporaneous-switch estimator `DID_M` (= `DID_1` at horizon `l = 1`) from the AER 2020 paper **plus** the full multi-horizon event study `DID_l` for `l = 1..L_max` from the dynamic companion paper. Phase 2 adds: per-group `DID_{g,l}` building block (Equation 3), dynamic placebos `DID^{pl}_l`, normalized estimator `DID^n_l`, cost-benefit aggregate `delta`, sup-t simultaneous confidence bands, and `plot_event_study()` integration. Phase 3 adds covariate adjustment (`DID^X`), group-specific linear trends (`DID^{fd}`), state-set-specific trends, and HonestDiD integration. Survey design supports pweight with strata/PSU/FPC via Taylor Series Linearization (analytical) or replicate-weight variance (BRR/Fay/JK1/JKn/SDR) across all IF sites, plus opt-in PSU-level Hall-Mammen wild bootstrap via `n_bootstrap > 0` (see the full checklist + Notes below for the contract). **This is the only modern staggered estimator in the library that handles non-absorbing (reversible) treatments** - treatment can switch on AND off over time, making it the natural fit for marketing campaigns, seasonal promotions, on/off policy cycles.
+**Phase 1-2 scope:** Ships the contemporaneous-switch estimator `DID_M` (= `DID_1` at horizon `l = 1`) from the AER 2020 paper **plus** the full multi-horizon event study `DID_l` for `l = 1..L_max` from the dynamic companion paper. Phase 2 adds: per-group `DID_{g,l}` building block (Equation 3), dynamic placebos `DID^{pl}_l`, normalized estimator `DID^n_l`, cost-benefit aggregate `delta`, sup-t simultaneous confidence bands, and `plot_event_study()` integration. Phase 3 adds covariate adjustment (`DID^X`), group-specific linear trends (`DID^{fd}`), state-set-specific trends, and HonestDiD integration. Survey design supports pweight with strata/PSU/FPC via Taylor Series Linearization (analytical) or replicate-weight variance (BRR/Fay/JK1/JKn/SDR) across all IF sites, plus opt-in PSU-level Hall-Mammen wild bootstrap via `n_bootstrap > 0` (see the full checklist + Notes below for the contract). **This is the most general library estimator for non-absorbing (reversible) treatments** - treatment can switch on AND off over time, switcher vs non-switcher is its primitive object, and it allows dynamic (carryover) effects with explicit joiner/leaver (`DID_+` / `DID_-`) decomposition - making it the natural fit for marketing campaigns, seasonal promotions, on/off policy cycles. (`LPDiD` with `non_absorbing="first_entry"` / `"effect_stabilization"` and `TROP` with `non_absorbing=True` under a no-dynamic-effects assumption also accept non-absorbing treatment under stronger assumptions.)
 
 **Key implementation requirements:**
 
@@ -2593,7 +2593,7 @@ confidence bands (sup-t) for event study.
 
 **Primary source:** [Athey, S., Imbens, G.W., Qu, Z., & Viviano, D. (2025). Triply Robust Panel Estimators. arXiv:2508.21536.](https://arxiv.org/abs/2508.21536)
 
-**Note (version pinning):** the methodology promotion (`METHODOLOGY_REVIEW.md` `#### TROP` → **Complete** as of 2026-05-24) is anchored on **arXiv:2508.21536v2**; the current arXiv version is **v3**. A formal v2→v3 source delta-check against the v3 PDF has NOT been performed for any of the sections covered by the promotion (Eqs. 2-3, Algorithms 1-3, Section 2.2, Section 5.2-5.3, Section 6.1-6.2, Theorem 5.1, Corollary 1, Appendix Theorem 8.1). See `docs/methodology/papers/athey-2025-review.md` "Version-pinning note" for the deferred action item.
+**Note (version pinning):** the methodology promotion (`METHODOLOGY_REVIEW.md` `#### TROP` → **Complete** as of 2026-05-24) is anchored on **arXiv:2508.21536v2**; the current arXiv version is **v3**. The **v3 PDF was consulted for the treatment-assignment-pattern sections** as part of the non-absorbing support work (§2.1 general assignment / "units moving into and out of treatment"; §2.2 Eq. 2 masking; §6.1 Eq. 12 / Algorithm 2; Assumption 1(i); Theorem 5.1) and confirms the general-assignment scope used here. A full v2→v3 source delta-check across all promoted sections (Eqs. 2-3, Algorithms 1-3, Section 2.2, Section 5.2-5.3, Section 6.1-6.2, Theorem 5.1, Corollary 1, Appendix Theorem 8.1) is still **deferred**. See `docs/methodology/papers/athey-2025-review.md` "Version-pinning note" for the deferred action item.
 
 **Key implementation requirements:**
 
@@ -2604,26 +2604,47 @@ confidence bands (sup-t) for event study.
 
 *Treatment indicator (D matrix) semantics:*
 
-D must be an **ABSORBING STATE** indicator, not a treatment timing indicator:
+By default (`non_absorbing=False`) D must be an **ABSORBING STATE** indicator, not a
+treatment timing indicator:
 - D[t, i] = 0 for all t < g_i (pre-treatment periods for unit i)
 - D[t, i] = 1 for all t >= g_i (during and after treatment for unit i)
 
 where g_i is the treatment start time for unit i.
 
-For staggered adoption, different units have different treatment start times g_i.
-The D matrix naturally handles this - distances use periods where BOTH units
+For **staggered adoption** (different units treated at different times, but still
+absorbing) the D matrix naturally handles this - distances use periods where BOTH units
 have D=0, matching the paper's (1 - W_iu)(1 - W_ju) formula in Equation 3.
 
-**Wrong D specification**: If user provides event-style D (only first treatment period
-has D=1), ATT will be incorrect - document this clearly.
+**True non-absorbing assignment** (treatment switches on *and* off) is a distinct case
+from staggered adoption. The paper (§2.1: "units moving into and out of treatment")
+supports it via the same Eq. 12 / Algorithm 2 masking, and the library exposes it through
+the opt-in `TROP(non_absorbing=True)` (`method='local'` only). See the requirements
+checklist below and the `**Note:**` entries on the no-dynamic-effects requirement and the
+block-only inference theory.
+
+**Wrong D specification**: With the default `non_absorbing=False`, an event-style D (only
+the first treatment period has D=1, then back to 0) is a non-monotonic indicator and is
+**rejected** with a `ValueError` (see "D matrix validation" below). This guards against the
+common mistake of encoding absorbing treatment as an event spike, which would silently bias
+the ATT. A user with genuinely non-absorbing treatment passes `non_absorbing=True`.
 
 *ATT definition (Equation 1, Section 6.1):*
 ```
 τ̂ = (1 / Σ_i Σ_t W_{it}) Σ_{i=1}^N Σ_{t=1}^T W_{it} τ̂_{it}(λ̂)
 ```
-- ATT averages over ALL cells where D_it=1 (treatment indicator)
+- ATT averages over all cells where D_it=1 (treatment indicator) that are
+  **estimable**. On balanced / support-complete absorbing panels every treated
+  cell is estimable, so this is all D=1 cells. A cell is non-estimable (NaN,
+  excluded) when `alpha_i + beta_t` is unidentified — its target unit and period
+  are not in the same connected component of the observed-control graph; this is
+  reachable under `non_absorbing=True` (always-treated unit, fully-treated period,
+  disconnected support) and on unbalanced absorbing panels (entirely-missing
+  unit/period controls). See the non-estimable-cell `**Note:**` below — matching
+  the library-wide non-estimable→NaN convention (cf. CallawaySantAnna group-time
+  cells).
 - No separate "post_periods" concept - D matrix is the sole input for treatment timing
-- Supports general assignment patterns including staggered adoption
+- Supports general assignment patterns including staggered adoption and (with
+  `non_absorbing=True`) on/off switching
 
 *Estimator equation (as implemented, Section 2.2):*
 
@@ -2707,16 +2728,16 @@ Q(λ) = Σ_{j,s: D_js=0} [τ̂_js^loocv(λ)]²
   - **Results storage**: `TROPResults` stores *original* λ_nn value (inf), while computations use 1e10. λ_time and λ_unit store their selected values directly (0.0 = uniform).
 - **Empty control observations**: If no valid control observations exist, returns Q(λ) = ∞ with warning. A score of 0.0 would incorrectly "win" over legitimate parameters.
 - **Infinite LOOCV score handling**: If best LOOCV score is infinite, `best_lambda` is set to None, triggering defaults fallback
-- Validation: requires at least 2 periods before first treatment
-- **D matrix validation**: Treatment indicator must be an absorbing state (monotonic non-decreasing per unit)
+- Validation: by default requires at least 2 periods before first treatment; with `non_absorbing=True` this becomes "at least 2 periods contain untreated cells" (the leading all-control block is ill-defined when treatment toggles)
+- **D matrix validation** (default `non_absorbing=False`): Treatment indicator must be an absorbing state (monotonic non-decreasing per unit)
   - Detection: `np.diff(D, axis=0) < 0` for any column indicates violation
   - Handling: Raises `ValueError` with list of violating unit IDs and remediation guidance
-  - Error message includes: "convert to absorbing state: D[t, i] = 1 for all t >= first treatment period"
-  - **Rationale**: Event-style D (0→1→0) silently biases ATT; runtime validation prevents misuse
-  - **Unbalanced panels**: Missing unit-period observations are allowed. Monotonicity validation checks each unit's *observed* D sequence for monotonicity, which correctly catches 1→0 violations that span missing period gaps (e.g., D[2]=1, missing [3,4], D[5]=0 is detected as a violation even though the gap hides the transition in adjacent-period checks).
+  - Error message includes: "convert to absorbing state: D[t, i] = 1 for all t >= first treatment period" AND the opt-in pointer ("if treatment genuinely turns on and off, pass `non_absorbing=True`")
+  - **Rationale**: Event-style D (0→1→0) silently biases ATT when the user *meant* absorbing treatment; runtime validation prevents that misuse while the opt-in serves genuine on/off designs
+  - **`non_absorbing=True`**: the monotonicity check is skipped entirely, so on/off (and event-style) D matrices are accepted. Identification falls back to untreated *cells* (the per-(i,t) estimator masks treated cells via (1-W) and fits the rest), so even a fully toggling panel with no never-treated unit is admitted; only "no D=0 cells at all" is rejected. See the requirements checklist + Notes for the no-dynamic-effects requirement and the block-only inference caveat.
+  - **Unbalanced panels**: Missing unit-period observations are allowed. Monotonicity validation (default mode) checks each unit's *observed* D sequence for monotonicity, which correctly catches 1→0 violations that span missing period gaps (e.g., D[2]=1, missing [3,4], D[5]=0 is detected as a violation even though the gap hides the transition in adjacent-period checks).
   - **n_post_periods metadata**: Counts periods where D=1 is actually observed (at least one unit has D=1), not calendar periods from first treatment. In unbalanced panels where treated units are missing in some post-treatment periods, only periods with observed D=1 values are counted.
-- Wrong D specification: if user provides event-style D (only first treatment period),
-  the absorbing-state validation will raise ValueError with helpful guidance
+- Wrong D specification: with the default `non_absorbing=False`, an event-style D (only first treatment period) is rejected with a `ValueError` carrying both the convert-to-absorbing guidance and the `non_absorbing=True` opt-in pointer
 - **Bootstrap minimum**: `n_bootstrap` must be >= 2 (enforced via `ValueError`). TROP uses bootstrap for all variance estimation — there is no analytical SE formula.
 - **Note:** TROP bootstrap loops (`_bootstrap_variance`, `_bootstrap_rao_wu`, and their global counterparts, including both Rust happy paths — local and global) emit a proportional `UserWarning` via `diff_diff.bootstrap_utils.warn_bootstrap_failure_rate` when the replicate failure rate exceeds 5%. The previous hard-coded `< 10 successes` threshold let high-failure runs (e.g. 11 of 200) pass silently; this was classified as a silent failure under the Phase 2 audit (axis D — degenerate-replicate handling). The 5% threshold matches the existing SyntheticDiD bootstrap and placebo guards. When zero replicates succeed, SE is set to `NaN` (unchanged). The local Rust path previously also used `len >= 10` as a Python-fallback trigger; it now accepts any non-zero Rust result and emits the proportional warning instead of path-switching silently.
 - **LOOCV failure metadata**: When LOOCV fits fail in the Rust backend, the first failed observation coordinates (t, i) are returned to Python for informative warning messages
@@ -2733,16 +2754,21 @@ Q(λ) = Σ_{j,s: D_js=0} [τ̂_js^loocv(λ)]²
 - [x] LOOCV uses SUM of squared errors per Equation 5
 - [x] Rank selection implicit via nuclear-norm soft-thresholding (paper Section 5.3 + Appendix); `TROPResults.effective_rank` reports the diagnostic. No discrete `rank_selection` constructor parameter is exposed — earlier mention of "cv / ic / elbow" methods in this checklist was an overclaim, corrected in the methodology-promotion PR. Locked by `tests/test_methodology_trop.py::TestTROPDeviations::test_rank_selection_is_implicit_via_nuclear_norm`.
 - [x] Returns the fitted factor matrix and an effective-rank diagnostic (`TROPResults.factor_matrix` and `TROPResults.effective_rank`). The library does NOT expose separate factor-loading / factor-score outputs — earlier prose claiming "factor loadings and scores" was an overclaim corrected in the 2026-05-24 methodology-promotion PR (TROP's nuclear-norm soft-thresholded L is delivered as a single (n_periods × n_units) matrix, not decomposed into loading / score components on Results).
-- [x] ATT averages over all D==1 cells (general assignment patterns)
+- [x] ATT averages over all **estimable** D==1 cells (staggered adoption by default; on/off switching with `non_absorbing=True`). All D==1 cells are estimable on balanced / support-complete panels; cells whose `alpha_i + beta_t` is unidentified (target unit and period in different connected components of the observed-control graph) are NaN and excluded (see the non-estimable-cell `**Note:**`).
 - [x] No post_periods parameter (D matrix determines treatment timing)
 - [x] D matrix semantics documented (absorbing state, not event indicator)
 - [x] Unbalanced panels supported — missing control / pre-treatment cells don't trigger false absorbing-state violations. Locked by `tests/test_methodology_trop.py::TestTROPDeviations::test_unbalanced_panels_supported` (10% random drops on control + pre-treatment subset). Three additional unbalanced-panel regressions live in `tests/test_trop.py::TestPR110FeedbackRound8` (`test_unbalanced_panel_d_matrix_validation`, `test_unbalanced_panel_real_violation_still_caught`, `test_unbalanced_panel_multiple_missing_periods`). Absorbing-state monotonicity validation (which fires on unbalanced cases too) is covered by `tests/test_trop.py::TestDMatrixValidation`.
-- [x] Per-observation treatment-effect estimation (Eq. 13 / Algorithm 2) — `treatment_effects` dict contains one finite `τ_hat_it` per treated cell, and the aggregate ATT equals the unweighted mean of per-cell effects (Eq. 1). **The methodology test exercises block adoption with a constant treatment effect**; **absorbing-state staggered adoption** and **heterogeneous per-cell effects** (paper Remark 6.1) are SUPPORTED by the code path (the implementation does not gate on cohort or effect-magnitude pattern), but are not directly verified in the methodology test surface in this PR. **Section 6.1 non-absorbing / on-off / switching assignment patterns are explicitly OUT OF SCOPE** — the absorbing-state validator at `trop_local.py` rejects non-monotonic D matrices with a `ValueError`, and `TestTROPDeviations::test_event_style_d_rejected_with_value_error` enforces the rejection contract (event-style D being one specific non-absorbing pattern; the same validator catches all 1→0 transitions). Cross-coverage of the staggered-cohort fit path is `tests/test_methodology_trop.py::TestTROPAlgorithm1LOOCV::test_control_set_includes_pretreat_of_eventually_treated` (two-cohort early-/late-treated panel under LOOCV-tuned `λ_unit`); absorbing-state structural validation is `tests/test_trop.py::TestDMatrixValidation`.
+- [x] Per-observation treatment-effect estimation (Eq. 13 / Algorithm 2) — `treatment_effects` dict contains one `τ_hat_it` entry per treated cell (finite for estimable cells; NaN for a missing outcome or, under `non_absorbing`, a cell with no weighted control support — see the no-support `**Note:**`), and the aggregate ATT equals the unweighted mean of the finite per-cell effects (Eq. 1). **The methodology test exercises block adoption with a constant treatment effect**; **absorbing-state staggered adoption** and **heterogeneous per-cell effects** (paper Remark 6.1) are SUPPORTED by the code path (the implementation does not gate on cohort or effect-magnitude pattern), but are not directly verified in the methodology test surface for those specific patterns. Cross-coverage of the staggered-cohort fit path is `tests/test_methodology_trop.py::TestTROPAlgorithm1LOOCV::test_control_set_includes_pretreat_of_eventually_treated` (two-cohort early-/late-treated panel under LOOCV-tuned `λ_unit`); absorbing-state structural validation is `tests/test_trop.py::TestDMatrixValidation`.
+- [x] **Section 6.1 non-absorbing / on-off / switching assignment patterns are SUPPORTED via the opt-in `TROP(non_absorbing=True)` (`method='local'` only)** — matching the paper's general-assignment scope (§2.1 "units moving into and out of treatment"; Eq. 12 / Algorithm 2 mask treated cells per (i,t) with no monotonicity requirement). The default (`non_absorbing=False`) still rejects non-monotonic D as a defensive guard (see the `**Note:**` entries below). Removing this opt-in restriction *narrows* a prior implementation over-restriction (the shipped estimator was stricter than the paper); it is **not** a new methodology deviation. Recovery on a no-dynamic-effects toggling DGP, the per-cell effect count, and the caveat warning are locked by `tests/test_methodology_trop.py::TestTROPDeviations::test_non_absorbing_general_assignment_supported`; the default-mode rejection contract by `TestTROPDeviations::test_event_style_d_rejected_with_value_error`; opt-in acceptance, the local-only guard, params round-trip, and Rust/Python parity by `tests/test_trop.py::TestDMatrixValidation`.
 - [x] Special-case reductions (paper Section 2.2): **DiD benchmark sanity check** (NOT a direct algebraic-equivalence proof) — TROP with `λ_nn=∞` + uniform weights produces an ATT within 0.5 of `DifferenceInDifferences` fitted as a basic 2×2 design on a TWFE-clean multi-period panel. This is empirical numerical agreement on a friendly DGP. A direct Section 2.2 reduction lock (true 2-period block-assignment panel where basic DiD is the algebraic target, or a comparison against `TwoWayFixedEffects` with explicit unit FE) is deferred. **Matrix Completion code path exercised** — TROP with uniform weights + finite `λ_nn` engages the nuclear-norm prox solver (effective_rank > 0) and beats the DiD-style baseline on a factor-confounded DGP; not an equivalence check against an independent MC reference. SC and SDID reductions are paper-claimed under "specific (omega, theta) weight choices" not provided in the paper text; cross-language anchor deferred until paper-author reference implementation clarifies the weight map. See `tests/test_methodology_trop.py::TestTROPSpecialCases`.
 - **Note:** The balancing representation / decomposition (paper Eq. 10, Section 5.2) is a paper-side identity. Direct numerical reconstruction of the four-term sum requires the internal `θ_s^{i,t}` / `ω_j^{i,t}` weight vectors, which are not exposed on the public TROP API; numerical Eq. 10 verification is therefore out of scope. The test `tests/test_methodology_trop.py::TestTROPNuclearNormProx::test_factor_matrix_consistent_with_treatment_effects` is a structural pointer only — it checks `factor_matrix` shape + finiteness + that `treatment_effects` is populated with finite entries, but does NOT lock the magnitude of `L_hat`. (The test DGP uses additive unit + time effects only; on a no-interactive-FE panel, the paper's framework absorbs the additive surfaces into `α_i` / `β_t`, so a near-zero `L_hat` is methodologically correct. An `effective_rank > 0` assertion would lock a solver artifact, not the intended low-rank behavior.) This is NOT a full Eq. 10 lock. The Eq. 2 ingredients (soft-threshold SVD, **plain prox-gradient monotonicity** — NOT the shipped accelerated FISTA outer loop, which uses Nesterov momentum and does not guarantee per-step monotonicity, see `TestTROPNuclearNormProx` class docstring — weighted-prox) that the Eq. 10 derivation relies on are independently verified in the same class.
 - **Note (library-side choice):** Weight normalization (Gap #5 in `docs/methodology/papers/athey-2025-review.md`): paper Section 5 (p. 20) states weights sum to one (`1ᵀω = 1ᵀθ = 1`), but Eq. 3 (p. 7) writes unnormalized exponential weights. **The paper-side ambiguity remains open**; the library resolves it as a documented deviation — the shipped implementation matches Eq. 2 (unnormalized). Verified by `tests/test_methodology_trop.py::TestTROPDeviations::test_unnormalized_weights_match_eq2`. Will be revisited once paper-author reference implementation lands.
 - **Note (deferral):** Equation 14 covariate extension (`Y_it = α_i + β_t + X_it·β_coef + R_it` with R low-rank, paper Section 6.2) is **not implemented**. `TROP.fit()` does not accept a `covariates` keyword argument. The corresponding Theorem 8.1 covariate-triple-robustness result is correspondingly out of scope. The non-support is locked by `tests/test_methodology_trop.py::TestTROPDeviations::test_covariates_not_supported`, which uses `inspect.signature` to guard against future `**kwargs` silently breaking the contract. Deferred until use cases motivate the X threading through `trop_local.py` / `trop_global.py` / LOOCV / bootstrap.
 - **Note:** Survey support: weights, strata, PSU, and FPC are all supported via Rao-Wu rescaled bootstrap with cross-classified pseudo-strata (Phase 6). Rust backend remains pweight-only; full-design surveys fall back to the Python bootstrap path. Survey weights enter ATT aggregation only — population-weighted average of per-observation treatment effects. Model fitting (kernel weights, LOOCV, nuclear norm regularization) stays unchanged. Rust and Python bootstrap paths both support survey-weighted ATT in each iteration.
+- **Note (defensive default):** `non_absorbing` defaults to `False`, retaining the absorbing-state monotonicity gate. This is an implementation choice, not a paper requirement: the gate's primary value is catching the common mistake of encoding *absorbing* treatment as an event-style spike (a single D=1 period), which silently biases the ATT. Genuine on/off designs opt in with `non_absorbing=True`. The default-mode rejection message carries both the convert-to-absorbing guidance and the opt-in pointer.
+- **Note (scope — local only):** `non_absorbing=True` is supported only for `method='local'`. The `global` method's post-hoc weighting and bootstrap bake in a contiguous, simultaneous treated block (it already rejects staggered adoption), so `TROP(method='global', non_absorbing=True)` raises a `ValueError`. The Rust local LOOCV/bootstrap paths are already mask-driven (`D==0`/`D==1`) and required no change; Rust/Python ATT parity on a non-absorbing panel is locked by `tests/test_trop.py::TestDMatrixValidation::test_non_absorbing_rust_python_parity`. For a fully toggling panel (no never-treated unit), the local Rust bootstrap is bypassed in favour of the Python loop (the Rust stratified resampler can return a degenerate ~0 SE on an empty control stratum).
+- **Note (inference caveat for non-absorbing):** The paper's *point estimator* (Eq. 12 / Algorithm 2) supports general assignment, but the formal **triple-robustness guarantee (Theorem 5.1) is proven only under Assumption 1(i) block assignment** `W_it = 1{i>N0}·1{t>T0}`; the paper does not extend that guarantee to general/non-absorbing patterns (cf. `docs/methodology/papers/athey-2025-review.md`). The non-parametric bootstrap (Algorithm 3) is offered generally but "its validity requires a growing number of treated units." Non-absorbing validity additionally relies on the paper's **no-spillover / no-dynamic-effects (no carryover)** assumption (paper §2.1). `TROP.fit()` emits a one-time `UserWarning` carrying these caveats whenever `non_absorbing=True`; the warning is locked by `tests/test_methodology_trop.py::TestTROPDeviations::test_non_absorbing_general_assignment_supported` and its absence in default mode by `test_non_absorbing_no_caveat_in_default_mode`.
+- **Note (non-absorbing non-estimable-cell trimming → estimable-cell ATT):** The working model fits unregularized unit/time fixed effects `alpha_j` / `beta_s` on the weighted observed control cells, then sets `tau_it = Y_it - alpha_i - beta_t - L_it`. A treated cell (i,t) is **estimable** only if the sum `alpha_i + beta_t` is identified by that two-way-FE fit. In a two-way FE model the effects are pinned only **within each connected component** of the bipartite graph whose nodes are units and periods and whose edges are the positively-weighted **observed** control cells (`usable = (D==0) & ~missing & isfinite(Y) & ω>0`); across components there is a free per-component offset. So estimability requires the **target unit node and target period node to lie in the same connected component** of that graph (predicate `diff_diff.trop_local._treated_cell_is_estimable`, a bipartite BFS run per treated cell with a cheap empty-row/empty-column fast-path). A marginal "the target unit has some usable control AND the target period has some usable control" test is **necessary but not sufficient** — e.g. usable cells at `(unitA,t0)` and `(unitB,t1)` with target `(unitA,t1)` pass it yet span two disconnected components, leaving `alpha_A + beta_1` unidentified. The connected-component check subsumes the simpler degeneracies: under `non_absorbing=True` (1) an **always-treated unit** has an empty control column (isolated unit node) — true even with `lambda_unit=0`; and (2) a **fully-treated period** has an empty control row (isolated period node). In all these cases tau would silently leak the fixed effect. Non-estimable cells are materialized as `NaN` in `treatment_effects` and **excluded from the ATT**, which is therefore the mean over **estimable** treated cells — NOT all D=1 cells. This matches the library-wide non-estimable→NaN convention (the per-named-cell analogue of CallawaySantAnna materializing non-estimable (g,t) as NaN); it is a **defensive choice for a degeneracy the paper does not cover** (the paper assumes enough overlap), not a deviation from Eq. 1 on the cells it covers. There is no `λ` that restores identification for these cells (the missing control row/column is structural), so the warning does not suggest one. **The predicate is applied to every local fit (absorbing and non-absorbing) as a general correctness guard** — it NaNs exactly the cells whose FE is genuinely unidentified. It is a **no-op whenever every treated cell's target unit and period have an OBSERVED control cell**: always true on a balanced panel, and in **absorbing** mode also true on unbalanced panels (a never-treated unit is a control at every observed period and each treated unit's pre-treatment controls are observed) — *unless* an unbalanced absorbing panel happens to leave a treated unit's pre-period controls or a period's controls entirely missing, in which case NaN-ing those cells is the correct fix to the identical latent FE leak (the prior behavior silently reported a contaminated tau). So estimable-cell trimming is the contract for **all** local TROP fits on unbalanced panels, not only non-absorbing ones. The point fit and the bootstrap refit apply the identical predicate; a draw with no estimable cell returns NaN and counts as a failed bootstrap replicate. **Rust/bootstrap parity:** the Rust per-cell bootstrap lacks the estimability guard, so whenever the point fit trims any cell (`force_python=True`, set from `n_no_support>0`) — or under `non_absorbing` generally — the bootstrap is routed to the guarded Python `_fit_with_fixed_lambda`, keeping the SE and the point ATT on the same estimable-cell set. (Rust remains the happy path for clean fits with no trimming.) **LOOCV is support-agnostic** by design: a degenerate pseudo-control cell yields a large raw-outcome pseudo-effect that inflates `Q(λ)`, so support-destroying `λ_unit` values are *naturally disfavored* (a soft penalty) rather than hard-rejected — hard-rejecting (`Q=∞`) would over-restrict. `TROP.fit()` emits a `UserWarning` naming the count of non-estimable cells. Locked by `tests/test_methodology_trop.py::TestTROPDeviations`: `test_non_absorbing_always_treated_unit_not_raw_outcome` (always-treated unit, `lambda_unit>0` and `lambda_unit=0`), `test_non_absorbing_fully_treated_period_not_estimable` (fully-treated period), `test_non_absorbing_disconnected_support_not_estimable` (disconnected bipartite control graph), and `test_unbalanced_absorbing_unidentified_unit_not_estimable` (the guard + `force_python` bootstrap parity in default absorbing mode).
 
 ### TROP Global Estimation Method
 
diff --git a/docs/methodology/papers/athey-2025-review.md b/docs/methodology/papers/athey-2025-review.md
index c2e253c7..a4148167 100644
--- a/docs/methodology/papers/athey-2025-review.md
+++ b/docs/methodology/papers/athey-2025-review.md
@@ -5,7 +5,7 @@
 **PDF reviewed:** https://arxiv.org/abs/2508.21536v2 (version-pinned arXiv abstract for v2)
 **Review date:** 2026-02-08
 
-**Version-pinning note (2026-05-25):** The current arXiv version of arXiv:2508.21536 is **v3** (submitted 2026-02-09). The 2026-05-24 methodology promotion ships against this v2-pinned review; a formal v2-vs-v3 delta-check against the v3 PDF for TROP-relevant methodology changes (Eqs. 2-3, Algorithms 1-3, Section 2.2, Section 5.2-5.3, Section 6.1-6.2, Theorem 5.1, Corollary 1, Appendix Theorem 8.1) has **NOT** been performed.
+**Version-pinning note (2026-05-25):** The current arXiv version of arXiv:2508.21536 is **v3** (submitted 2026-02-09). The 2026-05-24 methodology promotion ships against this v2-pinned review; a formal v2-vs-v3 delta-check against the v3 PDF for TROP-relevant methodology changes (Eqs. 2-3, Algorithms 1-3, Section 2.2, Section 5.2-5.3, Section 6.1-6.2, Theorem 5.1, Corollary 1, Appendix Theorem 8.1) has **NOT** been performed in full. **Update (non-absorbing support work):** the v3 PDF was consulted for the treatment-assignment-pattern sections (§2.1 general assignment, §2.2 Eq. 2 masking, §6.1 Eq. 12 / Algorithm 2, Assumption 1(i), Theorem 5.1) and confirms the general-assignment scope on which `TROP(non_absorbing=True)` is built; the remaining sections of the delta-check stay deferred.
 
 **Action item**: before the next paper-author reference implementation or substantive v3 release, refresh this review against the most recent arXiv version, perform a real v2→v3 PDF delta audit, and re-validate that the verified-component checklist still maps cleanly. Pending that refresh, the methodology promotion is anchored on v2 as documented here.
 
@@ -283,9 +283,9 @@ Note: Stratified bootstrap -- control and treated units resampled separately. Pr
 - **Outcome matrix**: Y (N x T), observed outcomes
 - **Treatment matrix**: W (N x T), binary treatment assignments where `W_it in {0, 1}`
 - **Covariates** (optional): X_it, observed covariates for each unit-period pair
-- Treatment must be an absorbing state for standard block assignment (W_it = 1{i > N_0} * 1{t > T_0})
-- **Paper scope (Equation 13):** the paper extends TROP to general assignment patterns including treatment switching on/off.
-- **Shipped implementation:** the current `diff_diff/trop.py` requires an absorbing-state treatment indicator and rejects non-absorbing/event-style inputs (gate in `diff_diff/trop.py:505-525`, also documented in `docs/methodology/REGISTRY.md` under TROP). Generalization to non-absorbing patterns is not in scope for the current implementation.
+- Treatment is an absorbing state for standard block assignment (W_it = 1{i > N_0} * 1{t > T_0}); this is the default mode.
+- **Paper scope (Equation 13 / Section 6.1):** the paper extends TROP to general assignment patterns including treatment switching on/off (§2.1: "units moving into and out of treatment").
+- **Shipped implementation:** `diff_diff/trop.py` accepts general (on/off) assignment via the opt-in `TROP(non_absorbing=True)` (`method='local'` only), matching the paper's scope. The default `non_absorbing=False` retains the absorbing-state monotonicity gate (in `diff_diff/trop_local.py::_setup_trop_data`, around `trop_local.py:131-144`) as a defensive guard against event-style mis-encoding; it rejects non-monotonic D with a `ValueError` that also points to the opt-in. See `docs/methodology/REGISTRY.md` under TROP for the no-dynamic-effects requirement and the block-only inference caveat (Theorem 5.1 is proven under Assumption 1(i) block assignment only). Removing the opt-in restriction *narrows* a prior implementation over-restriction; the global method still requires block assignment and rejects `non_absorbing=True`.
 
 ### Computational Considerations
 - **Main bottleneck**: LOOCV grid search -- for each grid point, every control observation requires a separate nuclear-norm penalized weighted least squares solve
diff --git a/docs/performance-scenarios.md b/docs/performance-scenarios.md
index 38d41a48..b0a42ea9 100644
--- a/docs/performance-scenarios.md
+++ b/docs/performance-scenarios.md
@@ -259,8 +259,9 @@ serves a different purpose: R-parity accuracy). They complement it.
 
 - **Persona / domain.** Marketing analyst measuring an always-on-with-
   dark-periods campaign, or a health-policy researcher studying a policy
-  that switches on and off. Reversible treatment breaks every other
-  staggered estimator; dCDH is the only option.
+  that switches on and off. Reversible treatment breaks the absorbing-only
+  staggered estimators; dCDH is the most general fit (LPDiD/TROP `non_absorbing`
+  also handle it under stronger assumptions).
 - **Data shape.** 120 groups x 10 periods, single-switch pattern per group,
   ~40% always-control, survey-weighted with 8 strata and 24 PSUs. Larger
   than the Tutorial's 80 x 6 demo to expose the `L_max` multi-horizon
diff --git a/docs/practitioner_decision_tree.rst b/docs/practitioner_decision_tree.rst
index 72f33ed6..1dd6e5b5 100644
--- a/docs/practitioner_decision_tree.rst
+++ b/docs/practitioner_decision_tree.rst
@@ -157,11 +157,14 @@ market is treated it stays treated.
 
 **Recommended method:** :class:`~diff_diff.ChaisemartinDHaultfoeuille` (alias :class:`~diff_diff.DCDH`)
 
-This is the **only library estimator** that handles non-absorbing (reversible)
+This is the **most general library estimator** for non-absorbing (reversible)
 treatments. It compares period-to-period outcome changes in markets that switch
 into treatment ("joiners") and markets that switch out ("leavers"), against
 simultaneously-stable controls. You get three numbers: the overall lift `DID_M`,
-a joiners-only view `DID_+`, and a leavers-only view `DID_-`.
+a joiners-only view `DID_+`, and a leavers-only view `DID_-`. (:class:`~diff_diff.LPDiD`
+with ``non_absorbing="first_entry"`` / ``"effect_stabilization"`` and
+:class:`~diff_diff.TROP` with ``non_absorbing=True`` — under a no-dynamic-effects
+assumption — also handle non-absorbing treatment, under stronger assumptions.)
 
 .. code-block:: python
 
@@ -442,7 +445,7 @@ At a Glance
      - Handles different launch dates correctly
    * - On/off cycles (reversible treatment)
      - ``ChaisemartinDHaultfoeuille``
-     - Only library option for non-absorbing treatments
+     - Most general option for non-absorbing treatments (see also LPDiD/TROP ``non_absorbing``)
    * - Varied spending levels
      - ``ContinuousDiD``
      - Dose-response curve
diff --git a/docs/tutorials/19_dcdh_marketing_pulse.ipynb b/docs/tutorials/19_dcdh_marketing_pulse.ipynb
index 51ed310d..0e6cd652 100644
--- a/docs/tutorials/19_dcdh_marketing_pulse.ipynb
+++ b/docs/tutorials/19_dcdh_marketing_pulse.ipynb
@@ -4,11 +4,7 @@
    "cell_type": "markdown",
    "id": "t19-cell-001",
    "metadata": {},
-   "source": [
-    "# Tutorial 19: dCDH for Marketing Pulse Campaigns\n",
-    "\n",
-    "A practitioner walkthrough for measuring lift from promotional campaigns that turn on AND off across markets at staggered times. The tutorial uses the `ChaisemartinDHaultfoeuille` estimator (alias `DCDH`) - diff-diff's only estimator built for reversible (non-absorbing) treatment, where every other modern staggered estimator in the library assumes treatment is absorbing."
-   ]
+   "source": "# Tutorial 19: dCDH for Marketing Pulse Campaigns\n\nA practitioner walkthrough for measuring lift from promotional campaigns that turn on AND off across markets at staggered times. The tutorial uses the `ChaisemartinDHaultfoeuille` estimator (alias `DCDH`) - diff-diff's estimator purpose-built for reversible (non-absorbing) treatment: the most general non-absorbing option in the library, with explicit joiner/leaver decomposition. (`LPDiD` and `TROP` also support non-absorbing treatment via their `non_absorbing` parameters, under stronger assumptions.)"
   },
   {
    "cell_type": "markdown",
@@ -124,16 +120,7 @@
    "cell_type": "markdown",
    "id": "t19-cell-008",
    "metadata": {},
-   "source": [
-    "## 3. Fitting dCDH\n",
-    "\n",
-    "`DID_M` is the headline dCDH estimator: the average across periods of two pieces:\n",
-    "\n",
-    "- **`DID_+`** (joiners): markets switching `0 → 1` between consecutive periods, compared to *contemporaneously untreated* control cells.\n",
-    "- **`DID_-`** (leavers): markets switching `1 → 0`, compared to *contemporaneously treated* control cells.\n",
-    "\n",
-    "Both pieces use only cells whose treatment status was stable across the two periods being compared - so no treated unit is ever used as a control for another treated unit. The library reports `DID_+`, `DID_-`, and their average `DID_M` separately, so you can see if the two halves agree."
-   ]
+   "source": "## 3. Fitting dCDH\n\n`DID_M` is the headline dCDH estimator: the average across periods of two pieces:\n\n- **`DID_+`** (joiners): markets switching `0 → 1` between consecutive periods, compared to *contemporaneously untreated* control cells.\n- **`DID_-`** (leavers): markets switching `1 → 0`, compared to *contemporaneously treated* control cells.\n\nBoth pieces use only cells whose treatment status was stable across the two periods being compared. No *switching* cell is used as a control; stable-untreated cells serve as controls for joiners, and stable-treated cells serve as controls for leavers. The library reports `DID_+`, `DID_-`, and their average `DID_M` separately, so you can see if the two halves agree."
   },
   {
    "cell_type": "markdown",
@@ -313,21 +300,7 @@
    "cell_type": "markdown",
    "id": "t19-cell-019",
    "metadata": {},
-   "source": [
-    "## 5. Communicating Results to Leadership\n",
-    "\n",
-    "A stakeholder-ready summary of the analysis above:\n",
-    "\n",
-    "> **Headline.** The pulse campaign lifted weekly checkout sessions by approximately **12 sessions per market per week** while the promo was on (95% CI: 11.3 to 12.8). On a baseline of about 110 weekly sessions per market, that's roughly an **11% lift**. *[Source: `results.overall_att` from Section 3.]*\n",
-    ">\n",
-    "> **Sample size and design.** 60 markets observed for 8 weeks (480 market-weeks). Of those, 38 markets started untreated and switched the promo on at some point during the quarter (joiners), and 22 markets started with the promo on and switched it off (leavers). Method: dCDH (de Chaisemartin & D'Haultfoeuille 2020) - diff-diff's only estimator built for treatment that can switch on AND off in the same panel. *[Source: switcher counts and panel shape from Section 2.]*\n",
-    ">\n",
-    "> **Validity evidence.** Two checks supported the result. (a) The joiners-vs-leavers split agreed: joiners produced a +12.1 lift, leavers a +11.9 lift, well within sampling uncertainty of each other and of the headline. (b) The multi-horizon placebos at l = -2 and l = -1 both sat on zero with bootstrap CIs comfortably covering it - parallel pre-trends look credible. *[Sources: joiners/leavers from Section 3, multi-horizon placebos from Section 4.]*\n",
-    ">\n",
-    "> **What \"+12 sessions per market per week\" means in business terms.** Across 60 markets and the weeks each one had the promo on, that's the per-market-week lift attributable to the campaign. Translate to your own revenue-per-session to compare against campaign spend, then use the per-market lift estimate to project what scaling the promo to additional markets would deliver.\n",
-    ">\n",
-    "> **Practical significance caveat.** The 11% lift is statistically significant (bootstrap p < 0.01 at both post-treatment horizons), and the on-impact effect persists at the second horizon - the pulse worked while it was on. Whether 11% justifies the campaign cost is a business judgment, not a statistical one. *[Sources: dynamic horizons from Section 4.]*"
-   ]
+   "source": "## 5. Communicating Results to Leadership\n\nA stakeholder-ready summary of the analysis above:\n\n> **Headline.** The pulse campaign lifted weekly checkout sessions by approximately **12 sessions per market per week** while the promo was on (95% CI: 11.3 to 12.8). On a baseline of about 110 weekly sessions per market, that's roughly an **11% lift**. *[Source: `results.overall_att` from Section 3.]*\n>\n> **Sample size and design.** 60 markets observed for 8 weeks (480 market-weeks). Of those, 38 markets started untreated and switched the promo on at some point during the quarter (joiners), and 22 markets started with the promo on and switched it off (leavers). Method: dCDH (de Chaisemartin & D'Haultfoeuille 2020) - diff-diff's most general estimator built for treatment that can switch on AND off in the same panel. *[Source: switcher counts and panel shape from Section 2.]*\n>\n> **Validity evidence.** Two checks supported the result. (a) The joiners-vs-leavers split agreed: joiners produced a +12.1 lift, leavers a +11.9 lift, well within sampling uncertainty of each other and of the headline. (b) The multi-horizon placebos at l = -2 and l = -1 both sat on zero with bootstrap CIs comfortably covering it - parallel pre-trends look credible. *[Sources: joiners/leavers from Section 3, multi-horizon placebos from Section 4.]*\n>\n> **What \"+12 sessions per market per week\" means in business terms.** Across 60 markets and the weeks each one had the promo on, that's the per-market-week lift attributable to the campaign. Translate to your own revenue-per-session to compare against campaign spend, then use the per-market lift estimate to project what scaling the promo to additional markets would deliver.\n>\n> **Practical significance caveat.** The 11% lift is statistically significant (bootstrap p < 0.01 at both post-treatment horizons), and the on-impact effect persists at the second horizon - the pulse worked while it was on. Whether 11% justifies the campaign cost is a business judgment, not a statistical one. *[Sources: dynamic horizons from Section 4.]*"
   },
   {
    "cell_type": "markdown",
@@ -387,4 +360,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 5
-}
+}
\ No newline at end of file
diff --git a/tests/test_methodology_trop.py b/tests/test_methodology_trop.py
index f6a662eb..e18099ad 100644
--- a/tests/test_methodology_trop.py
+++ b/tests/test_methodology_trop.py
@@ -2423,3 +2423,435 @@ def test_safe_inference_nan_propagation_contract(self):
                     "conf_int": conf_int,
                 }
             )
+
+    # ------------------------------------------------------------------
+    # Non-absorbing (general assignment) support — Eq. 1 / Eq. 12 /
+    # Algorithm 2, Section 6.1. The paper's estimator handles general
+    # assignment patterns ("units moving into and out of treatment"),
+    # not only absorbing/staggered adoption (§2.1). The library exposes
+    # this via the opt-in TROP(non_absorbing=True); the default still
+    # rejects non-monotonic D (covered in test_trop.py::TestDMatrixValidation
+    # and test_event_style_d_rejected_with_value_error above).
+    # ------------------------------------------------------------------
+
+    @staticmethod
+    def _make_non_absorbing_panel(seed=0, tau=3.0, n_units=16, n_periods=8, all_toggle=False):
+        """TWFE-clean panel with on/off (non-absorbing) treatment, no dynamic effects.
+
+        Y_it(0) = alpha_i + beta_t + noise; Y_it(1) = Y_it(0) + tau. Some units
+        switch treatment on and then off again, so D is non-monotonic.
+        """
+        rng = np.random.default_rng(seed)
+        alpha = rng.normal(0.0, 1.0, n_units)
+        beta = rng.normal(0.0, 1.0, n_periods)
+        rows = []
+        for i in range(n_units):
+            d = np.zeros(n_periods, dtype=int)
+            if all_toggle:
+                on = 3 + (i % 2)
+                d[on : on + 2] = 1  # every unit treated on an interior block
+            elif i % 4 == 0 and i > 0:
+                d[4:6] = 1  # on then off (non-absorbing)
+            elif i % 3 == 0:
+                d[5:] = 1  # absorbing block (mix of patterns)
+            for t in range(n_periods):
+                y0 = alpha[i] + beta[t] + rng.normal(0.0, 0.05)
+                rows.append(
+                    {
+                        "unit": i,
+                        "period": t,
+                        "outcome": y0 + (tau if d[t] == 1 else 0.0),
+                        "treated": int(d[t]),
+                    }
+                )
+        return pd.DataFrame(rows)
+
+    @pytest.mark.slow
+    def test_non_absorbing_general_assignment_supported(self):
+        """TROP(non_absorbing=True) accepts on/off treatment and recovers the
+        ATT on a no-dynamic-effects DGP (Eq. 1 averages over all D=1 cells;
+        Eq. 12 / Algorithm 2 masks treated cells per (i, t)). A caveat
+        ``UserWarning`` is emitted because Theorem 5.1's guarantee is proven
+        only under block assignment.
+        """
+        tau = 3.0
+        df = self._make_non_absorbing_panel(seed=0, tau=tau)
+        n_treated_cells = int(df["treated"].to_numpy().sum())
+        # Sanity: the panel really is non-absorbing (some unit goes 1 -> 0).
+        treated_wide = df.pivot(index="period", columns="unit", values="treated").to_numpy()
+        assert bool(
+            (np.diff(treated_wide, axis=0) < 0).any()
+        ), "test panel must contain a 1->0 transition"
+
+        est = TROP(
+            method="local",
+            non_absorbing=True,
+            lambda_time_grid=[0.0],
+            lambda_unit_grid=[0.0],
+            lambda_nn_grid=[0.1],
+            n_bootstrap=2,
+            seed=1,
+        )
+        with pytest.warns(UserWarning, match="(?i)non_absorbing.*Theorem 5.1"):
+            res = est.fit(df, "outcome", "treated", "unit", "period")
+
+        # Estimand: ATT averages the per-cell effects over all D=1 cells (Eq. 1).
+        assert np.isfinite(res.att)
+        assert abs(res.att - tau) < 0.5, f"ATT {res.att} should recover tau={tau}"
+        # One finite per-cell effect per treated cell (Eq. 12 / Algorithm 2).
+        assert len(res.treatment_effects) == n_treated_cells
+        assert all(np.isfinite(v) for v in res.treatment_effects.values())
+
+    def test_non_absorbing_no_caveat_in_default_mode(self):
+        """The non-absorbing caveat warning fires ONLY for non_absorbing=True;
+        a default (absorbing) fit must not emit it.
+        """
+        # Absorbing staggered panel (monotonic per unit).
+        rows = []
+        for i in range(12):
+            g = 4 if i < 3 else (6 if i < 6 else None)
+            for t in range(8):
+                d = 1 if (g is not None and t >= g) else 0
+                rows.append(
+                    {
+                        "unit": i,
+                        "period": t,
+                        "outcome": float(i) * 0.1 + float(t) * 0.2 + (2.0 if d else 0.0),
+                        "treated": d,
+                    }
+                )
+        df = pd.DataFrame(rows)
+        est = TROP(
+            method="local",
+            lambda_time_grid=[0.0],
+            lambda_unit_grid=[0.0],
+            lambda_nn_grid=[0.1],
+            n_bootstrap=2,
+            seed=1,
+        )
+        with warnings.catch_warnings(record=True) as caught:
+            warnings.simplefilter("always")
+            est.fit(df, "outcome", "treated", "unit", "period")
+        assert not any(
+            "non_absorbing" in str(w.message) for w in caught
+        ), "default (absorbing) mode must not emit the non_absorbing caveat"
+
+    @pytest.mark.slow
+    def test_non_absorbing_unbalanced_panel_supported(self):
+        """Non-absorbing support tolerates unbalanced panels (random missing
+        control cells) and still returns a finite ATT.
+        """
+        df = self._make_non_absorbing_panel(seed=7, tau=3.0)
+        # Drop 10% of untreated rows at random (missing control observations).
+        rng = np.random.default_rng(11)
+        control_rows = df.index[df["treated"] == 0].to_numpy()
+        drop = rng.choice(control_rows, size=int(0.1 * len(control_rows)), replace=False)
+        df_unbalanced = df.drop(index=drop).reset_index(drop=True)
+
+        est = TROP(
+            method="local",
+            non_absorbing=True,
+            lambda_time_grid=[0.0],
+            lambda_unit_grid=[0.0],
+            lambda_nn_grid=[0.1],
+            n_bootstrap=2,
+            seed=1,
+        )
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            res = est.fit(df_unbalanced, "outcome", "treated", "unit", "period")
+        assert np.isfinite(res.att)
+        assert np.isfinite(res.se)
+
+    @pytest.mark.slow
+    @pytest.mark.parametrize("lambda_unit", [0.0, 1.0])
+    def test_non_absorbing_always_treated_unit_not_raw_outcome(self, lambda_unit):
+        """A treated cell whose UNIT has no observed control cell leaves ``alpha_i``
+        unidentified, so its tau would silently leak the unit fixed effect (a
+        raw-outcome-like value). Such cells must be marked non-estimable (NaN).
+        This holds for BOTH ``lambda_unit=0`` (uniform unit weights still give the
+        always-treated unit no own control row) and ``lambda_unit>0`` (inf
+        distance -> zero donor weights). Estimable cells still recover the effect
+        and the bootstrap SE stays finite.
+
+        Locks the documented behavior (REGISTRY ## TROP "non-absorbing
+        non-estimable-cell trimming" Note): the ATT is the mean over estimable
+        treated cells (library-wide non-estimable->NaN convention).
+        """
+        rng = np.random.default_rng(0)
+        n_units, n_periods, tau = 8, 8, 5.0
+        alpha = rng.normal(0.0, 1.0, n_units)
+        alpha[0] = 10.0  # large unit-0 FE so any leak is unmistakable
+        beta = rng.normal(0.0, 1.0, n_periods)
+        rows = []
+        for i in range(n_units):
+            for t in range(n_periods):
+                if i == 0:
+                    d = 1  # always-treated: no untreated history
+                elif i % 3 == 0:
+                    d = 1 if 4 <= t <= 5 else 0  # on/off
+                else:
+                    d = 1 if t >= 6 else 0  # untreated history present
+                y0 = alpha[i] + beta[t] + rng.normal(0.0, 0.05)
+                rows.append(
+                    {
+                        "unit": i,
+                        "period": t,
+                        "outcome": y0 + (tau if d else 0.0),
+                        "treated": int(d),
+                    }
+                )
+        df = pd.DataFrame(rows)
+        est = TROP(
+            method="local",
+            non_absorbing=True,
+            lambda_time_grid=[0.0],
+            lambda_unit_grid=[lambda_unit],
+            lambda_nn_grid=[0.1],
+            n_bootstrap=3,
+            seed=1,
+        )
+        with pytest.warns(UserWarning, match="(?i)not estimable"):
+            res = est.fit(df, "outcome", "treated", "unit", "period")
+
+        # Every cell of the always-treated unit is non-estimable (NaN), never a
+        # fixed-effect-contaminated raw outcome (alpha_0 = 10 would leak otherwise).
+        raw_y = {
+            t: float(df[(df.unit == 0) & (df.period == t)]["outcome"].iloc[0])
+            for t in range(n_periods)
+        }
+        u0 = {k: v for k, v in res.treatment_effects.items() if k[0] == 0}
+        assert len(u0) == n_periods
+        for (_, t), v in u0.items():
+            assert np.isnan(v), f"cell(0,{t}) should be NaN, got {v}"
+            assert not np.isclose(v, raw_y[t]), "tau must not equal raw outcome"
+
+        # Estimable cells (other units) remain finite and aggregate near the truth.
+        assert np.isfinite(res.att)
+        assert np.isfinite(res.se)
+        assert abs(res.att - tau) < 0.6
+
+    @pytest.mark.slow
+    def test_non_absorbing_fully_treated_period_not_estimable(self):
+        """A period in which EVERY unit is treated has no control cell, so
+        ``beta_t`` is unidentified and that period's tau would leak the time fixed
+        effect. Those cells must be NaN (non-estimable), not finite raw-outcome
+        values; treated cells in other periods still recover the effect.
+        """
+        rng = np.random.default_rng(1)
+        n_units, n_periods, tau, hot = 8, 8, 3.0, 4
+        alpha = rng.normal(0.0, 1.0, n_units)
+        beta = rng.normal(0.0, 1.0, n_periods)
+        beta[hot] = 20.0  # large period-`hot` FE so any leak is unmistakable
+        rows = []
+        for i in range(n_units):
+            for t in range(n_periods):
+                if t == hot:
+                    d = 1  # every unit treated at `hot` -> no control at that period
+                elif i % 2 == 0:
+                    d = 1 if t >= 6 else 0
+                else:
+                    d = 1 if 1 <= t <= 2 else 0
+                y0 = alpha[i] + beta[t] + rng.normal(0.0, 0.05)
+                rows.append(
+                    {
+                        "unit": i,
+                        "period": t,
+                        "outcome": y0 + (tau if d else 0.0),
+                        "treated": int(d),
+                    }
+                )
+        df = pd.DataFrame(rows)
+        # Sanity: period `hot` is fully treated.
+        assert bool((df[df.period == hot]["treated"].to_numpy() == 1).all())
+        est = TROP(
+            method="local",
+            non_absorbing=True,
+            lambda_time_grid=[0.0],
+            lambda_unit_grid=[0.0],
+            lambda_nn_grid=[0.1],
+            n_bootstrap=3,
+            seed=1,
+        )
+        with pytest.warns(UserWarning, match="(?i)not estimable"):
+            res = est.fit(df, "outcome", "treated", "unit", "period")
+
+        hot_cells = {k: v for k, v in res.treatment_effects.items() if k[1] == hot}
+        assert len(hot_cells) == n_units
+        for k, v in hot_cells.items():
+            assert np.isnan(v), f"fully-treated-period cell {k} should be NaN, got {v}"
+        # Treated cells in other (estimable) periods still recover the effect.
+        assert np.isfinite(res.att)
+        assert np.isfinite(res.se)
+        assert abs(res.att - tau) < 0.6
+
+    @pytest.mark.slow
+    def test_non_absorbing_fully_toggling_no_never_treated_unit(self):
+        """non_absorbing admits a fully toggling panel with NO never-treated unit
+        (every unit is treated at some point but retains observed untreated
+        cells). Identification falls back to untreated cells, and the bootstrap
+        runs via the Python path (the Rust stratified resampler can return a
+        degenerate ~0 SE on an empty control stratum). Asserts admission + finite
+        ATT/SE + recovery.
+        """
+        tau = 4.0
+        df = self._make_non_absorbing_panel(seed=2, tau=tau, all_toggle=True)
+        # Sanity: no never-treated unit (every unit treated at some period).
+        assert bool((df.groupby("unit")["treated"].max().to_numpy() == 1).all())
+        est = TROP(
+            method="local",
+            non_absorbing=True,
+            lambda_time_grid=[0.0],
+            lambda_unit_grid=[0.0],
+            lambda_nn_grid=[0.1],
+            n_bootstrap=3,
+            seed=1,
+        )
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            res = est.fit(df, "outcome", "treated", "unit", "period")
+        assert np.isfinite(res.att)
+        assert np.isfinite(res.se)
+        assert res.se > 0  # not a degenerate empty-stratum ~0 SE
+        assert abs(res.att - tau) < 0.6
+
+    @pytest.mark.slow
+    def test_unbalanced_absorbing_unidentified_unit_not_estimable(self):
+        """The estimability guard applies to DEFAULT (absorbing) local fits too,
+        not only non_absorbing. On an unbalanced absorbing panel where a treated
+        unit's pre-treatment rows are entirely missing, that unit has no observed
+        control cell, so ``alpha_i`` is unidentified; its cells must be NaN (the
+        prior behavior silently reported a fixed-effect-contaminated tau), while
+        the rest of the panel is estimated normally. ``non_absorbing=False``.
+        """
+        rng = np.random.default_rng(3)
+        n_periods, tau = 6, 4.0
+        beta = rng.normal(0.0, 1.0, n_periods)
+        rows = []
+        # Never-treated controls (units 0-2), observed all periods.
+        for i in range(3):
+            a = rng.normal(0.0, 1.0)
+            for t in range(n_periods):
+                rows.append(
+                    {
+                        "unit": i,
+                        "period": t,
+                        "outcome": a + beta[t] + rng.normal(0, 0.05),
+                        "treated": 0,
+                    }
+                )
+        # Well-observed treated unit (unit 3), adopts at t=4, full pre-history.
+        a3 = rng.normal(0.0, 1.0)
+        for t in range(n_periods):
+            d = 1 if t >= 4 else 0
+            rows.append(
+                {
+                    "unit": 3,
+                    "period": t,
+                    "outcome": a3 + beta[t] + rng.normal(0, 0.05) + (tau if d else 0),
+                    "treated": d,
+                }
+            )
+        # Pathological treated unit (unit 4): adopts at t=3 but ONLY observed at
+        # its treated periods 3,4,5 -- pre-treatment rows 0,1,2 are MISSING, so it
+        # has no observed control cell and alpha_4 is unidentified.
+        a4 = rng.normal(0.0, 1.0)
+        a4 += 12.0  # large FE so any leak into tau would be unmistakable
+        for t in (3, 4, 5):
+            rows.append(
+                {
+                    "unit": 4,
+                    "period": t,
+                    "outcome": a4 + beta[t] + rng.normal(0, 0.05) + tau,
+                    "treated": 1,
+                }
+            )
+        df = pd.DataFrame(rows)
+
+        est = TROP(
+            method="local",  # non_absorbing defaults to False (absorbing)
+            lambda_time_grid=[0.0],
+            lambda_unit_grid=[0.0],
+            lambda_nn_grid=[0.1],
+            n_bootstrap=3,
+            seed=1,
+        )
+        with pytest.warns(UserWarning, match="(?i)not estimable"):
+            res = est.fit(df, "outcome", "treated", "unit", "period")
+
+        # Unit 4's treated cells are NaN (alpha_4 unidentified), never a leaked FE.
+        u4 = {k: v for k, v in res.treatment_effects.items() if k[0] == 4}
+        assert len(u4) == 3
+        for k, v in u4.items():
+            assert np.isnan(v), f"unidentified-unit cell {k} should be NaN, got {v}"
+        # Unit 3 (well-observed) is estimated and recovers the effect.
+        u3 = [v for k, v in res.treatment_effects.items() if k[0] == 3 and np.isfinite(v)]
+        assert len(u3) > 0
+        assert np.isfinite(res.att)
+        assert abs(res.att - tau) < 0.6
+
+    @pytest.mark.slow
+    def test_non_absorbing_disconnected_support_not_estimable(self):
+        """Strict two-way-FE identification: ``alpha_i + beta_t`` is pinned only
+        within a connected component of the observed-control bipartite graph. A
+        treated cell whose target unit and target period fall in DIFFERENT
+        components is non-estimable even though both have *some* control support
+        (the marginal row/column check would wrongly pass it). Such cells must be
+        NaN, not a finite cross-component FE-contaminated value.
+
+        Construction (periods 0-5): component A = units {0,1,2} whose untreated
+        periods rotate within {0,1,2,3} (so A's control graph connects periods
+        0-3 and units 0-2 into one component, with estimable treated cells inside
+        it); component B = units {3,4} untreated only at periods {4,5}. A and B
+        share no unit or period, so they are disconnected. Target cell (unit 0,
+        period 4): unit 0 in A, period 4 in B -> alpha_0 + beta_4 unidentified ->
+        NaN. A large beta_4 makes any cross-component leak unmistakable; component
+        A still yields a finite ATT.
+        """
+        rng = np.random.default_rng(0)
+        n_periods, tau = 6, 3.0
+        alpha = np.array([0.0, 1.0, 2.0, 5.0, 6.0])
+        beta = np.zeros(n_periods)
+        beta[4] = 20.0
+        # untreated-period sets: component A rotates within {0..3} and is treated
+        # at {4,5}; component B is untreated only at {4,5}.
+        untreated = {
+            0: {0, 1},
+            1: {1, 2},
+            2: {2, 3},
+            3: {4, 5},
+            4: {4, 5},
+        }
+        rows = []
+        for i in range(5):
+            for t in range(n_periods):
+                d = 0 if t in untreated[i] else 1
+                rows.append(
+                    {
+                        "unit": i,
+                        "period": t,
+                        "outcome": alpha[i] + beta[t] + rng.normal(0, 0.05) + (tau if d else 0),
+                        "treated": int(d),
+                    }
+                )
+        df = pd.DataFrame(rows)
+        est = TROP(
+            method="local",
+            non_absorbing=True,
+            lambda_time_grid=[0.0],
+            lambda_unit_grid=[0.0],
+            lambda_nn_grid=[0.1],
+            n_bootstrap=3,
+            seed=1,
+        )
+        with pytest.warns(UserWarning, match="(?i)not estimable"):
+            res = est.fit(df, "outcome", "treated", "unit", "period")
+
+        # The cross-component target cell (unit 0 treated at period 4) is
+        # non-estimable (NaN), not a beta_4-contaminated value.
+        cell = res.treatment_effects.get((0, 4))
+        assert cell is not None and np.isnan(cell), f"(0,4) should be NaN, got {cell}"
+        # Within-component A treated cells (e.g. unit 0 at period 2) stay
+        # estimable, so the fit still produces a finite ATT.
+        assert np.isfinite(res.treatment_effects.get((0, 2)))
+        assert np.isfinite(res.att)
diff --git a/tests/test_trop.py b/tests/test_trop.py
index 88642481..a7bc3cfc 100644
--- a/tests/test_trop.py
+++ b/tests/test_trop.py
@@ -8,6 +8,7 @@
 import pandas as pd
 import pytest
 
+from diff_diff import HAS_RUST_BACKEND
 from diff_diff.prep import generate_factor_data
 from diff_diff.trop import TROP, TROPResults, trop
 from diff_diff.trop_local import _run_trop_bootstrap_loop
@@ -909,6 +910,268 @@ def test_d_matrix_validation_error_message_helpful(self):
         assert "absorbing state" in error_msg
         assert "monotonic" in error_msg.lower() or "non-decreasing" in error_msg.lower()
         assert "D[t, i] = 1 for all t >= first treatment" in error_msg
+        # Also steers genuine on/off (non-absorbing) users to the opt-in.
+        assert "non_absorbing" in error_msg
+
+    @staticmethod
+    def _non_absorbing_df(seed=0, tau=3.0, n_units=14, n_periods=8):
+        """Small TWFE-clean panel with on/off (non-monotonic) treatment."""
+        rng = np.random.default_rng(seed)
+        alpha = rng.normal(0.0, 1.0, n_units)
+        beta = rng.normal(0.0, 1.0, n_periods)
+        rows = []
+        for i in range(n_units):
+            d = np.zeros(n_periods, dtype=int)
+            if i % 4 == 0 and i > 0:
+                d[4:6] = 1  # on then off (non-absorbing)
+            elif i % 3 == 0:
+                d[5:] = 1  # absorbing block
+            for t in range(n_periods):
+                y0 = alpha[i] + beta[t] + rng.normal(0.0, 0.05)
+                rows.append(
+                    {
+                        "unit": i,
+                        "period": t,
+                        "outcome": y0 + (tau if d[t] == 1 else 0.0),
+                        "treated": int(d[t]),
+                    }
+                )
+        return pd.DataFrame(rows)
+
+    @pytest.mark.slow
+    def test_non_absorbing_opt_in_accepted(self):
+        """TROP(non_absorbing=True) accepts a non-monotonic D and returns a
+        finite ATT instead of raising (the default still rejects -- see
+        test_d_matrix_absorbing_state_validation_invalid).
+        """
+        df = self._non_absorbing_df(seed=0, tau=3.0)
+        est = TROP(
+            method="local",
+            non_absorbing=True,
+            lambda_time_grid=[0.0],
+            lambda_unit_grid=[0.0],
+            lambda_nn_grid=[0.1],
+            n_bootstrap=2,
+            seed=42,
+        )
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")  # caveat warning asserted elsewhere
+            results = est.fit(df, "outcome", "treated", "unit", "period")
+        assert isinstance(results, TROPResults)
+        assert np.isfinite(results.att)
+
+    def test_non_absorbing_global_method_raises(self):
+        """non_absorbing=True is local-only; the global method must raise."""
+        df = self._non_absorbing_df(seed=1)
+        est = TROP(
+            method="global",
+            non_absorbing=True,
+            lambda_time_grid=[0.0],
+            lambda_unit_grid=[0.0],
+            lambda_nn_grid=[0.1],
+            n_bootstrap=2,
+        )
+        with pytest.raises(ValueError, match="(?i)non_absorbing.*local|local.*non_absorbing"):
+            est.fit(df, "outcome", "treated", "unit", "period")
+
+    def test_non_absorbing_param_round_trip_and_validation(self):
+        """non_absorbing round-trips through get_params/set_params and rejects
+        non-bool values in both __init__ and set_params.
+        """
+        est = TROP(non_absorbing=True)
+        assert est.get_params()["non_absorbing"] is True
+        est.set_params(non_absorbing=False)
+        assert est.non_absorbing is False
+        with pytest.raises(ValueError, match="non_absorbing must be a bool"):
+            TROP(non_absorbing="yes")  # type: ignore[arg-type]
+        with pytest.raises(ValueError, match="non_absorbing must be a bool"):
+            TROP().set_params(non_absorbing=1)  # type: ignore[arg-type]
+
+    @pytest.mark.slow
+    @pytest.mark.skipif(not HAS_RUST_BACKEND, reason="Rust backend not available")
+    def test_non_absorbing_rust_python_parity(self):
+        """The Rust local path is absorbing-agnostic: on a non-absorbing panel
+        it produces the same ATT as the forced-Python path (single-point grids
+        remove lambda-selection ambiguity, so only solver roundoff remains).
+        """
+        # The package re-exports the ``trop`` function, shadowing the submodule
+        # attribute, so reach the modules via sys.modules (matches the idiom used
+        # by the other Rust-toggle tests in this file).
+        trop_mod = sys.modules["diff_diff.trop"]
+        trop_local_mod = sys.modules["diff_diff.trop_local"]
+
+        df = self._non_absorbing_df(seed=3, tau=3.0)
+        kwargs = dict(
+            method="local",
+            non_absorbing=True,
+            lambda_time_grid=[0.0],
+            lambda_unit_grid=[0.0],
+            lambda_nn_grid=[0.1],
+            n_bootstrap=2,
+            seed=7,
+        )
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            att_rust = TROP(**kwargs).fit(df, "outcome", "treated", "unit", "period").att
+            with (
+                patch.object(trop_mod, "HAS_RUST_BACKEND", False),
+                patch.object(trop_local_mod, "HAS_RUST_BACKEND", False),
+            ):
+                att_py = TROP(**kwargs).fit(df, "outcome", "treated", "unit", "period").att
+        assert np.isfinite(att_rust) and np.isfinite(att_py)
+        np.testing.assert_allclose(att_rust, att_py, atol=1e-6, rtol=1e-6)
+
+    def test_non_absorbing_rejects_no_observed_untreated_cells(self):
+        """non_absorbing identification needs OBSERVED untreated cells. An
+        unbalanced panel whose only D=0 cells are structural gaps (every observed
+        row is treated) must raise before LOOCV/default fallback, not fit on
+        raw-outcome residuals. Guards against the missing-cell-fill loophole.
+        """
+        # Every observed row treated=1; ~half the (unit, period) cells dropped so
+        # all 4 periods still appear in the pivot and the missing cells fill to
+        # D=0 (with NaN outcomes).
+        rows = []
+        for i in range(6):
+            for t in range(4):
+                if (i + t) % 2 == 0:  # keep ~half -> unbalanced
+                    rows.append(
+                        {"unit": i, "period": t, "outcome": float(i) * 0.1 + t, "treated": 1}
+                    )
+        df = pd.DataFrame(rows)
+        est = TROP(
+            method="local",
+            non_absorbing=True,
+            lambda_time_grid=[0.0],
+            lambda_unit_grid=[0.0],
+            lambda_nn_grid=[0.1],
+            n_bootstrap=2,
+            seed=1,
+        )
+        with pytest.raises(ValueError, match="(?i)no observed untreated"):
+            est.fit(df, "outcome", "treated", "unit", "period")
+
+    def test_non_absorbing_rejects_single_control_period(self):
+        """non_absorbing requires >=2 periods with an observed untreated cell.
+        A panel with exactly one such period must raise (factor-model
+        identifiability floor), counting only OBSERVED untreated cells.
+        """
+        # Balanced panel, every cell treated except one observed untreated cell
+        # at (unit 0, period 0) -> only one period has an untreated observation.
+        rows = []
+        for i in range(6):
+            for t in range(5):
+                treated = 0 if (i == 0 and t == 0) else 1
+                rows.append(
+                    {"unit": i, "period": t, "outcome": float(i) * 0.1 + t, "treated": treated}
+                )
+        df = pd.DataFrame(rows)
+        est = TROP(
+            method="local",
+            non_absorbing=True,
+            lambda_time_grid=[0.0],
+            lambda_unit_grid=[0.0],
+            lambda_nn_grid=[0.1],
+            n_bootstrap=2,
+            seed=1,
+        )
+        with pytest.raises(ValueError, match="(?i)2 periods .* observed untreated"):
+            est.fit(df, "outcome", "treated", "unit", "period")
+
+    @pytest.mark.slow
+    def test_non_absorbing_recorded_on_results(self):
+        """The assignment scope is persisted on TROPResults / to_dict() so a
+        saved result retains the non-absorbing + inference-caveat context after
+        the fit-time warning is gone.
+        """
+        grid = dict(
+            lambda_time_grid=[0.0],
+            lambda_unit_grid=[0.0],
+            lambda_nn_grid=[0.1],
+            n_bootstrap=2,
+            seed=1,
+        )
+        df = self._non_absorbing_df(seed=0, tau=3.0)
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            res = TROP(method="local", non_absorbing=True, **grid).fit(
+                df, "outcome", "treated", "unit", "period"
+            )
+        assert res.non_absorbing is True
+        assert res.to_dict()["non_absorbing"] is True
+
+        # Default (absorbing) fit records False.
+        abs_rows = []
+        for i in range(12):
+            g = 4 if i < 6 else None
+            for t in range(8):
+                d = 1 if (g is not None and t >= g) else 0
+                abs_rows.append(
+                    {
+                        "unit": i,
+                        "period": t,
+                        "outcome": float(i) * 0.1 + 0.2 * t + (2.0 if d else 0.0),
+                        "treated": d,
+                    }
+                )
+        res_abs = TROP(method="local", **grid).fit(
+            pd.DataFrame(abs_rows), "outcome", "treated", "unit", "period"
+        )
+        assert res_abs.non_absorbing is False
+        assert res_abs.to_dict()["non_absorbing"] is False
+
+    @pytest.mark.slow
+    @pytest.mark.skipif(not HAS_RUST_BACKEND, reason="Rust backend not available")
+    def test_unbalanced_panel_bootstrap_uses_python_guard(self):
+        """On an UNBALANCED panel (default absorbing here), the point fit may be
+        fully estimable, yet a bootstrap resample can lose a treated cell's only
+        control support. The Rust bootstrap lacks the estimability guard, so the
+        fit must route the bootstrap to the guarded Python path whenever the panel
+        has missing cells -- locking the force_python condition. Balanced panels
+        keep the Rust happy path (covered elsewhere).
+        """
+        rng = np.random.default_rng(5)
+        rows = []
+        for i in range(12):
+            g = 4 if i < 4 else (6 if i < 8 else None)  # 4 never-treated controls
+            for t in range(8):
+                d = 1 if (g is not None and t >= g) else 0
+                rows.append(
+                    {
+                        "unit": i,
+                        "period": t,
+                        "outcome": float(i) * 0.1
+                        + 0.2 * t
+                        + rng.normal(0, 0.05)
+                        + (2.0 if d else 0.0),
+                        "treated": d,
+                    }
+                )
+        df = pd.DataFrame(rows)
+        # Drop a few control rows -> unbalanced, but leave ample support so the
+        # point fit trims nothing (isolates the missing-cell trigger).
+        ctrl = df.index[df["treated"] == 0].to_numpy()
+        drop = rng.choice(ctrl, size=max(1, int(0.06 * len(ctrl))), replace=False)
+        df = df.drop(index=drop).reset_index(drop=True)
+
+        est = TROP(
+            method="local",
+            lambda_time_grid=[0.0],
+            lambda_unit_grid=[0.0],
+            lambda_nn_grid=[0.1],
+            n_bootstrap=3,
+            seed=1,
+        )
+        trop_local_mod = sys.modules["diff_diff.trop_local"]
+        with patch.object(trop_local_mod, "_rust_bootstrap_trop_variance") as mock_rust:
+            with warnings.catch_warnings():
+                warnings.simplefilter("ignore")
+                res = est.fit(df, "outcome", "treated", "unit", "period")
+        # The Rust bootstrap must NOT be used for an unbalanced panel.
+        mock_rust.assert_not_called()
+        # The point fit itself trimmed nothing (so the trigger was the missing
+        # cells, not point-fit non-estimability).
+        assert all(np.isfinite(v) for v in res.treatment_effects.values())
+        assert np.isfinite(res.att) and np.isfinite(res.se)
 
 
 @pytest.mark.slow
@@ -3361,6 +3624,71 @@ def _make_survey_panel_and_design():
         )
         return df, survey_design, resolved_survey
 
+    def test_non_absorbing_rao_wu_zero_estimable_weight_is_nan_not_crash(self):
+        """Survey Rao-Wu bootstrap after non-estimable trimming: a draw whose
+        nonzero rescaled weight lands only on a skipped (non-estimable) unit
+        leaves the estimable treated cells with zero total weight. np.average
+        would raise ZeroDivisionError; the guard must return NaN for that draw so
+        the bootstrap stays NaN-safe (no crash) per the contract.
+        """
+        from unittest.mock import patch
+
+        from diff_diff import SurveyDesign
+
+        # unit 0: always treated (non-estimable). units 1,2: treated at periods
+        # 4,5. units 3,4,5: never-treated controls (so periods 4,5 are NOT fully
+        # treated and units 1,2 have estimable cells).
+        rows = []
+        for i in range(6):
+            for t in range(6):
+                if i == 0:
+                    d = 1
+                elif i in (1, 2):
+                    d = 1 if t >= 4 else 0
+                else:
+                    d = 0
+                rows.append(
+                    {
+                        "unit": i,
+                        "period": t,
+                        "outcome": float(i) + t + (2.0 if d else 0.0),
+                        "treated": d,
+                        "weight": 1.0,
+                        "psu": i,
+                    }
+                )
+        df = pd.DataFrame(rows)
+        survey_design = SurveyDesign(weights="weight", psu="psu")
+
+        # Per-unit Rao-Wu draw: nonzero weight only on unit 0 (always-treated,
+        # skipped); estimable units 1,2 get zero -> zero estimable-cell weight.
+        zero_estimable = np.zeros(6, dtype=np.float64)
+        zero_estimable[0] = 1.0
+
+        est = TROP(
+            method="local",
+            non_absorbing=True,
+            lambda_time_grid=[0.0],
+            lambda_unit_grid=[0.0],
+            lambda_nn_grid=[0.1],
+            n_bootstrap=3,
+            seed=1,
+        )
+        with patch(
+            "diff_diff.bootstrap_utils.generate_rao_wu_weights",
+            return_value=zero_estimable,
+        ):
+            with warnings.catch_warnings():
+                warnings.simplefilter("ignore")
+                # Must not raise ZeroDivisionError.
+                res = est.fit(
+                    df, "outcome", "treated", "unit", "period", survey_design=survey_design
+                )
+        # Point fit (original unit weights) is estimable; bootstrap draws all
+        # degenerate -> SE is NaN, not a crash.
+        assert np.isfinite(res.att)
+        assert np.isnan(res.se)
+
     def test_local_rao_wu_bootstrap_warns_above_5pct_failure(self):
         """Local Rao-Wu survey bootstrap: forced failures → proportional warn."""
         from unittest.mock import patch