feat(had): cluster-robust SE on the continuous paths (Phase 2a) by igerber · Pull Request #596 · igerber/diff-diff

igerber · 2026-07-01T21:39:48Z

Summary

Thread cluster= into bias_corrected_local_linear on the HeterogeneousAdoptionDiD continuous designs (continuous_at_zero / continuous_near_d_lower). Previously cluster= was ignored on the continuous path with a UserWarning; the Phase-1c wrapper already supports cluster, so the estimator just needed to pass it through. The CCT-2014 robust variance becomes cluster-robust and the β̂-scale SE is se_robust / |den|.
Composes with the weights= shortcut (weighted cluster-robust). cluster= + survey_design= raises NotImplementedError — the Binder (1983) TSL survey variance is composed from the per-unit influence function and would silently override the cluster-robust SE; route clustering through survey_design=SurveyDesign(psu=<cluster_col>) instead.
Cluster must be unit-constant: a nonexistent column, NaN, or within-unit-varying cluster now raises (mirroring the mass-point path) instead of being silently ignored. Result metadata reports vcov_type="cr1" + cluster_name with inference_method="analytical_nonparametric" (distinguishing the clustered CCT variance from the mass-point 2SLS CR1 sandwich).
Scope: Phase 2a static path only. The event-study (Phase 2b) path still defers cluster= with a warning (its per-horizon cband sup-t bootstrap normalizes HC-scale perturbations and would mix variance families under clustering); rescoped the TODO.md row to Phase 2b.

Methodology references

Method name(s): HeterogeneousAdoptionDiD continuous-dose designs (de Chaisemartin, Ciccia, D'Haultfœuille & Knau 2026), Theorem 1 / Eq 3 (Design 1') and Theorem 3 / Eq 11 (Design 1); Phase-1c bias-corrected local linear (Calonico-Cattaneo-Titiunik 2014) cluster-robust variance.
Paper / source link(s): REGISTRY § "HeterogeneousAdoptionDiD" — new "Note (continuous cluster-robust SE, Phase 2a)" documents the threading, the exact-rescale validation anchor, and the survey/weights/event-study interactions.
Any intentional deviations from the source (and why): None new. The clustered CCT SE is the existing bias_corrected_local_linear cluster path (golden-tested vs nprobust DGP 4); the β-scale rescale (se_robust / |den|) is a deterministic linear transform, so the estimator SE equals the direct call to machine precision.

Validation

Tests added/updated: tests/test_had.py::TestClusterHandling — new test_cluster_threaded_on_continuous_path, test_cluster_weighted_on_continuous_path, test_cluster_threaded_on_continuous_near_d_lower, test_cluster_survey_design_raises_on_continuous, test_auto_design_continuous_threads_cluster; the 4 pre-existing "cluster ignored on continuous → lax validation" tests are flipped to expect the now-correct ValueError on invalid cluster. Each SE test asserts an exact match (atol=1e-12) to bias_corrected_local_linear(cluster=...).se_robust / |den| (unweighted and weighted).
tests/test_had.py 309 passed / 2 skipped; tests/test_methodology_had.py + tests/test_had_pretests.py 281 passed (non-cluster continuous paths byte-unchanged). black/ruff clean; mypy 0-new.

Security / privacy

Confirm no secrets/PII in this PR: Yes

github-actions · 2026-07-01T21:42:58Z

PR Review Report

Overall Assessment

⛔ Blocker — one unmitigated P0 finding.

Executive Summary

The PR correctly documents the intended Phase 2a static continuous-path threading of cluster= into bias_corrected_local_linear.
The main methodology path matches the REGISTRY note for clustered CCT SE and beta-scale rescaling.
However, the new continuous clustered path does not guard against fewer than two effective clusters.
The underlying lprobust_vce clustered meat computes and returns a matrix even with one cluster, so the new HAD path can report finite clustered SE / p-values / CIs where clustered inference is unidentified.
The event-study continuous cluster deferral is documented in TODO.md and REGISTRY, so it is not a blocker.

Methodology

Finding 1

Severity: P0
Location: diff_diff/had.py:L3592-L3608, diff_diff/had.py:L3942-L3963, diff_diff/_nprobust_port.py:L369-L386
Impact: The PR newly exposes continuous-path clustered inference, but it does not enforce at least two effective clusters. HeterogeneousAdoptionDiD.fit() passes cluster_arr into _fit_continuous() and labels the result vcov_type="cr1", while bias_corrected_local_linear() delegates to lprobust_vce(). That port path iterates over np.unique(cluster) and returns the clustered meat with no n_clusters < 2 guard. For a single cluster, this can produce a finite SE, and safe_inference() will then emit finite t-stat, p-value, and CI for unidentified clustered inference. This conflicts with the project’s NaN-inference contract and the mass-point CR1 behavior, which explicitly returns se=nan when n_clusters < 2 at diff_diff/had.py:L2455-L2471.
Concrete fix: Add an effective-cluster guard for the continuous clustered path. The robust fix is inside the local-linear/lprobust layer after positive-weight and active-window filtering, because a sample can have multiple global clusters but only one cluster in the kernel window. Return se_robust=np.nan or otherwise propagate NaN to HAD so safe_inference() sets t-stat, p-value, and CI to NaN. Add HAD-level tests for single-cluster continuous fits.

Code Quality

Severity: None
Impact: Apart from the P0 guard above, the static path is narrowly scoped and uses the existing _fit_continuous() and safe_inference() patterns.
Concrete fix: None beyond the P0 fix.

Performance

Severity: None
Impact: The extra cluster aggregation is O(G) and only runs when cluster= is supplied. No material performance concern in the changed diff.
Concrete fix: None.

Maintainability

Severity: P2
Location: diff_diff/had.py:L3572-L3584
Impact: cluster= + survey_design= is rejected after cluster extraction/validation, so invalid cluster columns can raise ValueError before the intended unsupported-composition NotImplementedError. This is not a correctness issue, but it makes the public contract less predictable.
Concrete fix: Consider moving the cluster_arg is not None and resolved_survey_unit_full is not None rejection before cluster-column extraction for continuous paths.

Tech Debt

Finding 1

Severity: P3 informational
Location: TODO.md:L40, docs/methodology/REGISTRY.md:L3202
Impact: Continuous event-study cluster= remains deferred. This is explicitly tracked and documented, so it is not a defect for this PR.
Concrete fix: No action required for Phase 2a.

Security

Severity: None
Impact: No secrets, unsafe IO, or security-sensitive behavior found in the changed diff.
Concrete fix: None.

Documentation / Tests

Finding 1

Severity: P1
Location: tests/test_had.py:L1585-L1685
Impact: The new tests cover valid clustered continuous paths, weighted composition, survey rejection, and invalid cluster columns, but they do not cover the degenerate effective-cluster case. That gap lets the P0 finite-inference issue through.
Concrete fix: Add regression tests asserting NaN-coupled inference for:

continuous_at_zero with all units in one cluster.
continuous_near_d_lower with all units in one cluster.
Weighted continuous path where zero weights leave only one positive-weight cluster.

Path to Approval

Add a continuous clustered-inference guard so fewer than two effective clusters produce se=np.nan and NaN t-stat / p-value / CI.
Ensure the guard applies after positive-weight filtering and, ideally, after the local-linear active-window selection.
Add the three regression tests listed under Documentation / Tests.
Re-run the HAD and local-linear test slices covering clustered continuous paths.

github-actions · 2026-07-01T21:51:44Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 802ff70f7401f7c2825dee192dc7a9e22c8836e9

PR Review Report

Overall Assessment

⛔ Blocker — one unmitigated P0 remains.

Executive Summary

The static continuous cluster= threading now matches the documented beta-scale rescale path: bias_corrected_local_linear(..., cluster=...) then se_robust / |den|.
The previous global single-cluster issue is partially addressed: the PR now NaNs SE when the positive-weight support has fewer than two clusters.
However, the CCT variance is computed on the active kernel window, not the full positive-weight support.
The new guard can still report finite clustered SE / p-values / CIs when globally there are ≥2 clusters but only one cluster contributes inside the local-linear bandwidth.
Continuous event-study cluster deferral remains documented in TODO.md and REGISTRY.md, so it is informational only.
I could not run tests in this sandbox because numpy is not installed.

Methodology

Finding 1

Severity: P0
Location: diff_diff/had.py:L4018-L4021, diff_diff/_nprobust_port.py:L1219-L1255, diff_diff/_nprobust_port.py:L1357-L1358, diff_diff/_nprobust_port.py:L373-L386
Impact: The previous effective-cluster blocker is only partially fixed. HAD counts clusters on the positive-weight support, but the CCT local-linear variance actually subsets to the active kernel window via ind, then passes eC = cluster[ind] into lprobust_vce(). If two clusters exist globally but all in-window observations belong to one cluster, HAD’s guard does not fire. The port then computes a clustered meat matrix with that one active cluster and can return a finite se_robust; safe_inference() will emit finite t-stat, p-value, and CI for unidentified clustered inference. This is silent incorrect statistical output. The new REGISTRY.md wording says “positive-weight support,” but that does not match the in-code CCT variance contract.

Concrete fix: Count effective clusters after the same active-window selection used by lprobust: positive-weight filtering, sorting if needed, kernel weights, ind_b / ind_h, and the h > b conditional. The safest fix is in bias_corrected_local_linear() or _nprobust_port.lprobust() immediately after eC = cluster[ind]: if len(np.unique(eC)) < 2, return se_rb=np.nan and ensure CI/inference fields are NaN. Then update the registry note from “positive-weight support” to “active kernel window after positive-weight filtering.”

Code Quality

Severity: None
Impact: The static path is otherwise narrowly scoped and continues to route inference through safe_inference().
Concrete fix: None beyond the P0 guard.

Performance

Severity: None
Impact: The added cluster handling is linear in sample size and only active when cluster= is supplied.
Concrete fix: None.

Maintainability

Severity: None
Impact: The prior continuous cluster= + survey_design= error-order concern is addressed by rejecting the unsupported composition before cluster extraction.
Concrete fix: None.

Tech Debt

Finding 1

Severity: P3 informational
Location: TODO.md:L40, docs/methodology/REGISTRY.md:L3202, diff_diff/had.py:L4436-L4444
Impact: Continuous event-study cluster= remains deferred with a warning. This is explicitly tracked and documented, so it is not a defect for this Phase 2a static-path PR.
Concrete fix: No action required for this PR.

Security

Severity: None
Impact: No secrets, unsafe IO, or security-sensitive behavior found in the changed diff.
Concrete fix: None.

Documentation / Tests

Finding 1

Severity: P1
Location: tests/test_had.py:L1687-L1738
Impact: The new regression tests cover global one-cluster and positive-weight one-cluster cases, but not the active-bandwidth one-cluster case that the CCT variance actually uses. The P0 can pass those tests.
Concrete fix: Add a regression test with at least two global positive-weight clusters where only one cluster appears inside the selected local-linear bandwidth, and assert se, t_stat, p_value, and both CI endpoints are NaN. Add it for continuous_at_zero; ideally mirror for continuous_near_d_lower.

Path to Approval

Add an active-window effective-cluster guard in the local-linear/lprobust layer, or reproduce the exact active-window mask in HAD before accepting clustered SE.
Set clustered continuous se / se_robust to NaN when active-window cluster count is < 2, so safe_inference() NaNs t-stat, p-value, and CI.
Update REGISTRY.md to say effective clusters are counted on the active kernel window after positive-weight filtering.
Add the active-window single-cluster regression test described above.
Re-run the HAD continuous clustered test slice and local-linear clustered variance tests.

Thread cluster= into bias_corrected_local_linear on the continuous designs (continuous_at_zero / continuous_near_d_lower). Previously cluster= was ignored on the continuous path with a UserWarning; the Phase-1c wrapper already supports cluster, so the estimator just needed to pass it through. The CCT-2014 robust variance becomes cluster-robust and the beta-scale SE is se_robust / |den|. - Resolve the per-unit cluster array for the continuous designs (was mass-point only) and thread it into _fit_continuous -> bias_corrected_local_linear. - Composes with the weights= shortcut (weighted cluster-robust). - cluster= + survey_design= raises NotImplementedError (rejected BEFORE cluster extraction so the error is predictable even with a malformed cluster column; the Binder-TSL survey variance would override the cluster-robust SE -- route clustering through survey_design=SurveyDesign(psu=<col>)). - Cluster must be unit-constant: a nonexistent column, NaN, or within-unit- varying cluster now raises (mirrors the mass-point path) instead of being silently ignored. - Single-cluster guard at the variance-computation site: _nprobust_port.lprobust NaNs se_rb/se_cl when fewer than two clusters fall in the ACTIVE KERNEL WINDOW (eC = cluster[ind]) -- a stricter condition than the global cluster count, since clusters can be separated from the boundary by the bandwidth. This NaN-couples the downstream t-stat / p-value / CI (att stays finite), matching the mass-point CR1 single-cluster contract, and also covers the direct bias_corrected_local_linear API. A window with >=2 clusters is bit-identical, so the nprobust clustered DGP-4 golden parity is preserved. - Result metadata reports vcov_type="cr1" + cluster_name; inference_method stays "analytical_nonparametric". - Event-study (Phase 2b) cluster threading remains a follow-up (the per-horizon cband sup-t bootstrap would mix variance families under clustering); that path still emits the "cluster ignored" warning. Validated: the estimator clustered SE equals the direct bias_corrected_local_linear(cluster=...).se_robust / |den| to machine precision, unweighted and weighted; single/degenerate-cluster fits NaN the inference triple. REGISTRY note + CHANGELOG + TODO (Phase 2b remains) updated. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-07-01T22:00:14Z

🔁 AI review rerun (requested by @igerber)

Head SHA: f3419fb7caa0e6db0793ef800e20eb0e40e6dfd7

PR Review Report

Overall Assessment

✅ Looks good — no unmitigated P0/P1 findings.

Executive Summary

The prior P0 is resolved: the effective-cluster guard now runs inside _nprobust_port.lprobust() after the same active kernel-window selection used for the CCT variance.
Static continuous HAD paths now pass cluster_arr into bias_corrected_local_linear() and rescale se_robust / |den|, matching the new Registry note.
cluster= + survey_design= on static continuous paths is rejected before cluster extraction, avoiding silent Binder-TSL override.
Continuous event-study cluster= remains deferred, but it is documented in both TODO.md and REGISTRY.md, so it is informational only.
I could not run tests in this sandbox because numpy and pytest are not installed; AST parsing of changed Python files passed.

Methodology

Finding 1

Severity: None
Impact: The changed static continuous methodology matches the Registry contract. HAD continuous estimands rely on the paper’s no-true-control, positive-dose design with boundary/local-polynomial control logic; the PR’s variance change is confined to threading the already-ported robust local-linear CCT/nprobust cluster variance and applying the deterministic beta-scale rescale. (arxiv.org)
Concrete fix: None.

Finding 2

Severity: None
Impact: The previous active-window cluster-count blocker is addressed at diff_diff/_nprobust_port.py:L1356-L1375: eC = cluster[ind] is the same active subset consumed by lprobust_vce(), and <2 distinct active clusters now NaNs both SEs before downstream safe_inference(). This matches the Registry note at docs/methodology/REGISTRY.md:L3202.
Concrete fix: None.

Code Quality

Finding 1

Severity: None
Impact: The static path changes are narrowly scoped: cluster validation/extraction is centralized at diff_diff/had.py:L3354-L3392, the continuous call threads cluster_arr at diff_diff/had.py:L3600-L3608, and inference remains routed through safe_inference().
Concrete fix: None.

Performance

Finding 1

Severity: None
Impact: Added work is linear in units and only active when cluster= is supplied. The active-window unique-cluster check is negligible relative to the local-polynomial fit.
Concrete fix: None.

Maintainability

Finding 1

Severity: None
Impact: The new behavior is documented in code comments and the Registry, including survey/weights/event-study interactions. The unsupported static continuous cluster= + survey_design= composition is explicitly rejected at diff_diff/had.py:L3361-L3375.
Concrete fix: None.

Tech Debt

Finding 1

Severity: P3 informational
Impact: Continuous event-study cluster= is still deferred and emits a warning at diff_diff/had.py:L4430-L4438. This is tracked in TODO.md:L40 and documented in docs/methodology/REGISTRY.md:L3202, so it is not a blocker.
Concrete fix: No action required for this PR.

Security

Finding 1

Severity: None
Impact: No secrets, unsafe I/O, shell execution, or security-sensitive behavior introduced in the changed diff.
Concrete fix: None.

Documentation / Tests

Finding 1

Severity: None
Impact: Tests now cover the prior blocker at the variance layer (tests/test_bias_corrected_lprobust.py:L320-L335) and cover static HAD cluster threading, weighted threading, survey rejection, both continuous designs, and single-cluster NaN behavior (tests/test_had.py:L1596-L1738).
Concrete fix: None.

Finding 2

Severity: P3 informational
Impact: I could not execute the test suite here because numpy and pytest are unavailable. I did verify that the changed Python files parse with ast.parse.
Concrete fix: Ensure CI runs the focused HAD/local-linear clustered tests plus the broader HAD suite.

igerber force-pushed the feature/had-continuous-cluster-threading branch from c44cb7e to 802ff70 Compare July 1, 2026 21:48

igerber force-pushed the feature/had-continuous-cluster-threading branch from 802ff70 to f3419fb Compare July 1, 2026 21:57

igerber added the ready-for-ci Triggers CI test workflows label Jul 1, 2026

igerber merged commit c791eb9 into main Jul 2, 2026
33 of 34 checks passed

igerber deleted the feature/had-continuous-cluster-threading branch July 2, 2026 01:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(had): cluster-robust SE on the continuous paths (Phase 2a)#596

feat(had): cluster-robust SE on the continuous paths (Phase 2a)#596
igerber merged 1 commit into
mainfrom
feature/had-continuous-cluster-threading

igerber commented Jul 1, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented Jul 1, 2026

Summary

Methodology references

Validation

Security / privacy

Uh oh!

github-actions Bot commented Jul 1, 2026

PR Review Report

Overall Assessment

Executive Summary

Methodology

Finding 1

Code Quality

Performance

Maintainability

Tech Debt

Finding 1

Security

Documentation / Tests

Finding 1

Path to Approval

Uh oh!

github-actions Bot commented Jul 1, 2026

PR Review Report

Overall Assessment

Executive Summary

Methodology

Finding 1

Code Quality

Performance

Maintainability

Tech Debt

Finding 1

Security

Documentation / Tests

Finding 1

Path to Approval

Uh oh!

github-actions Bot commented Jul 1, 2026

PR Review Report

Overall Assessment

Executive Summary

Methodology

Finding 1

Finding 2

Code Quality

Finding 1

Performance

Finding 1

Maintainability

Finding 1

Tech Debt

Finding 1

Security

Finding 1

Documentation / Tests

Finding 1

Finding 2

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant