diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md index 2405db13..8ec76e19 100644 --- a/docs/methodology/REGISTRY.md +++ b/docs/methodology/REGISTRY.md @@ -19,6 +19,7 @@ This document provides the academic foundations and key implementation requireme - [TwoStageDiD](#twostagedid) - [StackedDiD](#stackeddid) - [WooldridgeDiD (ETWFE)](#wooldridgedid-etwfe) + - [LPDiD](#lpdid) 3. [Advanced Estimators](#advanced-estimators) - [SyntheticDiD](#syntheticdid) - [SyntheticControl](#syntheticcontrol) @@ -1783,6 +1784,78 @@ Consolidated list of substantive deviations from the W2025 paper and from R `etw --- +## LPDiD + +**Primary source:** [Dube, A., Girardi, D., Jordà, Ò., & Taylor, A. M. (2025). "A Local Projections Approach to Difference-in-Differences." *Journal of Applied Econometrics*, 40(5), 741-758.](https://doi.org/10.1002/jae.70000) (Open Access; NBER Working Paper 31184; FRBSF Working Paper 2023-12.) Paper review on file: `docs/methodology/papers/dube-2025-review.md` (main article + official online appendix; equation/section numbering pinned to the JAE 2025 version). + +**Reference implementations:** Stata `lpdid` (SSC `s459273`, the authors' reference); R `alexCardazzi/lpdid` (third-party; absorbing + non-absorbing); authors' example scripts `danielegirardi/lpdid` (R + Stata). + +### Identification + +Model-based DiD: untreated potential outcomes follow the two-way fixed-effects DGP `E[y_it(0)|i,t] = alpha_i + delta_t` (paper Eq. 1), under two assumptions: + +- **No anticipation (Assumption 1):** `E[y_it(p) - y_it(0)] = 0` for all `t < p`. +- **Parallel trends (Assumption 2):** `E[y_it(0) - y_{i1}(0) | p_i = p] = E[y_it(0) - y_{i1}(0)]` for all `t in {2..T}`, `p in {1..T, inf}` - untreated potential-outcome trends are common across cohorts, stated relative to the first period (`t = 1`). The base period used for LP-DiD's long difference (first-lag `t-1` vs premean) is a separate efficiency/robustness choice, NOT part of this identification assumption (see the PMD edge case). + +Treatment is binary; the main path assumes **absorbing** treatment (`D_{is} <= D_{it}` for `s < t`). Target parameter: the cohort-specific dynamic ATT `tau_h^g = E[y_{i,p_g+h}(p_g) - y_{i,p_g+h}(0) | p_i = p_g]`, h periods after group g enters at p_g. Treatment effects may be dynamic and heterogeneous across cohorts. + +The key device is the **clean-control restriction**: each horizon-h regression keeps only newly-treated obs (`Delta_D_it = 1`) and not-yet-treated "clean" controls (`D_{i,t+h} = 0`, absorbing case). Excluding already-treated units from the control group is what eliminates the negative-weighting bias of naive TWFE/LP (paper Eqs. 6-7). Only the entry-period rows (`t = p_g`) identify each `beta_h` (online Appendix A.2). + +### Key Equations + +**LP-DiD regression (paper Eq. 4 restricted by Eq. 8), run separately per horizon `h in {-Q..H}`, `h != -1`:** + + y_{i,t+h} - y_{i,t-1} = beta_h^{LP-DiD} * Delta_D_it + delta_t^h + e_it^h + sample: Delta_D_it = 1 (newly treated) OR D_{i,t+h} = 0 (clean control) + +`delta_t^h` = calendar-time fixed effects; **no unit FE** (differenced out). `h = -1` is the reference (coefficient fixed at 0); negative h give pre-trend placebos. + +**Estimand = variance-weighted ATT (paper Eqs. 9-10; online Appendix B):** + + E(beta_h^{LP-DiD}) = sum_{g != 0} omega_{g,h} * tau_h^g + omega_{g,h} = N_CCS_{g,h} * n_{g,h} * (1 - n_{g,h}) / sum_{g != 0}[...], n_{g,h} = N_g / N_CCS_{g,h} + +Weights are **always non-negative** (the central result). Via Frisch-Waugh-Lovell, the residualized treatment dummy is the per-group constant `Delta_D~_g = 1 - N_g/N_CCS_{g,h}` (online Appendix Eqs. B.4-B.6) - the hook for the reweighting implementation. + +**Equally-weighted ATT (paper Section 3.3) - two equivalent routes:** +- `reweight=True`: weight each observation in the clean-control sample `CCS_{g,h}` (the newly-treated obs and their clean controls) by `(omega_{g,h}/N_g)^{-1}`. Numerically equivalent to **Callaway-Sant'Anna (2021)**. +- Regression adjustment (RA): fit the long difference on time FE using clean controls only, predict each treated obs's counterfactual, average residuals: `beta_h^{RA} = N_TR^{-1} sum_{TR}[(y_{i,t+h}-y_{i,t-1}) - Ehat((y_{i,t+h}-y_{i,t-1}) | D_{i,t+h}=0)]`. An imputation estimator in the **BJS (2024)** sense. + +**Covariates (paper Section 4.1):** recommended RA path `beta_{h,x}^{RA} = N_TR^{-1} sum_{TR}[(y_{i,t+h}-y_{i,t-1}) - gamma~^h x_i - delta~_t^h]`, with `gamma~^h, delta~_t^h` from a clean-control-only regression. **PMD base period (Section 3.4):** replace `y_{i,t-1}` with the mean of the last `k` pretreatment periods (`k=t-1` = all); single-cohort `k=t-1` == BJS. **Pooled estimand (Section 3.5):** posttreatment-mean long difference `(1/(H+1)) sum_{h=0}^H y_{i,t+h} - y_{i,t-1}` as the dependent variable. + +### Standard Errors + +**The paper specifies no SE formula** - Section 1 defers to "standard, well-understood techniques." The reference Stata uses **cluster-robust SEs at the unit level** (`vce(cluster unit)`, footnote 9); pooled / joint tests stack the per-horizon regressions (`suest`). No bootstrap is discussed. Any analytical SE the library ships - and in particular an influence-function cluster variance for the RA path - is therefore an **implementation choice validated against the reference package, not against the paper**, and must be documented under Deviations once implemented (PR-B). + +### Edge Cases + +- **Composition effects (Section 3.6):** the treated/clean-control set can change across horizons. `no_composition` tightens the clean-control condition to `D_{i,t+H}=0` at all horizons (and excludes cohorts with `p_g > T-H` to fix the treated set). Costs statistical power. +- **Bias-variance (Sections 3.3, 5.3):** variance weighting (default) -> lower variance, some bias; equal weighting (`reweight`) -> unbiased, higher variance. Variance won at short horizons, equal at long horizons in the paper's simulation. +- **PMD vs first-lag (Section 3.4):** PMD gains efficiency under low autocorrelation but can amplify bias if PT holds only in some pretreatment periods; first-lag relies on weaker PT (Marcus & Sant'Anna 2021). Choose the base period ex-ante. +- **Covariate-weight positivity (online Appendix B.2):** direct covariate inclusion keeps non-negative weights ONLY under linear + homogeneous covariate effects (B.2.1; main-text Assumption 6); in the general case (B.2.2) weights are not guaranteed positive -> prefer the RA covariate path (the direct path should carry a homogeneity-assumption warning). +- **Non-absorbing (Section 4.2, online Appendix C):** few/no never-treated units handled via the effect-stabilization assumption (Assumption 9, window `L`) with a modified clean-control window (Eq. 13). Two distinct estimands - first-time entry (Eq. 12) and effect-stabilization (Eq. 13). Deferred to a later PR; the absorbing main path rejects non-absorbing input. + +### Deviations from the paper / from R / library extensions + +*To be populated in PR-B (source + tests). Anticipated entries: (1) the analytical/cluster SE convention (paper specifies none - implementation choice vs Stata `lpdid` `vce(cluster unit)`); (2) any RA-path influence-function variance; (3) the `pmd="max"` / integer-`k` panel-start edge behavior vs the package; (4) absorbing-only scope in the first release, with non-absorbing (Section 4.2) deferred.* + +### Implementation Checklist + +- [ ] Per-horizon long-difference OLS with time FE, no unit FE; `h=-1` reference fixed at 0 (PR-B) +- [ ] Clean-control sample restriction (absorbing: `D_{i,t+h}=0`) (PR-B) +- [ ] Variance-weighted (default) + reweighted (equal-weight) estimands (PR-B) +- [ ] Regression-adjustment covariate path (recommended) + direct-inclusion path with homogeneity warning (PR-B) +- [ ] PMD base period; pooled pre/post estimands (PR-B) +- [ ] `no_composition` option (PR-B) +- [ ] Cluster-robust SE at unit level by default; NaN-consistent inference via `safe_inference` (PR-B) +- [ ] `LPDiDResults` with `summary()` / `to_dict()` / cluster metadata (PR-B) +- [ ] Layered tests: analytical DGPs + cross-estimator equivalence (CS / BJS / Stacked / DiD) + self-generated R-parity (PR-B) +- [ ] doc-deps.yaml mapping for `diff_diff/lpdid.py` + `lpdid_results.py`; llms.txt catalog entry (PR-B, test-enforced) +- [ ] Non-absorbing extension (Section 4.2) - deferred to a later PR +- [ ] Survey-design support - deferred to a later PR + +--- + # Advanced Estimators ## SyntheticDiD diff --git a/docs/methodology/papers/dube-2025-review.md b/docs/methodology/papers/dube-2025-review.md new file mode 100644 index 00000000..fe0d276f --- /dev/null +++ b/docs/methodology/papers/dube-2025-review.md @@ -0,0 +1,202 @@ +# Paper Review: A Local Projections Approach to Difference-in-Differences + +**Authors:** Arindrajit Dube, Daniele Girardi, Òscar Jordà, Alan M. Taylor +**Citation:** Dube, A., Girardi, D., Jordà, Ò., & Taylor, A. M. (2025). A Local Projections Approach to Difference-in-Differences. *Journal of Applied Econometrics*, 40(5), 741-758. https://doi.org/10.1002/jae.70000 (Open Access, CC BY) +**PDF reviewed:** papers/J of Applied Econometrics - 2025 - Dube - A Local Projections Approach to Difference-in-Differences.pdf (main article, 18 pp) **+** papers/lpdid_online_appendix.pdf (official Wiley online appendix, 24 pp; Appendices A-C reviewed in full, D-F = simulation/empirical replication, skimmed). Both gitignored under top-level `papers/`; do NOT commit the PDFs. +**Review date:** 2026-06-28 + +--- + +## Methodology Registry Entry + +*Formatted to match docs/methodology/REGISTRY.md structure. The corresponding registry entry is mirrored in `docs/methodology/REGISTRY.md` (`## LPDiD`, under "Modern Staggered Estimators").* + +## LPDiD + +**Primary source:** Dube, Girardi, Jordà & Taylor (2025), *Journal of Applied Econometrics* 40(5):741-758, https://doi.org/10.1002/jae.70000. NBER WP 31184; FRBSF WP 2023-12. Reference Stata package `lpdid` (SSC s459273); authors' example repo https://github.com/danielegirardi/lpdid (R + Stata). + +**Key implementation requirements:** + +*Assumption checks / warnings:* +- **Assumption 1 (No anticipation):** `E[y_it(p) - y_it(0)] = 0` for all `t < p`. Units do not respond before treatment. +- **Assumption 2 (Parallel trends):** `E[y_it(0) - y_{i1}(0) | p_i = p] = E[y_it(0) - y_{i1}(0)]` for all `t in {2..T}`, `p in {1..T, inf}` - untreated potential-outcome trends are common across cohorts, stated relative to the first period (`t = 1`). The base period for LP-DiD's long difference (first-lag `t-1` vs premean) is a separate efficiency/robustness choice, not part of this assumption (see PMD edge case below; first-lag relies on weaker PT than premean per Marcus & Sant'Anna 2021). +- Treatment is **binary**. The Section 3 main path assumes **absorbing** treatment (`D_{is} <= D_{it}` for `s < t`); non-absorbing handled by the Section 4.2 extension with a modified clean-control window. +- Identification is **model-based** (DGP `E[y_it(0)|i,t] = alpha_i + delta_t`, Equation 1). Treatment effects may be **dynamic and heterogeneous** across cohorts. + +*Target parameter (Section 2.1):* + +Cohort-specific dynamic ATT at horizon `h` for group `g`: + + tau_h^g = E[ y_{i,p_g+h}(p_g) - y_{i,p_g+h}(0) | p_i = p_g ] + +`p_g` = period group `g` first enters treatment; `g = 0` is never-treated. + +*Estimator equation (Equation 4 restricted by Equation 8, as implemented):* + +The LP regression is run **separately for each horizon `h`** on a long-differenced outcome: + + y_{i,t+h} - y_{i,t-1} = beta_h^{LP-DiD} * Delta_D_it + delta_t^h + e_it^h + +estimated on the **restricted (clean) sample** of observations that are either: + + newly treated: Delta_D_it = 1 (Delta_D_it = D_it - D_{i,t-1}) + clean control: D_{i,t+h} = 0 (absorbing case; Equation 8) + +where: +- `y_{i,t+h} - y_{i,t-1}` = long difference (horizon-`h` outcome minus base period `t-1`) +- `Delta_D_it` = first difference of the treatment indicator (1 in the period a unit switches on) +- `delta_t^h` = time (calendar-period) fixed effects, horizon-specific +- **No unit fixed effects** (differenced out by construction) +- `h in {-Q, ..., 0, ..., H}`, `h != -1` (the base period). Negative `h` give pre-trend / placebo coefficients. + +`beta_h^{LP-DiD}` consistently estimates a **variance-weighted ATT** (VWATT, Goodman-Bacon 2021 terminology): + + E(beta_h^{LP-DiD}) = sum_{g != 0} omega_{g,h}^{LP-DiD} * tau_h^g (Equation 9) + omega_{g,h}^{LP-DiD} = [N_CCS_{g,h} * n_{g,h} * (1 - n_{g,h})] / sum_{g != 0}[ N_CCS_{g,h} * n_{g,h} * (1 - n_{g,h}) ] (Equation 10) + +`N_CCS_{g,h}` = # obs in the clean-control sample for `(g,h)`; `n_{g,h} = N_g / N_CCS_{g,h}` = treated share. **Weights are always non-negative** (this is the central result — no negative weighting). Derived via Frisch-Waugh-Lovell in online Appendix B. + +*Equally weighted ATT — two numerically equivalent routes (Section 3.3):* + +1. **Reweighted regression (`reweight=True`):** assign each obs in `CCS_{g,h}` weight `(omega_{g,h}^{LP-DiD} / N_g)^{-1}`. In practice obtained via an auxiliary regression of `Delta_D` on time indicators in the Equation-8 sample (FWL). Reweighted LP-DiD is **numerically equivalent to Callaway-Sant'Anna (2020)**. +2. **Regression adjustment (RA):** fit the long-difference on time indicators using **clean controls only**, predict the counterfactual for each treated obs, average the residuals: + + beta_h^{LP-DiD,RA} = N_TR^{-1} * sum_{(i,t) in TR} [ (y_{i,t+h} - y_{i,t-1}) - Ehat(y_{i,t+h} - y_{i,t-1} | D_{i,t+h}=0) ] + + `TR` = set of newly-treated obs (`Delta_D_it = 1`). This RA form is an **imputation estimator in the sense of Borusyak-Jaravel-Spiess (2024)**. + +*With covariates / doubly robust (Section 4.1):* + +Under conditional PT (Assumptions 3-5, linear CEF), the **recommended** covariate path is RA (Equation 11): + + beta_{h,x}^{LP-DiD,RA} = N_TR^{-1} * sum_{(i,t) in TR} [ (y_{i,t+h} - y_{i,t-1}) - gammahat^h * x_i - deltahat_t^h ] + +where `gammahat^h, deltahat^h` come from the clean-control-only regression `y_{i,t+h} - y_{i,t-1} = delta_t^h + gamma^h x_i + u_it^h`. Generalizes to non-linear/semiparametric first-stage CEF (Equation 11 still holds). + +**Direct covariate inclusion vs RA - positivity of weights (settled in online Appendix B.2):** +- **Linear + homogeneous covariate effects (App. B.2.1):** adding `x_i` directly to the Equation-8 OLS leaves the cohort weights **unchanged** (Equation B.6 still holds) -> weights stay non-negative. Safe. +- **General (heterogeneous/non-linear) covariate effects (App. B.2.2):** weights become proportional to the residual of `Delta_D` on time indicators *and* the covariates (Equations B.10-B.11); the authors state it is **"not possible to ensure that they are always positive."** Negative weighting can return. +- **=> Recommendation (authors, Section 4.1.1 + App. B.2.2): use the RA covariate path**, which is unbiased for the equally-weighted ATT under the weaker Assumptions 3-5. The direct-inclusion path is only safe under the strong homogeneity assumption (main-text Assumption 6 / App. B.2.1). **Implementation: the simple-covariate path should at minimum carry a docstring/warning that it assumes homogeneous covariate effects; default practitioners toward RA.** + +*Premean-differenced base period (PMD, Section 3.4):* + +Replace the single base period `t-1` with the mean of the last `k` pretreatment periods: + + y_{i,t+h} - (1/k) * sum_{tau=t-k}^{t-1} y_{i,tau} = beta_h^{PMD LP-DiD} * Delta_D_it + delta_t^h + e_it^h + +`k = t-1` uses all pretreatment obs. Same weights as Section 3.2. PMD LP-DiD with `k=t-1` and a single treated group is **numerically equal to BJS** (footnotes 10-11); with multiple groups, very close but not identical. + +*Pooled estimand (Section 3.5):* + +Single overall ATT over the posttreatment window `h in {0..H}` by using the posttreatment mean long-difference as the dependent variable: + + (1/(H+1)) * sum_{h=0}^{H} y_{i,t+h} - y_{i,t-1} + +(can be combined with PMD on the base period). Pre-window pooling is analogous over negative horizons. + +*Standard errors (Section 1 - NOT specified by the paper):* +- **The paper deliberately gives no SE formula.** Section 1: "all the estimators ... allow for standard statistical inference using well-understood techniques ... For this reason, we do not discuss statistical inference here." +- Default in the reference Stata implementation: **cluster-robust at the unit level** (`vce(cluster unit)`, footnote 9). +- Pooled / joint tests across horizons: stack the per-horizon regressions (Stata `suest`). +- Bootstrap: not discussed in the paper. +- => Any analytical SE we ship (and especially the influence-function variance for the RA path) is an **implementation choice to be validated against the reference package**, not against the paper. Document under "Deviations from the paper". + +*Edge cases:* +- **Composition effects (Section 3.6):** the treated/clean-control set can change across horizons `h`. To hold the **control** set fixed: tighten the clean-control condition to `D_{i,t+H} = 0` at all horizons (`H` = max horizon) -> the `no_composition` / `nocomp` option. To hold the **treated** set fixed: exclude cohorts with `p_g > T - H`. Costs statistical power. +- **Bias-variance trade-off (Sections 3.3, 5.3):** variance weighting (default) -> lower variance, some bias; equal weighting (`reweight`) -> unbiased, higher variance. Variance weighting won lower RMSE at short horizons (h=0,1), equal weighting at long horizons in the simulation. +- **PMD vs first-lag (Section 3.4):** PMD gains efficiency under low autocorrelation but can **amplify bias** if PT holds only in some pretreatment periods; first-lag differencing relies on a **weaker** PT assumption (Marcus & Sant'Anna 2021). +- **Few/no never-treated units (non-absorbing, Section 4.2.3):** use the effect-stabilization assumption (Assumption 9, parameter `L`) so recently-but-not-currently-changing units can serve as clean controls. + +*Algorithm (per horizon `h`):* +1. Build `Delta_D_it = D_it - D_{i,t-1}`; identify newly-treated obs (`Delta_D_it = 1`). +2. Restrict to the clean sample: newly treated OR clean control (`D_{i,t+h} = 0` absorbing; Equation 13 window for non-absorbing). +3. Form the long difference `y_{i,t+h} - y_{i,t-1}` (or PMD base; or pooled posttreatment mean). +4. Run OLS of the long difference on `Delta_D_it` + time FE (+ direct covariates, or RA first-stage on controls; + reweighting weights for equal-weight estimand). +5. `beta_h` is the event-study coefficient at horizon `h`. Cluster-robust SE at unit level. +6. Repeat across `h in {-Q..H}, h != -1`; `h = -1` is the (zero) reference. + +**Reference implementation(s):** +- Stata: `lpdid` (SSC `s459273`) - the authors' reference implementation. +- R: `alexCardazzi/lpdid` (third-party, A. Cardazzi & Z. Porreca; covers absorbing AND non-absorbing); authors' own R example scripts in `danielegirardi/lpdid`. +- Stata RA syntax (footnote 9): `teffects ra (Dhy i.time) (dtreat) if D.treat==1 | Fh.treat==0, atet vce(cluster unit)`. + +**Requirements checklist:** +- [ ] Long-difference dependent variable with selectable base period (`t-1` default, PMD `k`) +- [ ] Clean-control sample restriction (absorbing: `D_{i,t+h}=0`) +- [ ] Per-horizon OLS with time FE, no unit FE; `h=-1` reference fixed at 0 +- [ ] Variance-weighted (default) and reweighted (equal-weight) estimands +- [ ] Regression-adjustment path for covariates (recommended) + direct-inclusion path +- [ ] Pooled pre/post estimands +- [ ] `no_composition` option (fixed control/treated set across horizons) +- [ ] Cluster-robust SE at unit level by default (implementation choice; validate vs package) +- [ ] Non-absorbing extension (Section 4.2) - phased follow-up + +--- + +## Implementation Notes + +### Data Structure Requirements +- Balanced or unbalanced **panel**: unit `i`, time `t`, outcome `y_it`, binary treatment `D_it`. +- Absorbing main path needs a well-defined entry period `p_i` per unit (`inf` for never-treated). +- Long differences require observing `y` at both `t-1` (base) and `t+h` (target) for each used obs -> missing target/base rows drop out at each horizon. + +### Computational Considerations +- **Speed is the headline advantage.** Table 2: at `N=184, T=54, 26 events`, LP-DiD ~0.16s vs CS ~137.5s, SA ~105.5s, BJS ~0.54s. LP-DiD is a stack of small OLS fits -> trivially parallelizable across horizons. +- Memory is modest: each horizon is an independent regression on a sample subset; no large group-time matrix as in CS. + +### Tuning Parameters + +| Parameter | Type | Default | Selection Method | +|-----------|------|---------|-----------------| +| `pre_window` (Q) | int | application-specific | # pre-treatment horizons (placebos) to estimate | +| `post_window` (H) | int | application-specific | # post-treatment horizons | +| `reweight` | bool | False (variance-weighted) | True -> equally weighted ATT (= CS) | +| `pmd` (k) | None / int / "max" | None (`t-1` base) | average over `k` pretreatment periods; choose ex-ante | +| `no_composition` | bool | False | True -> fixed control/treated set across horizons | +| covariates / RA | columns | None | conditional-PT settings; RA recommended over direct inclusion | +| cluster | column | unit | cluster-robust SE level (implementation default) | + +### Relation to Existing diff-diff Estimators +The paper *proves* numerical equivalences we can exploit for **internal cross-validation** (no external package needed): +- **Reweighted LP-DiD == CallawaySantAnna (2020)** [Section 3.7]. Strongest cross-check; we have `CallawaySantAnna`. +- **PMD LP-DiD (k=t-1, single cohort) == ImputationDiD/BJS** [Section 3.4, footnotes 10-11]; we have `ImputationDiD`. +- **Variance-weighted LP-DiD == Cengiz et al. (2019) stacked regression** [Section 3.7]; we have `StackedDiD`. +- **2x2, h=0 == first-difference / static TWFE == plain DiD** [Section 2.2]; we have `DifferenceInDifferences` / `TwoWayFixedEffects`. +- RA LP-DiD is an imputation estimator -> shares machinery conceptually with `ImputationDiD`. +- Backend reuse: `linalg.solve_ols` (with `weights`, `cluster_ids`) covers every estimation path; `survey.py` (`_resolve_survey_for_fit`, `_compute_stratified_psu_meat`) is the template for the Phase D survey extension. + +--- + +## Appendix Derivations (Online Appendix A-C, reviewed) + +### A - LP/DiD equivalence and the identification insight +- **A.1 (2x2):** the LP regression at `h=0` is *exactly* a first-difference regression, and `beta_0^LP = beta^STWFE = beta^2x2 = ATT`. (Confirms the 2x2 cross-validation test against `DifferenceInDifferences`.) +- **A.2 (single cohort, multiple periods):** `beta_h^LP` recovers the dynamic DiD estimand `tau_h` exactly. **Key identification fact:** the LP regression is equivalent to a *cross-sectional* regression run only on entry-period rows (`t = s`) - "observations with `t != s` do not contribute to the estimated coefficient `beta_h^LP`." Only the period in which a unit switches on identifies the horizon-`h` coefficient. (Mirrors the absorbing clean-control restriction; useful when reasoning about which rows drive each estimate.) +- **A.3 (staggered, homogeneous effects):** an LP with leads *and* lags (Appendix Eq A.2: `y_{i,t+h}-y_{i,t-1} = delta_t^h + beta_h^LP Delta_D_it + sum_{j != 0} theta_j^h Delta_D_{i,t-j} + e`) recovers `tau_h`. LP differs from dynamic TWFE only in how unit FE are removed (LP differences around treatment times rather than full-sample mean-differencing), which can reduce bias under partial PT violations. + +### B - Weights (the FWL derivation, App. B) +- **DGP (Eq B.1):** `y_{i,t+h} - y_{i,t-1} = delta_t^h + tau_{i,t+h} D_{i,t+h} - tau_{i,t-1} D_{i,t-1} + e_it^h`. +- Each obs satisfying the clean-control condition belongs to exactly **one** clean-control sample `CCS_{g,h}` (taken at `t = p_g`) -> LP-DiD == Cengiz stacked. +- **FWL (Eq B.4):** `E(beta_h^{LP-DiD}) = [sum_g sum_{i in CCS} DeltaDtilde_{i,pj} * E(y_{i,pj+h}-y_{i,pj-1})] / [sum_g sum_{i in CCS} DeltaDtilde_{i,pj}^2]`, where `DeltaDtilde` is the residual of `Delta_D` on time indicators within the clean sample. +- **Residualized treatment dummy (Eq B.5):** for every unit in group `g`, `DeltaDtilde_{i,pg} = DeltaDtilde_{g,pg} = 1 - N_g / N_{CCS_{g,h}}` (a per-group constant). This is the practical hook for the **reweighting** implementation (equal-weight estimand = weight obs by the inverse of this). +- **Weight (Eq B.6) = main-text Eq 10**, re-expressed: `omega_{g,h} = N_{CCS_{g,h}} n_{g,h}(1-n_{g,h}) / sum_{g != 0}[...]`, `n_{g,h} = N_g/N_{CCS_{g,h}}`. Always non-negative. +- **B.2 covariate weights:** see the "Direct covariate inclusion vs RA" block above - B.2.1 (homogeneous) leaves weights unchanged; B.2.2 (general) gives `omega^c` proportional to the residual of `Delta_D` on time indicators **and** covariates (Eq B.10-B.11), not guaranteed positive. + +### C - Non-absorbing weights (App. C, for Phase C) +- Introduces **exit events** (`Delta_D_{g,j} = -1`), exit periods `q_g^n`, and exit-event dynamic effects `eta_h^{g,n}`. Assumption 9 (effect stabilization) applies to exit events too. +- Eq C.1 decomposes `E[y_{t+h}-y_{t-1}]` over treatment AND exit events within the `[t-L, t+h]` window -> justifies the modified clean-control condition of main-text **Equation 13**: clean controls = units with no treatment change in `[t-L, t+h]`. +- **Non-absorbing weight:** `omega_{g,n,h}^{LP-DiD''} = nbar_{g,n,h} N_{CCS_{g,n,h}} [nhat_{g,n,h}(1-nhat_{g,n,h})] / sum[...]`, where `nhat` = share of newly-treated and `nbar` = share of group `g` among newly-treated, in `CCS_{p_g^n, h}`. Non-negative. (Phase C will need both the "first-time entry" estimand of Eq 12 and this effect-stabilization estimand of Eq 13 - they are distinct.) + +### D - dynamic-selection simulation (App. D, reference for Phase C tests) +- DGP calibrated to ANRR democracy: 184 units x 51 periods, `rho=0.98`, treatment entry on `psi*Delta_y_{i,t-1} + (1-psi)u_i <= theta` (dynamic selection), reversals allowed after >=5 periods, effect stabilizes after 9 periods, estimators include `Delta_y_{i,t-1}` as a control. Conditional (not unconditional) PT. Flags **Nickell bias** from the lagged-outcome control (only material under high autocorrelation + short T). A candidate template for a Phase C non-absorbing test DGP. + +--- + +## Gaps and Uncertainties + +1. **No SE formula in the paper.** Inference is deferred to "standard techniques" (Section 1, p.742). The reference Stata uses `vce(cluster unit)`. Our analytical SEs - and especially the contributor's hand-rolled influence-function cluster variance for the RA path - are **unspecified by the paper** and must be validated against the `lpdid` package / R reference, then documented as an implementation choice under "Deviations". (Pages 742, footnote 9.) +2. ~~Weight derivations are in the unreviewed online appendix.~~ **RESOLVED** - the official `lpdid_online_appendix.pdf` (Appendices A-C) has now been reviewed and the FWL weight derivations (App. B), covariate-weight positivity conditions (App. B.2.1/B.2.2), and non-absorbing weights (App. C) are captured in the "Appendix Derivations" section above. Remaining appendix material not transcribed: D (dynamic-selection simulation, skimmed), E (LW banking-dereg replication), F (ANRR replication) - empirical/simulation reproduction, not needed for the estimator contract. +3. **Non-absorbing extension (Section 4.2)** is presented as illustrative, not exhaustive ("a comprehensive discussion ... would indeed require a whole article", p.751). The effect-stabilization window `L` (Assumption 9, Equation 13) and the "first-time entry" estimand (Equation 12) are two distinct estimands; a Phase C design will need to choose/expose both deliberately. The contributor's scaffold rejects non-absorbing entirely. +4. **PMD multi-cohort vs BJS** are "very similar (although not identical)" (p.747) - do not assert exact equality in tests except in the single-cohort `k=t-1` case. +5. **`pmd="max"` semantics:** the paper's `k=t-1` "use all pretreatment periods" is per-observation (expanding window). The contributor implemented `pmd="max"` as the expanding mean of all prior periods and integer `pmd=k` as the trailing-`k` mean - verify this matches the package's `pmd` option exactly (the paper fixes the formula but not the option's edge behavior at the panel start). +6. **Pooled SE / joint tests** use stacking (`suest`); the variance of the pooled estimand under clustering should be cross-checked against the package rather than assembled ad hoc. diff --git a/docs/references.rst b/docs/references.rst index b5f1ee1d..16e60d8f 100644 --- a/docs/references.rst +++ b/docs/references.rst @@ -256,6 +256,17 @@ Multi-Period and Staggered Adoption Source for the 8-step practitioner workflow surfaced via ``diff_diff.get_llm_guide("practitioner")`` and the README ``## Practitioner Workflow`` section. See ``docs/methodology/REGISTRY.md`` for the diff-diff renumbering and per-step deviations. +Local Projections DiD +--------------------- + +- **Dube, A., Girardi, D., Jordà, Ò., & Taylor, A. M. (2025).** "A Local Projections Approach to Difference-in-Differences." *Journal of Applied Econometrics*, 40(5), 741-758. https://doi.org/10.1002/jae.70000 (Open Access; NBER Working Paper 31184; FRBSF Working Paper 2023-12.) + + Primary source for the ``LPDiD`` estimator: per-horizon long-difference local-projection regressions estimated on a "clean control" sample, yielding non-negatively-weighted event-study and pooled ATTs (variance-weighted by default; reweighted regression or regression adjustment for the equally-weighted ATT; premean-differenced base periods; covariates; non-absorbing extension). Reference Stata package ``lpdid`` (SSC s459273); R ``alexCardazzi/lpdid``. Paper review on file at ``docs/methodology/papers/dube-2025-review.md``. + +- **Jordà, Ò. (2005).** "Estimation and Inference of Impulse Responses by Local Projections." *American Economic Review*, 95(1), 161-182. https://doi.org/10.1257/0002828053828518 + + Origin of the local-projections method that LP-DiD adapts to the difference-in-differences setting. + Continuous Treatment DiD ------------------------