Skip to content

ENH: Add Hansen-Lee misspecification-robust J-test to IVGMM#701

Open
hass-nation wants to merge 1 commit into
bashtage:mainfrom
hass-nation:gmm/hansen-lee-j-test
Open

ENH: Add Hansen-Lee misspecification-robust J-test to IVGMM#701
hass-nation wants to merge 1 commit into
bashtage:mainfrom
hass-nation:gmm/hansen-lee-j-test

Conversation

@hass-nation

Copy link
Copy Markdown

Closes #430

What this adds

IVGMMResults.robust_j_stat — a misspecification-robust J-test from Hansen & Lee (2021, Econometrica 89(3), 1419-1447).

The standard J-statistic uses the uncentered moment covariance as its weight matrix. Under misspecification (E[g(z,θ)] != 0), this causes the test to saturate — it stays bounded by n rather than diverging. The robust statistic replaces the uncentered covariance with the centered covariance S_c = (1/n) sum (g_i - g_bar)(g_i - g_bar)' giving J* = n g_bar' S_c^{-1} g_bar ~ chi2(q) under correct specification and diverging at rate n under misspecification.

API

res = IVGMM(y, exog, endog, instr).fit(cov_type="robust")
res.j_stat         # existing: standard J-test
res.robust_j_stat  # new: Hansen-Lee robust J-test
print(res.summary) # shows both side-by-side in the header table

Works for all cov_type values: robust, unadjusted, kernel, clustered.

Implementation

  • _IVGMMBase._hansen_lee_j_statistic: computes J* using the centered weight matrix estimator matching cov_type
  • _gmm_post_estimation in both _IVGMMBase and IVGMM: threads cov_type/cov_config through and adds robust_j_stat to the results dict
  • IVGMMResults.robust_j_stat property with full docstring and math
  • IVGMMResults._top_right: summary header now shows J-stat, HL J-stat, p-values, distributions, and iterations
  • Zero new dependencies

Tests: 14 new, 6515 existing all pass

  • test_robust_j_stat_formula_matches_manual_calculation: J* = n g_bar' S_c^{-1} g_bar exactly
  • test_robust_j_algebraic_identity_iterated_gmm: J* = J*n/(n-J) for iterated GMM
  • test_robust_j_stat_nonnegative_all_cov_types: valid for robust/homo/kernel/clustered
  • test_ivgmmcue_has_robust_j_stat: IVGMMCUE also exposes robust_j_stat
  • test_summary_contains_hl_j_statistic: both stats shown in summary

References

Hansen, B. E. & Lee, S. (2021). Inference for iterated GMM under misspecification. Econometrica, 89(3), 1419-1447.

Adds `robust_j_stat` property to `IVGMMResults` (and `IVGMMCUE`),
implementing the misspecification-robust J-test from Hansen & Lee
(2021, Econometrica, 89(3), 1419-1447).

The standard J-statistic uses the uncentered moment covariance as its
weight matrix, which under model misspecification (E[g(z,theta)] != 0)
leads to a test that saturates at n rather than diverging. The
Hansen-Lee statistic replaces the uncentered covariance with the
*centered* covariance S_c = (1/n) sum (g_i - g_bar)(g_i - g_bar)',
giving J* = n * g_bar' S_c^{-1} g_bar ~ chi2(q) under correct
specification and diverging at rate n under misspecification.

Changes:
- `_IVGMMBase._hansen_lee_j_statistic`: computes J* for any cov_type
  (robust/heteroskedastic, homoskedastic, kernel, clustered)
- `_IVGMMBase._gmm_post_estimation` + `IVGMM._gmm_post_estimation`:
  accept cov_type/cov_config and include `robust_j_stat` in results
- `IVGMM.fit` + `IVGMMCUE.fit`: pass cov_type/cov_config through
- `IVGMMResults.robust_j_stat` property with full docstring
- `IVGMMResults._top_right`: summary now shows both J-stats side-by-side
- 14 new tests in `linearmodels/tests/iv/test_hansen_lee_j_stat.py`
  covering type checks, formula verification, all cov_types, CUE,
  summary display, and the algebraic identity J* = J*n/(n-J) for
  iterated GMM

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 28, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 98.49624% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 99.53%. Comparing base (b535f55) to head (299e3d1).

Files with missing lines Patch % Lines
linearmodels/iv/model.py 90.90% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #701      +/-   ##
==========================================
- Coverage   99.54%   99.53%   -0.01%     
==========================================
  Files         101      102       +1     
  Lines       17426    17557     +131     
  Branches     1430     1437       +7     
==========================================
+ Hits        17347    17476     +129     
- Misses         29       30       +1     
- Partials       50       51       +1     
Flag Coverage Δ
adder 99.52% <98.49%> (-0.01%) ⬇️
subtractor 99.52% <98.49%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bashtage bashtage left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the ruff linting error.

@bashtage

Copy link
Copy Markdown
Owner

Need Ruff fix, and probably to be blackened. Should probably run isort on the changed files as well.

Oh no! 💥 💔 💥
5 files would be reformatted, 124 files would be left unchanged.
Skipped 1 files
RUF002 Docstring contains ambiguous `–` (EN DASH). Did you mean `-` (HYPHEN-MINUS)?
    --> linearmodels/iv/model.py:1098:54
     |
1096 |         ----------
1097 |         Hansen, B. E. & Lee, S. (2021). Inference for iterated GMM under
1098 |         misspecification. *Econometrica*, 89(3), 1419–1447.
     |                                                      ^
1099 |         """
1100 |         y, x, z = self._wy, self._wx, self._wz
     |

RUF002 Docstring contains ambiguous `–` (EN DASH). Did you mean `-` (HYPHEN-MINUS)?
    --> linearmodels/iv/results.py:1503:54
     |
1501 |         ----------
1502 |         Hansen, B. E. & Lee, S. (2021). Inference for iterated GMM under
1503 |         misspecification. *Econometrica*, 89(3), 1419–1447.
     |                                                      ^
1504 |         """
1505 |         return self._robust_j_stat
     |

F841 Local variable `expected_df` is assigned to but never used
  --> linearmodels/tests/iv/test_hansen_lee_j_stat.py:78:5
   |
76 |     ninstr = data.instr.shape[1] + data.exog.shape[1]
77 |     nendog = data.endog.shape[1]
78 |     expected_df = ninstr - nendog - data.exog.shape[1]
   |     ^^^^^^^^^^^
79 |     # df = total instruments - total params = (nexog+ninstr) - (nexog+nendog)
80 |     #    = ninstr - nendog
   |
help: Remove assignment to unused variable `expected_df`

Found 3 errors.
No fixes available (1 hidden fix can be enabled with the `--unsafe-fixes` option).
linearmodels/tests/iv/test_hansen_lee_j_stat.py:78:5: F841 local variable 'expected_df' is assigned to but never used

def test_robust_j_stat_df_equals_overidentification_degree(res_robust, data):
ninstr = data.instr.shape[1] + data.exog.shape[1]
nendog = data.endog.shape[1]
expected_df = ninstr - nendog - data.exog.shape[1]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ENH: add misspecificatin robust inference for GMM models

3 participants