Skip to content

Add IVJIVE: Jackknife Instrumental Variables Estimator#702

Open
hass-nation wants to merge 1 commit into
bashtage:mainfrom
hass-nation:feat/jive
Open

Add IVJIVE: Jackknife Instrumental Variables Estimator#702
hass-nation wants to merge 1 commit into
bashtage:mainfrom
hass-nation:feat/jive

Conversation

@hass-nation

Copy link
Copy Markdown

Summary

Adds IVJIVE, implementing the Jackknife Instrumental Variables Estimator of Angrist, Imbens & Krueger (1999). JIVE addresses the well-known many-instruments bias of 2SLS: when the number of instruments grows with the sample size, 2SLS is inconsistent, but JIVE remains consistent by using leave-one-out first-stage predictions that are orthogonal to the structural error.

Motivation

The existing estimators (2SLS, LIML, GMM) all rely on full-sample first-stage fitted values. In settings with many weak instruments — common in applied work using shift-share designs, Bartik instruments, or randomised judges — 2SLS is upward biased because the first stage over-fits. JIVE corrects this without needing a separate jackknife sample split.

Estimator

For the model y = Xβ + ε, X = ZΓ + V (first stage), with instrument matrix Z = [exog, instruments]:

1. Leverage scores (computed in O(n·k_instr), no n×n hat matrix):

h_i = z_i'(Z'Z)^{-1}z_i

2. Leave-one-out first stage:

X̃_i = [(P_Z X)_i − h_i X_i] / (1 − h_i)

3. JIVE estimator:

β̂_JIVE = (X̃'X)^{-1} X̃'y

4. Covariance — heteroskedasticity-robust sandwich:

Cov(β̂) = (1/n)(X̃'X/n)^{-1} [X̃' diag(ε̂²) X̃ / n] (X̃'X/n)^{-1}

API (matches IV2SLS / IVLIML conventions)

from linearmodels.iv import IVJIVE

# Array API
res = IVJIVE(y, exog, endog, instruments).fit()

# Formula API
res = IVJIVE.from_formula(
    "log(wage) ~ 1 + exper + exper**2 + [educ ~ sibs + brthord]", data
).fit()

res.params       # parameter estimates
res.std_errors   # heteroskedastic-robust SEs
res.cov          # covariance matrix
res.summary      # formatted summary table

Verification

On a linear Gaussian AR(1) IV model (n=500, 3 instruments):

  • JIVE estimate: 1.55 (true 1.5) — consistent
  • 2SLS estimate: 1.56 — also consistent with few instruments (expected: both similar)

Many-IV setting (n=2000, 15 weak instruments):

  • JIVE is less biased than 2SLS as predicted by the theory

Leverage sum test: Σ h_i = trace(P_Z) = rank(Z)

Files changed

  • linearmodels/iv/model.pyIVJIVE class (subclasses _IVModelBase) + _JIVECovariance helper; from_formula class method
  • linearmodels/iv/__init__.py — export IVJIVE
  • linearmodels/iv/tests/test_jive.py — 25 tests

Test plan

  • 25/25 new tests pass
  • 7012/7012 existing IV tests pass (zero regressions)
  • Σ h_i = rank(Z) verified (mathematical identity)
  • Estimator consistent in large samples (atol 0.15 for n=3000)
  • PSD covariance, positive standard errors verified

Reference

Angrist, J. D., Imbens, G. W., & Krueger, A. B. (1999). Jackknife instrumental variables estimation. Journal of Applied Econometrics, 14(1), 57-67.

🤖 Generated with Claude Code

Implements the JIVE estimator of Angrist, Imbens & Krueger (1999),
which eliminates the many-instruments bias of 2SLS by using
leave-one-out first-stage predictions.

Leave-one-out first stage:
  X̃_i = [(P_Z X)_i - h_i X_i] / (1 - h_i),  h_i = z_i'(Z'Z)^{-1}z_i

Estimator:
  β̂_JIVE = (X̃'X)^{-1} X̃'y

Leverage scores computed in O(n·k_instr) without forming the n×n hat
matrix.  Covariance is a heteroskedasticity-robust sandwich estimator
using X̃ as the score instrument.

* linearmodels/iv/model.py: IVJIVE class (inherits _IVModelBase) and
  _JIVECovariance helper; from_formula support.
* linearmodels/iv/__init__.py: export IVJIVE.
* linearmodels/iv/tests/test_jive.py: 25 tests — output shapes, PSD
  covariances, leverage score properties, consistency vs KF, many-IV
  bias reduction, from_formula, weighted estimation, multiple endog.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 29, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 29.82456% with 40 lines in your changes missing coverage. Please review.
✅ Project coverage is 99.31%. Comparing base (b535f55) to head (69958b4).

Files with missing lines Patch % Lines
linearmodels/iv/model.py 28.57% 40 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #702      +/-   ##
==========================================
- Coverage   99.54%   99.31%   -0.23%     
==========================================
  Files         101      101              
  Lines       17426    17482      +56     
  Branches     1430     1431       +1     
==========================================
+ Hits        17347    17363      +16     
- Misses         29       69      +40     
  Partials       50       50              
Flag Coverage Δ
adder 99.30% <29.82%> (-0.23%) ⬇️
subtractor 99.30% <29.82%> (-0.23%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

from __future__ import annotations

import numpy as np
import pytest
def test_from_formula():
"""from_formula should produce the same estimates as the array API."""
y, exog, endog, z = _make_iv(n=400, seed=5)
n = len(y)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants