Add ESA and JSA calibration targets by MaxGhenis · Pull Request #325 · PolicyEngine/policyengine-uk-data

MaxGhenis · 2026-04-11T21:34:28Z

Summary

add DWP claimant-count targets for ESA total, ESA contributory, ESA income-related, and current JSA claims
fix the OBR Jobseeker's Allowance target to map to total jsa rather than only jsa_income
add focused tests for the new DWP targets and the OBR JSA mapping

Why

The current calibration target set barely constrains ESA/JSA. ESA only has a coarse OBR spending anchor, and JSA spending was incorrectly matched to jsa_income only. That made the ESA/JSA asset-rule integration unstable under recalibration.

Verification

uv run pytest -q policyengine_uk_data/tests/test_esa_jsa_targets.py
uv run pytest -q policyengine_uk_data/tests/test_target_registry.py
uvx ruff check policyengine_uk_data/targets/sources/dwp.py policyengine_uk_data/targets/sources/obr.py policyengine_uk_data/tests/test_esa_jsa_targets.py

MaxGhenis · 2026-04-12T00:54:57Z

Update after re-running the calibration comparison more carefully.

The earlier optimistic ~18% national mean absolute relative error result was not a fair apples-to-apples comparison. I re-ran this as a matched admin-only benchmark using the same expanded target set, the same constituency calibration harness, and 5 matched random seeds for:

current main
all_assets = UC + HB + IS + PC + ESA/JSA asset-rule bundle

Post-constituency-recalibration summary across 5 seeds:

Model	National mean abs rel error	National median abs rel error	National within 10%	National within 20%	Constituency-local median abs rel error	Constituency-local within 10%
`current_main`	`69.0%` median (`68.4%` to `69.2%`)	`18.4%`	`36.4%`	`52.2%`	`5.65%`	`89.9%`
`all_assets`	`61.6%` median (`61.3%` to `61.9%`)	`18.7%`	`36.6%`	`50.9%`	`5.71%`	`89.4%`

So the stable result is not “clear across-the-board improvement”. The full asset-rule bundle does consistently improve national mean error by about 7.1pp, but it is roughly flat to slightly worse on the other national metrics and on constituency-local fit.

Two concrete takeaways:

The comparison needs to be treated as stochastic. Single calibration draws are not trustworthy here.
The more important next step is probably a target audit / target-family breakdown, not more one-off recalibration runs. We already found one real mapping issue on JSA, and the current objective seems able to trade off target families in ways that are not obvious from one aggregate loss number.

MaxGhenis · 2026-04-12T01:00:08Z

Follow-up on the calibration/target side after rerunning the fair admin-only comparison.

Corrected calibration read

With the new ESA/JSA target additions in this PR, the fair comparison is not "everything gets dramatically better". Over 5 matched fresh constituency-recalibration seeds:

current_main median national mean absolute relative error: 69.0%
all_assets median national mean absolute relative error: 61.6%
stable delta on that metric: about -7.1pp

But the rest of the metrics are mixed:

national median absolute relative error: slightly worse
national within 10%: slightly better
national within 20%: worse
constituency-local fit: slightly worse

So the cleaner conclusion is that the expanded asset-rule package improves national mean admin-target error, but not the whole loss surface.

Target freshness audit

I also checked the current source dates against official releases. There are newer admin targets available for several target families:

OBR welfare/tax-benefit spend targets in tax_benefit.csv still point at obr_march_2024_efo, but OBR has newer March 2026 detailed forecast tables published on 3 March 2026.
dwp.py benefit cap targets still use ...to-february-2025, but DWP has a newer official release Benefit cap: number of households capped to November 2025.
dwp.py two-child-limit targets still reference the April 2024 publication, but DWP has a newer official release Universal Credit claimants statistics on the two child limit policy, April 2025.
local_uc.py still uses country-level UC-by-children proportions from November 2023, while the UC statistics collection now runs through 12 February 2026 on Stat-Xplore, so those splits can be refreshed.

The new ESA/JSA count targets added here are different: they are already anchored to the current DWP benefit statistics release for February 2026, so I do not think they are the stale part.

Add ESA and JSA calibration targets

39a609a

MaxGhenis merged commit cdeb599 into main Apr 12, 2026
3 checks passed

MaxGhenis deleted the codex/uk-data-esa-targets branch April 12, 2026 02:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ESA and JSA calibration targets#325

Add ESA and JSA calibration targets#325
MaxGhenis merged 1 commit intomainfrom
codex/uk-data-esa-targets

MaxGhenis commented Apr 11, 2026

Uh oh!

MaxGhenis commented Apr 12, 2026

Uh oh!

MaxGhenis commented Apr 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MaxGhenis commented Apr 11, 2026

Summary

Why

Verification

Uh oh!

MaxGhenis commented Apr 12, 2026

Uh oh!

MaxGhenis commented Apr 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant