Conversation
|
Update after re-running the calibration comparison more carefully. The earlier optimistic
Post-constituency-recalibration summary across 5 seeds:
So the stable result is not “clear across-the-board improvement”. The full asset-rule bundle does consistently improve national mean error by about Two concrete takeaways:
|
|
Follow-up on the calibration/target side after rerunning the fair admin-only comparison.
With the new ESA/JSA target additions in this PR, the fair comparison is not "everything gets dramatically better". Over 5 matched fresh constituency-recalibration seeds:
But the rest of the metrics are mixed:
So the cleaner conclusion is that the expanded asset-rule package improves national mean admin-target error, but not the whole loss surface.
I also checked the current source dates against official releases. There are newer admin targets available for several target families:
The new ESA/JSA count targets added here are different: they are already anchored to the current DWP benefit statistics release for February 2026, so I do not think they are the stale part. |
Summary
jsarather than onlyjsa_incomeWhy
The current calibration target set barely constrains ESA/JSA. ESA only has a coarse OBR spending anchor, and JSA spending was incorrectly matched to
jsa_incomeonly. That made the ESA/JSA asset-rule integration unstable under recalibration.Verification
uv run pytest -q policyengine_uk_data/tests/test_esa_jsa_targets.pyuv run pytest -q policyengine_uk_data/tests/test_target_registry.pyuvx ruff check policyengine_uk_data/targets/sources/dwp.py policyengine_uk_data/targets/sources/obr.py policyengine_uk_data/tests/test_esa_jsa_targets.py