Skip to content

[codex] Fix microsim self-employment smoke test#7978

Open
daphnehanse11 wants to merge 1 commit intoPolicyEngine:mainfrom
daphnehanse11:codex/fix-microsim-self-employment
Open

[codex] Fix microsim self-employment smoke test#7978
daphnehanse11 wants to merge 1 commit intoPolicyEngine:mainfrom
daphnehanse11:codex/fix-microsim-self-employment

Conversation

@daphnehanse11
Copy link
Copy Markdown
Collaborator

Summary

  • update the microsimulation smoke test to treat self-employment income as nonzero when its aggregate magnitude is nonzero
  • document why a weighted self-employment total can be negative
  • add a changelog fragment

Why

The failing GitHub Actions job on PR #7977 was not caused by the Texas Medicaid change. It failed in policyengine_us/tests/microsimulation/test_microsim.py because the test assumed self_employment_income.sum() > 0.

That assumption is too strict: self_employment_income is modeled as net business income, so weighted losses can outweigh weighted profits in a sampled dataset and produce a negative aggregate total even when the variable is present and working.

Root cause

The smoke test was using a positivity check where it really needed a nonzero-magnitude check.

Validation

  • make format
  • python -m py_compile policyengine_us/tests/microsimulation/test_microsim.py
  • confirmed locally that MicroSeries.sum() can be negative while MicroSeries.abs().sum() remains nonzero

Notes

I could not run the full dataset-backed microsimulation test locally in this sandbox because the Hugging Face datasets are not reachable here, so CI should be the source of truth for the full repro and verification.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.84%. Comparing base (b70c547) to head (65e4304).
⚠️ Report is 13 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##              main    #7978      +/-   ##
===========================================
- Coverage   100.00%   92.84%   -7.16%     
===========================================
  Files            1     4212    +4211     
  Lines           13    60818   +60805     
  Branches         0      307     +307     
===========================================
+ Hits            13    56468   +56455     
- Misses           0     4222    +4222     
- Partials         0      128     +128     
Flag Coverage Δ
unittests 92.84% <ø> (-7.16%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@daphnehanse11 daphnehanse11 marked this pull request as ready for review April 10, 2026 19:51
Copy link
Copy Markdown
Collaborator

@baogorek baogorek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the comment. My robot thinks self_employment_income can be negative whereas employment_income is non-negative, so they should be treated separately.

# Check that the microsim calculates important variables as nonzero in current year.
# Self-employment is net business income, so weighted totals can be negative
# when losses outweigh profits in the sampled records.
for var in ["employment_income", "self_employment_income"]:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My robot is saying that we should split these apart in the list, do .sum() > 0 for employment_income and .abs().sum() > 0 for self_employment_income

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants