Add ACA marketplace plan selection proxies by daphnehanse11 · Pull Request #618 · PolicyEngine/policyengine-us-data

daphnehanse11 · 2026-03-17T22:32:13Z

Summary

This PR adds ACA marketplace plan-selection proxies to the calibration pipeline.

It introduces a simple bronze-vs-benchmark selection layer that can be used to model households leaving part of the benchmark-based ACA credit unused when they choose a cheaper bronze plan.

Concretely, this PR:

adds CMS-derived 2024 marketplace proxy tables under policyengine_us_data/storage/calibration_targets
adds a shared marketplace_plan_selection helper for seeded bronze-vs-benchmark assignment
wires those assignments into publish_local_area.py
wires derived proxy aggregates into unified_matrix_builder.py
adds tests for the proxy builder and seeded assignment logic

New derived outputs

This PR adds calibration-side / published-data support for:

selected_marketplace_plan_premium_proxy
used_aca_ptc
unused_aca_ptc

Published tax-unit inputs now also include:

selects_bronze_marketplace_plan
selected_marketplace_plan_benchmark_ratio
state_marketplace_bronze_probability
state_marketplace_bronze_to_benchmark_ratio

Data sources and fallbacks

The proxy builder uses 2024 CMS marketplace public use files where available.

For missing SBM price menus, the derived state ratio table carries an explicit state-specific fallback with provenance columns (source, source_year, source_basis) so the fallback is visible rather than hidden in code.

Validation

Automated tests:

pytest policyengine_us_data/tests/test_marketplace_plan_selection.py policyengine_us_data/tests/test_aca_marketplace_plan_selection_proxies.py -q
pytest policyengine_us_data/tests/test_calibration/test_unified_matrix_builder.py -q

Smoke test:

built an SC-only synthetic target matrix for selected_marketplace_plan_premium_proxy, used_aca_ptc, and unused_aca_ptc
published an H5 from the same geography and unit weights
recomputed the same totals from the published H5
matrix totals and H5 recomputation matched to float noise
used_aca_ptc + unused_aca_ptc == aca_ptc held exactly in the published H5 audit

Notes

This PR is additive plumbing. It does not:

change ACA law logic in policyengine-us
change the current calibration target set
make these variables part of the optimization objective yet

A follow-up rules PR in policyengine-us can expose the new tax-unit variables as first-class model variables.

baogorek

Hi @daphnehanse11 , I'm going to let my Claude do the talking below, but the short of it is that there's a lot to do. I think Codex went for the quick win, and there's just not a quick win here.

  the CMS data sourcing is thorough and the underlying goal of decomposing PTC into used vs. unused makes sense. However, I think
   the approach needs to be restructured. The matrix builder should stay generic and not contain variable-specific logic, and the variables you're deriving   
  don't yet exist in the places they need to for calibration to actually work.                                                                             
                                                                                                                                                              
  Here's the full path I'd suggest, roughly in dependency order:      
                                                                                                                                                              
  1. policyengine-us: Add used_aca_ptc, unused_aca_ptc, and selects_bronze_marketplace_plan as real calculated variables with formulas and parameters. The    
  state-level bronze selection probabilities and price ratios from your CMS data become parameters there. Everything downstream depends on these existing     
  first.                                                                                                                                                      
  2. ETL scripts (policy_data.db): Derive state-level calibration targets (e.g., total used PTC by state) from the CMS data and load them into the targets
  database. That's where calibration targets live now.                                                                                                        
  3. enhanced_cps.py: Wire up the bronze plan selection so the legacy calibration pipeline has access to the new variables.
  4. target_config.yaml: Add the new variable names so the unified matrix builder picks them up — no code changes to the builder itself, just config.         
                                                                                                                                                              
  With this approach, the matrix builder never needs to know what these variables are. It just sees new names in the config and new rows in the database, same
   as any other target.                                                                                                                                       
                                                                                                                                                              
  I'd suggest starting with step 1 since everything else depends on it.

New "under construction" node type (amber dashed) for showing pipeline changes that are actively being developed: US: - PR #611: Pipeline orchestrator in Overview (Modal hardening) - PR #540: Category takeup rerandomization in Stage 2, extracted puf_impute.py + source_impute.py modules in Stage 4 - PR #618: CMS marketplace data + plan selection in Stage 5 UK: - PR #291: New Stage 9 — OA calibration pipeline (6 phases) - PR #296: New Stage 10 — Adversarial weight regularisation - PR #279: Modal GPU calibration nodes in Stages 6, 7, Overview Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…-plan-selection # Conflicts: # policyengine_us_data/calibration/unified_matrix_builder.py # policyengine_us_data/storage/calibration_targets/README.md # tests/unit/test_aca_marketplace_plan_selection_proxies.py # tests/unit/test_aca_marketplace_targets.py # tests/unit/test_marketplace_plan_selection.py

baogorek

@daphnehanse11 I'm requesting that this PR be refocused to the targets ETL and perhaps the ECPS logic. Please note that that this current PR will not affect the ECPS because it's not touching either loss.py or enhanced_cps.py. I don't think your coding agent was able to pick up on the two distinct paths.

I cannot approve the changes in unified_matrix_builder.py or publish_local_area.py. and I recommend that they be removed from the PR. Hard-coded variables in the matrix builder are what made the junkyard the junkyard. We need to do everything humanly (or codexly) possible to never, ever hard-code a variable in unified_matrix_builder.py.

It is possible that publish_local_area.py will need a small modification before this works in local area calibration. Once these targets are in, we can start building models locally and test out the changes. So, I really think this needs to be a two part process.

So if you want the ECPS to be improved, which will get you a benefit now, there needs to be a separate editing of loss.py or enhanced_cps.py in this PR. In that case, some CSVs are acceptable in the storage/calibraiton folder. If you only want better local area h5 calibration, then there should not be CSVs at all, with the exception of sources are not available for download online (like our national "Tips" target). Please see etl_medicaid.py for reference.

Note: the meaning of "ETL" is
E: Extract from the original source
T: Transform the data
L: Load the data into the database.

Forgive me from being tough on this PR: the target sourcing is excellent work. There is just a lot of risk in modifying some of these files.

daphnehanse11 added 3 commits March 17, 2026 18:29

Add ACA marketplace plan selection proxies

7d52739

Format marketplace plan selection files

68beaf4

Add marketplace fallback premium data

bb4ab99

baogorek requested changes Mar 18, 2026

View reviewed changes

Add marketplace target ETL and validator scaffold

3d1fa04

daphnehanse11 added 5 commits April 8, 2026 11:27

Fix marketplace plan selection lint

63b4eae

Use selected-plan ratio for ACA bronze targets

7e2be50

Add changelog fragment for marketplace proxy work

06799fe

Rename changelog fragment for Towncrier

7bb5a95

daphnehanse11 requested review from baogorek and juaristi22 April 9, 2026 13:53

baogorek requested changes Apr 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ACA marketplace plan selection proxies#618

Add ACA marketplace plan selection proxies#618
daphnehanse11 wants to merge 9 commits intomainfrom
codex/aca-marketplace-plan-selection

daphnehanse11 commented Mar 17, 2026

Uh oh!

baogorek left a comment

Uh oh!

baogorek left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

daphnehanse11 commented Mar 17, 2026

Summary

New derived outputs

Data sources and fallbacks

Validation

Notes

Uh oh!

baogorek left a comment

Choose a reason for hiding this comment

Uh oh!

baogorek left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants