Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/1118.added
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add scoped Stage 3 fitted-weight contract artifacts for regional and national fits.
10 changes: 10 additions & 0 deletions docs/engineering/pipeline-map.md
Original file line number Diff line number Diff line change
Expand Up @@ -460,12 +460,14 @@ Fit regional log-weights using L0 HardConcrete gates on GPU
| `modal_gpu` Modal GPU Container | `external` | `unknown` | `unknown` | |
| `fit_spec_regional` FittedWeightsSpec regional | `library` | `unknown` | `unknown` | |
| `fit_artifacts_regional` ScopedFitArtifacts regional | `library` | `unknown` | `unknown` | |
| `fit_contract_builder_regional` FittedWeightsContractBuilder regional | `library` | `unknown` | `unknown` | |
| `create_model` Create SparseCalibrationWeights | `process` | `unknown` | `unknown` | |
| `extract_weights` Extract Weights | `process` | `unknown` | `unknown` | |
| `out_weights` calibration_weights.npy | `artifact` | `unknown` | `unknown` | |
| `out_geo_s6` geography_assignment.npz | `artifact` | `unknown` | `unknown` | |
| `out_diag` unified_diagnostics.csv | `artifact` | `unknown` | `unknown` | |
| `out_config_s6` unified_run_config.json | `artifact` | `unknown` | `unknown` | |
| `out_fit_contract_regional` fitted_weights_regional_contract.json | `artifact` | `unknown` | `unknown` | |
| `util_l0` l0-python | `utility` | `unknown` | `unknown` | |
| `util_pytorch` PyTorch | `utility` | `unknown` | `unknown` | |
| `init_weights` Compute Initial Weights | `library` | `current` | `moving` | `policyengine_us_data.calibration.unified_calibration.compute_initial_weights` |
Expand All @@ -479,6 +481,9 @@ Fit regional log-weights using L0 HardConcrete gates on GPU
- `fit_artifacts_regional` -> `out_geo_s6` `documents`
- `fit_artifacts_regional` -> `out_diag` `documents`
- `fit_artifacts_regional` -> `out_config_s6` `documents`
- `fit_artifacts_regional` -> `out_fit_contract_regional` `documents`
- `fit_model` -> `fit_contract_builder_regional` `data_flow`
- `fit_contract_builder_regional` -> `out_fit_contract_regional` `produces_artifact`
- `init_weights` -> `create_model` `data_flow`
- `create_model` -> `fit_model` `data_flow`
- `modal_gpu` -> `fit_model` `runs_on_infra` (runs on)
Expand Down Expand Up @@ -507,12 +512,14 @@ Fit national log-weights for the national H5 output using the same L0 calibratio
| `modal_gpu_national` Modal GPU Container | `external` | `unknown` | `unknown` | |
| `fit_spec_national` FittedWeightsSpec national | `library` | `unknown` | `unknown` | |
| `fit_artifacts_national` ScopedFitArtifacts national | `library` | `unknown` | `unknown` | |
| `fit_contract_builder_national` FittedWeightsContractBuilder national | `library` | `unknown` | `unknown` | |
| `create_model_national` Create National SparseCalibrationWeights | `process` | `unknown` | `unknown` | |
| `extract_national_weights` Extract National Weights | `process` | `unknown` | `unknown` | |
| `out_national_weights` national_calibration_weights.npy | `artifact` | `unknown` | `unknown` | |
| `out_national_geo_s6` national_geography_assignment.npz | `artifact` | `unknown` | `unknown` | |
| `out_national_diag` national_unified_diagnostics.csv | `artifact` | `unknown` | `unknown` | |
| `out_national_config_s6` national_unified_run_config.json | `artifact` | `unknown` | `unknown` | |
| `out_fit_contract_national` fitted_weights_national_contract.json | `artifact` | `unknown` | `unknown` | |
| `util_l0_national` l0-python | `utility` | `unknown` | `unknown` | |
| `util_pytorch_national` PyTorch | `utility` | `unknown` | `unknown` | |
| `init_weights` Compute Initial Weights | `library` | `current` | `moving` | `policyengine_us_data.calibration.unified_calibration.compute_initial_weights` |
Expand All @@ -526,6 +533,9 @@ Fit national log-weights for the national H5 output using the same L0 calibratio
- `fit_artifacts_national` -> `out_national_geo_s6` `documents`
- `fit_artifacts_national` -> `out_national_diag` `documents`
- `fit_artifacts_national` -> `out_national_config_s6` `documents`
- `fit_artifacts_national` -> `out_fit_contract_national` `documents`
- `fit_model` -> `fit_contract_builder_national` `data_flow`
- `fit_contract_builder_national` -> `out_fit_contract_national` `produces_artifact`
- `init_weights` -> `create_model_national` `data_flow`
- `create_model_national` -> `fit_model` `data_flow`
- `modal_gpu_national` -> `fit_model` `runs_on_infra` (runs on)
Expand Down
14 changes: 14 additions & 0 deletions docs/engineering/stages/fit_weights.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,20 @@ step parameters. Manual legacy package runs may proceed without the contract
only through the explicit no-contract fallback, which emits a warning and
records that only the package checksum was available.

Each successful scoped fit writes a semantic Stage 3 handoff contract next to
the primary fitted-weight artifacts:

- regional: `fitted_weights_regional_contract.json`;
- national: `fitted_weights_national_contract.json`.

These contracts use the canonical `fitted_weights` stage-contract type, include
the matching Stage 2 package and contract inputs, list the scoped weights,
geography, run config, legacy diagnostics, and epoch log outputs, and embed the
solver parameters plus package, geography, weights, and diagnostics summaries.
The fit step manifests record these contract JSON files as normal outputs so
Stage 4 can validate a scoped semantic handoff without relying only on filename
conventions.

The current artifact names remain behavior-compatible:

- regional: `calibration_weights.npy`, `geography_assignment.npz`,
Expand Down
8 changes: 4 additions & 4 deletions docs/generated/pipeline_api.json
Original file line number Diff line number Diff line change
Expand Up @@ -1164,7 +1164,7 @@
"docstring": "Scoped output bundle created before Stage 3 bytes become files.",
"id": "fitted_weights_output_bundle",
"kind": "class",
"line": 113,
"line": 302,
"metadata": {
"api_refs": [
"policyengine_us_data.fit_weights.bundles.FittedWeightsOutputBundle"
Expand Down Expand Up @@ -3086,7 +3086,7 @@
"docstring": "Promote a completed pipeline run to production.\n\n1. Verify run status is \"completed\"\n2. Promote every staged artifact in one Hugging Face commit\n3. Upload/copy every artifact to GCS\n4. Finalize release_manifest.json, tag the release, and update\n version_manifest.json\n5. Update run status to \"promoted\"\n\nArgs:\n run_id: The run ID to promote.\n candidate_version: Candidate staging scope used for staged source files.\n release_version: Stable version used for final release metadata.\n\nReturns:\n Summary message.",
"id": "promote_pipeline_run",
"kind": "function",
"line": 2091,
"line": 2133,
"metadata": {
"api_refs": [
"modal_app.pipeline.promote_run"
Expand Down Expand Up @@ -3541,7 +3541,7 @@
"docstring": "Run the full pipeline end-to-end.\n\nArgs:\n branch: Git branch to build from.\n gpu: GPU type for regional calibration.\n epochs: Training epochs for regional calibration.\n national_gpu: GPU type for national calibration.\n national_epochs: Training epochs for national.\n num_workers: Number of parallel H5 workers.\n n_clones: Number of clones for H5 building.\n skip_national: Skip national calibration/H5.\n resume_run_id: Resume a previously failed run.\n clear_checkpoints: Wipe ALL checkpoints before building\n (default False). Normally not needed \u2014 checkpoints are\n scoped by commit SHA, so stale ones from other commits\n are cleaned automatically. Use True only to force a\n full rebuild of the current commit.\n candidate_version: Candidate staging scope used for HF staging.\n release_version: Final stable release version. Usually empty until\n promotion.\n base_release_version: Stable release current when this candidate was\n built.\n release_bump: Intended SemVer bump for this candidate.\n sha_override: Exact source SHA deployed by GitHub Actions. When\n provided, this is recorded instead of reading the current\n branch tip.\n run_id: Cross-system run ID created by GitHub.\n run_context: Serialized run context from the launcher workflow.\n modal_app_name: Deployed Modal app name for this run.\n modal_environment: Modal environment used for this run.\n chunked_matrix: Build the calibration matrix in clone-household\n chunks instead of the non-chunked path. Opt-in; default off.\n chunk_size: Clone-household columns per chunk when\n ``chunked_matrix`` is True.\n parallel_matrix: Fan chunked matrix building across Modal\n workers via ``build_matrix_chunk_worker``. Only meaningful\n when ``chunked_matrix`` is True; ignored otherwise.\n num_matrix_workers: Number of Modal workers when\n ``parallel_matrix`` is True.\n\nReturns:\n The run ID for use with promote.",
"id": "run_modal_pipeline",
"kind": "function",
"line": 1113,
"line": 1115,
"metadata": {
"api_refs": [
"modal_app.pipeline.run_pipeline"
Expand Down Expand Up @@ -4479,7 +4479,7 @@
"docstring": "Verify deployed-image imports and subprocess seams.",
"id": "verify_runtime_seams",
"kind": "function",
"line": 739,
"line": 741,
"metadata": {
"api_refs": [
"modal_app.pipeline.verify_runtime_seams"
Expand Down
62 changes: 60 additions & 2 deletions docs/generated/pipeline_map.json
Original file line number Diff line number Diff line change
Expand Up @@ -4486,6 +4486,21 @@
"source": "fit_artifacts_regional",
"target": "out_config_s6"
},
{
"edge_type": "documents",
"source": "fit_artifacts_regional",
"target": "out_fit_contract_regional"
},
{
"edge_type": "data_flow",
"source": "fit_model",
"target": "fit_contract_builder_regional"
},
{
"edge_type": "produces_artifact",
"source": "fit_contract_builder_regional",
"target": "out_fit_contract_regional"
},
{
"edge_type": "data_flow",
"source": "init_weights",
Expand Down Expand Up @@ -4546,14 +4561,16 @@
"node_ids": [
"fit_spec_regional",
"fit_artifacts_regional",
"fit_contract_builder_regional",
"init_weights",
"create_model",
"fit_model",
"extract_weights",
"out_weights",
"out_geo_s6",
"out_diag",
"out_config_s6"
"out_config_s6",
"out_fit_contract_regional"
]
}
],
Expand Down Expand Up @@ -4588,6 +4605,12 @@
"label": "ScopedFitArtifacts regional",
"node_type": "library"
},
{
"description": "Builds the regional fitted_weights stage contract from Stage 2 package identity, fit parameters, artifacts, and diagnostics",
"id": "fit_contract_builder_regional",
"label": "FittedWeightsContractBuilder regional",
"node_type": "library"
},
{
"description": "n_features = 5.16M, init_keep_prob = 0.999",
"id": "create_model",
Expand Down Expand Up @@ -4624,6 +4647,12 @@
"label": "unified_run_config.json",
"node_type": "artifact"
},
{
"description": "Scoped Stage 3 contract for regional fitted weights, geography, run config, and legacy diagnostics",
"id": "out_fit_contract_regional",
"label": "fitted_weights_regional_contract.json",
"node_type": "artifact"
},
{
"description": "SparseCalibrationWeights - HardConcrete gates",
"id": "util_l0",
Expand Down Expand Up @@ -4722,6 +4751,21 @@
"source": "fit_artifacts_national",
"target": "out_national_config_s6"
},
{
"edge_type": "documents",
"source": "fit_artifacts_national",
"target": "out_fit_contract_national"
},
{
"edge_type": "data_flow",
"source": "fit_model",
"target": "fit_contract_builder_national"
},
{
"edge_type": "produces_artifact",
"source": "fit_contract_builder_national",
"target": "out_fit_contract_national"
},
{
"edge_type": "data_flow",
"source": "init_weights",
Expand Down Expand Up @@ -4782,14 +4826,16 @@
"node_ids": [
"fit_spec_national",
"fit_artifacts_national",
"fit_contract_builder_national",
"init_weights",
"create_model_national",
"fit_model",
"extract_national_weights",
"out_national_weights",
"out_national_geo_s6",
"out_national_diag",
"out_national_config_s6"
"out_national_config_s6",
"out_fit_contract_national"
]
}
],
Expand Down Expand Up @@ -4824,6 +4870,12 @@
"label": "ScopedFitArtifacts national",
"node_type": "library"
},
{
"description": "Builds the national fitted_weights stage contract from Stage 2 package identity, fit parameters, artifacts, and diagnostics",
"id": "fit_contract_builder_national",
"label": "FittedWeightsContractBuilder national",
"node_type": "library"
},
{
"description": "National target scope with L0 HardConcrete gates",
"id": "create_model_national",
Expand Down Expand Up @@ -4860,6 +4912,12 @@
"label": "national_unified_run_config.json",
"node_type": "artifact"
},
{
"description": "Scoped Stage 3 contract for national fitted weights, geography, run config, and legacy diagnostics",
"id": "out_fit_contract_national",
"label": "fitted_weights_national_contract.json",
"node_type": "artifact"
},
{
"description": "SparseCalibrationWeights - HardConcrete gates",
"id": "util_l0_national",
Expand Down
38 changes: 38 additions & 0 deletions docs/pipeline_map.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1047,6 +1047,7 @@ stages:
node_ids:
- fit_spec_regional
- fit_artifacts_regional
- fit_contract_builder_regional
- init_weights
- create_model
- fit_model
Expand All @@ -1055,6 +1056,7 @@ stages:
- out_geo_s6
- out_diag
- out_config_s6
- out_fit_contract_regional
extra_nodes:
- id: in_pkg_s6
label: calibration_package.pkl
Expand All @@ -1072,6 +1074,10 @@ stages:
label: ScopedFitArtifacts regional
node_type: library
description: Regional fitted-weight artifact filenames and remote result mapping
- id: fit_contract_builder_regional
label: FittedWeightsContractBuilder regional
node_type: library
description: Builds the regional fitted_weights stage contract from Stage 2 package identity, fit parameters, artifacts, and diagnostics
- id: create_model
label: Create SparseCalibrationWeights
node_type: process
Expand All @@ -1096,6 +1102,10 @@ stages:
label: unified_run_config.json
node_type: artifact
description: Hyperparameters + SHA256 checksums
- id: out_fit_contract_regional
label: fitted_weights_regional_contract.json
node_type: artifact
description: Scoped Stage 3 contract for regional fitted weights, geography, run config, and legacy diagnostics
- id: util_l0
label: l0-python
node_type: utility
Expand Down Expand Up @@ -1123,6 +1133,15 @@ stages:
- source: fit_artifacts_regional
target: out_config_s6
edge_type: documents
- source: fit_artifacts_regional
target: out_fit_contract_regional
edge_type: documents
- source: fit_model
target: fit_contract_builder_regional
edge_type: data_flow
- source: fit_contract_builder_regional
target: out_fit_contract_regional
edge_type: produces_artifact
- source: init_weights
target: create_model
edge_type: data_flow
Expand Down Expand Up @@ -1172,6 +1191,7 @@ stages:
node_ids:
- fit_spec_national
- fit_artifacts_national
- fit_contract_builder_national
- init_weights
- create_model_national
- fit_model
Expand All @@ -1180,6 +1200,7 @@ stages:
- out_national_geo_s6
- out_national_diag
- out_national_config_s6
- out_fit_contract_national
extra_nodes:
- id: in_pkg_national_s6
label: calibration_package.pkl
Expand All @@ -1197,6 +1218,10 @@ stages:
label: ScopedFitArtifacts national
node_type: library
description: National fitted-weight artifact filenames and remote result mapping
- id: fit_contract_builder_national
label: FittedWeightsContractBuilder national
node_type: library
description: Builds the national fitted_weights stage contract from Stage 2 package identity, fit parameters, artifacts, and diagnostics
- id: create_model_national
label: Create National SparseCalibrationWeights
node_type: process
Expand All @@ -1221,6 +1246,10 @@ stages:
label: national_unified_run_config.json
node_type: artifact
description: National hyperparameters + SHA256 checksums
- id: out_fit_contract_national
label: fitted_weights_national_contract.json
node_type: artifact
description: Scoped Stage 3 contract for national fitted weights, geography, run config, and legacy diagnostics
- id: util_l0_national
label: l0-python
node_type: utility
Expand Down Expand Up @@ -1248,6 +1277,15 @@ stages:
- source: fit_artifacts_national
target: out_national_config_s6
edge_type: documents
- source: fit_artifacts_national
target: out_fit_contract_national
edge_type: documents
- source: fit_model
target: fit_contract_builder_national
edge_type: data_flow
- source: fit_contract_builder_national
target: out_fit_contract_national
edge_type: produces_artifact
- source: init_weights
target: create_model_national
edge_type: data_flow
Expand Down
Loading