feat(governance): add distributional drift detection (PSI, KS) by Hopelynconsult · Pull Request #53 · Climate-Vision/ClimateVision

Hopelynconsult · 2026-05-07T16:21:52Z

Summary

Adds governance/drift_detector.py — distributional drift checks that compare recent prediction/input windows against a reference baseline.
Two non-parametric methods: Population Stability Index (PSI) and two-sample Kolmogorov-Smirnov.
Per-feature DriftResult rolled up into a DriftReport. Designed to plug into the prediction-history JSONL written by the anomaly detector (feat(governance): add anomaly detection for inference outputs #35).

Why this is distinct from the anomaly detector

The anomaly detector (#35) flags individual predictions whose features sit outside historical norms — point anomalies. A model can have zero point anomalies and still be silently drifting if the distribution of its predictions has shifted (e.g., post-monsoon Sentinel-2 statistics on a deforestation model). PSI/KS catch that distributional shift over a window.

What's in the PR

DriftResult (per-feature) and DriftReport (multi-feature) dataclasses with JSON serialisation, mirroring governance.calibration.CalibrationReport style.
population_stability_index() — bins on the reference's quantiles (canonical PSI), industry-standard severity bands (< 0.1 stable, 0.1–0.25 moderate, >= 0.25 severe). Constant-reference fallback to a single bin.
kolmogorov_smirnov() — supremum CDF gap with an asymptotic p-value computed from the standard Kolmogorov series. Avoids pulling scipy into evaluation-time deps.
detect_drift() — one-shot entrypoint that takes feature-name → values dicts for both windows and runs PSI or KS per feature.
write_drift_report() — persistence alongside model cards / calibration reports.
13 unit tests: identical vs shifted distributions, both methods, per-feature severity isolation, constant-reference edge case, validation (non-finite, empty, feature mismatch, unknown method), JSON round-trip.

Plugging into the existing pipeline

The anomaly detector already writes a JSONL of per-prediction features (mean_confidence, std_confidence, positive_fraction, entropy). Wiring PSI on those four features over a rolling window is a 30-line script — left as a follow-up so this PR stays focused.

Follow-ups (out of scope here)

A scheduled CI job that reads the last N days of outputs/anomalies/history.jsonl, picks a baseline window, and emits a drift report.
A drift_score threshold added to scripts/governance_ci_gate.py so a release fails if any monitored feature is in the severe band.

Test plan

pytest tests/test_drift_detector.py -q → 13 passed
Reviewer: confirm the severity bands (PSI_STABLE=0.10, PSI_MODERATE=0.25, KS significance 0.05) are the right defaults for our use case before we wire them into the CI gate.

🤖 Generated with Claude Code

Complement to the per-point anomaly detector (#35): the anomaly detector flags individual predictions whose features fall outside historical norms; this module compares the *distribution* of recent predictions (or inputs) against a reference baseline and flags drift even when no single prediction is anomalous. Two non-parametric tests: - Population Stability Index over reference quantile bins. PSI < 0.1 stable, 0.1-0.25 moderate, > 0.25 severe (industry-standard rule of thumb). - Two-sample Kolmogorov-Smirnov, with the asymptotic p-value computed from the standard Kolmogorov series so we don't pull in scipy at evaluation time. Both run per-feature; a DriftReport aggregates per-feature DriftResults so callers (CI gate, monitoring dashboards) decide their own aggregation policy. Designed to plug into the prediction-history JSONL emitted by the anomaly detector so drift can run as a scheduled CI step over the last N days of production predictions. - DriftResult / DriftReport dataclasses with JSON serialisation - detect_drift() one-shot entrypoint covering both methods - write_drift_report() for persistence alongside model cards - 13 tests covering identical/shifted distributions, both methods, per-feature severity, edge cases (constant reference, non-finite, empty windows), feature mismatch validation, and JSON round-trip

Hopelynconsult requested a review from Goldokpa as a code owner May 7, 2026 16:21

Goldokpa mentioned this pull request May 7, 2026

governance/drift_detector: clarify PSI drifted threshold semantics + minor cleanups #55

Open

fix(governance): align PSI drifted flag with significant-drift threshold

4fdb979

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(governance): add distributional drift detection (PSI, KS)#53

feat(governance): add distributional drift detection (PSI, KS)#53
Hopelynconsult wants to merge 2 commits into
developfrom
feature/governance-drift-detector

Hopelynconsult commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Hopelynconsult commented May 7, 2026

Summary

Why this is distinct from the anomaly detector

What's in the PR

Plugging into the existing pipeline

Follow-ups (out of scope here)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants