Skip to content

governance/drift_detector: clarify PSI drifted threshold semantics + minor cleanups #55

@Goldokpa

Description

@Goldokpa

Follow-up from review of #53.

Main concern — PSI drifted flag fires on "moderate" drift.

In detect_drift (PSI branch):

drifted=psi >= PSI_STABLE,  # PSI_STABLE = 0.10

Combined with _psi_severity, a PSI of e.g. 0.12 will be marked drifted=True with severity="moderate". Since report.any_drifted is what consumers will gate on (CI, alerts, etc.), this means "moderate" drift triggers alerts — which conflicts with the docstring's canonical guidance that >0.25 is the significant-drift threshold.

The existing tests pass either way because they only assert severity in {"moderate", "severe"} and any_drifted is True.

Proposal:

  • Expose a configurable threshold parameter on detect_drift, e.g. psi_drift_threshold: float = PSI_MODERATE.
  • Default to psi >= PSI_MODERATE so any_drifted only fires on significant drift, while still reporting severity for the moderate range.
  • Update / add tests to lock in both branches (moderate-but-not-drifted, severe-and-drifted).

Smaller items (can fold into the same PR or split out):

  1. KS p-value with ties. The asymptotic Kolmogorov formula assumes continuous distributions; with heavy ties (e.g. constant-vs-constant or coarse discrete data) p-values can be biased. Add a docstring note or short comment in kolmogorov_smirnov calling this out.
  2. Magic constant in KS severity. severe = p_value < (ks_significance / 5) — promote 5 (or the resulting factor) to a named constant like KS_SEVERE_FACTOR = 5 with a brief justification, or expose it as a parameter.
  3. Double-sort micro-opt in KS. kolmogorov_smirnov sorts ref and cur twice each (once via np.searchsorted(np.sort(ref), ...), once implicitly via combined). Hoist ref_sorted = np.sort(ref) / cur_sorted = np.sort(cur) once. Cosmetic only.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions