Fix tascCODA returning no credible effects (theta collapse) by Zethson · Pull Request #1017 · scverse/pertpy

Zethson · 2026-06-10T13:57:07Z

What

Fixes #1015 — tascCODA returned zero credible effects on data where it should recover them.

Root cause

The global spike-and-slab mixing weight theta ~ Beta(1, d) collapses to its prior (median ≈ 0.01) under numpyro NUTS. Because the credibility threshold is

delta = 1/(l0 - l1) * log(1/p_t - 1),   p_t = (theta*l1/2) / (theta*l1/2 + (1-theta)*l0/2)

a near-zero theta is a double failure: it shrinks b_tilde = (1-theta)*spike + theta*slab toward the spike (≈0) and sends delta → ∞. Nothing can clear the threshold.

This is the model's true marginal posterior, not a numpyro bug. A single global theta gates a high-dimensional slab, so the low-theta funnel mouth (slab unconstrained) carries almost all the marginal volume — confirmed by Betancourt's incomplete-reparameterization result: no practitioner-applicable reparameterization changes theta's marginal, and the slab is already non-centered.

Why upstream "works": the reference TFP implementation samples with a fixed identity mass matrix and a fixed 10-leapfrog trajectory started at theta = 0.5, so it physically cannot traverse to the funnel and stays pinned near init (back-solving its reported Delta = 0.066 gives theta ≈ 0.34). numpyro NUTS (adaptive mass + dynamic trajectories) mixes well enough to find the genuinely-collapsed posterior. In other words, the published results were computed with theta effectively fixed, never inferred — and the canonical spike-and-slab LASSO (Ročková–George) is itself estimated by MAP/EM precisely because full-Bayes MCMC on this posterior is multimodal/prohibitive.

Fix

Hold theta fixed via numpyro.deterministic at pen_args["theta"] (default 0.5, the reference's operating point; settable, e.g. 0.34 to match upstream's exact delta).

This is the minimal change that reproduces the paper's effective behavior. Unchanged: samples["theta"] (still present, as a constant), the delta credibility rule, the summary layout, arviz dims/coords, and param_names. The now-dead d = D*(T - n_ref) and the stale theta init entry are removed.

Validation (tutorial data, `formula="Health"`, `phi=0`, automatic reference)

scenario	θ median	credible	recovered
before (sampled θ)	0.01	0 / 74	—
full 3-group, θ=0.5	0.500	3 / 148	Immune, B cells, TA cells
2-group, θ=0.5	0.500	1 / 74	TA cells
2-group, θ=0.34 (`pen_args`)	0.340	1 / 74	TA cells

Stable across seeds {0, 7, 42, 123} and across θ∈[0.34, 0.5]; delta is finite. The recovered set matches the effects reported in the issue.

Trade-off / faithfulness

theta is no longer a random variable. That is a deliberate, documented deviation from the generative model — but it is faithful to the published results, which depend on theta staying near 0.5. Keeping theta sampled and instead crippling the numpyro sampler to mimic upstream's non-convergence would be fragile and init-dependent; a regularized-horseshoe redesign would be robust but is a different method (no theta/delta).

tascCODA returned zero credible effects because the global spike-and-slab mixing weight theta collapses to its Beta(1, d) prior (~0.01) under numpyro NUTS, which sends the selection threshold delta to infinity and zeroes out every node effect (issue #1015). This is the model's true marginal posterior, not a sampler bug: a single global theta gates a high-dimensional slab, so the low-theta funnel mouth carries almost all the marginal volume. The reference TFP implementation only avoids the collapse because its fixed identity-mass, short-trajectory HMC stays pinned near the theta=0.5 init -- i.e. the published results were computed with theta effectively fixed, never inferred. Hold theta fixed via numpyro.deterministic at pen_args["theta"] (default 0.5, the reference's operating point). samples["theta"], the delta credibility rule, arviz dims and param_names are all unchanged. On the tutorial data this recovers the expected credible effects (Immune, B cells, TA cells) and is stable across seeds and across theta in [0.34, 0.5]. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

codecov-commenter · 2026-06-10T14:11:26Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.90%. Comparing base (5fa8ed7) to head (b705184).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1017      +/-   ##
==========================================
+ Coverage   77.81%   77.90%   +0.09%     
==========================================
  Files          50       50              
  Lines        6580     6581       +1     
==========================================
+ Hits         5120     5127       +7     
+ Misses       1460     1454       -6

Files with missing lines	Coverage Δ
pertpy/tools/_coda/_tasccoda.py	`77.45% <100.00%> (+0.13%)`	⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions Bot added the bug Something isn't working label Jun 10, 2026

Zethson added a commit that referenced this pull request Jun 10, 2026

docs: changelog entry for tascCODA theta-collapse fix (#1017)

26595dd

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

github-actions Bot added the chore label Jun 10, 2026

Zethson force-pushed the fix/tasccoda-theta-collapse-1015 branch from 26595dd to b705184 Compare June 10, 2026 13:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix tascCODA returning no credible effects (theta collapse)#1017

Fix tascCODA returning no credible effects (theta collapse)#1017
Zethson wants to merge 1 commit into
mainfrom
fix/tasccoda-theta-collapse-1015

Zethson commented Jun 10, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Jun 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Zethson commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Root cause

Fix

Validation (tutorial data, formula="Health", phi=0, automatic reference)

Trade-off / faithfulness

Uh oh!

codecov-commenter commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Zethson commented Jun 10, 2026 •

edited

Loading

Validation (tutorial data, `formula="Health"`, `phi=0`, automatic reference)

codecov-commenter commented Jun 10, 2026 •

edited

Loading