Skip to content

PEtab v2 problem importer — the 'two-adapter' proof (first step: parameters table → FreeParameter/Prior) #407

@wshlavacek

Description

@wshlavacek

Motivation

The M2 modularization gave PyBNF first-class, registry-backed Prior (ADR-0010, pybnf/priors/) and NoiseModel (ADR-0011, pybnf/noise/) abstractions, deliberately PEtab-defaulted but not PEtab-bound (ADR-0004). The payoff that justifies that shape is a PEtab v2 problem importer: a thin adapter that reads a problem.yaml + its TSV tables + SBML model and produces the same internal objects a native .conf produces.

That makes it the "two-adapter" proof the refactor plan calls out — native .conf and a PEtab problem feeding one set of FreeParameter/Prior/NoiseModel/exp-data objects. If both adapters land on the same objects, the abstractions are right; if PEtab forces a special case, we learn where they're wrong.

This is an umbrella/tracking issue. It scopes the whole importer and pins down a concrete, self-contained first step that can be split into its own issue when work begins.

What a PEtab v2 problem is

  • problem.yaml — references the model file(s) + the TSV tables
  • model — SBML (PyBNF already imports SBML/Antimony: SbmlModel in pset.py:1027, BngsimAntimony)
  • parameters.tsvparameterId, parameterScale (lin|log|log10), lowerBound, upperBound, nominalValue, estimate (0|1), objectivePriorType, objectivePriorParameters
  • observables.tsvobservableId, observableFormula, noiseFormula, observableTransformation, noiseDistribution
  • measurements.tsvobservableId, simulationConditionId, measurement, time, …
  • conditions.tsv — per-condition parameter/species overrides

Mapping to PyBNF's existing abstractions

PEtab concept PyBNF target Status
parameterScale lin / log10 Scale Linear / Log10 (pybnf/priors/scale.py) ✅ exists
parameterScale = log (natural log) ⚠️ gap: PyBNF has Log10 only
objectivePriorType uniform / normal / laplace Uniform / Normal / Laplace family (pybnf/priors/) ✅ exists
parameterScale{Uniform,Normal,Laplace} (prior in scaled space) prior-in-own-scale, no Jacobian (ADR-0003) ✅ exact match
estimate = 0 (fixed) NoPrior / fixed value ✅ exists
lowerBound / upperBound reflecting bounds on FreeParameter ✅ exists
noiseDistribution normal / laplace Gaussian / (Laplace noise — only as prior today) ⚠️ partial
observableTransformation lin / log / log10 NoiseModel scale-additive-on axis (ADR-0011) ✅ partial
location = median (PEtab v2 hardcodes) Location Interpretation axis (CONTEXT.md) ✅ exists
observableFormula / noiseFormula (sympy math over model entities) ⚠️ biggest chunk: needs a formula layer

Proposed first step (concrete, self-contained, CI-testable)

parameters.tsvFreeParameter + Prior. This is the cleanest 1:1 with the fresh M2.3 abstraction, needs no simulator, and is pure/oracle-testable:

  1. A pybnf/petab/parameters.py reader: parameters.tsv row → (prior family, scale, bounds, nominal, estimate) → a FreeParameter carrying a Prior, reusing PRIOR_KEYWORD_MAP/the prior registry — not a parallel mapping table.
  2. Round-trip / equivalence tests: a PEtab parameters table and the equivalent native *_var lines produce bit-identical FreeParameter/Prior objects (the two-adapter contract, scoped to parameters).
  3. Surface the gaps the table above flags as explicit NotImplementedErrors with clear messages (natural-log scale; PEtab prior types we lack), so the boundary is documented in code, not silent.

This proves the prior catalog maps to PEtab's prior catalog and turns the "are the abstractions right?" question into a passing test — without committing to the formula parser or the full YAML wiring yet.

Subsequent chunks (rough order, each its own issue when reached)

  • Step 1 — parameters → Prior/FreeParameter (the first step above)
  • observables.tsv → NoiseModel selection + the noiseDistribution/transformation → (family, scale-additive-on, location) mapping (ADR-0011)
  • measurements.tsv + conditions.tsv → PyBNF exp-data + per-condition model overrides
  • observableFormula / noiseFormula expression layer (sympy over model entities) — the largest piece
  • problem.yaml top-level wiring + SBML model load → a complete Configuration
  • End-to-end: import a small published PEtab benchmark problem and fit it
  • Catalog parity follow-ups for the ⚠️ gaps (natural-log Scale; Laplace noise family; any missing PEtab prior types)

Notes / constraints

  • New runtime deps (e.g. petab, python-libsbml, sympy if not already pulled in) must be hand-mirrored into .github/actions/setup-pybnf or the tests/integration CI tiers go red (the recurring single-sync-point gotcha).
  • Keep the importer simulator-free where possible so it runs in the bngsim-less CI tier.
  • Out-of-scope framing comes from dev/refactor-plan.md ("Out of scope (future) → PEtab-problem importer"). Relevant ADRs: 0003 (no Jacobian), 0004 (PEtab-defaulted not -bound), 0010 (Prior), 0011 (NoiseModel).

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions