Skip to content

Add as = "Rle" return type to read_bigwig() (#18)#21

Merged
jayhesselberth merged 1 commit into
mainfrom
rle-return-type
Jun 17, 2026
Merged

Add as = "Rle" return type to read_bigwig() (#18)#21
jayhesselberth merged 1 commit into
mainfrom
rle-return-type

Conversation

@jayhesselberth

Copy link
Copy Markdown
Member

Summary

Adds read_bigwig(..., as = "Rle"), requested by @jmw86069 in #18, for
per-base coverage queries (e.g. over exons) as a lighter-weight
rtracklayer replacement.

  • Returns a per-base run-length-encoded vector spanning the queried range:
    a single Rle for one chromosome, or a named RleList for several.
  • Expanded length equals end - start when both are supplied, otherwise the
    data extent of each chromosome. bigWig coords are 0-based half-open, so
    element i is genomic position start + i - 1.
  • Uncovered bases are set to fill (default 0, the coverage convention;
    pass fill = NA to mark them missing).

Implementation notes

The conversion lives in R (as_rle() / runs_to_rle() in R/read.r). The
C++ layer (read_bigwig_cpp) already emits run-length data, and an Rle's
payload is just (values, lengths), so only compact runs cross the
C++/R boundary — building the S4 object stays in R alongside the
S4Vectors/IRanges classes that define it. No C++ changes.

Adds S4Vectors to Imports (for Rle) and imports RleList from
IRanges.

Tests

New local (non-network) tests cover windowed length, interior gap filling,
fill = NA, whole-chromosome extent, and the multi-chromosome RleList.
Full suite passes locally (68 tests). Leaving the full R CMD check to CI.

Note

Builds on the remote Range-request fix (#18, already on main) — per-exon
remote queries would crash without it.

read_bigwig(..., as = "Rle") returns a per-base run-length-encoded
vector spanning the queried range: an Rle for a single chromosome, or
a named RleList for several. Uncovered bases are set to a `fill` value
(default 0, the coverage convention; use NA to mark them missing).

The conversion lives in R (as_rle/runs_to_rle): the C++ layer already
emits run-length data via read_bigwig_cpp, and the Rle payload is just
(values, lengths), so only compact runs cross the boundary. Building
the S4 object stays in R next to the S4Vectors/IRanges classes that
define it.

Adds S4Vectors to Imports (Rle), imports RleList from IRanges, and
local (non-network) tests covering windowed length, gap filling, NA
fill, whole-chrom extent, and the multi-chrom RleList.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jayhesselberth jayhesselberth merged commit b9394a1 into main Jun 17, 2026
7 checks passed
@jayhesselberth jayhesselberth deleted the rle-return-type branch June 17, 2026 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant