Add as = "Rle" return type to read_bigwig() (#18)#21
Merged
Conversation
read_bigwig(..., as = "Rle") returns a per-base run-length-encoded vector spanning the queried range: an Rle for a single chromosome, or a named RleList for several. Uncovered bases are set to a `fill` value (default 0, the coverage convention; use NA to mark them missing). The conversion lives in R (as_rle/runs_to_rle): the C++ layer already emits run-length data via read_bigwig_cpp, and the Rle payload is just (values, lengths), so only compact runs cross the boundary. Building the S4 object stays in R next to the S4Vectors/IRanges classes that define it. Adds S4Vectors to Imports (Rle), imports RleList from IRanges, and local (non-network) tests covering windowed length, gap filling, NA fill, whole-chrom extent, and the multi-chrom RleList. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
read_bigwig(..., as = "Rle"), requested by @jmw86069 in #18, forper-base coverage queries (e.g. over exons) as a lighter-weight
rtracklayerreplacement.a single
Rlefor one chromosome, or a namedRleListfor several.end - startwhen both are supplied, otherwise thedata extent of each chromosome. bigWig coords are 0-based half-open, so
element
iis genomic positionstart + i - 1.fill(default0, the coverage convention;pass
fill = NAto mark them missing).Implementation notes
The conversion lives in R (
as_rle()/runs_to_rle()inR/read.r). TheC++ layer (
read_bigwig_cpp) already emits run-length data, and anRle'spayload is just
(values, lengths), so only compact runs cross theC++/R boundary — building the S4 object stays in R alongside the
S4Vectors/IRanges classes that define it. No C++ changes.
Adds
S4VectorstoImports(forRle) and importsRleListfromIRanges.Tests
New local (non-network) tests cover windowed length, interior gap filling,
fill = NA, whole-chromosome extent, and the multi-chromosomeRleList.Full suite passes locally (68 tests). Leaving the full
R CMD checkto CI.Note
Builds on the remote Range-request fix (#18, already on
main) — per-exonremote queries would crash without it.