ATSEmapper takes per-sample or per-cell splice junction files (from regtools) and identifies alternative splicing events (ATSEs) by building splice graphs across all samples and annotating junctions against a reference genome. Designed for single-cell and bulk RNA-seq data.
ATSEmapper contains two modules:
- ATSEmapper — maps ATSEs from splice junction files; outputs a compressed TSV of splicing events with PSI-based usage scores
- ATSEviz — Python plotting library for visualizing splicing events and isoforms in Jupyter notebooks
ATSEmapper efficiently identifies alternative splicing events from RNA-seq splice junction data generated by regtools.
- Batch processing of multiple splice junction files
- Integration with genome annotations (GTF/GFF3 via gffutils)
- Canonical splice site verification against genome sequence
- Junction annotation and filtering
- PSI-based splice site usage quantification
ATSEmapper processes splice junction files generated by regtools:
regtools junctions extract -s 0 sample.bam -o sample.junctions.bed- Each file corresponds to one sample (bulk RNA-seq or single-cell)
- Optimized for bulk RNA-seq and plate-based single-cell data
- Input can be a directory of
.bed/.juncfiles or a.txtfile listing paths (one per line) - 10X Genomics (multi-cell BAM) support is under development
From GitHub (first time):
git clone https://github.com/yourusername/ATSEmapper.git
pip install -e ATSEmapperIf you already have the repo locally:
pip install -e /path/to/ATSEmapperThe -e flag installs in editable mode — atsemapper becomes importable anywhere in your environment and any code changes are immediately live without reinstalling.
This installs one CLI command: atsemapper.
- Python ≥ 3.10
- pandas
- numpy
- networkx
- matplotlib
- seaborn
- pyfaidx (genome sequence access)
- gffutils (annotation parsing)
- tqdm
atsemapper \
--input path/to/junction/files \
--annotation path/to/annotation.gtf \
--genome path/to/genome.fafrom atsemapper.atsemapper.main import run_atsemapper
class Args:
input = "path/to/junction/files"
annotation = "path/to/annotation.gtf"
genome = "path/to/genome.fa"
output = None # auto-generated if None
db_path = None # annotation.db path; created if missing
min_intron = 50
max_intron = 500000
min_reads = 100
min_cells = 2
batch_size = 10
num_workers = 4
sequencing_type = "single_cell"
annotation_status = "both"
only_canonical = False
tolerance = 1
min_splice_site_usage = 0.01
sample_size = None
log_file = None
verbose = False
output_file = run_atsemapper(Args())See examples/atsemapper_demo.py for a full working example.
| Parameter | Description | Default |
|---|---|---|
--input |
Directory of .bed/.junc files, or a .txt file listing paths, or a single file |
required |
--annotation |
GTF/GFF3 genome annotation file | required |
--genome |
FASTA genome sequence file | required |
--output |
Output directory (auto-generated with timestamp if omitted) | LeafletFA_ATSE_mapper_output_<DATE> |
--db_path |
Path to existing gffutils SQLite annotation database | {output}/annotation.db |
--min-intron |
Minimum intron length (bp) | 50 |
--max-intron |
Maximum intron length (bp) | 500000 |
--min-reads |
Minimum total reads across all cells for a junction | 100 |
--min-cells |
Minimum number of cells/samples with a junction | 2 |
--batch-size |
Number of files to process per batch | 10 |
--num-workers |
Number of parallel worker threads | 4 |
--sequencing-type |
single_cell or bulk |
single_cell |
--annotation-status |
Junction annotation filter: both, either, unanno_also |
both |
--only-canonical |
Keep only canonical splice sites (GT-AG, GC-AG, AT-AC) | False |
--tolerance |
Tolerance (bp) for matching splice sites to annotated exons | 1 |
--min-splice-site-usage |
Minimum proportion of reads a junction must contribute at a splice site | 0.01 |
--sample-size |
Randomly sample N junction files (useful for testing) | all files |
--log-file |
Path to log file | {output}/atsemapper_<timestamp>.log |
--verbose |
Enable debug-level logging | False |
ATSEmapper writes a compressed TSV to the output directory:
{output}/atse_events_<timestamp>.tsv.gz
Each row is an alternative splicing event with junction coordinates, gene annotation, and splice site usage.
atsemapper \
--input examples/junctions/ \
--annotation /path/to/gencode.v38.annotation.gtf \
--genome /path/to/GRCh38.primary_assembly.genome.fa \
--sequencing-type bulk \
--min-reads 5 \
--only-canonicalATSEviz is a Python plotting library for visualizing splicing events and isoforms. It is designed for use in Jupyter notebooks — there is no CLI.
See examples/visualization_examples/visualize_atses.ipynb for a full walkthrough.
- Sashimi-style splice junction plots with usage-ratio coloring
- Isoform plots with exon/CDS/intron structure
- Intron compression for compact visualization
- PDF export
import gffutils
from atsemapper.atseviz.main import (
plot_exons_and_junctions,
plot_isoforms,
fetch_transcripts_and_annotations,
fetch_transcripts_for_gene,
)
db = gffutils.FeatureDB("path/to/annotation.db", keep_order=True)
transcripts = fetch_transcripts_for_gene(db, "Ptbp1")
transcript_data = fetch_transcripts_and_annotations(db, transcripts)
region_start, region_end = determine_region_boundaries_from_transcripts(transcript_data)
plot_isoforms(db, transcript_data, region_start, region_end, transcript_order=transcripts)If you use ATSEmapper in your research, please cite:
Isaev et al. (2025). LeafletFA: A comprehensive framework for alternative splicing analysis
from single cell RNA-seq data. Journal Name. DOI: 10.xxxx/xxxxx
Contributions are welcome. Please open a Pull Request:
- Fork the repository
- Create a feature branch (
git checkout -b feature/my-feature) - Commit your changes
- Push and open a Pull Request
MIT License — see the LICENSE file for details.
Questions or bug reports: open a GitHub issue or email karin.isaev@gmail.com.