Skip to content

cellgeni/nf-cluster

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cellgeni/nf-cluster

nf-test

Nextflow nf-core template version run with docker run with singularity

Introduction

cellgeni/nf-cluster is a Nextflow pipeline for single-cell ATAC processing from Cell Ranger ARC output directories to integrated clustering and visualization outputs.

The current ATAC workflow performs:

  • AMULET doublet calling
  • metadata attachment and QC filtering
  • tile matrix generation and feature selection
  • Scrublet doublet scoring
  • on-disk concatenation across samples
  • spectral embedding
  • RAPIDS neighbors, Leiden clustering, and UMAP
  • Scanpy embedding plots colored by Leiden and selected metadata columns

Usage

Prepare a sample sheet with the following columns:

sample,path
SAMPLE_A,/path/to/cellranger_arc_count_output_A
SAMPLE_B,/path/to/cellranger_arc_count_output_B

Each path should point to a Cell Ranger ARC output directory containing fragments files (for example, fragments.tsv.gz).

Run the ATAC workflow:

nextflow run cellgeni/nf-cluster \
   --input examples/samples.csv \
   --atac.genome hg38 \
   --outdir results

You can also provide parameters through a YAML/JSON params file:

nextflow run cellgeni/nf-cluster \
   -params-file params.yml

Key Parameters

  • Required:
    • --input: CSV sample sheet with sample,path
    • --atac.genome: genome label used by the ATAC workflow
  • Common:
    • --random_state
    • --outdir
  • RAPIDS neighbors:
    • --neighbors.n_neighbors
    • --neighbors.algorithm
    • --neighbors.metric
    • --neighbors.method
  • RAPIDS Leiden:
    • --leiden.resolution
    • --leiden.theta
    • --leiden.n_iterations
    • --leiden.key_added
  • RAPIDS UMAP:
    • --umap.min_dist
    • --umap.spread
    • --umap.n_components
    • --umap.init_pos
  • Embedding plotting:
    • --embeddingplot.basis (default: X_umap)
    • --embeddingplot.color (list, default includes leiden)
    • --embeddingplot.legend_loc

Output Overview

The pipeline writes linked outputs under the chosen outdir, including:

  • h5ad/filtered: QC-filtered AnnData files
  • h5ad/neighbors: AnnData with neighbor graph
  • h5ad/leiden: AnnData with Leiden labels in obs
  • h5ad/umap: AnnData with UMAP coordinates in obsm
  • plots/embedding: PNG embedding plots (for example, UMAP colored by leiden)
  • amulet: AMULET outputs
  • reports: Nextflow execution report, timeline, trace, and DAG

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors