Skip to content

Convert untar, pigz/compress, kallisto/index, samtools/index, bowtie2/build to topic versions#9796

Merged
FriederikeHanssen merged 41 commits intomasterfrom
feat/topics-version-batch1
Feb 2, 2026
Merged

Convert untar, pigz/compress, kallisto/index, samtools/index, bowtie2/build to topic versions#9796
FriederikeHanssen merged 41 commits intomasterfrom
feat/topics-version-batch1

Conversation

@pinin4fjords
Copy link
Member

@pinin4fjords pinin4fjords commented Jan 28, 2026

Summary

Converts 5 modules from versions.yml file-based version emission to the modern topic: versions channel emission pattern:

  • untar
  • pigz/compress
  • kallisto/index
  • samtools/index
  • bowtie2/build

Changes per module

  • main.nf: Replace path "versions.yml", emit: versions with structured tuple output using topic: versions
  • main.nf: Remove heredoc version blocks from both script: and stub: sections
  • meta.yml: Remove old versions: output section, add versions_<tool>: output and topics: section
  • tests/main.nf.test: Update assertions to use process.out.findAll { key, val -> key.startsWith('versions') }
  • tests/main.nf.test.snap: Regenerated with structured version tuples ["PROCESS_NAME", "tool", "version"]

Version output format

Before:

path "versions.yml", emit: versions

After:

tuple val("${task.process}"), val('toolname'), eval('tool --version ...'), emit: versions_toolname, topic: versions

Bug fix: fastq_align_dedup_bwamem subworkflow

Fixed non-deterministic GPU test failures caused by incorrect use of .first() on subworkflow version channels:

Problem: The workflow used .first() on BAM_SORT_STATS_SAMTOOLS.out.versions and FASTQ_ALIGN_BWA.out.versions, which are subworkflow outputs containing multiple versions mixed from parallel processes (FLAGSTAT, IDXSTATS, etc.). The .first() operator randomly selected whichever version emerged first from the channel, causing:

  • Single md5 values to flip between FLAGSTAT and IDXSTATS versions across runs
  • GPU tests to fail non-deterministically due to variable process timing

Fix:

  • Remove .first() from subworkflow version channels (should only be used on process outputs, not subworkflows)
  • Sort versions by md5 hash in tests for deterministic ordering

Conda environment fixes

Pin dependencies in several modules to match their container versions, preventing compatibility issues when running with conda (particularly numpy 2.x and pandas dtype coercion issues):

  • gmmdemux: Pin Python 3.12.2, numpy 1.26.4, pandas 2.2.1, scipy 1.12.0, scikit-learn 1.4.1
  • pypgx (all 4 modules): Pin Python 3.12.3, numpy 1.26.4, pandas 2.2.2, scipy 1.13.1, scikit-learn 1.5.0
  • emu/abundance: Pin Python 3.13.9, numpy 2.3.4, pandas 2.3.3
  • orthofinder: Pin Python 3.12.1, numpy 1.26.3, scipy 1.11.4
  • checkm2/predict: Pin Python 3.12.9, numpy 1.26.4, pandas 2.2.3, scipy 1.15.2, scikit-learn 1.6.1, tensorflow 2.17.0, keras 3.9.0

Test plan

  • All modules pass nf-core modules lint
  • All tests pass locally (except kallisto non-stub which requires x86_64 - snapshot manually updated)
  • CI tests pass

🤖 Generated with Claude Code

…/build to topic-based version emission

Converts these modules from versions.yml file emission to the modern
topic: versions channel emission pattern.

Changes per module:
- main.nf: Replace `path "versions.yml", emit: versions` with
  `tuple val("${task.process}"), val('<tool>'), eval('<version_cmd>'),
  emit: versions_<tool>, topic: versions`
- main.nf: Remove heredoc version blocks from script and stub sections
- meta.yml: Remove old versions output, add versions_<tool> output and
  topics section
- tests: Update assertions to use findAll for versions outputs
- tests: Regenerate snapshots with structured version tuples

Updates 11 subworkflows to remove explicit version collection for
converted modules (versions now collected via topic channel):
- archive_extract
- bam_sort_stats_samtools
- bam_dedup_stats_samtools_umitools
- bam_markduplicates_picard
- bam_dedup_stats_samtools_umicollapse
- fastq_align_bwaaln
- bam_split_by_region
- fastq_remove_rrna
- fasta_index_dna
- fasta_index_bismark_bwameth
- fasta_index_methylseq

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@pinin4fjords pinin4fjords force-pushed the feat/topics-version-batch1 branch from 9407d1d to 01c2896 Compare January 28, 2026 20:42
@pinin4fjords pinin4fjords removed the request for review from maxulysse January 28, 2026 20:43
Remove .out.versions references from fastq_align_dedup_bismark and
fastq_align_dedup_bwamem since samtools/index now uses topic-based
version emission.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@mashehu
Copy link
Contributor

mashehu commented Jan 29, 2026

This PR is getting quite big, maybe split them up by tool?

pinin4fjords and others added 2 commits January 29, 2026 09:41
Remove aliased SAMTOOLS_INDEX_ALIGNMENTS and SAMTOOLS_INDEX_DEDUPLICATED
.out.versions references since samtools/index now uses topic-based
version emission.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This change doubles the number of shards from 16 to 32 to reduce disk
pressure on CI runners. This should be reverted before merging.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@pinin4fjords pinin4fjords requested review from a team as code owners January 29, 2026 09:51
@pinin4fjords
Copy link
Member Author

pinin4fjords commented Jan 29, 2026

This PR is getting quite big, maybe split them up by tool?

It was only 5 modules! The fallout is just extensive. If have to make a PR per module I'm not going to be very motivated to tackle these updates.

@mashehu
Copy link
Contributor

mashehu commented Jan 29, 2026

The fallout is just extensive

That's why I would split it up into per module (or per tool) PR, to make it also easier to finish them

@pinin4fjords
Copy link
Member Author

The fallout is just extensive

That's why I would split it up into per module (or per tool) PR, to make it also easier to finish them

I've got 35 of these to do 😢

@mashehu
Copy link
Contributor

mashehu commented Jan 29, 2026

you mean <1,500 😜 https://nf-co.re/stats/code/container_conversion/

@pinin4fjords
Copy link
Member Author

This PR is getting quite big, maybe split them up by tool?

It was only 5 modules! The fallout is just extensive. If have to make a PR per module I'm not going to be very motivated to tackle these updates.

you mean <1,500 😜 https://nf-co.re/stats/code/container_conversion/

Ha! Dream on 😛

pinin4fjords and others added 11 commits January 29, 2026 11:07
This reverts commit 590763d.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Pin Python and key dependencies to versions matching the biocontainers
image to prevent pandas compatibility issues with newer Python versions.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Pin Python and key dependencies to versions matching the biocontainers
image to prevent pandas compatibility issues with newer Python versions.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Pin Python and key dependencies to versions matching the Wave container
to prevent compatibility issues with different conda resolutions.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Pin Python, numpy, and scipy to versions matching the biocontainers
image to prevent numpy 2.x compatibility issues.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@pinin4fjords
Copy link
Member Author

@nf-core-bot update gpu snapshot path: subworkflows/nf-core/fastq_align_dedup_bwamem/tests/gpu.nf.test

@pinin4fjords
Copy link
Member Author

@nf-core-bot update gpu snapshot path: subworkflows/nf-core/fastq_align_dedup_bwameth/tests/gpu.nf.test

nf-core-bot and others added 5 commits January 29, 2026 16:27
The GAWK transformation now handles multiple checkm2 output formats:
- stem only -> adds .fna.gz
- stem.fna -> adds .gz
- stem.fna.gz -> unchanged

This fixes conda test failures where checkm2 outputs genome names
differently than docker due to dependency version differences.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
output:
tuple val(meta), path('bowtie2') , emit: index
path "versions.yml" , emit: versions
tuple val("${task.process}"), val('bowtie2'), eval('bowtie2 --version 2>&1 | head -1 | sed "s/^.*bowtie2-align-s version //; s/ .*//"'), emit: versions_bowtie2, topic: versions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be nice to shorten this expression, but not going to worry about it here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd rather leave expressions as they were before the topics change if poss.

pinin4fjords and others added 2 commits February 2, 2026 09:23
- Add conda-forge:: prefix and sort dependencies alphabetically in
  checkm2, emu, gmmdemux, and orthofinder environment.yml files
- Remove unused ch_versions from fastq_align_bwaaln subworkflow

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@pinin4fjords pinin4fjords requested a review from SPPearce February 2, 2026 09:25
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
SPPearce added a commit that referenced this pull request Feb 2, 2026
@pinin4fjords
Copy link
Member Author

pinin4fjords commented Feb 2, 2026

There's some issue with the Galah test w/ Docker in CI today that wasn't there on Friday. It relies on a file on zenodo, and maybe the CI runners are disallowed.

But it passes on a clean linux machine I started:

ubuntu@ip-172-31-32-102:~/modules$ nf-test test modules/nf-core/galah/tests/main.nf.test --profile docker --verbose

🚀 nf-test 0.9.3
https://www.nf-test.com
(c) 2021 - 2024 Lukas Forer and Sebastian Schoenherr

Load .nf-test/plugins/nft-anndata/0.1.0/nft-anndata-0.1.0.jar
Load .nf-test/plugins/nft-bam/0.6.0/nft-bam-0.6.0.jar
Load .nf-test/plugins/nft-csv/0.1.0/nft-csv-0.1.0.jar
Load .nf-test/plugins/nft-compress/0.1.0/nft-compress-0.1.0.jar
Load .nf-test/plugins/nft-fastq/0.0.1/nft-fastq-0.0.1.jar
Load .nf-test/plugins/nft-utils/0.0.7/nft-utils-0.0.7.jar
Load .nf-test/plugins/nft-vcf/1.0.7/nft-vcf-1.0.7.jar

Test Process GALAH

  Test [250843e1] 'genomes - no qc_table' 
    > N E X T F L O W  ~  version 25.10.3
    > Launching `/home/ubuntu/modules/.nf-test-250843e1bfa0e25f378392bb851be0a.nf` [sick_neumann] DSL2 - revision: a04438af3e
    > WARN: There's no process matching config selector: CHECKM2_PREDICT
    > WARN: There's no process matching config selector: GAWK
    > [fb/2a5aa6] Submitted process > GALAH (test)
    PASSED (5.96s)
  Test [7c0102a5] 'genomes - checkm2 qc_table' 
    > N E X T F L O W  ~  version 25.10.3
    > Launching `/home/ubuntu/modules/.nf-test-7c0102a5180b51fac4608c4f0036a39d.nf` [dreamy_brown] DSL2 - revision: b7f789018d
    > Staging foreign file: https://zenodo.org/records/14897628/files/checkm2_database.tar.gz
    > [0c/970c16] Submitted process > UNTAR (checkm2_database.tar.gz)
    > [6a/8a1026] Submitted process > CHECKM2_PREDICT (test_checkm2)
    > [0f/785f47] Submitted process > GAWK (test_checkm2)
    > [8f/b2ee16] Submitted process > GALAH (test)
    PASSED (448.54s)
  Test [99a8b8c5] 'genomes - stub' 
    > N E X T F L O W  ~  version 25.10.3
    > Launching `/home/ubuntu/modules/.nf-test-99a8b8c53b6aaed2546b53281583255d.nf` [adoring_lalande] DSL2 - revision: 61228c27d2
    > WARN: There's no process matching config selector: CHECKM2_PREDICT
    > WARN: There's no process matching config selector: GAWK
    > [31/ccd739] Submitted process > GALAH (test)
    PASSED (5.342s)


SUCCESS: Executed 3 tests in 459.855s

@FriederikeHanssen FriederikeHanssen merged commit 447f7bc into master Feb 2, 2026
48 of 96 checks passed
@FriederikeHanssen FriederikeHanssen deleted the feat/topics-version-batch1 branch February 2, 2026 11:37
github-merge-queue bot pushed a commit that referenced this pull request Feb 4, 2026
#9756)

* Fix variable redeclaration errors in strict syntax mode

Remove variable redeclarations by renaming closure parameters and local
variables that shadow input parameters or previously declared variables.

Fixed modules:
- bbmap/bbsplit: Renamed 'index' closure parameter to 'idx' in eachWithIndex
- bedtools/groupby: Renamed local 'summary_col' to 'summary_col_opt'
- mafft/align: Renamed stub section variables with '_stub' suffix
- spaceranger/count: Renamed option variables with '_opt' suffix

This addresses variable redeclaration errors flagged by nextflow lint in
strict syntax mode (NXF_SYNTAX_PARSER=v2).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Fix additional variable redeclaration errors (part 2)

Continue fixing variable redeclaration errors for strict syntax mode:

- krakentools/extractkrakenreads: Use distinct meta parameter names (meta2, meta3)
- mafft/align: Rename local option variables with _opt suffix
- blobtk/plot: Add def keyword to script section variables
- cellranger/count: Add def keyword to script section variables
- diann: Add def keyword to script section variables
- flash: Add def keyword to script section variables
- scds: Add def keyword to script section variables

In strict syntax mode, all variable declarations must use the def keyword
explicitly, and variables cannot be redeclared within the same scope.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Fix variable redeclaration errors (part 3)

Continue fixing variable redeclarations:

- peka, pureclip: Use distinct meta parameter names for multiple tuple inputs
- svtyper/svtyper: Fix duplicate meta2 in inputs, rename vcf shadow variable
- svtyper/svtypersso: Rename vcf and fasta shadow variables with _opt suffix
- cellrangerarc/mkref: Rename reference_config to reference_config_name
- eklipse: Rename ref_gb to ref_gb_path
- ichorcna/createpon: Rename exons to exons_opt, add def to prefix
- igvreports: Rename fasta to fasta_opt

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Fix final variable redeclaration errors (part 4)

Complete fixing all variable redeclaration errors for strict syntax mode:

- pureclip: Use distinct meta parameter name (meta3) for third input
- jvarkit/sam2tsv, jvarkit/vcf2table: Rename regions_file to regions_opt
- hmmer/hmmfetch: Rename index to index_opt
- chewbbaca/createschema: Rename prodigal_tf and cds to *_opt
- cnvnator/cnvnator: Add def to prefix in script section
- crumble: Rename bedout to bedout_opt in both script and stub
- gmmdemux: Rename skip and type_report to *_opt
- happy/sompy: Rename bams to bams_opt
- salsa2: Rename gfa and dup to *_opt
- sam2lca/analyze: Rename database to database_path
- segemehl/align: Rename reads to reads_opt, add def to prefix
- svanalyzer/svbenchmark: Rename bed to bed_opt

All 53 variable redeclaration errors have been resolved.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Fix variable redeclarations in response to PR review

Addresses review comments from @SPPearce on PR #9684:

1. Remove 'def' from prefix variables used in output blocks:
   - cnvnator/cnvnator: Remove def from prefix (script & stub)
   - diann: Remove def from prefix (script & stub)
   - flash: Remove def from prefix (script & stub)
   - ichorcna/createpon: Remove def from prefix (script only)
   - segemehl/align: Remove def from prefix (script only)

2. Remove unnecessary variable declarations:
   - mafft/align: Remove unused stub variables (args_stub, add_stub,
     addfragments_stub, addfull_stub, addprofile_stub, addlong_stub)
   - mafft/align: Remove def from prefix (script) and prefix_stub

3. Improve code formatting:
   - spaceranger/count: Add backslash continuation between image option
     variables for better readability

All changes verified with:
- nextflow lint (all modules pass)
- NXF_SYNTAX_PARSER=v2 nextflow inspect (all modules pass)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Update meta.yml files to match renamed meta variables

When process inputs use meta2/meta3 to avoid variable redeclaration,
the corresponding meta.yml files must document these renamed variables.

Updated:
- krakentools/extractkrakenreads: meta → meta2, meta3
- peka: meta → meta2
- pureclip: meta → meta2, meta3
- svtyper/svtyper: meta2 → meta3 (for fai input)

All modules verified with:
- nextflow inspect (standard parser)
- NXF_SYNTAX_PARSER=v2 nextflow inspect (strict parser)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Apply suggestions from code review

* Fix blobtk/plot prefix variable scope

Remove `def` from prefix variable to make it accessible to the
tag directive which is evaluated before the script block.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix test infrastructure issues

- hmmer/hmmfetch: Update test data URL from master to 0.9 tag
  (master branch no longer has the HMM file)
- peka: Bump version from 1.0.0 to 1.0.2
  (1.0.0 requires numpy 1.17.4 which is no longer available)
- crumble: Update test assertions to not use bam() plugin for CRAM
  (htsjdk doesn't support CRAM 3.1 format)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix hmmer/hmmrank test data URL

Update barrnap test data URL from master to 0.9 tag.
The master branch no longer has the HMM database files.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Update test snapshots

- peka: Update versions.yml hash for 1.0.2
- svtyper/svtypersso: Update VCF output hash

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix template variable access in cellranger/count and scds

Remove `def` from variables that need to be accessible to templates.
Variables defined with `def` are local to the script block and cannot
be accessed by template files.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix scds test data URL

Update to use existing test data from modules branch.
The scdownstream branch does not exist.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix test failures for hmmer/hmmrank, svtyper/svtypersso, and scds

- Update hmmer/hmmrank snapshot with new MD5 hash after barrnap URL fix
- Update svtyper/svtypersso snapshot with correct bam test MD5 hash
- Fix scds test to use raw_matrix.sce.rds which has required "counts" assay

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Revert scds changes - test data issue pre-exists this PR

The SCDS test data at scdownstream/samples/SAMN14430801_... doesn't exist.
This is a pre-existing issue, not caused by this PR. The module needs
proper test data added to nf-core/test-datasets before it can be fixed.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix scds: update test data URL and remove def from prefix

- Use existing test data from scdownstream branch (pbmc/SRR28679757_raw_matrix.sce.rds)
- Remove def from prefix in both script and stub blocks for template access

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Update scds test data URL to modules branch filtered matrix

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix svtyper/svtypersso snapshots and keep scds prefix fix

- Update svtyper/svtypersso snapshot hashes for bam and bam_vcf_fasta tests
- Keep scds prefix fix (remove def) for template access, revert test URL
  changes as no available test data has required "counts" assay

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix scds test data URL and svtyper snapshots

- Use commit-pinned URL for scds test data with counts assay
- Update svtyper/svtypersso snapshot hashes for bam and bam_vcf_fasta tests

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Add def back to args in cellranger/count

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Various fixes

* Add extra pins to gmmdemux as in #9796

* Eklipse doesn't seem to work on conda, so fail it

* Use different eklipse bioconda version

* Fix svtypersso and try pinning eklipse

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Simon Pearce <24893913+SPPearce@users.noreply.github.com>
cavenel pushed a commit to cavenel/modules that referenced this pull request Feb 5, 2026
…/build to topic versions (nf-core#9796)

* Convert untar, pigz/compress, kallisto/index, samtools/index, bowtie2/build to topic-based version emission

Converts these modules from versions.yml file emission to the modern
topic: versions channel emission pattern.

Changes per module:
- main.nf: Replace `path "versions.yml", emit: versions` with
  `tuple val("${task.process}"), val('<tool>'), eval('<version_cmd>'),
  emit: versions_<tool>, topic: versions`
- main.nf: Remove heredoc version blocks from script and stub sections
- meta.yml: Remove old versions output, add versions_<tool> output and
  topics section
- tests: Update assertions to use findAll for versions outputs
- tests: Regenerate snapshots with structured version tuples

Updates 11 subworkflows to remove explicit version collection for
converted modules (versions now collected via topic channel):
- archive_extract
- bam_sort_stats_samtools
- bam_dedup_stats_samtools_umitools
- bam_markduplicates_picard
- bam_dedup_stats_samtools_umicollapse
- fastq_align_bwaaln
- bam_split_by_region
- fastq_remove_rrna
- fasta_index_dna
- fasta_index_bismark_bwameth
- fasta_index_methylseq

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Remove SAMTOOLS_INDEX version collection from additional subworkflows

Remove .out.versions references from fastq_align_dedup_bismark and
fastq_align_dedup_bwamem since samtools/index now uses topic-based
version emission.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Remove SAMTOOLS_INDEX version collection from fastq_align_dedup_bwameth

Remove aliased SAMTOOLS_INDEX_ALIGNMENTS and SAMTOOLS_INDEX_DEDUPLICATED
.out.versions references since samtools/index now uses topic-based
version emission.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* ci: Temporarily increase max_shards to 32 to reduce disk space issues

This change doubles the number of shards from 16 to 32 to reduce disk
pressure on CI runners. This should be reverted before merging.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Update snaps

* style: Fix prettier formatting in bowtie2/build meta.yml

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Bump shards again

* Update more snaps

* Revert "Bump shards again"

This reverts commit 590763d.

* ci: Increase max_shards to 48 to reduce disk space issues

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* This module has variable outputs

* Couple more snaps

* fix(gmmdemux): Pin dependencies to match container versions

Pin Python and key dependencies to versions matching the biocontainers
image to prevent pandas compatibility issues with newer Python versions.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(pypgx): Pin dependencies to match container versions

Pin Python and key dependencies to versions matching the biocontainers
image to prevent pandas compatibility issues with newer Python versions.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(emu/abundance): Pin dependencies to match container versions

Pin Python and key dependencies to versions matching the Wave container
to prevent compatibility issues with different conda resolutions.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(orthofinder): Pin dependencies to match container versions

Pin Python, numpy, and scipy to versions matching the biocontainers
image to prevent numpy 2.x compatibility issues.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(checkm2/predict): Pin dependencies to match container versions

Pin Python, TensorFlow, Keras, and other dependencies to versions
matching the Wave container to prevent hanging issues caused by
TensorFlow/numpy version mismatches.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test: Update GPU test snapshots for bwamem and bwameth subworkflows

Update versions.yml entries in GPU test snapshots to match current
module outputs.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test: Fix additional GPU test snapshots for bwamem subworkflow

Update remaining versions.yml entries in GPU test snapshots.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test: Fix SE skip deduplication GPU snapshots for bwamem

Update versions.yml entries and structured versions for SE skip
deduplication tests (both regular and stub).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(galah): Pin fastani version for conda/Docker consistency

Pin fastani=1.34 to match the biocontainer, ensuring consistent
ANI calculation results between conda and Docker profiles.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test: Fix remaining GPU snapshots for bwamem subworkflow

Update versions arrays for:
- SE deduplicate: add PICARD_ADDORREPLACEREADGROUPS version
- SE skip deduplication stub: correct versions and parsed process list

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test: Make GPU test versions output deterministic

- Add .collect().sort() to workflow.out.versions in all GPU tests
- Fix stub test name from "skip deduplication" to "deduplicate" to
  match actual behavior (input[4] = false means DO deduplication)
- Sort versions arrays inside workflow.out for stub test stability
- Sort parsed yaml collection for consistent ordering

The versions channel uses .mix() which has non-deterministic ordering
depending on process completion order. Sorting ensures stable snapshots.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: Remove .first() from subworkflow versions and sort by md5

The GPU tests were non-deterministic because:
1. .first() was called on subworkflow version channels that contained
   multiple versions mixed from parallel processes (FLAGSTAT, IDXSTATS)
2. .first() randomly selected whichever version emerged first from the
   channel, causing single md5 values to flip between runs

Fix:
- Remove .first() from subworkflow version channels in main.nf
  (should only be used on process outputs, not subworkflow outputs)
- Sort versions by md5 hash for deterministic test ordering
- Simplify stub tests to match non-stub pattern

[skip ci]

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* update snap

* [automated] Update gpu snapshot

* fix(galah): Make GAWK transform robust for checkm2 genome names

The GAWK transformation now handles multiple checkm2 output formats:
- stem only -> adds .fna.gz
- stem.fna -> adds .gz
- stem.fna.gz -> unchanged

This fixes conda test failures where checkm2 outputs genome names
differently than docker due to dependency version differences.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* ci: Increase max shards to 64

* ci: Add galah to conda skip list

* revert: Remove galah fixes (added to CI skip list instead)

* fix: Address PR review comments

- Add conda-forge:: prefix and sort dependencies alphabetically in
  checkm2, emu, gmmdemux, and orthofinder environment.yml files
- Remove unused ch_versions from fastq_align_bwaaln subworkflow

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: Remove versions references from fastq_align_bwaaln tests

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* reset shards

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: nf-core-bot <core@nf-co.re>
cavenel pushed a commit to cavenel/modules that referenced this pull request Feb 5, 2026
nf-core#9756)

* Fix variable redeclaration errors in strict syntax mode

Remove variable redeclarations by renaming closure parameters and local
variables that shadow input parameters or previously declared variables.

Fixed modules:
- bbmap/bbsplit: Renamed 'index' closure parameter to 'idx' in eachWithIndex
- bedtools/groupby: Renamed local 'summary_col' to 'summary_col_opt'
- mafft/align: Renamed stub section variables with '_stub' suffix
- spaceranger/count: Renamed option variables with '_opt' suffix

This addresses variable redeclaration errors flagged by nextflow lint in
strict syntax mode (NXF_SYNTAX_PARSER=v2).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Fix additional variable redeclaration errors (part 2)

Continue fixing variable redeclaration errors for strict syntax mode:

- krakentools/extractkrakenreads: Use distinct meta parameter names (meta2, meta3)
- mafft/align: Rename local option variables with _opt suffix
- blobtk/plot: Add def keyword to script section variables
- cellranger/count: Add def keyword to script section variables
- diann: Add def keyword to script section variables
- flash: Add def keyword to script section variables
- scds: Add def keyword to script section variables

In strict syntax mode, all variable declarations must use the def keyword
explicitly, and variables cannot be redeclared within the same scope.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Fix variable redeclaration errors (part 3)

Continue fixing variable redeclarations:

- peka, pureclip: Use distinct meta parameter names for multiple tuple inputs
- svtyper/svtyper: Fix duplicate meta2 in inputs, rename vcf shadow variable
- svtyper/svtypersso: Rename vcf and fasta shadow variables with _opt suffix
- cellrangerarc/mkref: Rename reference_config to reference_config_name
- eklipse: Rename ref_gb to ref_gb_path
- ichorcna/createpon: Rename exons to exons_opt, add def to prefix
- igvreports: Rename fasta to fasta_opt

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Fix final variable redeclaration errors (part 4)

Complete fixing all variable redeclaration errors for strict syntax mode:

- pureclip: Use distinct meta parameter name (meta3) for third input
- jvarkit/sam2tsv, jvarkit/vcf2table: Rename regions_file to regions_opt
- hmmer/hmmfetch: Rename index to index_opt
- chewbbaca/createschema: Rename prodigal_tf and cds to *_opt
- cnvnator/cnvnator: Add def to prefix in script section
- crumble: Rename bedout to bedout_opt in both script and stub
- gmmdemux: Rename skip and type_report to *_opt
- happy/sompy: Rename bams to bams_opt
- salsa2: Rename gfa and dup to *_opt
- sam2lca/analyze: Rename database to database_path
- segemehl/align: Rename reads to reads_opt, add def to prefix
- svanalyzer/svbenchmark: Rename bed to bed_opt

All 53 variable redeclaration errors have been resolved.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Fix variable redeclarations in response to PR review

Addresses review comments from @SPPearce on PR nf-core#9684:

1. Remove 'def' from prefix variables used in output blocks:
   - cnvnator/cnvnator: Remove def from prefix (script & stub)
   - diann: Remove def from prefix (script & stub)
   - flash: Remove def from prefix (script & stub)
   - ichorcna/createpon: Remove def from prefix (script only)
   - segemehl/align: Remove def from prefix (script only)

2. Remove unnecessary variable declarations:
   - mafft/align: Remove unused stub variables (args_stub, add_stub,
     addfragments_stub, addfull_stub, addprofile_stub, addlong_stub)
   - mafft/align: Remove def from prefix (script) and prefix_stub

3. Improve code formatting:
   - spaceranger/count: Add backslash continuation between image option
     variables for better readability

All changes verified with:
- nextflow lint (all modules pass)
- NXF_SYNTAX_PARSER=v2 nextflow inspect (all modules pass)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Update meta.yml files to match renamed meta variables

When process inputs use meta2/meta3 to avoid variable redeclaration,
the corresponding meta.yml files must document these renamed variables.

Updated:
- krakentools/extractkrakenreads: meta → meta2, meta3
- peka: meta → meta2
- pureclip: meta → meta2, meta3
- svtyper/svtyper: meta2 → meta3 (for fai input)

All modules verified with:
- nextflow inspect (standard parser)
- NXF_SYNTAX_PARSER=v2 nextflow inspect (strict parser)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Apply suggestions from code review

* Fix blobtk/plot prefix variable scope

Remove `def` from prefix variable to make it accessible to the
tag directive which is evaluated before the script block.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix test infrastructure issues

- hmmer/hmmfetch: Update test data URL from master to 0.9 tag
  (master branch no longer has the HMM file)
- peka: Bump version from 1.0.0 to 1.0.2
  (1.0.0 requires numpy 1.17.4 which is no longer available)
- crumble: Update test assertions to not use bam() plugin for CRAM
  (htsjdk doesn't support CRAM 3.1 format)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix hmmer/hmmrank test data URL

Update barrnap test data URL from master to 0.9 tag.
The master branch no longer has the HMM database files.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Update test snapshots

- peka: Update versions.yml hash for 1.0.2
- svtyper/svtypersso: Update VCF output hash

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix template variable access in cellranger/count and scds

Remove `def` from variables that need to be accessible to templates.
Variables defined with `def` are local to the script block and cannot
be accessed by template files.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix scds test data URL

Update to use existing test data from modules branch.
The scdownstream branch does not exist.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix test failures for hmmer/hmmrank, svtyper/svtypersso, and scds

- Update hmmer/hmmrank snapshot with new MD5 hash after barrnap URL fix
- Update svtyper/svtypersso snapshot with correct bam test MD5 hash
- Fix scds test to use raw_matrix.sce.rds which has required "counts" assay

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Revert scds changes - test data issue pre-exists this PR

The SCDS test data at scdownstream/samples/SAMN14430801_... doesn't exist.
This is a pre-existing issue, not caused by this PR. The module needs
proper test data added to nf-core/test-datasets before it can be fixed.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix scds: update test data URL and remove def from prefix

- Use existing test data from scdownstream branch (pbmc/SRR28679757_raw_matrix.sce.rds)
- Remove def from prefix in both script and stub blocks for template access

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Update scds test data URL to modules branch filtered matrix

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix svtyper/svtypersso snapshots and keep scds prefix fix

- Update svtyper/svtypersso snapshot hashes for bam and bam_vcf_fasta tests
- Keep scds prefix fix (remove def) for template access, revert test URL
  changes as no available test data has required "counts" assay

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix scds test data URL and svtyper snapshots

- Use commit-pinned URL for scds test data with counts assay
- Update svtyper/svtypersso snapshot hashes for bam and bam_vcf_fasta tests

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Add def back to args in cellranger/count

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Various fixes

* Add extra pins to gmmdemux as in nf-core#9796

* Eklipse doesn't seem to work on conda, so fail it

* Use different eklipse bioconda version

* Fix svtypersso and try pinning eklipse

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Simon Pearce <24893913+SPPearce@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants