update to new versions capture for checkm2 databasedownload and checkm2 predict#9744
update to new versions capture for checkm2 databasedownload and checkm2 predict#9744mashehu merged 21 commits intonf-core:masterfrom
Conversation
| assertAll( | ||
| { assert process.success }, | ||
| { assert snapshot(process.out.versions).match() }, | ||
| { assert snapshot(process.out.findAll { key, val -> key.startsWith('versions') }).match() }, |
There was a problem hiding this comment.
Any specific reason why process.out.database is not captured?
There was a problem hiding this comment.
Local dev now captures database via this assertion:
{ assert path(process.out.database.get(0).get(1)).exists() },
Unable to snapshot the file as its 16gb.
Open to other suggestions re assertions for the database
There was a problem hiding this comment.
Once tests pass and snapshots are updated for this and other subworkflows/module setups which use checkm2 I will update the PR
…--user-agent and jobs fail
|
An update - Ongoing dev required here - Zenodo now blocks CheckM2 database downloads via aria2c default Hope to get this wrapped up today but it has been a bit of a process |
|
how big is the data? can it be added to https://github.com/nf-core/test-datasets/ ? |
16GB, so it cant go on test-datasets |
|
Iteration time is super slow for testing etc because it has to download the 16gb file. Additionally, the user agent mode which works is via wget - the default which gets blocked by zenodo parallelizes the download. WGET takes 15-30 minutes to download this database, but doesnt get blocked by zenodo |
|
any way to make this database smaller to make testing easier? |
Well theres two modules - one is to download the database and one to run. In theory we could potentially make the checkm2 run module use a test one, but this doesnt help the checkm2/databasedownload module |
|
looks like other nf-core people are already on the case: nf-core/mag#915 |
|
Don't really understand how I can get these remaining tests to pass/why they are failing in the first place @mashehu |
|
the CI picks up all differences between |
|
That makes sense. Thanks! all passing now :) |
modify version capture code Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com>
|
Tests are a bit iffy. They have passed now but I dont think they are particularly stable, potentially due to rate-limiting from zenodo - see https://blog.zenodo.org/2026/01/28/2026-01-28-improvements-and-support-expectations/ Now the database is also available at Should we tweak to use the nf-core aws database? |
|
Also, for future reference, Galah (which uses checkm2 in its test) sometimes produces a different snapshot - output is unstable. @prototaxites to address Galah test data - thanks! |
…m2 predict (nf-core#9744) Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com> Co-authored-by: Julio DiazCaballero <jdiaz@eit.org>
Updating checkm2 modules to migrate versions to topic channels.
Also fixed a bug with checkm2/databasedownload that causes zenodo to block the user agent model and causes tests to fail. Tests fail even with main branch.
PR checklist
topic: versions- See version_topicslabelnf-core modules test <MODULE> --profile dockernf-core modules test <MODULE> --profile singularitynf-core modules test <MODULE> --profile condanf-core subworkflows test <SUBWORKFLOW> --profile dockernf-core subworkflows test <SUBWORKFLOW> --profile singularitynf-core subworkflows test <SUBWORKFLOW> --profile conda