Conversation
…htly between architectures
|
Tests were changed to only assert versions and the run-independent portion of the header line, as the computed scores differ slightly between architectures (observed here from the fourth decimal onward). This was noted by the developers of anarcii as well: "[...] However, we have observed that they show minor variation on different architectures and versions of python/torch in a small number of sequences. These differences are minimal." (source: https://github.com/oxpig/ANARCII/wiki/FAQs#understanding-sequence-scores, retrieved 30.03.26). |
| assertAll( | ||
| { assert snapshot( | ||
| process.out.versions, | ||
| file(process.out.anarcii.get(0).get(1)).readLines()[0].contains("Name,Chain,Score,Query start,Query end") |
There was a problem hiding this comment.
maybe you can make this assertion easier to read using nft-csv and checking the header? https://github.com/lukfor/nft-csv#columnnames
modules/nf-core/anarcii/main.nf
Outdated
|
|
||
| conda "${moduleDir}/environment.yml" | ||
| container "${ workflow.containerEngine == 'singularity' ? | ||
| 'oras://community.wave.seqera.io/library/python_pip_anarcii:702a76f5b5d01657' : |
There was a problem hiding this comment.
please don't use an https instead of an oras url here: nf-co.re/docs/tutorials/nf-core_components/using_seqera_containers
There was a problem hiding this comment.
Thanks for reviewing, both done :)
|
Did you already manage to add the test-data to the test-datasets repository? :) |
| def anarciiResults = path(process.out.anarcii.get(0).get(1)).csv | ||
| assert "Name" in anarciiResults.columnNames | ||
| assert "Chain" in anarciiResults.columnNames | ||
| assert "Score" in anarciiResults.columnNames | ||
| assert "Query start" in anarciiResults.columnNames | ||
| assert "Query end" in anarciiResults.columnNames |
There was a problem hiding this comment.
| def anarciiResults = path(process.out.anarcii.get(0).get(1)).csv | |
| assert "Name" in anarciiResults.columnNames | |
| assert "Chain" in anarciiResults.columnNames | |
| assert "Score" in anarciiResults.columnNames | |
| assert "Query start" in anarciiResults.columnNames | |
| assert "Query end" in anarciiResults.columnNames | |
| assert path(process.out.anarcii[0][1]).csv.columnNames == ["Name", "Chain", "Score", "Query start", "Query end"] |
or are the optionally more column names?
There was a problem hiding this comment.
There are 128+ other columns corresponding to the numbering, which depends on the dataset. I can find out which exact numbering (includes alternatives sometimes, like 112A, 112B etc) this test data set produces, but I thought that including all of those would be hard to read and not representative of the tool, only of this specific test dataset. However, if you feel I should include them, it is possible.
There was a problem hiding this comment.
Are you still struggling with instable md5sums even after updating the test data?
There was a problem hiding this comment.
Yes, unfortunately while the scores look better, they are still not 100% stable across systems
| def anarciiResults = path(process.out.anarcii.get(0).get(1)).csv | ||
| assert "Name" in anarciiResults.columnNames | ||
| assert "Chain" in anarciiResults.columnNames | ||
| assert "Score" in anarciiResults.columnNames | ||
| assert "Query start" in anarciiResults.columnNames | ||
| assert "Query end" in anarciiResults.columnNames |
There was a problem hiding this comment.
Are you still struggling with instable md5sums even after updating the test data?
Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>
Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>
Co-authored-by: Famke Bäuerle <45968370+famosab@users.noreply.github.com>
nf-core/modules pull request
PR checklist
Closes #10805
Adds a new module: ANARCII for auto-numbering TCRs and antibodies using a language model
Tool github: https://github.com/oxpig/ANARCII
topic: versions- See version_topicslabelnf-core modules test <MODULE> --profile dockernf-core modules test <MODULE> --profile singularitynf-core modules test <MODULE> --profile conda