Add a common name higher-rank-fallback flag/option and column indicating the common name rank retrieved#37
Conversation
Record the rank used for each common-name hit so downstream consumers can distinguish specific names from higher-rank fallbacks. Preserve the existing fallback behavior by default while allowing callers to opt out. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR extends the common-names post-processing workflow to (1) optionally disable higher-rank fallback when looking up vernacular names and (2) make fallback behavior explicit in outputs by recording the rank where a common name was found.
Changes:
- Adds
--higher-rank-fallback/--no-higher-rank-fallbacktotaxonopy common-names, preserving the existing default fallback behavior while allowing strict “no climb” mode. - Adds a
common_name_rankoutput column and threads rank reporting through hierarchical lookup + merge logic. - Expands unit/integration test coverage and updates user/docs guidance to reflect the new flag and output schema.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/taxonopy/resolve_common_names.py |
Implements higher_rank_fallback behavior and emits common_name_rank; wires the option through the common-names pipeline. |
src/taxonopy/cli.py |
Exposes the new --higher-rank-fallback/--no-higher-rank-fallback flag on the common-names subcommand. |
src/taxonopy/config.py |
Adds higher_rank_fallback to central config and allows it to be set via CLI args dict. |
tests/test_resolve_common_names.py |
Adds/updates tests for rank reporting and strict no-fallback behavior across ranks and merge scenarios. |
docs/user-guide/quick-reference.md |
Updates quick-reference documentation and sample output to include common_name_rank and flag semantics. |
AGENTS.md |
Updates agent guidance for the new output column and fallback flag. |
Comments suppressed due to low confidence (1)
src/taxonopy/resolve_common_names.py:528
main()referencesargs.cache_dir, but this argparse parser does not define a--cache-diroption. Running this module as a script (whenannotation_dir/output_dirare None) will raiseAttributeError: Namespace has no attribute 'cache_dir'. Add the argument to this parser (or remove the conditional) so script mode works.
# Update config if cache-dir was provided
if args.cache_dir:
from taxonopy.config import config
config.cache_dir = args.cache_dir
Path(config.cache_dir).mkdir(parents=True, exist_ok=True)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
It looks great, but I feel there might be another issue that hasn't been described. What we observed is that some common names in the original TOL-10M are more specific (to the genus or species level), but they are updated to the family level in the revision. I feel that is more extreme than the intended |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
That's true, the original TOL-10M uses more sources for identifying common names, so it has a higher hit rate at granular taxa than the TaxonoPy However that coverage question is separate from this PR, which makes the current rank fallback behavior explicit (by adding a record of the Improving common name granular coverage is a work in progress. |
vimar-gu
left a comment
There was a problem hiding this comment.
I see! That makes sense.
Addresses #36
Specifically:
taxonopy common-names --higher-rank-fallback(default, matches existing behavior which climbs from species toward kingdom to find an available vernacular name) and--no-higher-rank-fallback(queries only the finest non-nullrank in the row's resolved lineage and leaves the common name empty on a miss).
common_name_rankoutput column that records the taxonomic rank at which the returned vernacular name was found, or null when no common name is available.These preserve the existing default fallback behavior while making coarser fallback names explicit and optionally avoidable.
Tests for:
common_name_rankcolumn.Documentation updates for:
common-namesquick-reference section and sample output table.Docs preview: https://imageomics.github.io/TaxonoPy/pr-37
Claude Opus 4.7 was used to generate a plan and solution, based my specification of the problem and a human-in-the-loop workflow. I prepared context, provided iterative feedback, reviewed and where needed revised all output for quality.