From 08d394f904c810092b2e18941c288f8958ecbcc4 Mon Sep 17 00:00:00 2001 From: Laura Mason <95650976+lauramason326@users.noreply.github.com> Date: Fri, 15 May 2026 12:40:28 -0600 Subject: [PATCH 01/13] Update Introduction.md --- docs/Introduction.md | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/docs/Introduction.md b/docs/Introduction.md index 21a0aeae..2d77a70d 100644 --- a/docs/Introduction.md +++ b/docs/Introduction.md @@ -1,36 +1,37 @@ -# Welcome to the wiki for Distilling and Refining Annotations of Metabolism 2 (DRAM2)! +# DRAM2 +## Welcome to the wiki for Distilling and Refining Annotations of Metabolism 2 (DRAM2)! -# **As of May 11, 2026, we will not be making public changes to DRAM2 code ahead of our upcoming publication. We appreciate your patience & stay tuned!** +## **As of May 11, 2026, we will not be making public changes to DRAM2 code ahead of our upcoming publication. We appreciate your patience & stay tuned!** -## Introduction +### Introduction DRAM2 (Distilling and Refining Annotations of Metabolism, version 2) is a tool for annotating genomic and metagenomic assemblies (e.g., scaffolds or contigs) as well as predicted genes (nucleotide or amino acid sequences). It organizes genome annotations into metabolic functions across three levels of increasing interpretation: (1) **ANNOTATE**, (2) **SUMMARIZE**, and (3) **VISUALIZE**. The **ANNOTATE** output contains all database hits for every gene in each genome, generating a comprehensive output of most annotation pipelines. DRAM2 extends beyond this by organizing (**SUMMARIZE**) and visualizing (**VISUALIZE**) annotations into ecosystem-relevant functional categories, enabling more interpretable comparisons across genomes and ecosystems. The DRAM2 workflow enables the analysis of large numbers of microbial genomes or metagenomes, highlighting functional guilds and supporting inference of organismal metabolism across datasets. -## DRAM2 Overview +### DRAM2 Overview > _Here provide we an overview on the DRAM2 workflow A quick start guide can be found here https://dramit.readthedocs.io/en/latest/usage.html A complete list of pipeline configuration parameters can be found here: [Parameters API](https://dramit.readthedocs.io/en/latest/params_doc.html) -## DRAM2 workflow +### DRAM2 workflow 1) Gene calling 2) Gene annotation 3) Summarize gene annotations based on curated datasets to ascribe function to MAGs 4) Generate an interactive heatmap of ecosystem-relevant MAG level metabolic function -## 1) Gene calling +#### 1) Gene calling DRAM2 uses Prodigal to find open reading frames (ORFs) from genomes, Metagenome Assembled Genomes, (MAGs), or assemblies for downstream annotation. Alternatively, users can supply genes called using another platform. -## 2) Gene annotation +#### 2) Gene annotation After gene-calling in Prodigal, DRAM2 annotates genes in each genome (or Metagenome Assembled Genome (MAG)) using a suite of user-defined databases, including [KEGG](https://www.kegg.jp/) (if provided by the user), [UniRef90](https://www.uniprot.org/), [PFAM](https://pfam.xfam.org/), [dbCAN3](http://bcb.unl.edu/dbCAN2/), [RefSeq Viral](https://www.ncbi.nlm.nih.gov/genome/viruses/), [VOGDB](http://vogdb.org/), [MEROPS](https://www.ebi.ac.uk/merops/), and optional user-defined databases. A full list of available annotation databases can be found here: [WrightonLabCSU/dram pipeline parameters](https://dramit.readthedocs.io/en/latest/params_doc.html#pipeline-steps). ANNOTATE then integrates results across all databases, increasing annotation coverage and yielding ~25% more database hits than commonly used annotators such as DFAST, MetaERG, and Prokka. The output of this step (“raw-annotations.tsv”) contains all database annotations. DRAM2 also generates ANNOTATE folder containing: (1) the annotated nucleotide and amino acid fasta files of all genes, (2) genome quality data generated via QUAST, (3) .gff files for each genome, and (4) database-specific files produced during the gene annotation process (i.e. HMMsearch output, MMseq2s output, dbcan3-hmm and dbcan3SUB-hmm etc). -## 3) Summarize gene annotations based on curated datasets to ascribe function to MAGs +#### 3) Summarize gene annotations based on curated datasets to ascribe function to MAGs After genes have been annotated, users can SUMMARIZE this information into a user-friendly excel workbook (metabolism_summary.xlsx), which contains consolidated gene counts for the most informative genes for specific metabolic functions(*Energy acquisition/bioenergetics, Assimilation & Cofactor Metabolism, Cellular Machinery & Environmental Interaction & Adaptation*). This information can be further refined using a user-defined ecosystem (*agriculture, engineered systems, biogeochemistry, and gut*) to provide the user with counts for a refined set of genes directly related to their ecosystem of interest. In addition to the metabolism_summary excel workbook, the SUMMARIZE which contains three key files: (1) A genome statistics table which includes all statistics required to meet MIMAG criteria (genome_stats.tsv), (2) a metabolism summary sheet which gives gene counts of functional genes across a wide variety of metabolisms (summarized_genomes.tsv), and (3) a traits table which provides users with the information on the presence and absence of environmentally relevant pathways (traits.xlsx) -## 4) Generate an interactive heatmap of ecosystem-relevant MAG level metabolic function +#### 4) Generate an interactive heatmap of ecosystem-relevant MAG level metabolic function Users can also generate an interactive heatmap depicting the presence of specific metabolic functions. DRAM2 automatically generates this heatmap for each ecosystem if indicated by the user in addition to a generic heatmap of metabolic function by MAG if no ecosystem is defined From 5b1f6518a858f33c9d0b253b28d9924bcbec49a9 Mon Sep 17 00:00:00 2001 From: Laura Mason <95650976+lauramason326@users.noreply.github.com> Date: Fri, 15 May 2026 12:43:47 -0600 Subject: [PATCH 02/13] Update README.md --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index f9e779cd..61d90a1b 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,7 @@ # DRAM2 +## As of May 11, 2026, we will not be making public changes to DRAM2 ahead of our upcoming publication. We appreciate your patience! + ## Welcome to the wiki for Distilling and Refining Annotations of Metabolism 2 (DRAM2)! Here you will find give you basic instructions for running DRAM2, but for full documentation, please see the official DRAM2 webpage: [Read-the-docs](https://dramit.readthedocs.io/en/latest) From c3d2f6e304e8ed978f13aeb26d7138031719ad16 Mon Sep 17 00:00:00 2001 From: Laura Mason <95650976+lauramason326@users.noreply.github.com> Date: Fri, 15 May 2026 12:45:44 -0600 Subject: [PATCH 03/13] Update index.md --- docs/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/index.md b/docs/index.md index 3dac4f7d..0181123d 100644 --- a/docs/index.md +++ b/docs/index.md @@ -8,7 +8,7 @@ maxdepth: 2 --- Installation Introduction -Usage/ Quick Start +Usage Parameter API Output Rules_parser From 65fdf7438c7be6ad34335f33d61ad2a7f28ee40c Mon Sep 17 00:00:00 2001 From: Laura Mason <95650976+lauramason326@users.noreply.github.com> Date: Fri, 15 May 2026 12:49:57 -0600 Subject: [PATCH 04/13] Update Introduction.md --- docs/Introduction.md | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/docs/Introduction.md b/docs/Introduction.md index 2d77a70d..4939f4ce 100644 --- a/docs/Introduction.md +++ b/docs/Introduction.md @@ -1,37 +1,38 @@ # DRAM2 -## Welcome to the wiki for Distilling and Refining Annotations of Metabolism 2 (DRAM2)! -## **As of May 11, 2026, we will not be making public changes to DRAM2 code ahead of our upcoming publication. We appreciate your patience & stay tuned!** +Welcome to the wiki for Distilling and Refining Annotations of Metabolism 2 (DRAM2)! + + **As of May 11, 2026, we will not be making public changes to DRAM2 code ahead of our upcoming publication. We appreciate your patience & stay tuned!** ### Introduction DRAM2 (Distilling and Refining Annotations of Metabolism, version 2) is a tool for annotating genomic and metagenomic assemblies (e.g., scaffolds or contigs) as well as predicted genes (nucleotide or amino acid sequences). It organizes genome annotations into metabolic functions across three levels of increasing interpretation: (1) **ANNOTATE**, (2) **SUMMARIZE**, and (3) **VISUALIZE**. The **ANNOTATE** output contains all database hits for every gene in each genome, generating a comprehensive output of most annotation pipelines. DRAM2 extends beyond this by organizing (**SUMMARIZE**) and visualizing (**VISUALIZE**) annotations into ecosystem-relevant functional categories, enabling more interpretable comparisons across genomes and ecosystems. The DRAM2 workflow enables the analysis of large numbers of microbial genomes or metagenomes, highlighting functional guilds and supporting inference of organismal metabolism across datasets. -### DRAM2 Overview +### DRAM2 Overview & Workflow -> _Here provide we an overview on the DRAM2 workflow -A quick start guide can be found here https://dramit.readthedocs.io/en/latest/usage.html -A complete list of pipeline configuration parameters can be found here: [Parameters API](https://dramit.readthedocs.io/en/latest/params_doc.html) +Here provide we an overview on the DRAM2 workflow: +- A quick start guide can be found here https: [Usage](//dramit.readthedocs.io/en/latest/usage.html) +- A complete list of pipeline configuration parameters can be found here: [Parameters API](https://dramit.readthedocs.io/en/latest/params_doc.html) -### DRAM2 workflow +**DRAM2 workflow** 1) Gene calling 2) Gene annotation 3) Summarize gene annotations based on curated datasets to ascribe function to MAGs 4) Generate an interactive heatmap of ecosystem-relevant MAG level metabolic function -#### 1) Gene calling +**1) Gene calling** DRAM2 uses Prodigal to find open reading frames (ORFs) from genomes, Metagenome Assembled Genomes, (MAGs), or assemblies for downstream annotation. Alternatively, users can supply genes called using another platform. -#### 2) Gene annotation +**2) Gene annotation** After gene-calling in Prodigal, DRAM2 annotates genes in each genome (or Metagenome Assembled Genome (MAG)) using a suite of user-defined databases, including [KEGG](https://www.kegg.jp/) (if provided by the user), [UniRef90](https://www.uniprot.org/), [PFAM](https://pfam.xfam.org/), [dbCAN3](http://bcb.unl.edu/dbCAN2/), [RefSeq Viral](https://www.ncbi.nlm.nih.gov/genome/viruses/), [VOGDB](http://vogdb.org/), [MEROPS](https://www.ebi.ac.uk/merops/), and optional user-defined databases. A full list of available annotation databases can be found here: [WrightonLabCSU/dram pipeline parameters](https://dramit.readthedocs.io/en/latest/params_doc.html#pipeline-steps). ANNOTATE then integrates results across all databases, increasing annotation coverage and yielding ~25% more database hits than commonly used annotators such as DFAST, MetaERG, and Prokka. The output of this step (“raw-annotations.tsv”) contains all database annotations. DRAM2 also generates ANNOTATE folder containing: (1) the annotated nucleotide and amino acid fasta files of all genes, (2) genome quality data generated via QUAST, (3) .gff files for each genome, and (4) database-specific files produced during the gene annotation process (i.e. HMMsearch output, MMseq2s output, dbcan3-hmm and dbcan3SUB-hmm etc). -#### 3) Summarize gene annotations based on curated datasets to ascribe function to MAGs +**3) Summarize gene annotations based on curated datasets to ascribe function to MAGs** After genes have been annotated, users can SUMMARIZE this information into a user-friendly excel workbook (metabolism_summary.xlsx), which contains consolidated gene counts for the most informative genes for specific metabolic functions(*Energy acquisition/bioenergetics, Assimilation & Cofactor Metabolism, Cellular Machinery & Environmental Interaction & Adaptation*). This information can be further refined using a user-defined ecosystem (*agriculture, engineered systems, biogeochemistry, and gut*) to provide the user with counts for a refined set of genes directly related to their ecosystem of interest. In addition to the metabolism_summary excel workbook, the SUMMARIZE which contains three key files: (1) A genome statistics table which includes all statistics required to meet MIMAG criteria (genome_stats.tsv), (2) a metabolism summary sheet which gives gene counts of functional genes across a wide variety of metabolisms (summarized_genomes.tsv), and (3) a traits table which provides users with the information on the presence and absence of environmentally relevant pathways (traits.xlsx) -#### 4) Generate an interactive heatmap of ecosystem-relevant MAG level metabolic function +**4) Generate an interactive heatmap of ecosystem-relevant MAG level metabolic function** Users can also generate an interactive heatmap depicting the presence of specific metabolic functions. DRAM2 automatically generates this heatmap for each ecosystem if indicated by the user in addition to a generic heatmap of metabolic function by MAG if no ecosystem is defined From efd05054e275f01cbcb779f585e44b75823c34cd Mon Sep 17 00:00:00 2001 From: Laura Mason <95650976+lauramason326@users.noreply.github.com> Date: Fri, 15 May 2026 13:13:16 -0600 Subject: [PATCH 05/13] Create dram2_dictionary.md --- docs/dram2_dictionary.md | 3 +++ 1 file changed, 3 insertions(+) create mode 100644 docs/dram2_dictionary.md diff --git a/docs/dram2_dictionary.md b/docs/dram2_dictionary.md new file mode 100644 index 00000000..c7c6ddf3 --- /dev/null +++ b/docs/dram2_dictionary.md @@ -0,0 +1,3 @@ +## DRAM2 dictionary - Coming soon + + From 2c9d89bfdd19135818519e7e9841dc122be10be5 Mon Sep 17 00:00:00 2001 From: Laura Mason <95650976+lauramason326@users.noreply.github.com> Date: Fri, 15 May 2026 13:13:36 -0600 Subject: [PATCH 06/13] Update index.md --- docs/index.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/index.md b/docs/index.md index 0181123d..f4f3e89c 100644 --- a/docs/index.md +++ b/docs/index.md @@ -14,6 +14,7 @@ Output Rules_parser Contributing Changelog_include +DRAM2 Dictionary ``` # Indices and tables From 8020f4dcf2c1e5f7343e35a811258bebbd3e3f8a Mon Sep 17 00:00:00 2001 From: Laura Mason <95650976+lauramason326@users.noreply.github.com> Date: Fri, 15 May 2026 14:16:35 -0600 Subject: [PATCH 07/13] Update output.md --- docs/output.md | 53 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 51 insertions(+), 2 deletions(-) diff --git a/docs/output.md b/docs/output.md index 8cdb7c10..baf262b2 100644 --- a/docs/output.md +++ b/docs/output.md @@ -1,5 +1,54 @@ # DRAM Output -## Introduction +This is a work in progress, but here is an updated list of output files for DRAM2 as of May 11, 2026: -Work in progress +
+ +ANNOTATE/ + +- `raw-annotations.tsv` — Initial gene annotations +- `raw_rrna_scan.tsv` — rRNA scan results +- `collected_rrnas.tsv` — Filtered rRNAs +- `collected_trnas.tsv` — Filtered tRNAs + +**Subdirectories:** +- `HMM_SEARCH/` — HMM-based functional annotation results +- `MMSEQ2/` — Sequence similarity search results +- `PRODIGAL/` — Gene predictions +- `QUAST/` — Assembly quality metrics +- `RENAMED_GFFS/` — Standardized GFF files +- `RENAMED_HEADERS/` — Renamed FASTA headers + +
+ +
+multiqc/ + +- `multiqc_report.html` — Aggregated QC report (HTML) +- `multiqc_data/` — subdirectory containing tool information and multiqc log + +
+ +
+ +pipeline_info/ + - contains logs, pipeline parameters, and execution traces + +
+ +
+SUMMARIZE/ + +- `metabolism_summary.xlsx` — excel workbook showing counts per MAG of curated gene sets +- `genome_stats.tsv` — Per-genome statistics +- `summarized_genomes.tsv` — text file with same information as the metabolism summary sheet +- `traits.xlsx` — Per-MAG Traits + +
+ +
+VISUALIZE/ + +- `product.html` - Interactive heatmaps — Trait-based visualizations + +
From 368b0fd739bf9c1e2c60ed646483d26ff10676da Mon Sep 17 00:00:00 2001 From: Laura Mason <95650976+lauramason326@users.noreply.github.com> Date: Fri, 15 May 2026 14:18:14 -0600 Subject: [PATCH 08/13] Update Introduction.md --- docs/Introduction.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/Introduction.md b/docs/Introduction.md index 4939f4ce..a6bd3921 100644 --- a/docs/Introduction.md +++ b/docs/Introduction.md @@ -23,15 +23,15 @@ Here provide we an overview on the DRAM2 workflow: **1) Gene calling** -DRAM2 uses Prodigal to find open reading frames (ORFs) from genomes, Metagenome Assembled Genomes, (MAGs), or assemblies for downstream annotation. Alternatively, users can supply genes called using another platform. +DRAM2 uses Prodigal(v2.6.3) to find open reading frames (ORFs) from genomes, Metagenome Assembled Genomes, (MAGs), or assemblies for downstream annotation. Alternatively, users can supply genes called using another platform. **2) Gene annotation** -After gene-calling in Prodigal, DRAM2 annotates genes in each genome (or Metagenome Assembled Genome (MAG)) using a suite of user-defined databases, including [KEGG](https://www.kegg.jp/) (if provided by the user), [UniRef90](https://www.uniprot.org/), [PFAM](https://pfam.xfam.org/), [dbCAN3](http://bcb.unl.edu/dbCAN2/), [RefSeq Viral](https://www.ncbi.nlm.nih.gov/genome/viruses/), [VOGDB](http://vogdb.org/), [MEROPS](https://www.ebi.ac.uk/merops/), and optional user-defined databases. A full list of available annotation databases can be found here: [WrightonLabCSU/dram pipeline parameters](https://dramit.readthedocs.io/en/latest/params_doc.html#pipeline-steps). ANNOTATE then integrates results across all databases, increasing annotation coverage and yielding ~25% more database hits than commonly used annotators such as DFAST, MetaERG, and Prokka. The output of this step (“raw-annotations.tsv”) contains all database annotations. DRAM2 also generates ANNOTATE folder containing: (1) the annotated nucleotide and amino acid fasta files of all genes, (2) genome quality data generated via QUAST, (3) .gff files for each genome, and (4) database-specific files produced during the gene annotation process (i.e. HMMsearch output, MMseq2s output, dbcan3-hmm and dbcan3SUB-hmm etc). +DRAM2 annotates genes in each genome (or Metagenome Assembled Genome (MAG)) using a suite of user-defined databases, including [KEGG](https://www.kegg.jp/) (if provided by the user), [UniRef90](https://www.uniprot.org/), [PFAM](https://pfam.xfam.org/), [dbCAN3](http://bcb.unl.edu/dbCAN2/), [RefSeq Viral](https://www.ncbi.nlm.nih.gov/genome/viruses/), [VOGDB](http://vogdb.org/), [MEROPS](https://www.ebi.ac.uk/merops/), and optional user-defined databases. A full list of available annotation databases can be found here: [WrightonLabCSU/dram pipeline parameters](https://dramit.readthedocs.io/en/latest/params_doc.html#pipeline-steps). ANNOTATE then integrates results across all databases, increasing annotation coverage and yielding ~25% more database hits than commonly used annotators such as DFAST, MetaERG, and Prokka. The output of this step (“raw-annotations.tsv”) contains all database annotations. DRAM2 also generates ANNOTATE folder containing: (1) the annotated nucleotide and amino acid fasta files of all genes, (2) genome quality data generated via QUAST, (3) .gff files for each genome, and (4) database-specific files produced during the gene annotation process (i.e. HMMsearch output, MMseq2s output, dbcan3-hmm and dbcan3SUB-hmm etc). **3) Summarize gene annotations based on curated datasets to ascribe function to MAGs** -After genes have been annotated, users can SUMMARIZE this information into a user-friendly excel workbook (metabolism_summary.xlsx), which contains consolidated gene counts for the most informative genes for specific metabolic functions(*Energy acquisition/bioenergetics, Assimilation & Cofactor Metabolism, Cellular Machinery & Environmental Interaction & Adaptation*). This information can be further refined using a user-defined ecosystem (*agriculture, engineered systems, biogeochemistry, and gut*) to provide the user with counts for a refined set of genes directly related to their ecosystem of interest. In addition to the metabolism_summary excel workbook, the SUMMARIZE which contains three key files: (1) A genome statistics table which includes all statistics required to meet MIMAG criteria (genome_stats.tsv), (2) a metabolism summary sheet which gives gene counts of functional genes across a wide variety of metabolisms (summarized_genomes.tsv), and (3) a traits table which provides users with the information on the presence and absence of environmentally relevant pathways (traits.xlsx) +After genes have been annotated, users can SUMMARIZE this information into a user-friendly excel workbook (metabolism_summary.xlsx), which contains consolidated gene counts for the most informative genes for specific metabolic functions(*Energy acquisition/bioenergetics, Assimilation & Cofactor Metabolism, Cellular Machinery & Environmental Interaction & Adaptation*). This information can be further refined using a user-defined ecosystem (*agriculture, engineered systems, biogeochemistry, and gut*) to provide the user with counts for a refined set of genes directly related to their ecosystem of interest. In addition to the metabolism_summary excel workbook, the SUMMARIZE which contains three key files: (1) A genome statistics table which includes all statistics required to meet MIMAG criteria (genome_stats.tsv), (2) a metabolism summary sheet which containing gene counts of functional genes across a curated set of metabolisms and ecosystems (summarized_genomes.tsv), and (3) a traits table which provides users with the information on the presence and absence of environmentally relevant pathways (traits.xlsx) **4) Generate an interactive heatmap of ecosystem-relevant MAG level metabolic function** From c53d6e576591899db22d455a922385b65294de47 Mon Sep 17 00:00:00 2001 From: Laura Mason <95650976+lauramason326@users.noreply.github.com> Date: Fri, 15 May 2026 14:24:07 -0600 Subject: [PATCH 09/13] Update Introduction.md --- docs/Introduction.md | 42 +++++++++++++++++++++++++++++++++++++++--- 1 file changed, 39 insertions(+), 3 deletions(-) diff --git a/docs/Introduction.md b/docs/Introduction.md index a6bd3921..d6470a81 100644 --- a/docs/Introduction.md +++ b/docs/Introduction.md @@ -1,19 +1,25 @@ -# DRAM2 +# Introduction to DRAM2 -Welcome to the wiki for Distilling and Refining Annotations of Metabolism 2 (DRAM2)! - **As of May 11, 2026, we will not be making public changes to DRAM2 code ahead of our upcoming publication. We appreciate your patience & stay tuned!** +**Welcome to the wiki for Distilling and Refining Annotations of Metabolism 2 (DRAM2)!** + + +**As of May 11, 2026, we will not be making public changes to DRAM2 code ahead of our upcoming publication. We appreciate your patience, & stay tuned for more!** + ### Introduction DRAM2 (Distilling and Refining Annotations of Metabolism, version 2) is a tool for annotating genomic and metagenomic assemblies (e.g., scaffolds or contigs) as well as predicted genes (nucleotide or amino acid sequences). It organizes genome annotations into metabolic functions across three levels of increasing interpretation: (1) **ANNOTATE**, (2) **SUMMARIZE**, and (3) **VISUALIZE**. The **ANNOTATE** output contains all database hits for every gene in each genome, generating a comprehensive output of most annotation pipelines. DRAM2 extends beyond this by organizing (**SUMMARIZE**) and visualizing (**VISUALIZE**) annotations into ecosystem-relevant functional categories, enabling more interpretable comparisons across genomes and ecosystems. The DRAM2 workflow enables the analysis of large numbers of microbial genomes or metagenomes, highlighting functional guilds and supporting inference of organismal metabolism across datasets. + ### DRAM2 Overview & Workflow Here provide we an overview on the DRAM2 workflow: - A quick start guide can be found here https: [Usage](//dramit.readthedocs.io/en/latest/usage.html) - A complete list of pipeline configuration parameters can be found here: [Parameters API](https://dramit.readthedocs.io/en/latest/params_doc.html) + + **DRAM2 workflow** 1) Gene calling @@ -21,18 +27,48 @@ Here provide we an overview on the DRAM2 workflow: 3) Summarize gene annotations based on curated datasets to ascribe function to MAGs 4) Generate an interactive heatmap of ecosystem-relevant MAG level metabolic function + **1) Gene calling** DRAM2 uses Prodigal(v2.6.3) to find open reading frames (ORFs) from genomes, Metagenome Assembled Genomes, (MAGs), or assemblies for downstream annotation. Alternatively, users can supply genes called using another platform. + **2) Gene annotation** DRAM2 annotates genes in each genome (or Metagenome Assembled Genome (MAG)) using a suite of user-defined databases, including [KEGG](https://www.kegg.jp/) (if provided by the user), [UniRef90](https://www.uniprot.org/), [PFAM](https://pfam.xfam.org/), [dbCAN3](http://bcb.unl.edu/dbCAN2/), [RefSeq Viral](https://www.ncbi.nlm.nih.gov/genome/viruses/), [VOGDB](http://vogdb.org/), [MEROPS](https://www.ebi.ac.uk/merops/), and optional user-defined databases. A full list of available annotation databases can be found here: [WrightonLabCSU/dram pipeline parameters](https://dramit.readthedocs.io/en/latest/params_doc.html#pipeline-steps). ANNOTATE then integrates results across all databases, increasing annotation coverage and yielding ~25% more database hits than commonly used annotators such as DFAST, MetaERG, and Prokka. The output of this step (“raw-annotations.tsv”) contains all database annotations. DRAM2 also generates ANNOTATE folder containing: (1) the annotated nucleotide and amino acid fasta files of all genes, (2) genome quality data generated via QUAST, (3) .gff files for each genome, and (4) database-specific files produced during the gene annotation process (i.e. HMMsearch output, MMseq2s output, dbcan3-hmm and dbcan3SUB-hmm etc). + **3) Summarize gene annotations based on curated datasets to ascribe function to MAGs** After genes have been annotated, users can SUMMARIZE this information into a user-friendly excel workbook (metabolism_summary.xlsx), which contains consolidated gene counts for the most informative genes for specific metabolic functions(*Energy acquisition/bioenergetics, Assimilation & Cofactor Metabolism, Cellular Machinery & Environmental Interaction & Adaptation*). This information can be further refined using a user-defined ecosystem (*agriculture, engineered systems, biogeochemistry, and gut*) to provide the user with counts for a refined set of genes directly related to their ecosystem of interest. In addition to the metabolism_summary excel workbook, the SUMMARIZE which contains three key files: (1) A genome statistics table which includes all statistics required to meet MIMAG criteria (genome_stats.tsv), (2) a metabolism summary sheet which containing gene counts of functional genes across a curated set of metabolisms and ecosystems (summarized_genomes.tsv), and (3) a traits table which provides users with the information on the presence and absence of environmentally relevant pathways (traits.xlsx) + **4) Generate an interactive heatmap of ecosystem-relevant MAG level metabolic function** Users can also generate an interactive heatmap depicting the presence of specific metabolic functions. DRAM2 automatically generates this heatmap for each ecosystem if indicated by the user in addition to a generic heatmap of metabolic function by MAG if no ecosystem is defined + + +**Basic usage:** + +Below is an example of basic DRAM2 usage. This code is for annotating a directory of genomes, renaming them for downstream use, calling genes and annotating them using all available databases, performing quality control, summarizing and visualizing with particular ecosystems in mind and assigning genome-level traits to the organisms. The command is submitted on the command line and will run in the background. + +``` bash +nextflow run WrightonLabCSU/DRAM --input_fasta [INPUT_FASTA] --outdir [OUTPUT_DIR] --rename --call --annotate --anno_dbs all --qc --summarize --sum_ecos 'eng_sys,ag' --visualize --traits -profile singularity -resume --slurm -bg +``` +Please note that '--input_fasta [INPUT_FASTA]' should be a directory of genomes or MAGs in .fa or .fna format. It is also worth noting that all Nextflow options are specified with a single dash `-`, while all DRAM2-specific options are specified with a double dash `--`. + +All available Nextflow options can be seen by running: +`nextflow run -help` + + +**Other DRAM products** + +- [DRAM webinar](https://www.youtube.com/watch?v=-Ky2fz2vw2s) +- [DRAM in KBase publication (2023)](https://pubmed.ncbi.nlm.nih.gov/36857575/) + + +**Citing DRAM** + +If DRAM2 helps you in your research, please cite: +[DRAM publication in Nucleic Acids Research (2020)](https://academic.oup.com/nar/article/48/16/8883/5884738) + From 9eda0536ee029bdf97ab875d59d9860bd1f324f5 Mon Sep 17 00:00:00 2001 From: Laura Mason <95650976+lauramason326@users.noreply.github.com> Date: Fri, 15 May 2026 14:25:52 -0600 Subject: [PATCH 10/13] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 61d90a1b..27d40773 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ ## As of May 11, 2026, we will not be making public changes to DRAM2 ahead of our upcoming publication. We appreciate your patience! -## Welcome to the wiki for Distilling and Refining Annotations of Metabolism 2 (DRAM2)! +### Welcome to the wiki for Distilling and Refining Annotations of Metabolism 2 (DRAM2)! Here you will find give you basic instructions for running DRAM2, but for full documentation, please see the official DRAM2 webpage: [Read-the-docs](https://dramit.readthedocs.io/en/latest)

From 283847a60595c58fae090c52bc1fb45a481ad9b6 Mon Sep 17 00:00:00 2001 From: Madeline Scyphers Date: Fri, 15 May 2026 16:28:41 -0600 Subject: [PATCH 11/13] Fix index.md links --- docs/index.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/index.md b/docs/index.md index f4f3e89c..60f85266 100644 --- a/docs/index.md +++ b/docs/index.md @@ -6,15 +6,15 @@ The WrightonLabCSU/dram documentation is split into the following pages: --- maxdepth: 2 --- -Installation -Introduction -Usage +installation +introduction +usage Parameter API -Output -Rules_parser -Contributing -Changelog_include -DRAM2 Dictionary +output +rules_parser +contributing +changelog_include +dram2_dictionary ``` # Indices and tables From ecc8ba3c67cf38c8a948cc2f82c4579f47657176 Mon Sep 17 00:00:00 2001 From: Laura Mason <95650976+lauramason326@users.noreply.github.com> Date: Sat, 16 May 2026 15:36:56 -0600 Subject: [PATCH 12/13] Update Introduction.md I think this is the fix --- docs/Introduction.md | 61 +++++++++++++++++++++----------------------- 1 file changed, 29 insertions(+), 32 deletions(-) diff --git a/docs/Introduction.md b/docs/Introduction.md index d6470a81..01e20cf6 100644 --- a/docs/Introduction.md +++ b/docs/Introduction.md @@ -1,74 +1,71 @@ # Introduction to DRAM2 +> As of May 11, 2026, we will not be making public changes to DRAM2 code ahead of our upcoming publication. We appreciate your patience, & stay tuned for more! -**Welcome to the wiki for Distilling and Refining Annotations of Metabolism 2 (DRAM2)!** - - -**As of May 11, 2026, we will not be making public changes to DRAM2 code ahead of our upcoming publication. We appreciate your patience, & stay tuned for more!** +---- +Welcome to the wiki for Distilling and Refining Annotations of Metabolism 2 (DRAM2)! ### Introduction DRAM2 (Distilling and Refining Annotations of Metabolism, version 2) is a tool for annotating genomic and metagenomic assemblies (e.g., scaffolds or contigs) as well as predicted genes (nucleotide or amino acid sequences). It organizes genome annotations into metabolic functions across three levels of increasing interpretation: (1) **ANNOTATE**, (2) **SUMMARIZE**, and (3) **VISUALIZE**. The **ANNOTATE** output contains all database hits for every gene in each genome, generating a comprehensive output of most annotation pipelines. DRAM2 extends beyond this by organizing (**SUMMARIZE**) and visualizing (**VISUALIZE**) annotations into ecosystem-relevant functional categories, enabling more interpretable comparisons across genomes and ecosystems. The DRAM2 workflow enables the analysis of large numbers of microbial genomes or metagenomes, highlighting functional guilds and supporting inference of organismal metabolism across datasets. - ### DRAM2 Overview & Workflow -Here provide we an overview on the DRAM2 workflow: -- A quick start guide can be found here https: [Usage](//dramit.readthedocs.io/en/latest/usage.html) -- A complete list of pipeline configuration parameters can be found here: [Parameters API](https://dramit.readthedocs.io/en/latest/params_doc.html) - +Here provide we an overview on the DRAM2 workflow: +- A quick start guide can be found here https: [Usage](//dramit.readthedocs.io/en/latest/usage.html) +- A complete list of pipeline configuration parameters can be found here: [Parameters API](https://dramit.readthedocs.io/en/latest/params_doc.html) **DRAM2 workflow** -1) Gene calling -2) Gene annotation -3) Summarize gene annotations based on curated datasets to ascribe function to MAGs -4) Generate an interactive heatmap of ecosystem-relevant MAG level metabolic function - +1) Gene calling +2) Gene annotation +3) Summarize gene annotations based on curated datasets to ascribe function to MAGs +4) Generate an interactive heatmap of ecosystem-relevant MAG level metabolic function **1) Gene calling** -DRAM2 uses Prodigal(v2.6.3) to find open reading frames (ORFs) from genomes, Metagenome Assembled Genomes, (MAGs), or assemblies for downstream annotation. Alternatively, users can supply genes called using another platform. - +DRAM2 uses Prodigal(v2.6.3) to find open reading frames (ORFs) from genomes, Metagenome Assembled Genomes (MAGs), or assemblies for downstream annotation. Alternatively, users can supply genes called using another platform. **2) Gene annotation** -DRAM2 annotates genes in each genome (or Metagenome Assembled Genome (MAG)) using a suite of user-defined databases, including [KEGG](https://www.kegg.jp/) (if provided by the user), [UniRef90](https://www.uniprot.org/), [PFAM](https://pfam.xfam.org/), [dbCAN3](http://bcb.unl.edu/dbCAN2/), [RefSeq Viral](https://www.ncbi.nlm.nih.gov/genome/viruses/), [VOGDB](http://vogdb.org/), [MEROPS](https://www.ebi.ac.uk/merops/), and optional user-defined databases. A full list of available annotation databases can be found here: [WrightonLabCSU/dram pipeline parameters](https://dramit.readthedocs.io/en/latest/params_doc.html#pipeline-steps). ANNOTATE then integrates results across all databases, increasing annotation coverage and yielding ~25% more database hits than commonly used annotators such as DFAST, MetaERG, and Prokka. The output of this step (“raw-annotations.tsv”) contains all database annotations. DRAM2 also generates ANNOTATE folder containing: (1) the annotated nucleotide and amino acid fasta files of all genes, (2) genome quality data generated via QUAST, (3) .gff files for each genome, and (4) database-specific files produced during the gene annotation process (i.e. HMMsearch output, MMseq2s output, dbcan3-hmm and dbcan3SUB-hmm etc). - +DRAM2 annotates genes in each genome (or Metagenome Assembled Genome (MAG)) using a suite of user-defined databases, including [KEGG](https://www.kegg.jp/) (if provided by the user), [UniRef90](https://www.uniprot.org/), [PFAM](https://pfam.xfam.org/), [dbCAN3](http://bcb.unl.edu/dbCAN2/), [RefSeq Viral](https://www.ncbi.nlm.nih.gov/genome/viruses/), [VOGDB](http://vogdb.org/), [MEROPS](https://www.ebi.ac.uk/merops/), and optional user-defined databases. A full list of available annotation databases can be found here: [WrightonLabCSU/dram pipeline parameters](https://dramit.readthedocs.io/en/latest/params_doc.html#pipeline-steps). ANNOTATE then integrates results across all databases, increasing annotation coverage and yielding ~25% more database hits than commonly used annotators such as DFAST, MetaERG, and Prokka. The output of this step (“raw-annotations.tsv”) contains all database annotations. DRAM2 also generates ANNOTATE folder containing: (1) the annotated nucleotide and amino acid fasta files of all genes, (2) genome quality data generated via QUAST, (3) .gff files for each genome, and (4) database-specific files produced during the gene annotation process (i.e. HMMsearch output, MMseq2s output, dbcan3-hmm and dbcan3SUB-hmm etc). **3) Summarize gene annotations based on curated datasets to ascribe function to MAGs** -After genes have been annotated, users can SUMMARIZE this information into a user-friendly excel workbook (metabolism_summary.xlsx), which contains consolidated gene counts for the most informative genes for specific metabolic functions(*Energy acquisition/bioenergetics, Assimilation & Cofactor Metabolism, Cellular Machinery & Environmental Interaction & Adaptation*). This information can be further refined using a user-defined ecosystem (*agriculture, engineered systems, biogeochemistry, and gut*) to provide the user with counts for a refined set of genes directly related to their ecosystem of interest. In addition to the metabolism_summary excel workbook, the SUMMARIZE which contains three key files: (1) A genome statistics table which includes all statistics required to meet MIMAG criteria (genome_stats.tsv), (2) a metabolism summary sheet which containing gene counts of functional genes across a curated set of metabolisms and ecosystems (summarized_genomes.tsv), and (3) a traits table which provides users with the information on the presence and absence of environmentally relevant pathways (traits.xlsx) - +After genes have been annotated, users can SUMMARIZE this information into a user-friendly excel workbook (metabolism_summary.xlsx), which contains consolidated gene counts for the most informative genes for specific metabolic functions (*Energy acquisition/bioenergetics, Assimilation & Cofactor Metabolism, Cellular Machinery & Environmental Interaction & Adaptation*). This information can be further refined using a user-defined ecosystem (*agriculture, engineered systems, biogeochemistry, and gut*) to provide the user with counts for a refined set of genes directly related to their ecosystem of interest. In addition to the metabolism_summary excel workbook, the SUMMARIZE contains three key files: (1) a genome statistics table which includes all statistics required to meet MIMAG criteria (genome_stats.tsv), (2) a metabolism summary sheet which contains gene counts of functional genes across a curated set of metabolisms and ecosystems (summarized_genomes.tsv), and (3) a traits table which provides users with the information on the presence and absence of environmentally relevant pathways (traits.xlsx) **4) Generate an interactive heatmap of ecosystem-relevant MAG level metabolic function** -Users can also generate an interactive heatmap depicting the presence of specific metabolic functions. DRAM2 automatically generates this heatmap for each ecosystem if indicated by the user in addition to a generic heatmap of metabolic function by MAG if no ecosystem is defined +Users can also generate an interactive heatmap depicting the presence of specific metabolic functions. DRAM2 automatically generates this heatmap for each ecosystem if indicated by the user in addition to a generic heatmap of metabolic function by MAG if no ecosystem is defined. +--- -**Basic usage:** +### Basic usage -Below is an example of basic DRAM2 usage. This code is for annotating a directory of genomes, renaming them for downstream use, calling genes and annotating them using all available databases, performing quality control, summarizing and visualizing with particular ecosystems in mind and assigning genome-level traits to the organisms. The command is submitted on the command line and will run in the background. +Below is an example of basic DRAM2 usage. This code is for annotating a directory of genomes, renaming them for downstream use, calling genes and annotating them using all available databases, performing quality control, summarizing and visualizing with particular ecosystems in mind and assigning genome-level traits to the organisms. The command is submitted on the command line and will run in the background. -``` bash +```bash nextflow run WrightonLabCSU/DRAM --input_fasta [INPUT_FASTA] --outdir [OUTPUT_DIR] --rename --call --annotate --anno_dbs all --qc --summarize --sum_ecos 'eng_sys,ag' --visualize --traits -profile singularity -resume --slurm -bg ``` -Please note that '--input_fasta [INPUT_FASTA]' should be a directory of genomes or MAGs in .fa or .fna format. It is also worth noting that all Nextflow options are specified with a single dash `-`, while all DRAM2-specific options are specified with a double dash `--`. + +Please note that --input_fasta [INPUT_FASTA] should be a directory of genomes or MAGs in .fa or .fna format. It is also worth noting that all Nextflow options are specified with a single dash -, while all DRAM2-specific options are specified with a double dash --. All available Nextflow options can be seen by running: -`nextflow run -help` +nextflow run -help +--- -**Other DRAM products** +## Other DRAM products -- [DRAM webinar](https://www.youtube.com/watch?v=-Ky2fz2vw2s) -- [DRAM in KBase publication (2023)](https://pubmed.ncbi.nlm.nih.gov/36857575/) +DRAM webinar: https://www.youtube.com/watch?v=-Ky2fz2vw2s +DRAM in KBase publication (2023): https://pubmed.ncbi.nlm.nih.gov/36857575/ +--- -**Citing DRAM** +## Citing DRAM If DRAM2 helps you in your research, please cite: -[DRAM publication in Nucleic Acids Research (2020)](https://academic.oup.com/nar/article/48/16/8883/5884738) - +DRAM publication in Nucleic Acids Research (2020): +https://academic.oup.com/nar/article/48/16/8883/5884738 From fabfaf5b05434e14a3633606d2e7be5eec93183c Mon Sep 17 00:00:00 2001 From: Madeline Scyphers Date: Mon, 18 May 2026 09:07:43 -0600 Subject: [PATCH 13/13] Rename Introduction.md back to introduction.md --- docs/{Introduction.md => introduction.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename docs/{Introduction.md => introduction.md} (100%) diff --git a/docs/Introduction.md b/docs/introduction.md similarity index 100% rename from docs/Introduction.md rename to docs/introduction.md