This workflow constructs Metagenome-Assembled Genomes (MAGs) using SPAdes or MEGAHIT as assemblers, followed by binning with four different tools and refinement using Binette. The resulting MAGs are dereplicated across the entire input sample set, then annotated and evaluated for quality. You can provide pooled reads (for co-assembly/binning), individual read sets, or a combination of both. The input samples must consist of the original reads, which are used for abundance estimation. In all cases, reads should be trimmed, adapters removed, and cleaned of host or other contaminants before processing.
Inputs
ID | Name | Description | Type |
---|---|---|---|
AMRFinderPlus Database for Bakta | AMRFinderPlus Database for Bakta | AMRFinderPlus Database for Bakta - ideally use the newest installed in the server |
|
ANI threshold for dereplication | ANI threshold for dereplication | ANI threshold to form secondary clusters of dereplication. An ANI value of ≥95-96% indicates genomes belong to the same species, an ANI of ≥98-99% suggests they are the same strain, and an ANI of ≤90-95% typically points to genomes from different genera. |
|
Bakta Database | Bakta Database | Bakta Database - ideally use the newest installed on the server |
|
CheckM2 Database | CheckM2 Database | CheckM2 Reference Database used for quality assessment for Binette, dRep, and quality assessment of the final bins |
|
Choose Assembler | Choose Assembler | The workflow can use MEGAHIT and metaSPAdes as assembler |
|
Contamination weight (Binette) | Contamination weight (Binette) | This weight is used for the scoring the bins. A low weight favor complete bins over low contaminated bins (--contamination_weight) |
|
Custom Assemblies | Custom Assemblies | This workflow allows using a custom assembly as input. If provided, select `custom assembly` as Assembler. Provide one assembly for each group of trimmed input reads. |
|
Environment for the built-in model (SemiBin) | Environment for the built-in model (SemiBin) | Environment for the built-in model (SemiBin), options are: human_gut, dog_gut, ocean, soil, cat_gut, human_oral, mouse_gut, pig_gut, built_environment, wastewater, chicken_caecum, global |
|
GTDB-tk Database | GTDB-tk Database | GTDB-tk Database used for Bin Taxonomy Classification |
|
Maximum MAG contamination percentage | Maximum MAG contamination percentage | Maximum MAG contamination percentage for dereplication |
|
Minimum MAG completeness percentage | Minimum MAG completeness percentage | Minimum MAG completeness percentage for bin refinement (binette) and dereplication (drep) |
|
Minimum MAG length | Minimum MAG length | Minimum MAG length for dereplication |
|
Minimum length of contigs to output | Minimum length of contigs to output | Minimum length of contigs to output (only for MEGAHIT). |
|
Read length (CONCOCT) | Read length (CONCOCT) | CONCOCT requires the read length for coverage. Best use fastQC to estimate the mean value. |
|
Run Bakta on MAGs | Run Bakta on MAGs | Bakta can take a long time, but will give comprehensive genome annotation |
|
Run GTDB-Tk on MAGs | Run GTDB-Tk on MAGs | GTDB-Tk can take a long time and requires a lot of memory, run only if taxonomic placement is required. |
|
Trimmed reads | Trimmed reads | These should be the reads from the samples to estimate MAGs abundance in the original samples. |
|
Trimmed reads from grouped samples | Trimmed reads from grouped samples | Provide already quality controlled trimmed reads, this can also be read groups or groups and individual samples or a mix of both. |
|
Steps
ID | Name | Description |
---|---|---|
18 | Map parameter value | toolshed.g2.bx.psu.edu/repos/iuc/map_param_value/map_param_value/0.2.0 |
19 | Map parameter value | toolshed.g2.bx.psu.edu/repos/iuc/map_param_value/map_param_value/0.2.0 |
20 | Unzip collection | __UNZIP_COLLECTION__ |
21 | MEGAHIT | toolshed.g2.bx.psu.edu/repos/iuc/megahit/megahit/1.2.9+galaxy2 |
22 | metaSPAdes | toolshed.g2.bx.psu.edu/repos/nml/metaspades/metaspades/4.1.0+galaxy0 |
23 | Pick parameter value | toolshed.g2.bx.psu.edu/repos/iuc/pick_value/pick_value/0.2.0 |
24 | Quast | toolshed.g2.bx.psu.edu/repos/iuc/quast/quast/5.3.0+galaxy0 |
25 | Bowtie2 | toolshed.g2.bx.psu.edu/repos/devteam/bowtie2/bowtie2/2.5.3+galaxy1 |
26 | CONCOCT: Cut up contigs | toolshed.g2.bx.psu.edu/repos/iuc/concoct_cut_up_fasta/concoct_cut_up_fasta/1.1.0+galaxy2 |
27 | Samtools sort | toolshed.g2.bx.psu.edu/repos/devteam/samtools_sort/samtools_sort/2.0.5 |
28 | SemiBin | toolshed.g2.bx.psu.edu/repos/iuc/semibin/semibin/2.0.2+galaxy1 |
29 | CONCOCT: Generate the input coverage table | toolshed.g2.bx.psu.edu/repos/iuc/concoct_coverage_table/concoct_coverage_table/1.1.0+galaxy2 |
30 | Calculate contig depths | toolshed.g2.bx.psu.edu/repos/iuc/metabat2_jgi_summarize_bam_contig_depths/metabat2_jgi_summarize_bam_contig_depths/2.17+galaxy0 |
31 | Converts genome bins in fasta format | toolshed.g2.bx.psu.edu/repos/iuc/fasta_to_contig2bin/Fasta_to_Contig2Bin/1.1.7+galaxy1 |
32 | CONCOCT | toolshed.g2.bx.psu.edu/repos/iuc/concoct/concoct/1.1.0+galaxy2 |
33 | MaxBin2 | toolshed.g2.bx.psu.edu/repos/mbernt/maxbin2/maxbin2/2.2.7+galaxy6 |
34 | MetaBAT2 | toolshed.g2.bx.psu.edu/repos/iuc/metabat2/metabat2/2.17+galaxy0 |
35 | CONCOCT: Merge cut clusters | toolshed.g2.bx.psu.edu/repos/iuc/concoct_merge_cut_up_clustering/concoct_merge_cut_up_clustering/1.1.0+galaxy2 |
36 | Converts genome bins in fasta format | toolshed.g2.bx.psu.edu/repos/iuc/fasta_to_contig2bin/Fasta_to_Contig2Bin/1.1.7+galaxy1 |
37 | Converts genome bins in fasta format | toolshed.g2.bx.psu.edu/repos/iuc/fasta_to_contig2bin/Fasta_to_Contig2Bin/1.1.7+galaxy1 |
38 | CONCOCT: Extract a fasta file | toolshed.g2.bx.psu.edu/repos/iuc/concoct_extract_fasta_bins/concoct_extract_fasta_bins/1.1.0+galaxy2 |
39 | Converts genome bins in fasta format | toolshed.g2.bx.psu.edu/repos/iuc/fasta_to_contig2bin/Fasta_to_Contig2Bin/1.1.7+galaxy1 |
40 | Build list | __BUILD_LIST__ |
41 | Binette | toolshed.g2.bx.psu.edu/repos/iuc/binette/binette/1.0.5+galaxy1 |
42 | Pool Bins from all samples | __FLATTEN__ |
43 | checkm2 | toolshed.g2.bx.psu.edu/repos/iuc/checkm2/checkm2/1.0.2+galaxy0 |
44 | Text reformatting | toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/9.5+galaxy0 |
45 | dRep dereplicate | toolshed.g2.bx.psu.edu/repos/iuc/drep_dereplicate/drep_dereplicate/3.5.0+galaxy1 |
46 | GTDB-Tk Classify genomes | toolshed.g2.bx.psu.edu/repos/iuc/gtdbtk_classify_wf/gtdbtk_classify_wf/2.4.0+galaxy0 |
47 | checkm2 | toolshed.g2.bx.psu.edu/repos/iuc/checkm2/checkm2/1.0.2+galaxy0 |
48 | CheckM lineage_wf | toolshed.g2.bx.psu.edu/repos/iuc/checkm_lineage_wf/checkm_lineage_wf/1.2.3+galaxy0 |
49 | CoverM genome | toolshed.g2.bx.psu.edu/repos/iuc/coverm_genome/coverm_genome/0.7.0+galaxy0 |
50 | Quast | toolshed.g2.bx.psu.edu/repos/iuc/quast/quast/5.3.0+galaxy0 |
51 | Bakta | toolshed.g2.bx.psu.edu/repos/iuc/bakta/bakta/1.9.4+galaxy0 |
52 | Column join | toolshed.g2.bx.psu.edu/repos/iuc/collection_column_join/collection_column_join/0.0.3 |
53 | MultiQC | toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.27+galaxy3 |
Outputs
ID | Name | Description | Type |
---|---|---|---|
Assembly Report | Assembly Report | n/a |
|
Dereplicated Bins | Dereplicated Bins | n/a |
|
Full MultiQC Report | Full MultiQC Report | n/a |
|
Version History
v0.1 (earliest) Created 30th Apr 2025 at 03:02 by WorkflowHub Bot
Updated to v0.1
Frozen
v0.1
7986c0c

Creators
Submitter
Views: 16 Downloads: 1 Runs: 1
Created: 30th Apr 2025 at 03:02

This item has not yet been tagged.

None