Workflow Type: Galaxy

This workflow constructs Metagenome-Assembled Genomes (MAGs) using SPAdes or MEGAHIT as assemblers, followed by binning with four different tools and refinement using Binette. The resulting MAGs are dereplicated across the entire input sample set, then annotated and evaluated for quality. You can provide pooled reads (for co-assembly/binning), individual read sets, or a combination of both. The input samples must consist of the original reads, which are used for abundance estimation. In all cases, reads should be trimmed, adapters removed, and cleaned of host or other contaminants before processing.

Inputs

ID Name Description Type
AMRFinderPlus Database for Bakta AMRFinderPlus Database for Bakta AMRFinderPlus Database for Bakta - ideally use the newest installed in the server
  • string
ANI threshold for dereplication ANI threshold for dereplication ANI threshold to form secondary clusters of dereplication. An ANI value of ≥95-96% indicates genomes belong to the same species, an ANI of ≥98-99% suggests they are the same strain, and an ANI of ≤90-95% typically points to genomes from different genera.
  • float
Bakta Database Bakta Database Bakta Database - ideally use the newest installed on the server
  • string
CheckM2 Database CheckM2 Database CheckM2 Reference Database used for quality assessment for Binette, dRep, and quality assessment of the final bins
  • string
Choose Assembler Choose Assembler The workflow can use MEGAHIT and metaSPAdes as assembler
  • string
Contamination weight (Binette) Contamination weight (Binette) This weight is used for the scoring the bins. A low weight favor complete bins over low contaminated bins (--contamination_weight)
  • int
Custom Assemblies Custom Assemblies This workflow allows using a custom assembly as input. If provided, select `custom assembly` as Assembler. Provide one assembly for each group of trimmed input reads.
  • File[]?
Environment for the built-in model (SemiBin) Environment for the built-in model (SemiBin) Environment for the built-in model (SemiBin), options are: human_gut, dog_gut, ocean, soil, cat_gut, human_oral, mouse_gut, pig_gut, built_environment, wastewater, chicken_caecum, global
  • string
GTDB-tk Database GTDB-tk Database GTDB-tk Database used for Bin Taxonomy Classification
  • string
Maximum MAG contamination percentage Maximum MAG contamination percentage Maximum MAG contamination percentage for dereplication
  • int
Minimum MAG completeness percentage Minimum MAG completeness percentage Minimum MAG completeness percentage for bin refinement (binette) and dereplication (drep)
  • int
Minimum MAG length Minimum MAG length Minimum MAG length for dereplication
  • int
Minimum length of contigs to output Minimum length of contigs to output Minimum length of contigs to output (only for MEGAHIT).
  • int
Read length (CONCOCT) Read length (CONCOCT) CONCOCT requires the read length for coverage. Best use fastQC to estimate the mean value.
  • int
Run Bakta on MAGs Run Bakta on MAGs Bakta can take a long time, but will give comprehensive genome annotation
  • boolean
Run GTDB-Tk on MAGs Run GTDB-Tk on MAGs GTDB-Tk can take a long time and requires a lot of memory, run only if taxonomic placement is required.
  • boolean
Trimmed reads Trimmed reads These should be the reads from the samples to estimate MAGs abundance in the original samples.
  • File[]
Trimmed reads from grouped samples Trimmed reads from grouped samples Provide already quality controlled trimmed reads, this can also be read groups or groups and individual samples or a mix of both.
  • File[]

Steps

ID Name Description
18 Map parameter value toolshed.g2.bx.psu.edu/repos/iuc/map_param_value/map_param_value/0.2.0
19 Map parameter value toolshed.g2.bx.psu.edu/repos/iuc/map_param_value/map_param_value/0.2.0
20 Unzip collection __UNZIP_COLLECTION__
21 MEGAHIT toolshed.g2.bx.psu.edu/repos/iuc/megahit/megahit/1.2.9+galaxy2
22 metaSPAdes toolshed.g2.bx.psu.edu/repos/nml/metaspades/metaspades/4.1.0+galaxy0
23 Pick parameter value toolshed.g2.bx.psu.edu/repos/iuc/pick_value/pick_value/0.2.0
24 Quast toolshed.g2.bx.psu.edu/repos/iuc/quast/quast/5.3.0+galaxy0
25 Bowtie2 toolshed.g2.bx.psu.edu/repos/devteam/bowtie2/bowtie2/2.5.3+galaxy1
26 CONCOCT: Cut up contigs toolshed.g2.bx.psu.edu/repos/iuc/concoct_cut_up_fasta/concoct_cut_up_fasta/1.1.0+galaxy2
27 Samtools sort toolshed.g2.bx.psu.edu/repos/devteam/samtools_sort/samtools_sort/2.0.5
28 SemiBin toolshed.g2.bx.psu.edu/repos/iuc/semibin/semibin/2.0.2+galaxy1
29 CONCOCT: Generate the input coverage table toolshed.g2.bx.psu.edu/repos/iuc/concoct_coverage_table/concoct_coverage_table/1.1.0+galaxy2
30 Calculate contig depths toolshed.g2.bx.psu.edu/repos/iuc/metabat2_jgi_summarize_bam_contig_depths/metabat2_jgi_summarize_bam_contig_depths/2.17+galaxy0
31 Converts genome bins in fasta format toolshed.g2.bx.psu.edu/repos/iuc/fasta_to_contig2bin/Fasta_to_Contig2Bin/1.1.7+galaxy1
32 CONCOCT toolshed.g2.bx.psu.edu/repos/iuc/concoct/concoct/1.1.0+galaxy2
33 MaxBin2 toolshed.g2.bx.psu.edu/repos/mbernt/maxbin2/maxbin2/2.2.7+galaxy6
34 MetaBAT2 toolshed.g2.bx.psu.edu/repos/iuc/metabat2/metabat2/2.17+galaxy0
35 CONCOCT: Merge cut clusters toolshed.g2.bx.psu.edu/repos/iuc/concoct_merge_cut_up_clustering/concoct_merge_cut_up_clustering/1.1.0+galaxy2
36 Converts genome bins in fasta format toolshed.g2.bx.psu.edu/repos/iuc/fasta_to_contig2bin/Fasta_to_Contig2Bin/1.1.7+galaxy1
37 Converts genome bins in fasta format toolshed.g2.bx.psu.edu/repos/iuc/fasta_to_contig2bin/Fasta_to_Contig2Bin/1.1.7+galaxy1
38 CONCOCT: Extract a fasta file toolshed.g2.bx.psu.edu/repos/iuc/concoct_extract_fasta_bins/concoct_extract_fasta_bins/1.1.0+galaxy2
39 Converts genome bins in fasta format toolshed.g2.bx.psu.edu/repos/iuc/fasta_to_contig2bin/Fasta_to_Contig2Bin/1.1.7+galaxy1
40 Build list __BUILD_LIST__
41 Binette toolshed.g2.bx.psu.edu/repos/iuc/binette/binette/1.0.5+galaxy1
42 Pool Bins from all samples __FLATTEN__
43 checkm2 toolshed.g2.bx.psu.edu/repos/iuc/checkm2/checkm2/1.0.2+galaxy0
44 Text reformatting toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_awk_tool/9.5+galaxy0
45 dRep dereplicate toolshed.g2.bx.psu.edu/repos/iuc/drep_dereplicate/drep_dereplicate/3.5.0+galaxy1
46 GTDB-Tk Classify genomes toolshed.g2.bx.psu.edu/repos/iuc/gtdbtk_classify_wf/gtdbtk_classify_wf/2.4.0+galaxy0
47 checkm2 toolshed.g2.bx.psu.edu/repos/iuc/checkm2/checkm2/1.0.2+galaxy0
48 CheckM lineage_wf toolshed.g2.bx.psu.edu/repos/iuc/checkm_lineage_wf/checkm_lineage_wf/1.2.3+galaxy0
49 CoverM genome toolshed.g2.bx.psu.edu/repos/iuc/coverm_genome/coverm_genome/0.7.0+galaxy0
50 Quast toolshed.g2.bx.psu.edu/repos/iuc/quast/quast/5.3.0+galaxy0
51 Bakta toolshed.g2.bx.psu.edu/repos/iuc/bakta/bakta/1.9.4+galaxy0
52 Column join toolshed.g2.bx.psu.edu/repos/iuc/collection_column_join/collection_column_join/0.0.3
53 MultiQC toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.27+galaxy3

Outputs

ID Name Description Type
Assembly Report Assembly Report n/a
  • File
Dereplicated Bins Dereplicated Bins n/a
  • File
Full MultiQC Report Full MultiQC Report n/a
  • File

Version History

v0.1 (earliest) Created 30th Apr 2025 at 03:02 by WorkflowHub Bot

Updated to v0.1


Frozen v0.1 7986c0c
help Creators and Submitter
Creators
Submitter
License
Activity

Views: 10   Downloads: 0   Runs: 1

Created: 30th Apr 2025 at 03:02

help Tags

This item has not yet been tagged.

help Attributions

None

Total size: 122 KB
Powered by
(v.1.16.0)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH