• Taxonomy Profiling and Visualization with Krona
  • Pathogen Detection PathoGFAIR Samples Aggregation and Visualisation
  • Nanopore Preprocessing
  • Gene-based Pathogen Identification
  • Allele-based Pathogen Identification
  • bacterial_genome_annotation
  • amr_gene_detection
  • Quality and Contamination Control For Genome Assembly
  • Genome assembly with Flye
  • Assembly polishing with long reads
  • Bacterial Genome Assembly using Shovill
  • Mass spectrometry: LC-MS preprocessing with XCMS
  • Mass spectrometry: GCMS with metaMS
  • QIIME2 Ia: multiplexed data (single-end)
  • QIIME2 Ib: multiplexed data (paired-end)
  • QIIME2 Ic: Demultiplexed data (single-end)
  • QIIME2 Id: Demultiplexed data (paired-end)
  • QIIME2 IIa: Denoising (sequence quality control) and feature table creation (single-end)
  • QIIME2 IIb: Denoising (sequence quality control) and feature table creation (paired-end)
  • QIIME2-III-V-Phylogeny-Rarefaction-Taxonomic-Analysis
  • QIIME2 VI: Diversity metrics and estimations
  • dada2 amplicon analysis pipeline - for paired end data
  • MetaProSIP OpenMS 2.8
  • Clinical Metaproteomics Quantitation
  • Create GRO and TOP complex files
  • dcTMD calculations with GROMACS
  • MMGBSA calculations with GROMACS
  • Fragment-based virtual screening using rDock for docking and SuCOS for pose scoring
  • Generic variation analysis on WGS PE data
  • Generic variation analysis reporting
  • Segmentation and counting of cell nuclei in fluorescence microscopy images
  • Parallel Accession Download
  • sra_manifest_to_concatenated_fastqs_parallel
  • Pox Virus Illumina Amplicon Workflow from half-genomes
  • scRNA-seq_preprocessing_10X_cellPlex
  • scRNA-seq_preprocessing_10X_v3_Bundle
  • Velocyto-on10X-from-bundled
  • Velocyto-on10X-filtered-barcodes
  • baredSC_1d_logNorm
  • baredSC_2d_logNorm
  • COVID-19: variation analysis on WGS PE data
  • COVID-19: variation analysis reporting
  • COVID-19: consensus construction
  • COVID-19: variation analysis on ARTIC PE data
  • SARS-CoV-2 Illumina Amplicon pipeline - iVar based
  • COVID-19: variation analysis on WGS SE data
  • COVID-19: variation analysis of ARTIC ONT data
  • ATACseq
  • ChIPseq_SR
  • Get Confident Peaks From ChIP_SR replicates
  • Get Confident Peaks From ChIP_PE replicates
  • Get Confident Peaks From ATAC or CUTandRUN replicates
  • Hi-C_fastqToCool_hicup_cooler
  • cHi-C_fastqToCool_hicup_cooler
  • Hi-C_juicermediumtabixToCool_cooler
  • Hi-C_fastqToPairs_hicup
  • CUTandRUN
  • ChIPseq_PE
  • Average Bigwig between replicates
  • Purge-duplicate-contigs-VGP6
  • Assembly-Hifi-only-VGP3
  • Scaffolding with Hi-C data VGP8
  • Mitogenome-Assembly-VGP0
  • Assembly-Hifi-HiC-phasing-VGP4
  • Generate Nx and Size plots for multiple assemblies
  • Purging-duplicates-one-haplotype-VGP6b
  • Assembly-Hifi-Trio-phasing-VGP5
  • Scaffolding-BioNano-VGP7
  • Assembly-decontamination-VGP9
  • kmer-profiling-hifi-trio-VGP2
  • kmer-profiling-hifi-VGP1
  • RNAseq_PE
  • RNAseq_SR
  • BREW3R
  • Repeat masking with RepeatModeler and RepeatMasker

Taxonomy Profiling and Visualization with Krona

Microbiome - Taxonomy Profiling

name:Collectionname:microGalaxyname:PathoGFAIRname:IWC
Details

Pathogen Detection PathoGFAIR Samples Aggregation and Visualisation

Pathogens of all samples report generation and visualization

name:Collectionname:microGalaxyname:PathoGFAIRname:IWC
Details

Nanopore Preprocessing

Microbiome - QC and Contamination Filtering

name:Collectionname:microGalaxyname:PathoGFAIRname:Nanoporename:IWC
Details

Gene-based Pathogen Identification

Nanopore datasets analysis - Phylogenetic Identification - antibiotic resistance genes detection and contigs building

name:Collectionname:PathoGFAIRname:IWCname:microGalaxy
Details

Allele-based Pathogen Identification

Microbiome - Variant calling and Consensus Building

name:Collectionname:microGalaxyname:PathoGFAIRname:IWC
Details

bacterial_genome_annotation

Annotation of an assembled bacterial genomes to detect genes, potential plasmids, integrons and Insertion sequence (IS) elements.

GenomicsfastaABRomicsbacterial-genomicsAnnotationgenome-annotation
Details

amr_gene_detection

Antimicrobial resistance gene detection from assembled bacterial genomes

fastaGenomicsABRomicsantibiotic-resistanceantimicrobial-resistance-genesantimicrobial resistancebacterial-genomicsAMRAMR-detection
Details

Quality and Contamination Control For Genome Assembly

Short paired-end read analysis to provide quality analysis, read cleaning and taxonomy assignation

Genomicsfastqbacterial-genomicstaxonomy-assignmentpaired-endqualityABRomicstrimming
Details

Genome assembly with Flye

Assemble long reads with Flye, then view assembly statistics and assembly graph

Details

Assembly polishing with long reads

Racon polish with long reads, x4

Details

Bacterial Genome Assembly using Shovill

Assembly of bacterial paired-end short read data with generation of quality metrics and reports

fastqGenomicsbacterial-genomicspaired-endassemblyqualityABRomics
Details

Mass spectrometry: LC-MS preprocessing with XCMS

This workflow is composed with the XCMS tool R package (Smith, C.A. 2006) able to extract, filter, align and fill gapand the possibility to annotate isotopes, adducts and fragments using the CAMERA R package (Kuhl, C 2012). https://training.galaxyproject.org/training-material/topics/metabolomics/tutorials/lcms-preprocessing/tutorial.html

metabolomicsMSLC-MSworkflow4metabolomicsxcmsGTN
Details

Mass spectrometry: GCMS with metaMS

This workflow is composed with the XCMS tool R package (Smith, C.A. 2006) able to extract and the metaMS R package (Wehrens, R 2014) for the field of untargeted metabolomics. https://training.galaxyproject.org/training-material/topics/metabolomics/tutorials/gcms/tutorial.html

metabolomicsMSworkflow4metabolomicsGC-MSGTNmetaMS
Details

QIIME2 Ia: multiplexed data (single-end)

Importing single-end multiplexed data (not demultiplexed yet)

Details

QIIME2 Ib: multiplexed data (paired-end)

Importing paired-end multiplexed data (not demultiplexed yet)

Details

QIIME2 Ic: Demultiplexed data (single-end)

Importing demultiplexed data (single-end)

Details

QIIME2 Id: Demultiplexed data (paired-end)

Importing demultiplexed data (paired-end)

Details

QIIME2 IIa: Denoising (sequence quality control) and feature table creation (single-end)

Use DADA2 for sequence quality control. DADA2 is a pipeline for detecting and correcting (where possible) Illumina amplicon sequence data. As implemented in the q2-dada2 plugin, this quality control process will additionally filter any phiX reads (commonly present in marker gene Illumina sequence data) that are identified in the sequencing data, and will filter chimeric sequences.

Details

QIIME2 IIb: Denoising (sequence quality control) and feature table creation (paired-end)

Use DADA2 for sequence quality control. DADA2 is a pipeline for detecting and correcting (where possible) Illumina amplicon sequence data. As implemented in the q2-dada2 plugin, this quality control process will additionally filter any phiX reads (commonly present in marker gene Illumina sequence data) that are identified in the sequencing data, and will filter chimeric sequences.

Details

QIIME2-III-V-Phylogeny-Rarefaction-Taxonomic-Analysis

This workflow - Reconstruct phylogeny (insert fragments in a reference) - Alpha rarefaction analysis - Taxonomic analysis

Details

QIIME2 VI: Diversity metrics and estimations

The first step in hypothesis testing in microbial ecology is typically to look at within- (alpha) and between-sample (beta) diversity. We can calculate diversity metrics, apply appropriate statistical tests, and visualize the data using the q2-diversity plugin.

Details

dada2 amplicon analysis pipeline - for paired end data

dada2 amplicon analysis for paired end data The workflow has three main outputs: - the sequence table (output of makeSequenceTable) - the taxonomy (output of assignTaxonomy) - the counts which allow to track the number of sequences in the samples through the steps (output of sequence counts)

name:amplicon
Details

MetaProSIP OpenMS 2.8

Automated inference of stable isotope incorporation rates in proteins for functional metaproteomics

Details

Clinical Metaproteomics Quantitation

Clinical Metaproteomics 4: Quantitation

name:clinicalMP
Details

Create GRO and TOP complex files

dcTMD calculations with GROMACS

Perform dcTMD free energy simulations and calculations

Details

MMGBSA calculations with GROMACS

MMGBSA simulation and calculation

Details

Fragment-based virtual screening using rDock for docking and SuCOS for pose scoring

Virtual screening of the SARS-CoV-2 main protease with rDock and pose scoring

Details

Generic variation analysis on WGS PE data

Workflow for variant analysis against a reference genome in GenBank format

mpxvgeneric
Details

Generic variation analysis reporting

This workflow takes a VCF dataset of variants produced by any of the variant calling workflows in https://github.com/galaxyproject/iwc/tree/main/workflows/sars-cov-2-variant-calling and generates tabular lists of variants by Samples and by Variant, and an overview plot of variants and their allele-frequencies.

mpvxgeneric
Details

Segmentation and counting of cell nuclei in fluorescence microscopy images

This workflow performs segmentation and counting of cell nuclei using fluorescence microscopy images. The segmentation step is performed using Otsu thresholding (Otsu, 1979). The workflow is based on the tutorial: https://training.galaxyproject.org/training-material/topics/imaging/tutorials/imaging-introduction/tutorial.html

Details

Parallel Accession Download

Downloads fastq files for sequencing run accessions provided in a text file using fasterq-dump. Creates one job per listed run accession.

Details

sra_manifest_to_concatenated_fastqs_parallel

This workflow takes as input a SRA_manifest from SRA Run Selector and will generate one fastq file or fastq pair of file for each experiment (concatenated multiple runs if necessary). Output will be relabelled to match the column specified by the user.

Details

Pox Virus Illumina Amplicon Workflow from half-genomes

A workflow for the analysis of pox virus genomes sequenced as half-genomes (for ITR resolution) in a tiled-amplicon approach

poxvirology
Details

scRNA-seq_preprocessing_10X_cellPlex

This workflow processes the CMO fastqs with CITE-seq-Count and include the translation step required for cellPlex processing. In parallel it processes the Gene Expresion fastqs with STARsolo, filter cells with DropletUtils and reformat all outputs to be easily used by the function 'Read10X' from Seurat.

#single-cell
Details

scRNA-seq_preprocessing_10X_v3_Bundle

This workflow processes the Gene Expresion fastqs with STARsolo, filter cells with DropletUtils and reformat all outputs to be easily used by the function 'Read10X' from Seurat.

#single-cell
Details

Velocyto-on10X-from-bundled

Run velocyto to get loom with counts of spliced and unspliced. It will extract the 'barcodes' from the bundled outputs.

name:single-cell
Details

Velocyto-on10X-filtered-barcodes

Run velocyto to get loom with counts of spliced and unspliced

name:single-cell
Details

baredSC_1d_logNorm

Run baredSC in 1 dimension in logNorm for 1 to N gaussians and combine models.

Details

baredSC_2d_logNorm

Run baredSC in 2 dimensions in logNorm for 1 to N gaussians and combine models.

Details

COVID-19: variation analysis on WGS PE data

This workflows performs paired end read mapping with bwa-mem followed by sensitive variant calling across a wide range of AFs with lofreq

COVID-19covid19.galaxyproject.orgiwcemergen_validated
Details

COVID-19: variation analysis reporting

This workflow takes a VCF dataset of variants produced by any of the *-variant-calling workflows in https://github.com/galaxyproject/iwc/tree/main/workflows/sars-cov-2-variant-calling and generates tabular lists of variants by Samples and by Variant, and an overview plot of variants and their allele-frequencies.

COVID-19covid19.galaxyproject.org
Details

COVID-19: consensus construction

Build a consensus sequence from FILTER PASS variants with intrasample allele-frequency above a configurable consensus threshold. Hard-mask regions with low coverage (but not consensus variants within them) and ambiguous sites.

COVID-19covid19.galaxyproject.org
Details

COVID-19: variation analysis on ARTIC PE data

The workflow for Illumina-sequenced ARTIC data builds on the RNASeq workflow for paired-end data using the same steps for mapping and variant calling, but adds extra logic for trimming ARTIC primer sequences off reads with the ivar package. In addition, this workflow uses ivar also to identify amplicons affected by ARTIC primer-binding site mutations and tries to exclude reads derived from such tainted amplicons when calculating allele-frequencies of other variants.

COVID-19ARTICcovid19.galaxyproject.org
Details

SARS-CoV-2 Illumina Amplicon pipeline - iVar based

Find and annotate variants in ampliconic SARS-CoV-2 Illumina sequencing data and classify samples with pangolin and nextclade

COVID-19ARTICiwc
Details

COVID-19: variation analysis on WGS SE data

This workflows performs single end read mapping with bowtie2 followed by sensitive variant calling across a wide range of AFs with lofreq

COVID-19covid19.galaxyproject.org
Details

COVID-19: variation analysis of ARTIC ONT data

This workflow for ONT-sequenced ARTIC data is modeled after the alignment/variant-calling steps of the [ARTIC pipeline](https://artic.readthedocs.io/en/latest/). It performs, essentially, the same steps as that pipeline’s minion command, i.e. read mapping with minimap2 and variant calling with medaka. Like the Illumina ARTIC workflow it uses ivar for primer trimming. Since ONT-sequenced reads have a much higher error rate than Illumina-sequenced reads and are therefor plagued more by false-positive variant calls, this workflow does make no attempt to handle amplicons affected by potential primer-binding site mutations.

COVID-19ARTICONTcovid19.galaxyproject.org
Details

ATACseq

This workflow takes as input a collection of paired fastq. It will remove bad quality and adapters with cutadapt. Map with Bowtie2 end-to-end. Will remove reads on MT and unconcordant pairs and pairs with mapping quality below 30 and PCR duplicates. Will compute the pile-up on 5' +- 100bp. Will call peaks and count the number of reads falling in the 1kb region centered on the summit. Will compute 2 normalization for coverage: normalized by million reads and normalized by million reads in peaks. Will plot the number of reads for each fragment length.

ATACseq
Details

ChIPseq_SR

This workflow takes as input a collection of fastqs (single reads). Remove adapters with cutadapt, map with bowtie2. Keep MAPQ30. MACS2 for bam with fixed extension or model.

ChIP
Details

Get Confident Peaks From ChIP_SR replicates

This workflow takes as input SR BAM from ChIP-seq. It calls peaks on each replicate and intersect them. In parallel, each BAM is subsetted to smallest number of reads. Peaks are called using all subsets combined. Only peaks called using a combination of all subsets which have summits intersecting the intersection of at least x replicates will be kept.

ATAC-seq
Details

Get Confident Peaks From ChIP_PE replicates

This workflow takes as input PE BAM from ChIP-seq. It calls peaks on each replicate and intersect them. In parallel, each BAM is subsetted to smallest number of reads. Peaks are called using all subsets combined. Only peaks called using a combination of all subsets which have summits intersecting the intersection of at least x replicates will be kept.

ATAC-seq
Details

Get Confident Peaks From ATAC or CUTandRUN replicates

This workflow takes as input BAM from ATAC-seq or CUT&RUN. It calls peaks on each replicate and intersect them. In parallel, each BAM is subsetted to smallest number of reads. Peaks are called using all subsets combined. Only peaks called using a combination of all subsets which have summits intersecting the intersection of at least x replicates will be kept.

ATAC-seq
Details

Hi-C_fastqToCool_hicup_cooler

This workflow takes as input a collection of paired fastq. It uses HiCUP to go from fastq to validPair file using the middle of the fragment as coordinates. The pairs are filtered for MAPQ and sorted by cooler to generate a tabix dataset. Cooler is used to generate a balanced cool file to the desired resolution.

Hi-C
Details

cHi-C_fastqToCool_hicup_cooler

This workflow take as input a collection of paired fastq. It uses HiCUP to go from fastq to validPair file. The pairs are filtered for MAPQ and for the region captured. Then, they are sorted by cooler to generate a tabix dataset. Cooler is used to generate a balanced cool file to the desired resolution.

Hi-C
Details

Hi-C_juicermediumtabixToCool_cooler

This workflow uses as input a collection of juicer medium tabix files and a genome name. It builds balanced cool file to the desired resolution.

Hi-C
Details

Hi-C_fastqToPairs_hicup

This workflow takes as input a collection of paired fastq. It uses HiCUP to go from fastq to validPair file. First truncate the fastq using the cutting sequence to guess the fill-in. Then map the truncated fastq. Then asign to fragment and filter the self-ligated and dandling ends or internal (it can also filter for the size). Then it removes the duplicates. Convert the output to be compatible with juicebox or cooler using the middle of the fragment as coordinates. Finally filter for mapping quality

Hi-C
Details

CUTandRUN

This workflow take as input a collection of paired fastq. Remove adapters with cutadapt, map pairs with bowtie2 allowing dovetail. Keep MAPQ30 and concordant pairs. BAM to BED. MACS2 with "ATAC" parameters.

CUTnRUN
Details

ChIPseq_PE

This workflow takes as input a collection of paired fastqs. Remove adapters with cutadapt, map pairs with bowtie2. Keep MAPQ30 and concordant pairs. MACS2 for paired bam.

ChIP
Details

Average Bigwig between replicates

We assume the identifiers of the input list are like: sample_name_replicateID. The identifiers of the output list will be: sample_name

Details

Purge-duplicate-contigs-VGP6

Purge contigs marked as duplicates by purge_dups (could be haplotypic duplication or overlap duplication). This workflow is the 6th workflow of the VGP pipeline. It is meant to be run after one of the contigging steps (Workflow 3, 4, or 5)

VGP_curated
Details

Assembly-Hifi-only-VGP3

VGPReviewed
Details

Scaffolding with Hi-C data VGP8

Scaffolding using HiC data with YAHS.

VGP_curated
Details

Mitogenome-Assembly-VGP0

ReviewedVGP
Details

Assembly-Hifi-HiC-phasing-VGP4

VGPReviewed
Details

Generate Nx and Size plots for multiple assemblies

Purging-duplicates-one-haplotype-VGP6b

VGP_curated
Details

Assembly-Hifi-Trio-phasing-VGP5

VGPReviewed
Details

Scaffolding-BioNano-VGP7

VGP_curated
Details

Assembly-decontamination-VGP9

VGP_curated
Details

kmer-profiling-hifi-trio-VGP2

Create Meryl Database used for the estimation of assembly parameters and quality control with Merqury. Part of the VGP pipeline.

ReviewedVGP
Details

kmer-profiling-hifi-VGP1

ReviewedVGP
Details

RNAseq_PE

This workflow takes as input a list of paired-end fastqs. Adapters and bad quality bases are removed with cutadapt. Reads are mapped with STAR with ENCODE parameters and genes are counted simultaneously as well as normalized coverage (per million mapped reads) on uniquely mapped reads. The counts are reprocessed to be similar to HTSeq-count output. FPKM are computed with cufflinks and/or with StringTie. The unstranded normalized coverage is computed with bedtools.

RNAseq
Details

RNAseq_SR

This workflow takes as input a list of single-reads fastqs. Adapters and bad quality bases are removed with cutadapt. Reads are mapped with STAR with ENCODE parameters and genes are counted simultaneously as well as normalized coverage (per million mapped reads) on uniquely mapped reads. The counts are reprocessed to be similar to HTSeq-count output. FPKM are computed with cufflinks and/or with StringTie. The unstranded normalized coverage is computed with bedtools.

RNAseq
Details

BREW3R

This workflow takes a collection of BAM (output of STAR) and a gtf. It extends the input gtf using de novo annotation.

Details

Repeat masking with RepeatModeler and RepeatMasker