Quality and Contamination Control For Genome Assembly
Short paired-end read analysis to provide quality analysis, read cleaning and taxonomy assignation
- Author(s):
- Release: 1.1.4
- License: GPL-3.0-or-later
- UniqueID: eb1c667d-639d-4403-a2b4-1cb6683e0fa5
Quality and Contamination control workflow for paired end data (v1.0)
This workflow uses paired-end illumina fastq(.gz) files and executes the following steps:
- Quality control and trimming
- fastp QC control and trimming
- Taxonomic assignation on trimmed data
- Kraken2 assignation
- Bracken to re-estimate abundance to the species level
- Recentrifuge to make a krona chart
- Aggregating outputs into a single JSON file
- ToolDistillator to extract and aggregate information from different tool outputs to JSON parsable files
Inputs
- Paired-end illumina raw reads in fastq(.gz) format.
Outputs
- Quality control:
- quality report
- trimmed raw reads
- Taxonomic assignation:
- Tabular report of identified species
- Tabular file with assigned read to a taxonomic level
- Krona chart to illustrate species diversity of the sample
- Aggregating outputs:
- JSON file with information about the outputs of fastp, Kraken2, Bracken, Recentrifuge