Nanopore Preprocessing
Microbiome - QC and Contamination Filtering
- Author(s):
- Release: 0.1
- License: MIT
- UniqueID: 0b976704-e1e0-4cc3-9e75-38bda67c01ff
Nanopore Preprocessing
Before starting any analysis, it is always a good idea to assess the quality of your input data and to discard poor-quality base content by trimming and filtering reads.
Generally, we are not interested in the host sequences, but rather only those originating from the pathogen itself. It is important to get rid of all host sequences and to only retain sequences that might include a pathogen, both in order to speed up further steps and to avoid host sequences compromising the analysis.
Input Datasets
- Collection of sequenced Nanopore reads of all samples to be analysed in a
fastqsanger
orfastqsanger.gz
format.
Output Datasets
Collection of Pre-Processed Sequenced reads of all samples, ready for further analysis with the other workflows, in a
fastqsanger
orfastqsanger.gz
format.Tables indicating total number of reads before and after host sequences trimming, and the host sequences percentages found in each sample.
If you're unsure how to use this workflows, or if you want to see it in action with test datasets, it is included in our detailed training material for foodborne pathogen detection and tracking. You can find step-by-step instructions and practical examples in the following GTN tutorial