Nanopore Preprocessing
microbiome-pathogen-detection-from-nanopore-foodborne-data/nanopore-preprocessing
Launch in Tutorial Mode
question
galaxy-download Download
galaxy-download Download
flowchart TD 0["ℹ️ Input Parameter\nsamples_profile"]; style 0 fill:#ded,stroke:#393,stroke-width:4px; 1["ℹ️ Input Collection\ncollection_of_all_samples"]; style 1 stroke:#2c3143,stroke-width:4px; 2["Porechop"]; 1 -->|output| 2; 34ea26db-11cb-41ee-85c3-75af8a53a2c0["Output\nporechop_output_trimmed_reads"]; 2 --> 34ea26db-11cb-41ee-85c3-75af8a53a2c0; style 34ea26db-11cb-41ee-85c3-75af8a53a2c0 stroke:#2c3143,stroke-width:4px; 3["NanoPlot"]; 1 -->|output| 3; 15ecf5b1-e0eb-405a-ac3a-359feb66d4cd["Output\nnanoplot_qc_on_reads_before_preprocessing_nanostats"]; 3 --> 15ecf5b1-e0eb-405a-ac3a-359feb66d4cd; style 15ecf5b1-e0eb-405a-ac3a-359feb66d4cd stroke:#2c3143,stroke-width:4px; 304110f9-60d0-4ba2-8b3b-fae0e2a49554["Output\nnanoplot_on_reads_before_preprocessing_nanostats_post_filtering"]; 3 --> 304110f9-60d0-4ba2-8b3b-fae0e2a49554; style 304110f9-60d0-4ba2-8b3b-fae0e2a49554 stroke:#2c3143,stroke-width:4px; f2bd0a1f-cd60-4a36-a7d0-8025cc19ea2e["Output\nnanoplot_qc_on_reads_before_preprocessing_html_report"]; 3 --> f2bd0a1f-cd60-4a36-a7d0-8025cc19ea2e; style f2bd0a1f-cd60-4a36-a7d0-8025cc19ea2e stroke:#2c3143,stroke-width:4px; 4["FastQC"]; 1 -->|output| 4; d0a64624-05d0-4068-835b-a025fc011760["Output\nfastqc_quality_check_before_preprocessing_html_file"]; 4 --> d0a64624-05d0-4068-835b-a025fc011760; style d0a64624-05d0-4068-835b-a025fc011760 stroke:#2c3143,stroke-width:4px; e61fef5d-1bc8-4c8e-be6a-f74e210e9920["Output\nfastqc_quality_check_before_preprocessing_text_file"]; 4 --> e61fef5d-1bc8-4c8e-be6a-f74e210e9920; style e61fef5d-1bc8-4c8e-be6a-f74e210e9920 stroke:#2c3143,stroke-width:4px; 5["fastp"]; 2 -->|outfile| 5; a2219483-50b7-4aed-98dd-333ad2e12eb8["Output\nnanopore_sequenced_reads_processed_with_fastp_after_host_removal"]; 5 --> a2219483-50b7-4aed-98dd-333ad2e12eb8; style a2219483-50b7-4aed-98dd-333ad2e12eb8 stroke:#2c3143,stroke-width:4px; 2a9a8b4d-458b-40e7-9a21-fb7108d5bbe4["Output\nnanopore_sequenced_reads_processed_with_fastp_after_host_removal_html_report"]; 5 --> 2a9a8b4d-458b-40e7-9a21-fb7108d5bbe4; style 2a9a8b4d-458b-40e7-9a21-fb7108d5bbe4 stroke:#2c3143,stroke-width:4px; 6["MultiQC"]; 4 -->|text_file| 6; ebffe782-a56c-431f-8af4-c0cb8d7a02fc["Output\nmultiQC_stats_before_preprocessing"]; 6 --> ebffe782-a56c-431f-8af4-c0cb8d7a02fc; style ebffe782-a56c-431f-8af4-c0cb8d7a02fc stroke:#2c3143,stroke-width:4px; 0f92196d-047d-4918-819d-c0ff7cd3ae85["Output\nmultiQC_html_report_before_preprocessing"]; 6 --> 0f92196d-047d-4918-819d-c0ff7cd3ae85; style 0f92196d-047d-4918-819d-c0ff7cd3ae85 stroke:#2c3143,stroke-width:4px; 7["Map with minimap2"]; 0 -->|output| 7; 5 -->|out1| 7; 9d7bb3b7-09a1-401f-a132-bb35a53375ea["Output\nbam_map_to_host"]; 7 --> 9d7bb3b7-09a1-401f-a132-bb35a53375ea; style 9d7bb3b7-09a1-401f-a132-bb35a53375ea stroke:#2c3143,stroke-width:4px; 8["NanoPlot"]; 5 -->|out1| 8; b5899290-4c57-4662-ad22-860654652ade["Output\nnanoplot_qc_on_reads_after_preprocessing_html_report"]; 8 --> b5899290-4c57-4662-ad22-860654652ade; style b5899290-4c57-4662-ad22-860654652ade stroke:#2c3143,stroke-width:4px; 949bfdf5-3d79-4dad-bdd8-c3a25e6af4cf["Output\nnanoplot_on_reads_after_preprocessing_nanostats_post_filtering"]; 8 --> 949bfdf5-3d79-4dad-bdd8-c3a25e6af4cf; style 949bfdf5-3d79-4dad-bdd8-c3a25e6af4cf stroke:#2c3143,stroke-width:4px; 42db7f93-919e-4bbb-81a1-06411a9da410["Output\nnanoplot_qc_on_reads_after_preprocessing_nanostats"]; 8 --> 42db7f93-919e-4bbb-81a1-06411a9da410; style 42db7f93-919e-4bbb-81a1-06411a9da410 stroke:#2c3143,stroke-width:4px; 9["FastQC"]; 5 -->|out1| 9; 09306471-afa0-4106-9cc7-259b93dfc862["Output\nfastqc_quality_check_after_preprocessing_text_file"]; 9 --> 09306471-afa0-4106-9cc7-259b93dfc862; style 09306471-afa0-4106-9cc7-259b93dfc862 stroke:#2c3143,stroke-width:4px; 084f982f-20f1-457e-8012-91ebbb85633d["Output\nfastqc_quality_check_after_preprocessing_html_file"]; 9 --> 084f982f-20f1-457e-8012-91ebbb85633d; style 084f982f-20f1-457e-8012-91ebbb85633d stroke:#2c3143,stroke-width:4px; 10["Split BAM by reads mapping status"]; 7 -->|alignment_output| 10; 14a53fe2-f296-43aa-86b7-243278c1050c["Output\nnon_host_sequences_bam"]; 10 --> 14a53fe2-f296-43aa-86b7-243278c1050c; style 14a53fe2-f296-43aa-86b7-243278c1050c stroke:#2c3143,stroke-width:4px; 3b1e626f-6bc1-484c-be01-366534361b73["Output\nhost_sequences_bam"]; 10 --> 3b1e626f-6bc1-484c-be01-366534361b73; style 3b1e626f-6bc1-484c-be01-366534361b73 stroke:#2c3143,stroke-width:4px; 11["Select"]; 9 -->|text_file| 11; a809853b-119f-44d2-986b-8d2006439fbe["Output\ntotal_sequences_before_hosts_sequences_removal"]; 11 --> a809853b-119f-44d2-986b-8d2006439fbe; style a809853b-119f-44d2-986b-8d2006439fbe stroke:#2c3143,stroke-width:4px; 12["Samtools fastx"]; 10 -->|mapped| 12; 10d4eaec-81d8-444e-8075-7b77a1fb6870["Output\nhost_sequences_fastq"]; 12 --> 10d4eaec-81d8-444e-8075-7b77a1fb6870; style 10d4eaec-81d8-444e-8075-7b77a1fb6870 stroke:#2c3143,stroke-width:4px; 13["Samtools fastx"]; 10 -->|unmapped| 13; 0c2dd74d-ac4f-45cf-839c-50386a7ece28["Output\nnon_host_sequences_fastq"]; 13 --> 0c2dd74d-ac4f-45cf-839c-50386a7ece28; style 0c2dd74d-ac4f-45cf-839c-50386a7ece28 stroke:#2c3143,stroke-width:4px; 14["Collapse Collection"]; 11 -->|out_file1| 14; 15["Filter failed datasets"]; 12 -->|output| 15; 16["Kraken2"]; 13 -->|output| 16; 203d303e-8f3a-4242-971f-b345842ebdb8["Output\nkraken2_with_kalamri_database_output"]; 16 --> 203d303e-8f3a-4242-971f-b345842ebdb8; style 203d303e-8f3a-4242-971f-b345842ebdb8 stroke:#2c3143,stroke-width:4px; 843afd4d-23a8-46e7-b945-8b67dd7ae341["Output\nkraken2_with_kalamri_database_report"]; 16 --> 843afd4d-23a8-46e7-b945-8b67dd7ae341; style 843afd4d-23a8-46e7-b945-8b67dd7ae341 stroke:#2c3143,stroke-width:4px; 17["Cut"]; 14 -->|output| 17; d07be9f1-d250-4008-91ee-59a68521eb56["Output\nquality_retained_all_reads"]; 17 --> d07be9f1-d250-4008-91ee-59a68521eb56; style d07be9f1-d250-4008-91ee-59a68521eb56 stroke:#2c3143,stroke-width:4px; 18["FastQC"]; 15 -->|output| 18; b0ee6e31-0eb1-437d-8c04-fc3640b9a0b7["Output\nhosts_qc_text_file"]; 18 --> b0ee6e31-0eb1-437d-8c04-fc3640b9a0b7; style b0ee6e31-0eb1-437d-8c04-fc3640b9a0b7 stroke:#2c3143,stroke-width:4px; b72ff57b-0921-43bf-a817-6cd444c8f3cb["Output\nhosts_qc_html"]; 18 --> b72ff57b-0921-43bf-a817-6cd444c8f3cb; style b72ff57b-0921-43bf-a817-6cd444c8f3cb stroke:#2c3143,stroke-width:4px; 19["Krakentools: Extract Kraken Reads By ID"]; 5 -->|out1| 19; 16 -->|report_output| 19; 16 -->|output| 19; 57e3b725-8e13-40b2-9acc-31fd56ebc80a["Output\ncollection_of_preprocessed_samples"]; 19 --> 57e3b725-8e13-40b2-9acc-31fd56ebc80a; style 57e3b725-8e13-40b2-9acc-31fd56ebc80a stroke:#2c3143,stroke-width:4px; 20["Select"]; 18 -->|text_file| 20; 3ba35c71-32f0-4741-98d4-ea8522e27500["Output\ntotal_sequences_after_hosts_sequences_removal"]; 20 --> 3ba35c71-32f0-4741-98d4-ea8522e27500; style 3ba35c71-32f0-4741-98d4-ea8522e27500 stroke:#2c3143,stroke-width:4px; 21["Collapse Collection"]; 20 -->|out_file1| 21; 22["Cut"]; 21 -->|output| 22; cef36c68-4549-4fd6-b7c8-71fb21df012f["Output\nquality_retained_hosts_reads"]; 22 --> cef36c68-4549-4fd6-b7c8-71fb21df012f; style cef36c68-4549-4fd6-b7c8-71fb21df012f stroke:#2c3143,stroke-width:4px; 23["Column join"]; 17 -->|out_file1| 23; 22 -->|out_file1| 23; 24["Compute"]; 23 -->|tabular_output| 24; 25["Column Regex Find And Replace"]; 24 -->|out_file1| 25; 470892ee-dab9-48d7-ad97-45dbd52afaa7["Output\nremoved_hosts_percentage_tabular"]; 25 --> 470892ee-dab9-48d7-ad97-45dbd52afaa7; style 470892ee-dab9-48d7-ad97-45dbd52afaa7 stroke:#2c3143,stroke-width:4px; 26["MultiQC"]; 9 -->|text_file| 26; 25 -->|out_file1| 26; 0b1b5a73-36ee-42a2-a220-1ced6ec7378b["Output\nmultiQC_html_report_after_preprocessing"]; 26 --> 0b1b5a73-36ee-42a2-a220-1ced6ec7378b; style 0b1b5a73-36ee-42a2-a220-1ced6ec7378b stroke:#2c3143,stroke-width:4px; 13cbf6c7-6954-4458-aa66-a5b020c63822["Output\nmultiQC_stats_after_preprocessing"]; 26 --> 13cbf6c7-6954-4458-aa66-a5b020c63822; style 13cbf6c7-6954-4458-aa66-a5b020c63822 stroke:#2c3143,stroke-width:4px;
Inputs
Input | Label |
---|---|
Input parameter | samples_profile |
Input dataset collection | collection_of_all_samples |
Outputs
From | Output | Label |
---|---|---|
toolshed.g2.bx.psu.edu/repos/iuc/porechop/porechop/0.2.4+galaxy0 | Porechop | |
toolshed.g2.bx.psu.edu/repos/iuc/nanoplot/nanoplot/1.42.0+galaxy1 | NanoPlot | |
toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.74+galaxy0 | FastQC | |
toolshed.g2.bx.psu.edu/repos/iuc/fastp/fastp/0.23.4+galaxy0 | fastp | |
toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy1 | MultiQC | |
toolshed.g2.bx.psu.edu/repos/iuc/minimap2/minimap2/2.28+galaxy0 | Map with minimap2 | |
toolshed.g2.bx.psu.edu/repos/iuc/nanoplot/nanoplot/1.42.0+galaxy1 | NanoPlot | |
toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.74+galaxy0 | FastQC | |
toolshed.g2.bx.psu.edu/repos/iuc/bamtools_split_mapped/bamtools_split_mapped/2.5.2+galaxy2 | Split BAM by reads mapping status | |
Grep1 | Select | |
toolshed.g2.bx.psu.edu/repos/iuc/samtools_fastx/samtools_fastx/1.15.1+galaxy2 | Samtools fastx | |
toolshed.g2.bx.psu.edu/repos/iuc/samtools_fastx/samtools_fastx/1.15.1+galaxy2 | Samtools fastx | |
toolshed.g2.bx.psu.edu/repos/iuc/kraken2/kraken2/2.1.1+galaxy1 | Kraken2 | |
Cut1 | Cut | |
toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.74+galaxy0 | FastQC | |
toolshed.g2.bx.psu.edu/repos/iuc/krakentools_extract_kraken_reads/krakentools_extract_kraken_reads/1.2+galaxy1 | Krakentools: Extract Kraken Reads By ID | |
Grep1 | Select | |
Cut1 | Cut | |
toolshed.g2.bx.psu.edu/repos/galaxyp/regex_find_replace/regexColumn1/1.0.3 | Column Regex Find And Replace | |
toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy1 | MultiQC |
Tools
To use these workflows in Galaxy you can either click the links to download the workflows, or you can right-click and copy the link to the workflow which can be used in the Galaxy form to import workflows.
Importing into Galaxy
Below are the instructions for importing these workflows directly into your Galaxy server of choice to start using them!Hands-on: Importing a workflow
- Click on Workflow on the top menu bar of Galaxy. You will see a list of all your workflows.
- Click on galaxy-upload Import at the top-right of the screen
- Provide your workflow
- Option 1: Paste the URL of the workflow into the box labelled “Archived Workflow URL”
- Option 2: Upload the workflow file in the box labelled “Archived Workflow File”
- Click the Import workflow button
Below is a short video demonstrating how to import a workflow from GitHub using this procedure:
Version History
Version | Commit | Time | Comments |
---|---|---|---|
5 | cdd93376a | 2024-06-06 12:00:29 | adding tags to some of the workflow outputs, updating the training with the latest PathoGFAIR workflows updates |
4 | e230001f4 | 2024-05-29 11:33:18 | updating preprocessing workflow and allele based workflow with a single user input parameter and adjusting the md file accodingly |
3 | 211b69394 | 2024-05-26 09:45:27 | adding workflow reports to the workflows of the training to match the latest version of the IWC PR |
2 | d320748c5 | 2024-05-20 18:17:48 | Foodborne training update 2024 |
1 | 0e0a2f2cc | 2024-01-10 15:47:09 | Rename metagenomics topic to microbiome |
For Admins
Installing the workflow tools
wget https://training.galaxyproject.org/training-material/topics/microbiome/tutorials/pathogen-detection-from-nanopore-foodborne-data/workflows/nanopore_preprocessing.ga -O workflow.ga workflow-to-tools -w workflow.ga -o tools.yaml shed-tools install -g GALAXY -a API_KEY -t tools.yaml workflow-install -g GALAXY -a API_KEY -w workflow.ga --publish-workflows