QC + Mapping + Counting (single+paired) - Ref Based RNA Seq - Transcriptomics - GTN
transcriptomics-ref-based/qc-mapping-counting-paired-and-single
Launch in Tutorial Mode
question
galaxy-download Download
galaxy-download Download
flowchart TD 0["ℹ️ Input Collection\nsingle fastqs"]; style 0 stroke:#2c3143,stroke-width:4px; 1["ℹ️ Input Collection\npaired fastqs"]; style 1 stroke:#2c3143,stroke-width:4px; 2["ℹ️ Input Dataset\nDrosophila_melanogaster.BDGP6.32.109_UCSC.gtf.gz"]; style 2 stroke:#2c3143,stroke-width:4px; 3["Cutadapt: remove bad quality bp"]; 0 -->|output| 3; 4["Flatten paired collection for FastQC"]; 1 -->|output| 4; 5["Cutadapt"]; 1 -->|output| 5; 6["Get gene length"]; 2 -->|output| 6; 077640cc-edbb-4185-9eb1-d11b522774af["Output\nGene length"]; 6 --> 077640cc-edbb-4185-9eb1-d11b522774af; style 077640cc-edbb-4185-9eb1-d11b522774af stroke:#2c3143,stroke-width:4px; 7["convert gtf to bed12"]; 2 -->|output| 7; 8["STAR: map single reads"]; 2 -->|output| 8; 3 -->|out1| 8; 9["Merge fastqs for FastQC"]; 4 -->|output| 9; 0 -->|output| 9; 10["Merge Cutadapt reports"]; 5 -->|report| 10; 3 -->|report| 10; 11["STAR: map paired reads"]; 2 -->|output| 11; 5 -->|out_pairs| 11; 12["count reads per gene for SR"]; 8 -->|mapped_reads| 12; 2 -->|output| 12; 13["FastQC check read qualities"]; 9 -->|output| 13; 14["Combine cutadapt results"]; 10 -->|output| 14; cab760db-5c9d-4a3c-b768-998bfbac6b57["Output\nmultiqc_cutadapt_html"]; 14 --> cab760db-5c9d-4a3c-b768-998bfbac6b57; style cab760db-5c9d-4a3c-b768-998bfbac6b57 stroke:#2c3143,stroke-width:4px; 15["Merge STAR logs"]; 11 -->|output_log| 15; 8 -->|output_log| 15; 16["Merge STAR counts"]; 8 -->|reads_per_gene| 16; 11 -->|reads_per_gene| 16; 17["count fragments per gene for PE"]; 11 -->|mapped_reads| 17; 2 -->|output| 17; 1527b5d7-1681-4934-9d9e-3a5f86ae0fee["Output\nfeatureCounts_gene_length"]; 17 --> 1527b5d7-1681-4934-9d9e-3a5f86ae0fee; style 1527b5d7-1681-4934-9d9e-3a5f86ae0fee stroke:#2c3143,stroke-width:4px; 18["Merge STAR BAM"]; 11 -->|mapped_reads| 18; 8 -->|mapped_reads| 18; 802017f4-fb1a-4243-b50d-2ed46f746f11["Output\nSTAR_BAM"]; 18 --> 802017f4-fb1a-4243-b50d-2ed46f746f11; style 802017f4-fb1a-4243-b50d-2ed46f746f11 stroke:#2c3143,stroke-width:4px; 19["merge coverage unique strand 1"]; 8 -->|signal_unique_str1| 19; 11 -->|signal_unique_str1| 19; 20["merge coverage unique strand 2"]; 8 -->|signal_unique_str2| 20; 11 -->|signal_unique_str2| 20; 21["Combine FastQC results"]; 13 -->|text_file| 21; 8d0ce9ee-e4e4-4c0c-8261-420ce756ecfd["Output\nmultiqc_fastqc_html"]; 21 --> 8d0ce9ee-e4e4-4c0c-8261-420ce756ecfd; style 8d0ce9ee-e4e4-4c0c-8261-420ce756ecfd stroke:#2c3143,stroke-width:4px; 22["Combine STAR Results"]; 15 -->|output| 22; 204e3f6c-6f54-46f0-b07c-1f31113265e7["Output\nmultiqc_star_html"]; 22 --> 204e3f6c-6f54-46f0-b07c-1f31113265e7; style 204e3f6c-6f54-46f0-b07c-1f31113265e7 stroke:#2c3143,stroke-width:4px; 23["Remove statistics from STAR counts"]; 16 -->|output| 23; 24["Determine library strandness with STAR"]; 16 -->|output| 24; fe7b84dd-4466-4fe7-94a8-408f4ac7ed1a["Output\nmultiqc_star_counts_html"]; 24 --> fe7b84dd-4466-4fe7-94a8-408f4ac7ed1a; style fe7b84dd-4466-4fe7-94a8-408f4ac7ed1a stroke:#2c3143,stroke-width:4px; 25["merge counts from featureCounts"]; 12 -->|output_short| 25; 17 -->|output_short| 25; c82388f8-cb09-4fdf-8a0e-03cdad579f37["Output\nfeatureCounts"]; 25 --> c82388f8-cb09-4fdf-8a0e-03cdad579f37; style c82388f8-cb09-4fdf-8a0e-03cdad579f37 stroke:#2c3143,stroke-width:4px; 26["merge featureCounts summary"]; 12 -->|output_summary| 26; 17 -->|output_summary| 26; 27["Determine library strandness with Infer Experiment"]; 18 -->|output| 27; 7 -->|bed_file| 27; 940ec3ec-dd2e-4d50-bbc4-756945eb16b2["Output\ninferexperiment"]; 27 --> 940ec3ec-dd2e-4d50-bbc4-756945eb16b2; style 940ec3ec-dd2e-4d50-bbc4-756945eb16b2 stroke:#2c3143,stroke-width:4px; 28["Read Distribution"]; 18 -->|output| 28; 7 -->|bed_file| 28; 29["Compute read distribution statistics"]; 18 -->|output| 29; 7 -->|bed_file| 29; 30["sample BAM"]; 18 -->|output| 30; 31["Get reads number per chromosome"]; 18 -->|output| 31; 32["Remove duplicates"]; 18 -->|output| 32; 33["Determine library strandness with STAR coverage"]; 19 -->|output| 33; 20 -->|output| 33; 2 -->|output| 33; 89e1b053-03c2-467a-95a0-d2dc404670ec["Output\npgt"]; 33 --> 89e1b053-03c2-467a-95a0-d2dc404670ec; style 89e1b053-03c2-467a-95a0-d2dc404670ec stroke:#2c3143,stroke-width:4px; 34["Select unstranded counts"]; 23 -->|outfile| 34; bce755be-ac3b-4346-9ac5-1128a287bf00["Output\ncounts_from_star"]; 34 --> bce755be-ac3b-4346-9ac5-1128a287bf00; style bce755be-ac3b-4346-9ac5-1128a287bf00 stroke:#2c3143,stroke-width:4px; 35["Sort counts to get gene with highest count on feature Counts"]; 25 -->|output| 35; 6aeb4dd1-445f-4c66-b1ce-4bb8faac53db["Output\nfeatureCounts_sorted"]; 35 --> 6aeb4dd1-445f-4c66-b1ce-4bb8faac53db; style 6aeb4dd1-445f-4c66-b1ce-4bb8faac53db stroke:#2c3143,stroke-width:4px; 36["Combine read asignments statistics"]; 26 -->|output| 36; fc72242a-f23c-4ceb-9a8b-5280343ea5d6["Output\nmultiqc_featureCounts_html"]; 36 --> fc72242a-f23c-4ceb-9a8b-5280343ea5d6; style fc72242a-f23c-4ceb-9a8b-5280343ea5d6 stroke:#2c3143,stroke-width:4px; 37["Combine read distribution on known features"]; 29 -->|output| 37; 07dca732-0ac7-432e-9e61-2b77f921a23b["Output\nmultiqc_read_distrib"]; 37 --> 07dca732-0ac7-432e-9e61-2b77f921a23b; style 07dca732-0ac7-432e-9e61-2b77f921a23b stroke:#2c3143,stroke-width:4px; 38["Get gene body coverage"]; 30 -->|outputsam| 38; 7 -->|bed_file| 38; 39["Combine results on reads per chromosome"]; 31 -->|output| 39; 7bfa8ae7-8ffd-46a1-a56e-815ed2c9f1cf["Output\nmultiqc_reads_per_chrom"]; 39 --> 7bfa8ae7-8ffd-46a1-a56e-815ed2c9f1cf; style 7bfa8ae7-8ffd-46a1-a56e-815ed2c9f1cf stroke:#2c3143,stroke-width:4px; 40["Combine results of duplicate reads"]; 32 -->|metrics_file| 40; 66553d0f-e851-458b-82c2-f9b30e394bac["Output\nmultiqc_dup"]; 40 --> 66553d0f-e851-458b-82c2-f9b30e394bac; style 66553d0f-e851-458b-82c2-f9b30e394bac stroke:#2c3143,stroke-width:4px; 41["Sort counts to get gene with highest count on STAR"]; 34 -->|out_file1| 41; 383df008-0ccb-4d67-98dd-33fa5e2db81e["Output\ncounts_from_star_sorted"]; 41 --> 383df008-0ccb-4d67-98dd-33fa5e2db81e; style 383df008-0ccb-4d67-98dd-33fa5e2db81e stroke:#2c3143,stroke-width:4px; 42["Combine gene body coverage"]; 38 -->|outputtxt| 42; 8544ea5c-faf2-44c9-85d6-40658fc9b9eb["Output\nmultiqc_gene_body_cov"]; 42 --> 8544ea5c-faf2-44c9-85d6-40658fc9b9eb; style 8544ea5c-faf2-44c9-85d6-40658fc9b9eb stroke:#2c3143,stroke-width:4px;
Inputs
Input | Label |
---|---|
Input dataset collection | single fastqs |
Input dataset collection | paired fastqs |
Input dataset | Drosophila_melanogaster.BDGP6.32.109_UCSC.gtf.gz |
Outputs
From | Output | Label |
---|---|---|
toolshed.g2.bx.psu.edu/repos/iuc/length_and_gc_content/length_and_gc_content/0.1.2 | Gene length and GC content | Get gene length |
toolshed.g2.bx.psu.edu/repos/iuc/gtftobed12/gtftobed12/357 | Convert GTF to BED12 | convert gtf to bed12 |
toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy1 | MultiQC | Combine cutadapt results |
toolshed.g2.bx.psu.edu/repos/iuc/featurecounts/featurecounts/2.0.3+galaxy2 | featureCounts | count fragments per gene for PE |
__MERGE_COLLECTION__ | Merge collections | Merge STAR BAM |
toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy1 | MultiQC | Combine FastQC results |
toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy1 | MultiQC | Combine STAR Results |
toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy1 | MultiQC | Determine library strandness with STAR |
__MERGE_COLLECTION__ | Merge collections | merge counts from featureCounts |
toolshed.g2.bx.psu.edu/repos/nilesh/rseqc/rseqc_infer_experiment/5.0.3+galaxy0 | Infer Experiment | Determine library strandness with Infer Experiment |
toolshed.g2.bx.psu.edu/repos/nilesh/rseqc/rseqc_read_distribution/5.0.3+galaxy0 | Read Distribution | |
toolshed.g2.bx.psu.edu/repos/devteam/picard/picard_MarkDuplicates/3.1.1.0 | MarkDuplicates | Remove duplicates |
toolshed.g2.bx.psu.edu/repos/iuc/pygenometracks/pygenomeTracks/3.8+galaxy2 | pyGenomeTracks | Determine library strandness with STAR coverage |
Cut1 | Cut | Select unstranded counts |
toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sort_header_tool/9.3+galaxy1 | Sort | Sort counts to get gene with highest count on feature Counts |
toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy1 | MultiQC | Combine read asignments statistics |
toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy1 | MultiQC | Combine read distribution on known features |
toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy1 | MultiQC | Combine results on reads per chromosome |
toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy1 | MultiQC | Combine results of duplicate reads |
toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sort_header_tool/9.3+galaxy1 | Sort | Sort counts to get gene with highest count on STAR |
toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy1 | MultiQC | Combine gene body coverage |
Tools
To use these workflows in Galaxy you can either click the links to download the workflows, or you can right-click and copy the link to the workflow which can be used in the Galaxy form to import workflows.
Importing into Galaxy
Below are the instructions for importing these workflows directly into your Galaxy server of choice to start using them!Hands-on: Importing a workflow
- Click on Workflow on the top menu bar of Galaxy. You will see a list of all your workflows.
- Click on galaxy-upload Import at the top-right of the screen
- Provide your workflow
- Option 1: Paste the URL of the workflow into the box labelled “Archived Workflow URL”
- Option 2: Upload the workflow file in the box labelled “Archived Workflow File”
- Click the Import workflow button
Below is a short video demonstrating how to import a workflow from GitHub using this procedure:
Version History
Version | Commit | Time | Comments |
---|---|---|---|
10 | 9a19075e2 | 2024-10-18 13:22:04 | Update ref-based workflows |
9 | a1251f286 | 2024-07-05 09:38:54 | Removed 'comments' tags |
8 | d804d52ac | 2024-07-05 09:22:56 | Updated tools in 'QC + Mapping + Counting (single+paired)' workflow |
7 | 41dead43e | 2023-05-02 10:31:07 | add mo orcid to workflows |
6 | 36eb5cf82 | 2023-04-28 17:26:00 | update workflows and tests |
5 | 8fc9c9026 | 2023-04-25 07:46:15 | add creators and licence to workflows |
4 | dc21d9ddb | 2023-04-22 08:29:08 | update images and results, rearrange workflow for part1 |
3 | 9921a8623 | 2023-04-21 12:37:10 | Update first part of the tutorial |
2 | 4d2f611a6 | 2022-04-28 15:20:51 | subset BAM before gene body coverage |
1 | 8bf6877e4 | 2022-04-15 11:16:13 | add workflow for PE and SE in parallel |
For Admins
Installing the workflow tools
wget https://training.galaxyproject.org/training-material/topics/transcriptomics/tutorials/ref-based/workflows/qc-mapping-counting-paired-and-single.ga -O workflow.ga workflow-to-tools -w workflow.ga -o tools.yaml shed-tools install -g GALAXY -a API_KEY -t tools.yaml workflow-install -g GALAXY -a API_KEY -w workflow.ga --publish-workflows