View markdown source on GitHub

GO Enrichment Analysis on Single-Cell RNA-Seq Data

Contributors

Questions

Objectives

last_modification Published: Sep 17, 2024
last_modification Last Updated: Sep 17, 2024

scRNA-Seq data analysis roadmap

.image-100[slide5]

Speaker Notes Here is a typical workflow for analyzing single-cell RNA sequencing data. We can break this process down into three main sections:


Ontology

.center[A standardized vocabulary for expressing knowledge within a specific domain.] .image-100[slide6]

Speaker Notes


Gene Ontology (GO): Unifying Biology

.image-100[slide7]

Speaker Notes Gene Ontology has 3 main classifications (Biological process, Molecular function, and Cellular component) this allows scientists to precisely describe what a gene does, how it does it, and where it happens in the cell.


GO Hierarchy

.image-100[slide8]

Speaker Notes

.image-100[slide9]

Speaker Notes


Example describing gene functions before and after Gene Ontology

.image-100[slide10]

Speaker Notes This simple example illustrates how Gene Ontology (GO) adds clarity and standardization when describing gene functions.


.center[How to use GO to perform GO Enrichment Analysis on scRNA-Seq data?]

Speaker Notes Now that we understand what GO mean, let’s explore what is GO Enrichment analysis in the context of scRNA-Seq data.


Enrichment analysis of scRNA-Seq data

.image-100[slide12] .center[Enrichment analysis is a type of functional annotation process which is the process of associating biological functions with genes or cells based on the expression data.]

Speaker Notes


GO Enrichment Analysis

.image-100[slide13]

Speaker Notes


Steps of GO enrichment analysis

.left[1- Select the marker genes for each cell cluster / condition] .image-60[slide14]

Speaker Notes We start the analysis by selecting a list of differentially expressed genes (marker genes). Marker genes could be a list of differentially expressed genes between 2 different conditions or between different cell types.


.left[2- Tag the genes with GO terms] .image-60[slide14]

Speaker Notes Each gene can be “tagged” with one or more GO terms, similar to how books are categorized by genre or topic.


.left[3- Count How Many Times Each GO Term Appears] .image-60[slide14]

Speaker Notes We then count the number of times each GO term shows up in your list of genes. For example, GO term A appears in 2 out of 3 genes


.left[4- Compare to the whole background gene set] .image-60[slide15]

Speaker Notes


.left[5- Is It Significant? Basic interpretation:] .image-60[slide16]

Speaker Notes


.left[6- Statistical testing:] .image-60[slide17]

Speaker Notes


.left[7- Interpret the results:] .image-60[slide18]

Speaker Notes After we have transformed the long list of marker genes into a short list of biological themes in the form of GO terms we can proceed with the interpretation of the results through visualization of the most common themes to identify patterns or relationships between GO terms, we can also analyze the GO hierarchy where higher-level categories (parent terms) provide broader biological contexts, while lower-level categories (child terms) offer more specific insights, in addition to relating the enriched GO terms to existing biological knowledge.


Example 1: GO Enrichment Analysis of Platelet Proteins in Early-Stage Cancer

.image-90[slide19]

Speaker Notes


Biological interpretation of the results

.image-100[slide20]

Speaker Notes


Example 2: Identifying Potential Biomarkers of Non-Small Cell Lung Cancer Through GO Enrichment Analysis of scRNA-Seq Data

.image-100[slide21]

Speaker Notes


Results interpretation

.image-100[slide22]


Purpose and Importance

.image-100[slide23]

Speaker Notes


Key Points

Thank you!

This material is the result of a collaborative work. Thanks to the Galaxy Training Network and all the contributors! Galaxy Training Network Tutorial Content is licensed under Creative Commons Attribution 4.0 International License.