Inferring single cell trajectories with Scanpy

Author(s)	Marisa Loach Wendi Bacon Julia Jakiela Mehmet Tekman
Editor(s)	Pavankumar Videm Helena Rasche
Reviewers

Overview
Questions:

How can I infer lineage relationships between single cells based on their RNA, without a time series?

Objectives:

Execute multiple plotting methods designed to identify lineage relationships between cells

Interpret these plots

Requirements:

Introduction to Galaxy Analyses

tutorial Hands-on: Generating a single cell matrix using Alevin

tutorial Hands-on: Combining single cell datasets after pre-processing

tutorial Hands-on: Filter, plot and explore single-cell RNA-seq data with Scanpy

Time estimation: 3 hours

Supporting Materials:

Datasets

Workflows

galaxy-history-input Input Histories

UseGalaxy.eu

help How to Use This

galaxy-history-answer Answer Histories

UseGalaxy.eu
2023-12-14

UseGalaxy.eu-ARCHIVED
2024-10-12

help How to Use This

instances Available on these Galaxies

Known Working

UseGalaxy.eu ✅ ⭐️

Possibly Working

UseGalaxy.org (Main)

UseGalaxy.org.au

Published: Dec 8, 2023

Last modification: Mar 11, 2025

License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License. The GTN Framework is licensed under MIT

purl PURL: https://gxy.io/GTN:T00379

rating Rating: 3.7 (3 recent ratings, 3 all time)

version Revision: 13

You’ve done all the hard work of preparing a single-cell matrix, processing it, plotting it, interpreting it, and finding lots of lovely genes. Now you want to infer trajectories, or relationships between cells… you can do that here, using the Galaxy interface, or head over to the Jupyter notebook version of this tutorial to learn how to perform the same analysis using Python.

Traditionally, we thought that differentiating or changing cells jumped between discrete states, so ‘Cell A’ became ‘Cell B’ as part of its maturation. However, most data shows otherwise. Generally, there is a spectrum (a ‘trajectory’, if you will…) of small, subtle changes along a pathway of that differentiation. Trying to analyse cells every 10 seconds can be pretty tricky, so ‘pseudotime’ analysis takes a single sample and assumes that those cells are all on slightly different points along a path of differentiation. Some cells might be slightly more mature and others slightly less, all captured at the same ‘time’. These cells are sorted accordingly along these pseudotime paths of differentiation to build a continuum of cells from one state to the next. We therefore ‘assume’ or ‘infer’ relationships from this continuum of cells.

We will use the same sample from the previous three tutorials, which contains largely T-cells in the thymus. We know T-cells differentiate in the thymus, so we would assume that we would capture cells at slightly different time points within the same sample. Furthermore, our cluster analysis alone showed different states of T-cells. Now it’s time to look further!

Comment: Tutorial from Scanpy

Please note, this tutorial is largely based on the trajectories tutorial found on the Scanpy site itself (Alex Wolf 2023).

Agenda

In this tutorial, we will cover:

Important tips for easier analysis

Prepare datasets

Get data

Filtering for T-cells

Force-directed graph

Calculate force-directed graph

Plot the FDG

Diffusion maps

Draw diffusion map

Re-calculate Nearest Neighbours

Re-draw the FDG

Partition-based Graph Abstraction (PAGA)

Re-draw force-directed graph (again!)

Exploring the results

Plotting by Genotype

Plotting Gene Expression

Diffusion pseudotime

Conclusion

Important tips for easier analysis

Tools are frequently updated to new versions. Your Galaxy may have multiple versions of the same tool available. By default, you will be shown the latest version of the tool. This may NOT be the same tool used in the tutorial you are accessing. Furthermore, if you use a newer tool in one step, and try using an older tool in the next step… this may fail! To ensure you use the same tool versions of a given tutorial, use the Tutorial mode feature.

Open your Galaxy server

Click on the curriculum icon on the top menu, this will open the GTN inside Galaxy.

Navigate to your tutorial

Tool names in tutorials will be blue buttons that open the correct tool for you

Note: this does not work for all tutorials (yet)

You can click anywhere in the grey-ed out area outside of the tutorial box to return back to the Galaxy analytical interface

Warning: Not all browsers work!

We’ve had some issues with Tutorial mode on Safari for Mac users.

Try a different browser if you aren’t seeing the button.

Did you know we have a unique Single Cell Omics Lab with all our single cell tools highlighted to make it easier to use on Galaxy? We recommend this site for all your single cell analysis needs, particularly for newer users.

The Single Cell Omics Lab is a different view of the underlying Galaxy server that organises tools and resources better for single-cell users! It also provides a platform for communities to engage and connect; distribute more targeted news and events; and highlight community-specific funding sources.

Try it out!

subdomain Europe: Single Cell Omics Lab

subdomain USA: Single Cell Omics Lab

subdomain Australia: Single Cell Omics Lab

When something goes wrong in Galaxy, there are a number of things you can do to find out what it was. Error messages can help you figure out whether it was a problem with one of the settings of the tool, or with the input data, or maybe there is a bug in the tool itself and the problem should be reported. Below are the steps you can follow to troubleshoot your Galaxy errors.

Expand the red history dataset by clicking on it.

Sometimes you can already see an error message here

View the error message by clicking on the bug icon galaxy-bug

Check the logs. Output (stdout) and error logs (stderr) of the tool are available:

Expand the history item

Click on the details icon

Scroll down to the Job Information section to view the 2 logs:

Tool Standard Output

Tool Standard Error

For more information about specific tool errors, please see the Troubleshooting section

Submit a bug report! If you are still unsure what the problem is.

Click on the bug icon galaxy-bug

Write down any information you think might help solve the problem

See this FAQ on how to write good bug reports

Click galaxy-bug Report button

Ask for help!

Where?

In the User community chatspace in Slack in our #single-cell-users channel

In the GTN Matrix Channel

In the Galaxy Matrix Channel

Browse the Galaxy Help Forum to see if others have encountered the same problem before (or post your question).

When asking for help, it is useful to share a link to your history

Prepare datasets

Get data

We’ve provided you with experimental data to analyse from a mouse dataset of fetal growth restriction Bacon et al. 2018. This is the full dataset generated from this tutorial (see the study in Single Cell Expression Atlas and the project submission). You can find the final dataset in this input history or download from Zenodo below.

Hands-on: Choose Your Own Tutorial

This is a "Choose Your Own Tutorial" (CYOT) section (also known as "Choose Your Own Analysis" (CYOA)), where you can select between multiple paths. Click one of the buttons below to select how you want to follow the tutorial

Importing via History is quickest. Works only on Galaxy EU for now.

History_import Zenodo_import

Hands On: Option 1: Data upload - Import history

Import history from: input history

Open the link to the shared history

Click on the Import this history button on the top left

Enter a title for the new history

Click on Copy History

Rename galaxy-pencil the the history to your name of choice.

Hands On: Option 2: Data upload - Add to history
Create a new history for this tutorial
Import the AnnData object from Zenodo
https://zenodo.org/record/7075718/files/Final_cell_annotated_object.h5ad
Copy the link location

Click galaxy-upload Upload Data at the top of the tool panel

Select galaxy-wf-edit Paste/Fetch Data

Paste the link(s) into the text field

Press Start

Close the window
Rename galaxy-pencil the .h5ad object as Final cell annotated object

Click on the galaxy-pencil pencil icon for the dataset to edit its attributes

In the central panel, change the Name field to Final cell annotated object

Click the Save button

Check that the datatype is h5ad

Click on the galaxy-pencil pencil icon for the dataset to edit its attributes

In the central panel, click galaxy-chart-select-data Datatypes tab on the top

In the galaxy-chart-select-data Assign Datatype, select h5ad from “New Type” dropdown

Tip: you can start typing the datatype into the field to filter the dropdown menu

Click the Save button

Filtering for T-cells

One problem with our current dataset is that it’s not just T-cells: we found in the previous tutorial that it also contains macrophages. This is a problem, because trajectory analysis will generally try to find relationships between all the cells in the sample. We need to remove those cell types to analyse the trajectory.

Hands On: Removing macrophages

Manipulate AnnData ( Galaxy version 0.7.5+galaxy1) with the following parameters:

param-file “Annotated data matrix”: Final cell annotated object (Input dataset)

“Function to manipulate the object”: Filter observations or variables

“What to filter?”: Observations (obs)

“Type of filtering?”: By key (column) values

“Key to filter”: cell_type

“Type of value to filter”: Text

“Filter”: not equal to

“Value”: Macrophages -

Rename galaxy-pencil output h5ad T-cell_object.h5ad

You should now have 8569 cells, as opposed to the 8605 you started with. You’ve only removed a few cells (the contaminants!), but it makes a big difference in the next steps.

Force-directed graph

First, we will calculate a force-directed graph (FDG), as an alternate to tSNE, which will likely work better for trajectory analysis.

Calculate force-directed graph

Hands On: Draw FDG

Scanpy RunFDG ( Galaxy version 1.8.1+galaxy9) with the following parameters:

param-file “Input object in AnnData/Loom format”: T-cell_object.h5ad (output of Manipulate AnnData tool)

“Use programme defaults”: galaxy-toggle No

“Graph layout”: fa

Comment: Graph Layout

We’re using the fa or ForceAtlas2 layout for our FDGs. It is the same layout used in the Jupyter notebook version of this tutorial and works well for our data. As well as choosing the fa layout when we create the FDGs, we will also specify the draw_graph_fa embedding when drawing the plots.

Plot the FDG

And now time to plot it!

Hands On: Plot the FDG

Scanpy PlotEmbed ( Galaxy version 1.8.1+galaxy9) with the following parameters:

param-file “Input object in AnnData/Loom format”: FDG object Anndata (output of Scanpy RunFDG tool)

“name of the embedding to plot”: draw_graph_fa

“color by attributes, comma separated texts”: cell_type

“Use raw attributes”: galaxy-toggle No

“Location of legend”: On data

Question

What has the FDG done to our clusters of T-cells and what might this suggest about the relationships between these groups?

Well now this is exciting! Our DP-late is more clearly separating, and we might also suppose that DP-M1, DP-M2, and DP-M3 are actually earlier on in the differentiation towards mature T-cells. And we’re only just getting started!

Open image in new tab

Figure 1: FDG Plot of our T-cell dataset

Diffusion maps

We’ll now perform an optional step, that basically takes the place of the standard Principle Component Analysis (PCA). Instead of using PCs, we can use diffusion maps.

Draw diffusion map

Hands On: Draw the Diffusion Map

Scanpy DiffusionMap ( Galaxy version 1.8.1+galaxy9) with the following parameters:

param-file “Input object in AnnData/Loom format”: FDG object Anndata (output of Scanpy RunFDG tool)

“Number of diffusion components to calculate”: 15

Comment: Choosing the number of diffusion components

We could change the number of diffusion components and end up with a slightly different plot - a bit like if we changed the number of principal components used in the PCA we ran in the Filter, Plot and Explore tutorial. 15 seems to work well for this dataset and matches the number used in the Jupyter version of this tutorial, so we’ll stick with that.

Re-calculate Nearest Neighbours

Now that we have our diffusion map, we need to re-calculate neighbors using the diffusion map instead of the PCs.

Hands On: Compute neighbours using diffusion map

Scanpy ComputeGraph ( Galaxy version 1.8.1+galaxy9) with the following parameters:

param-file “Input object in AnnData/Loom format”: DiffusionMap Anndata (output of Scanpy DiffusionMap tool)

“Use programme defaults”: param-toggle No

“Use the indicated representation”: X_diffmap

Comment

If you’re using the latest versions of these tools (e.g. Scanpy ComputeGraph ( Galaxy version 1.9.3+galaxy0), rather than the ones suggested in the tutorial (e.g. Scanpy ComputeGraph ( Galaxy version 1.8.1+galaxy9) then you may need to change one more parameter here to set the Number of PCs to use to 15. These are the 15 diffusion components we just calculated, rather than actual PCs.

Re-draw the FDG

Now that we’ve re-calculated the nearest neighbours, we can use these new neighbours to re-draw the FDG to see how this changes the plot.

Hands On: Plot a new FDG

Scanpy RunFDG ( Galaxy version 1.8.1+galaxy9) with the following parameters:

param-file “Input object in AnnData/Loom format”: Graph object Anndata (output of Scanpy ComputeGraph tool)

“Use programme defaults”: param-toggle No

“Graph layout”: fa

Scanpy PlotEmbed ( Galaxy version 1.8.1+galaxy9) with the following parameters:

param-file “Input object in AnnData/Loom format”: FDG object Anndata (output of Scanpy RunFDG tool)

“name of the embedding to plot”: draw_graph_fa

“color by attributes, comma separated texts”: cell_type

“Use raw attributes”: No

“Location of legend”: On data

Question

Does this plot seem better or worse than before? Remember that we’re trying to understand the relationships between our groups of cells in time.

Oh dear! This doesn’t look great. Maybe the DP-M4 cells are a whole other trajectory? That doesn’t seem right. Saying that, this does spread out our T-mature cells, which makes a lot more sense when it comes to T-cell biology (we expect T-cells to differentiate into two types of T-cells, Cd8+Cd4- and Cd4+Cd8-). If you wanted to, you could also re-cluster your cells (since you’ve changed the neighborhood graph on which the clusterisation depends). However, we tried that, and it called far too many clusters given the depth of sequencing in this dataset. Let’s stick with our known cell types and move from there.

Open image in new tab

Figure 2: FDG Plot after recalculating neighbours from the diffusion map

If you are working in a group, you can now divide up a decision here with one control and the rest can vary in numbers so that you can compare results throughout the tutorials.

Control

Go straight to the Partition-based Graph Abstraction (PAGA) section

Everyone else:

you could recluster your cells using Scanpy FindCluster ( Galaxy version 1.8.1+galaxy0) at a different resolution, perhaps lower than the 0.6 we used before (Take a look at the Cell clusters step in the Filter, Plot and Explore tutorial if you need help with this.) Please note that in this case, you will want to change the PAGA step Scanpy PAGA to group by louvain rather than cell_type. You can certainly still plot both, we only didn’t because with using our old Louvain calls, the cell_type and louvain categories are identical.

you could undo the optional diffusion map step by recalculating the neighbours again using X_pca instead of X_diffmap

you could also try changing the number of neighbors used in that step when running Scanpy ComputeGraph ( Galaxy version 1.8.1+galaxy9)

Everyone else: You will want to compare FREQUENTLY with your control team member.

Partition-based Graph Abstraction (PAGA)

PAGA is used to generalise relationships between groups, or likely clusters, in this case. It will make it much easier to see the trajectories between our clusters of T-cells.

Hands On: Plot PAGA

Scanpy PAGA ( Galaxy version 1.8.1+galaxy9) with the following parameters:

param-file “Input object in AnnData/Loom format”: FDG object Anndata (output of Scanpy RunFDG tool)

“Name of the clustering”: cell_type

Comment: Plotting gene expression

We can now draw our PAGA plot and we might also be interested in colouring our plot by genes as well. In this case, remembering that we are dutifully counting our genes by their EnsemblIDs rather than Symbols (which do not exist for all EnsemblIDs), we have to look up our genes of interest (CD4, CD8a) and plot the corresponding IDs in the next step.

Scanpy PlotTrajectory ( Galaxy version 1.8.1+galaxy9) with the following parameters:

param-file “Input object in AnnData/Loom format”: PAGA object Anndata (output of Scanpy PAGA tool)

“Layout functions”: ForceAtlas2

“Location of legend”: On data

“Use programme defaults”: param-toggle No

“Name of cell annotation or gene that is used to color the nodes”: cell_type,ENSMUSG00000023274,ENSMUSG00000053977

Comment: Choosing the layout

We’re going to pick the ForceAtlas2 layout function for our PAGA plots as this is the same type of layout that we used for our FDG.

Question

How have the relationships between our cell clusters changed now?

Which clusters are expressing our genes of interest, Cd4 and Cd8, at the highest levels?

The way the clusters are arranged has changed a bit now. The M4 cluster is right in the middle of the M1-3 clusters, rather than heading off on its own. The M1 cluster is looking like it is driving towards differentiation, which is not something we had necessarily been able to specify before by just looking at our cluster graphs or applying our biological knowledge.

Cd4 and Cd8 expression appear highest in the DP-L cluster. The expression of both Cd4 and Cd8 also appears higher than we might expect in the DP-M4 cluster - perhaps this is a sign that it is closer to the DP-L cluster than it seems in this simple plot.

Open image in new tab

Figure 3: PAGA plots coloured by cell type, Cd4 expression, and Cd8 expression

Re-draw force-directed graph (again!)

Force-directed graphs can be initialised randomly, or we can prod it in the right direction. We’ll prod it with our PAGA calculations. Note that you could also try prodding it with tSNE or UMAP. A lot of these tools can be used on top of each other or with each other in different ways, this tutorial is just one example. Similarly, you could be using any obs information for grouping, so could do this for louvain or cell_type for instance.

Hands On: Initialise FDG using PAGA

Scanpy RunFDG ( Galaxy version 1.8.1+galaxy9) with the following parameters:

param-file “Input object in AnnData/Loom format”: Plotted PAGA Anndata (output of Scanpy PlotTrajectory tool)

“Use programme defaults”: param-toggle No

“Method to initialise embedding, any key for adata.obsm or choose from the preset methods”: paga

“Graph layout”: fa

Scanpy PlotEmbed ( Galaxy version 1.8.1+galaxy9) with the following parameters:

param-file “Input object in AnnData/Loom format”: FDG object Anndata (output of Scanpy RunFDG tool)

“name of the embedding to plot”: draw_graph_fa

“color by attributes, comma separated texts”: cell_type

“Use raw attributes”: No

“Location of legend”: On data

Question

How has basing our FDG on the PAGA plot changed the relationships between our cells?

The interesting change here occurs between the Double Negative (DN) and Double Positive Module 4 (DP-M4) cells. Our DP-M4 cells are now heading on a clear trajectory towards differentiation. It looks like we’ve got the correct ordering of cells from DN through the DP groups and on towards T-mature. We didn’t see this in our previous plots.

Open image in new tab

Figure 4: FDG plot initiated from the PAGA plot

Exploring the results

The experiment that produced this data used two different groups of mice - the control or wildtype group and the knockout mice that were missing a gene involved in placental development, which impacted thymus development. Since we know the genotype of the mice from which each sample was collected, we can colour in our plots to see if there are any differences in the cells present in wildtype and knockout mice.

Plotting by Genotype

The easiest way to do this is just to rerun galaxy-refresh the previous step, but change the attribute we want to use to colour the FDG plot.

Hands On: Plot by genotype

Scanpy PlotEmbed ( Galaxy version 1.8.1+galaxy9) with the following parameters:

param-file “Input object in AnnData/Loom format”: FDG object Anndata (output of Scanpy RunFDG tool)

“name of the embedding to plot”: draw_graph_fa

“color by attributes, comma separated texts”: genotype

“Use raw attributes”: No

“Location of legend”: Right margin

Question

Are there any differences in the distribution of the wildtype and knockout cells?

We’re seeing a clear trajectory issue whereby the knockout cells are not found along the trajectory into T-mature (which, well, we kind of already figured out with just the cluster analysis, but we can feel even more confident about our results!)

Open image in new tab

Figure 5: FDG plot coloured by genotype

Plotting Gene Expression

We’re also interested in the expression of the two genes that are known to be markers of the two different types of mature T-cells: Cd4 and Cd8. We can colour in our plot to show which cells are expressing these genes.

Hands On: Plot for gene expression

Scanpy PlotEmbed ( Galaxy version 1.8.1+galaxy9) with the following parameters:

param-file “Input object in AnnData/Loom format”: FDG object Anndata (output of Scanpy RunFDG tool)

“name of the embedding to plot”: draw_graph_fa

“color by attributes, comma separated texts”: ENSMUSG00000023274,ENSMUSG00000053977

“Use raw attributes”: No

“Location of legend”: On data

Comment: Gene Symbols

We’re using the EnsemblIDs during this tutorial, as discussed above. If you like, you could change the names of these plots to the gene symbols by filling in the optional “Figure title” field with Cd4,Cd8. Make sure that the order of your figure titles matches the order of the EnsemblIDs in the colour by field. ENSMUSG00000023274 is Cd4 and ENSMUSG00000053977 is Cd8.

Question

Does the expression pattern of these genes tell us anything about our cells?

It’s clear that both Cd4 and Cd8 are being expressed mainly in the later stages of T-cell development, as they head towards the DP-L cluster - although Cd4 expression is a bit more widespread. This is what we would expect to see in genes that are associated with mature T-cells. You might also be able to spot some differences in the expression of the two genes in our mature T-cell group, but there doesn’t seem to be a very clear division between the cells that express Cd4 and those that express Cd8. This is a bit disappointing, as we know that there are two types of mature T-cells, which each express a different gene.

Open image in new tab

Figure 6: FDG plot showing expression of Cd4 and Cd8

Diffusion pseudotime

Now that we have a reasonable FDG plot for our cells, based on the diffusion map (if used) and PAGA plot, we can place our cells into pseudotime. Pseudotime lets us imagine that instead of looking at a sample of cells taken at a single timepoint, we are looking at cells moving through time. Our sample included cells at different stages of their development, but we can use pseudotime to think of these as different timepoints in the journey of individual cells.

We know that our cells are initialising at DN. We can feed that information into our algorithms by naming DN as the root cell type to then calculate a trajectory starting from these cells.

If you called new clusters using Scanpy FindCluster ( Galaxy version 1.8.1+galaxy0), you might want to choose one of those clusters to be your root cell instead, so change the cell_type for louvain and then name the cluster number. Use the plots you created to help you pick the number!

On to the diffusion pseudotime, where we infer multiple time points within the same piece of data!

Hands On: DPT Plot

Scanpy DPT ( Galaxy version 1.8.1+galaxy9) with the following parameters:

param-file “Input object in AnnData/Loom format”: FDG object Anndata (output of Scanpy RunFDG tool)

“Name of attribute that defines clustering”: cell_type

“Name of the clustering that defines the root cell type”: DN

Scanpy PlotEmbed ( Galaxy version 1.8.1+galaxy9) with the following parameters:

param-file “Input object in AnnData/Loom format”: Diffusion pseudotime inference Anndata (output of Scanpy DPT tool)

“name of the embedding to plot”: draw_graph_fa

“color by attributes, comma separated texts”: cell_type,dpt_pseudotime

“Use raw attributes”: No

“Location of legend”: On data

Question

Does the pseudotime plot match with your expectations of which cells represent earlier or later stages of T-cell development?

When we look at the cell type and DPT plots, we can see that there’s a clear progression from our root DN cells, through the various groups of DP cells, into DP-L and then T-mat. This matches with our expectations that the DP-L and DP-mat clusters represent later stages in T-cell development. The DPT plot also confirms that the DP-M4 cluster is heading towards differentiation, which makes sense given its position on our FDG plot.

Open image in new tab

Figure 7: FDG plots showing cell types and pseudotime

This is nice, as it supports our conclusions thus far on the trajectory of the T-cell differentiation. With single-cell, the more ways you can prove to yourself what you’re seeing is real, the better! If we did not find consistent results, we would need to delve in further to see if the cause is the algorithm (not all algorithms fit all data!) or the biology.

Where might we go from here? We might consider playing with our louvain resolutions, to see if we can get the two groups of Cd4+ and Cd8+ cells to be called as different clusters, and then comparing them to each other for gene differences or genotype differences. We might also use different objects (for instance, what if we regressed out cell cycle genes?) and see if that changes the results. What would you do?

Look at each others images! How do yours differ, what decisions were made? Previously, when calling clusters in the Filter, Plot and Explore Single-cell RNA-seq Data tutorial, the interpretation at the end is largely consistent, no matter what decisions are made throughout (mostly!). Is this the case with your trajectory analyses? You may find that it is not, which is why pseudotime analysis even more crucially depends on your understanding of the underlying biology (we have to choose the root cells, for instance, or recognise that DN cells should not be found in the middle of the DPs) as well as choosing the right analysis. That’s why it is a huge field! With analysing scRNA-seq data, it’s almost like you need to know about 75% of your data and make sure your analysis shows that, for you to then identify the 25% new information.

Conclusion

Congratulations! You’ve made it to the end! You might be interested in the workflowworkflow for this tutorial or this galaxy-historyExample History which shows the results you should expect to see if you follow this tutorial.

In this tutorial, you moved from called clusters to inferred relationships and trajectories using pseudotime analysis. You found an alternative to PCA (diffusion map), an alternative to tSNE (force-directed graph), a means of identifying cluster relationships (PAGA), and a metric for pseudotime (diffusion pseudotime) to identify early and late cells. If you were working in a group, you found that such analysis is slightly more sensitive to your decisions than the simpler filtering/plotting/clustering is. We are inferring and assuming relationships and time, so that makes sense!

You've Finished the Tutorial

Key points

Trajectory analysis is less robust than pure plotting methods, as such ‘inferred relationships’ are a bigger mathematical leap

As always with single-cell analysis, you must know enough biology to deduce if your analysis is reasonable, before exploring or deducing novel insight

Frequently Asked Questions

Have questions about this tutorial? Have a look at the available FAQ pages and support channels

Useful literature

Further information, including links to documentation and original publications, regarding the tools, analysis techniques and the interpretation of results described in this tutorial can be found here.

References

Bacon, W. A., R. S. Hamilton, Z. Yu, J. Kieckbusch, D. Hawkes et al., 2018 Single-Cell Analysis Identifies Thymic Maturation Delay in Growth-Restricted Neonatal Mice. Frontiers in Immunology 9: 10.3389/fimmu.2018.02523
Alex Wolf, S. R., Fidel Ramirez, 2023 Trajectory inference for hematopoiesis in mouse — scanpy-tutorials 1.4.7.dev38+g3e4e434 documentation — scanpy-tutorials.readthedocs.io. [Accessed 11-09-2024]. https://scanpy.readthedocs.io/en/stable/tutorials/trajectories/paga-paul15.html

Feedback

Did you use this material as an instructor? Feel free to give us feedback on how it went.
Did you use this material as a learner or student? Click the form below to leave feedback.

Citing this Tutorial

Marisa Loach, Wendi Bacon, Julia Jakiela, Mehmet Tekman, Inferring single cell trajectories with Scanpy (Galaxy Training Materials). https://training.galaxyproject.org/training-material/topics/single-cell/tutorials/scrna-case_trajectories/tutorial.html Online; accessed TODAY
Hiltemann, Saskia, Rasche, Helena et al., 2023 Galaxy Training: A Powerful Framework for Teaching! PLOS Computational Biology 10.1371/journal.pcbi.1010752
Batut et al., 2018 Community-Driven Data Analysis Training for Biology Cell Systems 10.1016/j.cels.2018.05.012

@misc{single-cell-scrna-case_trajectories,
author = "Marisa Loach and Wendi Bacon and Julia Jakiela and Mehmet Tekman",
	title = "Inferring single cell trajectories with Scanpy (Galaxy Training Materials)",
	year = "",
	month = "",
	day = "",
	url = "\url{https://training.galaxyproject.org/training-material/topics/single-cell/tutorials/scrna-case_trajectories/tutorial.html}",
	note = "[Online; accessed TODAY]"
}
@article{Hiltemann_2023,
	doi = {10.1371/journal.pcbi.1010752},
	url = {https://doi.org/10.1371%2Fjournal.pcbi.1010752},
	year = 2023,
	month = {jan},
	publisher = {Public Library of Science ({PLoS})},
	volume = {19},
	number = {1},
	pages = {e1010752},
	author = {Saskia Hiltemann and Helena Rasche and Simon Gladman and Hans-Rudolf Hotz and Delphine Larivi{\`{e}}re and Daniel Blankenberg and Pratik D. Jagtap and Thomas Wollmann and Anthony Bretaudeau and Nadia Gou{\'{e}} and Timothy J. Griffin and Coline Royaux and Yvan Le Bras and Subina Mehta and Anna Syme and Frederik Coppens and Bert Droesbeke and Nicola Soranzo and Wendi Bacon and Fotis Psomopoulos and Crist{\'{o}}bal Gallardo-Alba and John Davis and Melanie Christine Föll and Matthias Fahrner and Maria A. Doyle and Beatriz Serrano-Solano and Anne Claire Fouilloux and Peter van Heusden and Wolfgang Maier and Dave Clements and Florian Heyl and Björn Grüning and B{\'{e}}r{\'{e}}nice Batut and},
	editor = {Francis Ouellette},
	title = {Galaxy Training: A powerful framework for teaching!},
	journal = {PLoS Comput Biol}
}

                   

Congratulations on successfully completing this tutorial!

Do you want to extend your knowledge?
Follow one of our recommended follow-up trainings:

tutorial Hands-on: Importing files from public atlases

slides Slides: GO Enrichment Analysis on Single-Cell RNA-Seq Data

tutorial Hands-on: GO Enrichment Analysis on Single-Cell RNA-Seq Data

You can use Ephemeris's shed-tools install command to install the tools used in this tutorial.

shed-tools install [-g GALAXY] [-a API_KEY] -t <(curl https://training.galaxyproject.org/training-material/api/topics/single-cell/tutorials/scrna-case_trajectories/tutorial.json | jq .admin_install_yaml -r)

Alternatively you can copy and paste the following YAML

---
install_tool_dependencies: true
install_repository_dependencies: true
install_resolver_dependencies: true
tools:
- name: scanpy_compute_graph
  owner: ebi-gxa
  revisions: 6417cccad133
  tool_panel_section_label: Single-cell
  tool_shed_url: https://toolshed.g2.bx.psu.edu/
- name: scanpy_compute_graph
  owner: ebi-gxa
  revisions: 22bd5861706d
  tool_panel_section_label: Single-cell
  tool_shed_url: https://toolshed.g2.bx.psu.edu/
- name: scanpy_find_cluster
  owner: ebi-gxa
  revisions: ad6995afd2e6
  tool_panel_section_label: Single-cell
  tool_shed_url: https://toolshed.g2.bx.psu.edu/
- name: scanpy_plot_embed
  owner: ebi-gxa
  revisions: 35f6611fb8c3
  tool_panel_section_label: Single-cell
  tool_shed_url: https://toolshed.g2.bx.psu.edu/
- name: scanpy_plot_trajectory
  owner: ebi-gxa
  revisions: f489ccdd423c
  tool_panel_section_label: Single-cell
  tool_shed_url: https://toolshed.g2.bx.psu.edu/
- name: scanpy_run_diffmap
  owner: ebi-gxa
  revisions: e2ca1846080f
  tool_panel_section_label: Single-cell
  tool_shed_url: https://toolshed.g2.bx.psu.edu/
- name: scanpy_run_dpt
  owner: ebi-gxa
  revisions: 2e06dabb3f2f
  tool_panel_section_label: Single-cell
  tool_shed_url: https://toolshed.g2.bx.psu.edu/
- name: scanpy_run_fdg
  owner: ebi-gxa
  revisions: 7c6581b57ad4
  tool_panel_section_label: Single-cell
  tool_shed_url: https://toolshed.g2.bx.psu.edu/
- name: scanpy_run_paga
  owner: ebi-gxa
  revisions: 4e7d55438238
  tool_panel_section_label: Single-cell
  tool_shed_url: https://toolshed.g2.bx.psu.edu/
- name: anndata_manipulate
  owner: iuc
  revisions: 3d748954434b
  tool_panel_section_label: Single-cell
  tool_shed_url: https://toolshed.g2.bx.psu.edu/

t{ hist[0] | to_stars }} 2

t{ hist[0] | to_stars }} 1

September 2024

3 stars: Liked: I liked it, I just think I could do better. Disliked: my patience

4 stars: Liked: The instructions! Disliked: Getting started was the normal nightmare - being in the right place, however the members of the breakout room I was in were super helpful

4 stars: Liked: The illustrations of what to expect with each graph output allowed me to trace back an error I'd made that still allowed the process to be run, and so didn't flag as an error.