View markdown source on GitHub

Phylogenetics - Back to Basics - Introduction

Contributors

last_modification Published: May 10, 2024
last_modification Last Updated: May 10, 2024

Hand drawn phylogenetic tree from Charles Darwin’s notebook with clades A, B, C, D branching from a common ancestor. Prefaced with handwriting that reads ‘I think’ and annotated with other illegible notes.

Charles Darwin, 1837 Notebeook entry

Model of the external structure of the SARS-CoV-2 virion .right[https://en.wikipedia.org/wiki/Coronavirus; CC BY-SA 4.0] —

Two-part figure showing a phylogenetic tree of SARS-CoV-2 strains on the left and genomic epidemiology of SARS-CoV-2 with subsampling focused globally over the 6 months leading up to November 2023 on the right .right[https://nextstrain.org/ncov/gisaid/global/6m] —

Covid-19, latest tree from NextStrain

Phylogenetic tree for SARS-CoV-2 strains coloured by strain grouping

Terminology

Schematic of a phylogenetic tree where features such as nodes/taxa, edges/branches are annotated and colour coded. The root of the tree is at the top of the image and the tree branches into two clades as you move towards the bottom of the image. The clades are formed of hypothetical common ancestors and five extant taxa which are labelled as the ‘in group’. Two additional taxa are appended to the right hand side of the tree and are labelled as the outgroup.

Phylogenetic tree of hexapods

Circular phylogenetic tree of hexapods (insects). Clades are colour coded and labelled with common names e.g. ‘Fleas’. Silhouettes of representative species are shown around the outside of the tree. .left[https://doi.org/10.1371/journal.pone.0109085; CCBY 4.0 DEED license] —

Sequence alignment

Screenshot of sequence visualisation output from Galaxy. Fifteen Anolis DNA sequences are arranged in rows. The nucleotides are colour coded and arranged in columns: A(blue), T (green), C (pink), G (orange). The top half of the image shows approximately 50 bases of each sequence. The lower half of the image shows a zoomed out heatmap-like image of a larger portion of the sequences.

Building trees from distances

Flow chart illustrating how sequence alignment data or dis/similarity measures are used to calculate and calculate phylogenetic distances. Colours and shapes are used to differentiate different sections of the flowchart, guiding the viewer through each step from left to right. The flowchart begins with Sequence Alignment or Dis/Similarity Measures. These are used to form a distance matrix (D) which is used to select two nodes (x and y) forming a new node z. The distance matrix is updated with the new node z until no further nodes can be formed.

Searching for trees

Screenshot of a phylogenetic tree of Anolis species. The root of the tree is on the left and the species are listed vertically on the right. The tree consists of multiple branching events and clades and includes bootstrap values.

Phylogenetic Networks

Screenshot of a phylogenetic network of Anolis species. The root of the network is at the centre of the image and clades radiate outwards forming a circular network.

Thank you!

Let’s begin


Thank you!

This material is the result of a collaborative work. Thanks to the Galaxy Training Network and all the contributors! page logo Tutorial Content is licensed under Creative Commons Attribution 4.0 International License.