DNA sequence data has become an indispensable tool for Molecular Biology & Evolutionary Biology. Study in these fields now require a genome sequence to work from. We call this a 'Reference Sequence.' We need to build a reference for each species. We do this by Genome Assembly. De novo Genome Assembly is the process of reconstructing the original DNA sequence from the fragment reads alone.
Before diving into this topic, we recommend you to have a look at:
|Lesson||Slides||Hands-on||Input dataset||Workflows||Galaxy tour||Galaxy instances|
|De Bruijn Graph Assembly||slides||tutorial Toggle Dropdown||zenodo_link||instances|
|Introduction to Genome Assembly||slides||tutorial Toggle Dropdown||zenodo_link||workflow||interactive_tour||instances|
Making sense of a newly assembled genome
|tutorial Toggle Dropdown||zenodo_link||interactive_tour||instances|
|slides||tutorial Toggle Dropdown||zenodo_link||workflow||interactive_tour|
You can use a public Galaxy instance which has been tested for the availability of the used tools. They are listed along with the tutorials above.
This material is maintained by:
For any question related to this topic and the content, you can contact them or visit our Gitter channel.
This material was contributed to by:
D.R. Zerbino and E. Birney: Velvet: algorithms for de novo short read assembly using de Bruijn graphs.
Velvet: Sequence assembler for very short reads
Daniel R. Zerbino , Gayle K. McEwen, Elliott H. Margulies, Ewan Birney: Pebble and Rock Band: Heuristic Resolution of Repeats and Scaffolding in the Velvet Short-Read de Novo Assembler
Heuristic Resolution of Repeats and Scaffolding in the Velvet Short-Read de Novo Assembler
Anton Bankevich, Sergey Nurk, Dmitry Antipov, Alexey A. Gurevich, Mikhail Dvorkin, Alexander S. Kulikov, Valery M. Lesin, Sergey I. Nikolenko, Son Pham, Andrey D. Prjibelski, Alexey V. Pyshkin, Alexander V. Sirotkin, Nikolay Vyahhi, Glenn Tesler, Max A. Alekseyev, and Pavel A. Pevzner: SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing
SPAdes : assembler for both single-cell and standard (multicell) assembly.
Andrews, Simon: FastQC: a quality control tool for high throughput sequence data.
FastQC : Quality Control Tool
Ryan R. Wick, Louise M. Judd, Claire L. Gorrie, Kathryn E. Holt: Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads
Unicycler : a tool for Bacterial genome assembly.
Alexey Gurevich, Vladislav Saveliev, Nikolay Vyahhi, Glenn Tesler: QUAST: quality assessment tool for genome assemblies
QUAST : a quality assessment tool for evaluating and comparing genome assemblies.
Torsten Seemann: Prokka: rapid prokaryotic genome annotation
Prokka, a software tool to fully annotate a draft bacterial genome in about 10 min.
Sergey I Nikolenko, Anton I Korobeynikov and Max A Alekseyev: BayesHammer: Bayesian clustering for error correction in single-cell sequencing
BAYES HAMMER : error correction tool including novel algorithms based on Hamming graphs and Bayesian subclustering.
Bruce J. Walker, Thomas Abeel, Terrance Shea, Margaret Priest, Amr Abouelliel, Sharadha Sakthikumar, Christina A. Cuomo, Qiandong Zeng, Jennifer Wortman, Sarah K. Young, Ashlee M. Earl: Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement
Pilon : fully automated, all-in-one tool for correcting draft assemblies and calling sequence variants of multiple sizes, including very large insertions and deletions.