De novo sequence assemblers
De Novo Sequence Assemblers are a type of program that assembles short nucleotide or
amino acid sequences into longer sequences without the use of a reference sequence. These
are most commonly used in bioinformatic studies to assemble genomes or transcriptome (link
to de novo transcriptome assembly).
Types of de novo assemblers
There are two types of algorithms that are commonly utilized by these assemblers: greedy
(linky), which aim to for local optima, and de bruijn graph algorithms (linky), which aim
for global optima. Different assemblers are tailored for particular needs, such as the
assembly of small, bacterial genomes, large, eukaryotic genomes, or transcriptomes
(linky).
Greedy algorithm assemblers are assemblers that find local optima in alignments of
smaller reads.
De bruijn graph assemblers assembles a de bruijn graph (linky) to guide the assembly.
During the assembly of the ggraph, reads are broken into smaller fragments of a
specified size, k. These k-mers then become "nodes" that are connected by "edges."
Common programs
Spades (linky)
Spades is a de bruijn graph method assembler that is designed to assemble
small genomes, such as bacterial genomes. It uses a multisized de bruijn
graph to guide assembly.
Ray (linky)
Ray is suite of assemblers that includes: Ray (de novo assembly of single
genomes), RayMeta (de novo assembly of metagenomes), RayCommunities
(microbe abundance and taxonomic profiling), RayOntologies (gene ontology
profiling), and RaySurveyor (compares genomic content between samples).
Ray also has a web-interface, called Ray Cloud Browser.
Abyss (linky)
de novo, parallel, paired-end sequence assembler designed for the assembly
of short reads. There are two versions: ABySS (genomic) and Trans-ABySS
(transcriptomic).
=== AllPaths-LG (linky) ===
Trinity (Linky)
The Assemblathon
The Assemblathon is a periodic, collaborative effort to test and improve the numerous
assemblers available. Thus far, two assemblathons have been completed (2011, linky and 2013,
linky) and a third is in progress (linky). Teams of researchers from across the world choose
a program and assemble genomes of model organisms whose genomes have been previously assembled
and annotated. The assemblies are then compared and evaluated using numerous metrics.
Assemblathon 1
- Participants/softwares
-
- Results of selected metrics
- N50 analysis
- Fragment analysis
- Gene length analysis
- Bacterial contamination
- PCA of metrics
Assemblathon 2
- Participants/softwares
-
- Results of selected metrics
This is a user sandbox of De novo sequence assemblers. You can use it for testing or practicing edits. This is not the sandbox where you should draft your assigned article for a dashboard.wikiedu.org course. To find the right sandbox for your assignment, visit your Dashboard course page and follow the Sandbox Draft link for your assigned article in the My Articles section. |
This template should only be used in the user namespace.This template should only be used in the user namespace.