Skip to main navigation Skip to search Skip to main content

Exploiting sparseness in de novo genome assembly

Chengxi Ye, Zhanshan Sam Ma, Charles H. Cannon, Mihai Pop, Douglas W. Yu

Research output: Contribution to journalArticlepeer-review

182 Citations (Scopus)

Abstract

Background:
The very large memory requirements for the construction of assembly graphs for de novo genome assembly limit current algorithms to super-computing environments.

Methods:
In this paper, we demonstrate that constructing a sparse assembly graph which stores only a small fraction of the observed k- mers as nodes and the links between these nodes allows the de novo assembly of even moderately-sized genomes (~500 M) on a typical laptop computer.

Results:
We implement this sparse graph concept in a proof-of-principle software package, SparseAssembler, utilizing a new sparse k- mer graph structure evolved from the de Bruijn graph. We test our SparseAssembler with both simulated and real data, achieving ~90% memory savings and retaining high assembly accuracy, without sacrificing speed in comparison to existing de novo assemblers.
Original languageEnglish
Article numberS1
Number of pages8
JournalBMC Bioinformatics
Volume13
Issue numberSuppl 6
DOIs
Publication statusPublished - 19 Apr 2012

Cite this