Fast de novo genome assembly using hierarchical indexing
An indexed reads approach can dramatically accelerate the runtime for de novo genome assembly. The goal of this project is to establish this as a viable approach, towards making de novo genome assembly fast and inexpensive – thereby enabling new biology. Key participants on this project are Dr. C-S Chin and Dr. A Khalak.
- May 2019: Shared this concept at the 2019 SFAF Meeting with a presentation.
- July 2019: Preprint published on bioarxiv.org
- Aug 2019: This work was reported in GenomeWeb
- Oct 2019: This work has been integrated into human genome haplotype-phasing pipeline, Preprint
- Oct 2019: This work was applied towards obtaining a phased assembly of the human MH region. Poster at ASHG and presentation slides
- Oct 2019: The method and de novo performance were presented at ASHG – link to slides
The tool has been demonstrated on a few genomes, and is shared under a creative commons license. It still requires some experience in assembly to use well, but we are actively working to make it easier to use and adopt.
The source code for this project is available here, which includes some usage guidelines.
There is also a dockerhub binary for the project here that allows potential users to demo and use the tool without having to build from scratch.