Succinct colored de Bruijn graphs

Martin D. Muggli, Alexander Bowe, Noelle R. Noyes, Paul S. Morley, Keith E. Belk, Robert Raymond, Travis Gagie, Simon J. Puglisi, Christina Boucher

Research output: Contribution to journalArticlepeer-review

70 Scopus citations

Abstract

Motivation In 2012, Iqbal et al. introduced the colored de Bruijn graph, a variant of the classic de Bruijn graph, which is aimed at 'detecting and genotyping simple and complex genetic variants in an individual or population'. Because they are intended to be applied to massive population level data, it is essential that the graphs be represented efficiently. Unfortunately, current succinct de Bruijn graph representations are not directly applicable to the colored de Bruijn graph, which requires additional information to be succinctly encoded as well as support for non-standard traversal operations. Results Our data structure dramatically reduces the amount of memory required to store and use the colored de Bruijn graph, with some penalty to runtime, allowing it to be applied in much larger and more ambitious sequence projects than was previously possible. Availability and Implementation https://github.com/cosmo-team/cosmo/tree/VARI Contact martin.muggli@colostate.edu Supplementary informationSupplementary dataare available at Bioinformatics online.

Original languageEnglish (US)
Pages (from-to)3181-3187
Number of pages7
JournalBioinformatics
Volume33
Issue number20
DOIs
StatePublished - Oct 15 2017
Externally publishedYes

Bibliographical note

Funding Information:
MDM, NRN, PSM, KEB, and CB are funded by USDA-NIFA grant 2016-67012-24679. NRN is also funded by USDA-NIFA grant 2015-68003-23048. SJP and TG are funded by Academy of Finland grants 294143 and 268324, respectively.

Publisher Copyright:
© The Author 2017. Published by Oxford University Press. All rights reserved.

Fingerprint

Dive into the research topics of 'Succinct colored de Bruijn graphs'. Together they form a unique fingerprint.

Cite this