AbSNP: RNA-Seq SNP calling in repetitive regions via abundance estimation

Shunfu Mao, Soheil Mohajer, Kannan Ramachandran, David Tse, Sreeram Kannan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Variant calling, in particular, calling SNPs (Single Nucleotide Polymorphisms) is a fundamental task in genomics. While existing packages offer excellent performance on calling SNPs which have uniquely mapped reads, they suffer in loci where the reads are multiply mapped, and are unable to make any reliable calls. Variants in multiply mapped loci can arise, for example in long segmental duplications, and can play important role in evolution and disease. In this paper, we develop a new SNP caller named abSNP, which offers three innovations. (a) abSNP calls SNPs from RNA-Seq data. Since RNA-Seq data is primarily sampled from gene regions, this method is inexpensive. (b) abSNP is able to successfully make calls on repetitive gene regions by exploiting the quality scores of multiply mapped reads carefully in order to make variant calls. (c) abSNP exploits a specific feature of RNA-Seq data, namely the varying abundance of different genes, in order to identify which repetitive copy a particular read is sampled from. We demonstrate that the proposed method offers significant performance gains on repetitive regions in simulated data. In particular, the algorithm is able to achieve near-perfect sensitivity on high-coverage SNPs, even when multiply mapped.

Original languageEnglish (US)
Title of host publication17th International Workshop on Algorithms in Bioinformatics, WABI 2017
EditorsKnut Reinert, Russell Schwartz
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
ISBN (Electronic)9783959770507
DOIs
StatePublished - Aug 1 2017
Externally publishedYes
Event17th International Workshop on Algorithms in Bioinformatics, WABI 2017 - Boston, United States
Duration: Aug 21 2017Aug 23 2017

Publication series

NameLeibniz International Proceedings in Informatics, LIPIcs
Volume88
ISSN (Print)1868-8969

Other

Other17th International Workshop on Algorithms in Bioinformatics, WABI 2017
Country/TerritoryUnited States
CityBoston
Period8/21/178/23/17

Bibliographical note

Funding Information:
∗ This work of SK and SM were supported, in part, by U.S. National Institute of Health grant 5R01HG008164-02 (SK and SM) and U.S. National Science Foundation CAREER grant 1651236 (SK). The work of DNT was supported in part by the Center for the Science of Information and in part by the NIH grant R01HG008164.

Publisher Copyright:
© Shunfu Mao, Soheil Mohajer, Kannan Ramachandran, David Tse, and Sreeram Kannan.

Keywords

  • Abundance Estimation
  • Multiply Mapped Reads
  • RNA-Seq
  • Repetitive Region
  • SNP Calling

Fingerprint

Dive into the research topics of 'AbSNP: RNA-Seq SNP calling in repetitive regions via abundance estimation'. Together they form a unique fingerprint.

Cite this