We study the genotype calling algorithms for the high-throughput single-nucleotide polymorphism (SNP) arrays. Building upon the novel SNP-robust multi-chip average preprocessing approach and the state-of-the-art corrected robust linear model with Mahalanobis distance (CRLMM) approach for genotype calling, we propose a simple modification to better model and combine the information across multiple SNPs with empirical Bayes modeling, which could often significantly improve the genotype calling of CRLMM. Through applications to the HapMap Trio data set and a non-HapMap test set of high quality SNP chips, we illustrate the competitive performance of the proposed method.
Bibliographical noteFunding Information:
This research was supported in part by a Biomedical Informatics and Computational Biology grant from the University of Minnesota-Mayo-IBM Collaboration, and NIH grant GM083345 and CA134848. We would like to thank two anonymous referees for their constructive comments that have improved the presentation of the paper.
- SNP arrays
- empirical Bayes
- genotype calling algorithm
- mixture model