Motivation: The need to align spectra to correct for mass-to-charge experimental variation is a problem that arises in mass spectrometry (MS). Most of the MS-based proteomic data analysis methods involve a two-step approach, identify peaks first and then do the alignment and statistical inference on these identified peaks only. However, the peak identification step relies on prior information on the proteins of interest or a peak detection model, which are subject to error. Also numerous additional features such as peak shape and peak width are lost in simple peak detection, and these are informative for correcting mass variation in the alignment step. Results: Here, we present a novel Bayesian approach to align the complete spectra. The approach is based on a parametric model which assumes that the spectrum and alignment function are Gaussian processes, but the alignment function is monotone. We show how to use the expectation-maximization algorithm to find the posterior mode of the set of alignment functions and the mean spectrum for a patient population. After alignment, we conduct tests while controlling for error attributable to multiple comparisons on the level of the peaks identified from the absolute mean spectra difference of two patient populations.
Bibliographical noteFunding Information:
Funding: National Institutes of Health (grant P01-AI074340); University of Minnesota graduate dissertation fellowship.