Multiple peak alignment in sequential data analysis: A scale-space-based approach

Weichuan Yu, Xiaoye Li, Junfeng Liu, Baolin Wu, Kenneth R. Williams, Hongyu Zhao

Research output: Contribution to journalArticlepeer-review

18 Scopus citations

Abstract

In this paper, we address the multiple peak alignment problem in sequential data analysis with an approach based on the Gaussian scale-space theory. We assume that multiple sets of detected peaks are the observed samples of a set of common peaks. We also assume that the locations of the observed peaks follow unimodal distributions (e.g., normal distribution) with their means equal to the corresponding locations of the common peaks and variances reflecting the extension of their variations. Under these assumptions, we convert the problem of estimating locations of the unknown number of common peaks from multiple sets of detected peaks into a much simpler problem of searching for local maxima in the scale-space representation. The optimization of the scale parameter is achieved using an energy minimization approach. We compare our approach with a hierarchical clustering method using both simulated data and real mass spectrometry data. We also demonstrate the merit of extending the binary peak detection method (i.e., a candidate is considered either as a peak or as a nonpeak) with a quantitative scoring measure-based approach (i.e., we assign to each candidate a possibility of being a peak).

Original languageEnglish (US)
Article number1668020
Pages (from-to)208-219
Number of pages12
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume3
Issue number3
DOIs
StatePublished - Jul 2006

Bibliographical note

Funding Information:
This work was supported with federal funds from NHLBI/NIH contract N01-HV-28186, NIDA/NIH grant P30 DA018343-01, NIGMS grant R01-59507, and US National Science Foundation grant DMS-0241160. The authors would like to thank Dr. Gusfield and Dr. Skiena and the anonymous reviewers for the very detailed comments, which greatly helped to improve the manuscript. The replicate data, ovarian cancer data, and the demonstration Matlab code are accessible at http:// bioinformatics.med.yale.edu/MSDATA.

Keywords

  • Biomarker discovery
  • Energy minimization
  • Multiple peak alignment
  • Parameter optimization
  • Peak identification
  • Prior information
  • Scale-space

Fingerprint

Dive into the research topics of 'Multiple peak alignment in sequential data analysis: A scale-space-based approach'. Together they form a unique fingerprint.

Cite this