Motivation: Comprehensive two-dimensional gas chromatography mass spectrometry (GC × GC-MS) brings much increased separation capacity, chemical selectivity and sensitivity for metabolomics and provides more accurate information about metabolite retention times and mass spectra. However, there is always a shift of retention times in the two columns that makes it difficult to compare metabolic profiles obtained from multiple samples exposed to different experimental conditions. Results: The existing peak alignment algorithms for GC × GC-MS data use the peak distance and the spectra similarity sequentially and require predefined either distance-based window and/or spectral similarity-based window. To overcome the limitations of the current alignment methods, we developed an optimal peak alignment using a novel mixture similarity by employing the peak distance and the spectral similarity measures simultaneously without any variation windows. In addition, we examined the effect of the four different distance measures such as Euclidean, Maximum, Manhattan and Canberra distances on the peak alignment. The performance of our proposed peak alignment algorithm was compared with the existing alignment methods on the two sets of GC × GC-MS data. Our analysis showed that Canberra distance performed better than other distances and the proposed mixture similarity peak alignment algorithm prevailed against all literature reported methods.
Bibliographical noteFunding Information:
Funding: grant (1RO1GM087735-02) through the National Institute of General Medical Sciences (NIGMS) within the National Institute of Health (NIH); (DE-EM0000197) through the Department of Energy (DOE).