We present a method for individual and integrative analysis of high dimension, low sample size data that capitalizes on the recurring theme in multivariate analysis of projecting higher dimensional data onto a few meaningful directions that are solutions to a generalized eigenvalue problem. We propose a general framework, called SELP (Sparse Estimation with Linear Programming), with which one can obtain a sparse estimate for a solution vector of a generalized eigenvalue problem. We demonstrate the utility of SELP on canonical correlation analysis for an integrative analysis of methylation and gene expression profiles from a breast cancer study, and we identify some genes known to be associated with breast carcinogenesis, which indicates that the proposed method is capable of generating biologically meaningful insights. Simulation studies suggest that the proposed method performs competitive in comparison with some existing methods in identifying true signals in various underlying covariance structures.
Bibliographical noteFunding Information:
The authors thank the two anonymous reviewers and Associate Editor for their useful comments that helped improve the manuscript. Sandra Safo’s work was supported in part by NIH grant K12HD085850, and Yongho Jeon’s by Basic Science Research Program of the National Research Foundation of Korea (NRF-2015R1A1A1A05001180) funded by the Korean government. The content is the responsibility of the authors and does not represent the views of NIH and NRF.
© 2018, The International Biometric Society
- Canonical Correlation Analysis
- Data Integration
- Generalized Eigenvalue Problem
- High Dimension
- Low Sample Size