Sparse linear discriminant analysis in structured covariates space

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Classification with high dimensional variables is a popular goal in many modern statistical studies. Fisher's linear discriminant analysis (LDA) is a common and effective tool for classifying entities into existing groups. It is well known that classification using Fisher's discriminant for high dimensional data is as bad as random guessing due to the many noise features that increases misclassification rate. Recently, it is being acknowledged that complex biological mechanisms occur through multiple features working together, though individually these features may contribute to noise accumulation in the data. In view of these, it is important to perform classification with discriminant vectors that use a subset of important variables, while also utilizing prior biological relationships among features. We tackle this problem in this article and propose methods that incorporate variable selection into the classification problem, for the identification of important biomarkers. Furthermore, we incorporate into the LDA problem prior information on the relationships among variables using undirected graphs in order to identify functionally meaningful biomarkers. We compare our methods to existing sparse LDA approaches via simulation studies and real data analysis.

Original languageEnglish (US)
Title of host publicationProceedings - 3rd IEEE International Conference on Data Science and Advanced Analytics, DSAA 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages772-781
Number of pages10
ISBN (Electronic)9781509052066
DOIs
StatePublished - Dec 22 2016
Event3rd IEEE International Conference on Data Science and Advanced Analytics, DSAA 2016 - Montreal, Canada
Duration: Oct 17 2016Oct 19 2016

Publication series

NameProceedings - 3rd IEEE International Conference on Data Science and Advanced Analytics, DSAA 2016

Other

Other3rd IEEE International Conference on Data Science and Advanced Analytics, DSAA 2016
Country/TerritoryCanada
CityMontreal
Period10/17/1610/19/16

Bibliographical note

Funding Information:
Sandra Safo's work was supported by NIH grant K12HD085850. Qi Long's work was supported by NIH grants R03CA173770 and R03CA183006. The content is solely the responsibility of the authors and does not represent the views of the NIH.

Publisher Copyright:
© 2016 IEEE.

Keywords

  • Biological information
  • Classification
  • High dimensional data
  • Linear discriminant analysis
  • Pathway analysis
  • Sparsity

Fingerprint

Dive into the research topics of 'Sparse linear discriminant analysis in structured covariates space'. Together they form a unique fingerprint.

Cite this