Variable Selection Diagnostics Measures for High-Dimensional Regression

Ying Nan, Yuhong Yang

Research output: Contribution to journalArticlepeer-review

32 Scopus citations

Abstract

Many exciting results have been obtained on model selection for high-dimensional data in both efficient algorithms and theoretical developments. The powerful penalized regression methods can give sparse representations of the data even when the number of predictors is much larger than the sample size. One important question then is: How do we know when a sparse pattern identified by such a method is reliable? In this work, besides investigating instability of model selection methods in terms of variable selection, we propose variable selection deviation measures that give one a proper sense on how many predictors in the selected set are likely trustworthy in certain aspects. Simulation and a real data example demonstrate the utility of these measures for application.

Original languageEnglish (US)
Pages (from-to)636-656
Number of pages21
JournalJournal of Computational and Graphical Statistics
Volume23
Issue number3
DOIs
StatePublished - Jul 3 2014

Bibliographical note

Funding Information:
The authors appreciate comments from Wei Pan, Lan Wang, Yi Yang, and Hui Zou. We also sincerely thank two referees, the AE, and the Editor for very helpful suggestions on improving our work in both theoretical and numerical aspects. This research was partially supported by NSF grant DMS-1106576.

Publisher Copyright:
© 2014 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.

Keywords

  • Model selection diagnostics
  • Model selection instability
  • Variable selection deviation

Fingerprint

Dive into the research topics of 'Variable Selection Diagnostics Measures for High-Dimensional Regression'. Together they form a unique fingerprint.

Cite this