Variable Selection Diagnostics Measures for High-Dimensional Regression

Ying Nan; Yuhong Yang

doi:10.1080/10618600.2013.829780

Variable Selection Diagnostics Measures for High-Dimensional Regression

Ying Nan, Yuhong Yang

Statistics (Twin Cities)

Research output: Contribution to journal › Article › peer-review

32 Scopus citations

Abstract

Many exciting results have been obtained on model selection for high-dimensional data in both efficient algorithms and theoretical developments. The powerful penalized regression methods can give sparse representations of the data even when the number of predictors is much larger than the sample size. One important question then is: How do we know when a sparse pattern identified by such a method is reliable? In this work, besides investigating instability of model selection methods in terms of variable selection, we propose variable selection deviation measures that give one a proper sense on how many predictors in the selected set are likely trustworthy in certain aspects. Simulation and a real data example demonstrate the utility of these measures for application.

Original language	English (US)
Pages (from-to)	636-656
Number of pages	21
Journal	Journal of Computational and Graphical Statistics
Volume	23
Issue number	3
DOIs	https://doi.org/10.1080/10618600.2013.829780
State	Published - Jul 3 2014

Bibliographical note

Funding Information:
The authors appreciate comments from Wei Pan, Lan Wang, Yi Yang, and Hui Zou. We also sincerely thank two referees, the AE, and the Editor for very helpful suggestions on improving our work in both theoretical and numerical aspects. This research was partially supported by NSF grant DMS-1106576.

Publisher Copyright:
© 2014 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.

Keywords

Model selection diagnostics
Model selection instability
Variable selection deviation

Access

10.1080/10618600.2013.829780

OpenUrl availability

Full text

Cite this

@article{21c786da49ad47da97618884ecfe98df,

title = "Variable Selection Diagnostics Measures for High-Dimensional Regression",

abstract = "Many exciting results have been obtained on model selection for high-dimensional data in both efficient algorithms and theoretical developments. The powerful penalized regression methods can give sparse representations of the data even when the number of predictors is much larger than the sample size. One important question then is: How do we know when a sparse pattern identified by such a method is reliable? In this work, besides investigating instability of model selection methods in terms of variable selection, we propose variable selection deviation measures that give one a proper sense on how many predictors in the selected set are likely trustworthy in certain aspects. Simulation and a real data example demonstrate the utility of these measures for application.",

keywords = "Model selection diagnostics, Model selection instability, Variable selection deviation",

author = "Ying Nan and Yuhong Yang",

note = "Funding Information: The authors appreciate comments from Wei Pan, Lan Wang, Yi Yang, and Hui Zou. We also sincerely thank two referees, the AE, and the Editor for very helpful suggestions on improving our work in both theoretical and numerical aspects. This research was partially supported by NSF grant DMS-1106576. Publisher Copyright: {\textcopyright} 2014 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.",

year = "2014",

month = jul,

day = "3",

doi = "10.1080/10618600.2013.829780",

language = "English (US)",

volume = "23",

pages = "636--656",

journal = "Journal of Computational and Graphical Statistics",

issn = "1061-8600",

publisher = "American Statistical Association",

number = "3",

}

TY - JOUR

T1 - Variable Selection Diagnostics Measures for High-Dimensional Regression

AU - Nan, Ying

AU - Yang, Yuhong

N1 - Funding Information: The authors appreciate comments from Wei Pan, Lan Wang, Yi Yang, and Hui Zou. We also sincerely thank two referees, the AE, and the Editor for very helpful suggestions on improving our work in both theoretical and numerical aspects. This research was partially supported by NSF grant DMS-1106576. Publisher Copyright: © 2014 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.

PY - 2014/7/3

Y1 - 2014/7/3

N2 - Many exciting results have been obtained on model selection for high-dimensional data in both efficient algorithms and theoretical developments. The powerful penalized regression methods can give sparse representations of the data even when the number of predictors is much larger than the sample size. One important question then is: How do we know when a sparse pattern identified by such a method is reliable? In this work, besides investigating instability of model selection methods in terms of variable selection, we propose variable selection deviation measures that give one a proper sense on how many predictors in the selected set are likely trustworthy in certain aspects. Simulation and a real data example demonstrate the utility of these measures for application.

AB - Many exciting results have been obtained on model selection for high-dimensional data in both efficient algorithms and theoretical developments. The powerful penalized regression methods can give sparse representations of the data even when the number of predictors is much larger than the sample size. One important question then is: How do we know when a sparse pattern identified by such a method is reliable? In this work, besides investigating instability of model selection methods in terms of variable selection, we propose variable selection deviation measures that give one a proper sense on how many predictors in the selected set are likely trustworthy in certain aspects. Simulation and a real data example demonstrate the utility of these measures for application.

KW - Model selection diagnostics

KW - Model selection instability

KW - Variable selection deviation

UR - http://www.scopus.com/inward/record.url?scp=84925959153&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84925959153&partnerID=8YFLogxK

U2 - 10.1080/10618600.2013.829780

DO - 10.1080/10618600.2013.829780

M3 - Article

AN - SCOPUS:84925959153

SN - 1061-8600

VL - 23

SP - 636

EP - 656

JO - Journal of Computational and Graphical Statistics

JF - Journal of Computational and Graphical Statistics

IS - 3

ER -

Variable Selection Diagnostics Measures for High-Dimensional Regression

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this