A Note on Cross-Validation for Lasso Under Measurement Errors

Abhirup Datta; Hui Zou

doi:10.1080/00401706.2019.1668856

A Note on Cross-Validation for Lasso Under Measurement Errors

Abhirup Datta, Hui Zou

Statistics (Twin Cities)

Research output: Contribution to journal › Comment/debate › peer-review

2 Scopus citations

Abstract

Variants of the Lasso or (Formula presented.) -penalized regression have been proposed to accommodate for presence of measurement errors in the covariates. Theoretical guarantees of these estimates have been established for some oracle values of the regularization parameters which are not known in practice. Data-driven tuning such as cross-validation has not been studied when covariates contain measurement errors. We demonstrate that in the presence of error-in-covariates, even when using a Lasso-variant that adjusts for measurement error, application of naive leave-one-out cross-validation to select the tuning parameter can be problematic. We provide an example where such a practice leads to estimation inconsistency. We also prove that a simple correction to cross-validation procedure restores consistency. We also study the risk consistency of the two cross-validation procedures and offer guideline on the choice of cross-validation based on the measurement error distributions of the training and the prediction data. The theoretical findings are validated using simulated data. Supplementary materials for this article are available online.

Original language	English (US)
Pages (from-to)	549-556
Number of pages	8
Journal	Technometrics
Volume	62
Issue number	4
DOIs	https://doi.org/10.1080/00401706.2019.1668856
State	Published - Oct 1 2020

Bibliographical note

Funding Information:
Zou’s research is supported in part by NSF grant DMS-1915842. We thank the editor, the associate editor, and the anonymous reviewers for their helpful feedback which helped to greatly improve the article. We thank Dr. Po-Ling Loh for sharing R and Matlab codes for computing the NCL estimator.

Publisher Copyright:
© 2019 American Statistical Association and the American Society for Quality.

Keywords

Cross-validation
Inconsistency
Lasso
Measurement errors

Access

10.1080/00401706.2019.1668856

OpenUrl availability

Full text

Cite this

@article{0050ef2efdc44c119bd14d31b75c6d71,

title = "A Note on Cross-Validation for Lasso Under Measurement Errors",

abstract = "Variants of the Lasso or (Formula presented.) -penalized regression have been proposed to accommodate for presence of measurement errors in the covariates. Theoretical guarantees of these estimates have been established for some oracle values of the regularization parameters which are not known in practice. Data-driven tuning such as cross-validation has not been studied when covariates contain measurement errors. We demonstrate that in the presence of error-in-covariates, even when using a Lasso-variant that adjusts for measurement error, application of naive leave-one-out cross-validation to select the tuning parameter can be problematic. We provide an example where such a practice leads to estimation inconsistency. We also prove that a simple correction to cross-validation procedure restores consistency. We also study the risk consistency of the two cross-validation procedures and offer guideline on the choice of cross-validation based on the measurement error distributions of the training and the prediction data. The theoretical findings are validated using simulated data. Supplementary materials for this article are available online.",

keywords = "Cross-validation, Inconsistency, Lasso, Measurement errors",

author = "Abhirup Datta and Hui Zou",

note = "Funding Information: Zou{\textquoteright}s research is supported in part by NSF grant DMS-1915842. We thank the editor, the associate editor, and the anonymous reviewers for their helpful feedback which helped to greatly improve the article. We thank Dr. Po-Ling Loh for sharing R and Matlab codes for computing the NCL estimator. Publisher Copyright: {\textcopyright} 2019 American Statistical Association and the American Society for Quality.",

year = "2020",

month = oct,

day = "1",

doi = "10.1080/00401706.2019.1668856",

language = "English (US)",

volume = "62",

pages = "549--556",

journal = "Technometrics",

issn = "0040-1706",

publisher = "American Statistical Association",

number = "4",

}

TY - JOUR

T1 - A Note on Cross-Validation for Lasso Under Measurement Errors

AU - Datta, Abhirup

AU - Zou, Hui

N1 - Funding Information: Zou’s research is supported in part by NSF grant DMS-1915842. We thank the editor, the associate editor, and the anonymous reviewers for their helpful feedback which helped to greatly improve the article. We thank Dr. Po-Ling Loh for sharing R and Matlab codes for computing the NCL estimator. Publisher Copyright: © 2019 American Statistical Association and the American Society for Quality.

PY - 2020/10/1

Y1 - 2020/10/1

N2 - Variants of the Lasso or (Formula presented.) -penalized regression have been proposed to accommodate for presence of measurement errors in the covariates. Theoretical guarantees of these estimates have been established for some oracle values of the regularization parameters which are not known in practice. Data-driven tuning such as cross-validation has not been studied when covariates contain measurement errors. We demonstrate that in the presence of error-in-covariates, even when using a Lasso-variant that adjusts for measurement error, application of naive leave-one-out cross-validation to select the tuning parameter can be problematic. We provide an example where such a practice leads to estimation inconsistency. We also prove that a simple correction to cross-validation procedure restores consistency. We also study the risk consistency of the two cross-validation procedures and offer guideline on the choice of cross-validation based on the measurement error distributions of the training and the prediction data. The theoretical findings are validated using simulated data. Supplementary materials for this article are available online.

AB - Variants of the Lasso or (Formula presented.) -penalized regression have been proposed to accommodate for presence of measurement errors in the covariates. Theoretical guarantees of these estimates have been established for some oracle values of the regularization parameters which are not known in practice. Data-driven tuning such as cross-validation has not been studied when covariates contain measurement errors. We demonstrate that in the presence of error-in-covariates, even when using a Lasso-variant that adjusts for measurement error, application of naive leave-one-out cross-validation to select the tuning parameter can be problematic. We provide an example where such a practice leads to estimation inconsistency. We also prove that a simple correction to cross-validation procedure restores consistency. We also study the risk consistency of the two cross-validation procedures and offer guideline on the choice of cross-validation based on the measurement error distributions of the training and the prediction data. The theoretical findings are validated using simulated data. Supplementary materials for this article are available online.

KW - Cross-validation

KW - Inconsistency

KW - Lasso

KW - Measurement errors

UR - http://www.scopus.com/inward/record.url?scp=85074557648&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85074557648&partnerID=8YFLogxK

U2 - 10.1080/00401706.2019.1668856

DO - 10.1080/00401706.2019.1668856

M3 - Comment/debate

AN - SCOPUS:85074557648

SN - 0040-1706

VL - 62

SP - 549

EP - 556

JO - Technometrics

JF - Technometrics

IS - 4

ER -

A Note on Cross-Validation for Lasso Under Measurement Errors

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this