Adaptive minimax regression estimation over sparse lq-hulls

Zhan Wang; Sandra Paterlini; Fuchang Gao; Yuhong Yang

Adaptive minimax regression estimation over sparse l_q-hulls

Zhan Wang, Sandra Paterlini, Fuchang Gao, Yuhong Yang

Statistics (Twin Cities)

Research output: Contribution to journal › Article › peer-review

26 Scopus citations

Abstract

Given a dictionary of M_n predictors, in a random design regression setting with n observations, we construct estimators that target the best performance among all the linear combinations of the predictors under a sparse l_q-norm (0 ≤ q ≤ 1) constraint on the linear coefficients. Besides identifying the optimal rates of convergence, our universal aggregation strategies by model mixing achieve the optimal rates simultaneously over the full range of 0 ≤ q ≤ 1 for any M_n and without knowledge of the l_q-norm of the best linear coefficients to represent the regression function. To allow model misspecification, our upper bound results are obtained in a framework of aggregation of estimates. A striking feature is that no specific relationship among the predictors is needed to achieve the upper rates of convergence (hence permitting basically arbitrary correlations between the predictors). Therefore, whatever the true regression function (assumed to be uniformly bounded), our estimators automatically exploit any sparse representation of the regression function (if any), to the best extent possible within the -constrained linear combinations for any 0 ≤ q ≤ 1. A sparse approximation result in the l_q-hulls turns out to be crucial to adaptively achieve minimax rate optimal aggregation. It precisely characterizes the number of terms needed to achieve a prescribed accuracy of approximation to the best linear combination in an l_q-hull for 0 ≤ q ≤ 1. It offers the insight that the minimax rate of -aggregation is basically determined by an effective model size, which is a sparsity index that depends on q, M_n, n, and the l_q-norm bound in an easily interpretable way based on a classical model selection theory that deals with a large number of models.

Original language	English (US)
Pages (from-to)	1675-1711
Number of pages	37
Journal	Journal of Machine Learning Research
Volume	15
State	Published - May 2014

Keywords

High-dimensional sparse learning
Minimax rate of convergence
Model selection
Optimal aggregation

OpenUrl availability

Full text

Cite this

@article{8dbb06338bc84023b352a47ed97ad337,

title = "Adaptive minimax regression estimation over sparse lq-hulls",

abstract = "Given a dictionary of Mn predictors, in a random design regression setting with n observations, we construct estimators that target the best performance among all the linear combinations of the predictors under a sparse lq-norm (0 ≤ q ≤ 1) constraint on the linear coefficients. Besides identifying the optimal rates of convergence, our universal aggregation strategies by model mixing achieve the optimal rates simultaneously over the full range of 0 ≤ q ≤ 1 for any Mn and without knowledge of the lq-norm of the best linear coefficients to represent the regression function. To allow model misspecification, our upper bound results are obtained in a framework of aggregation of estimates. A striking feature is that no specific relationship among the predictors is needed to achieve the upper rates of convergence (hence permitting basically arbitrary correlations between the predictors). Therefore, whatever the true regression function (assumed to be uniformly bounded), our estimators automatically exploit any sparse representation of the regression function (if any), to the best extent possible within the -constrained linear combinations for any 0 ≤ q ≤ 1. A sparse approximation result in the lq-hulls turns out to be crucial to adaptively achieve minimax rate optimal aggregation. It precisely characterizes the number of terms needed to achieve a prescribed accuracy of approximation to the best linear combination in an lq-hull for 0 ≤ q ≤ 1. It offers the insight that the minimax rate of -aggregation is basically determined by an effective model size, which is a sparsity index that depends on q, Mn, n, and the lq-norm bound in an easily interpretable way based on a classical model selection theory that deals with a large number of models.",

keywords = "High-dimensional sparse learning, Minimax rate of convergence, Model selection, Optimal aggregation",

author = "Zhan Wang and Sandra Paterlini and Fuchang Gao and Yuhong Yang",

year = "2014",

month = may,

language = "English (US)",

volume = "15",

pages = "1675--1711",

journal = "Journal of Machine Learning Research",

issn = "1532-4435",

publisher = "Microtome Publishing",

}

TY - JOUR

T1 - Adaptive minimax regression estimation over sparse lq-hulls

AU - Wang, Zhan

AU - Paterlini, Sandra

AU - Gao, Fuchang

AU - Yang, Yuhong

PY - 2014/5

Y1 - 2014/5

N2 - Given a dictionary of Mn predictors, in a random design regression setting with n observations, we construct estimators that target the best performance among all the linear combinations of the predictors under a sparse lq-norm (0 ≤ q ≤ 1) constraint on the linear coefficients. Besides identifying the optimal rates of convergence, our universal aggregation strategies by model mixing achieve the optimal rates simultaneously over the full range of 0 ≤ q ≤ 1 for any Mn and without knowledge of the lq-norm of the best linear coefficients to represent the regression function. To allow model misspecification, our upper bound results are obtained in a framework of aggregation of estimates. A striking feature is that no specific relationship among the predictors is needed to achieve the upper rates of convergence (hence permitting basically arbitrary correlations between the predictors). Therefore, whatever the true regression function (assumed to be uniformly bounded), our estimators automatically exploit any sparse representation of the regression function (if any), to the best extent possible within the -constrained linear combinations for any 0 ≤ q ≤ 1. A sparse approximation result in the lq-hulls turns out to be crucial to adaptively achieve minimax rate optimal aggregation. It precisely characterizes the number of terms needed to achieve a prescribed accuracy of approximation to the best linear combination in an lq-hull for 0 ≤ q ≤ 1. It offers the insight that the minimax rate of -aggregation is basically determined by an effective model size, which is a sparsity index that depends on q, Mn, n, and the lq-norm bound in an easily interpretable way based on a classical model selection theory that deals with a large number of models.

AB - Given a dictionary of Mn predictors, in a random design regression setting with n observations, we construct estimators that target the best performance among all the linear combinations of the predictors under a sparse lq-norm (0 ≤ q ≤ 1) constraint on the linear coefficients. Besides identifying the optimal rates of convergence, our universal aggregation strategies by model mixing achieve the optimal rates simultaneously over the full range of 0 ≤ q ≤ 1 for any Mn and without knowledge of the lq-norm of the best linear coefficients to represent the regression function. To allow model misspecification, our upper bound results are obtained in a framework of aggregation of estimates. A striking feature is that no specific relationship among the predictors is needed to achieve the upper rates of convergence (hence permitting basically arbitrary correlations between the predictors). Therefore, whatever the true regression function (assumed to be uniformly bounded), our estimators automatically exploit any sparse representation of the regression function (if any), to the best extent possible within the -constrained linear combinations for any 0 ≤ q ≤ 1. A sparse approximation result in the lq-hulls turns out to be crucial to adaptively achieve minimax rate optimal aggregation. It precisely characterizes the number of terms needed to achieve a prescribed accuracy of approximation to the best linear combination in an lq-hull for 0 ≤ q ≤ 1. It offers the insight that the minimax rate of -aggregation is basically determined by an effective model size, which is a sparsity index that depends on q, Mn, n, and the lq-norm bound in an easily interpretable way based on a classical model selection theory that deals with a large number of models.

KW - High-dimensional sparse learning

KW - Minimax rate of convergence

KW - Model selection

KW - Optimal aggregation

UR - http://www.scopus.com/inward/record.url?scp=84902815139&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84902815139&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84902815139

SN - 1532-4435

VL - 15

SP - 1675

EP - 1711

JO - Journal of Machine Learning Research

JF - Journal of Machine Learning Research

ER -

Adaptive minimax regression estimation over sparse l_q-hulls

Abstract

Keywords

OpenUrl availability

Other files and links

Fingerprint

Cite this