Partially linear additive quantile regression in ultra-high dimension

Ben Sherwood; Lan Wang

doi:10.1214/15-AOS1367

Partially linear additive quantile regression in ultra-high dimension

Ben Sherwood, Lan Wang

Statistics (Twin Cities)

Research output: Contribution to journal › Article › peer-review

91 Scopus citations

Abstract

We consider a flexible semiparametric quantile regression model for analyzing high dimensional heterogeneous data. This model has several appealing features: (1) By considering different conditional quantiles, we may obtain a more complete picture of the conditional distribution of a response variable given high dimensional covariates. (2) The sparsity level is allowed to be different at different quantile levels. (3) The partially linear additive structure accommodates nonlinearity and circumvents the curse of dimensionality. (4) It is naturally robust to heavy-tailed distributions. In this paper, we approximate the nonlinear components using B-spline basis functions. We first study estimation under this model when the nonzero components are known in advance and the number of covariates in the linear part diverges. We then investigate a nonconvex penalized estimator for simultaneous variable selection and estimation. We derive its oracle property for a general class of nonconvex penalty functions in the presence of ultra-high dimensional covariates under relaxed conditions. To tackle the challenges of nonsmooth loss function, nonconvex penalty function and the presence of nonlinear components, we combine a recently developed convex-differencing method with modern empirical process techniques. Monte Carlo simulations and an application to a microarray study demonstrate the effectiveness of the proposed method. We also discuss how the method for a single quantile of interest can be extended to simultaneous variable selection and estimation at multiple quantiles.

Original language	English (US)
Pages (from-to)	288-317
Number of pages	30
Journal	Annals of Statistics
Volume	44
Issue number	1
DOIs	https://doi.org/10.1214/15-AOS1367
State	Published - Feb 1 2016

Bibliographical note

Publisher Copyright:
© Institute of Mathematical Statistics, 2016.

Keywords

High dimensional data
Nonconvex penalty
Partial linear
Quantile regression
Variable selection

Access

10.1214/15-AOS1367

OpenUrl availability

Full text

Cite this

@article{312da9f4c03c4a1db787800c99820496,

title = "Partially linear additive quantile regression in ultra-high dimension",

abstract = "We consider a flexible semiparametric quantile regression model for analyzing high dimensional heterogeneous data. This model has several appealing features: (1) By considering different conditional quantiles, we may obtain a more complete picture of the conditional distribution of a response variable given high dimensional covariates. (2) The sparsity level is allowed to be different at different quantile levels. (3) The partially linear additive structure accommodates nonlinearity and circumvents the curse of dimensionality. (4) It is naturally robust to heavy-tailed distributions. In this paper, we approximate the nonlinear components using B-spline basis functions. We first study estimation under this model when the nonzero components are known in advance and the number of covariates in the linear part diverges. We then investigate a nonconvex penalized estimator for simultaneous variable selection and estimation. We derive its oracle property for a general class of nonconvex penalty functions in the presence of ultra-high dimensional covariates under relaxed conditions. To tackle the challenges of nonsmooth loss function, nonconvex penalty function and the presence of nonlinear components, we combine a recently developed convex-differencing method with modern empirical process techniques. Monte Carlo simulations and an application to a microarray study demonstrate the effectiveness of the proposed method. We also discuss how the method for a single quantile of interest can be extended to simultaneous variable selection and estimation at multiple quantiles.",

keywords = "High dimensional data, Nonconvex penalty, Partial linear, Quantile regression, Variable selection",

author = "Ben Sherwood and Lan Wang",

note = "Publisher Copyright: {\textcopyright} Institute of Mathematical Statistics, 2016.",

year = "2016",

month = feb,

day = "1",

doi = "10.1214/15-AOS1367",

language = "English (US)",

volume = "44",

pages = "288--317",

journal = "Annals of Statistics",

issn = "0090-5364",

publisher = "Institute of Mathematical Statistics",

number = "1",

}

TY - JOUR

T1 - Partially linear additive quantile regression in ultra-high dimension

AU - Sherwood, Ben

AU - Wang, Lan

N1 - Publisher Copyright: © Institute of Mathematical Statistics, 2016.

PY - 2016/2/1

Y1 - 2016/2/1

N2 - We consider a flexible semiparametric quantile regression model for analyzing high dimensional heterogeneous data. This model has several appealing features: (1) By considering different conditional quantiles, we may obtain a more complete picture of the conditional distribution of a response variable given high dimensional covariates. (2) The sparsity level is allowed to be different at different quantile levels. (3) The partially linear additive structure accommodates nonlinearity and circumvents the curse of dimensionality. (4) It is naturally robust to heavy-tailed distributions. In this paper, we approximate the nonlinear components using B-spline basis functions. We first study estimation under this model when the nonzero components are known in advance and the number of covariates in the linear part diverges. We then investigate a nonconvex penalized estimator for simultaneous variable selection and estimation. We derive its oracle property for a general class of nonconvex penalty functions in the presence of ultra-high dimensional covariates under relaxed conditions. To tackle the challenges of nonsmooth loss function, nonconvex penalty function and the presence of nonlinear components, we combine a recently developed convex-differencing method with modern empirical process techniques. Monte Carlo simulations and an application to a microarray study demonstrate the effectiveness of the proposed method. We also discuss how the method for a single quantile of interest can be extended to simultaneous variable selection and estimation at multiple quantiles.

AB - We consider a flexible semiparametric quantile regression model for analyzing high dimensional heterogeneous data. This model has several appealing features: (1) By considering different conditional quantiles, we may obtain a more complete picture of the conditional distribution of a response variable given high dimensional covariates. (2) The sparsity level is allowed to be different at different quantile levels. (3) The partially linear additive structure accommodates nonlinearity and circumvents the curse of dimensionality. (4) It is naturally robust to heavy-tailed distributions. In this paper, we approximate the nonlinear components using B-spline basis functions. We first study estimation under this model when the nonzero components are known in advance and the number of covariates in the linear part diverges. We then investigate a nonconvex penalized estimator for simultaneous variable selection and estimation. We derive its oracle property for a general class of nonconvex penalty functions in the presence of ultra-high dimensional covariates under relaxed conditions. To tackle the challenges of nonsmooth loss function, nonconvex penalty function and the presence of nonlinear components, we combine a recently developed convex-differencing method with modern empirical process techniques. Monte Carlo simulations and an application to a microarray study demonstrate the effectiveness of the proposed method. We also discuss how the method for a single quantile of interest can be extended to simultaneous variable selection and estimation at multiple quantiles.

KW - High dimensional data

KW - Nonconvex penalty

KW - Partial linear

KW - Quantile regression

KW - Variable selection

UR - http://www.scopus.com/inward/record.url?scp=85013158377&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85013158377&partnerID=8YFLogxK

U2 - 10.1214/15-AOS1367

DO - 10.1214/15-AOS1367

M3 - Article

AN - SCOPUS:85013158377

SN - 0090-5364

VL - 44

SP - 288

EP - 317

JO - Annals of Statistics

JF - Annals of Statistics

IS - 1

ER -

Partially linear additive quantile regression in ultra-high dimension

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this