SURE information criteria for large covariance matrix estimation and their asymptotic properties

Danning Li; Hui Zou

doi:10.1109/TIT.2016.2530090

SURE information criteria for large covariance matrix estimation and their asymptotic properties

Danning Li, Hui Zou

Statistics (Twin Cities)

Research output: Contribution to journal › Article › peer-review

9 Scopus citations

Abstract

Consider $n$ independent and identically distributed $p$ -dimensional Gaussian random vectors with covariance matrix Σ. The problem of estimating Σ when $p$ is much larger than $n$ has received a lot of attention in recent years. Yet, little is known about the information criterion for covariance matrix estimation. How to properly define such a criterion and what are the statistical properties? We attempt to answer these questions in this paper by focusing on the estimation of bandable covariance matrices when p>n but log (p)=o(n). Motivated by the deep connection between Stein's unbiased risk estimation (SURE) and Akaike information criterion (AIC) in regression models, we propose a family of generalized SURE (SURE_c) indexed by c for covariance matrix estimation, where c is some constant. When c is 2, SURE₂ provides an unbiased estimator of the Frobenius risk of the covariance matrix estimator. Furthermore, we show that by minimizing SURE₂ over all possible banding covariance matrix estimators, we attain the minimax optimal rate of convergence under the Frobenius norm, and the resulting estimator behaves like the covariance matrix estimator obtained by the so-called oracle tuning. When the true covariance matrix is exactly banded, we prove that by minimizing SURE_log(n) , we select the true bandwidth with probability tending to one. Therefore, our analysis indicates that SURE₂ and SURE_log(n) can be regarded as the AIC and Bayesian information criterion for large covariance matrix estimation, respectively.

Original language	English (US)
Article number	7407414
Pages (from-to)	2153-2169
Number of pages	17
Journal	IEEE Transactions on Information Theory
Volume	62
Issue number	4
DOIs	https://doi.org/10.1109/TIT.2016.2530090
State	Published - Apr 2016

Bibliographical note

Funding Information:
H. Zou was supported by the National Science Foundation under Grant DMS 1505111.

Publisher Copyright:
© 1963-2012 IEEE.

Keywords

Covariance matrix
High-dimensional asymptotics
Information criteria
Risk optimality
Selection consistency

Access

10.1109/TIT.2016.2530090

OpenUrl availability

Full text

Cite this

@article{d0792b97dd644a8da066baf39f1fbc42,

title = "SURE information criteria for large covariance matrix estimation and their asymptotic properties",

abstract = "Consider $n$ independent and identically distributed $p$ -dimensional Gaussian random vectors with covariance matrix Σ. The problem of estimating Σ when $p$ is much larger than $n$ has received a lot of attention in recent years. Yet, little is known about the information criterion for covariance matrix estimation. How to properly define such a criterion and what are the statistical properties? We attempt to answer these questions in this paper by focusing on the estimation of bandable covariance matrices when p>n but log (p)=o(n). Motivated by the deep connection between Stein's unbiased risk estimation (SURE) and Akaike information criterion (AIC) in regression models, we propose a family of generalized SURE (SUREc) indexed by c for covariance matrix estimation, where c is some constant. When c is 2, SURE2 provides an unbiased estimator of the Frobenius risk of the covariance matrix estimator. Furthermore, we show that by minimizing SURE2 over all possible banding covariance matrix estimators, we attain the minimax optimal rate of convergence under the Frobenius norm, and the resulting estimator behaves like the covariance matrix estimator obtained by the so-called oracle tuning. When the true covariance matrix is exactly banded, we prove that by minimizing SURElog(n) , we select the true bandwidth with probability tending to one. Therefore, our analysis indicates that SURE2 and SURElog(n) can be regarded as the AIC and Bayesian information criterion for large covariance matrix estimation, respectively.",

keywords = "Covariance matrix, High-dimensional asymptotics, Information criteria, Risk optimality, Selection consistency",

author = "Danning Li and Hui Zou",

note = "Funding Information: H. Zou was supported by the National Science Foundation under Grant DMS 1505111. Publisher Copyright: {\textcopyright} 1963-2012 IEEE.",

year = "2016",

month = apr,

doi = "10.1109/TIT.2016.2530090",

language = "English (US)",

volume = "62",

pages = "2153--2169",

journal = "IEEE Transactions on Information Theory",

issn = "0018-9448",

publisher = "IEEE",

number = "4",

}

TY - JOUR

T1 - SURE information criteria for large covariance matrix estimation and their asymptotic properties

AU - Li, Danning

AU - Zou, Hui

PY - 2016/4

Y1 - 2016/4

N2 - Consider $n$ independent and identically distributed $p$ -dimensional Gaussian random vectors with covariance matrix Σ. The problem of estimating Σ when $p$ is much larger than $n$ has received a lot of attention in recent years. Yet, little is known about the information criterion for covariance matrix estimation. How to properly define such a criterion and what are the statistical properties? We attempt to answer these questions in this paper by focusing on the estimation of bandable covariance matrices when p>n but log (p)=o(n). Motivated by the deep connection between Stein's unbiased risk estimation (SURE) and Akaike information criterion (AIC) in regression models, we propose a family of generalized SURE (SUREc) indexed by c for covariance matrix estimation, where c is some constant. When c is 2, SURE2 provides an unbiased estimator of the Frobenius risk of the covariance matrix estimator. Furthermore, we show that by minimizing SURE2 over all possible banding covariance matrix estimators, we attain the minimax optimal rate of convergence under the Frobenius norm, and the resulting estimator behaves like the covariance matrix estimator obtained by the so-called oracle tuning. When the true covariance matrix is exactly banded, we prove that by minimizing SURElog(n) , we select the true bandwidth with probability tending to one. Therefore, our analysis indicates that SURE2 and SURElog(n) can be regarded as the AIC and Bayesian information criterion for large covariance matrix estimation, respectively.

AB - Consider $n$ independent and identically distributed $p$ -dimensional Gaussian random vectors with covariance matrix Σ. The problem of estimating Σ when $p$ is much larger than $n$ has received a lot of attention in recent years. Yet, little is known about the information criterion for covariance matrix estimation. How to properly define such a criterion and what are the statistical properties? We attempt to answer these questions in this paper by focusing on the estimation of bandable covariance matrices when p>n but log (p)=o(n). Motivated by the deep connection between Stein's unbiased risk estimation (SURE) and Akaike information criterion (AIC) in regression models, we propose a family of generalized SURE (SUREc) indexed by c for covariance matrix estimation, where c is some constant. When c is 2, SURE2 provides an unbiased estimator of the Frobenius risk of the covariance matrix estimator. Furthermore, we show that by minimizing SURE2 over all possible banding covariance matrix estimators, we attain the minimax optimal rate of convergence under the Frobenius norm, and the resulting estimator behaves like the covariance matrix estimator obtained by the so-called oracle tuning. When the true covariance matrix is exactly banded, we prove that by minimizing SURElog(n) , we select the true bandwidth with probability tending to one. Therefore, our analysis indicates that SURE2 and SURElog(n) can be regarded as the AIC and Bayesian information criterion for large covariance matrix estimation, respectively.

KW - Covariance matrix

KW - High-dimensional asymptotics

KW - Information criteria

KW - Risk optimality

KW - Selection consistency

UR - http://www.scopus.com/inward/record.url?scp=84963821068&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84963821068&partnerID=8YFLogxK

U2 - 10.1109/TIT.2016.2530090

DO - 10.1109/TIT.2016.2530090

M3 - Article

AN - SCOPUS:84963821068

SN - 0018-9448

VL - 62

SP - 2153

EP - 2169

JO - IEEE Transactions on Information Theory

JF - IEEE Transactions on Information Theory

IS - 4

M1 - 7407414

ER -

SURE information criteria for large covariance matrix estimation and their asymptotic properties

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this