Accelerating Large-Scale Statistical Computation With the GOEM Algorithm

Xiao Nie; Jared Huling; Peter Z.G. Qian

doi:10.1080/00401706.2016.1256840

Accelerating Large-Scale Statistical Computation With the GOEM Algorithm

Xiao Nie, Jared Huling, Peter Z.G. Qian

Research output: Contribution to journal › Article › peer-review

2 Scopus citations

Abstract

Large-scale data analysis problems have become increasingly common across many disciplines. While large volume of data offers more statistical power, it also brings computational challenges. The orthogonalizing expectation–maximization (EM) algorithm by Xiong et al. is an efficient method to deal with large-scale least-square problems from a design point of view. In this article, we propose a reformulation and generalization of the orthogonalizing EM algorithm. Computational complexity and convergence guarantees are established. The reformulation of the orthogonalizing EM algorithm leads to a reduction in computational complexity for least-square problems and penalized least-square problems. The reformulation, named the GOEM (generalized orthogonalizing EM) algorithm, can incorporate a wide variety of convex and nonconvex penalties, including the lasso, group lasso, and minimax concave penalty penalties. The GOEM algorithm is further extended to a wider class of models including generalized linear models and Cox's proportional hazards model. Synthetic and real data examples are included to illustrate its use and efficiency compared with standard techniques. Supplementary materials for this article are available online.

Original language	English (US)
Pages (from-to)	416-425
Number of pages	10
Journal	Technometrics
Volume	59
Issue number	4
DOIs	https://doi.org/10.1080/00401706.2016.1256840
State	Published - Oct 2 2017
Externally published	Yes

Bibliographical note

Publisher Copyright:
© 2017 American Statistical Association and the American Society for Quality.

Keywords

Big data
Computational statistics
OEM
Optimization
Proximal algorithm

Access

10.1080/00401706.2016.1256840

OpenUrl availability

Full text

Cite this

@article{a71b632174e74b5fac19ca9776aa0322,

title = "Accelerating Large-Scale Statistical Computation With the GOEM Algorithm",

abstract = "Large-scale data analysis problems have become increasingly common across many disciplines. While large volume of data offers more statistical power, it also brings computational challenges. The orthogonalizing expectation–maximization (EM) algorithm by Xiong et al. is an efficient method to deal with large-scale least-square problems from a design point of view. In this article, we propose a reformulation and generalization of the orthogonalizing EM algorithm. Computational complexity and convergence guarantees are established. The reformulation of the orthogonalizing EM algorithm leads to a reduction in computational complexity for least-square problems and penalized least-square problems. The reformulation, named the GOEM (generalized orthogonalizing EM) algorithm, can incorporate a wide variety of convex and nonconvex penalties, including the lasso, group lasso, and minimax concave penalty penalties. The GOEM algorithm is further extended to a wider class of models including generalized linear models and Cox's proportional hazards model. Synthetic and real data examples are included to illustrate its use and efficiency compared with standard techniques. Supplementary materials for this article are available online.",

keywords = "Big data, Computational statistics, OEM, Optimization, Proximal algorithm",

author = "Xiao Nie and Jared Huling and Qian, {Peter Z.G.}",

note = "Publisher Copyright: {\textcopyright} 2017 American Statistical Association and the American Society for Quality.",

year = "2017",

month = oct,

day = "2",

doi = "10.1080/00401706.2016.1256840",

language = "English (US)",

volume = "59",

pages = "416--425",

journal = "Technometrics",

issn = "0040-1706",

publisher = "American Statistical Association",

number = "4",

}

TY - JOUR

T1 - Accelerating Large-Scale Statistical Computation With the GOEM Algorithm

AU - Nie, Xiao

AU - Huling, Jared

AU - Qian, Peter Z.G.

PY - 2017/10/2

Y1 - 2017/10/2

N2 - Large-scale data analysis problems have become increasingly common across many disciplines. While large volume of data offers more statistical power, it also brings computational challenges. The orthogonalizing expectation–maximization (EM) algorithm by Xiong et al. is an efficient method to deal with large-scale least-square problems from a design point of view. In this article, we propose a reformulation and generalization of the orthogonalizing EM algorithm. Computational complexity and convergence guarantees are established. The reformulation of the orthogonalizing EM algorithm leads to a reduction in computational complexity for least-square problems and penalized least-square problems. The reformulation, named the GOEM (generalized orthogonalizing EM) algorithm, can incorporate a wide variety of convex and nonconvex penalties, including the lasso, group lasso, and minimax concave penalty penalties. The GOEM algorithm is further extended to a wider class of models including generalized linear models and Cox's proportional hazards model. Synthetic and real data examples are included to illustrate its use and efficiency compared with standard techniques. Supplementary materials for this article are available online.

AB - Large-scale data analysis problems have become increasingly common across many disciplines. While large volume of data offers more statistical power, it also brings computational challenges. The orthogonalizing expectation–maximization (EM) algorithm by Xiong et al. is an efficient method to deal with large-scale least-square problems from a design point of view. In this article, we propose a reformulation and generalization of the orthogonalizing EM algorithm. Computational complexity and convergence guarantees are established. The reformulation of the orthogonalizing EM algorithm leads to a reduction in computational complexity for least-square problems and penalized least-square problems. The reformulation, named the GOEM (generalized orthogonalizing EM) algorithm, can incorporate a wide variety of convex and nonconvex penalties, including the lasso, group lasso, and minimax concave penalty penalties. The GOEM algorithm is further extended to a wider class of models including generalized linear models and Cox's proportional hazards model. Synthetic and real data examples are included to illustrate its use and efficiency compared with standard techniques. Supplementary materials for this article are available online.

KW - Big data

KW - Computational statistics

KW - OEM

KW - Optimization

KW - Proximal algorithm

UR - http://www.scopus.com/inward/record.url?scp=85018890991&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85018890991&partnerID=8YFLogxK

U2 - 10.1080/00401706.2016.1256840

DO - 10.1080/00401706.2016.1256840

M3 - Article

AN - SCOPUS:85018890991

SN - 0040-1706

VL - 59

SP - 416

EP - 425

JO - Technometrics

JF - Technometrics

IS - 4

ER -

Accelerating Large-Scale Statistical Computation With the GOEM Algorithm

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this