Variable selection in penalized model-based clustering via regularization on grouped parameters

Benhuai Xie; Wei Pan; Xiaotong Shen

doi:10.1111/j.1541-0420.2007.00955.x

Variable selection in penalized model-based clustering via regularization on grouped parameters

Benhuai Xie, Wei Pan, Xiaotong Shen

Research output: Contribution to journal › Article › peer-review

29 Scopus citations

Abstract

Penalized model-based clustering has been proposed for high-dimensional but small sample-sized data, such as arising from genomic studies; in particular, it can be used for variable selection. A new regularization scheme is proposed to group together multiple parameters of the same variable across clusters, which is shown both analytically and numerically to be more effective than the conventional L₁ penalty for variable selection. In addition, we develop a strategy to combine this grouping scheme with grouping structured variables. Simulation studies and applications to microarray gene expression data for cancer subtype discovery demonstrate the advantage of the new proposal over several existing approaches.

Original language	English (US)
Pages (from-to)	921-930
Number of pages	10
Journal	Biometrics
Volume	64
Issue number	3
DOIs	https://doi.org/10.1111/j.1541-0420.2007.00955.x
State	Published - Sep 2008

Keywords

BIC
Diagonal covariance
EM algorithm
High-dimension but low-sample size
Microarray gene expression
Mixture model
Penalized likelihood

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access

10.1111/j.1541-0420.2007.00955.x

OpenUrl availability

Full text

Cite this

@article{6f90d6a22d0545d6840ec027456bd899,

title = "Variable selection in penalized model-based clustering via regularization on grouped parameters",

abstract = "Penalized model-based clustering has been proposed for high-dimensional but small sample-sized data, such as arising from genomic studies; in particular, it can be used for variable selection. A new regularization scheme is proposed to group together multiple parameters of the same variable across clusters, which is shown both analytically and numerically to be more effective than the conventional L1 penalty for variable selection. In addition, we develop a strategy to combine this grouping scheme with grouping structured variables. Simulation studies and applications to microarray gene expression data for cancer subtype discovery demonstrate the advantage of the new proposal over several existing approaches.",

keywords = "BIC, Diagonal covariance, EM algorithm, High-dimension but low-sample size, Microarray gene expression, Mixture model, Penalized likelihood",

author = "Benhuai Xie and Wei Pan and Xiaotong Shen",

year = "2008",

month = sep,

doi = "10.1111/j.1541-0420.2007.00955.x",

language = "English (US)",

volume = "64",

pages = "921--930",

journal = "Biometrics",

issn = "0006-341X",

publisher = "Wiley-Blackwell",

number = "3",

}

TY - JOUR

T1 - Variable selection in penalized model-based clustering via regularization on grouped parameters

AU - Xie, Benhuai

AU - Pan, Wei

AU - Shen, Xiaotong

PY - 2008/9

Y1 - 2008/9

N2 - Penalized model-based clustering has been proposed for high-dimensional but small sample-sized data, such as arising from genomic studies; in particular, it can be used for variable selection. A new regularization scheme is proposed to group together multiple parameters of the same variable across clusters, which is shown both analytically and numerically to be more effective than the conventional L1 penalty for variable selection. In addition, we develop a strategy to combine this grouping scheme with grouping structured variables. Simulation studies and applications to microarray gene expression data for cancer subtype discovery demonstrate the advantage of the new proposal over several existing approaches.

AB - Penalized model-based clustering has been proposed for high-dimensional but small sample-sized data, such as arising from genomic studies; in particular, it can be used for variable selection. A new regularization scheme is proposed to group together multiple parameters of the same variable across clusters, which is shown both analytically and numerically to be more effective than the conventional L1 penalty for variable selection. In addition, we develop a strategy to combine this grouping scheme with grouping structured variables. Simulation studies and applications to microarray gene expression data for cancer subtype discovery demonstrate the advantage of the new proposal over several existing approaches.

KW - BIC

KW - Diagonal covariance

KW - EM algorithm

KW - High-dimension but low-sample size

KW - Microarray gene expression

KW - Mixture model

KW - Penalized likelihood

UR - http://www.scopus.com/inward/record.url?scp=49749148013&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=49749148013&partnerID=8YFLogxK

U2 - 10.1111/j.1541-0420.2007.00955.x

DO - 10.1111/j.1541-0420.2007.00955.x

M3 - Article

C2 - 18162109

AN - SCOPUS:49749148013

SN - 0006-341X

VL - 64

SP - 921

EP - 930

JO - Biometrics

JF - Biometrics

IS - 3

ER -

Variable selection in penalized model-based clustering via regularization on grouped parameters

Abstract

Keywords

UN SDGs

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this