Sparse Linear Discriminant Analysis for Multi-view Structured Data

Sandra E. Safo; Eun Jeong Min; Lillian Haine

Sparse Linear Discriminant Analysis for Multi-view Structured Data

Sandra E. Safo, Eun Jeong Min, Lillian Haine

Biostatistics

Research output: Contribution to journal › Article

7 Downloads (Pure)

Abstract

Classification methods that leverage the strengths of data from multiple sources (multi-view data) simultaneously have enormous potential to yield more powerful findings than two step methods: association followed by classification. We propose two methods, sparse integrative discriminant analysis (SIDA) and SIDA with incorporation of network information (SIDANet), for joint association and classification studies. The methods consider the overall association between multi-veiw data, and the separation within each view in choosing discriminant vectors that are associated and optimally separate subjects into different classes. SIDANet is among the first methods to incorporate prior structural information in joint association and classification studies. It uses the normalized Laplacian of a graph to smooth coefficients of predictor variables, thus encouraging selection of predictors that are connected and behave similarly. We demonstrate the effectiveness of our methods on a set of synthetic and real datasets. Our findings underscore the benefit of joint association and classification methods if the goal is to correlate multi-view data and to perform classification.

Original language	Undefined/Unknown
Journal	arXiv
State	Published - Nov 13 2019

Bibliographical note

40 pages, 4 figures

Keywords

stat.ME

Access

1911.05643v2

OpenUrl availability

Full text

Cite this

@article{acd66d00bc434e37aae17cf388810830,

title = "Sparse Linear Discriminant Analysis for Multi-view Structured Data",

abstract = " Classification methods that leverage the strengths of data from multiple sources (multi-view data) simultaneously have enormous potential to yield more powerful findings than two step methods: association followed by classification. We propose two methods, sparse integrative discriminant analysis (SIDA) and SIDA with incorporation of network information (SIDANet), for joint association and classification studies. The methods consider the overall association between multi-veiw data, and the separation within each view in choosing discriminant vectors that are associated and optimally separate subjects into different classes. SIDANet is among the first methods to incorporate prior structural information in joint association and classification studies. It uses the normalized Laplacian of a graph to smooth coefficients of predictor variables, thus encouraging selection of predictors that are connected and behave similarly. We demonstrate the effectiveness of our methods on a set of synthetic and real datasets. Our findings underscore the benefit of joint association and classification methods if the goal is to correlate multi-view data and to perform classification. ",

keywords = "stat.ME",

author = "Safo, {Sandra E.} and Min, {Eun Jeong} and Lillian Haine",

note = "40 pages, 4 figures",

year = "2019",

month = nov,

day = "13",

language = "Undefined/Unknown",

journal = "arXiv",

}

TY - JOUR

T1 - Sparse Linear Discriminant Analysis for Multi-view Structured Data

AU - Safo, Sandra E.

AU - Min, Eun Jeong

AU - Haine, Lillian

N1 - 40 pages, 4 figures

PY - 2019/11/13

Y1 - 2019/11/13

N2 - Classification methods that leverage the strengths of data from multiple sources (multi-view data) simultaneously have enormous potential to yield more powerful findings than two step methods: association followed by classification. We propose two methods, sparse integrative discriminant analysis (SIDA) and SIDA with incorporation of network information (SIDANet), for joint association and classification studies. The methods consider the overall association between multi-veiw data, and the separation within each view in choosing discriminant vectors that are associated and optimally separate subjects into different classes. SIDANet is among the first methods to incorporate prior structural information in joint association and classification studies. It uses the normalized Laplacian of a graph to smooth coefficients of predictor variables, thus encouraging selection of predictors that are connected and behave similarly. We demonstrate the effectiveness of our methods on a set of synthetic and real datasets. Our findings underscore the benefit of joint association and classification methods if the goal is to correlate multi-view data and to perform classification.

AB - Classification methods that leverage the strengths of data from multiple sources (multi-view data) simultaneously have enormous potential to yield more powerful findings than two step methods: association followed by classification. We propose two methods, sparse integrative discriminant analysis (SIDA) and SIDA with incorporation of network information (SIDANet), for joint association and classification studies. The methods consider the overall association between multi-veiw data, and the separation within each view in choosing discriminant vectors that are associated and optimally separate subjects into different classes. SIDANet is among the first methods to incorporate prior structural information in joint association and classification studies. It uses the normalized Laplacian of a graph to smooth coefficients of predictor variables, thus encouraging selection of predictors that are connected and behave similarly. We demonstrate the effectiveness of our methods on a set of synthetic and real datasets. Our findings underscore the benefit of joint association and classification methods if the goal is to correlate multi-view data and to perform classification.

KW - stat.ME

M3 - Article

JO - arXiv

JF - arXiv

ER -