Nonlinear dimensionality reduction for discriminative analytics of multiple datasets

Jia Chen; Gang Wang; Georgios B. Giannakis

doi:10.1109/TSP.2018.2885478

Nonlinear dimensionality reduction for discriminative analytics of multiple datasets

Jia Chen, Gang Wang, Georgios B. Giannakis

Electrical and Computer Engineering

Research output: Contribution to journal › Article › peer-review

12 Scopus citations

Abstract

Principal component analysis (PCA) is widely used for feature extraction and dimensionality reduction, with documented merits in diverse tasks involving high-dimensional data. PCA copes with one dataset at a time, but it is challenged when it comes to analyzing multiple datasets jointly. In certain data science settings however, one is often interested in extracting the most discriminative information from one dataset of particular interest (a.k.a. target data) relative to the other(s) (a.k.a. background data). To this end, this paper puts forth a novel approach, termed discriminative (d) PCA, for such discriminative analytics of multiple datasets. Under certain conditions, dPCA is proved to be least-squares optimal in recovering the latent subspace vector unique to the target data relative to background data. To account for nonlinear data correlations, (linear) dPCA models for one or multiple background datasets are generalized through kernel-based learning. Interestingly, all dPCA variants admit an analytical solution obtainable with a single (generalized) eigenvalue decomposition. Finally, substantial dimensionality reduction tests using synthetic and real datasets are provided to corroborate the merits of the proposed methods.

Original language	English (US)
Article number	8565879
Pages (from-to)	740-752
Number of pages	13
Journal	IEEE Transactions on Signal Processing
Volume	67
Issue number	3
DOIs	https://doi.org/10.1109/TSP.2018.2885478
State	Published - Feb 1 2019

Bibliographical note

Publisher Copyright:
© 1991-2012 IEEE.

Keywords

Principal component analysis
discriminative analytics
kernel learning
multiple background datasets

Access

10.1109/TSP.2018.2885478

OpenUrl availability

Full text

Cite this

@article{e34ec13b5a5a423ea11c242ea1c3583b,

title = "Nonlinear dimensionality reduction for discriminative analytics of multiple datasets",

abstract = "Principal component analysis (PCA) is widely used for feature extraction and dimensionality reduction, with documented merits in diverse tasks involving high-dimensional data. PCA copes with one dataset at a time, but it is challenged when it comes to analyzing multiple datasets jointly. In certain data science settings however, one is often interested in extracting the most discriminative information from one dataset of particular interest (a.k.a. target data) relative to the other(s) (a.k.a. background data). To this end, this paper puts forth a novel approach, termed discriminative (d) PCA, for such discriminative analytics of multiple datasets. Under certain conditions, dPCA is proved to be least-squares optimal in recovering the latent subspace vector unique to the target data relative to background data. To account for nonlinear data correlations, (linear) dPCA models for one or multiple background datasets are generalized through kernel-based learning. Interestingly, all dPCA variants admit an analytical solution obtainable with a single (generalized) eigenvalue decomposition. Finally, substantial dimensionality reduction tests using synthetic and real datasets are provided to corroborate the merits of the proposed methods.",

keywords = "Principal component analysis, discriminative analytics, kernel learning, multiple background datasets",

author = "Jia Chen and Gang Wang and Giannakis, {Georgios B.}",

note = "Publisher Copyright: {\textcopyright} 1991-2012 IEEE.",

year = "2019",

month = feb,

day = "1",

doi = "10.1109/TSP.2018.2885478",

language = "English (US)",

volume = "67",

pages = "740--752",

journal = "IEEE Transactions on Signal Processing",

issn = "1053-587X",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "3",

}

TY - JOUR

T1 - Nonlinear dimensionality reduction for discriminative analytics of multiple datasets

AU - Chen, Jia

AU - Wang, Gang

AU - Giannakis, Georgios B.

PY - 2019/2/1

Y1 - 2019/2/1

N2 - Principal component analysis (PCA) is widely used for feature extraction and dimensionality reduction, with documented merits in diverse tasks involving high-dimensional data. PCA copes with one dataset at a time, but it is challenged when it comes to analyzing multiple datasets jointly. In certain data science settings however, one is often interested in extracting the most discriminative information from one dataset of particular interest (a.k.a. target data) relative to the other(s) (a.k.a. background data). To this end, this paper puts forth a novel approach, termed discriminative (d) PCA, for such discriminative analytics of multiple datasets. Under certain conditions, dPCA is proved to be least-squares optimal in recovering the latent subspace vector unique to the target data relative to background data. To account for nonlinear data correlations, (linear) dPCA models for one or multiple background datasets are generalized through kernel-based learning. Interestingly, all dPCA variants admit an analytical solution obtainable with a single (generalized) eigenvalue decomposition. Finally, substantial dimensionality reduction tests using synthetic and real datasets are provided to corroborate the merits of the proposed methods.

AB - Principal component analysis (PCA) is widely used for feature extraction and dimensionality reduction, with documented merits in diverse tasks involving high-dimensional data. PCA copes with one dataset at a time, but it is challenged when it comes to analyzing multiple datasets jointly. In certain data science settings however, one is often interested in extracting the most discriminative information from one dataset of particular interest (a.k.a. target data) relative to the other(s) (a.k.a. background data). To this end, this paper puts forth a novel approach, termed discriminative (d) PCA, for such discriminative analytics of multiple datasets. Under certain conditions, dPCA is proved to be least-squares optimal in recovering the latent subspace vector unique to the target data relative to background data. To account for nonlinear data correlations, (linear) dPCA models for one or multiple background datasets are generalized through kernel-based learning. Interestingly, all dPCA variants admit an analytical solution obtainable with a single (generalized) eigenvalue decomposition. Finally, substantial dimensionality reduction tests using synthetic and real datasets are provided to corroborate the merits of the proposed methods.

KW - Principal component analysis

KW - discriminative analytics

KW - kernel learning

KW - multiple background datasets

UR - http://www.scopus.com/inward/record.url?scp=85051188014&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85051188014&partnerID=8YFLogxK

U2 - 10.1109/TSP.2018.2885478

DO - 10.1109/TSP.2018.2885478

M3 - Article

AN - SCOPUS:85051188014

SN - 1053-587X

VL - 67

SP - 740

EP - 752

JO - IEEE Transactions on Signal Processing

JF - IEEE Transactions on Signal Processing

IS - 3

M1 - 8565879

ER -

Nonlinear dimensionality reduction for discriminative analytics of multiple datasets

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this