Joint and individual variation explained (JIVE) for integrated analysis of multiple data types

Eric F. Lock; Katherine A. Hoadley; J. S. Marron; Andrew B. Nobel

doi:10.1214/12-AOAS597

Joint and individual variation explained (JIVE) for integrated analysis of multiple data types

Eric F. Lock, Katherine A. Hoadley, J. S. Marron, Andrew B. Nobel

Biostatistics

Research output: Contribution to journal › Article › peer-review

325 Scopus citations

Abstract

Research in several fields now requires the analysis of data sets in which multiple high-dimensional types of data are available for a common set of objects. In particular, The Cancer Genome Atlas (TCGA) includes data from several diverse genomic technologies on the same cancerous tumor samples. In this paper we introduce Joint and Individual Variation Explained (JIVE), a general decomposition of variation for the integrated analysis of such data sets. The decomposition consists of three terms: a low-rank approximation capturing joint variation across data types, low-rank approximations for structured variation individual to each data type, and residual noise. JIVE quantifies the amount of joint variation between data types, reduces the dimensionality of the data and provides new directions for the visual exploration of joint and individual structures. The proposed method represents an extension of Principal Component Analysis and has clear advantages over popular two-block methods such as Canonical Correlation Analysis and Partial Least Squares. A JIVE analysis of gene expression and miRNA data on Glioblastoma Multiforme tumor samples reveals gene-miRNA associations and provides better characterization of tumor types.

Original language	English (US)
Pages (from-to)	523-542
Number of pages	20
Journal	Annals of Applied Statistics
Volume	7
Issue number	1
DOIs	https://doi.org/10.1214/12-AOAS597
State	Published - Mar 2013

Keywords

Data fusion
Data integration
Multi-block data
Principal component analysis

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access

10.1214/12-AOAS597

OpenUrl availability

Full text

Cite this

@article{e7581a8864364705a56bd198c6a1f2ab,

title = "Joint and individual variation explained (JIVE) for integrated analysis of multiple data types",

abstract = "Research in several fields now requires the analysis of data sets in which multiple high-dimensional types of data are available for a common set of objects. In particular, The Cancer Genome Atlas (TCGA) includes data from several diverse genomic technologies on the same cancerous tumor samples. In this paper we introduce Joint and Individual Variation Explained (JIVE), a general decomposition of variation for the integrated analysis of such data sets. The decomposition consists of three terms: a low-rank approximation capturing joint variation across data types, low-rank approximations for structured variation individual to each data type, and residual noise. JIVE quantifies the amount of joint variation between data types, reduces the dimensionality of the data and provides new directions for the visual exploration of joint and individual structures. The proposed method represents an extension of Principal Component Analysis and has clear advantages over popular two-block methods such as Canonical Correlation Analysis and Partial Least Squares. A JIVE analysis of gene expression and miRNA data on Glioblastoma Multiforme tumor samples reveals gene-miRNA associations and provides better characterization of tumor types.",

keywords = "Data fusion, Data integration, Multi-block data, Principal component analysis",

author = "Lock, {Eric F.} and Hoadley, {Katherine A.} and Marron, {J. S.} and Nobel, {Andrew B.}",

year = "2013",

month = mar,

doi = "10.1214/12-AOAS597",

language = "English (US)",

volume = "7",

pages = "523--542",

journal = "Annals of Applied Statistics",

issn = "1932-6157",

publisher = "Institute of Mathematical Statistics",

number = "1",

}

TY - JOUR

T1 - Joint and individual variation explained (JIVE) for integrated analysis of multiple data types

AU - Lock, Eric F.

AU - Hoadley, Katherine A.

AU - Marron, J. S.

AU - Nobel, Andrew B.

PY - 2013/3

Y1 - 2013/3

N2 - Research in several fields now requires the analysis of data sets in which multiple high-dimensional types of data are available for a common set of objects. In particular, The Cancer Genome Atlas (TCGA) includes data from several diverse genomic technologies on the same cancerous tumor samples. In this paper we introduce Joint and Individual Variation Explained (JIVE), a general decomposition of variation for the integrated analysis of such data sets. The decomposition consists of three terms: a low-rank approximation capturing joint variation across data types, low-rank approximations for structured variation individual to each data type, and residual noise. JIVE quantifies the amount of joint variation between data types, reduces the dimensionality of the data and provides new directions for the visual exploration of joint and individual structures. The proposed method represents an extension of Principal Component Analysis and has clear advantages over popular two-block methods such as Canonical Correlation Analysis and Partial Least Squares. A JIVE analysis of gene expression and miRNA data on Glioblastoma Multiforme tumor samples reveals gene-miRNA associations and provides better characterization of tumor types.

AB - Research in several fields now requires the analysis of data sets in which multiple high-dimensional types of data are available for a common set of objects. In particular, The Cancer Genome Atlas (TCGA) includes data from several diverse genomic technologies on the same cancerous tumor samples. In this paper we introduce Joint and Individual Variation Explained (JIVE), a general decomposition of variation for the integrated analysis of such data sets. The decomposition consists of three terms: a low-rank approximation capturing joint variation across data types, low-rank approximations for structured variation individual to each data type, and residual noise. JIVE quantifies the amount of joint variation between data types, reduces the dimensionality of the data and provides new directions for the visual exploration of joint and individual structures. The proposed method represents an extension of Principal Component Analysis and has clear advantages over popular two-block methods such as Canonical Correlation Analysis and Partial Least Squares. A JIVE analysis of gene expression and miRNA data on Glioblastoma Multiforme tumor samples reveals gene-miRNA associations and provides better characterization of tumor types.

KW - Data fusion

KW - Data integration

KW - Multi-block data

KW - Principal component analysis

UR - http://www.scopus.com/inward/record.url?scp=84876058478&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84876058478&partnerID=8YFLogxK

U2 - 10.1214/12-AOAS597

DO - 10.1214/12-AOAS597

M3 - Article

C2 - 23745156

AN - SCOPUS:84876058478

SN - 1932-6157

VL - 7

SP - 523

EP - 542

JO - Annals of Applied Statistics

JF - Annals of Applied Statistics

IS - 1

ER -

Joint and individual variation explained (JIVE) for integrated analysis of multiple data types

Abstract

Keywords

UN SDGs

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this