Integrative factorization of bidimensionally linked matrices

Jun Young Park, Eric F. Lock

Research output: Contribution to journalArticlepeer-review

Abstract

Advances in molecular “omics” technologies have motivated new methodologies for the integration of multiple sources of high-content biomedical data. However, most statistical methods for integrating multiple data matrices only consider data shared vertically (one cohort on multiple platforms) or horizontally (different cohorts on a single platform). This is limiting for data that take the form of bidimensionally linked matrices (eg, multiple cohorts measured on multiple platforms), which are increasingly common in large-scale biomedical studies. In this paper, we propose bidimensional integrative factorization (BIDIFAC) for integrative dimension reduction and signal approximation of bidimensionally linked data matrices. Our method factorizes data into (a) globally shared, (b) row-shared, (c) column-shared, and (d) single-matrix structural components, facilitating the investigation of shared and unique patterns of variability. For estimation, we use a penalized objective function that extends the nuclear norm penalization for a single matrix. As an alternative to the complicated rank selection problem, we use results from the random matrix theory to choose tuning parameters. We apply our method to integrate two genomics platforms (messenger RNA and microRNA expression) across two sample cohorts (tumor samples and normal tissue samples) using the breast cancer data from the Cancer Genome Atlas. We provide R code for fitting BIDIFAC, imputing missing values, and generating simulated data.

Original languageEnglish (US)
Pages (from-to)61-74
Number of pages14
JournalBiometrics
Volume76
Issue number1
DOIs
StatePublished - Mar 1 2020

Bibliographical note

Funding Information:
We would like to thank coeditor, associate editor, and two anonymous reviewers for their constructive comments. This research was supported by the National Institute of Health (NIH) under the grant R21CA231214 and by the Minnesota Supercomputing Institute (MSI).

Funding Information:
We would like to thank coeditor, associate editor, and two anonymous reviewers for their constructive comments. This research was supported by the National Institute of Health (NIH) under the grant R21CA231214 and by the Minnesota Supercomputing Institute (MSI).

Publisher Copyright:
© 2019 The International Biometric Society

Keywords

  • BIDIFAC
  • bidimensional data
  • cancer genomics
  • data integration
  • dimension reduction
  • principal component analysis

Fingerprint Dive into the research topics of 'Integrative factorization of bidimensionally linked matrices'. Together they form a unique fingerprint.

Cite this