Turbo-SMT: Parallel coupled sparse matrix-Tensor factorizations and applications

Evangelos E. Papalexakis; Tom M. Mitchell; Nicholas D. Sidiropoulos; Christos Faloutsos; Partha Pratim Talukdar; Brian Murphy

doi:10.1002/sam.11315

Turbo-SMT: Parallel coupled sparse matrix-Tensor factorizations and applications

Evangelos E. Papalexakis, Tom M. Mitchell, Nicholas D. Sidiropoulos, Christos Faloutsos, Partha Pratim Talukdar, Brian Murphy

Electrical and Computer Engineering

Research output: Contribution to journal › Article › peer-review

1 Scopus citations

Abstract

How can we correlate the neural activity in the human brain as it responds to typed words, with properties of these terms (like ‘edible’, ‘fits in hand’)? In short, we want to find latent variables, that jointly explain both the brain activity, as well as the behavioral responses. This is one of many settings of the Coupled Matrix-Tensor Factorization (CMTF) problem. Can we enhance any CMTF solver, so that it can operate on potentially very large datasets that may not fit in main memory? We introduce Turbo-SMT, a meta-method capable of doing exactly that: it boosts the performance of any CMTF algorithm, produces sparse and interpretable solutions, and parallelizes any CMTF algorithm, producing sparse and interpretable solutions (up to 65 fold). Additionally, we improve upon ALS, the work-horse algorithm for CMTF, with respect to efficiency and robustness to missing values. We apply Turbo-SMT to BrainQ, a dataset consisting of a (nouns, brain voxels, human subjects) tensor and a (nouns, properties) matrix, with coupling along the nouns dimension. Turbo-SMT is able to find meaningful latent variables, as well as to predict brain activity with competitive accuracy. Finally, we demonstrate the generality of Turbo-SMT, by applying it on a FACEBOOK dataset (users, ‘friends', wall-postings); there, Turbo-SMT spots spammer-like anomalies.

Original language	English (US)
Pages (from-to)	269-290
Number of pages	22
Journal	Statistical Analysis and Data Mining
Volume	9
Issue number	4
DOIs	https://doi.org/10.1002/sam.11315
State	Published - Aug 1 2016

Bibliographical note

Publisher Copyright:
© 2016 Wiley Periodicals, Inc.

Keywords

algorithm
coupled matrix-tensor factorization
fMRI data
neurosemantics
parallel
sparse
speedup
tensor

Access

10.1002/sam.11315

https://pureadmin.qub.ac.uk/ws/files/124764278/paper.pdf

OpenUrl availability

Full text

Cite this

@article{6042330524ed481eacfb896edc2f88ff,

title = "Turbo-SMT: Parallel coupled sparse matrix-Tensor factorizations and applications",

abstract = "How can we correlate the neural activity in the human brain as it responds to typed words, with properties of these terms (like {\textquoteleft}edible{\textquoteright}, {\textquoteleft}fits in hand{\textquoteright})? In short, we want to find latent variables, that jointly explain both the brain activity, as well as the behavioral responses. This is one of many settings of the Coupled Matrix-Tensor Factorization (CMTF) problem. Can we enhance any CMTF solver, so that it can operate on potentially very large datasets that may not fit in main memory? We introduce Turbo-SMT, a meta-method capable of doing exactly that: it boosts the performance of any CMTF algorithm, produces sparse and interpretable solutions, and parallelizes any CMTF algorithm, producing sparse and interpretable solutions (up to 65 fold). Additionally, we improve upon ALS, the work-horse algorithm for CMTF, with respect to efficiency and robustness to missing values. We apply Turbo-SMT to BrainQ, a dataset consisting of a (nouns, brain voxels, human subjects) tensor and a (nouns, properties) matrix, with coupling along the nouns dimension. Turbo-SMT is able to find meaningful latent variables, as well as to predict brain activity with competitive accuracy. Finally, we demonstrate the generality of Turbo-SMT, by applying it on a FACEBOOK dataset (users, {\textquoteleft}friends', wall-postings); there, Turbo-SMT spots spammer-like anomalies.",

keywords = "algorithm, coupled matrix-tensor factorization, fMRI data, neurosemantics, parallel, sparse, speedup, tensor",

author = "Papalexakis, {Evangelos E.} and Mitchell, {Tom M.} and Sidiropoulos, {Nicholas D.} and Christos Faloutsos and Talukdar, {Partha Pratim} and Brian Murphy",

note = "Publisher Copyright: {\textcopyright} 2016 Wiley Periodicals, Inc.",

year = "2016",

month = aug,

day = "1",

doi = "10.1002/sam.11315",

language = "English (US)",

volume = "9",

pages = "269--290",

journal = "Statistical Analysis and Data Mining",

issn = "1932-1864",

publisher = "John Wiley and Sons Inc.",

number = "4",

}

TY - JOUR

T1 - Turbo-SMT

T2 - Parallel coupled sparse matrix-Tensor factorizations and applications

AU - Papalexakis, Evangelos E.

AU - Mitchell, Tom M.

AU - Sidiropoulos, Nicholas D.

AU - Faloutsos, Christos

AU - Talukdar, Partha Pratim

AU - Murphy, Brian

PY - 2016/8/1

Y1 - 2016/8/1

N2 - How can we correlate the neural activity in the human brain as it responds to typed words, with properties of these terms (like ‘edible’, ‘fits in hand’)? In short, we want to find latent variables, that jointly explain both the brain activity, as well as the behavioral responses. This is one of many settings of the Coupled Matrix-Tensor Factorization (CMTF) problem. Can we enhance any CMTF solver, so that it can operate on potentially very large datasets that may not fit in main memory? We introduce Turbo-SMT, a meta-method capable of doing exactly that: it boosts the performance of any CMTF algorithm, produces sparse and interpretable solutions, and parallelizes any CMTF algorithm, producing sparse and interpretable solutions (up to 65 fold). Additionally, we improve upon ALS, the work-horse algorithm for CMTF, with respect to efficiency and robustness to missing values. We apply Turbo-SMT to BrainQ, a dataset consisting of a (nouns, brain voxels, human subjects) tensor and a (nouns, properties) matrix, with coupling along the nouns dimension. Turbo-SMT is able to find meaningful latent variables, as well as to predict brain activity with competitive accuracy. Finally, we demonstrate the generality of Turbo-SMT, by applying it on a FACEBOOK dataset (users, ‘friends', wall-postings); there, Turbo-SMT spots spammer-like anomalies.

AB - How can we correlate the neural activity in the human brain as it responds to typed words, with properties of these terms (like ‘edible’, ‘fits in hand’)? In short, we want to find latent variables, that jointly explain both the brain activity, as well as the behavioral responses. This is one of many settings of the Coupled Matrix-Tensor Factorization (CMTF) problem. Can we enhance any CMTF solver, so that it can operate on potentially very large datasets that may not fit in main memory? We introduce Turbo-SMT, a meta-method capable of doing exactly that: it boosts the performance of any CMTF algorithm, produces sparse and interpretable solutions, and parallelizes any CMTF algorithm, producing sparse and interpretable solutions (up to 65 fold). Additionally, we improve upon ALS, the work-horse algorithm for CMTF, with respect to efficiency and robustness to missing values. We apply Turbo-SMT to BrainQ, a dataset consisting of a (nouns, brain voxels, human subjects) tensor and a (nouns, properties) matrix, with coupling along the nouns dimension. Turbo-SMT is able to find meaningful latent variables, as well as to predict brain activity with competitive accuracy. Finally, we demonstrate the generality of Turbo-SMT, by applying it on a FACEBOOK dataset (users, ‘friends', wall-postings); there, Turbo-SMT spots spammer-like anomalies.

KW - algorithm

KW - coupled matrix-tensor factorization

KW - fMRI data

KW - neurosemantics

KW - parallel

KW - sparse

KW - speedup

KW - tensor

UR - http://www.scopus.com/inward/record.url?scp=84978520160&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84978520160&partnerID=8YFLogxK

U2 - 10.1002/sam.11315

DO - 10.1002/sam.11315

M3 - Article

C2 - 27672406

AN - SCOPUS:84978520160

SN - 1932-1864

VL - 9

SP - 269

EP - 290

JO - Statistical Analysis and Data Mining

JF - Statistical Analysis and Data Mining

IS - 4

ER -

Turbo-SMT: Parallel coupled sparse matrix-Tensor factorizations and applications

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this