The Tucker decomposition is a higher-order analogue of the singular value decomposition and is a popular method of performing analysis on multi-way data (tensors). Computing the Tucker decomposition of a sparse tensor is demanding in terms of both memory and computational resources. The primary kernel of the factorization is a chain of tensor-matrix multiplications (TTMc). State-of-the-art algorithms accelerate the underlying computations by trading off memory to memoize the intermediate results of TTMc in order to reuse them across iterations. We present an algorithm based on a compressed data structure for sparse tensors and show that many computational redundancies during TTMc can be identified and pruned without the memory overheads of memoization. In addition, our algorithm can further reduce the number of operations by exploiting an additional amount of user-specified memory. We evaluate our algorithm on a collection of real-world and synthetic datasets and demonstrate up to 20.7X speedup while using 28.5X less memory than the state-of-the-art parallel algorithm.
|Original language||English (US)|
|Title of host publication||Euro-Par 2017|
|Subtitle of host publication||Parallel Processing - 23rd International Conference on Parallel and Distributed Computing, Proceedings|
|Editors||Francisco F. Rivera, Tomas F. Pena, Jose C. Cabaleiro|
|Number of pages||16|
|State||Published - 2017|
|Event||23rd International Conference on Parallel and Distributed Computing, Euro-Par 2017 - Santiago de Compostela, Spain|
Duration: Aug 28 2017 → Sep 1 2017
|Name||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Other||23rd International Conference on Parallel and Distributed Computing, Euro-Par 2017|
|City||Santiago de Compostela|
|Period||8/28/17 → 9/1/17|
Bibliographical noteFunding Information:
Acknowledgments. The authors would like to thank Oguz Kaya for sharing the HyperTensor source code, Muthu Baskaran for providing the Alzheimer tensor, Jee W. Choi for providing the synthetic tensor generator, and anonymous reviewers for their valuable feedback. This work was supported in part by NSF (IIS-0905220, OCI-1048018, CNS-1162405, IIS-1247632, IIP-1414153, IIS-1447788), Army Research Office (W911NF-14-1-0316), a University of Minnesota Doctoral Dissertation Fellowship, Intel Software and Services Group, and the Digital Technology Center at the University of Minnesota. Access to research and computing facilities was provided by the Digital Technology Center and the Minnesota Supercomputing Institute.
© 2017, Springer International Publishing AG.