On the Fundamental Limits of Coded Data Shuffling

Adel Elmahdy, Soheil Mohajer

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Scopus citations

Abstract

We consider the data shuffling problem, in which a master node is connected to a set of worker nodes, via a shared link, in order to communicate a set of files to the worker nodes. The master node has access to a database of files. In every shuffling iteration, each worker node processes a new subset of files, and has excess storage to partially cache the remaining files. We characterize the exact rate-memory trade-off for the worst-case shuffling under the assumption that cached files are uncoded, by deriving the minimum communication rate for a given storage capacity per worker node. As a byproduct, the exact rate-memory trade-off for any random shuffling is characterized when the number of files is equal to the number of worker nodes. We propose a novel deterministic and systematic coded shuffling scheme, which improves the state of the art. Then, we prove the optimality of our proposed scheme by deriving a matching lower bound and showing that the placement phase of the proposed coded shuffling scheme is optimal over all shuffles.

Original languageEnglish (US)
Title of host publication2018 IEEE International Symposium on Information Theory, ISIT 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages716-720
Number of pages5
ISBN (Print)9781538647806
DOIs
StatePublished - Aug 15 2018
Event2018 IEEE International Symposium on Information Theory, ISIT 2018 - Vail, United States
Duration: Jun 17 2018Jun 22 2018

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
Volume2018-June
ISSN (Print)2157-8095

Other

Other2018 IEEE International Symposium on Information Theory, ISIT 2018
Country/TerritoryUnited States
CityVail
Period6/17/186/22/18

Bibliographical note

Publisher Copyright:
© 2018 IEEE.

Fingerprint

Dive into the research topics of 'On the Fundamental Limits of Coded Data Shuffling'. Together they form a unique fingerprint.

Cite this