Large-scale neural modeling in MapReduce and Giraph

Shuo Yang; Nicholas D. Spielman; Jadin C. Jackson; Brad S. Rubin

doi:10.1109/EIT.2014.6871824

Large-scale neural modeling in MapReduce and Giraph

Shuo Yang, Nicholas D. Spielman, Jadin C. Jackson, Brad S. Rubin

Integrative Biology and Physiology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

2 Scopus citations

Abstract

One of the most crucial challenges in scientific computing is scalability. Hadoop, an open-source implementation of the MapReduce parallel programming model developed by Google, has emerged as a powerful platform for performing large-scale scientific computing at very low costs. In this paper, we explore the use of Hadoop to model large-scale neural networks. A neural network is most naturally modeled by a graph structure with iterative processing. In this paper, we first present an improved graph algorithm design pattern in MapReduce called Mapper-side Schimmy. Experiments show that the application of our design pattern, combined with the current best practices, can reduce the running time of the neural network simulation on a neural network with 100,000 neurons and 2.3 billion edges by 64%. MapReduce, however, is inherently not efficient for iterative graph processing. To address the limitation of the MapReduce model, we then explore the use of Giraph, an open source large-scale graph processing framework that sits on top of Hadoop to implement graph algorithms with a vertex-centric approach. We show that our Giraph implementation boosted performance by 91% compared to a basic MapReduce implementation and by 60% compared to our improved Mapper-side Schimmy algorithm.

Original language	English (US)
Title of host publication	2014 IEEE International Conference on Electro/Information Technology, EIT 2014
Publisher	IEEE Computer Society
Pages	556-561
Number of pages	6
ISBN (Print)	9781479947744
DOIs	https://doi.org/10.1109/EIT.2014.6871824
State	Published - 2014
Event	2014 IEEE International Conference on Electro/Information Technology, EIT 2014 - Milwaukee, WI, United States Duration: Jun 5 2014 → Jun 7 2014

Publication series

Name	IEEE International Conference on Electro Information Technology
ISSN (Print)	2154-0357
ISSN (Electronic)	2154-0373

Other

Other	2014 IEEE International Conference on Electro/Information Technology, EIT 2014
Country/Territory	United States
City	Milwaukee, WI
Period	6/5/14 → 6/7/14

Access

10.1109/EIT.2014.6871824

OpenUrl availability

Full text

Cite this

Large-scale neural modeling in MapReduce and Giraph. / Yang, Shuo; Spielman, Nicholas D.; Jackson, Jadin C. et al.
2014 IEEE International Conference on Electro/Information Technology, EIT 2014. IEEE Computer Society, 2014. p. 556-561 6871824 (IEEE International Conference on Electro Information Technology).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Yang, S, Spielman, ND, Jackson, JC & Rubin, BS 2014, Large-scale neural modeling in MapReduce and Giraph. in 2014 IEEE International Conference on Electro/Information Technology, EIT 2014., 6871824, IEEE International Conference on Electro Information Technology, IEEE Computer Society, pp. 556-561, 2014 IEEE International Conference on Electro/Information Technology, EIT 2014, Milwaukee, WI, United States, 6/5/14. https://doi.org/10.1109/EIT.2014.6871824

@inproceedings{152fbdcf705341b18d4bc0dfff95dd67,

title = "Large-scale neural modeling in MapReduce and Giraph",

abstract = "One of the most crucial challenges in scientific computing is scalability. Hadoop, an open-source implementation of the MapReduce parallel programming model developed by Google, has emerged as a powerful platform for performing large-scale scientific computing at very low costs. In this paper, we explore the use of Hadoop to model large-scale neural networks. A neural network is most naturally modeled by a graph structure with iterative processing. In this paper, we first present an improved graph algorithm design pattern in MapReduce called Mapper-side Schimmy. Experiments show that the application of our design pattern, combined with the current best practices, can reduce the running time of the neural network simulation on a neural network with 100,000 neurons and 2.3 billion edges by 64%. MapReduce, however, is inherently not efficient for iterative graph processing. To address the limitation of the MapReduce model, we then explore the use of Giraph, an open source large-scale graph processing framework that sits on top of Hadoop to implement graph algorithms with a vertex-centric approach. We show that our Giraph implementation boosted performance by 91% compared to a basic MapReduce implementation and by 60% compared to our improved Mapper-side Schimmy algorithm.",

author = "Shuo Yang and Spielman, {Nicholas D.} and Jackson, {Jadin C.} and Rubin, {Brad S.}",

year = "2014",

doi = "10.1109/EIT.2014.6871824",

language = "English (US)",

isbn = "9781479947744",

series = "IEEE International Conference on Electro Information Technology",

publisher = "IEEE Computer Society",

pages = "556--561",

booktitle = "2014 IEEE International Conference on Electro/Information Technology, EIT 2014",

note = "2014 IEEE International Conference on Electro/Information Technology, EIT 2014 ; Conference date: 05-06-2014 Through 07-06-2014",

}

TY - GEN

T1 - Large-scale neural modeling in MapReduce and Giraph

AU - Yang, Shuo

AU - Spielman, Nicholas D.

AU - Jackson, Jadin C.

AU - Rubin, Brad S.

PY - 2014

Y1 - 2014

N2 - One of the most crucial challenges in scientific computing is scalability. Hadoop, an open-source implementation of the MapReduce parallel programming model developed by Google, has emerged as a powerful platform for performing large-scale scientific computing at very low costs. In this paper, we explore the use of Hadoop to model large-scale neural networks. A neural network is most naturally modeled by a graph structure with iterative processing. In this paper, we first present an improved graph algorithm design pattern in MapReduce called Mapper-side Schimmy. Experiments show that the application of our design pattern, combined with the current best practices, can reduce the running time of the neural network simulation on a neural network with 100,000 neurons and 2.3 billion edges by 64%. MapReduce, however, is inherently not efficient for iterative graph processing. To address the limitation of the MapReduce model, we then explore the use of Giraph, an open source large-scale graph processing framework that sits on top of Hadoop to implement graph algorithms with a vertex-centric approach. We show that our Giraph implementation boosted performance by 91% compared to a basic MapReduce implementation and by 60% compared to our improved Mapper-side Schimmy algorithm.

AB - One of the most crucial challenges in scientific computing is scalability. Hadoop, an open-source implementation of the MapReduce parallel programming model developed by Google, has emerged as a powerful platform for performing large-scale scientific computing at very low costs. In this paper, we explore the use of Hadoop to model large-scale neural networks. A neural network is most naturally modeled by a graph structure with iterative processing. In this paper, we first present an improved graph algorithm design pattern in MapReduce called Mapper-side Schimmy. Experiments show that the application of our design pattern, combined with the current best practices, can reduce the running time of the neural network simulation on a neural network with 100,000 neurons and 2.3 billion edges by 64%. MapReduce, however, is inherently not efficient for iterative graph processing. To address the limitation of the MapReduce model, we then explore the use of Giraph, an open source large-scale graph processing framework that sits on top of Hadoop to implement graph algorithms with a vertex-centric approach. We show that our Giraph implementation boosted performance by 91% compared to a basic MapReduce implementation and by 60% compared to our improved Mapper-side Schimmy algorithm.

UR - http://www.scopus.com/inward/record.url?scp=84906559156&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84906559156&partnerID=8YFLogxK

U2 - 10.1109/EIT.2014.6871824

DO - 10.1109/EIT.2014.6871824

M3 - Conference contribution

AN - SCOPUS:84906559156

SN - 9781479947744

T3 - IEEE International Conference on Electro Information Technology

SP - 556

EP - 561

BT - 2014 IEEE International Conference on Electro/Information Technology, EIT 2014

PB - IEEE Computer Society

T2 - 2014 IEEE International Conference on Electro/Information Technology, EIT 2014

Y2 - 5 June 2014 through 7 June 2014

ER -

Large-scale neural modeling in MapReduce and Giraph

Abstract

Publication series

Other

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this