Scalable computational geometry in MapReduce

Yuan Li, Ahmed Eldawy, Jie Xue, Nadezda Knorozova, Mohamed F. Mokbel, Ravi Janardan

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Hadoop, employing the MapReduce programming paradigm, has been widely accepted as the standard framework for analyzing big data in distributed environments. Unfortunately, this rich framework has not been exploited for processing large-scale computational geometry operations. This paper introduces CG_Hadoop; a suite of scalable and efficient MapReduce algorithms for various fundamental computational geometry operations, namely polygon union, Voronoi diagram, skyline, convex hull, farthest pair, and closest pair, which present a set of key components for other geometric algorithms. For each computational geometry operation, CG_Hadoop has two versions, one for the Apache Hadoop system and one for the SpatialHadoop system, a Hadoop-based system that is more suited for spatial operations. These proposed algorithms form the nucleus of a comprehensive MapReduce library of computational geometry operations. Extensive experimental results run on a cluster of 25 machines over datasets of size up to 3.8B records show that CG_Hadoop achieves up to 14x and 115x better performance than traditional algorithms when using Hadoop and SpatialHadoop systems, respectively.

Original languageEnglish (US)
Pages (from-to)523-548
Number of pages26
JournalVLDB Journal
Volume28
Issue number4
DOIs
StatePublished - Aug 1 2019

Bibliographical note

Publisher Copyright:
© 2019, Springer-Verlag GmbH Germany, part of Springer Nature.

Keywords

  • Computational Geometry
  • Distributed Systems
  • Hadoop
  • MapReduce
  • Output-sensitive Algorithms

Fingerprint

Dive into the research topics of 'Scalable computational geometry in MapReduce'. Together they form a unique fingerprint.

Cite this