Big data frequent pattern mining

David C. Anastasiu; Jeremy Iverson; Shaden Smith; George Karypis

doi:10.1007/978-3-319-07821-2_10

Big data frequent pattern mining

David C. Anastasiu, Jeremy Iverson, Shaden Smith, George Karypis

Computer Science and Engineering

Research output: Chapter in Book/Report/Conference proceeding › Chapter

20 Scopus citations

Abstract

Frequent pattern mining is an essential data mining task, with a goal of discovering knowledge in the form of repeated patterns. Many efficient pattern mining algorithms have been discovered in the last two decades, yet most do not scale to the type of data we are presented with today, the so-called Big Data. Scalable parallel algorithms hold the key to solving the problem in this context. In this chapter, we review recent advances in parallel frequent pattern mining, analyzing them through the Big Data lens. We identify three areas as challenges to designing parallel frequent pattern mining algorithms: memory scalability, work partitioning, and load balancing. With these challenges as a frame of reference, we extract and describe key algorithmic design patterns from the wealth of research conducted in this domain.

Original language	English (US)
Title of host publication	Frequent Pattern Mining
Publisher	Springer International Publishing
Pages	225-259
Number of pages	35
Volume	9783319078212
ISBN (Electronic)	9783319078212
ISBN (Print)	3319078208, 9783319078205
DOIs	https://doi.org/10.1007/978-3-319-07821-2_10
State	Published - Jul 1 2014

Bibliographical note

Publisher Copyright:
© 2014 Springer International Publishing Switzerland. All rights are reserved.

Keywords

Data mining
Frequent graph mining
Frequent pattern mining
Frequent sequence mining
Load balancing
Memory scalability
Motif discovery
Parallel algorithms
Work partitioning

Access

10.1007/978-3-319-07821-2_10

OpenUrl availability

Full text

Cite this

@inbook{ab3a51edd2f44dc587e376b78fff99a5,

title = "Big data frequent pattern mining",

abstract = "Frequent pattern mining is an essential data mining task, with a goal of discovering knowledge in the form of repeated patterns. Many efficient pattern mining algorithms have been discovered in the last two decades, yet most do not scale to the type of data we are presented with today, the so-called Big Data. Scalable parallel algorithms hold the key to solving the problem in this context. In this chapter, we review recent advances in parallel frequent pattern mining, analyzing them through the Big Data lens. We identify three areas as challenges to designing parallel frequent pattern mining algorithms: memory scalability, work partitioning, and load balancing. With these challenges as a frame of reference, we extract and describe key algorithmic design patterns from the wealth of research conducted in this domain.",

keywords = "Data mining, Frequent graph mining, Frequent pattern mining, Frequent sequence mining, Load balancing, Memory scalability, Motif discovery, Parallel algorithms, Work partitioning",

author = "Anastasiu, {David C.} and Jeremy Iverson and Shaden Smith and George Karypis",

note = "Publisher Copyright: {\textcopyright} 2014 Springer International Publishing Switzerland. All rights are reserved.",

year = "2014",

month = jul,

day = "1",

doi = "10.1007/978-3-319-07821-2_10",

language = "English (US)",

isbn = "3319078208",

volume = "9783319078212",

pages = "225--259",

booktitle = "Frequent Pattern Mining",

publisher = "Springer International Publishing",

}

TY - CHAP

T1 - Big data frequent pattern mining

AU - Anastasiu, David C.

AU - Iverson, Jeremy

AU - Smith, Shaden

AU - Karypis, George

PY - 2014/7/1

Y1 - 2014/7/1

N2 - Frequent pattern mining is an essential data mining task, with a goal of discovering knowledge in the form of repeated patterns. Many efficient pattern mining algorithms have been discovered in the last two decades, yet most do not scale to the type of data we are presented with today, the so-called Big Data. Scalable parallel algorithms hold the key to solving the problem in this context. In this chapter, we review recent advances in parallel frequent pattern mining, analyzing them through the Big Data lens. We identify three areas as challenges to designing parallel frequent pattern mining algorithms: memory scalability, work partitioning, and load balancing. With these challenges as a frame of reference, we extract and describe key algorithmic design patterns from the wealth of research conducted in this domain.

AB - Frequent pattern mining is an essential data mining task, with a goal of discovering knowledge in the form of repeated patterns. Many efficient pattern mining algorithms have been discovered in the last two decades, yet most do not scale to the type of data we are presented with today, the so-called Big Data. Scalable parallel algorithms hold the key to solving the problem in this context. In this chapter, we review recent advances in parallel frequent pattern mining, analyzing them through the Big Data lens. We identify three areas as challenges to designing parallel frequent pattern mining algorithms: memory scalability, work partitioning, and load balancing. With these challenges as a frame of reference, we extract and describe key algorithmic design patterns from the wealth of research conducted in this domain.

KW - Data mining

KW - Frequent graph mining

KW - Frequent pattern mining

KW - Frequent sequence mining

KW - Load balancing

KW - Memory scalability

KW - Motif discovery

KW - Parallel algorithms

KW - Work partitioning

UR - http://www.scopus.com/inward/record.url?scp=84930330824&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84930330824&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-07821-2_10

DO - 10.1007/978-3-319-07821-2_10

M3 - Chapter

AN - SCOPUS:84930330824

SN - 3319078208

SN - 9783319078205

VL - 9783319078212

SP - 225

EP - 259

BT - Frequent Pattern Mining

PB - Springer International Publishing

ER -

Big data frequent pattern mining

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this