Minits-AllOcc: An Efficient Algorithm for Mining Timed Sequential Patterns

Somayah Karsoum, Clark Barrus, Le Gruenwald, Eleazar Leal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Sequential pattern mining aims to find the subsequences in a sequence database that appear together in the order of timestamps. Although there exist sequential pattern mining techniques, they ignore the temporal relationship information between the itemsets in the subsequences. This information is important in many real-world applications. For example, even if healthcare providers know that symptom Y frequently occurs after symptom X, it is also valuable for them to be able to estimate when Y will occur after X so that they can provide treatment at the right time. Considering temporal relationship information for sequential pattern mining raises new issues to be solved, such as designing a new data structure to save this information and traversing this structure efficiently to discover patterns without re-scanning the database. In this paper, we propose an algorithm called Minits-AllOcc (MINIng Timed Sequential Pattern for All-time Occurrences) to find sequential patterns and the transition time between itemsets based on all possible occurrences of a pattern in the database. We also propose a parallel multicore CPU version of this algorithm, called MMinits-AllOcc (Multicore Minits-AllOcc), to deal with Big Data. Extensive experiments on real and synthetic datasets show the advantages of this approach over the brute-force method. Also, the multicore CPU version of the algorithm is shown to outperform the single-core version on Big Data by 2.5X.

Original languageEnglish (US)
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 25th Pacific-Asia Conference, PAKDD 2021, Proceedings
EditorsKamal Karlapalem, Hong Cheng, Naren Ramakrishnan, R. K. Agrawal, P. Krishna Reddy, Jaideep Srivastava, Tanmoy Chakraborty
PublisherSpringer Science and Business Media Deutschland GmbH
Pages668-685
Number of pages18
ISBN (Print)9783030757618
DOIs
StatePublished - 2021
Event25th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2021 - Virtual, Online
Duration: May 11 2021May 14 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12712 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference25th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2021
CityVirtual, Online
Period5/11/215/14/21

Bibliographical note

Publisher Copyright:
© 2021, Springer Nature Switzerland AG.

Keywords

  • Multicore
  • Parallel sequential pattern mining
  • Sequential pattern mining
  • Timed sequential pattern

Fingerprint

Dive into the research topics of 'Minits-AllOcc: An Efficient Algorithm for Mining Timed Sequential Patterns'. Together they form a unique fingerprint.

Cite this