Local trend discovery on real-time microblogs with uncertain locations in tight memory environments

Abdulaziz Almaslukh, Amr Magdy, Ahmed M. Aly, Mohamed F. Mokbel, Sameh Elnikety, Yuxiong He, Suman Nath, Walid G. Aref

Research output: Contribution to journalArticlepeer-review

Abstract

This paper presents GeoTrend+; a system approach to support scalable local trend discovery on recent microblogs, e.g., tweets, comments, online reviews, and check-ins, that come in real time. GeoTrend+ discovers top-k trending keywords in arbitrary spatial regions from recent microblogs that continuously arrive with high rates and a significant portion has uncertain geolocations. GeoTrend+ distinguishes itself from existing techniques in different aspects: (1) Discovering trends in arbitrary spatial regions, e.g., city blocks. (2) Considering both exact geolocations, e.g., accurate latitude/longitude coordinates, and uncertain geolocations, e.g., district-level or city-level, that represents a significant portion of past years microblogs. (3) Promoting recent microblogs as first-class citizens and optimizes different components to digest a continuous flow of fast data in main-memory while removing old data efficiently. (4) Providing various main-memory optimization techniques that are able to distinguish useful from useless data to effectively utilize tight memory resources while maintaining accurate query results on relatively large amounts of data. (5) Supporting various trending measures that effectively capture trending items under a variety of definitions that suit different applications. GeoTrend+ limits its scope to real-time data that is posted during the last T time units. To support its queries efficiently, GeoTrend+ employs an in-memory spatial index that is able to efficiently digest incoming data and expire data that is beyond the last T time units. The index also materializes top-k keywords in different spatial regions so that incoming queries can be processed with low latency. In peak times, the main-memory optimization techniques are employed to shed less important data to sustain high query accuracy with limited memory resources. Experimental results based on real data and queries show the scalability of GeoTrend+ to support high arrival rates and low query response time, and at least 90+% query accuracy even under limited memory resources.

Original languageEnglish (US)
Pages (from-to)301-337
Number of pages37
JournalGeoInformatica
Volume24
Issue number2
DOIs
StatePublished - Apr 1 2020
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2019, Springer Science+Business Media, LLC, part of Springer Nature.

Keywords

  • Adaptive memory optimization
  • Indexing
  • Microblogs
  • Query processing
  • Real-time
  • Spatial
  • Trend
  • Uncertain location
  • Uncertainty

Fingerprint

Dive into the research topics of 'Local trend discovery on real-time microblogs with uncertain locations in tight memory environments'. Together they form a unique fingerprint.

Cite this