Abstract
Among many Big Data applications are those that deal with data streams. A data stream is a sequence of data points with timestamps that possesses the properties of transiency, infiniteness, uncertainty, concept drift, and multi-dimensionality. In this paper we propose an outlier detection technique called Orion that addresses all the characteristics of data streams. Orion looks for a projected dimension of multi-dimensional data points with the help of an evolutionary algorithm, and identifies a data point as an outlier if it resides in a low-density region in that dimension. Experiments comparing Orion with existing techniques using both real and synthetic datasets show that Orion achieves an average of 7X the precision, 5X the recall, and a competitive execution time compared to existing techniques.
Original language | English (US) |
---|---|
Title of host publication | Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016 |
Editors | Ronay Ak, George Karypis, Yinglong Xia, Xiaohua Tony Hu, Philip S. Yu, James Joshi, Lyle Ungar, Ling Liu, Aki-Hiro Sato, Toyotaro Suzumura, Sudarsan Rachuri, Rama Govindaraju, Weijia Xu |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 512-521 |
Number of pages | 10 |
ISBN (Electronic) | 9781467390040 |
DOIs | |
State | Published - 2016 |
Event | 4th IEEE International Conference on Big Data, Big Data 2016 - Washington, United States Duration: Dec 5 2016 → Dec 8 2016 |
Publication series
Name | Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016 |
---|
Other
Other | 4th IEEE International Conference on Big Data, Big Data 2016 |
---|---|
Country/Territory | United States |
City | Washington |
Period | 12/5/16 → 12/8/16 |
Bibliographical note
Publisher Copyright:© 2016 IEEE.
Keywords
- Data Mining
- Data Streams
- Outlier Detection