In this paper, we study the scanning activities towards a large campus network using a month-long netflow traffic trace. Based on the novel notion of “gray” IP space (namely, collection of IP addresses within our campus network that are not assigned to any “active” host during a certain period of time), we identify and extract potential outside scanners and their associated activities. We then apply data mining and machine learning techniques to analyze the scanning patterns of these scanners and classify them into a few groups (e.g., focused hitters, random address scanners, and blockwise scanners). The goal is to infer the scanning strategies of the scanners so as to provide some assessment of the potential harmfulness of these scanning activities - for example, whether the observed scanning activities are simply part of background radiation of global random scanning or more focused scanning targeted at our campus network. This is an on-going work; we report some preliminary, yet promising results obtained so far.
|Original language||English (US)|
|State||Published - 2007|
|Event||2nd Workshop on Tackling Computer Systems Problems with Machine Learning Techniques, SysML 2007, co-located with NSDI 2007 - Cambridge, United States|
Duration: Apr 10 2007 → …
|Conference||2nd Workshop on Tackling Computer Systems Problems with Machine Learning Techniques, SysML 2007, co-located with NSDI 2007|
|Period||4/10/07 → …|
Bibliographical noteFunding Information:
This work was supported in part by the NSF grants CNS-0435444 and CNS-0626812, a University of Minnesota Digital Technology Center DTI grant, a Cisco gift grant and an IBM Faculty Partnership Award.