Hypertext transfer protocol (HTTP) has become the main protocol to carry out malicious activities. Attackers typically use HTTP for communication with command-and-control servers, click fraud, phishing and other malicious activities, as they can easily hide among the large amount of benign HTTP traffic. The user-agent (UA) field in the HTTP header carries information on the application, operating system (OS), device, and so on, and adversaries fake UA strings as a way to evade detection. Motivated by this, we propose a novel grammar-guided UA string classification method in HTTP flows. We leverage the fact that a number of 'standard' applications, such as web browsers and iOS mobile apps, have well-defined syntaxes that can be specified using context-free grammars, and we extract OS, device and other relevant information from them. We develop association heuristics to classify UA strings that are generated by 'non-standard' applications that do not contain OS or device information. We provide a proof-of-concept system that demonstrates how our approach can be used to identify malicious applications that generate fake UA strings to engage in fraudulent activities.
Bibliographical notePublisher Copyright:
Copyright © 2015 John Wiley & Sons, Ltd.