To clean or not to clean: Malware removal strategies for servers under load

Sherwin Doroudi; Thanassis Avgerinos; Mor Harchol-Balter

doi:10.1016/j.ejor.2020.10.036

To clean or not to clean: Malware removal strategies for servers under load

Sherwin Doroudi, Thanassis Avgerinos, Mor Harchol-Balter

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

We consider how to best schedule reparative downtime for a customer-facing online service that is vulnerable to cyber attacks such as malware infections. These infections can cause performance degradation (i.e., a slower service rate) and facilitate data theft, both of which have monetary repercussions. Infections may go undetected and can only be removed by time-consuming cleanup procedures, which require temporarily taking the service offline. From a security-oriented perspective, cleanups should be undertaken as frequently as possible. From a performance-oriented perspective, frequent cleanups are desirable because they maintain faster service, but they are simultaneously undesirable because they lead to more frequent downtimes and subsequent loss of revenue. We ask when and how often cleanups should happen. In order to analyze various downtime scheduling policies, we combine queueing-theoretic techniques with a revenue model to capture the problem's tradeoffs. Unlike classical repair problems, this problem necessitates the analysis of a quasi-birth-death Markov chain, tracking the number of customer requests in the system and the (possibly unknown) infection state. We adapt a recent analytic technique, Clearing Analysis on Phases (CAP), to determine the exact steady-state distribution of the underlying Markov chain, which we then use to compute revenue rates and make recommendations. Prior work on downtime scheduling under cyber attacks relies on heuristic approaches, with our work being the first to address this problem analytically.

Original language	English (US)
Pages (from-to)	596-609
Number of pages	14
Journal	European Journal of Operational Research
Volume	292
Issue number	2
DOIs	https://doi.org/10.1016/j.ejor.2020.10.036
State	Published - Jul 16 2021
Externally published	Yes

Bibliographical note

Publisher Copyright:
© 2020 Elsevier B.V.

Keywords

Computer security
Maintenance
Malware
Markov processes
Queueing

Access

10.1016/j.ejor.2020.10.036

OpenUrl availability

Full text

Cite this

@article{ec1383bf176d4444967d1060db7bdce8,

title = "To clean or not to clean: Malware removal strategies for servers under load",

abstract = "We consider how to best schedule reparative downtime for a customer-facing online service that is vulnerable to cyber attacks such as malware infections. These infections can cause performance degradation (i.e., a slower service rate) and facilitate data theft, both of which have monetary repercussions. Infections may go undetected and can only be removed by time-consuming cleanup procedures, which require temporarily taking the service offline. From a security-oriented perspective, cleanups should be undertaken as frequently as possible. From a performance-oriented perspective, frequent cleanups are desirable because they maintain faster service, but they are simultaneously undesirable because they lead to more frequent downtimes and subsequent loss of revenue. We ask when and how often cleanups should happen. In order to analyze various downtime scheduling policies, we combine queueing-theoretic techniques with a revenue model to capture the problem's tradeoffs. Unlike classical repair problems, this problem necessitates the analysis of a quasi-birth-death Markov chain, tracking the number of customer requests in the system and the (possibly unknown) infection state. We adapt a recent analytic technique, Clearing Analysis on Phases (CAP), to determine the exact steady-state distribution of the underlying Markov chain, which we then use to compute revenue rates and make recommendations. Prior work on downtime scheduling under cyber attacks relies on heuristic approaches, with our work being the first to address this problem analytically.",

keywords = "Computer security, Maintenance, Malware, Markov processes, Queueing",

author = "Sherwin Doroudi and Thanassis Avgerinos and Mor Harchol-Balter",

note = "Publisher Copyright: {\textcopyright} 2020 Elsevier B.V.",

year = "2021",

month = jul,

day = "16",

doi = "10.1016/j.ejor.2020.10.036",

language = "English (US)",

volume = "292",

pages = "596--609",

journal = "European Journal of Operational Research",

issn = "0377-2217",

publisher = "Elsevier",

number = "2",

}

TY - JOUR

T1 - To clean or not to clean

T2 - Malware removal strategies for servers under load

AU - Doroudi, Sherwin

AU - Avgerinos, Thanassis

AU - Harchol-Balter, Mor

PY - 2021/7/16

Y1 - 2021/7/16

N2 - We consider how to best schedule reparative downtime for a customer-facing online service that is vulnerable to cyber attacks such as malware infections. These infections can cause performance degradation (i.e., a slower service rate) and facilitate data theft, both of which have monetary repercussions. Infections may go undetected and can only be removed by time-consuming cleanup procedures, which require temporarily taking the service offline. From a security-oriented perspective, cleanups should be undertaken as frequently as possible. From a performance-oriented perspective, frequent cleanups are desirable because they maintain faster service, but they are simultaneously undesirable because they lead to more frequent downtimes and subsequent loss of revenue. We ask when and how often cleanups should happen. In order to analyze various downtime scheduling policies, we combine queueing-theoretic techniques with a revenue model to capture the problem's tradeoffs. Unlike classical repair problems, this problem necessitates the analysis of a quasi-birth-death Markov chain, tracking the number of customer requests in the system and the (possibly unknown) infection state. We adapt a recent analytic technique, Clearing Analysis on Phases (CAP), to determine the exact steady-state distribution of the underlying Markov chain, which we then use to compute revenue rates and make recommendations. Prior work on downtime scheduling under cyber attacks relies on heuristic approaches, with our work being the first to address this problem analytically.

AB - We consider how to best schedule reparative downtime for a customer-facing online service that is vulnerable to cyber attacks such as malware infections. These infections can cause performance degradation (i.e., a slower service rate) and facilitate data theft, both of which have monetary repercussions. Infections may go undetected and can only be removed by time-consuming cleanup procedures, which require temporarily taking the service offline. From a security-oriented perspective, cleanups should be undertaken as frequently as possible. From a performance-oriented perspective, frequent cleanups are desirable because they maintain faster service, but they are simultaneously undesirable because they lead to more frequent downtimes and subsequent loss of revenue. We ask when and how often cleanups should happen. In order to analyze various downtime scheduling policies, we combine queueing-theoretic techniques with a revenue model to capture the problem's tradeoffs. Unlike classical repair problems, this problem necessitates the analysis of a quasi-birth-death Markov chain, tracking the number of customer requests in the system and the (possibly unknown) infection state. We adapt a recent analytic technique, Clearing Analysis on Phases (CAP), to determine the exact steady-state distribution of the underlying Markov chain, which we then use to compute revenue rates and make recommendations. Prior work on downtime scheduling under cyber attacks relies on heuristic approaches, with our work being the first to address this problem analytically.

KW - Computer security

KW - Maintenance

KW - Malware

KW - Markov processes

KW - Queueing

UR - http://www.scopus.com/inward/record.url?scp=85100575185&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85100575185&partnerID=8YFLogxK

U2 - 10.1016/j.ejor.2020.10.036

DO - 10.1016/j.ejor.2020.10.036

M3 - Article

AN - SCOPUS:85100575185

SN - 0377-2217

VL - 292

SP - 596

EP - 609

JO - European Journal of Operational Research

JF - European Journal of Operational Research

IS - 2

ER -

To clean or not to clean: Malware removal strategies for servers under load

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this