A Game-Theoretic Framework for Detecting Advanced Persistent Threats
| dc.contributor.advisor | Poovendran, Radha | |
| dc.contributor.author | Sahabandu, Dinuka | |
| dc.date.accessioned | 2023-08-14T17:04:26Z | |
| dc.date.available | 2023-08-14T17:04:26Z | |
| dc.date.issued | 2023-08-14 | |
| dc.date.submitted | 2023 | |
| dc.description | Thesis (Ph.D.)--University of Washington, 2023 | |
| dc.description.abstract | Advanced Persistent Threats (APTs) are stealthy and long-term attacks on cyber systems that threaten the security and privacy of sensitive information. The interactions of APTs with a victim system introduce information flows which are recorded in system logs. Dynamic Information Flow Tracking (DIFT) is a promising mechanism that examines the usage of information flows for detecting APTs. DIFT taints (tags) information flows originating at system entities that are susceptible to an attack, tracks the propagation of tainted flows, and authenticates the tainted flows at certain system components (traps) according to a pre-defined security policy. The deployment of DIFT to defend against APTs in cyber systems is limited by heavy resource and performance overhead. The effectiveness of detecting APTs depends on the False-Negatives (FNs) and False-Positives (FPs) associated with DIFT’s security analysis. FNs and FPs of DIFT arise due to theinability of DIFT’s pre-defined security policies to detect stealthy behavior of APTs. In this dissertation, we use game theory to develop an interaction model between DIFT and APT. We use this model to study the trade-off between resource efficiency and the effectiveness of detection. Our game-theoretic framework incorporates several parameters that characterize the interaction, including costs of performing security analysis, false positives, and false negatives. We make use of the system log data information and postmortem/offline analysis of the data and construct an Information Flow Graph (IFG) to capture the victim system’s events during the execution time. We model and evaluate our game-theoretic frameworks on the IFGs extracted from a set of real-world attack datasets. We summarize the contributions of this dissertation below. First, we consider simplified models for DIFT and APT in-order to draw insights on their interactions. Specifically, we assume DIFT does not incur any false-negatives and false-positives when performing a security analysis on a tagged information flow. The objective of DIFT is to identify the best set of system locations for tagging information flows that incur minimum resource overhead and enable high probability of APT detection via ensuring the tagged flows reach a set of predefined traps. We consider an APT whose goal is to evade detection and reach its target(s) in the victim system (single-stage attacks). We formulate a nonzero-sum, imperfect information game (DIFT-APT game) that models the interactions between the DIFT and APT. We characterize equilibrium strategies for both the defense and adversary, and design efficientalgorithms for computing the strategies. Then, we extend the DIFT-APT game model to incorporate multi-stage APT attacks where adversary sequentially passes through a set of intermediate targets in the system before reaching its final target. We model the best responses of the DIFT and APT using a shortest path problem and a submodular optimization problem, respectively. For a special case of the problem where the attack is a single-stage attack, we show a Nash equilibrium can be computed using a min-cut problem. We provide a polynomial-time algorithm to compute a correlated equilibrium for the multi-stage attack case. An equilibrium policy of DIFT identifies the best set of tag sources, trap locations, and pre-defined security rules that incur minimum resource cost while enabling high detection probability of an APT. Next, we model the detection of multiple APTs using resource constrained DIFT. Given the attackers’ strategies, we prove that finding an optimal defense strategy is equivalent to maximizing an increasing DR-submodular function. Given a defense strategy and strategies of other attackers, we show that finding an optimal attacker strategy is equivalent to solving a shortest path problem. Then, we incorporate the false-negatives and false-positives associated with DIFT’s security analysis by modeling the strategic interactions between DIFT and APT as a stochastic game. We prove that the best response of the APT is a maximal reachability probability problem. We formulate the best response of the defense as a linear optimization problem. We present a nonlinear programming based polynomial-time algorithm to find an ϵ-Nash equilibrium (NE) of the discountedstochastic DIFT-APT game. Next, we provide a model-free reinforcement learning algorithm to compute an NE of the discounted stochastic DIFT-APT game when the underlying false negatives and false positives of the DIFT are unknown. Specifically, we use an actor-critic algorithm that combines value-based and policy-based methods for faster convergence rates and smaller convergence errors. Then, we formulate the interactions between DIFT and APT as an average reward stochastic game to capture the long-term behavior of the APTs. We show the existence of an Average Reward Nash Equilibrium (ARNE) in DIFT-APT game. We propose a reinforcement learning algorithm, RL-ARNE, to learn an ARNE of DIFT-APT game and prove the convergence of RLARNE algorithm to an ARNE of an average reward stochastic game. Finally, we study the problem of Instruction Set Architecture (ISA) identification using partial binaries to facilitate DIFT in detecting known malicious patterns recorded in the program binaries. We use two different datasets with binaries from 12 ISAs and 23 ISAs to show that byte-level (1, 2, 3)-gram TF-IDF features yield high accuracy (∼ 98%) compared to the existing byte-histogram and signature-based features (∼ 91%). Additionally, we show that character-level (1, 2, 3)-gramTF-IDF features extracted from encoded binaries yield high accuracy with 16Ã fewer features compared to the number of byte-level (1, 2, 3)-gram TF-IDF features. For future research, we propose to investigate interpreting a Nash equilibrium of the average reward stochastic game between DIFT-APT using graph-theory. Additionally, we provide insights on integrating signature and anomaly-based detection into our game framework. We believe this line of research is a promising endeavor to enable resource efficient DIFT for detecting APTs. | |
| dc.embargo.terms | Open Access | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.other | Sahabandu_washington_0250E_25810.pdf | |
| dc.identifier.uri | http://hdl.handle.net/1773/50377 | |
| dc.language.iso | en_US | |
| dc.rights | CC BY | |
| dc.subject | Game Theory | |
| dc.subject | Machine Learning | |
| dc.subject | Reinforcement Learning | |
| dc.subject | Electrical engineering | |
| dc.subject | Computer science | |
| dc.subject | Engineering | |
| dc.subject.other | Electrical and computer engineering | |
| dc.title | A Game-Theoretic Framework for Detecting Advanced Persistent Threats | |
| dc.type | Thesis |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Sahabandu_washington_0250E_25810.pdf
- Size:
- 5.3 MB
- Format:
- Adobe Portable Document Format
