Modeling memory processes in phishing decision making using instance based learning and natural language processing

Xu, Tianhao

Modeling memory processes in phishing decision making using instance based learning and natural language processing

Files

Xu_washington_0250E_26203.pdf (7.49 MB)

Date

2023-09-27

Authors

Xu, Tianhao

Abstract

Phishing is a type of social engineering attack that uses psychological manipulations to influence people into revealing their personal information. Despite advancements in security technologies, phishing attacks continue to be rampant and successful because phishing attacks are primarily rare events, and discriminating phishing emails from legitimate emails continues to be challenging. Furthermore, attackers now exploit our personal information on the Internet to generate personalized phishing attacks, known as spear-phishing attacks. Current automation (e.g., spam filters) successfully filters most phishing emails but is poor at detecting spear-phishing attacks. Therefore, the onus is on the recipient to detect attacks that automation misses. However, in most instances, people who are the targets of spear-phishing attacks fall victim to them. Past research has mostly blamed inattention for human susceptibility to phishing. But what is the underlying cognitive process that prevents people from paying attention to key indicators in phishing and spear-phishing messages? The answer to this question could be in the human working memory because our attention is inextricably linked with the contents in our memory and memory processes that govern our decision-making. Despite the large body of research on phishing attacks, there is a significant lack of work explaining the role of memory processes in end-user susceptibility to phishing and spear-phishing attacks. This dissertation will address this research gap through laboratory experiments grounded in the principles of instance-based learning and the development of cognitive models of human decision-making. A novel multi-player, human-in-the-loop simulation environment called SpearSim was developed to study the human decision making to spear phishing attacks from both the attackers and end-users' perspectives. Results from the experiment show that access to more personal information about targets can enable attackers to produce spear-phishing attacks involving contextually meaningful impersonation and narratives, making end-users more vulnerable to spear-phishing attacks. Data from the experiment conducted using SpearSim was used to train Instance-Based Learning (IBL) models and natural language processing models (LSA, GloVe, and BERT) to predict and explain the role of working memory processes behind the human response to phishing and spear-phishing attacks. Results from my experiments with IBL models of phishing decision-making show that, compared to representations that only consider the semantic properties of emails, using representations that consider higher-order contextual meanings assigned by humans could enable IBL agents to predict human response with high accuracy. Furthermore, I found evidence that IBL models of phishing decision-making performed better in predicting responses in situations where participants made quick, system-1 like decisions, suggesting that instance-based learning satisfies the conditions for describing intuitive decision-making. A follow-up study focused on end-user phishing decision-making to further test the insights obtained from the previous experiment and to understand how people encode emails to memory. The study involved the use of an eye tracker to monitor participants' eye movements when they processed the emails presented to them and to study how end-users' attention may influence their decision-making. Similar to previous experiments, data from the experiment was used to develop IBL models of phishing decision-making. I once again found that representations that consider higher-order contextual meanings assigned by humans enable IBL agents to predict human response more accurately than input representations that consider human attention to words and phrases along with the semantic properties of the emails. I also found more evidence from eye-tracking data revealing that instance-based learning models effectively predict human responses in situations involving intuitive decision-making. This insight is crucial to cyber security defense because people are more likely to fall victim to phishing and spear-phishing attacks with more intuitive decision-making, and my work shows that models grounded in IBL can inform interventions to mitigate phishing threats. Findings from this dissertation are expected to advance our understanding of the cognitive processes associated with detecting phishing attacks and could facilitate the development of personalized anti-phishing training solutions. The findings from this dissertation are also expected to contribute to our understanding of the cognitive models and how to apply them to analyze human decision-making in cyber security.