Computing and Software Systems (UW Bothell)
Permanent URI for this collectionhttps://digital.lib.washington.edu/handle/1773/19652
Browse
Recent Submissions
Item type: Item , Robust Prediction and Biomarker Discovery in Rare Cancers Using Interpretable Machine Learning(2026-02-05) Madasamy, Dhurka Rohini; Kim, WooyoungRare cancers such as Glioblastoma Multiforme (GBM, a rare brain cancer) pose persistent challenges in computational oncology due to limited data, biological noise, and difficulty in isolating disease-specific molecular signatures. Based on these constraints, this work began with the expectation that rare-cancer models would perform poorly. However, machine learning approaches on genomic data achieved unexpectedly strong accuracy, motivating investigation into whether this separability reflected genuine biology or artifactual signal. This thesis develops an interpretable machine learning framework that evaluates predictive robustness and isolates biologically meaningful biomarkers under extreme imbalance. Cascade Learning systematically removes broad cancer pathways and reveals biomarkers uniquely associated with the rare cancer, while SHAP-based interpretability aligns these genes with experimentally reported glioma biology. Complementary Tab2Image visualizations provide spatial confirmation of class separability, strengthening biological trust in the learned signal. Overall, this work provides a robust, biologically grounded, and ethically aligned pathway for rare-cancer biomarker discovery that emphasizes transparency, fairness, and accountability in scarce-data environments.Item type: Item , Understanding Neural Burst Patterns Through Graph Neural Network Explainability in Simulated Neuronal Networks(2025-10-02) Dhanasekaran, Hari Priya; Stiber, MichaelSpontaneous bursting activity in neural networks represents a fundamental mode of informationprocessing in the brain, yet the mechanisms triggering these synchronized events remain poorly understood. While graph-based representations of neural networks are established, discovering the specific connectivity and activity patterns that predict burst initiation remains a significant challenge. This work uses GNNs to classify and explain burst initiation in Graphitti-simulated cortical networks by representing neurons as nodes with temporal firing statistics and synapses as edges, thereby integrating activity patterns with network architecture. To move beyond black-box classification, we applied GNNExplainer to identify the minimal neural connectivity patterns driving model predictions. This explainability analysis revealed which specific neurons and synaptic connections the model deemed most critical for each prediction. This work demonstrates how explainable AI can transform our understanding of complex neural dynamics, providing insights that pure predictive modeling cannot offer. By combining the representation power of graph neural networks with explainability techniques, we bridge the gap between prediction and understanding. Our findings challenge prevailing views of burst initiation as a localized phenomenon, instead revealing the critical role of distributed precursor patterns in driving network-wide synchronization. This methodology opens new avenues for investigating emergent behaviors in complex networks.Item type: Item , An Enhancement of Distributed Graph Queries in an Agent-Based Graph Database(2025-10-02) Prajapati, Aatman Rajeshkumar; Fukuda, MunehiroGraph databases have become essential in domains requiring real-time querying of highlyinterconnected data. This work is an enhancement of the querying capabilities of a distributed agent-based graph database. Built using the MASS (Multi-Agent Spatial Simulation) Java library, it combines the computing paradigm of Agent-based Modeling (ABM) with rich data modeling of the property graph model. This system leverages autonomous agents for querying a distributed in-memory graph across multiple computing nodes, enabling scalable and parallel graph operations. This work focuses on integrating the Cypher WHERE clause, a critical feature that enables data filtering functionality, into the current system. To implement this functionality, we adopted a modular approach that begins with an abstract syntax tree (AST) for parsing Boolean expressions. On top of this, we developed two evaluation methods designed to handle constraints efficiently. These extensions improve both the flexibility and performance of read operations, while retaining the system's agent-based execution model. As a result, the system's practical scope is broadened, and a basis is established for supporting more complex query patterns and future mutation operations.Item type: Item , Exploring Quantum Machine Learning-Enhanced Models for EEG Data Classification(2025-08-01) Murray, Stephanie Anne; Parsons, ErikaElectroencephalography (EEG) records brain activity linked to both executed and imagined movements, but separating true motor signals from background noise in high-dimensional EEG data remains a challenge. Reliable classifiers are therefore vital for accurately tracking patient progress over time. This work is part of a larger initiative, the Smart NeuroRehab Ecosystem, which has two primary goals: (1) to propose innovative physical-rehabilitation strategies for neurologic conditions such as stroke using emerging technologies that make therapy more accessible, and (2) to collect and analyze EEG data using machine learning (ML) models that classify movement-related brain signals. EEG data are complex and often difficult to interpret. In this research, we explore the use of quantum machine learning as an alternative approach for EEG signal classification. Compared to classical ML strategies, quantum methods may offer a fundamentally different way of representing and processing data, potentially improving classification performance or computational efficiency. We implement and analyze a ten-qubit Variational Quantum Classifier (VQC), and compare its performance to a tuned Random Forest baseline using EEG data from a publicly available 64-channel dataset. The task involves classifying each EEG time-window as either a movement or rest condition. Across 40 preliminary runs, the VQC achieves a macro-F1 score of approximately 0.75, accuracy of 0.76, and AUROC of 0.83, outperforming the Random Forest (macro-F1 ≈ 0.71, AUROC ≈ 0.79). In addition to higher macro-F1 and AUROC scores, the VQC also demonstrated significantly better precision and recall on the movement class, based on paired statistical tests. Most experiments were conducted on a quantum simulator, with a subset tested on a cloud-based quantum processor. These findings suggest that hybrid quantum-classical models can match or exceed the performance of tuned classical pipelines without increasing computational complexity. Within the scope of the Smart NeuroRehab project, this work demonstrates that quantum approaches may offer a practical path to continuous monitoring of EEG in clinical settings. Future improvements in quantum hardware may expand the range of practical applications in biomedical signal analysis.Item type: Item , Network Behavior Analysis of Spike Timing Dependent Plasticity (STDP) in Simulated Neural Networks(2025-08-01) Arndorfer, Vanessa; Stiber, MichaelThe machine learning landscape is rapidly evolving with researchers often turning toward nature for inspiration. Understanding the development of neural networks \textit{in vivo} contributes significant transferable insight for advancing both neuroscience and computational research. This project applies a multiplicative Spike Timing Dependent Plasticity (STDP) model to the weighted graph output from neural growth simulations and analyzes the resulting spike and weight changes over time. This preliminary investigation establishes a baseline process for understanding the effects of STDP on a neural network and provides a framework for defining the resulting network behavior. Through rigorous data analysis, we examine bursting behavior during the refinement phase, analyze the progressive effects of STDP on synapse weights, and compare how the network behavior changes between the growth and refinement phases of neural development.Item type: Item , StackBERT-Enhancer: A Dual-Layer BERT-Based Framework for Enhancer Identification and Strength Classification in Genomic Data(2025-08-01) Tran, Phat; Kim, WooyoungAccurately identifying and classifying crucial regulatory DNA sequences known as enhancers is a significant challenge, as traditional computational methods often struggle with their complex, context-dependent nature and lack interpretability. This thesis introduces StackBERT-Enhancer, a novel deep learning framework to address these limitations, focusing on two primary tasks: distinguishing enhancer sequences from non-enhancer sequences and classifying identified enhancers by their activity levels. The proposed framework employs multiple transformer-based language models, each independently trained on DNA sequences tokenized with different k-mer sizes, allowing for the capture of sequence dependencies across various scales. These individual models are then integrated into a stacking ensemble architecture, which significantly boosts classification accuracy, robustness, and generalization, achieving state-of-the-art results of 83.5% in enhancer identification and 99.0% in enhancer strength classification. The framework utilizes distributed multi-GPU systems for efficient model training and incorporates interpretability techniques such as SHapley Additive exPlanations (SHAP) for feature importance and attention score analysis for sequence motif discovery, bridging predictive power with biological insight. This advanced approach offers a robust and interpretable tool for enhancer analysis, holding strong potential for applications in disease modeling and broader biomedical research.Item type: Item , Data-Centric Preprocessing for Multivariate Biological Data: From Conventional Pipelines to Novel Pair-Based Approach(2025-08-01) Selke, William D; Kim, WooyoungHigh-throughput biological datasets, such as those from CAR-T cell therapy, are challenging to analyze due to high dimensionality, heterogeneity, and noise. Standard algorithms often fail not from lack of complexity, but from ignoring the biological structure of the data. This thesis introduces Paired Vector Centralization (PVC), a normalization method designed for paired experimental designs, like responder vs. toxicity comparisons. PVC re-centers feature vectors around biologically meaningful contrasts, correcting for baseline drift and concentration effects. Applied to protein–protein interaction (PPI) data from pre-infusion CAR-T assays, PVC improves classification and embedding quality over conventional methods. Additional experiments explored hybrid approaches, including Tab2Img and a Convolutional Neural Network–Random Forest (CNN-RF) ensemble, which offered insights but further underscored the value of biologically informed preprocessing. Overall, the findings support a shift from treating biological data as static snapshots to modeling them as dynamic transitions between paired states. Embedding domain knowledge directly into preprocessing enhances signal recovery and interpretability. This pairing-based perspective may offer broader value across immunology and other high-dimensional fields, especially where understanding relationships between conditions matters more than analyzing isolated measurements.Item type: Item , Evaluating Vision-Language-Action Models in Robotic Manipulation: Performance, Implementation, and Comparison with Deterministic Systems(2025-08-01) Kemple, Jake; Chen, MinRobotic manipulation systems typically use deterministic policies for perception, decision-making, and task planning, which achieve millimeter-level precision but require extensive specialized development and cannot easily generalize to new tasks. Emerging vision-language-action (VLA) foundation models promise to reduce this specialized effort and inflexibility through learned multimodal reasoning. However, their practicality in the real world and associated development costs remain largely unknown. This thesis presents a real-world comparison of a strong open-source VLA foundation model (OpenVLA-7B) against a fine-tuned deterministic control system. Both systems are evaluated on identical hardware consisting of a WidowX 250 6-DoF robotic arm, an Intel RealSense D415 camera, an NVIDIA Jetson AGX Orin edge computer, and the ROS 2 (Robot Operating System 2) framework. Each system repeatedly executes a pick-and-place robotic task under randomized initial conditions. Performance is measured using goal-oriented, object-centric metrics of accuracy, repeatability, and cycle time, adapted from ISO 9283 standards. Additionally, a qualitative analysis examines the installation effort and configuration challenges associated with each system. The primary contributions of this research include: (i) a comparative evaluation of performance and setup complexity between robotic systems utilizing a VLA-based control policy and conventional deterministic control logic; (ii) documentation of hardware, software, and configuration challenges encountered during VLA system implementation; and (iii) qualitative insights from real-world deployment, emphasizing usability and adaptability. Results indicate that current VLA foundation models underperform compared to deterministic control systems in terms of accuracy, repeatability, and cycle time, limiting their immediate viability for production-level robotic tasks. However, the inherent flexibility of VLA models suggests strong potential as future replacements for deterministic approaches, contingent upon improvements through fine-tuning, future optimizations, enhanced integration frameworks, and better overall performance metrics. These findings offer practical insights and set realistic expectations for developers considering transitioning from deterministic robotics systems to VLA-based implementations.Item type: Item , Abstractions for Code Migration from CPU to GPU in Simulation Domain(2025-08-01) Wadhwa, Avikant; Stiber, MichaelSimulations are crucial in science, enabling the modeling of complex phenomena that are difficult to study experimentally. As they scale, they demand greater performance and efficiency. To meet this need, computing has shifted toward heterogeneous architectures that combine CPUs and GPUs. While effective, this shift introduces software engineering challenges, making abstraction an increasingly important tool for improving programmability. Abstractions hide low-level implementation details behind clean interfaces, improving clarity and reducing complexity. This thesis reviews existing abstractions for heterogeneous architectures, analyzing their integration effort, performance trade-offs, and limitations. It uses the insights from that review to present the design and implementation of DeviceVector, a lightweight abstraction that unifies host and device memory management in Graphitti, a high-performance graph-based simulation platform. DeviceVector enhances programmability by reducing code duplication, introducing a clear CPU–GPU data relationship, and abstracting CUDA boilerplate through an interface that closely mirrors a standard C++ container. It also discusses design approaches for extending support in the future to object hierarchies and general function-level abstractions, further minimizing logic duplication between host and device code. Overall, this work highlights how thoughtful abstraction design can bridge the usability-performance gap in heterogeneous computing systems.Item type: Item , ARM64 for Serverless Computing: Performance Modeling and Analysis to Understand Implications of Architecture Adoption(2025-01-23) Chen, Xinghan; Lloyd, WesThe recent availability of ARM64 architectures on serverless computing platforms presents opportunities and challenges. To encourage the adoption of ARM64 processors, AWS discounts the price for compute time by 20% vs. Function-as-a-Service (FaaS) processors. Before adopting ARM64 processors for serverless functions, understanding their performance implications can help developers better plan and prioritize codebase migration to minimize refactoring and conversion efforts. ARM64 processors also predominate edge devices. A better understanding of ARM64 function runtime can inform workload deployments across edge, fog, and cloud infrastructure. To help developers understand the implications of adopting ARM64 architectures, this thesis addresses the critical need to understand and accurately predict serverless function runtimes on ARM64 processors based on profiling on x86. As cloud computing evolves, with ARM64 processors gaining prominence for their efficiency and performance, there is a pressing need to bridge the knowledge gap in runtime behavior between ARM64 and traditional x86 64 processors. Our research helps to fill this void by offering insights and methodologies crucial for developers and organizations navigating the transition to ARM-based serverless computing. In this thesis, we investigate efficacy of cross-architecture performance models for serverless FaaS platforms. Specifically, we create and evaluate models that predict serverless function runtime for functions executed on ARM64 processors, by utilizing re- source utilization profiling data from function execution on x86 64 processors. We train regression based function-specific, and also generalized performance models using Linux CPU profiling data. We evaluate accuracy of serverless function runtime predictions for both seen and unseen functions, those not included as training data. We leveraged 18 distinct serverless function workloads, including 11 seen and 7 unseen, in total encompassing over 144,000 serverless function calls. We evaluate three different generalized performance models for unseen predictions: All-in-one, where all training data is combined into one model, Resource-bound, where separate models are trained for CPU vs. I/O bound functions, and ARM-speed models, where three separate models are trained based on ARM64 relative speed vs. x86 64. Using a separate classification model, we automate selection of the appropriate ARM-speed model to make predictions. For seen workloads on ARM64 processors, we predict function runtime with a mean absolute percentage error (MAPE) of only ∼1.17. Using our ARM-speed generalized performance models, we predict function runtime with MAPE of only ∼10.29 for unseen workloads, and ∼3.04 for seen workloads. Our performance modeling techniques can be leveraged to support creating a broadly applicable tool that predicts serverless function runtime on ARM64 processors by profiling unseen functions on x86 64 to provide inference data for model inputs. This research has the potential to contribute to the improvement of serverless function performance and optimization in cloud computing environments.Item type: Item , Metapath of Thoughts: Verbalized Metapaths in Heterogeneous Graph as Contextual Augmentation to LLM(2025-01-23) Singh, Jyoti Arvind; Teredesai, Ankur Dr.Heterogeneous graph neural networks (HGNNs) excel in cap- turing graph topology and structural information. However, they are ineffective in processing the textual components present in nodes and edges and thus producing suboptimal performance in downstream tasks such as node-classification. Additionally, HGNNs lack in their explanatory power and are considered black-box. Although, Large Language models (LLMs) are good at processing textual information, how- ever, utilizing them for tasks like node-prediction can be non-trivial since it is difficult to identify the ideal graphi- cal context and present it in a form suitable for LLMs to consume effectively. We introduce a framework that com- bines the strengths of both models by leveraging the context obtained through metapaths, which are generated during the training of HGNNs. This approach enables the under- standing of complex and indirect relationships between dif- ferent types of nodes. Our novel framework enhances the prediction accuracy of HGNNs and the transparency of their decision-making process through natural language explana- tions provided by LLMs. We demonstrate that our proposed framework outperforms FASTGTN (SOTA on heterogeneous node classification tasks), an HGNN tailored for heteroge- neous graph data, on two network datasets (DBLP citation graph and Goodreads graph dataset), with improvements in F1 score from 0.81 and 0.66 of the baseline to 0.9 and 0.91, respectively. Furthermore, the efficacy of the framework in generating explanations has been evaluated through human evaluation, considering metrics such as helpfulness and fac- tual correctness.Item type: Item , A Comprehensive Categorization Framework for Interactive Fiction Games(2024-10-16) Liu, Hongyang; SUNG, KELVINInteractive Fiction (IF) games are digital experiences that merge storytelling with interactive gameplay, allowing players to navigate and influence story-driven adventures. These games have evolved significantly, integrating advanced visual and interactive elements alongside traditional textual narratives, making them an intriguing area of study. However, currently, there are few structured frameworks designed for the systemic classification of IF games and it can be challenging to analyze these games wholistically.This thesis presents a comprehensive categorization framework for IF games, designed to facilitate systematic classification and analysis. Based on features derived from common video game features, including human-computer interface, game genres, game mechanics, and business model, the framework supports the classification of IF games into distinct categories. This structured approach allows feature-based examination and facilitates the holistic analysis of IF games and their evolution. Validation for the proposed framework involved three rounds of sampling and categorizing IF games. The first round sampled popular IF games developed based on well-established game engines to demonstrate the fundamental validity of the framework. The second round sampled popular IF games over time for insights into potential trends as IF games continue to develop and evolve. The third round was based on popular IF games developed by the same studio to examine the potential trend of IF games after removing the bias of developers. The three rounds of sampling and categorizing reveal potential patterns and trends that enhance the understanding of IF games. Key insights include the trends from text-only to image-based or even animation-based output, from no or little towards more sophisticated support for stats and resource management, and the potential overlapping and merging of IF and action-adventure games. These insights can serve as references for future IF game development. These findings demonstrate that the proposed framework is an effective tool for systematic analysis that can offer valuable insights into the development and trends of IF games. Since classification involves subjectivity, future work should repeat the process based on stakeholders with distinct backgrounds, e.g., publishers, developers, and gamers. Additionally, the proposed framework is but a first step and should be continuously reviewed and refined.Item type: Item , Development of Personality Adaptive Conversational AI for Mental Health Therapy Using LLMs(2024-10-16) Jaiswal, Sugam; Si, DongMany individuals with mental health issues cannot get access to professional help due to reasons such as lack of awareness, limited availability, and high costs. Conversational agents present a viable alternative to deliver mental health support that is accessible, affordable, and scalable. However, the effectiveness of these agents can vary among users, as different users have different personality types such as extroversion, agreeability, etc. which influence how users interact with chatbots. Therefore, it is important to develop therapy chatbots that adapt to individual personalities. In this study, we highlight the significant role of Personality Adaptive Conversational Agents (PACAs) in mental healthcare. We designed an architecture around traditional ML models and open-source LLMs to build a PACA for mental health (based on the existing iCare project at the DAIS research group at UWB). We utilized the architecture to build a functional prototype and conducted a user study, which concluded that personality adaptiveness is a critical feature for mental health chatbots. The prototype is currently live and freely available for use at http://test.icare.uw.edu:3010/.Item type: Item , Innovative Rehabilitation Approach for Upper Limb Neurologic Conditions Using Mixed-Reality Simulation and EEG/EMG Biofeedback(2024-10-16) Raj, Arsheya; Parsons, ErikaRecent advances in Augmented Reality (AR), Mixed Reality (MR), Electroencephalogram (EEG), and Electromyogram (EMG) offer significant opportunities in medicine and neuroscience. This research aims to use these technologies to aid stroke patients with upper limb extremity weakness. This work extends the Edge Computing Ecosystem for Neuroscience Patients' Rehabilitation, part of the 'Stroke Rehabilitation Project' by University of Washington Bothell Engineering, University of Washington Seattle Neuroscience, and Rehabilitation Medicine at Harborview Medical Center (UWHM). Recently, UW Bothell's CSSE has also contributed, focusing on solutions for stroke patient rehabilitation. Traditional motor rehabilitation is costly, resource intensive, and often monotonous, reducing patient engagement. Using augmented and mixed reality technologies, interactive environments can be created on mobile devices, providing engaging and motivating experiences for patients. AR simulates real-world scenarios, offering a safe and fun way to practice tasks and aid in rehabilitation. We used EEG and EMG sensors to conduct experiments and to collect data in a controlled environment targeting a reduced set of representative relevant motor tasks. The data were processed using various signal processing and statistical techniques, which in combination with the MR / AR simulation can be used to build a novel feedback and guidance system. This system is a building block of our ``NeuroRehab'' ecosystem, which will use various ML models and algorithms for sequential prediction, with the aim of guiding patients through an optimal rehabilitation path within the 3D simulation environment. Results showed that the combination of Frequency Filtering, ICA, and ERP with CNN model yield, so far, the best accuracy for classifying the motor tasks in EEG and EMG data. These findings contribute to the field of stroke rehabilitation of upper limb extremity weakness. It also contributes to a larger project that aims for better understanding and rehabilitation of other neurological ailments by offering insights to different hand gestures using EMG, and EEG data, and creating a framework for data processing, and feedback systems.Item type: Item , Enhancing the Performance of GNN and Utilizing 3D Instance Segmentation for Ligand Binding Site Prediction(2024-10-16) Gavali, Esha Rajesh; Si, DongThis study addresses the challenge of accurately predicting ligand binding sites (LBS) onproteins, a critical aspect of structure-based drug design. Ligand binding site prediction is crucial for designing effective drugs and understanding protein functions, benefiting pharmaceutical companies, biotechnologists, and researchers by accelerating drug discovery and improving therapeutic interventions. We employ and improve Graph Neural Networks (GNNs) and innovative 3D point cloud instance segmentation to refine and advance LBS prediction methods. This research demonstrates significant enhancements in predictive accuracy by evaluating these methods on widely used datasets. Our novel clustering algorithm, which combines density-based and fuzzy clustering, notably improves the definition and identification of ligand binding sites without prior knowledge of the number of clusters. This methodology allows for more precise predictions, effectively managing binding sites' overlapping nature. Implementing instance segmentation further delineates individual binding pockets, offering a more granular understanding of ligand-protein interactions. The results illustrate that our approaches meet the current state-of-the-art for ligand binding site prediction and support their potential utility in real- world pharmaceutical applications. Future work will focus on refining these methods and extending their application to molecular docking studies.Item type: Item , GraphConv: Geometric Deep Learning for Multiple Conformation Generation from Electron Density Images(2024-10-16) Jayakumar, Saurav; Si, DongIn the field of cryo-electron microscopy (cryo-EM) structural analysis, the precise prediction of molecular conformations within datasets is essential. Despite strides made in deep learning methodologies, existing solutions often yield volumes of suboptimal quality. Addressing this critical limitation, our research introduces GraphConv, an innovative encoder model designed to embed particle images into a latent space, thereby substituting the conventional encoder utilized by CryoDRGN. This novel approach employs a Graph Neural Network (GNN) architecture featuring multiple GraphConv and Convolutional layers, aimed at capturing richer information from particle images and precisely reconstructing corresponding 3D volumes. Rigorous testing across two authentic datasets and three simulated datasets underscores the efficacy of our model, suggesting marked enhancements in reconstruction quality. Specifically, our findings reveal enhancements in resolution by up to 20% compared to CryoDRGN. By harnessing the power of GNNs, our methodology shows promise for significant advancements in the fidelity and accuracy of output volumes, thereby contributing to the ongoing refinement of cryo-EM structural analysis methodologies.Item type: Item , A Study on The Effectiveness of Education and Fear Appeal to Prevent Spear Phishing of Online Users(2024-09-09) Alsulami, Saja Faham; Dupuis, MarcSpear phishing attacks are considered one of the most elaborate forms of social engineering. It presupposes that an attacker designs a scam to obtain the personal information of specific users from their social media accounts. It involves a preliminary analysis of targeted users and their online behaviors needed to persuade them that a malicious link or attachment is sent by a trusted person. This attack implies that human beings are the weakest link within a security system; their vulnerabilities could be exploited. The most detrimental consequences following spear phishing attacks are financial losses, network compromises, loss of login credentials, and malware installation. This quantitative study used Protective motivated theory (PMT) to examine the impact of education and fear appeals on users’ knowledge and abilities to identify spear phishing attacks. Three interventions were implemented: an education intervention, a fear appeal intervention, and a combined education and fear appeal intervention. The control groupwas used for comparison purposes. This study was conducted as an online experiment that was managed via the Qualtrics platform. It has 726 participants, and they were assigned randomly into four groups; after interventions, there was a spear phishing test to evaluate their knowledge and abilities to identify spear phishing attacks. The spear phishing test was administered to compare the efficacy of every intervention group (education, fear appeal, and combined education and fear appeal) to the control group. The experiment findings revealed no statistically significant differences in the mean test for these four groups. The PMT finding revealed that the high effect of threat vulnerability, self-efficacy, and the low effect of cost response can enhance the participant’s knowledge of spear phishing attacks. The study results indicate further research is needed to develop an effective intervention program that would considerably enhance users’ knowledge of spear phishing attacks and their resilience to them.Item type: Item , Evaluating the Effectiveness of the Convolutional LSTM Neural Network for Simulations in Computational Fluid Dynamics(2024-09-09) Castillo Tosi, Agustín; Parsons, ErikaComputational Fluid Dynamics (CFD) is an important part of engineering design, with applications in diverse areas. Although its practical application is widespread, the computational cost hinders its utilization. This research evaluates the effectiveness of the Convolutional LSTM (ConvLSTM) neural network for Computational Fluid Dynamics simulations when creating Reduce Order Models (ROM) and simulating turbulent fluid flows interacting with an obstacle. We propose a novel end-to-end Artificial Neural Network (ANN) model architecture based entirely on ConvLSTM that can successfully predict the spatiotemporal evolution of a fluid flow. This data-driven approach achieves similar results to a classical CFD method with direct numerical simulation with a Mean Squared Error of 1.107 Ã 10−05 in a quarter of its execution time. This model could be used to accelerate CFD simulations, leading to a faster engineering development process. By providing rapid preliminary results for prototype testing, engineers can explore more design ideas without waiting days or weeks for simulation results.Item type: Item , Real-Time Rendering of Atmospheric Clouds(2024-09-09) Ford, Parker; Sung, KelvinRendering realistic clouds is an important aspect of creating believable virtual worlds. The detailed shapes and complex light interactions present in clouds make this a daunting task to complete in a real-time application. Our solution, based on Schneider’s cloud modeling and Fong’s volumetric rendering frameworks for low-altitude cloudscapes, supports realism and real-time performance. For efficient approximations of radiance measurements, we adopt Hillaire’s energy-conserving integration method for light scattering. To simulate the effect of multiple light scattering, we followed Wrenninge's approach for computing the multi-bounce diffusion of light within a volume. To capture the details of light interreflection off microscopic water droplets, the complex behavior of Mie scattering is approximated with Jenderise and d’Eon’s phase function modeling technique. To capture the details with nominal computational cost, we introduce a temporal anti-aliasing strategy that unifies the sampling strategy for the area over a pixel and interval of volumetric participating media.The resulting system is capable of rendering scenes consisting of expansive cloudscapes well within real-time requirements, achieving frame rates between 2 and 3 milliseconds on a typical machine. Users can adjust parameters to control various types of low-altitude cloud formations and weather conditions, with presets available for easily transitioning between settings. Our unique combination of techniques adopted in the volumetric rendering process enhances both efficiency and visual fidelity where the novel approach to volumetric temporal anti-aliasing efficiently and effectively unifies the sampling of pixel areas and volumetric intervals. Looking forward, this technique could be adapted for real-time applications such as video games or flight simulations. Further improvements could refine the cloud modeling system, incorporating procedural generation for high-altitude clouds, thus broadening the range of cloudscapes that can be represented. Additionally, our volumetric rendering framework could be paired with recent investigations into voxel-based cloud rendering.Item type: Item , Advancing Deep Packet Inspection in SDNs: A Comparative Analysis of P4 and OpenFlow Programmability(2024-09-09) Bustamante Suarez, Anthony Jesus; Lagesse, BrentThis thesis undertakes a critical examination of Deep Packet Inspection (DPI) capabilities within Software-Defined Networking (SDN) frameworks, emphasizing the comparative efficacy of P4 programming language against the conventional OpenFlow protocol.OpenFlow, while foundational in SDN’s evolution, exhibits notable constraints in DPI’s domain, primarily due to its limited packet inspection depth, confined largely to the Transport, Network, Data Link and Physical layers. In contrast, this research advocates for the adoption of P4 for its unique flexibility and programmability, potentially extending DPI functionalities to the application layer (Layer 7), thereby addressing and potentially surpassing OpenFlow’s limitations. Employing a methodical approach, this study harnesses Open vSwitch and BMv2 (Behavioral Model version 2) switches to emulate real-world network scenarios. These emulations facilitate a head-to-head comparison of OpenFlow and P4 in executing DPI tasks, particularly focusing on HTTP and SQL protocols — common vectors for network threats. Through a comprehensive suite of protocols including OpenFlow, gRPC (Google Remote Procedure Call), and P4Runtime, the research crafts a robust DPI framework, further complemented by a custom-developed controller designed for the BMv2 and P4 ecosystem. The research culminates providing three different implementations to do Deep Packet Inspection within SDN domain, benchmarking each of them to measure their advantages and disadvantages. With these implementations and benchmarking, we not only aim to validate P4’s superiority over OpenFlow in managing DPI tasks but we also seek to dynamically adapt packet-processing techniques to the ever-evolving landscape of network threats. By advancing SDN functionalities beyond traditional layer boundaries, this thesis contributes significantly to the discourse on network security, management, and optimization, paving the way for future innovations in increasingly complex network environments.
