Computer science and engineering

Permanent URI for this collectionhttps://digital.lib.washington.edu/handle/1773/4909

Browse

Recent Submissions

Now showing 1 - 20 of 562
  • Item type: Item ,
    IVF Singular Search: Agent-Based Implementation of Vector Search on GPU
    (2026-02-05) Rakhmatullaev, Akbarbek Azamatovich; Fukuda, Munehiro
    Vector search plays a crucial role in large-scale similarity search applications, with IVF (Inverted File Index) being a widely used indexing method due to its balance between accuracy and efficiency. However, traditional vector search algorithms that used IVF as an indexing method, such as IVF Flat and IVFPQ, yield results by brute force searching within each cluster/list. This paper introduces a new IVF-based vector search algorithm, called IVF Singular Search, which does the search within each cluster/list through a different arrangement of data and traversal using the Binary Search. In order to accelerate the development phase, the author used MASS CUDA to handle the searching part, which allowed to leverage the abstraction level of the code. We evaluated our IVF Singular Search, implemented for GPUs using MASS CUDA, against two other algorithms, IVF Flat and IVFPQ, demonstrating the significant speed efficiency of the approach. The findings suggest IVF Singular Search can make vector search more efficient and robust on infrastructure that requires immediate response, such as navigation systems or robots.
  • Item type: Item ,
    Retrofitting automated verification to systems code by scaling symbolic evaluation
    (2026-02-05) Nelson, Luke Robert; Wang, Xi
    Formal verification is a technique for eliminating classes of bugs insystems software by formally proving that a system's implementation meets its intended specification. While effective at systematically preventing hard-to-catch bugs, formal verification demands a significant effort from developers in the form of manual proofs. Automated verification techniques reduce the burden of verification by leveraging automated reasoning to avoid the need for manual proofs. But as a result, they sacrifice generality and require developers to build bespoke verification tools and to carefully design systems with automated verification in mind. This dissertation explores how to make it easier to build andreuse automated verifiers, and how to retrofit systems to automated verification. To do so, we built Serval, a framework for writing automated verifiers for systems code. To use Serval, developers write an interpreter for a language; Serval then leverages the Rosette programming language to lift the interpreter into a verifier via symbolic evaluation. Serval also comes with a set of techniques and optimizations to help overcome verification bottlenecks. We use Serval to develop automated verifiers for RISC-V, x86, Arm,LLVM IR, and BPF. We apply these verifiers to retrofit automated verification to two existing security monitors previously formally verified using other techniques: CertiKOS, an OS kernel with strict process isolation, and Komodo, a monitor that implements secure enclaves. We port these two systems to RISC-V, modifying their interfaces for automated verification and to improve security. We write specifications amenable to automation, and compare our efforts with that of the original systems. To demonstrate applicability to systems beyond security monitors, weuse Serval to build Jitterbug, a framework for writing and verifying just-in-time (JIT) compilers for the Berkeley Packet Filter (BPF) language in the Linux kernel. We develop a specification of compiler correctness suitable for these JITs. Using this approach, we found and fixed more than 30 new bugs in the JITs in the Linux kernel and developed a new, verified BPF JIT for 32-bit RISC-V.
  • Item type: Item ,
    Steps Towards the Pluralistic Alignment of Language Models
    (2026-02-05) Sorensen, Taylor John; Choi, Yejin
    AI alignment is concerned with ensuring that AI systems understand and adhere to human values and preferences. However, most prior alignment work makes a simplifying assumption that preferences are monolithic. In reality, human values and preferences can vary between and within individuals, groups, and societies. In this dissertation, I formalize and advance the study of \textit{pluralistic alignment}, or aligning AI systems with diverse human values, perspectives, and preferences. Specifically, I use large language models (LLMs) as a test-bed for pluralistic alignment.I first motivate the need for pluralism in alignment, outlining failure modes and risks of either assuming that value variation either doesn't exist or ignoring such variation. I propose a concrete framework for pluralistic alignment, including three definitions of how models and benchmarks can each be pluralistic. Based on this framework, I propose a roadmap with recommendations and directions for further empirical and methodological work in the area. This framework has been widely adopted by the community, and serves as an agenda for the remainder of the dissertation. Next, I focus on improving LLMs' ability to properly model and steer to varied human values. I introduce a large-scale dataset for value pluralism (\textsc{Value Prism}), and conduct a human study to understand whose values are represented. With this dataset, I train \textsc{Value Kaleidoscope}, a model for assessing the relevance of values to a particular situation and giving contextual judgments based on a value description. I find that the model is sensitive to situational changes and that it helps to explain human variation.I then propose an autoencoder-based approach for inferring the values that could have led to a particular individual's judgments (called \textit{value profiles}). I find that our value profile approach is able to preserve $>$70\% of the predictive information found in the rater demonstrations on which they are based, and offers benefits in terms of interpretability and steerability. Based on value profiles, I propose a novel rater clustering method for assigning individuals to a fixed number of clusters. I find that these clusters are far more predictive than demographic groupings of the same size, and that the clusters enable dataset-specific analysis of the dimensionality of rater variation. Generalizing beyond textual value descriptions, I focus on language model post-training for general tasks and abilities. I find that current instruction-tuning techniques reduce pluralism in many ways, harming LLMs' ability to steer to subjective judgments and diverse generation distributions, leading to mode collapse on queries with many valid answers, and reducing distributional alignment. Pretrained models are better at steering and matching distributions, but are less usable as a result of being poor at following instructions.To improve instruction-following while also improving pluralism, I compile a large-scale resource from $>$40 datasets in a unified format that require inferring and steering to diverse generation functions in-context (\textsc{Spectrum Suite}). With this data, I introduce \textsc{Spectrum Tuning}, a simple and scalable post-training method which improves instruction-following concurrently with several modes of pluralism, leading to more steerable models which also avoid mode collapse. Based on \textsc{Spectrum Tuning}, I further design a system for steering to individuals, which achieves state-of-the-art at individual subjective judgment modeling. To conclude, I survey related work in the community building on our pluralistic alignment framework and methodologies and outline directions for future work.
  • Item type: Item ,
    Towards Interpretable and Robust ML Systems
    (2026-02-05) Verma, Sahil; Bilmes, Jeffery JB; Shah, Chirag CS
    Recent advancements in ML have taken strides in enabling models to accomplish unprecedented tasks, starting from the bare minimum binary classification for loan applications to intrinsically complex self-driving. As the models have become better, faster, and more powerful, they have also become larger and more opaque. This has happened because of the widespread use of neural networks, which enable capturing and expressing incredibly complex representations but are uninterpretable to humans. This phenomenon raises the question of trust -- as humans who want to be in the position of control, how do we trust the model to make correct decisions? In this thesis, I aim to answer this question by making models more interpretable, examining their robustness, and ensuring they are safe for us as a society to rely on.
  • Item type: Item ,
    Improving Online Community Governance at Web Scale
    (2026-02-05) Weld, Galen Cassebeer; Althoff, Tim; Zhang, Amy X
    Nearly two out of every three people on the planet are members of an an online community, and this number is forecast to keep growing. These communities have an incredible diversity of topic, size, and structure, and they offer unique ways to connect their users and bring people together. Unfortunately, online communities have also been associated with significant offline harms, including the mental health crisis, abuse and harassment, interference with free and democratic elections, and radicalization and political polarization. Almost all online communities rely on some form of governance to set and enforce rules, role model good behavior, and generally lead the community. The forms that this governance takes varies widely from community to community. On some platforms, moderators' work is conducted in the background, while in many others, community leaders are volunteers who take a more visible role. Many communities' governance also relies on a range of complex technical tools. Some communities operate on a pseudodemocratic basis, with nominations and regular elections, while others operate on a consensus model, and still others are effectively autocracies. It is very difficult to know how best to govern an online community, given different community needs, the enormous range of available governance strategies, and the challenge of empirically measuring governance and outcomes. In this dissertation, I conduct research that makes online communities better through data-driven analyses of community values, moderation practices, and experiments with new tools. My work focuses on three important research activities: (1) I \emph{characterize} communities' values in community members' own words to build a foundational understanding of communities' needs and what `better' actually means. (2) I \emph{assess} existing moderation practices and community affordances such as voting at a massive scale across hundreds of thousands of communities in order to identify which practices are most promising. (3) I \emph{deploy} interventions and best practices in partnership with community leaders to maximize real world impact . Much of my research is conducted on Reddit, one of the largest platforms for online communities, and a platform where I am a longtime moderator of several subreddits, and a member of the Reddit Moderator Council. My dissertation makes several key contributions: My theoretical contributions include the first ever taxonomy of community values, based on the largest-to-date surveys of community members. My methodological contributions include a new method for scalably measuring community outcomes by quantifying how community members talk about their moderators, and a new method for classifying the rules enforced by communities. Finally, I make artifact contributions by publishing classifiers for discussions of moderators and rules, and datasets of anonymized survey results, community rules, and news sharing behavior.
  • Item type: Item ,
    Hybrid Static-Dynamic Feature-Weighted Analysis for IoT Botnet Malware Detection
    (2026-02-05) Lemak, Colleen; Thamilarsu, Geetha
    As the Internet of Things (IoT) domain continues to evolve, IoT devices face escalating security challenges. Recent waves of IoT botnets have exploited device vulnerabilities to launch dangerous large-scale Distributed Denial of Service (DDoS) attacks from compromised, resource-constrained devices. These networks of infected devices pose a unique threat to modern infrastructure, homes, schools, medical facilities, and transportation systems at heightened risk of malicious exploitation. This paper proposes a novel hybrid framework that combines static and dynamic analysis techniques for IoT botnet malware detection without relying on complex Machine Learning (ML) models. By extracting and weighing the importance of key features from malware binaries based on their relevance to DDoS behavior, the framework maintains statistical adaptability to observed data while avoiding large memory usage and opaque black-box decision processes. Designed for interpretability and efficiency, this malware detection framework bridges code-level structure and runtime behavior, offering a transparent and practical botnet detection strategy for diverse resource-constrained IoT ecosystems.
  • Item type: Item ,
    Benchmarking TenSEAL’s Homomorphic Encryption Through Predicting Encrypted RNA Sequencing Data
    (2026-02-05) Choi, Logan; Kim, Wooyoung
    This study addresses the growing need to protect sensitive healthcare data as digital technologies and cloud-based analytics become integral to modern medical research and care delivery. Healthcare data, such as clinical or genomic information, holds immense potential to enhance disease understanding and improve diagnostics through machine learning models; however, adopting third-party cloud technologies increases the risks of data breaches and noncompliance with regulations such as the Health Insurance Portability and Accountability Act (HIPAA). To address these concerns, this research investigates homomorphic encryption, a cryptographic method that allows computations on encrypted data without exposing sensitive information. The study benchmarks the TenSEAL library to evaluate its performance in encrypting healthcare test datasets and executing predictions through a pre-trained machine learning model, while also evaluating memory utilization and encryption time. The findings show that TenSEAL’s CKKS encryption scheme effectively enables data encryption and secure machine learning inference on genomic datasets for breast, lung, and prostate cancers, achieving an average accuracy of 90% across all datasets. On the other hand, our results also highlight a key trade-off: as encryption strength and dataset size increase, computational overhead rises sharply. Thus, medical professionals and data scientists must carefully balance the need for security with the practical deployment in real-world healthcare systems.
  • Item type: Item ,
    Understanding Aging at Multi-scale
Using Explainable AI
    (2026-02-05) Qiu, Wei; Lee, Su-In
    As human lifespans increase, understanding the biological and clinical mechanisms that shape aging has become increasingly important. This dissertation presents a set of explainable AI (XAI) frameworks that illuminate aging at multiple scales, ranging from population-level health data to bulk transcriptomics and single-cell gene expression. I begin with IMPACT, an XAI framework for all-cause mortality prediction in NHANES dataset. IMPACT improves prediction accuracy over traditional models and uses XAI methods to reveal previously underappreciated risk factors and clinically meaningful feature interactions. Building on this foundation, ENABL Age extends the IMPACT framework to model biological age. ENABL Age combines machine learning with XAI to estimate biological age and to quantify how specific lifestyle, clinical, and biochemical factors contribute to accelerated or slowed aging. This framework provides individualized insights into modifiable components of aging and supports the development of interpretable precision aging tools. At the molecular scale, DeepProfile learns biologically meaningful latent representations from 50,211 cancer transcriptomes across 18 tumor types. It identifies universal immune activation signals, cancer-type specific subtype structure, and mechanistic links among mutation burden, cell-cycle activity, antigen presentation, and patient survival. By studying cancer across many organs, DeepProfile also offers insight into organ health and organ aging, illustrating how unsupervised learning can uncover clinically relevant biology from large transcriptomic datasets. Finally, ACE is an explainable deep generative model for single-cell RNA sequencing data that isolates aging-related gene expression changes from dominant background variation, enabling the study of cellular aging. Applied to mouse, fly, and human datasets, ACE recovers tissue and cell-type specific aging signatures, identifies conserved aging pathways across species, predicts biological age at cellular resolution, and prioritizes novel regulators such as Uba52, whose relevance is validated through lifespan-shortening RNAi experiments in C. elegans. Together, these contributions form an integrated XAI-driven framework for understanding aging at multi-scale and advance both mechanistic aging biology and transparent approaches for improving human healthspan.
  • Item type: Item ,
    Building Flexible Data Center Network Stacks for the Terabit Era
    (2026-02-05) Shashidhara, Rajath; Peter, Simon
    Modern data center workloads demand end-host network stacks that sustain terabit-scale bandwidth alongside ??-scale latency, overwhelming traditional software TCP stacks with high CPU overheads. ASIC-based transport offloads deliver high performance and energy efficiency but sacrifice flexibility, hindering customization to diverse application and deployment needs. This thesis explores flexible stateful TCP offload using emerging programmable in-network accelerators. It tackles the core challenge of mapping TCP’s complex, stateful processing onto the restrictive programming models of resource-constrained hardware, enabling fine-grained data-path parallelization. We present FlexTOE and Laminar, two novel TCP stack offloads built on Network Processing Unit (NPU) and Reconfigurable Match-Action Table (RMT) architectures. Both eliminate all host TCP data-path CPU overheads, integrate transparently with existing applications, remain robust under realistic network dynamics, and crucially, retain software programmability. The design principles developed generalize beyond TCP and extend naturally to other accelerator architectures. Through extensive evaluation, we demonstrate that these practical designs achieve a meaningful balance of high-performance, energy efficiency, and flexibility, surpassing state-of-the-art software stacks and offering a viable, adaptable alternative to rigid hardware transports.
  • Item type: Item ,
    Navigating the Ocean of Language Model Training Data
    (2026-02-05) Liu, Jiacheng; Choi, Yejin; Hajishirzi, Hannaneh
    One crucial step toward understanding large language models (LLMs) is to understand their training data. Modern LLMs are trained on text corpora with trillions of tokens, hindering them from being easily analyzed. In this thesis, I discuss my research on making these massive text corpora efficiently searchable and revealing insights to the connection between LLMs and their training data. First, I developed infini-gram, a search engine system that enables fast string counting and document retrieval. With infini-gram, I indexed four open text corpora commonly used for LLM pretraining, totaling 5 trillion tokens. A by-product was the biggest n-gram language model ever built as of the date of publication, which I combined with neural LLMs to greatly improve their perplexity. Next, on top of infini-gram, I led the development of a system for tracing LLM generations into their multi-trillion-token training data in real time, named OLMoTrace. OLMoTrace shows long verbatim matches between LLM outputs and the full training data, enabling us to do fact-checking, trace "creative expressions", understand LLM's math capabilities, and much more. Finally, to enable searching in even bigger, Internet-scale corpora with limited budget, more storage-efficient indexing techniques are needed. To that end, we developed infini-gram mini, a search system with 12x less storage requirement than the original infini-gram, conceptually allowing us to index the entirety of Common Crawl (the main source of training data for LLMs). We indexed 83TB of text, including the Common Crawl snapshots between January and July 2025, making it the largest body of searchable text in the open-source community. With infini-gram mini, we revealed that many crucial LLM evaluation benchmarks are heavily contaminated, and we are hosting a public bulletin to continuously monitor this dire evaluation crisis. Together, my research enables everyone to inspect and understand LLM training data at scale, and paves way towards comprehending and debugging LLM behaviors from a data perspective.
  • Item type: Item ,
    Delivering Predictable Tail Latency in Data Center Networks
    (2026-02-05) Zhao, Kevin; Anderson, Thomas E
    Modern web services decompose a user request into thousands of RPCs whose slowest 1% dominate end-to-end latency, costing revenue and straining user patience. Operators codify expectations as tail latency SLOs, but meeting them is difficult even in well-run data center networks. Although such networks expose configuration parameters that have a large impact on tail latency, like switch weights, congestion windows, and switch marking thresholds, operators typically set these parameters once and rarely revisit them. When workload characteristics shift, for example in burstiness, traffic mix, or demand patterns, the resulting mismatch between the workload and the network can degrade user-observed performance and cause SLO violations, even in networks that deploy congestion control, traffic engineering, and class-based scheduling. A natural response is to adapt network parameters when workloads change, but existing methods adjust parameters by trial and error, risking intermediate violations and slow convergence in high-dimensional, noisy settings. This dissertation argues that prediction-guided control is an effective technique for delivering predictable tail latency in data center networks. It makes two contributions. First, Parsimon is a scalable tail-latency estimator. Through a series of approximations, Parsimon decouples links and simulates them in parallel, allowing it to run orders of magnitude faster than full-fidelity simulators while retaining distribution-level accuracy. Second, Polyphony embeds such estimators in a closed loop control system to improve network performance. It treats predictions as priors, fuses them with live measurements, and searches safely inside a trust region that resets as conditions drift. In a small testbed on real machines, Polyphony meets tail latency SLOs within minutes, whereas a state-of-the-art model-free tuner fails to converge after an hour. Together, fast prediction and prediction-guided control form a promising toolkit for steering large networks toward better performance for latency-sensitive applications, reducing the cost of provisioning and the risk of unsafe exploration.
  • Item type: Item ,
    Facilitating FPGA Prototyping with Hardware OS Primitives
    (2026-02-05) Lim, Katherine; Anderson, Thomas; Kasikci, Baris
    Both data center operators and the research community have embraced hardware accelerators,because of their potential for significant improvements in performance and energy efficiency. There have now been several large-scale deployments of accelerators in datacenters from com- panies such as Google, Facebook, and Microsoft. FPGAs have become a compelling acceleration platform, because their reconfigurability allows them to be repurposed as the application mix changes. Both Microsoft and Amazon have deployed FPGAs throughout their datacenter to both rent to consumers as well as accelerate their own services. Microsoft in particular attaches the FPGAs it uses to accelerate its own workloads directly to the network. Directly attaching the FPGA to the network further reduces latency, improves cost-performance, and reduces energy use relative to mediating network communications with CPUs. However, building accelerated applications or services for direct-attached FPGAs is challenging, especially with the complex I/O and multi-accelerator capacity of modern FPGAs. This thesis argues that direct-attached accelerator systems can be built in a modular manner that preserves the benefits of a direct-attached accelerator while also reducing the engineering burden. We first describe a design and prototype for Apiary, a microkernel operating system for direct-attached FPGA accelerators based on messaging passing over a network on chip (NoC) architecture. The key idea in Apiary is to raise the level of abstraction for accelerated application code, with isolation, threaded execution, and interprocess communication provided by a portable hardware OS layer in order to ease development difficulties. We propose specific hardware OS primitives to provide these services and abstractions. We then conduct an end-to-end case study of Apiary by prototyping a selection of these primitives to evaluate how well they serve Apiary’s design goals. We then describe Beehive, a hardware network stack we designed and prototyped for Apiary based around message passing over a NoC. We show that our architecture is better able to support the complexity of a software datacenter network stack by providing replication of elements and applications and standard TCP and UDP interoperation. At the same time, direct- attached accelerators using Beehive can achieve 4x improvement in end-to-end RPC tail latency for Linux UDP clients versus a CPU-attached accelerator.
  • Item type: Item ,
    Generative Keyframing
    (2026-02-05) Wang, Xiaojuan; Seitz, Steven M.; Curless, Brian
    Keyframing is a fundamental element of animation creation and video editing. It involves defining specific frames, i.e., keyframes, that mark important moments of change and guide how the intermediate frames are filled or interpolated. In early hand-drawn animation, a keyframe is a visual drawing created by animators, with assistants manually drawing the in-between frames. With the advent of digital animation and video editing software, a keyframe became a set of parameters that define the state of the rendered character or object at specific times, with in-between transitions produced by interpolating these parameters. However, such parametric approaches rely heavily on manually designed controls and artist-crafted heuristics, making them difficult to capture complex, nuanced, and realistic motions. Furthermore, they do not naturally generalize to real image and video domains. The rapid progress of visual generative models that are trained on large collections of visual data and capable of learning rich appearance and motion patterns, has made it possible to generate high-fidelity imagery and realistic motion. Building on these advances, this thesis investigates generative keyframing, a data-driven, non-parametric, image-based approach to the keyframing process. To this end, I present a series of works in this thesis that collectively develop and explore this idea. I begin with the basic aspect: using generative models to synthesize transitions directly from images, and even to fully generate in-between motions. I first present a GAN-based technique for smoothing jump cuts in talking head videos, synthesizing seamless transitions between the cuts even in challenging cases involving large head movement. I then introduce a method for generating in-between videos with dynamic motion between more distant key frames by adapting a pretrained large-scale image-to-video diffusion model with minimal fine-tuning effort. Beyond automatically generating transitions between keyframes, I further explore multi-scale keyframing for achieving very deep zoom. Specifically, I introduce a multi-scale joint sampling diffusion approach for generating consistent images (keyframes) across different spatial scales while adhering to their respective input text prompts. This enables deep semantic zoom and a continuous zoom video can be rendered from these images. When working with multiple keyframes, one import question is how they should be ordered in the final video. I address this in the context of dance video generation---specifically, music synchronized and choreography-aware animal dance video---where unordered keyframes representing distinct animal poses are arranged via graph optimization to satisfy a specified choreography pattern of beats that defines the long-range structure of a dance. Finally, I conclude with discussions and directions for future works.
  • Item type: Item ,
    High-Performance Transaction Processing in Disk-based Databases
    (2026-02-05) Hwang, Deukyeon; Peter, Simon
    Achieving high-performance transaction processing in disk-based databases has long required system designers to choose between lock-based concurrency control methods, which suffer from CPU overhead and reduced parallelism, and timestamp-based methods, which provide superior concurrency but incur prohibitive I/O overhead when timestamp metadata is stored on disk. Modern high-speed storage devices like NVMe SSDs exacerbate this trade-off, as the CPU becomes the bottleneck for lock-based methods while disk-based timestamp storage wastes the storage device’s speed on frequent small metadata operations. This dissertation introduces a novel approach that eliminates this fundamental trade-off through approximate timestamp storage and demonstrates that timestamp-based concurrency control protocols—specifically Strict Timestamp Ordering (STO), Multi-Version Timestamp Ordering (MVTO), and TicToc—can maintain correctness (serializability) even when timestamps are overapproximated for inactive keys, as long as active keys maintain exact timestamps throughout their transaction lifetime. This key insight enables designing FPSketch, a hybrid data structure combining a hash table for exact timestamps of active keys with a probabilistic sketch for approximate upper bounds of inactive keys. The first contribution is the design, implementation, and evaluation of FPSketch integrated with STO, MVTO, and TicToc in the SplinterDB key-value store. FPSketch achieves nearly the idealized performance while requiring only minimal memory—as little as 32KiB for an 80GB database—by eliminating the need to access timestamp metadata from disk during normal operation. Experimental evaluation on modern NVMe SSDs demonstrates that TicToc with FPSketch achieves up to 14x higher goodput than traditional two-phase locking, up to 5.9x higher goodput than disk-based timestamp storage. The second contribution is a comprehensive analytical and experimental study evaluating FPSketch across the entire storage performance spectrum, from traditional hard disk drives with millisecond latencies to emerging CXL-based storage approaching DRAM-like speeds. The evaluation reveals that FPSketch’s benefits scale with the fundamental gap between local memory and remote storage access, ensuring its continued relevance as storage technology evolves. On slow storage (HDDs and SATA SSDs), FPSketch enables timestamp-based protocols to outperform traditional concurrency control methods: on SATA SSD, TicTocFocus-Sketch achieves up to 6.89× and 2.52× higher goodput than two-phase locking (2PL) and KR-OCC, respectively, while on HDD it reaches up to 1.8× the goodput of KR-OCC. FPSketch also eliminates the prohibitive overhead of timestamp disk accesses, achieving improvements of up to 569% over disk-based timestamp storage. On fast storage (simulated CXL-based SSDs), where systems transition from I/O-bound to CPU-bound, FPSketch continues to provide substantial benefits by keeping timestamp metadata in fast local memory, enabling timestamp-based protocols to significantly outperform traditional approaches. Together, these contributions establish that approximate, in-memory metadata management enables high-performance transaction processing for disk-based databases. FPSketch demonstrates that approximate metadata management can unlock advanced concurrency control designs that would otherwise be impractical, providing a practical solution that enables efficient timestamp-based concurrency control across diverse storage technologies while requiring only minimal memory overhead.
  • Item type: Item ,
    Extending Human Capabilities with Deep Learning-Powered Wearables
    (2026-02-05) Kim, Maruchi; Gollakota, Shyamnath
    Deep learning-powered wearables have the potential to seamlessly extend human capabilities by enhancing perception and interaction in everyday environments. In this dissertation, I present three wearable systems that integrate deep learning neural networks with custom hardware to enable real-time audio enhancement, vision-based smart interactions, and visual intelligence through wireless earbuds. First, I present ClearBuds, a wireless earbud system that performs real-time speech enhancement using a synchronized binaural microphone array and a lightweight dual-channel neural network. The system achieves high-precision synchronization and low-latency processing on mobile devices, enabling robust noise suppression and background speech removal in diverse real-world conditions. Second, I introduce IRIS, a vision-enabled smart ring that fits within the size and power constraints of the ring form factor to enable context-aware smart home interactions. By combining scene semantics with detected objects, IRIS achieves instance-level device recognition and outperforms voice commands in speed, precision, and social acceptability. Third, and as the final contribution, I present VueBuds, the first vision-enabled wireless earbuds integrating low-power cameras with vision language model interaction. VueBuds addresses fundamental challenges in embedding cameras into earbuds—strict power and form-factor constraints, facial occlusion from ear-level positioning, and real-time multimodal processing over Bluetooth. Through a stereo camera system operating at under 5 mW and end-to-end system optimizations, VueBuds achieves visual question-answering performance comparable to commercial smart glasses while leveraging a significantly more ubiquitous form factor. Together, these systems demonstrate how deep learning powered wearables can extend human capabilities with on-the-go intelligence, establishing new platforms for intuitive, responsive, and enhanced human-computer interaction.
  • Item type: Item ,
    Exploring protein-protein interactions using high-throughput datasets and deep learning
    (2026-02-05) La Fleur, Alyssa Marie; Seelig, Georg
    Protein-protein interactions (PPIs) are fundamental to cellular function.Understanding which proteins interact—and how sequence variation alters these interactions—is essential for advancing therapeutic discovery and protein engineering. High-throughput sequencing technologies enable the large-scale measurement of PPIs, but the resulting datasets are complex and require error correction, modeling, and interpretation to yield meaningful insights. This thesis presents work across the process of designing, executing, and making use of high-throughput data, including (1) designing and modeling mutant protein libraries for large-scale PPI measurement, (2) developing PPI-specific sequencing analysis pipelines, (3) training models on limited structural features for PPI prediction for specific families, and (3) applying feature attribution techniques to interpret sequence-to-function models. Together, this work supports the continued development of experimental and computational tools to deepen our understanding of protein-protein interactions.
  • Item type: Item ,
    Algorithmic Design-for-Manufacturing and Programmability of Metamaterials
    (2025-10-02) Revier, Daniel; Lipton, Jeffrey I
    Modern digital design tools offer vast creative freedom, yet a persistent gap remains between conceptual designs and their physical realization, particularly for advanced materials and designs. This challenge is acute for metamaterials, whose novel properties are derived from complex, fine-scale architecture that is often difficult to design for and manufacture. This dissertation argues that design tools should directly translate a user's high-level intent into physical form by managing complexity on two fronts: abstracting complex material behavior into simple, programmable controls, and embedding the physics of the fabrication process directly into the design environment. This approach treats physical realization not as a downstream constraint but as an integral design parameter, enabling more expressive, accessible, and efficient workflows. To substantiate this claim, this thesis presents three interlocking contributions that integrate computational design with advanced manufacturing: First, to address the design complexity of metamaterials, I introduce a framework that abstracts complex mechanical behaviors into simple, programmable primitives. This work uses compliant straight-line mechanisms (SLMs) as reconfigurable building blocks that explicitly encode zero-energy deformation modes. By coordinating these SLMs with planar symmetries, I demonstrate the ability to deterministically program and smoothly interpolate between all 2D extremal classes (nullmode, unimode, bimode, and trimode), enabling in-situ, reversible tuning of emergent properties like Poisson's ratio and chirality without costly re-computation. Second, I present Fabrication-Directed Entanglement (FDE), which combines the previous work on simplifying metamaterial design with topology optimization and viscous thread printing (VTP) to computationally design and fabricate monolithic foam metamaterials with spatially patterned entanglement. By translating intended directions of compliance and rigidity into optimized density fields and controlled VTP coiling patterns, FDE produces single-filament structures exhibiting targeted anisotropy, extreme Poisson's ratios, and significant chirality-driven normal-shear coupling---behaviors previously inaccessible in uniform or multi-material entangled foams. Third, to demonstrate the generality of this fabrication-aware approach beyond mechanics, I extend the VTP-based design methodology to the optical domain. I present a computational pipeline for creating foam-based lithophanes where light transmission is controlled by spatially varying the foam's porosity. This work leverages a calibrated physical simulation of the VTP process and a photorealistic rendering pipeline to automate the translation of digital images into manufacturable, uniform-thickness structures with programmed optical properties. Collectively, these projects demonstrate that abstracting low-level fabrication and material physics into process-aware computational tools expands the space of what can be designed and built. By algorithmically integrating the means of production into the act of creation, this work provides a pathway toward more powerful and intuitive design systems for complex physical objects.
  • Item type: Item ,
    Exploring Classification Methods for Motor Imagery and Execution EEG Signal Fluctuations
    (2025-10-02) Dode, Pragati; Parsons, Erika
    Brain-Computer Interfaces (BCIs) offer promising applications in neurological rehabilitation through motor imagery (MI)-based training, which is the "intent'' of performing an action. This research addresses the challenge of accurately classifying MI and motor execution (ME) based on Electroencephalography (EEG) signals. This kind of data is often limited by subject variability, non-stationarity, environmental noise during data collection, EEG device quality, and small dataset sizes. For our study, we propose to make use of an external large dataset, including data from 103 subjects (compared to the 9–12 subject datasets used in prior work). One of the main goals of this research is to integrate multiple feature extraction techniques spanning time, frequency, and spatial domains. Effective EEG channel selection was guided by fMRI studies identifying MI- and ME-relevant Brodmann areas, combined with EEG-based statistical analysis, resulting in a refined set of 12 informative electrodes. Several machine learning models (SVM, RF, KNN, XGBoost, MLP) are evaluated, achieving up to 80% accuracy with improved robustness across subjects. These findings demonstrate enhanced generalizability and support the development of more reliable BCI applications for real-world rehabilitation scenarios.
  • Item type: Item ,
    Improved XOR Lemmas for Communication Complexity
    (2025-10-02) Iyer Vaidyanathan, Siddharth; Rao, Anup
    We give communication lower bounds for computing the $n$-fold XOR of a given Boolean function $f$, denoted $f^{\oplus n}(x,y) := f(x_1,y_1)\oplus\ldots\oplus f(x_n,y_n)$, in both the deterministic and the randomized setting. In addition, we also give deterministic communication lower bounds on computing the composition of 2 functions, $g\circ f(x,y) := g(f(x_1,y_1),\ldots,f(x_n,y_n))$. Below for some absolute constant $C_0 > 0$ and all $C > C_0$ we show the following:\begin{enumerate} \item \textbf{Randomized XOR Lemma.} If $f$ requires $C$ bits to be computed with some constant success probability then, computing $f^{\oplus n}$ with probability at least $1/2 + \exp(-\Omega(n))$ requires $\tilde\Omega(C\sqrt{n})$ bits. \item \textbf{Deterministic XOR Lemma.} If $f$ requires $C$ bits to be computed deterministically then, computing $f^{\oplus n}$ deterministically requires $\Omega(n\sqrt C)$ bits. \item \textbf{Lifting Theorem.} For any function $g$, having sensitivity $s$ and degree $d$, and any $f$ requiring $C$ bits to be computed deterministically, computing $g\circ f$ deterministically requires $\Omega(\min\{s,d\}\cdot\sqrt C)$ bits. \end{enumerate} We prove the above results using information theory.In particular, the randomized XOR lemma is proved using a new notion of information that we call marginal information.
  • Item type: Item ,
    A Qualitative Approach to Agile Hardware Design
    (2025-10-02) Ruelas-Petrisko, Daniel; Taylor, Michael B; Oskin, Mark H
    The recent, dramatic rise of small hardware startups illustrates the demand for rapid, low-cost ASIC design.While researchers borrow Agile principles from software engineering, such as quick iteration, aggressive reuse, and continuous integration, their methodologies have struggled to escape small labs. Unfortunately, there remain clear challenges to broader adoption. Effective research evaluation techniques often do not scale to larger, more complex design flows. Conversely, risk aversity means that traditional hardware design methodologies can be rigid and slow to adapt. The result is that despite measurable quantitative improvements, traditional Agile design methodologies cannot be practically applied in complex SoC designs. This thesis presents a comprehensive approach to Agile Hardware Design through three tools: BSG Pearls, BlackParrot, and ZynqParrot. All three projects are open-source, silicon-proven, and available for immediate use under a permissive BSD-3 License.Hardware designers can leverage these efforts to make Agile Hardware Design qualitatively more feasible across a wide variety of research and commercial projects.