Electrical and computer engineering
Permanent URI for this collectionhttps://digital.lib.washington.edu/handle/1773/49166
Browse
Recent Submissions
Item type: Item , Feedback Loops in Interactive Machine Learning: Online Weakly-Submodular Learning and Probing for Missing Labels(2026-04-20) Narang, Adhyyan; Fazel, Maryam; Ratliff, LillianMachine learning systems are increasingly deployed as interactive services that obtain data not by sampling from a fixed distribution, but through direct and indirect interaction with an environment, users, and other learners. In recommender engines, language model services, and online platforms, this interaction makes the learner's information environment endogenous: the learner's own actions or current state determine what feedback and data become available to it. This dissertation studies two distinct channels through which interactivity induces endogeneity, and develops principled algorithms with provable guarantees for each. Chapter I addresses history-dependent feedback in repeated interaction. When a learner constructs a set of choices over time (for instance, recommending movies sequentially), the value of each future action depends on what has already been selected: a sequel gains value if the original was recommended, while similar items exhibit diminishing returns. The learner's past actions shape the structure of its own future feedback, creating combinatorial utilities that are neither purely submodular nor purely supermodular. We extend Gaussian Process contextual bandits to objectives that are BP-decomposable (a sum of monotone submodular and supermodular terms) or weakly submodular. We introduce a novel separate-feedback framework where observations are available independently for each component, and integrate Nystrom sketching to ensure scalability. We prove sublinear regret bounds in all cases, demonstrating that richer utility structures can be optimized online with theoretical guarantees. Chapter II addresses choice-driven data allocation in multi-learner markets. When multiple learners compete for the same pool of users, who choose based on predictive quality and inherent preferences (e.g., brand loyalty), the data each learner observes becomes a function of its own performance, creating a second form of endogeneity. We characterize an overspecialization trap: as learners optimize for users who already prefer them, they become less attractive to others, further restricting their data and leading to arbitrarily poor global performance, even when models with low full-population loss exist. Inspired by knowledge distillation, we propose Peer Probing, an algorithm that queries peer models to obtain synthetic labels for users outside the learner's organic base. We prove that this procedure converges almost surely to a stationary point with bounded full-population risk when probing sources are sufficiently informative. Together, these contributions show that accounting for the endogeneity inherent in interactive learning, through richer function classes and richer data sources, yields algorithms that are both theoretically principled and practically effective.Item type: Item , Extracting Clinical Information from Unstructured EHRs using Language Models, and its Role in Disease Prediction(2026-04-20) Zhou, Sitong; Ostendorf, Mari; Yetisgen, MelihaClinical unstructured data contain critical information for clinical decision making, suchas symptoms and radiology findings, that can complement structured EHRs and often add greater details. However, clinically relevant information can be buried in abundant EHR unstructured notes, which can challenge physicians to review. In addition, large volumes of texts can include irrelevant information to secondary machine learning applications. We aim to develop language model–based information extraction (IE) methods to extract clinically critical information from EHR texts, supporting human review and secondary clinical decision applications. We first develop robust event extraction methods using supervised learning to identify clinical events at the sentence level, and improve their domain generalization across different domain shifts. In one study on symptom event extraction, we demonstrate that two strategies, adaptive pretraining on unstructured EHRs and masking frequent symptoms during training, improve domain generalization when using an encoder-only language model. In a second study on radiological findings extraction, we show that generative LMs generalize better than encoder-only models in categorizing minority classes, and further training them on decomposed, simpler subtasks improves generalization to complex tasks when subtask dependencies are shifted across domains. In addition to event extraction from isolated reports, we present longitudinal summarization of radiology reports as an additional IE task to track radiological findings and capture temporal changes not reflected in individual reports. We frame longitudinal summarization as a timeline generation task that groups related findings across time, and introduce RadTimeline as an evaluation dataset, and propose an LLM-based approach that achieves good recall of lung findings and human-comparable grouping of gold-standard findings without training data. Finally, we apply information extraction to extract risk factors from longitudinal EHRs for a secondary-use clinical application, a lung cancer prediction task. We create a lung cancer case-control cohort, where each patient has a 5-year longitudinal EHR history and a lung cancer outcome within three years. We find that COPD, smoking status and radiology abnormality information extracted from unstructured notes can complement the structured EHRs, and improve lung cancer risk prediction performance. Using a transformer-based risk prediction model, we further compare different representations of longitudinal risk factors across model variants and input orderings, finding that there is no benefit from including findings from reports beyond a 6-month window.Item type: Item , Identifiable Bayesian Representations for Heterogeneous Medical Imaging(2026-04-20) Wang, Xin; Shapiro, Linda; Yuan, ChunMedical images exhibit pervasive heterogeneity arising from acquisition protocols, scanner properties, reconstruction pipelines, modality and contrast mechanisms, and anatomical variability across subjects and scan coverage. While deep learning has achieved strong performance in many medical image analysis tasks, robustness under compounded heterogeneity remains fragile. This dissertation argues that such fragility reflects a representational limitation: when task-relevant generative properties and observational variability are not organized in an identifiable manner, models may rely on unstable observational cues as surrogates, leading to degraded generalization as heterogeneity intensifies. To address this challenge, we develop a unified perspective based on Bayesian representation learning and explicit latent role specification. Using latent variable models and variational inference, we construct mechanisms that preserve task-relevant invariants while suppressing observational variability, a requirement termed identifiable invariant preservation. We show that strengthening the identifiability of latent organization provides a practical pathway to both interpretability and improved predictive performance across progressively more demanding regimes of heterogeneity. The dissertation substantiates this thesis through three projects. First, we study supervised intracranial arterial calcification segmentation from multi-contrast brain MRI under intensity-level appearance heterogeneity. Because calcification is dark and often weakly expressed in MRI, segmentation depends on fragile contextual cues that are easily perturbed by scanner- and protocol-dependent fluctuations. A variational Bayesian formulation that restricts representational complexity yields more stable internal organization and improved segmentation accuracy. Second, we address unsupervised multimodal groupwise image registration under compounded contrast/modality variability and registration-compatible geometric heterogeneity. We formulate registration as hierarchical Bayesian inference that disentangles common anatomy from image-specific geometry, enabling intrinsic multimodal similarity and stable alignment without intensity-based heuristics. Third, we study unsupervised domain adaptation for segmentation in a correspondence-free regime with unpaired source and target domains. We introduce a probabilistic anatomical manifold that provides global canonicalization through a structured latent decomposition, inducing architecture-emergent adaptation without an explicit alignment loss and yielding a unified procedure across source-accessible and source-free settings. Together, these contributions demonstrate that interpretable, identifiable latent organization is not merely an explanatory preference, but a practical mechanism for robust medical image learning under increasing heterogeneity. By developing Bayesian, disentangled formulations that progressively strengthen latent role specification across tasks, this dissertation provides a unified methodological pathway that improves both generalization and semantic interpretability in challenging real-world imaging regimes.Item type: Item , Generalizable Object Tracking in Complex Real-World Scenes with Contextual Cues and Memory(2026-04-20) Yang, Cheng-Yen; Hwang, Jenq-NengMultiple-Object Tracking (MOT) serves as a cornerstone of computer vision, yet achieving robust data association remains a significant challenge in dynamic environments characterized by frequent occlusions. This dissertation investigates the strategic integration of spatial contextual cues and hierarchical memory to enhance tracking stability. To bridge the gap between camera and image space, we first analyze three distinct spatial perspectives: an extrinsic approach leveraging drone metadata for maritime scenarios, an intrinsic method utilizing self-calibration for multi-camera consistency, and a depth-aware modality to prioritize non-occluded objects in dense crowds. Building upon these spatial foundations, we leverage vision foundation models to introduce SAMURAI, a motion-aware zero-shot tracker, and SAMURAI++, a unified framework that reconciles tracking-by-detection and tracking-by-query paradigms. By maintaining dual-horizon memory—short-term and long-term—for each tracklet, this work achieves superior identity preservation and cross-domain generalizability without the need for task-specific fine-tuning. Collectively, these contributions demonstrate that the synergy of temporal memory and spatial context provides a robust trajectory toward generalizable object tracking in complex, real-world scenes.Item type: Item , Action Reasoning Models that can Reason in Space(2026-04-20) Lee, Jason; Hwang, Jenq-Neng; Fox, DieterReasoning is central to purposeful action, yet most robotic foundation models map perception and instructions directly to control, which limits adaptability, generalization, and semantic grounding. We introduce Action Reasoning Models (ARMs), a class of robotic foundation models that integrate perception, planning, and control through a structured three-stage pipeline. Our model, MolmoAct, encodes observations and instructions into depth-aware perception tokens, generates mid-level spatial plans as editable trajectory traces, and predicts precise low-level actions, enabling explainable and steerable behavior. MolmoAct-7B-D achieves strong performance across simulation and real-world settings: 70.5% zero-shot accuracy on SimplerEnv Visual Matching tasks, surpassing closed-source Pi-0 and GR00T N1.5; 86.6% average success on LIBERO, including an additional 6.3% gain over ThinkAct on long-horizon tasks; and in real-world fine-tuning, an additional 10% (single-arm) and an additional 22.7% (bimanual) task progression over Pi-0-FAST. It also outperforms baselines by an additional 23.3% on out-of-distribution generalization and achieves top human-preference scores for open-ended instruction following and trajectory steering. Furthermore, we release, for the first time, the MolmoAct Dataset -- a mid-training robot dataset comprising over 10,000 high quality robot trajectories across diverse scenarios and tasks. Training with this dataset yields an average 5.5% improvement in general performance over the base model. We release all model weights, training code, our collected dataset, and our action reasoning dataset, establishing MolmoAct as both a state-of-the-art robotics foundation model and an open blueprint for building ARMs that transform perception into purposeful action through structured reasoning.Item type: Item , FPGA-Based System for Radiation Energy Histogram Computation using Time-Over-Threshold(2026-04-20) Guntha, Pavan Sai; Hauck, ScottPersonalized dosimetry is essential for optimizing radiopharmaceutical therapy, yet current clinical practice relies on serial hospital imaging that is expensive and limits the number of time-point samples. This thesis presents an FPGA-based system used in the Portable Dosimetry Device (PODD), a compact wireless gamma spectroscopy system for monitoring 177Lu therapy. This thesis focuses on the design and implementation of an FPGA-based real-time energy histogram computation system. The FPGA processes time-over-threshold encoded signals from 16 parallel detector channels, achieving 2.5 ns time binning for pulse width measurement and maintaining 512-bin histograms per channel. The digital processing core integrates with GAGG:Ce/SiPM scintillation detectors and communicates wirelessly via Bluetooth to an Android application. Validation with 177Lu and 22Na radioactive sources demonstrated clearly resolved photopeaks at 113 keV, 208 keV, and 511 keV. The PODD’s compact form factor and wireless operation enable accessible point-of-care dosimetry for personalized radiopharmaceutical therapy.Item type: Item , Natural Language Processing for Education Research: Exploring Strategic Use of Traditional and Large Language Topic Models(2026-04-20) Kardam, Neha; Wilson, DeniseEngineering education research increasingly relies on qualitative analysis of short, open-ended survey responses to understand student experiences across courses and institutions, but extracting reliable themes from these texts at scale requires methods that balance computational efficiency with interpretive rigor. While Natural Language Processing (NLP)has been applied in education for automated grading and sentiment analysis, its systematic integration with qualitative thematic analysis for short, prompt-guided educational research texts has received limited attention. This dissertation addresses that gap by comparing five topic modeling methods on short student feedback on instructional support and by developing the NLP-Assisted Thematic Analysis framework, a six-stage workflow that embeds domain expert judgment from data preparation through final validation. Three survey datasets of undergraduate engineering student responses on faculty support, teaching assistant (TA) support, and peer support (1,667, 1,592, and 1,376 responses, respectively, for approximately 4,600 total) were processed through a standardized preprocessing pipeline and evaluated against expert-coded themes. Five methods were compared:k-means clustering, Latent Dirichlet Allocation (LDA), Non-Negative Matrix Factorization (NMF), BERTopic with MiniLM and MPNet sentence embeddings, and zero-shot classification (ZSC). Performance was evaluated using accuracy, macro and weighted F1, topic coherence, and inter-rater reliability (Cohen’s κ). Ground truth was established by two approaches: (a) a machine-led approach in which topic model keywords guided manual coding of the data; and (b) a human-led approach in which a domain expert coded the data independently. Results varied by dataset, with no single method performing best across all three corpora. BERTopic MiniLM performed best on the concise, low-ambiguity peer support corpus (85% accuracy, 77% macro-F1), with LDA second at 78% accuracy and 67% macro-F1. BERTopic MPNet led on faculty support, where over one quarter of responses addressed overlapping themes (76.8% accuracy, 65.7% macro-F1), with NMF close behind on accuracy (76.76% accuracy, 62.82% macro-F1). TA support was the most challenging dataset due to higher thematic ambiguity and misalignment between model-generated topics and expert-identified themes. ZSC, applied to the peer support dataset, reached 85% accuracy and 60% weighted F1 when prompts used mainstream language, compared to 82% accuracy and 56% weighted F1 with domain-specific prompts.The NLP-Assisted Thematic Analysis framework structures domain expert involvement across six stages of the analysis pipeline, from data preparation through final validation. Expert review consolidated nine algorithmic topics into five research themes, with inter-rater reliability (Cohen’s κ) between 0.72 and 0.75 across all three datasets. Targeted interventions, including domain-specific stopword curation, hyperparameter selection, topic-to-theme bridging, and review of algorithmically uncertain responses, improved macro-F1 by up to 14 percentage points. The largest single gain arose from BERTopic outlier review on the TA support dataset, raising macro-F1 from 54.2% to 69.3%. These results establish performance benchmarks for five NLP methods on short educational research text, identify where domain expert involvement has the greatest impact on accuracy and interpretive quality, and provide the NLP-Assisted Thematic Analysis framework as a reproducible, decision-guided protocol for researchers applying topic modeling to qualitative survey data in education and related fields.Item type: Item , Numerical Simulation of Liquid Oxygen Droplet Combustion in Hydrogen under Microgravity Conditions(2026-02-05) Davis, Benjamin Lu; Hermanson, James C.; Raiti, JohnThis work presents an efficiently lean, custom-built numerical simulator developed to study the combustion of a liquid oxygen (LOX) droplet combusting in Hydrogen gas (H2) under microgravity conditions. Motivated by drop-tower tests conducted at ZARM (The Center for Applied Space Research and Microgravitation) in Bremen, Germany, the quasi-static evaporation framework reproduces key coupled processes– flame dynamics, Stefan flow, droplet regression, and surface ice formation– within a computationally minimalist yet physically faithful model. The governing reaction-diffusion equations were solved using finite-difference methods incorporating time-dependent, spatially homogeneous Stefan velocity fields generated by real-time evaporative feedback from the flame. The simulation achieves strong quantitative agreement with experimental and computational benchmarks, reproducing flame stand-off ratios (F/D ≈ 2–3.5) and peak adiabatic flame temperatures (Tpeak ≈ 3000 K) consistent with previous work. Diffusive heat transfer dominates the total energetic flux, contributing 80–85% of the total heat input (Qmax ≈ 0.3–0.5 W), while radiative effects remain secondary, in accordance with previous estimates. Parametric sweeps over surface ice coverage fraction ψ reveal compensating feedback between evaporative impedance and geometric flame shape contraction. A single predominant global reaction mechanism, augmented by equilibrium radical generation at the reactive flame front, suffices to reproduce thin flame-sheet behavior within the high-Damkoehler limit. The resulting simulator balances interpretability, stability, and physical fidelity, requiring no HPC infrastructure and running interactively accessibly in Google Colab. Beyond LOX–H2 combustion, this framework offers a transparent, extensible platform for general coupled parabolic PDEs, bridging the gap between high-overhead CFD and simplistic static equilibrium tools.Item type: Item , Neuromechanical Modeling of Nematode C. elegans via Modular Integration and Deep Learning(2026-02-05) Kim, Jimin; Shlizerman, Eli ESNeural circuits within the nervous system use coordinated activities to control behavior. The mediation of neural activities by individual neuron dynamics and their integration within the nervous system represents a fundamental question in neuroscience. Computational approaches that integrate modeling of the nervous system, muscles, and the body can assist in investigating functional pathways that guide neural activities and movement. Such approaches are referred to as neuromechanical models, as they incorporate models of the nervous system and biomechanics to achieve simultaneous simulation of neural activities and behavior. Nematode Caernorhabditis elegans (C. elegans) is considered a viable framework for studying neuromechanics due to advances in the resolution of its nervous system connectomics, biomechanics, and electrophysiological recordings of neuronal activity. The availability of data allows for the construction of neuromechanical model candidates with varying scopes and modalities. In my PhD research, I proposed key methods for the identification, construction, and extension of neuromechanical models for C. elegans. In particular, I proposed the modular integration approach and its implementation, modWorm, for modeling and simulating neuromechanical model candidates. The modWorm software allows for the construction of a model as an integrated series of configurable and exchangeable modules, each describing specific biophysical processes. Using modWorm, I proposed an initial candidate for the integrated neuromechanical model of C. elegans. The model integrates the complete connectome and 7 biophysical modules, including intra- and extra-cellular neural dynamics, translation of neural dynamics to muscle dynamics, muscle dynamics to body postures, and proprioceptive feedback from the environment. The model recapitulates i) Known natural behavioral responses, such as forward and backward locomotion in response to associated neural stimuli or external forces, and ii) Transitional behaviors, such as avoidance and turns, through timed stimulus. We performed computational ablation studies on neurons to infer novel neural circuits involved in sensorimotor behaviors (e.g., touch response). Variations of the model’s modules, such as more detailed intra- and extra-cellular dynamics, connectome mappings, and optimizations of associated parameters, can delineate possiblemechanisms of locomotion and directions in which the model can be improved to fit experimental findings. For an extension of modWorm modality, I developed mod-SenseWorm to incorporate environmental stimulus during the simulation of C. elegans behavior (e.g., chemotaxis). In particular, mod-SenseWorm incorporates the dynamic translation of external stimulus into neural stimulation to achieve a closed-loop simulation between neuromechanics and the surrounding environment. The translation algorithms employed by individual neurons can be configured by setting their stimulus encoding properties (e.g., tonic, phasic) and anatomical locations in the body (e.g., anterior, posterior). We applied mod-SenseWorm to study C. elegans O2 aerotaxis behavior and showed that the proposed model, in conjunction with the simulation of an O2 environment, recapitulates empirically observed avoidance behaviors associated with increased O2 levels. Furthermore, through the analysis of simulated neural activities, we show the use case of mod-SenseWorm to infer potential functional circuits associated with chemotactic responses. Deep learning methods can assist in extending the scope of the proposed neuromechanical model by inferring the parameters of biologically detailed modules associated with empirical data. This led me to develop ElectroPhysiomeGAN (EP-GAN), a deep generative method for the estimation of biophysical neuron parameters associated with neuron models from recorded electrophysiological responses. Trained with simulation data, EP-GAN learns the translation from recorded neuron responses (e.g., membrane potential responses, steady-state currents) to biophysical model parameters associated with the detailed Hodgkin-Huxley (HH) model. Validation of EP-GAN by estimating HH-model parameters for 200 simulatednon-spiking neurons, followed by 9 experimentally recorded neurons in C. elegans, showed EP-GAN’s advantages in the accuracy of the estimated parameters and inference speed compared to existing estimation methods. Control strategies can further extend the modality of the neuromechanical model by inferring supplemental mechanisms of neural circuits associated with behavior. In particular, I have introduced a possible employment of deep reinforcement learning (DeepRL) methods to develop control strategies for both neural stimulation (neuromodulatory control) and neural connection mapping (connectome control) that are applied on top of the proposed neuromechanical model to achieve aimed behaviors. The strategies learned by DeepRL can be used to identify dynamic neuromodulatory inputs between neurons (e.g., neuropeptidic currents) and perturbations of the connection wiring map for a local neural circuit, which result in empirically observed chemotactic behavior (e.g., attraction) in response to environmental stimuli. The results highlight the potential of utilizing DeepRL methods in conjunction with the neuromechanical model to infer potential neural interactions and circuitry that lead to specific behaviors.Item type: Item , Learning Representations from Neural Population Dynamics: Addressing Neural Variability Across Scales(2026-02-05) Le, Trung; Shlizerman, EliInteractions between individual neurons, each characterized by distinct intrinsic physiological properties, collectively give rise to population responses underlying complex animal behaviors. These responses exhibit variable dynamics across trials, recording sessions, and behavioral contexts—arising from stochastic spiking at the trial level, electrode drift and neural plasticity across sessions, and task- or state-dependent modulation across behavioral contexts. This multiscale variability complicates the reliable extraction of scientific insights from population activity. Consequently, modeling and decoding from population activity necessitate methods capable of learning stable representations that capture the underlying structure of neuronal activity in the presence of neural variability caused by noise, partial observability, and domain shifts inherent in population recordings. In this dissertation, I present my studies that aim to extract useful information from population dynamics while addressing neural variability across different scales: single trials, recording sessions, and behavioral contexts. In the first study, I developed a spatiotemporal transformer to learn stable neural representations underlying stochastic firing activity of neural population on the single-trial basis. In the second study, I introduced a self-supervised framework for extracting time-invariant representations of individual neurons by modeling their dynamics across partially overlapping populations over multiple recording sessions. In the third study, I developed a lightweight adaptive framework for online neural decoding, enabling rapid and robust generalization in unseen sessions with minimal unlabeled calibration trials and no model fine-tuning. In the fourth study, I exploited the dependence of population dynamics on behavioral contexts and presented a decoding framework leveraging context-aware representations for effective decoding of speech from population activity. Together, these studies advance a representation-centric paradigm for neural population analysis—delivering generalizable abstractions that are robust across contexts, scale to large recordings, and leverage inductive biases embedded in the population—thereby enabling effective extraction of scientific insights from population analysis and paving a way towards high-performing and robust brain–computer interfaces.Item type: Item , Integrated Acousto-optic Beam Steering for Advanced Free Space Optical Applications(2026-02-05) Lin, Qixuan; Li, MoOptical beam steering underpins numerous technologies, including light detection and ranging (LiDAR), biomedical imaging, remote sensing, and utility-scale quantum computing with optically addressed qubits. However, established beam-steering approaches are constrained by slow response times, system complexity, and limited control over beam dynamics, preventing their widespread deployment in practical, high-performance optical systems. This thesis introduces integrated acousto-optic beam steering (AOBS), a solid-state beam-steering technology based on enhanced light–sound interactions in thin-film lithium niobate (TFLN). Gigahertz surface acoustic waves (SAWs), generated piezoelectrically and controlled by RF signals, produce moving refractive-index gratings that dynamically reshape the phase front of guided light. When the acoustic and optical modes satisfy the phase-matching condition, the guided light is efficiently scattered into free space, enabling agile beam steering on chip. The first part of the thesis establishes the principles and device-design considerations for AOBS on TFLN. I examine the piezoelectric and acoustic properties of lithium-niobate-on-insulator (LNOI), which form the foundation for efficient SAW generation. I then present theoretical models and simulation frameworks for acousto-optic interactions, identifying the key mechanisms that govern scattering efficiency and guiding the design of high-performance AOBS devices. The second part demonstrates LiDAR and multi-beam free-space communication enabled by the unique properties of AOBS. The Brillouin scattering process not only allows the steering angle to be controlled by the acoustic frequency using a single transducer, but also imprints a distinct frequency shift on each steered beam. This enables frequency–angular resolving (FAR) LiDAR, in which a single coherent receiver extracts the angular position of a target directly from the frequency of the returned signal. The coherent nature of the process further supports simultaneous transmission of microwave-encoded data streams at different acoustic frequencies to spatially separated targets, enabling multiple-input multiple-output (MIMO) free-space optical communication. In the final part, I show that co-confining acoustic and optical modes in micrometer-scale rib waveguides not only boosts the efficiency and agility of AOBS, but also enables seamless integration within broader photonic integrated circuits (PICs). This opens the door to compact, high-performance, and multifunctional free-space optical systems that combine acousto-optic, electro-optic, and nonlinear photonic functionalities on a single TFLN platform.Item type: Item , Knowledge Transfer from Deep Electronic Networks to Optical Neural Networks(2026-02-05) Xiang, Jinlin; Shlizerman, EliOptical Neural Networks (ONNs) offer a promising alternative to electronic networks for artificial intelligence computing by leveraging the speed of light, providing lower power consumption and latency. However, implementing ONNs remains challenging due to high energy cost of nonlinear operations and the precise alignment required for multi-layer optical systems. Previous research introduced hybrid approaches that combine an optical frontend for fast computation with an electronic backend for nonlinear processing. While end-to-end optimization for hybrid ONNs has been demonstrated on specific datasets and optical configurations, these approaches typically lack generalization across tasks and hardware design. This is primarily due to the optical frontend’s inability to reliably mimic the feature extraction capabilities of state-of-the-art electronic networks. In my research, I proposed to transfer knowledge from electronic networks to hybrid electro-photonic convolutional neural networks, enabling the optical frontend to capture features similar to electronic networks while simplifying the model architecture. I trained the hybrid network using a teacher-student transfer learning framework, where a nonlinear electronic teacher network guided the optical frontend to learn features while circumventing nonlinearity. Next, I collaborated with colleagues to compress the convolutional layers of electronic networks (e.g., AlexNet) into a single layer, reducing the need for precise optical alignment and lowering computational costs. Compared with previous works, this approach reduced latency and power consumption while improving feature alignment via transfer learning. Furthermore, considering a continual learning setting, I introduced a novel tangent kernel loss as an effective approach for a transfer learning framework. Then, I integrated the approach based on tangent kernel loss into ONNs to form a unified pipeline, Neural Tangent Knowledge Distillation (NTKD). This task-agnostic and hardware-agnostic framework supports image classification and segmentation across diverse optical systems. Experiments on multiple datasets and hardware configurations show that NTKD pipeline consistently enhances accuracy and enables practical deployment in both pre-fabrication simulations and physical implementations.Item type: Item , Wireless Energy Systems in Extreme Environments: New Solutions for Earth- and Space-Based Wireless Power Transmission and Low-Power Wireless Communications Systems(2026-02-05) Garman, Shanti M.; Smith, Joshua RAs global space agencies and private companies pursue ambitious lunar exploration programs, including NASA's Artemis initiative, establishing sustainable operations on the Moon requires innovative solutions to overcome extreme environmental conditions, power scarcity, and radio frequency (RF) interference constraints. This dissertation addresses critical technological challenges for lunar and planetary missions through four interconnected research contributions in wireless power transfer, RF energy harvesting, and low-power and low-emissions communications. First, this work investigates magnetic coupling behavior in wireless power transfer systems operating in the presence of lunar regolith simulant enriched with iron nanoparticles. Findings reveal that particle size and skin depth of metallic iron content are critical parameters for electromagnetic coupling, with implications for modeling accuracy in future lunar missions. These results extend beyond space applications to benefit terrestrial systems including ground-penetrating radar and wireless power networks. Second, a high-power RF energy harvesting prototype is demonstrated, delivering nearly 3 W of DC power—orders of magnitude beyond traditional RF harvesters—using an array architecture operating in the ultra-high frequency band, with validation through real-world cellular site demonstrations. Third, a low-power wireless communication system using modulated Johnson noise (MJN) is developed and tested with WISP 6 RFID tags, achieving 100\% data transmission accuracy up to 10 cm without requiring a generated RF carrier, thereby reducing system power requirements. Finally, MJN-based wireless communications is extended to lunar surface vehicles, through investigating electromagnetic field coupling with lunar regolith and its effects on MJN system performance. This work aims to enable autonomous lunar rovers to communicate wirelessly while minimizing RF interference near sensitive radioastronomy installations and maximizing mission longevity in resource-constrained environments. An important contribution of this work is development of new analytical models for range scaling of noise power, signal-to-noise ratio (SNR), and channel capacity for MJN wireless communications systems.Item type: Item , Articutool: Proactive Verification and Decoupled Control for Robust Robot-Assisted Feeding(2026-02-05) Jaime Martinez, Jose; Srinivasa, Siddhartha S; Burden, SamFor individuals with motor impairments, general-purpose assistive robots can offer increased independence. However, the practical utility of such systems can be undermined if they are unable to reliably handle common foods, leading to spillage that negatively impacts the user’s dining experience. We propose that decoupling gross arm transport from fine-grained tool manipulation can enhance reliability. To this end, this paper introduces the Articutool: a modular, untethered, and locally intelligent 2-DOF wrist that a 6-DOF arm can temporarily equip to form a decoupled 8-DOF system. This decoupled approach separates the concerns of gross arm transport from fine-grained tool manipulation, empowering the arm’s planner to find robust paths while the tool’s onboard controller maintains utensil orientation. Our “plan-then-verify” control methodology proactively checks the arm’s plans against the Articutool’s kinematic and dynamic limitations to reduce the likelihood of spills before they happen. Our large-scale simulation benchmark, which isolates the challenging constrained-transport phase of feeding, demonstrates that this decoupled approach achieves a 96.0% transport planning success rate with a median planning time of 4.0 seconds. While monolithic baselines can achieve comparable success rates given sufficient computation time, they are orders of magnitude slower (median 75.7s for 8-DOF), rendering them impractical for real-time interaction. Physical experiments confirm these findings, showing that the system can successfully acquire challenging foods such as noodles and liquids, and achieves a 70.0% meaningful success rate (delivering a spill-free bite that meets an empirically-defined mass threshold) on the end-to-end feeding task, a task on which the baseline’s meaningful success rate was only 10.0%. This work serves as a critical step toward an ecosystem of intelligent, task-specific tools for more capable, general-purpose assistive robots.Item type: Item , Hybrid-integrated photonics platform for quantum networks based on defects in diamond(2026-02-05) Yama, Nicholas Sako; Fu, Kai-Mei COptically active defects in solid-state hosts such as diamond are a promising platform for quantum technologies. In such a platform, quantum information is encoded in the localized spin states of the defect system while photons are utilized to mediate the long-range transfer of information over the quantum network. The efficient interfacing of network photons and individual defects is consequently a fundamental requirement. This may be achieved by integrating defects into photonic circuits and leveraging the enhanced control afforded by cavity quantum electrodynamics (QED). At the same time, this approach also offers a straightforward means of scalability as these circuits can be densely packed with all the necessary functionalities onto a single chip. The development of a photonics platform capable of realizing such functionality and scalability, however, remains elusive. In this thesis we develop a scalable hybrid-integrated quantum photonics platform based on gallium phosphide (GaP)-on-diamond. We demonstrate the first integrated photonic devices in boron-doped GaP --- a scalable source of GaP grown commercially at 12-inch wafer scale --- defining a path forward for the development of large-scale GaP-on-diamond photonics. Leveraging the strong nonlinear properties of GaP, we describe novel nonlinear photonic devices which utilize resonant enhancement to achieve high-efficiency frequency conversion in compact and scalable integrated devices. We then develop a technique for enhancing the bandwidth of these devices, enabling them to be utilized in practical quantum networking applications. Finally, we develop a cavity-QED platform integrating single silicon-vacancy (SiV) centers with one-dimensional GaP-on-diamond photonic crystal (PhC) cavities. These cavities are integrated by stamp transfer and do not require any additional diamond substrate processing, enabling straightforward scalability. We then specialize PhC cavity design principles to hybrid-integrated devices, improving the PhC design metrics by nearly two orders of magnitude. We fabricate these devices and demonstrate spin-dependent scattering in the high-cooperativity regime: a requirement for cavity-QED-based devices. Altogether these results demonstrate the necessary components for realizing scalable photonic interfaces for defect qubit systems, establishing GaP-on-diamond as a promising platform for the development of quantum technologies.Item type: Item , CARE - Clinician Augmented Reality Environment: Developing an Apple Vision Pro Framework for Image-guided Surgeries(2026-02-05) Wang, Ze Xia Lucas; Hussein, RaniaIn the evolving landscape of surgical specialties, the shift toward minimally invasive procedures has become increasingly prominent, with most surgeries now being guided by imaging techniques such as laparoscopy, endoscopy, and fluoroscopy. Monitors have become essential for these procedures, yet their placement and visibility present significant challenges in terms of user experience and ergonomics, contributing to musculoskeletal disorders affecting 50-85% of practicing surgeons. Operating room staff continually seek innovations that simplify the environment, improve the surgical experience, and improve patient outcomes. Augmented Reality (AR) tools offer a promising solution by allowing optimal placement of virtual monitors, reducing the physical constraints and ergonomic challenges of traditional monitor placement. We are developing the Clinician Augmented Reality Environment (CARE) platform, a wireless streaming platform paired with an Apple Vision Pro AR headset software application, to enable wireless casting of any video source directly into the surgeon's field of view. Our implementation achieved streaming latencies under 71 milliseconds for 1080p video, meeting established requirements for real-time surgical applications. This thesis establishes a user-centered design methodology grounded in comprehensive interviews with 23 clinical and engineering stakeholders, analyzes the technical performance of our platform in surgical contexts, and evaluates the potential for clinical deployment and commercialization of AR-based surgical technologies.Item type: Item , Applications of Metasurfaces in Endoscope and Hyperspectral Imaging(2026-02-05) Xie, Ningzhi; B\"ohringer, Karl F.; Majumdar, ArkaMetasurfaces are ultrathin arrays of subwavelength scatterers. They offer versatile control over light within a wavelength-scale thickness. Their ability to condense complex optical functions into compact forms makes them particularly promising for miniaturized imaging systems, where conventional optics face challenges of bulk and limited scalability. This thesis presents several research projects that explore the integration of metasurfaces into endoscopic imaging and hyperspectral imaging, addressing key challenges in device miniaturization, resolution, and spectral functionality. The first part of the thesis focuses on scanning fiber endoscopes (SFEs), which are among the most compact scanning-based endoscopes. Two projects demonstrate the replacement of conventional refractive lens assemblies with metasurface-based flat lenses (metalenses). A monochromatic near-infrared metalens was designed, fabricated, and experimentally validated, achieving diffraction-limited performance and significantly reducing optical track length compared to refractive optics. Building on this, a polychromatic metalens was developed to enable tri-color RGB imaging, overcoming the intrinsic dispersion of conventional metalenses and delivering near-diffraction-limited resolution across multiple wavelengths. These studies highlight the potential of metalenses to enable highly compact endoscopic systems with improved imaging performance. The second part of the thesis extends metasurface functionality beyond focusing to spectral encoding. A metasurface–Fabry–Pérot cavity array was designed as a spatial-to-spectral encoder, enabling the transmission of multi-pixel image information through a single fiber core without scanning. This proof-of-concept demonstrates the feasibility of spectrally encoded, non-scanning endoscopic imaging, offering a pathway to surpass the resolution limits imposed by fiber pixel density in endoscopic imaging. The final part explores hyperspectral imaging, where metasurfaces are used as spectral code masks for compressive sensing–based reconstruction. A metasurface code mask was optimized to encode full hyperspectral datasets into single-shot grayscale images, which were computationally reconstructed to recover high-resolution hyperspectral information. Experimental demonstrations validate this system as a compact, efficient, and high-speed alternative to conventional hyperspectral imagers. Together, these works establish metasurfaces as powerful optical platforms for advancing miniaturized endoscopy and hyperspectral imaging, demonstrating the transformative potential of metasurfaces in next-generation biomedical and imaging technologies.Item type: Item , Design and Radiometric Modeling of a Portable EEM Fluorescence Sensor for ppb-Level Detection of Pesticide Mixtures in Water(2026-02-05) Holterhoff, Nathan Paul; Mamishev, AlexanderAccording to the U.S. Geological Survey (USGS), pesticide contamination of American waters is widespread, with typical samples containing mixtures of 10 to 20 active compounds. Recent studies show that agricultural runoff and seasonal application patterns are two of the most common sources of this contamination. Environmental monitoring studies help improve understanding of the distribution and persistence of pesticides in natural water systems. Enhanced detection tools are critical for environmental monitoring studies that target data collection and the analysis of water quality data. Traditional pesticide measurement methods include solvent extraction and chromatographic separation, which introduce problems such as: 1) high cost per sample, 2) slow turnaround time, and 3) limited suitability for field deployment. Recent advancements in fluorescence spectroscopy have allowed for the development of various portable measurement techniques in different applications. However, environmental agencies are still using laboratory-based analysis rather than portable optical measurement tools, which demonstrates that there is significant room for improvement in this field. The detection of pesticides using excitation-emission matrix (EEM) fluorescence requires accurate photon throughput calculation using component-based or radiometric modeling techniques. This thesis is a study of the design, modeling, and validation of an EEM fluorescence system based on multi-wavelength excitation theory. The system was designed, modeled, and evaluated in pesticide detection applications using three representative compounds: zeta-cypermethrin, myclobutanil, and glyphosate. The compounds were tested at five concentration levels to characterize the system across different detection scenarios. When compared to model predictions, the experimental results showed detection limits of 10-100 ppb for strongly fluorescent pesticides, approximately one order of magnitude above predicted values due to lower LED power than modeled. Based on the results and validation from the radiometric model, the use of compact EEM fluorescence systems in portable applications has potential to improve the frequency and cost-effectiveness of pesticide screening.Item type: Item , Automated Analog Layout Methodologies for Mixed-Signal SoC Implementation(2025-10-02) Liu, Xindi; Shi, C. -J. RichardThis dissertation presents automated analog layout methodologies for high-performance mixed-signal SoCs. A template-based generator produces block-specific layouts in minutes, coupled with batch-mode verification for fast optimization. The LEGO methodology builds a process-portable analog standard cell library, where verified fixed-dimension cells can be tiled to form larger systems. Using this approach, two interface designs: Advanced Interface Bus (AIB) and Electronic Integrated Circuit (EIC), demonstrate modularity, tight matching, and rapid integration. These methodologies enable efficient analog layout generation while preserving critical performance and symmetry in complex AMS designs.Item type: Item , Contact-less Object Handling: Manipulation and Sensing Methods for Acoustic Levitation Systems(2025-10-02) Nakahara, Jared; Smith, Joshua RThis dissertation will discuss the design, control, and sensing capabilities of acoustic levitation systems for laboratory automation providing a flexible and programmable method for contact-less sample manipulation. Conventional tools like pipettes and well plates rely on surface contact to handle small volumes of liquids and reagents. This introduces risks of contamination, sample loss, and measurement uncertainty. Assays and laboratory experiments can require hundreds or even thousands of steps which can be labor intense if done by hand. Automated liquid handling systems can help streamline labor intense workflows, alleviate shortages in labor, and free up scientist to focus on task that benefit from individual training and expertise. These traditional tools also generate significant plastic waste and often require separate, siloed systems for manipulation and sensing, increasing operational complexity and benchtop footprint. Acoustic levitation offers an alternative by using high-intensity ultrasonic waves to trap, lift, move, mix, and measure solids, liquids, and even living organisms in mid-air without any physical contact. This eliminates contamination risk and enables continuous, real-time monitoring of samples from the moment of mixing through analysis. Advances in computing power, acoustic field modeling, and holographic beamforming have dramatically expanded the capabilities of acoustic levitation systems. This work draws from fields such as ultrasonic holography, robotics, and sensor fusion to create programmable systems capable of vessel-free containment, automated liquid handling, and embedded sensing. The system architecture discussed in this dissertation integrates control algorithms, custom hardware, and on-board sensing to enable a contact-less liquid handling instrument: one that merges manipulation and measurement in a unified platform. This dissertation will discuss contributions in acoustic levitator sensing, control and manipulation, including the ability to pick up objects from acoustically reflective surfaces, weigh droplets and particles in air using trap dynamics, translate levitated objects via frequency modulation, dispense droplets of liquid into the levitator with an acoustic launching system and control the shape of levitated liquid droplets. When applied to disciplines like chemistry, pharmaceuticals, and molecular biology, this approach could provide a contact-less robotic liquid handling solution that is programmable while minimizing contamination, eliminating consumables, reducing human error, and enabling new opportunities for automation. Acoustic levitation systems thus represent not just a new tool, but a possible shift in how laboratories can conduct precise, clean, and scalable research.
