Knowledge Transfer from Deep Electronic Networks to Optical Neural Networks
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Optical Neural Networks (ONNs) offer a promising alternative to electronic networks for artificial intelligence computing by leveraging the speed of light, providing lower power consumption and latency. However, implementing ONNs remains challenging due to high energy cost of nonlinear operations and the precise alignment required for multi-layer optical systems. Previous research introduced hybrid approaches that combine an optical frontend for fast computation with an electronic backend for nonlinear processing. While end-to-end optimization for hybrid ONNs has been demonstrated on specific datasets and optical configurations, these approaches typically lack generalization across tasks and hardware design. This is primarily due to the optical frontend’s inability to reliably mimic the feature extraction capabilities of state-of-the-art electronic networks. In my research, I proposed to transfer knowledge from electronic networks to hybrid electro-photonic convolutional neural networks, enabling the optical frontend to capture features similar to electronic networks while simplifying the model architecture. I trained the hybrid network using a teacher-student transfer learning framework, where a nonlinear electronic teacher network guided the optical frontend to learn features while circumventing nonlinearity. Next, I collaborated with colleagues to compress the convolutional layers of electronic networks (e.g., AlexNet) into a single layer, reducing the need for precise optical alignment and lowering computational costs. Compared with previous works, this approach reduced latency and power consumption while improving feature alignment via transfer learning. Furthermore, considering a continual learning setting, I introduced a novel tangent kernel loss as an effective approach for a transfer learning framework. Then, I integrated the approach based on tangent kernel loss into ONNs to form a unified pipeline, Neural Tangent Knowledge Distillation (NTKD). This task-agnostic and hardware-agnostic framework supports image classification and segmentation across diverse optical systems. Experiments on multiple datasets and hardware configurations show that NTKD pipeline consistently enhances accuracy and enables practical deployment in both pre-fabrication simulations and physical implementations.
Description
Thesis (Ph.D.)--University of Washington, 2025
