Interpretation and Optimization of Recurrent Neural Network Performance Through Lyapunov Exponents Methodology

Vogt, Ryan Andrew

Interpretation and Optimization of Recurrent Neural Network Performance Through Lyapunov Exponents Methodology

dc.contributor.advisor	Shlizerman, Eli
dc.contributor.author	Vogt, Ryan Andrew
dc.date.accessioned	2023-08-14T17:01:49Z
dc.date.available	2023-08-14T17:01:49Z
dc.date.issued	2023-08-14
dc.date.submitted	2023
dc.description	Thesis (Ph.D.)--University of Washington, 2023
dc.description.abstract	Common deep learning models for learning multivariate time series data are Recurrent Neural Networks (RNN). These models are ubiquitous computing systems which have been studied for decades. The propagation of gradients over long time-sequences can make training RNNs particularly challenging and difficult to interpret. The hidden states of RNNs can be viewed as non-autonomous dynamical systems which can be analyzed using dynamical systems tools. In this work, we leverage Lyapunov Exponents, a dynamical systems tool which measures the rate at which nearby trajectories expand or contract over time to analyze the propagation of information in RNNs and relate these properties to RNN training and performance. We show that several statistics of the Lyapunov spectrum have moderate correlation with network loss on both classification and regression tasks, and emerge early in training. We also train an autoencoder to learn the relation between the full Lyapunov spectrum and an RNN's loss on given tasks. The latent representation of the autoencoder distinguishes between high- and low-accuracy networks across a variety of network hyperparameters, including initialization parameter, network size, and network architecture more effectively than direct statistics of the Lyapunov spectrum. From a theoretical perspective to further analyze Lyapunov Exponents of RNNs, we derive a direct expression for gradient in terms of the components of RNNs' Lyapunov Exponents which measure directions (vectors Q) and factors (scalars R) of expansion and contraction over a sequence. We find that the Q vectors associated with the greatest degree of expansion become increasingly aligned with the dominant directions of the gradient extracted by singular value decomposition. Furthermore, we show that the predictions generated by RNN are maximally affected by input perturbations at moments which the R values are maximal. These results showcase correlation between dynamical systems stability theory for RNNs, network performance, and loss gradients. This may open the way to design hyperparameter optimization algorithms and adaptive training methods that account for state-space dynamics as measured by Lyapunov Exponents to improve computations. It may also provide a unifying dynamical systems framework to study RNN performance across network architectures and tasks.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Vogt_washington_0250E_25670.pdf
dc.identifier.uri	http://hdl.handle.net/1773/50205
dc.language.iso	en_US
dc.rights	CC BY
dc.subject	Artificial Intelligence
dc.subject	Dynamical Systems
dc.subject	Machine Learning
dc.subject	Recurrent Neural Networks
dc.subject	Mathematics
dc.subject	Computer science
dc.subject.other	Applied mathematics
dc.title	Interpretation and Optimization of Recurrent Neural Network Performance Through Lyapunov Exponents Methodology
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Vogt_washington_0250E_25670.pdf
Size:: 23.07 MB
Format:: Adobe Portable Document Format

Download

Collections

Applied mathematics