Interpretation and Optimization of Recurrent Neural Network Performance Through Lyapunov Exponents Methodology

dc.contributor.advisorShlizerman, Eli
dc.contributor.authorVogt, Ryan Andrew
dc.date.accessioned2023-08-14T17:01:49Z
dc.date.available2023-08-14T17:01:49Z
dc.date.issued2023-08-14
dc.date.submitted2023
dc.descriptionThesis (Ph.D.)--University of Washington, 2023
dc.description.abstractCommon deep learning models for learning multivariate time series data are Recurrent Neural Networks (RNN). These models are ubiquitous computing systems which have been studied for decades. The propagation of gradients over long time-sequences can make training RNNs particularly challenging and difficult to interpret. The hidden states of RNNs can be viewed as non-autonomous dynamical systems which can be analyzed using dynamical systems tools. In this work, we leverage Lyapunov Exponents, a dynamical systems tool which measures the rate at which nearby trajectories expand or contract over time to analyze the propagation of information in RNNs and relate these properties to RNN training and performance. We show that several statistics of the Lyapunov spectrum have moderate correlation with network loss on both classification and regression tasks, and emerge early in training. We also train an autoencoder to learn the relation between the full Lyapunov spectrum and an RNN's loss on given tasks. The latent representation of the autoencoder distinguishes between high- and low-accuracy networks across a variety of network hyperparameters, including initialization parameter, network size, and network architecture more effectively than direct statistics of the Lyapunov spectrum. From a theoretical perspective to further analyze Lyapunov Exponents of RNNs, we derive a direct expression for gradient in terms of the components of RNNs' Lyapunov Exponents which measure directions (vectors Q) and factors (scalars R) of expansion and contraction over a sequence. We find that the Q vectors associated with the greatest degree of expansion become increasingly aligned with the dominant directions of the gradient extracted by singular value decomposition. Furthermore, we show that the predictions generated by RNN are maximally affected by input perturbations at moments which the R values are maximal. These results showcase correlation between dynamical systems stability theory for RNNs, network performance, and loss gradients. This may open the way to design hyperparameter optimization algorithms and adaptive training methods that account for state-space dynamics as measured by Lyapunov Exponents to improve computations. It may also provide a unifying dynamical systems framework to study RNN performance across network architectures and tasks.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherVogt_washington_0250E_25670.pdf
dc.identifier.urihttp://hdl.handle.net/1773/50205
dc.language.isoen_US
dc.rightsCC BY
dc.subjectArtificial Intelligence
dc.subjectDynamical Systems
dc.subjectMachine Learning
dc.subjectRecurrent Neural Networks
dc.subjectMathematics
dc.subjectComputer science
dc.subject.otherApplied mathematics
dc.titleInterpretation and Optimization of Recurrent Neural Network Performance Through Lyapunov Exponents Methodology
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Vogt_washington_0250E_25670.pdf
Size:
23.07 MB
Format:
Adobe Portable Document Format