Computational methods for system identification and data-driven forecasting

Rudy, Samuel

Computational methods for system identification and data-driven forecasting

Files

Rudy_washington_0250E_19997.pdf (9.28 MB)

Date

2019-08-14

relationships.isAuthorOf

Rudy, Samuel

Abstract

This thesis develops several novel computational tools for system identification and data-driven forecasting. The material is divided into four chapters: data-driven identification of partial differential equations, neural network interpolation of velocity field data from trajectory measurements, smoothing of high dimensional nonlinear time series, and an application of data-driven forecasting in biology. We first develop a novel computational method for identifying partial differential equations (PDEs) from measurements in the spatio-temporal domain. Building on past methods in sparse regression, we formulate a regression problem to select the active terms of a PDE from a large library of candidate basis functions. In contrast to many data-driven forecasting methods, the proposed algorithm yields exact representations of the dynamics. This has the advantage of allowing for future state prediction from novel initial and boundary conditions as well as rigorous mathematical analysis. The method is also extended to the case where coefficients vary either in space or time. We demonstrate the ability to accurately learn the correct active terms and their magnitudes on a variety on canonical partial differential equations. We also develop a method for interpolating the velocity fields of smooth dynamical systems using neural networks. We specifically focus on addressing the issue of learning from noisy and limited data. We construct a cost function for training neural network interpolations of velocity fields from trajectory measurements that explicitly accounts for measurement noise. The need to numerically differentiate data is avoided by placing the neural network interpolation of velocity within an explicit timestepping scheme and training as a flow map rather than directly on the velocity field. The proposed framework is shown to be capable of learning accurate forecasting models even when data is corrupted by significant levels of noise. We also consider some limitations of using neural networks as forecasting models for dynamical systems. Using test problems with known dynamics, we show that neural networks are able to accurately interpolate a vector field only where data is collected and generally exhibit high generalization error. Some guidelines are proposed regarding the contexts in which neural networks may or may not be useful in practice. For datasets where dynamics are known either completely or up to a set of parameters, we develop a novel smoothing technique based on soft-adherence to governing equations. The proposed method may be applicable to smoothing data from deterministic dynamical systems where high dimensionality or nonlinearity make sequential Bayesian methods impractical. We test the method on several canonical problems from data assimilation and show that it is robust to exceptionally high levels of noise as well as noise with non-zero mean and temporally autocorrelated noise. The last section of this thesis develops a data-driven forecasting model for the half-sarcomere, a small component of skeletal muscle tissue. Current models of the half-sarcomere currently require computationally expensive Monte Carlo simulations to resolve the effects of filament compliance. We seek to replicate the dynamic behavior realized by Monte Carlo simulation of the half-sarcomere at a lower cost. Drawing inspiration from surrogate and reduced order modeling, we apply a course graining to the variables tracked by the Monte Carlo simulation and learn a dynamics model on the course grained variables using data. We find that the resulting data-driven model effectively reproduces force traces and dynamics of the course grained state when given novel input parameters. Taken together, the innovations presented in this thesis represent a modest contribution to the field of data-driven methods for system identification and forecasting. In the concluding chapter, we highlight several exciting directions that build upon and improve the research presented in this thesis.