Irreversibility in Stochastic Dynamic Models and Efficient Bayesian Inference

Ma, Yian

Irreversibility in Stochastic Dynamic Models and Efficient Bayesian Inference

Files

Ma_washington_0250E_17586.pdf (6.34 MB)

Date

2017-08-11

relationships.isAuthorOf

Ma, Yian

Abstract

This thesis is the summary of an excursion around the topic of reversibility. We start the journal from a classical mechanical view of the “time reversal symmetry”: we look into the details to track the movements of all particles at all times and ask whether the entire system remains the same if both time and momentum flip signs. This description of reversible process is the exact reflection of classical mechanics with a quadratic kinetic energy which generates Boltzmann’s equilibrium thermodynamics. Unfortunately, it heavily depends on the coordinate system the variables reside in and automatically excludes the processes with dissipation or/and fluctuation from being reversible. A related but slightly more relaxed scenario is that the dynamics conserve certain quantities. Fortunately, we are able to generalize thermodynamics to this broader range of systems. For the discussion of reversibility, however, we veer towards a direction that requires much less scrutiny, and provides far more generality. We follow Kolmogorov’s footsteps and only study the statistics of the variables in question. Reversibility in that realm dictates that the probability of observing a path forward equals to that of seeing a path backward. Interestingly though, the aforementioned conservative dynamics are the source of irreversibility in stationarity. We then realize that the general Markov process can be decomposed into reversible and irreversible components, each preserving the entire process’ stationary distribution. This realization lets us continue along the path to develop thermodynamic theory for general stochastic processes and confirm the universal ideal behavior in Orntein-Uhlenbeck processes. The realization also prompts us to continue our excursion further into applications. On the modeling side, we discover a way to analyze noise induced phenomena in reaction diffusion equations. Stability and bifurcation analysis is brought into the stochastic models through the bridge of “effective dynamics”. We are able to quantitatively explain the onset of pattern formations introduced by chemical reaction noise. Looking over to the Bayesian inference side (for the learning of model parameters from data), we find ourselves in the position of digging into a critical problem: computation with stochasticity. As the defacto approaches for Bayesian inference, Markov chain Monte Carlo (MCMC) methods have always been criticized for their slow convergence (mixing rates) and huge amount of computation required for large data sets (scalability). It has been discovered that introduction of irreversibility increases the mixing of Markov processes. Using the decomposition of general Markov processes, we reparametrize the space of viable Markov processes for sampling purpose, so that the search for the correct MCMC algorithm turns into a game of plug and play with two matrices (or transition probabilities) to choose from. Irreversibility is automatically incorporated as one of the components to specify. Digging even deeper into a new world of scalable Bayesian inference, we start to make use of stochastic gradient techniques for excessively large data sets. With independent and identically distributed data, our previous results with continuous Markov process can be revised and provide a complete recipe to construct new stochastic gradient MCMC algorithms. Within our recipe, we pick some of the nice attributes of the previous methods and combine them to form an algorithm that excels at learning topics in Wikipedia entries in a streaming manner. With correlated data, we find a huge void space to explore. As the first step, we visit time dependent data and harness the memory decay to generalize the stochastic gradient MCMC methods to hidden Markov models. We find our method about 1,000 times faster than the traditional sampling method for an ion channel recording containing 209,634 observations.