Robust dynamic optimization: theory and applications

Sinha, Saumya

Robust dynamic optimization: theory and applications

dc.contributor.advisor	Ghate, Archis
dc.contributor.author	Sinha, Saumya
dc.date.accessioned	2018-11-28T03:14:41Z
dc.date.available	2018-11-28T03:14:41Z
dc.date.issued	2018-11-28
dc.date.submitted	2018
dc.description	Thesis (Ph.D.)--University of Washington, 2018
dc.description.abstract	Many applications in decision-making use a dynamic optimization framework to model a system evolving uncertainly in discrete time, and an agent who chooses actions/controls from a set of available choices in order to minimize a suitable cost function. An important aspect of model formulation is the choice of input parameters. These are traditionally estimated from historical data and prior domain knowledge, and treated as known quantities in the decision-making process. This approach ignores any estimation errors or misspecification in the problem data, leading to potentially suboptimal solutions. Robust optimization addresses this issue by treating the parameters themselves as unknown quantities, known only to lie within some set of plausible values called the ‘uncertainty set’. The decision-maker then follows a conservative approach and minimizes a ‘worst-case’ cost over all possible values of the parameter. Problems of this nature are the subject of this dissertation. The first chapter provides a background on infinite-horizon Markov decision processes (MDPs) and the Newsvendor model. MDPs are sequential decision-making problems with infinitely many decision epochs. At the end of every epoch, the next state of the system is prescribed via a transition probability depending on the current state and the action cho- sen. The robust formulation allows for these transition probabilities to be unknown, and the decision-maker minimizes the maximum expected total discounted cost. A detailed analytical treatment of robust MDPs with bounded immediate costs, along with robust versions of the the standard solution methods of value iteration and policy iteration, is available in the literature. However, these methods cannot be implemented when the state-space is countable. Further, no theoretical framework is available for the case when costs are unbounded. These issues are addressed in Chapters 2 to 4. The Newsvendor model is a classical framework for inventory management over a finite horizon under demand ambiguity, and a robust formulation described in Chapter 5 circumvents the issue of assuming distributional information on this demand. Robust nonstationary MDPs: In the second chapter, I consider an infinite-horizon robust MDP for which immediate costs are time-dependent but uniformly bounded, and the uncertainty sets vary with time. The state- and action-spaces are assumed to be finite. The optimal value function can be obtained from the robust Bellman equations [28], but the non- stationarity of the data results in an infinite system of equations to be solved. I provide a policy iteration algorithm which uses finite-dimensional approximations to policy evaluation and policy improvement, so that each step of the algorithm requires a finite amount of memory and computation, and as such, can be used in practice. These approximations are chosen adaptively to guarantee that the algorithm achieves sufficient improvement in each iteration, so that the values of the policies generated by the algorithm monotonically converge pointwise to the optimal. The policies converge subsequentially to an optimal policy. Robust countable-state MDPs with bounded costs: In the third chapter, I generalize the above setup to solve robust stationary MDPs with countable state-spaces. Im- mediate costs as well as the uncertainty sets are time-invariant in this case. The costs are non-negative and bounded, and the action-spaces are finite. In this case as well, an as-is execution of the existing policy iteration method is not possible, owing to three main reasons. The first issue arises due to the countable nature of the state-space that necessitates the solution of an infinite system of equations, and is addressed via state-space truncation. The other two complications arise from the nonlinearity of the robust evaluation operator and the need for solving the so-called inner problems to arbitrary accuracy. These are addressed by successive approximation and a careful selection of uncertainty sets. Thus, I present an approximate policy iteration algorithm that can be used in practice. Value functions of the policies generated by the algorithm converge to the optimal, while the policies themselves converge subsequentially to an optimal policy. Robust MDPs with interval uncertainty sets, robust MDPs with bounded state-transitions, and a robust equipment replacement model are presented as examples where the algorithm can be implemented. Robust countable-state MDPs with unbounded costs: The third chapter further widens the scope by allowing the immediate cost functions to be unbounded. A theoretical treatment of these MDPs is not available in the literature, and I develop such a framework here. Standard assumptions for unbounded-cost MDPs are generalized to the robust case. The robust Bellman operator is shown to be a J-step contraction mapping, which guarantees the existence of a unique solution to the robust Bellman equations. Optimality of the robust Bellman equations is also established. A robust multi-period newsvendor model with inventory balance constraints: In the fourth chapter, I study a different approach to dynamic optimization by means of an application in inventory control. A seller managing the inventory of a single product over multiple periods must determine the optimal order quantity per period in the face of uncertain demand. This problem is solved via a newsvendor model, and the optimal solution is a function of the purchase, shortage and holding costs as well as the revenue earned per unit. Here, I formulate a robust multi-period newsvendor model to address the ambiguity in demand, and the seller maximizes his ‘worst-case’ total profit. Closed-form expressions for robust optimal order quantities are provided, and their relationship with various cost parameters is analyzed. Explicit optimal solutions to the inner-problems are obtained for a large class of uncertainty sets. Additionally, a numerical comparison of the robust model with a stochastic one is presented for benchmarking.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Sinha_washington_0250E_19157.pdf
dc.identifier.uri	http://hdl.handle.net/1773/42949
dc.language.iso	en_US
dc.rights	CC BY-NC-SA
dc.subject	dynamic programming
dc.subject	Markov decision processes
dc.subject	robust optimization
dc.subject	Operations research
dc.subject	Applied mathematics
dc.subject.other	Applied mathematics
dc.title	Robust dynamic optimization: theory and applications
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Sinha_washington_0250E_19157.pdf
Size:: 647.89 KB
Format:: Adobe Portable Document Format

Download

Collections

Applied mathematics