Dynamic, convex, and robust optimization with Bayesian learning for response-guided dosing

Kotas, Jakob

Dynamic, convex, and robust optimization with Bayesian learning for response-guided dosing

Files

Kotas_washington_0250E_16261.pdf (3.39 MB)

Date

2016-07-14

relationships.isAuthorOf

Kotas, Jakob

Abstract

Medical treatment commonly involves the administration of drug doses at multiple time-points. Intuitively, the higher the doses, the higher the likelihood of disease control as well as the risk of adverse effects and of logistical inconvenience. Since an individual patient's response to treatment is uncertain, the need to effectively balance this trade-off pervades all of medicine. In response-guided dosing (RGD), the goal is to adaptively tailor doses to each individual patient's stochastic evolution of disease condition over multiple treatment sessions. Several clinical experts, in editorial and review papers, have commented that despite a strong surge of interest in RGD, a quantitative, dynamic decision-making framework has been missing. The research objective of this dissertation is to apply stochastic dynamic programming (DP), convex optimization, and Bayesian learning methods to develop such a mathematically rigorous framework to facilitate dosing decisions in RGD. The ultimate goal of this framework is to administer the right dose to the right patient at the right time. RGD for rheumatoid arthritis: The first chapter begins with a stochastic DP framework to facilitate RGD in rheumatoid arthritis, which adapts biologic doses over the treatment course based on each patient's observed evolution of the 28-joint disease activity score (DAS28). The goal is to balance the DAS28 attained at the end of the course with the weighted total dose administered. Numerical experiments and sensitivity analyses using data from the OPTION trial are performed, which are found to be monotone. A general stochastic DP formulation for RGD: The specific rheumatoid arthritis formulation is then generalized for other diseases. The DP allows for an arbitrary dose-response function, and balances the disutility of doses with the disutility of the disease condition reached. We prove that under assumptions on the underlying functions, there exists an optimal dosing policy which is monotone with respect to patient state, and provide several examples where these conditions are met. Robust RGD: We then study a robust counterpart of the stochastic DP model of the previous chapter, where the pmf of the distribution of the stochastic dose-response parameter is unknown but is assumed to belong to an interval uncertainty set. We show that the inner maximization problem of the robust Bellman's equations is a linear program with a closed-form solution. We prove monotonicity of optimal dose with respect to both disease state and an ambiguity parameter, and illustrate monotonicity via simulation. Optimal Bayesian learning of dose-response parameters from a cohort: In this chapter, we study the problem of finding optimal RGD policies while learning the unknown distribution on a stochastic dose-response parameter from a cohort of patients. We provide a Bayesian stochastic DP formulation, though exact solution of Bellman's equations is computationally intractable. We therefore present two approximate control schemes and analyze the monotonicity, stationarity, and separability structures of the resulting dosing strategies, which are exploited in efficient, approximate solution of our problem. Numerical experiments are completed, and results are compared to non-Bayesian methods. Optimal stopping for RGD: In the final chapter, we consider an optimal stopping variant of RGD, where the decision-maker is allowed to end treatment prematurely. This could occur, for example, when the patient responds well quickly so that further treatment is unnecessary. We numerically demonstrate that for some problems, it is optimal to stop treatment in states better than a certain threshold.