Optimizing Distributed Systems using Machine Learning

dc.contributor.advisorKrishnamurthy, Arvind
dc.contributor.authorCano, Ignacio Agustin
dc.date.accessioned2019-05-02T23:18:28Z
dc.date.available2019-05-02T23:18:28Z
dc.date.issued2019-05-02
dc.date.submitted2019
dc.descriptionThesis (Ph.D.)--University of Washington, 2019
dc.description.abstractDistributed systems consist of many components that interact with each other to perform certain task(s). Traditionally, many of these systems base their decisions on sets of rules or configurations defined by operators as well as handcrafted analytical models. However, creating those rules or engineering such models is a challenging task. First, the same system should be able to work under a combinatorial number of conditions on top of heterogeneous hardware. Second, they should support different type of workloads and run in potentially widely different settings. Third, they should be able to handle time-varying resource needs. These factors render reasoning about distributed systems' performance in general far from trivial. In this thesis, we propose optimizing distributed systems using machine learning (ML). Our main contribution is the design, implementation, augmentation, and evaluation of three distributed systems that illustrate the impact of these ML-based optimizations: 1) Curator, a framework that safeguards distributed storage systems' health and performance by scheduling and executing background maintenance tasks, 2) AdaRes, an adaptive system that dynamically adjusts virtual machine resources in virtual execution environments, and 3) Pulpo, a federated system that efficiently trains machine learning models across different data centers. Each system instantiates appropriate ML models for the task at hand, alleviating systems designers from manually tuning rules and handcrafting complex analytical models. Our evaluations on real clusters show how our ML formulations result in improved distributed systems' efficiency and performance.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherCano_washington_0250E_19622.pdf
dc.identifier.urihttp://hdl.handle.net/1773/43659
dc.language.isoen_US
dc.rightsnone
dc.subjectcontextual bandits
dc.subjectdistributed systems
dc.subjectmachine learning
dc.subjectoptimization
dc.subjectreinforcement learning
dc.subjectComputer science
dc.subject.otherComputer science and engineering
dc.titleOptimizing Distributed Systems using Machine Learning
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Cano_washington_0250E_19622.pdf
Size:
1.96 MB
Format:
Adobe Portable Document Format