Optimizing Distributed Systems using Machine Learning

Cano, Ignacio Agustin

Optimizing Distributed Systems using Machine Learning

dc.contributor.advisor	Krishnamurthy, Arvind
dc.contributor.author	Cano, Ignacio Agustin
dc.date.accessioned	2019-05-02T23:18:28Z
dc.date.available	2019-05-02T23:18:28Z
dc.date.issued	2019-05-02
dc.date.submitted	2019
dc.description	Thesis (Ph.D.)--University of Washington, 2019
dc.description.abstract	Distributed systems consist of many components that interact with each other to perform certain task(s). Traditionally, many of these systems base their decisions on sets of rules or configurations defined by operators as well as handcrafted analytical models. However, creating those rules or engineering such models is a challenging task. First, the same system should be able to work under a combinatorial number of conditions on top of heterogeneous hardware. Second, they should support different type of workloads and run in potentially widely different settings. Third, they should be able to handle time-varying resource needs. These factors render reasoning about distributed systems' performance in general far from trivial. In this thesis, we propose optimizing distributed systems using machine learning (ML). Our main contribution is the design, implementation, augmentation, and evaluation of three distributed systems that illustrate the impact of these ML-based optimizations: 1) Curator, a framework that safeguards distributed storage systems' health and performance by scheduling and executing background maintenance tasks, 2) AdaRes, an adaptive system that dynamically adjusts virtual machine resources in virtual execution environments, and 3) Pulpo, a federated system that efficiently trains machine learning models across different data centers. Each system instantiates appropriate ML models for the task at hand, alleviating systems designers from manually tuning rules and handcrafting complex analytical models. Our evaluations on real clusters show how our ML formulations result in improved distributed systems' efficiency and performance.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Cano_washington_0250E_19622.pdf
dc.identifier.uri	http://hdl.handle.net/1773/43659
dc.language.iso	en_US
dc.rights	none
dc.subject	contextual bandits
dc.subject	distributed systems
dc.subject	machine learning
dc.subject	optimization
dc.subject	reinforcement learning
dc.subject	Computer science
dc.subject.other	Computer science and engineering
dc.title	Optimizing Distributed Systems using Machine Learning
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Cano_washington_0250E_19622.pdf
Size:: 1.96 MB
Format:: Adobe Portable Document Format

Download

Collections

Computer science and engineering