The system will be down for regular maintenance from 8:00-10:00am PDT on April 3rd, 2024.
Improving Fault Tolerance and Performance of Data Center Networks
dc.contributor.advisor | Anderson, Thomas E | |
dc.contributor.advisor | Krishnamurthy, Arvind | |
dc.contributor.author | Liu, Vincent | |
dc.date.accessioned | 2017-02-14T22:38:06Z | |
dc.date.available | 2017-02-14T22:38:06Z | |
dc.date.submitted | 2016-09 | |
dc.identifier.other | Liu_washington_0250E_16576.pdf | |
dc.identifier.uri | http://hdl.handle.net/1773/38103 | |
dc.description | Thesis (Ph.D.)--University of Washington, 2016-09 | |
dc.description.abstract | Data center networks are a key component to the explosive growth of cloud computing---enabling the utilization of tens to hundreds of thousands of co-located servers for large-scale computing and services. As applications and data sets continue to grow rapidly, the challenge for data center networks is to keep pace---by providing enough bandwidth while also lowering costs, increasing flexibility, and maintaining reliability. My thesis is that a key part of the answer is the network's wiring topology: topology has foundational cross-layer effects, and a small amount of intentional asymmetry in the topology can help data center networks meet that challenge. I present two complementary innovations that demonstrate this. The first, F10, is a co-design of the network topology and failover protocols to provide efficient, near-instantaneous, fine-grained, and localized recovery and rebalancing for common-case network failures. My results show that following network link and switch failures, F10 has 1/7th the packet loss of current schemes. The second innovation, Subways, proposes and evaluates a new method to add network capacity by connecting multiple network links per server in an overlapping topology. Using a simulation-based methodology, my work shows that Subways offers substantial performance benefits for popular application workloads: up to a 3.1x speedup in MapReduce and a 2.5x throughput improvement in memcache for a fixed average request latency, relative to an equivalent-bandwidth network that differs only in its wiring. | |
dc.format.mimetype | application/pdf | |
dc.language.iso | en_US | |
dc.rights | ||
dc.subject | ||
dc.subject.other | Computer science | |
dc.subject.other | computer science and engineering | |
dc.title | Improving Fault Tolerance and Performance of Data Center Networks | |
dc.type | Thesis | |
dc.embargo.terms | Open Access |