Understanding and Improving Database-Backed Applications

Loading...
Thumbnail Image

Authors

Yan, Cong

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

From online shopping to social media network, modern web applications are used everywhere in our daily life. These applications are often structured with three tiers: the front-end developed with HTML or JavaScript, the application server developed with object-oriented programming language like Java, Python or Ruby, and the back-end database that accepts SQL queries. Latency is critical for these applications. However, our study shows that many open-source web applications suffer from serious performance issues, with many slow pages as well as pages whose generation time scales superlinearly with the data size. Prior work has been focusing on improving the performance of each tier individually, but is often not enough to reduce the latency to meet the developer’s expectation. In this thesis, I present a different optimization strategy that enables much further improvement of database-backed applications. Rather than looking into each tier separately, I focus on optimizing one tier based on how other tiers process and consume the data. In particular, I describe five projects that implement five examples of such optimization. On the database tier, I present 1) CHESTNUT, a data layout designer that generates a customized data layout based on how the application tier consumes the query result; and 2) CONSTROPT, a query rewrite optimizer that performs query optimization by leveraging the data constraint inferred from the application code. On the application tier, I present 3) QURO, a query reorder compiler that changes the query order to reduce the database locking time when processing transactions; and 4) POWERSTATION, an IDE plugin to fix performance anti-patterns in the application code which results in slow or redundant database queries. Besides cross-tier information, there is more we can use to improve the application. Then I propose 5) HYPERLOOP, a framework that leverages not only application knowledge but also the developer’s insights to make performance tradeoffs. In each project, by looking into the interaction with other tiers and even the developers, I show that many optimizations that may not be viable or effective when optimizing each one tier alone can improve the overall application performance significantly. I show the performance gain brought by these optimizations using real-world applications, then discuss future work in this direction and conclude.

Description

Thesis (Ph.D.)--University of Washington, 2020

Citation

DOI