Leveraging Usage History to Enhance Database Usability

Khoussainova, Nodira

Leveraging Usage History to Enhance Database Usability

dc.contributor.advisor	Balazinska, Magdalena	en_US
dc.contributor.author	Khoussainova, Nodira	en_US
dc.date.accessioned	2013-02-25T18:01:31Z
dc.date.available	2013-02-25T18:01:31Z
dc.date.issued	2013-02-25
dc.date.submitted	2012	en_US
dc.description	Thesis (Ph.D.)--University of Washington, 2012	en_US
dc.description.abstract	More so than ever before, large datasets are being collected and analyzed throughout a variety of disciplines. Examples include social networking data, software logs, scientific data, web clickstreams, sensor network data, and more. As such, there are a wide range of users interacting with these large datasets, ranging from scientists, to data analysts, to sociologists, to market researchers. These users are experts in their domain and understand their data extensively, but are not database experts. Database systems are scalable and efficient, but are notoriously difficult to use. In this work, we aim to address this challenge, by leveraging usage history. From usage history, we can extract knowledge about the multitude of users' experiences with the database. Consequently, this knowledge allows us to build smarter systems that better cater to the users' needs. We address different aspects of the database usability problem and develop three complementary systems. First, we aim to ease the query formulation process. We build the SnipSuggest system, which is an autocompletion tool for SQL queries. It provides on-the-go, context-aware assistance in the query composition process. The second challenge we address is that of query debugging. Query debugging is a painful process in part because executing queries directly over a large database is slow while manually creating small test databases is burdensome to users. We present the second contribution of this dissertation: SIQ (Sample-based Interactive Querying). SIQ is a system for automatically selecting a `good' small sample of the underlying input database to allow queries to execute in realtime, thus supporting interactive query debugging. Third, once a user has successfully constructed the right query, they must execute it. However, executing and understanding the performance of a query on a large-scale, parallel database system can be difficult even for experts. Our third contribution, PerfXplain, is a tool for explaining the performance of a MapReduce job running on a shared-nothing cluster. Namely, it aims to answer the question of why one job was slower than another. PerfXplain analyzes the MapReduce log files from past runs to better understand the correlation between different properties of pairs of job and their relative runtimes.	en_US
dc.embargo.terms	No embargo	en_US
dc.format.mimetype	application/pdf	en_US
dc.identifier.other	Khoussainova_washington_0250E_11006.pdf	en_US
dc.identifier.uri	http://hdl.handle.net/1773/22014
dc.language.iso	en_US	en_US
dc.rights	Copyright is held by the individual authors.	en_US
dc.subject	databases; queries; usability	en_US
dc.subject.other	Computer science	en_US
dc.subject.other	Computer science and engineering	en_US
dc.title	Leveraging Usage History to Enhance Database Usability	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Khoussainova_washington_0250E_11006.pdf
Size:: 3.86 MB
Format:: Adobe Portable Document Format

Download

Collections

Computer science and engineering