ResearchWorks Archive

Scalable Query Evaluation over Complex Probabilistic Databases

Show simple item record

dc.contributor.advisor Suciu, Dan en_US Jha, Abhay en_US 2012-09-13T17:32:58Z 2012-09-13T17:32:58Z 2012-09-13 2012 en_US
dc.identifier.other Jha_washington_0250E_10602.pdf en_US
dc.description Thesis (Ph.D.)--University of Washington, 2012 en_US
dc.description.abstract The age of Big Data has brought with itself datasets which are not just big, but also much more complicated. These datasets are constructed from disparate, unreliable and noisy sources, many times in an ad-hoc way because careful data cleaning and integration is too time consuming and not always necessary anymore. Representing the uncertainty hidden in these datasets is necessary to get meaningful query answers and Probabilistic Databases have come up as arguably the most popular solution to this problem. Their application to practical problems though has been held back because (i) the common models they use are not rich enough to capture the dependencies in these problems, and (ii) unlike traditional databases, query evaluation for probabilistic databases can be very expensive and unpredictable. This dissertation addresses these challenges by first proposing a new model for probabilistic databases that is rich enough to capture the dependencies found in most practical applications, while still allowing for a translation to considerably simpler and well-studied models. Our model leverages existing models from AI literature that combine probability theory with logic. The main challenge of query evaluation over probabilistic databases is that it requires solving probabilistic inference which is a notoriously hard problem. This dissertation studies this problem via both (i) foundational results that give new theoretical insights about existing probabilistic inference algorithms, like Read-Once Formulas, Tree-Decompositions, Binary Decision Diagrams, Negation Normal Forms, when applied to the setting of probabilistic databases, which as we will see have their own distinct challenges and expectations, and (ii) building a robust system where the above ideas are leveraged for efficient and reliable query evaluation. en_US
dc.format.mimetype application/pdf en_US
dc.language.iso en_US en_US
dc.rights Copyright is held by the individual authors. en_US
dc.subject en_US
dc.subject.other Computer science en_US
dc.subject.other Computer science and engineering en_US
dc.title Scalable Query Evaluation over Complex Probabilistic Databases en_US
dc.type Thesis en_US
dc.embargo.terms No embargo en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search ResearchWorks

Advanced Search


My Account