Show simple item record

dc.contributor.advisorSuciu, Danen_US
dc.contributor.authorJha, Abhayen_US
dc.date.accessioned2012-09-13T17:32:58Z
dc.date.available2012-09-13T17:32:58Z
dc.date.issued2012-09-13
dc.date.submitted2012en_US
dc.identifier.otherJha_washington_0250E_10602.pdfen_US
dc.identifier.urihttp://hdl.handle.net/1773/20750
dc.descriptionThesis (Ph.D.)--University of Washington, 2012en_US
dc.description.abstractThe age of Big Data has brought with itself datasets which are not just big, but also much more complicated. These datasets are constructed from disparate, unreliable and noisy sources, many times in an ad-hoc way because careful data cleaning and integration is too time consuming and not always necessary anymore. Representing the uncertainty hidden in these datasets is necessary to get meaningful query answers and Probabilistic Databases have come up as arguably the most popular solution to this problem. Their application to practical problems though has been held back because (i) the common models they use are not rich enough to capture the dependencies in these problems, and (ii) unlike traditional databases, query evaluation for probabilistic databases can be very expensive and unpredictable. This dissertation addresses these challenges by first proposing a new model for probabilistic databases that is rich enough to capture the dependencies found in most practical applications, while still allowing for a translation to considerably simpler and well-studied models. Our model leverages existing models from AI literature that combine probability theory with logic. The main challenge of query evaluation over probabilistic databases is that it requires solving probabilistic inference which is a notoriously hard problem. This dissertation studies this problem via both (i) foundational results that give new theoretical insights about existing probabilistic inference algorithms, like Read-Once Formulas, Tree-Decompositions, Binary Decision Diagrams, Negation Normal Forms, when applied to the setting of probabilistic databases, which as we will see have their own distinct challenges and expectations, and (ii) building a robust system where the above ideas are leveraged for efficient and reliable query evaluation.en_US
dc.format.mimetypeapplication/pdfen_US
dc.language.isoen_USen_US
dc.rightsCopyright is held by the individual authors.en_US
dc.subjecten_US
dc.subject.otherComputer scienceen_US
dc.subject.otherComputer science and engineeringen_US
dc.titleScalable Query Evaluation over Complex Probabilistic Databasesen_US
dc.typeThesisen_US
dc.embargo.termsNo embargoen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record