Automated Reasoning of Database Queries
From booking air tickets to analyzing astronomy datasets, database queries are pervasive in people’s work and life. This thesis describes Cosette, the first tool for automated reasoning the equivalences of SQL queries. The core of Cosette is a formal semantics of SQL based on semirings. This semantics covers major SQL features, including sophisticated ones such as grouping, aggregate, correlated sub- queries, and integrity constraints. Also, this semantics is denotational and only adds a few equational axioms, as the interpretation of SQL, to semirings. Then, to check the equivalences, Cosette uses this semantics to encode a pair of input SQL queries in both an interactive theorem prover and a constraint solver. In the end, Cosette will either certify their equivalences using a sound decision procedure implemented in a theorem prover that covers the known decidable fragment of SQL, or show their inequivalence by providing a counter-example. Empirical studies show that Cosette can decide the equivalence or provide counter example for a wide range of practical SQL queries collected from database literature, real-world optimizer rules and bugs, and data management class homework assignment from UW.