Understanding Queries by Conditional Instances

Abstract

A powerful way to understand a complex query is by observing how it operates on data instances. However, specific database instances are not ideal for such observations. They often include largeamounts of superfluous details that are not only irrelevant to understanding the query but also causes cognitive overload; and onespecific database may not be enough. Given a relational query, is it possible to provide a simple andgeneric “representative” instance that (1) illustrates how the query can be satisfied, (2) summarizes allspecific instances that would satisfy the query in the same way byabstracting away unnecessary details? Furthermore, is it possibleto find a collection of such representative instances that together completely characterize all possible ways in which the query can be satisfied? This paper takes initial steps towards answering these questions. We design what these representative instances look like, define what they stand for, and formalize what it means for them to satisfy a query in “all possible ways.” We argue that this problemis undecidable for general domain relational calculus queries, anddevelop practical algorithms for computing a minimum collection of such instances subject to other constraints. We evaluate the efficiency of our approach experimentally, and show its effectivenessin helping users debug relational queries through a user study.

Publication
In SIGMOD