Towards web-scale how-provenance

Abstract

The annotation of data with meta-data, and its propagation through data-intensive computation in a way that follows the transformations that the data undergoes (“how-provenance”), has many applications, including explanation of the computation results, assessing their trustworthiness and proving their correctness, evaluation in presence of incomplete or probabilistic information, view maintenance, etc. As data gets bigger, its transformations become more complex, and both are being relegated to the cloud, the role of provenance in these applications is even more crucial. But at the same time, the overhead incurred due to provenance computation, in terms of time, space and communication, may limit the scalability of how-provenance management systems. We envision an approach for addressing this complex problem, through allowing selective tracking of how-provenance, where the selection criteria are partly based on the meta-data itself. We illustrate use-cases in the web context, and highlight some challenges in this respect.

Publication
In DESWeb Workshop, ICDE