Computing Provenance for Database Updates and Transactions

January 29, 2014

11:25am - 12:40pm

SB 107

Home
Events
Computing Provenance for Database Updates and Transactions

Speaker

Boris Glavic
Assistant Professor of Computer Science, Illinois Institute of Technology
http://cs.iit.edu/~glavic/

Description

Abstract: Data provenance, information about the origin and creation process of data, has been used to debug queries and clean data in data warehouses, to understand and correct complex data integration transformations, for auditing, and to understand the quality of data in Big Data analytics and Data Science. Automatic provenance generation is of immense importance in Big Data and data science where the data size, its heterogeneity, and the time requirements for analysis results to be available make it infeasible to generate provenance information manually. Most of the literature on database provenance has focused on tracing the provenance of queries, i.e., to map each output row of a query to all rows from the query's input that where used to compute this output row. However, use cases such as auditing need to be able to trace the origin of a row through database updates which are usually executed as part of a transaction to preserve consistency under concurrent access and recovery from failures. In this talk I given an overview of my group's research on computing provenance for updates and transactions. Similar to most approaches for computing the provenance of queries, we use query rewrite techniques to generate queries that compute provenance as a side-effect. Our approach is based on transaction time histories for tables and an encoding of update as queries over past states of tables. This work is partially supported by Oracle.

Event Topic

Data Science

Seminar

Computing Provenance for Database Updates and Transactions

Speaker

Description

Event Topic

Tags:

Learn more...

Computing Provenance for Database Updates and Transactions

Time

Locations

Speaker

Description

Event Topic

Tags:

Learn more...