Communication Cost in Parallel Query Processing

Time

-

Locations

Stuart Building 111

Distinguished Lecture Series

Host

Computer Science

Description

Suciu considers the following problem: What is the amount of communication required to compute a query in parallel on p servers, over a large database instance? He defines the Massively Parallel Communication (MPC) model, where the computation proceeds in rounds consisting of local computations followed by a global reshuffling of the data. Servers have unlimited computational power and are allowed to exchange any datal the only cost parameters are the number of rounds and the maximum amount of communication per server. Surprisingly, any multi-join query can be computed in a single communication round; however, the price to pay is that the amount of data being reshuffled exceeds the input data. Suciu will describe tight bounds on the amount of communication for the case of single round algorithms on non-skewed data, and discuss some partial results for multiple rounds and for skewed data.

Joint work with Paul Beame and Paris Koutris

About the speaker

Suciu received his Ph.D. from the University of Pennsylvania in 1995, was a principal member of the technical staff at AT&T Labs, and joined the University of Washington in 2000. Suciu is conducting research in data management, with an emphasis on topics related to Big Data and data sharing, such as probabilistic data, data pricing, parallel data processing, and data security. He is a co-author of two books, Data on the Web: From Relations to Semistructured Data and XML (1999) and Probabilistic Databases (2011). He is a fellow of the ACM, holds 12 U.S. patents, received the best paper award in SIGMOD 2000 and ICDT 2013, the ACM PODS Alberto Mendelzon Test of Time Award in 2010 and in 2012, the 10 Year Most Influential Paper Award in ICDE 2013, and the VLDB 10 Year Best Paper Award in 2014, and is a recipient of the NSF Career Award and of an Alfred P. Sloan Fellowship. Suciu serves on the VLDB (Very Large Databases) Board of Trustees; is an associate editor for the VLDB Journal, ACM TWEB, and Information Systems; and is a past associate editor for ACM TODS and ACM TOIS. Suciu's Ph.D. students Gerome Miklau and Christopher Re received the ACM SIGMOD Best Dissertation Award in 2006 and 2010, respectively, and Nilesh Dalvi was a runner-up in 2008.

Event Topic

Distinguished Lecture Series