Production Run Failure Diagnosis for Concurrency Bugs

Time

-

Locations

CS Conference Room

Host

Department of Computer Science



Description

Failures caused by software bugs are widespread in production runs, causing severe losses for end users. Unfortunately, diagnosing production-run failures, especially failures caused by concurrency bugs in multi-threaded software, is challenging. Existing work cannot satisfy privacy, run-time overhead, diagnosis capability, and diagnosis latency requirements all at once.

This talk will present a series of attempts from our group to address the above challenges. Our first attempt, called CCI, applies the cooperative bug isolation (CBI) approach, which was initially designed for sequential bugs, to concurrency bugs. Our carefully designed interleaving predicates and sampling schemes allow CCI to diagnose a wide variety of concurrency-bug failures with decent overhead. Our second attempt, called PBI, further improves the performance and preserves the diagnosis capability of CCI through a novel use of hardware performance counters. Our final attempt, called LXR, addresses the long diagnosis latency problem of CCI and PBI.

Different from CCI and PBI that both obtain run-time information through sampling, LXR obtains run-time information through hardware support that maintains recent execution history with negligible overhead. I will conclude the talk by discussing other research in my group that tackles concurrency bugs and performance bugs.

Shan Lu is an Associate Professor of Computer Science at University of Chicago. Her main research interest is software reliability. She won Alfred P. Sloan Research Fellow in 2014, Distinguished Alumni Educator Award from the Department of Computer Science at the University of Illinois in 2013, and NSF Career Award in 2010. Her co-authored papers won the Best Paper Award at USENIX FAST in 2013, ACM-SIGPLAN CACM Research Highlight Nomination in 2011, and IEEE Micro Top Picks in 2006. She currently serves as the Information Director of ACM-SIGOPS and the Program Chair of 2015 USENIX Annual Technical Conference. Prior to joining University of Chicago, Shan was the Clare Boothe Luce Assistant Professor of Computer Sciences at the University of Wisconsin-Madison for five years. She received her Ph.D. degree at the University of Illinois at Urbana-Champaign in 2008, and her bachelor degree at University of Sci. & Tech. of China in 2003.

Tags: