Illinois Tech Computer Science’s Culotta and University of Michigan’s Hemphill Create Cyberbullying Early-Response System

Date

What if we could predict cyberbullying and intervene before it escalates? Researchers have created a tool that uses linguistic and social features from earlier comments to forecast hostility presence and intensity on Instagram. The tool can predict cyberbullying up to 10 hours before it will occur, giving schools, parents, law enforcement officials and others time to intervene.

The work was done by Aron Culotta, director of the Text Analysis in the Public Interest Lab and assistant professor of computer science at Illinois Tech, and Libby Hemphill, director of the Resource Center for Minority Data and associate professor of information, University of Michigan, and their students, Ping Liu, Ph.D. candidate in computer science, and Josh Guberman (Psych ‘17). Their research was originally funded by the Nayar Prize at Illinois Tech.

Culotta and Hemphill chose to focus on cyberbullying among teenagers. One in three U.S. teens experiences cyberbullying, which can lead to violence, depression, and substance abuse. Interviewing stakeholders such as parents, school administrators, and police, Culotta and Hemphill learned that their top cyberbullying concerns included situations in which teens who know each other get into offline fights that escalate online and can result in physical altercations. They also said Instagram is one of the social media that teens commonly use.

Culotta and Hemphill created a natural language processing algorithm to comb through some 15 million Instagram comments from more than 400,000 Instagram posts. They also manually annotated 30,000 comments for the presence of cyberbullying to train and validate their forecasting models. Their best model was highly accurate (AUC .82) at predicting cyberbullying based on earlier comments in a post, with cues like patterns of hostile conversations and certain key words and acronyms that often precede cyberbullying, such as gender-specific hostile terms or a commonly used acronym suggesting that the poster be quiet.

Most prior work in cyberbullying has focused on identifying hostile messages after they have been posted. Culotta and Hemphill’s work forecasts the presence and intensity of future hostile comments, making it more useful as an intervention tool. The two created a Web interface for their tool and plan to engage parent groups to opt into the application.