Learning by Doing Through Research Opportunities
Saikiran Yerraguntla (CS/AMAT ’20) discovered plenty of research opportunities as an undergraduate student while studying at Illinois Institute of Technology, making research an integral part of his educational experience.
He found himself working with faculty on projects that included developing models to detect cancer cells based on medical imaging results, building a predictive model on the duration of the Ebola outbreak in West Africa, and detecting social media bot troll farms.
Yerraguntla also found non-faculty research opportunities. He spent a summer implementing a data pipeline with the Array of Things and Waggle Team at Argonne National Laboratory, and worked with a journalist to develop a pattern analysis of Chicago parking tickets.
Through a course project, Robert Ellis, associate professor of applied mathematics at Illinois Tech, introduced Yerraguntla and his team of fellow student researchers to David Eads, a data journalist with ProPublica. Yerraguntla says Eads provided the team with data sets on parking tickets issued over the last 30 years including demographic, ward, and weather pattern data.
“David’s team had found some interesting results, but he hoped we would expatiate upon it with our analysis, observations, results, and conclusions,” Yerraguntla says. “He was also providing us with assistance throughout the way, in terms of acquiring more data from ProPublica or addressing our concerns whenever we encountered strange, dubious results.”
Yerraguntla says the experience taught him a lot about working with large data sets, specifically in the areas of data gathering and processing.
He says he learned the importance of gathering data from multiple sources, and how exhaustive the data collection process should be.
“It is important to note this process is not a one-time task,” Yerraguntla says. “Data analysis is a cyclic procedure that requires constant gathering and processing of data throughout the research. You and your machine must be prepared to handle this efficiently and effectively.”
Learning how to process large amounts of data also took a lot of trial and error, Yerraguntla says. It takes a lot of care, he warns.
“Do not attempt to load and parse such files locally on your computer if you do not have the necessary resources, and have multiple processes running in the background,” he says. “It can freeze up the computer and worse, crash it.”
The solution was to load the data onto a virtual machine, and running a script on it to break the data into small enough subsets that a local computer can handle.
Illinois Tech not only has a lot of research opportunities to conduct data analysis on campus, but its Chicago location helps students access data sets to conduct their own research. Yerraguntla says a favorite source is the City of Chicago’s data portal.
He says the portal holds millions of free, publicly available data sets on variety of city life aspects including crime, inflation, public health and safety, education, DNA testing, weather, and more.
“The Chicago data portal is highly beneficial for students performing research in multiple fields and disciplines,” Yerraguntla says. “I personally have used the CDP for many other research and hackathon projects. I think it is a great resource for all students coming from different skills and experiences.”
Yerraguntla says he will be working full time as a rotational software engineer for U.S. Cellular after graduation.
Photo: College of Science student Saikiran Yerraguntla (provided)