Transforming Data Interaction with NSF Grant
The National Science Foundation has granted Illinois Institute of Technology researchers $4 million to explore a new vision in managing the increasing complexity and scale of data in modern scientific pursuits.
Xian-He Sun, Ron Hochsprung Endowed Chair of Computer Science, and Antonios Kougkas, assistant research professor of computer science, are receiving the bulk of a $5 million NSF grant to pursue IOWarp. IOWarp aims to reduce the amount of data that needs to be transferred through optimization techniques and data transformation, as well as provide a unified platform that can handle a wide range of data sources and formats, simplifying data management for scientists.
“I’m particularly energized by how IOWarp isn’t just another generic data platform,” Kougkas says. “It’s been carefully designed through direct collaboration with scientists across diverse fields such as materials science, cosmology, and biomedicine. This means we’re not just building technology in isolation. We’re creating solutions that directly address the complex challenges scientists face in their daily work.”
IOWarp is a direct evolution of Hermes, a multi-tiered distributed input/output (I/O) buffering system that was previously developed by Sun, Kougkas, and their research team. IOWarp incorporates the key features from Hermes while adding new functionalities and optimizations to handle the complexities of modern scientific workflows, especially those incorporating artificial intelligence.
“This is the third multi-million-dollar NSF grant that our group has received over the last six years,” Sun says. “From the Hermes data management and transfer systems for high performance computing, to the ChronoLog data systems for cloud computing, to the current IOWarp system for AI applications, we have extended our research horizon from foundation to application and established our leading position in the nation. This award is a great recognition for us and for Illinois Tech.”
The new system has the potential to ease data management in three main ways. It addresses challenges in managing diverse data types and formats that are required across different workflow stages of modern scientific pursuits. It also aims to reduce the amount of data transferred through various mechanisms, including tiered content organization and content operators, and it also supports a wide variety of data sources.
“When I think about how IOWarp could accelerate breakthrough discoveries in fields ranging from atmospheric science to biomedical research by streamlining how researchers work with complex datasets, it’s hard not to be enthusiastic about the impact this could have on scientific progress as a whole,” Kougkas says.
IOWarp features a new natural language interface driven by WarpGPT, a suite of AI technologies being developed by the team to assist scientists in exploring data dynamics using natural language, the ultimate interface.
WarpGPT makes complex analyses and explorations as easy as asking a question, which democratizes data access and analysis by reducing coding barriers, unlocking complex insights, automating data management, and creating a more transparent and reproducible way to document data analysis steps compared to complex code.
“IOWarp’s natural language interface is not just a tool—it’s a vision for the future of scientific data management,” Kougkas says. “By enabling scientists to interact with their data in a way that feels natural and intuitive, it empowers them to unlock the full potential of their research and accelerate discoveries that benefit us all.”
Another key differentiator is IOWarp’s novel data representation termed “content,” which acts like a universal adapter to streamline and simplify data management. It does this by taking data from different sources, such as complex scientific instruments or simulations, and transforms it into a standardized format that any application can understand. It also reduces data bottlenecks and allows researchers to ask complex questions in natural language rather than code to make it easier for AI to understand and extract valuable insights from the data.
“The biggest hurdle for the IOWarp team is convincing the scientific community to adopt this new platform, especially since many researchers already rely on established data management solutions,” Kougkas says. “The team needs to showcase IOWarp’s advantages and directly address concerns that researchers may have about things such as compatibility with their current systems, the time it takes to learn a new platform, and how easily IOWarp can mesh with their existing tools.”
He says a key part of this challenge lies in building a strong and active open-source community around IOWarp. A vibrant community will drive the project’s ongoing development, ensure it stays up to date with technological advances, and keeps it relevant in the fast-paced world of scientific computing.
“As someone passionate about advancing scientific discovery, I see tremendous potential in how IOWarp can help scientists spend less time wrestling with data management and more time focusing on their core research questions,” Kougkas says. “The open-source nature of the project adds another layer of excitement—it means we’re not just building a tool but fostering a community that can continuously evolve and improve the platform.”
Sun and Kougkas are working with researchers at the University of Utah on IOWarp, as well as The HDF Group, which will help build the software.
You can check out the team’s Github, follow its developments, or join the IOWarp community.
Disclaimer: Research reported in this publication is supported by the National Science Foundation under Award Number 2411318. This content is solely the responsibility of the authors and does not necessarily represent the official views of the National Science Foundation.