Department of Energy Announces $23.9 Million for Research on Next-Generation Data Management and Scientific Data Visualization

Research will advance big data movement and analysis, develop tools for visually exploring data and communicating results

Photo of Line Pouchard enlarge

Line Pouchard, a senior researcher with the Computational Science Initiative's Center for Data-Driven Discovery, is Principal Investigator of Scalable Metadata and Provenance Services for Reproducible Hybrid Workflows and will oversee all technical research related to the project.

The following news release was issued by the U.S. Department of Energy (DOE), announcing $11.4 million for next-generation data management research. The awards are part of a nearly $24 million funding package to support foundational research in data management, seeking solutions to manage and use the increasingly massive data sets produced by scientific experiments and DOE supercomputers every year. Line Pouchard, of Brookhaven Lab’s Computational Science Initiative (CSI), will lead the Scalable Metadata and Provenance Services for Reproducible Hybrid Workflows project that includes scientists from Argonne National Laboratory, Texas State University, and CSI. The project aims to make hybrid, artificial intelligence-high-performance computing (AI-HPC) workflows and other AI-enabled applications easily reproducible while preserving key aspects of Findable, Accessible, Interoperable, and Reusable (FAIR) data. Currently, data sets and workflows rarely meet FAIR criteria in the AI-HPC context. Such reproducibility will support broader use of AI applications and hybrid workflows at scale in science, engendering trust in their results.

WASHINGTON, D.C.—Today, the U.S. Department of Energy (DOE) announced $23.9 million in funding for ten projects in advanced scientific data management and visualization.

Foundational research in data management will address challenges stemming from the increasingly massive data sets produced by scientific experiments and supercomputers. Innovative and intuitive data visualization approaches will support scientific discovery, decision-making, and communication based on that data.

“The new capabilities in data management and visualization these projects develop will help make the most of the deluge of data generated by modern scientific experiments and simulations,” said Barbara Helland, DOE Associate Director of Science for Advanced Scientific Computing Research. “These efforts will enable data to be processed and stored at higher rates across the edge, cloud, and high-performance computing environments, and develop new visualization methods to explore that data, form hypotheses, and convey conclusions to a broad spectrum of audiences.”

Improvements in data management are expected to facilitate discovery in a wide range of fields, from materials science and chemistry to climate modeling and the development of new clean energy sources to new approaches to increasing energy efficiency and reducing energy consumption. Projects include research on optimizing the management of massive amounts of data that must be moved and reproducibly analyzed using sophisticated mathematical techniques, including machine learning, in systems that provide both speed and flexibility. Supported research will also advance innovative techniques that exploit smart storage and networking hardware that may provide breakthroughs that address the data challenges scientists and engineers face.

Advances in visualization techniques will better enable inter-disciplinary collaboration and enhance communication across domains. Projects include research on new techniques and theory needed to aid in the development of informative and interactive visualization of complex scientific data of interest to DOE’s mission space—from those describing cosmological, climate, or Earth systems to those describing advanced manufacturing processes. The development and deployment of visualization tools that incorporate humancentric and interoperable design for scientific computing and simulations is key to avoiding bespoke solutions that limit the engagement of a broader range of domain scientists. Supported advances will address the rapid expansion of data generation and availability, the complexity of data types, new visualization technologies becoming available on the edge, as well as the demand for decisions to be made at the edge.

The projects were selected by competitive peer review under the DOE Funding Opportunity Announcement for FOAs “Management and Storage of Scientific Data” and “Data Visualization for Scientific Discovery, Decision-Making, and Communication.”

Total funding is $23.9 million for projects lasting up to three years in duration, with $12.1 million in Fiscal Year 2022 dollars and outyear funding contingent on congressional appropriations. The list of projects and more information can be found here.

2022-20807  |  INT/EXT  |  Newsroom