Light Sources Form Data Solution Task Force
New collaboration between scientists at the five U.S. Department of Energy light source facilities will develop flexible software to easily process big data
February 12, 2020
The five DOE light sources: Brookhaven National Laboratory's National Synchrotron Light Source II (NSLS-II), Lawrence Berkeley National Laboratory's Advanced Light Source (ALS), Argonne National Laboratory's Advanced Photon Source (APS), and SLAC National Accelerator Laboratory's Stanford Synchrotron Radiation Lightsource (SSRL) and Linac Coherent Light Source (LCLS).
Light source facilities are tackling some of today’s biggest scientific challenges, from designing new quantum materials to revealing protein structures. But as these facilities continue to become more technologically advanced, processing the wealth of data they produce has become a challenge of its own. By 2028, the five U.S. Department of Energy (DOE) Office of Science light sources, will produce data at the exabyte scale, or on the order of billions of gigabytes, each year. Now, scientists have come together to develop synergistic software to solve that challenge.
With funding from DOE for a two-year pilot program, scientists from the five light sources have formed a Data Solution Task Force that will demonstrate, build, and implement software, cyberinfrastructure, and algorithms that address universal needs between all five facilities. These needs range from real-time data analysis capabilities to data storage and archival resources.
“It is exciting to see the progress that is being made by all the light sources working together to produce solutions that will be deployed across the whole DOE complex,” said Stuart Campbell, leader of the data acquisition, management and analysis group at the National Synchrotron Light Source II (NSLS-II), a DOE Office of Science user facility at DOE’s Brookhaven National Laboratory.
In addition, the new software will be designed to facilitate multimodal research—studies that combine data collected from multiple experimental stations, called beamlines. Typically, each beamline at a light source uses custom-built data acquisition software that is incompatible with another beamline’s, making it difficult for scientists to collect and compare data from multiple experimental stations. The task force aims to develop flexible software that can be deployed at multiple beamlines across all five facilities, expanding the possibilities for scientific collaboration.
Members of the task force met at NSLS-II for a project kickoff meeting in August of 2019.
To develop the new software, the task force will start by building up existing solutions that can already be found at the five light sources. Two of the key components are Bluesky, an open source software that was created at NSLS-II, and Xi-CAM, which was developed at the Advanced Light Source (ALS) and the Center for Advanced Mathematics for Energy Research Applications—both at DOE’s Lawrence Berkeley National Laboratory. Together, Bluesky and Xi-Cam will provide capabilities like live visualization and interactivity, data processing tools, and the ability to export data in real time into nearly any file format.
Each of the five light sources in the task force is bringing unique tools and skillsets to help develop a more robust and scalable solution to extract scientific knowledge from data for the nation’s light sources.
“There is tremendous enthusiasm at the light sources for solving the data challenge,” said Alexander Hexemer, senior scientist and computing program lead at ALS. “We strongly believe this will be the path forward for light sources to work together in the future.”
With the task force in its early stages, researchers have begun running test experiments on beamlines at NSLS-II and installing Bluesky and Xi-CAM at the Advanced Photon Source, a DOE Office of Science user facility at DOE’s Argonne National Laboratory.
By the end of the two-year pilot project, “we plan to deliver a set of tools that will provide an end-to-end software solution for the targeted scientific areas that can be deployed and used on different beamlines across all the DOE light sources,” Campbell said.
Alongside the task force pilot, the five light sources are working with DOE to develop data systems solutions that will scale to the unprecedented data rates that will be produced in the near future, using the new generation of “exascale” computers being built by DOE.
Brookhaven National Laboratory is supported by the U.S. Department of Energy’s Office of Science. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science.
2020-16902 | INT/EXT | Newsroom