Brookhaven Lab to Lead Software Development Project and Partner on Data Co-Design Center for DOE's Exascale Computing Project

Scientists will help build the software infrastructure required for exascale computing

Computational Science Initiative enlarge

(Clockwise from left) Wei Xu, Klaus Mueller, Shinjae Yoo, and Kerstin Kleese van Dam of Brookhaven Lab's Computational Science Initiative are partnering with Argonne and Oak Ridge National Laboratories and several universities on CODAR, one of four co-design center projects that the U.S. Department of Energy's Exascale Computing Project (ECP) recently funded. Not pictured: Brookhaven team members Barbara Chapman (who is also leading an ECP-funded software development project called SOLLVE) and Abid Malik, both of the Computational Science Initiative.

UPTON, NY—The U.S. Department of Energy's (DOE) Exascale Computing Project (ECP) just awarded $34 million in first-year funding to 35 software development proposals and $12 million to four co-design center proposals. Scientists at the DOE's Brookhaven National Laboratory are leading one of these software development projects, SOLLVE, and contributing to one of these co-design centers, CODAR.

Exascale computing refers to computing systems that are at least 50 times faster than the nation's most powerful supercomputers used today. The ECP is responsible for the planning, execution, and delivery of technologies—including software, applications, hardware, and early testbed platforms—needed for the nation to effectively design and run the exascale systems of the future.    

"The funding of these software development projects and co-design centers, following our recent announcement for application development awards, signals the momentum and direction of ECP as we bring together the necessary ecosystem and infrastructure to drive the nation's exascale imperative," said ECP Director Paul Messina.

In September 2016, ECP announced the selection of 15 fully funded and seven seed-funded application development projects, with a total funding of $39.8 million. Brookhaven Lab is a partner on two of the fully funded projects: one in computational chemistry, and the other in computational physics.

"Brookhaven is proud to contribute at all levels to this exciting capability development that will enable DOE to meet its scientific and national security mission needs over the next decade," said Kerstin Kleese van Dam, director of Brookhaven's Computational Science Initiative, which is addressing the challenges of exascale computing as part of its research portfolio.

SOLLVE: a programming model for exascale computing applications

Most ECP application development projects plan to use OpenMP (Open Multiprocessing), an industry standard for programming shared-memory parallel computers, as part of their strategy for reaching exascale performance levels. However, with respect to the requirements of these applications, there are critical gaps in OpenMP functionality that have resulted from the extremely rapid changes in hardware technology and programming languages relevant to exascale computing.

Here is where the Brookhaven-led software development project called SOLLVE—for Scaling OpenMP via LLVM (Low-Level Virtual Machine) for Exascale Performance and Portability—comes in.

"Obtaining exascale performance on tomorrow's largest computer systems will only be possible if the resources in each compute node are well utilized. OpenMP is key to achieving node resource efficiency on today's computers. In this development project, we intend to get OpenMP ready for the challenges of tomorrow," said project lead Barbara Chapman, director of Brookhaven's Computer Science and Mathematics Group who is also a professor in the Applied Mathematics and Statistics Department and the Computer Science Department at Stony Brook University. Chapman's team includes collaborators from Argonne, Lawrence Livermore, and Oak Ridge National Laboratories, as well as Rice University and the University of Illinois at Urbana-Champaign.

The SOLLVE team will design, implement, and help standardize key OpenMP functionality features identified by ECP application developers. These features include the ability to program new kinds of memory, efficiently move complex data structures between different memories, create portable code that maintains high performance across various exascale architectures, and support the latest standards in programming languages—especially C++, which ECP applications are trending toward. The team will also ensure that OpenMP's full feature set is available to ECP participants.

CODAR: an exascale co-design process focused on data analysis and reduction methods

Designing an exascale computer system involves the integration of many components: hardware, software, numerical methods, programming models, algorithms, and applications. A coordinated development process in which developers co-design these components within the context of constraints such as development costs, power efficiency, and scalability is critical to reaching exascale performance. "Co-design centers" help guide this process.

The co-design center that Brookhaven is partnering on—Co-Design Center for Online Data Analysis and Reduction at the Exascale, or CODAR—is focused on developing and implementing services to analyze and reduce data online before the data are written to disk for possible further offline analysis.

Exascale systems are projected to provide unprecedented increases in computational speed, but the input/output rates of transferring the computed results to storage disks are not expected to keep pace. Given this disparity, it will be infeasible at the exascale for scientists to save all of their scientific results for offline analysis.

With its partners—Argonne (project lead) and Oak Ridge National Laboratories and Brown, Rutgers, and Stony Brook universities—Brookhaven will develop the services required to output just the data needed by the application to enable scientific insight at the exascale. CODAR team members, many of whom are partners on ECP application and software development projects, will develop data analysis and reduction methods tailored to the needs of these projects.

"The Brookhaven team will focus its efforts on developing machine learning solutions for data reduction and analysis, creating visual analytics tools to assess and steer the reduction and analysis process, and enabling the reproducibility of this process so that the integrity of scientific results can be maintained. Underpinning these efforts will be the development of runtime system support for the data-intensive applications," said Kleese van Dam, who is leading Brookhaven's efforts on the project. Barbara Chapman, Abid Malik, Klaus Mueller, Wei Xu, Shinjae Yoo are on the Brookhaven team.

The ECP is a collaborative effort of two DOE organizations—the Office of Science and the National Nuclear Security Administration. As part of President Obama's National Strategic Computing Initiative, ECP was established to develop a capable exascale ecosystem, encompassing applications, system software, hardware technologies and architectures, and workforce development to meet the scientific and national security mission needs of DOE in the mid-2020s timeframe.

Brookhaven National Laboratory is supported by the Office of Science of the U.S. Department of Energy. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov

Tags: computing

2016-11892  |  INT/EXT  |  Newsroom