Skip to main content
U.S. flag

An official website of the United States government

Publication Date
1 November 2020

A Performance-Portable Nonhydrostatic Atmospheric Dycore for the Energy Exascale Earth System Model Running at Cloud-Resolving Resolutions

Subtitle
HOMMEXX-NH: A portable non-hydrostatic atmosphere dycore for E3SM.
Print / PDF
Powerpoint Slide
Image
Achieved SYPD for different implementations and resolutions.
Science

We developed a single implementation of E3SM's non-hydrostatic atmosphere dycore, which can run on a variety of supercomputer architectures, including many-core CPU and General Purpose Graphics Processing Unit (GPGPU) accelerators.

Impact

This work is the foundation on which physics parametrizations in E3SM (i.e., packages that approximate atmosphere processes that the dycore does not fully resolve at the given grid-scale) can build in order to achieve a code base that can efficiently run on a variety of architectures. A performance-portable implementation of the full atmosphere component of E3SM will permit some of the first decade-long cloud-resolving climate simulations, allowing scientists to address the uncertainties arising from the approximation of cloud systems.

Summary

The non-hydrostatic atmosphere dynamical core (HOMME-NH) of the Energy Exascale Earth System Model (E3SM) was rewritten, from the original, CPU-centric, Fortran90 implementation to a C++ implementation (HOMMEXX-NH), using the C++ library Kokkos to handle on-node threaded parallelism. By using Kokkos, we were able to achieve a single implementation, capable of running on a variety of HPC architectures, including conventional CPUs, many-core CPUs, and GPUs. To test performance, we chose the NGGPS benchmark, a community benchmark for cloud-resolving atmosphere models, which includes 10 passive tracers. We tested HOMMEXX-NH on conventional CPUs, many-core CPUs, and GPUs. On CPU systems, the new implementation proved to be as fast as the original implementation, and sometimes even slightly faster. More importantly, when running on the full OLCF Summit supercomputer, with a 3km horizontal resolution, HOMMEXX-NH was able to achieve 0.97 Simulated Years Per Day (SYPD) when running on GPU. Such throughput is roughly one order of magnitude larger than the SYPD obtained when running just on the Summit CPUs or on the full NERSC Cori KNL supercomputer.

Point of Contact
Luca Bertagna
Institution(s)
Sandia National Laboratories (SNL)
Funding Program Area(s)
Publication