CMDV-SM: A Global Climate Model Software Modernization Surge

CMDV-SM started in September 2016. The project is managed as nine tasks that all support the goals of the project.

Early progress supports the value propositions in the proposal. The incorporation of people, libraries, and processes from the broader DOE computational science community is adding to productivity. As of September 2017, here are some task highlights:

  • One highlight for the PIs was a remark by a climate scientist that because of this project’s efforts to develop targeted testing of individual physics modules, she could now work on her laptop instead of logging into supercomputers. This will lead to a huge productivity gain for developers. 
  • The develpment of a full-featured Single Column Model with a library of documented comfigurations is already having an impact on the broader community developing atmospheric physics parameterizations.
  • We are beginning to deploy climate reproducibility tests to automatically accept round-off level code changes, as are often introduced when doing performance optimizations, from true climate-changing bugs.
  • Our verification effort has found four code or algorithm issues in a physics parameterization which are having an effect on precipitation predictions, and has also fixed energy definitions discrepencies between separate libraries. 
  • We have added a capability to compute atmosphere dynamics and physics concurrently on different cores, exposing much more parallelism in the model to increase the model throughput.
  • We are making quick progress towards deploying a new accurate, scalable, and maintainable Coupler code based on the MOAB mesh database library. 
  • We have made good progress on rewriting the atmospher dycore code into C++ to use a single performance-portable code implementation that runs well on CPUs, Phis, and GPUs. 

A key success of ACME-SM, over and above the reaching of task milestones, is the incorporation of new staff who are developing skills in areas that are critical to the long-term health of the ACME project. We have multiple developers working on the coupler, on atmosphere performance, and software infrastrucutre, are training new integrators, and are exposing several staff to new levels of project leadership. 

The E3SM Software Modernization project under the CMDV program (CMDV-SM) takes advantage of a one-time, three year pulse of funding to tackle software infrastructure issues that would otherwise prevent the E3SM model from reaching its potential. Specifically, we propose changes that improve the trustworthiness of the model, that prepare the code for exascale architectures, and that allow E3SM to tap into cutting-edge computational science.

One focus area of our efforts is increased and improved testing, which is necessary for improved reliability and the ability to rapidly prototype cleaner and more efficient code implementations. Within three years, we intend to provide the E3SM team with a more flexible and efficient (CMake-based) build system, a robust unit-testing capability, a reliable and efficient means for determining whether answer-changing modifications have a practical effect on model solutions, and a greatly expanded set of tests. Another focus of our work aims to verify that the existing code is satisfying the intended set of equations. We will target three components of the atmospheric physics for intensive analysis (MG2 microphysics, CLUBB turbulence/macrophysics/convection, and MAM aerosols), with the rest of the code receiving a more cursory evaluation. The verification effort is expected to provide many new tests which will be incorporated into the testing effort. A single-column model configuration will be needed by both the testing and verification efforts; another task in this project will harden this capability and make it available to the E3SM community.

We have also targeted three specific areas of the code for intensive rewriting and modernization. The first of these is the interface between atmospheric physical parameterizations and fluid dynamics. By computing physics and dynamics in parallel, the model will run faster, scale to bigger core counts, and will be able to efficiently handle more sophisticated parameterizations without a time penalty. We will also put substantial effort into modernizing the coupler. In particular, we will revise the way communication between components occurs to take advantage of new capabilities in our mesh database library. The driver will be redesigned to allow for more efficient and flexible specification of numerical coupling algorithms, grids, and data structures. Additionally, we will develop a dynamic load balancing/automated decomposition optimization capability for the coupler, and  will improve the coupler's regridding capability. Our third target for code modernization is the atmospheric spectral element dycore, which we will rewrite in C++ using Kokkos and other Trilinos libraries. The resulting code will be performance portable, will allow for better uncertainty quantification, and will allow for continued leverage of advanced ASC and ASCR technologies.

A theme which cuts across all aspects of our proposal is to establish good programming practices within the E3SM community. We will do this in part by setting a good example for code verification and testing with our own efforts. We will also make it easy to follow best-practices approaches by creating robust, easy-to-use testing and verification frameworks. Our proposal also contains an explicit education component aimed at encouraging developers to improve their coding knowledge.

In total, these efforts will put the E3SM code base and development team on a much-improved footing for conducting world-class science in support of DOE missions.

Project Term: 
2016 to 2019
Project Type: 
Laboratory Funded Research

Publications:

None Available

Research Highlights:

None Available