Climate models are the most commonly-used tools in climate change studies; however, model biases can make it difficult to employ these models for scientific and stakeholder applications. Therefore, these biases must be assessed to select models suitable for producing credible projections. As one of the most high-impact extreme weather events, drought influences our society significantly and has the potential to become more dangerous under a warming climate. To that end, we put forward a comprehensive drought-feature-based evaluation system and apply it to various climate datasets and hydrologic regions to gain insights into the underlying model biases.
Impacts from drought are related to temporal continuity features like consecutive duration and probability of occurrence, which are ignored in most model evaluation studies. Our proposed evaluation system aims to fill this gap. By using statistical hypothesis testing, the system further builds in a standard to assess the absolute performance of models for each metric, instead of their relative performance. The system also enables the analysis of global models as well as dynamical and statistical downscaling products. Such a feature-based evaluation system can also be developed for other weather events.
Because of drought’s significant impact and its increasing risk for damage under a warming climate, there is a growing need for a standardized suite of metrics addressing how well models capture this phenomenon. In this study, we present a widely applicable set of metrics for evaluating the agreement between climate datasets and observations in the context of drought. Two notable advances are made in our evaluation system: First, statistical hypothesis testing is employed for the normalization of individual scores against the threshold for statistical significance. And second, within each evaluation region and dataset, principal feature analysis is used to select the most descriptive metrics among 11 metrics that capture essential features of drought. By applying the evaluation system to the most advanced reanalysis dataset, it is shown to be able to identify highly performant datasets. Results are shown from the application of our metrics package to three characteristically distinct regions in the conterminous US and across several commonly employed climate datasets (CMIP5/6, LOCA, and CORDEX). From analyzing these results, insights emerge into the underlying drivers of model bias in global climate models, regional climate models, and statistically downscaled models.