We use simple pattern scaling and time-shift to emulate changes in a set of climate extreme indices under future scenarios, and we evaluate the emulators' accuracy. We propose an error metric that separates systematic emulation errors from discrepancies between emulated and target values due to internal variability, taking advantage of the availability of climate model simulations in the form of initial condition ensembles. We compute the error metric at grid-point scale, and we show geographically resolved results, or aggregate them as global averages. We use a range of scenarios spanning global temperature increases by the end of the century of 1.5 C and 2.0 C compared to a pre-industrial baseline, and two higher trajectories, RCP4.5 and RCP8.5. With this suite of scenarios, we can test the effects on the error of the size of the temperature gap between emulation origin and target scenarios.
We find that in the emulation of most indices the dominant source of discrepancy is internal variability. For at least one index, however, counting exceedances of a high-temperature threshold, significant portions of the globally aggregated discrepancy and its regional pattern originate from the systematic emulation error. The metric also highlights a fundamental difference in the two methods related to the simulation of internal variability, which is significantly resized by simple pattern scaling. This aspect needs to be considered when using these methods in applications where preserving variability for uncertainty quantification is important.
We propose our metric as a diagnostic tool, facilitating the formulation of scientific hypotheses on the reasons for the error. In the meantime, we show that for many impact relevant indices these two well-established emulation techniques perform accurately when measured against internal variability, establishing the fundamental condition for using them to represent climate drivers in impact modeling.