Figure 3. Projected changes in global mean temperature (top
row) and energy balance at the TOA (N) (bottom row). Each panel shows
changes in the AOGCM (x-axis) against the EBM2 emulation (y-axis). Each
point represents an annual mean during 1915-2014.
3.4 Future near-surface temperature
projections
We compare temperature emulations for the twenty-first century from EBM2
based on the different methods for calibrating λ and ε (Figure 4).
Results are shown for five of the eight AOGCMs where the most complete
CMIP6 data is available. Results for other models and experiments are
shown in Figure S4.
The performance of the abrupt-4xCO2 calibration varies greatly between
the AOGCMs and typically performs worse than the step model (Figure S4).
For four of the AOGCMs, the emulations of SSP2-4.5 deteroriate during
the twenty-first century. The errors in the emulations are correlated
with the magnitude of the forcing and peak near the end of the
twenty-first century for total and GHG forcing and early in the
twenty-first century for aerosol forcing. The exception is MIROC6 for
which the abrupt-4xCO2 calibrated EBM2 performs well throughout
1850-2100 and across the three simulations. For NorESM2-LM, SSP2-4.5 is
relatively closely emulated but SSP2-4.5-AER is not. Optimization of the
λ and ε parameters (the “1850-2100” calibration in Figure 4) yielded
close emulations for all of the AOGCMs and across the three experiments.
Similarly close emulations were also achieved by minimizing the RMSE
over 2015-2100 (not shown). Minimizing the RMSE for the later years of
the projection, when the temperature anomalies are largest, is key.
The “1850-2014” calibration yields a close emulation of temperatures
to 2014 but errors increase strongly after the calibration period.
Extending the calibration period from 1850-2014 to 1850-2040 (not shown)
does improve the emulation to 2040 but not always after 2040.
Importantly, it does not mitigate the risk of large emulation errors
outside the calibration period and its impact varies greatly between
AOGCMs and between different experiments for the same AOGCM.
To investigate the impact of using a calibration from one experiment for
a different experiment, the “1850-2100” calibration from SSP2-4.5 was
applied to the SSP2-4.5-GHG and SSP2-4.5-AER experiments (the
“SSP2-4.5” calibration in Figure 4). For both SSP2-4.5-GHG and
SSP2-4.5-AER, the error for the “SSP2-4.5” calibration is greater than
for the “1850-2100” calibration. The impact also varies between models
and experiments in terms of the size of the impact and its temporal
behaviour. For CanESM5 for instance, the difference in temperature
emulation is evident early in the twentieth century for SSP2-4.5-AER
compared to early in the twenty-first century for SSP2-4.5-GHG. Bespoke
parameter calibrations for different ERF scenarios are necessary,
therefore, to achieve close emulations throughout 1850-2100. This result
is important because it demonstrates that emulator performance can be
poor for out-of-sample predictions, yet there is no clear a priori way
to know if this will be the case. This poses a problem since the value
of emulators lies in their use for creating out-of-sample scenarios
where AOGCM simulations do not exist and cannot be readily performed.
The average of the emulations for individual models (Figure 4 “Ensemble
mean”) has relatively small RMSEs (except for the 1850-2014
calibration). This is due, in part, to averaging of interannual
variability across the ensemble of emulations. Further, the ensemble
mean generally has smaller RMSEs than an emulation in which the ensemble
mean ERF is used to emulate the ensemble temperature projection (Figure
4 “Ensemble emulation”).
Finally, while the optimization method yields unique parameter solutions
there is a near linear trade-off between the λ and ε parameters when
minimizing the RMSE (Figure S5). For the same RMSE, there are solutions
with a strong feedback (λ) with weak pattern effect (ε), and solutions
with a weak feedback with strong pattern effect.This shows that
optimized values for the λ and ε parameters may not be robust estimates
of climate feedback or the AOGCM pattern effect.