2 Methods and data

2.1 Impulse-response step model

We use a step model (Good et al. 2011) to provide a benchmark of EBM emulator performance for temperature projections. The step-response function for each AOGCM was derived by dividing the projected temperature changes from a single realization of a CMIP6 abrupt-4xCO2 simulation by the radiative forcing for 4xCO2 (Byrne & Goldblatt 2013). The step-response function was smoothed using cubic splines, and linear regession (years 121-150) was used for extrapolation beyond the 150 years of the abrupt-4xCO2 simulations. Temperature projections from the step model were produced by convolution of annual changes in ERF and the step-response functions.

2.2 Two-layer EBM

In the two-layer EBM (EBM2) (Held et al. 2010; Geoffroy et al. 2013a) the upper layer represents the Earth’s atmosphere, land surface and ocean mixed layer, and the lower layer represents the deep ocean. The rate of temperature change in each model layer is determined from:
\(C_{1}\frac{dT_{1}}{\text{dt}}=F+\lambda T_{1}-\varepsilon\gamma(T_{1}-T_{0})\)(1)
\(C_{0}\frac{dT_{0}}{\text{dt}}=\gamma(T_{1}-T_{0})\) (2)
Where C representations heat capacity, T temperature, F ERF, λ the climate feedback parameter and γ the heat transfer coefficient between the upper layer (layer 1) and the lower layer (layer 0). We follow the formulation of Geoffroy et al. (2013b) which includes an efficacy parameter for deep ocean heat uptake (ε) to account for the forced pattern effect in surface temperature (Stevens et al. 2016). As is commonplace (Geoffroy et al. 2013a, b; Gregory et al. 2015; Cummins et al., 2020), the EBM2 parameters (Table S1) were calibrated for each AOGCM using a single realization of a CMIP6 abrupt-4xCO2 simulation (Table S1).

2.3 Calibration of EBM2 using linear optimization

As an alternative to abrupt-4xCO2 calibration, we use a linear optimization algorithm (scipy.optimize.minimize v1.6.2) to optimize the λ and ε parameters by minimizing the root mean squared error (RMSE) of the emulated temperatures compared to the AOGCM. The temperature projections are less sensitive to changes in the other EBM2 parameters (i.e., C0, C1, and γ), so these parameters are unchanged from their abrupt-4xCO2 calibrations. We also applied the linear optimization methodology to the abrupt-4xCO2 simulations and affirmed the calibrated parameter values of Geoffroy et al. (2013b).

2.4 Three-layer EBM

We use a three-layer EBM (EBM3) (Cummins et al. 2020) as a second benchmark for EBM2 performance. We follow the method of Cummins et al. (2020) to calibrate EBM3 parameters for each AOGCM using a single realization of a CMIP6 abrupt-4xCO2 simulation.

2.5 Data

We use projections of global annual mean near-surface temperature and radiative fluxes at the top of atmosphere (TOA) from the CMIP6 archive. We emulate temperatures for eight AOGCMs selected because data was available for the CMIP6 experiments of interest. For projections of recent and future climate change, the Historical and SSP experiments were used. The Detection and Attribution Model Intercomparison Project (DAMIP) experiments (Gillett et al. 2016) are used for projections of temperature change attributed to different sources of ERF. RFMIP experiments (Pincus et al. 2016; Smith et al. 2021) are used for estimates of ERF during the historical period and ERF projected to 2100 under SSP2-4.5. Following Forster et al. (2013), unforced drift is removed from the AOGCM projections using the preindustrial control simulation.