Figure 11. Precipitation skill scores (top) and bias score
(bottom) vs. StageIV for 6-hr CONUS precipitation in three versions of
C-SHiELD, given for precipitation events greater than three six-hourly
accumulation thresholds (0.1, 5.0, and 25.0 mm). Skill scores are given
for both Equitable Threat Score (ETS; Hogan and Mason 2012) and
Fractions Skill Score (FSS; Roberts and Lean 2008). C-SHiELD 2017 is
validated from May 2017 to May 2018; C-SHiELD 2018 is validated from
April 2018 to May 2019; C-SHiELD 2019 is validated from January to
December 2019. Validation is performed on the 4-km StageIV grid using
3x3 neighborhoods, corresponding to a 12-km radius.
Precipitation forecast skill (Figure 11, top panels) is similar among
all three versions of C-SHiELD. The 2019 version has the least overall
bias (Figure 11, bottom panels) as earlier versions had too much light
and too little heavy precipitation. The 2019 version reduced the diurnal
cycle in the bias of light and moderate precipitation, although this was
still apparent in the bias score for heavy precipitation and still had a
prominent high bias of heavy precipitation during the first 30 hours. We
speculate that the re-configuration of the numerical diffusion, which
improved storm placement, and the revised settings for the GFDL
microphysics, which improved structure and evolution of the storms,
combined to improve the biases in the 2019 version.
We use the surrogate severe technique of Sobash et al. (2011) to
validate our 2–5 km updraft helicity (UH) fields against storm reports
from the Storm Prediction Center. This is a well-established method used
for evaluation of convective-scale prediction models (cf.
https://hwt.nssl.noaa.gov/sfe/2018/docs/HWT_SFE_2018_Prelim_Findings_v1.pdf).
We create surrogate severe fields and validate against observed severe
fields to compute FSS and Bias scores in C-SHiELD and plot the results
as a function of UH threshold and smoothing radius (Figure 12), similar
to Figure 17 in Sobash et al. (2016). For all versions of C-SHiELD the
highest FSS is found from the largest smoothing radius of 240 km and for
UH thresholds of 150–200 m2 s-2,
with slightly higher or lower thresholds giving similar skill scores.
The UH threshold giving the best score for C-SHiELD is higher than in
many other convective-scale models due to the significantly higher
updraft helicities in FV3-based models (Potvin et al. 2019). This in
turn is likely due to the emphasis on vorticity in the horizontal
discretization as described in Harris2019.
The maximum FSS in the 2018 and 2019 versions is about 0.8, on par with
operational and research convective-scale models (cf. Sobash et al.
2019) and significantly higher than the 2017 version. There is a uniform
over-prediction bias for all but the highest UH thresholds (Figure 12,
bottom row). This bias was significant in the 2017 version but is
decreased every year for most threshold-radius combinations, and for the
highest-FSS combination decreases from 0.47 in 2017 to 0.22 in 2019.
C-SHiELD 2019 still has a high frequency bias except for the very
highest UH thresholds, as it is still too aggressive at creating strong
storms.