Figure 2. Sunburst chart of feature contributions in the
satellite AOD retrieval from Landsat imagery using the SHapley Additive
exPlanations (SHAP) approach of eXplainable AI (XAI).
Evaluation and uncertainty analysis
Model cross-validation
We initially employed three independent cross-validation techniques to
assess the performance of the proposed model for the Landsat global
aerosol retrieval. Our model demonstrates strong performance in
estimating AOD from Landsat images across the world, agreeing moderately
well with measurements at approximately 78% of the sites (sample-based
CV-R > 0.5), with median biases within ± 2% for about 77%
of the sites (Figure 3). Higher levels of accuracy are noted in populous
areas characterized by elevated levels of air pollution, including
Southern Africa, India, and East Asia (Table 1), where correlations
surpass 0.8. The retrieval uncertainties at most sites remain
consistently low, with approximately 84% and 76% of sites having small
MAE and RMSE values below 0.08 and 0.1, respectively. Exceptions are
observed at a few sites in North Africa and the Middle East, and eastern
China (Table 1), where larger absolute errors are primarily associated
with high AOD levels resulting from heavy frequent sand/dust emissions
or anthropogenic activities. Overall, over 87% of the sites show
considerable accuracy, with more than 70% of the retrievals falling
within the EE envelope. Furthermore, 68% of the sites have at least
40% of retrievals meeting the GCOS requirements. Similar spatial
patterns are observed from spatial (temporal) CV results (Figures S3 and
S4), but the performance is generally poorer than the sample-based CV
results. Nevertheless, approximately 73% (72%), 73% (78%), and 66%
(72%) of the sites continue to demonstrate station-based (month-based)
moderate correlations (CV-R > 0.5) and low MAE (<
0.08) and RMSE (< 0.1) values between the retrievals and
ground-truth values. Additionally, acceptable retrievals meeting the
error criteria of the EE (> 70%) and GCOS (>
40%) are observed at approximately 64% (77%) and 48% (57%) of the
sites on land.