Figure 1. The
Atmospheric Radiative Transfer (ART) - GeoChronoTransformers (GCT) model
online framework used for global aerosol retrievals from Landsat imagery
on the Google Earth Engine (GEE) platform.
Validation method
To comprehensively assess the
performance of the proposed ART-GCT-GEE model for global Landsat AOD
retrievals, we employ two distinct categories of independent validation
methods. One is the widely used ten-fold cross-validation (10-CV), a
standard approach in validating AI regression tasks (Rodriguez et al.,
2010). This is conducted at
sample, station, and monthly levels, which involves randomly selecting
90% of the data samples, ground monitor stations, and months of the
year for training the model, while the remaining 10% are reserved for
validation (J. Wei et al., 2023).
This process ensures that the
training samples are independent of the testing samples at overall,
spatial, and temporal scales. This cycle is repeated 10 times to ensure
that all data samples are used as test sets in the cross-validation.
These three methods collectively
evaluate the overall accuracy of AOD estimates at monitoring stations
and predictive accuracies at locations and on dates where ground
measurements are not available, respectively.
The other validation method is
comprised of two distinct parts: First, taking into consideration the
unbalanced distribution of ground monitors and the pronounced
spatiotemporal clustering patterns of AOD, we evaluate the model’s
predictive capabilities by withholding temporal and spatial units.
This entails controlling each
year from 2013 to 2022 and each of the 10 geographical continents
globally [defined in Figure S1 according to (J. Wei et al., 2019a)]
to conduct independent validations (withhold one year or one continent).
This is accomplished by
sequentially selecting all data samples from a single year or a single
continent as the validation set while utilizing data samples from the
remaining 9 years or 9 continents for model training.
Second, we employ data samples
from the middle 6 years (i.e., 2015–2020) for the model training and
utilize the two initial years (i.e., 2013 and 2014) and the two final
years (i.e., 2021 and 2022) for validation.
This split results in
approximately 65% of the samples for training and 35% for testing.
This method can effectively validate the model’s capacity to both
predict historical and forecast future AOD levels.
To quantitatively assess the
model’s accuracy and facilitate model comparison, several statistical
indicators are used, namely, the Pearson correlation coefficient (R),
median bias (MB), mean absolute error (MAE), and root-mean-square error
(RMSE). Additionally, to assess
the uncertainties of satellite AOD retrievals, we employ the expected
errors (EE) for AOD retrievals from the MODIS Deep Blue algorithm over
land (Equation 8) (Hsu et al., 2013) and the criteria for AOD retrievals
in the Global Climate Observation System (GCOS) (Equation 9) (GCOS,
2010).
\(EE=\pm(0.05+20\%\times\tau_{\text{observation}})\) (8)
\(GCOS=\pm maximum(0.03,\ 10\%\times\tau_{\text{observation}})\) (9)
Results and discussion
Feature contribution analysis using XAI
DL models are commonly seen as black boxes, but with the emergence of
XAI, their internal workings can be unveiled. Here, we select the
advanced SHapley Additive exPlanation (SHAP) method to investigate and
understand the driving factors in the Landsat AOD retrieval by assessing
the contribution of input variables through the computation of Shapley
values (Figure 2). SHAP, with its exceptional model-agnostic nature, can
offer both local and global interpretability, thereby ensuring
transparency, fairness, and interpretability across a diverse AI,
especially DL applications (Lundberg and Lee, 2017). Our findings
demonstrate that coastal aerosol channel within the deep-blue wavelength
(Band 1) exert the most significant contribution, with the highest SHAP
value of approximately 36% among all features, followed by the blue
channel (Band 2), accounting for ~9%. The contributions
of discrete channels to the AOD retrieval tend to gradually decrease as
wavelengths increase, consistent with the decreasing sensitivity of the
aerosol signal to apparent reflectance (Figure S2). Nevertheless, these
contributions remain substantial, and the total contribution of all
channels, spanning from visible to shortwave infrared wavelengths,
amounts to approximately 58%, underscoring the considerable importance
of multi-band information in aerosol retrievals. Observation angles,
especially solar zenith and scattering angles, also have great impacts
on the AOD retrieval (total SHAP = 19%). Furthermore, multi-dimensional
information encompassing space, time, and altitude, as well as surface
NDVISWIR, also play important roles (SHAP = 2–11%) in
enhancing the AOD retrieval within the Transformers model. These results
illustrate the rationale behind our feature selection, contributing to a
deeper understanding of the physical interpretability of DL.