Keywords:
Landsat imagery; AOD retrieval; Transformers; Google Earth Engine (GEE);
Explainable Artificial Intelligence (XAI)
Introduction
Atmospheric aerosols are suspended
particulate matter consisting of solid or liquid particles with
diameters ranging from 10 nm to 100 μm suspended in the atmosphere,
chiefly in the lower atmosphere (Kaufman et al., 1997).
These aerosols originate from
natural processes such as volcanic eruptions, wildfires, and dust
storms, and anthropogenic activities, including the combustion of fuels,
vehicular emissions, and industrial and agricultural processes.
Aerosols can alter the
Earth-atmosphere radiative balance by absorbing and scattering solar
radiation, directly impacting ecosystems and climate (Ramanathan et al.,
2001; Kaufman et al., 2002;). They also indirectly influence climate
change through interactions with clouds and participation in both
physical responses and chemical reactions (Andreae et al., 2004;
Rosenfeld et al., 2008, 2014; Z. Li et al., 2011, 2016;).
Furthermore, in the form of fine
and ultra-fine particulate matter like PM2.5 and
PM1, aerosols significantly deteriorate air quality and
pose substantial threats to public health (Fuller et al., 2022; Fuzzi et
al., 2015; Kim et al., 2015; Z. Li et al., 2017).
Therefore, accurate determination
and assessment of aerosol quantity and characteristics is of utmost
significance in monitoring atmospheric environmental pollution and
understanding climate change dynamics.
Aerosol optical depth (AOD) is
defined as the integral of a medium’s extinction coefficient in the
vertical direction and serves as a critical optical parameter for
characterizing aerosol content and assessing atmospheric turbidity
(Kaufman et al., 2002). Satellite
remote sensing offers an efficient means to acquire AOD distributions
across extended timeframes and broad spatial extents.
At present, a multitude of mature
aerosol retrieval algorithms have been established, e.g., Dark Target
(Levy et al., 2013), Deep Blue (Hsu et al., 2013), and Multi-Angle
Implementation of Atmospheric Correction (Lyapustin et al., 2018), using
atmospheric radiative transfer (ART) models.
These algorithms have been
successfully and operationally applied to medium- and
low-spatial-resolution sun-synchronous satellites, such as the MODIS and
VIIRS, as well as high-temporal-resolution geostationary satellites,
including Himawari and the GOES-R Series (S. Li et al., 2019; She et
al., 2019; W. Wang et al., 2022), providing daily or hourly AOD products
covering the entire Earth with considerable accuracy.
These datasets are invaluable for
real-time air quality monitoring, tracking, forecasting, and a variety
of related studies, particularly in the retrieval of particulate matter
(Ma et al., 2022; J. Wei et al., 2021; Zhang et al., 2021).
However, for
high-spatial-resolution satellites like the Landsat series and
Sentinel-2, there are considerably fewer studies dedicated to aerosol
retrieval. Satellite-derived AOD
serves as the primary and most crucial input for atmospheric correction,
directly impacting the accuracy of quantitative remote sensing
information extraction (Doxani et al., 2023; Z. B. Li et al., 2019;
Vermote et al., 2016).
Traditionally, physically based
methods informed by an ART model have been employed to retrieve AODs at
a 30-m resolution from Landsat imagery, with a particular emphasis on
urban areas (Yang et al., 2022).
Luo et al. (2015) developed a
simple ImAero-Landsat algorithm without using look-up tables and
retrieved AODs from Landsat 7 imagery, which was then applied to improve
PM10 estimates in Beijing, China.
Tian et al. (2018) retrieved AOD
from Landsat 8 imagery in Beijing City by estimating surface reflectance
with a bidirectional reflectance distribution function (BRDF) using the
RossThick-LiSparse model. Lin et
al. (2021a) improved AOD retrievals from Landsat 8 images by developing
three schemes for estimating land surface reflectance and by considering
four aerosol types in Beijing and Wuhan, China.
They then further refined a
fusion retrieval algorithm specifically designed for urban areas using
Landsat 8 and Sentinel-2 imagery (Lin et al., 2021b).
Kumar and Mehta (2023) introduced
a simple iterative approach to enhance AOD retrievals from Landsat 8 and
Sentinel 2 data by using visible and near-infrared spectral signals
across diverse land surface types.
Our group has also made notable
contributions to facilitating AOD retrievals from Landsat 4–8 images
through the construction of priori land surface reflectance databases
and the determination of aerosol types using time series analyses of
historical ground measurements in Beijing, China (L. Sun et al., 2016;
J. Wei et al., 2017).
Traditional physical methods for
coupled unknown variables associated with the Atmosphere-Earth system
typically depend on empirical assumptions.
However, the varying sensitivity
of satellite signals to diverse underlying surfaces, coupled with the
high spatiotemporal variability of aerosols and their compositions,
introduces inherent uncertainties that have a substantial impact on the
accuracy of aerosol retrievals (Z. Li et al., 2009; X. Wei et al.,
2020). The growing capacity for
robust information mining and the capability to tackle intricate,
non-linear problems have led to the increased applications of artificial
intelligence (AI) techniques in satellite aerosol retrievals across a
range of satellite platforms (Jia et al., 2022; Su et al., 2020; Tao et
al., 2023; Yeom et al., 2022).
However,
its application to Landsat imagery is still in its early experimental
stages, and only a handful of studies have been conducted thus far.
For example, Liang et al. (2022)
applied a range of tree-based machine learning models to retrieve AOD
from Landsat 7 and 8 scenes covering the Beijing area, utilizing prior
knowledge from multiple sources as inputs.
She et al. (2022) demonstrated
the effectiveness of a deep neural network (DNN) model to retrieve AOD
from Landsat 8 data over land, showing strong correlations with
ground-based measurements.
The Landsat series of satellites
have been providing long-term, high-spatial-resolution, and high-quality
global-scale imagery, continuously improving and updating their
capabilities since the 1970s with over 50 years of continuity (Crawford
et al., 2023; Wulder et al., 2022).
These advantages have made
Landsat an indispensable tool in a variety of applications, such as
natural resources mapping, land cover change analysis, and ecological
environmental monitoring (Masek et al., 2020; Wulder et al., 2019).
To achieve greater reliability
and accuracy in these applications, it is crucial to acquire accurate
surface reflectance data by removing atmospheric effects through
atmospheric correction, underscoring the urgent need for AOD products.
However, given the relatively
long revisit periods (~16 days) with the associated
challenges in real-time air quality monitoring, a significant gap
persists in the availability of widely accepted global aerosol retrieval
algorithms and products specifically designed for Landsat data.
With these challenges in mind,
this study introduces a state-of-the-art powerful deep-learning model
called Transformers, developed to tackle the intricate nonlinear
problems associated with Atmosphere-Earth decoupling in aerosol
retrievals for Landsat imagery.
To further enhance the model’s
performance, multi-dimensional information, including spatial
variability, temporal sequences, and the topographical impact of air
pollution, are incorporated into the model, resulting in the novel
GeoChronoTransformers (GCT) model.
Furthermore, we employ an ART
model to determine the key input features influencing aerosol retrievals
from a physical perspective and utilize the eXplainable Artificial
Intelligence (XAI) technique to dissect the individual contributions of
various features to aerosol retrievals.
Lastly, we automate the complete
process of data preprocessing and accomplish global-scale aerosol
retrievals from Landsat imagery on the Google Earth Engine (GEE)
platform. We assess the model’s
AOD retrieval performance using a variety of independent validation
methods and compare our results with those from widely used machine and
deep-learning models. We also
verify the model’s robustness by testing it in four representative
regions selected from around the world.
- Materials and methods
- Data sources
- Landsat 8/9 imagery
The
Landsat Operational Land Imager (OLI) is a multispectral instrument
onboard the eighth satellite of the Landsat Data Continuity Mission,
which was launched successfully on 11 February 2013.
Landsat 8 (L8) OLI provides
satellite data covering the spectrum from visible to shortwave infrared
channels (Roy et al., 2014). L8
OLI introduces new bands for coastal aerosol (deep blue) and cirrus
cloud detection bands in comparison to its predecessor, the Landsat 7
Enhanced Thematic Mapper Plus, which ceased operations in May 2021.
It also boasts improvements in
imaging width (185 km × 180 km) and daily image captures (725 scenes per
day), enriching the multispectral radiometric dataset and expanding the
potential for capturing cloud-free scenes globally.
To minimize data gaps within the
Landsat record, Landsat 9 (L9) was launched on 27 September 2021,
carrying the OLI-2 sensor. It closely aligns with L8 OLI in terms of
spectral bands, spatiotemporal coverage, radiometry, and geometric
precision, further enhancing measurement accuracy and stability while
offering extended imaging capabilities (Masek et al., 2020).
L9 operates in orbit alongside
L8, with an 8-day orbital offset, resulting in an updated data revisit
time of every 8 days and acquiring over 1400 scenes daily.
To facilitate user access and
utilization, the GEE remote sensing cloud platform provides the USGS
Landsat 8/9 Collection 2 Tier 1 TOA Reflectance dataset (Crawford et
al., 2023). This dataset
comprises 11 discrete spectral bands, 4 observation angles, and 2
quality assessment bands (see details in Table S1). Here, we have
collected all available images from L8 (~19,462
cloud-screened scenes from 2013 to 2022) and L9 (~1,293
cloud-screened scenes from 2021 to 2022) worldwide since their
respective launch times, matching with ground monitoring stations over
land.
Ground-based measurements
The AErosol RObotic NETwork
(AERONET) is a global ground-based aerosol observation network capable
of automated data collection (Holben et al., 1998). It comprises over
1600 monitoring stations since 1992, each equipped with a CIMEL
Electronique CE318 multi-band sun photometer. AERONET provides
multi-channel AOD observations every 15 minutes, with an average
uncertainty of ± 2%. These
observations undergo different quality controls and are categorized into
three quality levels: Level 1.0 (unscreened), Level 1.5 (cloud
screened), and Level 2.0 (cloud screened and quality assured) (Giles et
al., 2019). AERONET AOD data has
been extensively employed as ground truth for the development and
validation of aerosol retrieval studies (Hsu et al., 2013; Levy et al.,
2013; Lyapustin et al., 2018; J. Wei et al., 2019a; X. Wei et al.,
2020).
Given the limited number of
AERONET stations in China, the Sun-sky Radiometer Observation Network
(SONET) has been incorporated to enrich the observation samples.
SONET is closely aligned to
AERONET in terms of observation instruments, spectral bands, and data
quality level categorization. It
provides AOD observations at 16 long-term ground stations across China,
with an average uncertainty of approximately 0.25–0.5% (Z. Q. Li et
al., 2018). These observations
have also been widely used in studies related to the development of
satellite aerosol algorithms and uncertainty analysis in China (He et
al., 2021; J. Wei et al., 2019b).
In this study, we have collected
AERONET and SONET Level 2.0 AOD ground observations at a total of 470
land sites from 2013 to 2022 (Figure S1).
However, AERONET and SONET do not
provide direct AOD observations at 550 nm, so they are interpolated
using the Ångström algorithm according to our previous studies (J. Wei
et al. 2019a, 2020b).
ART-GCT-GEE aerosol
retrieval cloud framework
In this study, we propose a novel
global aerosol retrieval cloud framework for Landsat imagery, consisting
of three main steps: feature selection (input variables) for aerosol
retrieval based on the atmospheric radiative transfer (ART) model,
atmospheric-surface decoupling using the GeoChronoTransformers (GCT)
model, and the automated global aerosol retrieval facilitated by the
Google Earth Engine (GEE) platform, referred to as the ART-GCT-GEE
model.
Atmospheric Radiative Transfer (ART)
model
Based on the ART model, the top-of-the-atmosphere (TOA) reflectance
(ρ* ) at wavelength \(\lambda\) received by
satellite sensors can be approximately represented as a function of
atmospheric path (aerosol) reflectance (ρAer ),
Rayleigh scattering reflectance (ρRay ), and
surface reflectance (ρ ) when assuming a uniform Lambertian
surface:
\(\rho^{*}\left(\lambda,\theta_{s},\theta_{v},\varphi\right)\approx\rho_{\text{Aer}}\left({\lambda,\theta}_{s},\theta_{v},\varphi,\tau\right)+\rho_{\text{Ray}}\left({\lambda,\theta}_{s},\theta_{v},\varphi\right)+\frac{\rho}{1-\rho*S_{\lambda}}T\left(\theta_{s}\right)T\left(\theta_{v}\right),\)(1)
where \(\tau\) is the aerosol optical depth, S is the hemispheric
albedo, and \(T\left(\theta_{s}\right)\) and \(T(\theta_{v})\) are the
atmospheric downwelling and upwelling transmittances, respectively.
Angles \(\theta_{s}\), \(\theta_{v}\), and \(\varphi\) represent the
solar zenith angle (SZA), viewing zenith angle (VZA), and relative
azimuth angle, respectively.
The complete process of
atmospheric-surface decoupling, i.e., separation between atmospheric and
surface contributions to TOA reflectance, forms the foundational
principle of satellite aerosol retrieval. This is calculated at
wavelength \(\lambda\), and based on simulations using the 6S model
under specific observational conditions (\(\theta_{s}\) = 30,\(\theta_{v}\) = 30, and \(\varphi\) = 100), we found that under low
surface reflectance conditions, TOA reflectance gradually increases as
AOD increases (Figure S2). This
effect is particularly pronounced in the visible channels, showing
strong positive non-linear correlations. However, as the wavelength
increases, the relationship between AOD and TOA reflectance gradually
weakens. In addition, as surface
reflectance increases, the impact of AOD on TOA reflectance gradually
diminishes, and in longer wavelengths (bands 6–7), an inverse
relationship is observed. This
indicates that aerosols significantly and diversely impact TOA
reflectance across different wavelengths. As a result, seven discrete
spectral channels (bands 1–7) are selected as the primary input
features for the deep learning (DL) model.
Notably, the incorporation of
multi-band information into the DL model can also enhance its ability to
diagnose various aerosol types over land (J. Wei et al., 2018).
Furthermore, we have included geographical location (latitude,
longitude) and monthly time series data for spatial points to improve
the AOD modelling by accounting for the spatiotemporal variations in air
pollution. To better capture
monthly variations, particularly the seasonal cycles of AOD, we
represented the temporal information using three helix-shape
trigonometric month sequence vectors [T1,
T2, T3] (Equations 2–4) (Sun et al.,
2022; J. Wei et al., 2023).
Leveraging the strong data mining
capabilities of DL, this approach can utilize implicit aerosol
information among multiple channels and aid in resolving complex
atmospheric-surface decoupling processes to improve the capability for
aerosol retrieval.
\(T_{1}=\frac{\text{Month}}{12}\) (2)
\(T_{2}=cos(2\pi\frac{\text{Month}}{12})\) (3)
\(T_{3}=\sin(2\pi\frac{\text{Month}}{12})\) (4)
In
addition, the ART model is influenced by the observational geometry, as
satellite signals can vary depending on different observation angles.
Therefore, four angles, i.e.,
SZA, the solar azimuth angle (SAA), VZA, and the viewing azimuth angle
(VAA), are also included as supplementary features.
We also incorporated the
scattering angle (\(\Theta\)) and the ground elevation (z) into our DL
model because they are crucial inputs for calculating the primary
parameters of the Rayleigh scattering phase function
(PRay(Θ)) and Rayleigh optical depth
(τRay(z)) in Rayleigh reflectance using empirical
formulas (Equations 5–6) (J. Wei et al., 2018).
More importantly, elevation is a
critical factor characterizing surface terrain changes, particularly in
rugged terrain and at high altitudes where aerosol retrievals face great
challenges (Y. Wang et al., 2019; Wei et al., 2020b).
Last, we employed the shortwave
infrared NDVI (NDVISWIR, Equation 7), which is minimally
affected by aerosols, to discern land use cover and changes.
\(P_{\text{Ray}}(\Theta)\approx\frac{3}{4}(1+\operatorname{}\Theta)\)(5)
\(\tau_{\text{Ray}}\left(\lambda,z=Z\right)\approx 0.00877{[\lambda(z=0)exp(Z/34)]}^{-4.05}\exp\left(\frac{-Z}{8.5}\right)\)(6)
\(\text{NDVI}_{\text{SWIR}}=\frac{{\rho^{*}}_{1.24}-{\rho^{*}}_{2.12}}{{\rho^{*}}_{1.24}+{\rho^{*}}_{2.12}}\)(7)
In summary, leveraging the ART
model, this study has selected a total of 19 predictive factors related
to aerosol remote sensing retrievals for use as input features in the DL
model. These factors encompass
information related to various aspects, including spectral radiance
(Landsat 8/9 OLI Bands 1–7 TOA reflectances), observational geometry
(i.e., SAA, SZA, VAA, VZA, and \(\Theta\)), a digital elevation model,
NDVISWIR, as well as spatial (latitude, longitude) and
temporal (three monthly vectors) attributes.
GeoChronoTransformers (GCT) model
Conventional remote sensing
approaches suffer from large uncertainties in establishing relationships
between retrieved and input quantities by resorting to assumptions such
as Lambertian surface reflection and fixed aerosol types, as well as the
use of fixed thresholds of vegetated index or land cover classification
products in distinguishing underlying surfaces (Hsu et al., 2013; Levy
et al., 2013; Lyapustin et al., 2018).
They could introduce
uncertainties into the aerosol retrieval process due to the nonlinear
relationships and the challenges posed by variations across different
times and locations. However,
artificial intelligence, particularly DL, offers a promising solution to
ameliorate and overcome these nonlinear challenges. More importantly,
contemporary XAI technologies enable us to analyze DL models from a
physical perspective (see details in Section 3.1).
Therefore,
this study introduces an advanced DL model called Transformers (Vaswani
et al., 2017) to minimize these effects and enhance the capability to
handle nonlinear problems.
Transformers, developed by Google researchers in 2017, has been
recognized as the fourth major category of DL models, following
multilayer perceptron (MLP), the Convolutional Neural Network (CNN), and
the recurrent neural network (RNN).
It was originally designed for
machine translation tasks and has demonstrated outstanding performance
in various domains such as image classification and text summarization
(Liu & Lapata, 2019; Dong et al., 2022).
However, it has seldom been
applied to the field of geoscience, especially in atmospheric science.
Unlike other DL models,
Transformers relies solely on attention mechanisms and introduces the
use of multi-head self-attention to simulate the multi-channel output of
CNNs. This mechanism combines
different feature representations to obtain richer and more accurate
information, effectively enhancing the model’s information
representation and generalization capabilities.
Additionally, Transformers
introduces the concepts of residual connections and encoder-decoder
structures, resulting in significant breakthroughs in aspects such as
operational efficiency, model flexibility, and scalability (see the
model structure in Figure S3).
To enhance the applicability of
the model for aerosol retrieval tasks, this study introduces
improvements and incorporates spatiotemporal information into the
traditional Transformer structure, resulting in a novel model named
GeoChronoTransformers (GCT).
Specifically, we input
spatiotemporal features of AOD and auxiliary variables, including
longitude, latitude, and monthly time series information (Equations
2–4), into the encoder section of the Transformers.
During this process, we replace
the original positional encoding module, typically employed in machine
translation tasks, with fully connected layers.
Subsequently, we stack multiple
Transformer encoders and directly yield output through fully connected
layers, which not only enables us to thoroughly extract hidden aerosol
information from multidimensional features but also effectively
mitigates the overfitting issue by complex models in this specific task.
Last, during model training, we
adopt a trial-and-error approach to fine-tune core parameters and
utilize an early stopping strategy to pinpoint the retrieval model that
performs best, which can also enhance both the model’s training
efficiency and convergence speed.
Google Earth Engine (GEE) cloud
platform
The GEE is a cloud-computing
platform designed specifically to enhance the parallel processing of
petabyte-scale satellite remote sensing imagery and other geographical
spatial data, leveraging the powerful computational infrastructure
(Gorelick et al., 2017; Tamiminia et al., 2020).
Moreover, the platform offers
application programming interfaces (APIs) and integrated development
environments in both JavaScript and Python, allowing users to engage in
customized development according to their requirements.
Users can leverage the cloud
computing capabilities of GEE’s backend data center to perform a variety
of computations and analyses, saving a huge amount of time required for
local data downloading and processing.
In this study, all data
preprocessing tasks are conducted via the secondary development of the
GEE Python API. First, Landsat 8/9 Collection 2 Tier 1 TOA reflectance
data on the GEE platform is filtered based on the central coordinates of
AERONET and SONET sites, retaining only valid observations that coincide
with the satellite overpass time within a ±30-minute window.
All valid observations are
averaged to establish ground-truth values for each site. To further
enhance the quality of data samples, only Landsat images with a low
cloud-cover fraction (< 20%) are collected.
Subsequently, we utilize the
Landsat quality assessment (QA) band to mask out all unsuitable
retrieval pixels, including those affected by clouds, cloud shadows, and
ice/snow. We then proceed with
the extraction (e.g., spectral reflectance and angle information),
computation (e.g., NDVISWIR), and pixel-to-pixel value
matching of selected input features to complete the generation of
training samples. To establish a
robust global aerosol retrieval model, we collected all satellite images
from the launch of Landsat 8 and 9 that matched ground-based monitors,
encompassing a total of 20,755 cloud-screened images (Table S2). All
data samples are input into the GCT model to train the final model, and
the constructed model is then applied to perform aerosol retrieval on
the GEE platform, ultimately generating the final output map.
Last, a 5 × 5 window median
filter is applied to minimize spatial noise and enhance the quality of
the retrieval, following our previous study (J. Wei et al., 2017).
Figure 1 illustrates the
flowchart of our proposed ART-GCT-GEE model for global aerosol
retrievals from Landsat imagery.