<Fig. 1>
2.1 Study area
The study area is located in southern Jiangsu Province of southeastern
China, including three prefecture-level cities (Suzhou, Wuxi, and
Changzhou) and two county-level cities (Danyang and Dantu), with a total
land area of 18,700 km2 (Fig. 1a). The area has a
subtropical monsoon climate with a mean annual rainfall of 900-1200 mm,
and an average temperature of 15.6 °C (Song et al., 2019). Most parts of
the study area are lowlands with elevations ranging from 0 to 10 m above
sea level, while the mountains with elevations >100 m are
mainly located in the southwest (Fig. 1b). Paddy soil (Stagnic
Anthrosols) is the dominant soil type, and fluvo-aquic soil (Aquic
Cambosols) and yellow brown soil (Ferri-Udic Argosols) are the other two
major soil types (Fig. 1c). The parent materials for paddy soils in the
area were mainly composed of alluvial materials, loess, and lacustrine
deposits, while the fluvo-aquic soils in the northern parts of the study
area were developed from the Yangtze River alluvium. Yellow brown soils
derived from residual and slope deposits were mainly distributed in the
low-mountain areas.
Southern Jiangsu Province has thousand years of agriculture planting
history, while the rapid urbanization only began in the early 1980s. The
rapid urbanization process and economic growth due to industrial
development have greatly influenced the soil quality in the area (Liu et
al., 2010).
2.2 Soil sampling and
analysis
Cropland topsoil data (0-20 cm) in 1980 were collected from the soil
survey reports of each city within the study area (Fig. 1a). These
legacy data have been further checked for quality control, such as the
correction of likely wrong samples and outlier detection (Schillaci et
al., 2019), resulting in a total of 399 data on soil organic matter
(SOM) being collected. The SOM in 1980 was determined by using the
potassium dichromate oxidation method with external heating (Institute
of Soil Science Chinese Academy of Sciences, 1978), and was converted to
SOC by the SOM×0.58 based on the Van Bemmelen factor (Yan et al., 2011)
(denoted as SOC1980 hereafter). Soil data recorded in
these soil survey reports, including the text descriptions of the soil
type, topography, bulk density, rock fragments with the fraction
>2 mm (coarse fraction), and sampling locations, were the
findings of the second National Soil Survey of China. So far, these
legacy data were still the most comprehensive and detailed historical
data that were available for extracting information about the soil
properties of the study area in 1980.
In 2000, a total of 413 topsoil samples were collected for better
coverage of the study area. These sampling locations were chosen to (i)
match the 1980’s sampling locations as closely as possible, (ii) be on
the identical soil types and topographic characteristics, and (iii) have
the most typical cropping systems around the sites. In 2015, the
sampling locations were determined by the same rules as the sampling
campaign in 2000. However, because of the significant changes in land
use (i.e., cropland converted to urban land over the period of
2000-2015), some sampling locations in 2015 were slightly different from
those in 2000, resulting in a total of 407 soil samples being collected
in 2015. Hand-held GPS (global positioning system, positioning error
<10 meters) was used to record the sample locations in 2000
and 2015, and the related information such as soil types, land use, and
topographic characteristics were also recorded in detail. The average
sampling density over the three sampling campaigns is approximately one
sample per 33 km2. And the SOC contents in 2000 and
2015, which were denoted as SOC2000 and
SOC2015 hereafter, were all determined by the same
method as SOC1980.
2.3 SOC change mapping and uncertainty
assessment
The geostatistical ordinary kriging (OK) may reflect the mechanism of
spatial variations in soil properties and is relatively transparent and
straightforward, compared to the complicated algorithm such as machine
learning (Veronesi & Schillaci, 2019). Therefore, the OK method was
employed to map the spatial distributions of SOC1980,
SOC2000, and SOC2015, and then, the
spatial patterns of the SOC changes over the three time periods were
determined by subtracting the OK-predicted SOC1980 from
the OK-predicted SOC2000, the OK-predicted
SOC2000 from the OK-predicted SOC2015,
and the OK-predicted SOC1980 from the OK-predicted
SOC2015, respectively.
During the OK prediction, the experimental variogram that summarized the
spatial relations in SOC data was first calculated. For a set of SOC
data z(xi ) at location xi(i=1,2,…n), the semivariance γ (h ) for a spatial
distance h can be estimated by using the following equation
(Veronesi & Schillaci, 2019):
(1),
where N (h ) is the number of pair observations
z(xi ) and z(xi +h )
that separated by the spatial distance h . The experimental
variogram can be fitted by some theoretical variogram models such as
spherical, exponential, and Gaussian models. The sill of the fitted
model is the semivariance value at which the fitted variogram model
first flattens out (represents the spatial variance of the data), and
the range is the distance at which the variogram reaches the sill
(represents the distance at which the data are no longer
auto-correlated), while the nugget is the model-fitted semivariance
value when the distance is zero (represents the influences of
measurement error and/or small-scale variations).
In this study, the variogram models were fitted to the experimental
variograms of SOC1980, SOC2000, and
SOC2015, respectively, and then the optimal model for
each of the three sampling dates was chosen according to the goodness of
fit (the value of R2) of spherical, exponential, and
Gaussian models (Goovaerts, 1997). Sequential Gaussian Simulation (SGS)
was used to generate equiprobable realizations of SOC for quantifying
the uncertainties associated with the OK-predicted SOC. Based on the
best-fitted variogram models from the OK method, the SGS
was first conducted 1000 times for each of the SOC1980,
SOC2000, and SOC2015 values, and then,
random subtractions of the 1000 times SOC1980 maps from
the 1000 times SOC2000 maps were used to obtain the 1000
equiprobable realizations of the SOC changes during 1980-2000. This
process was repeated to generate the SOC change realizations for the
periods of 2000-2015 and 1980-2015. Finally, the realizations of the SOC
changes were used to quantify the uncertainty intervals of the SOC
changes.
Ensure that the sample data are normal or approximately normal
distribution, and the SGS algorithm used in this study was summarized as
follows (Webster & Oliver, 2007):
- Model the variogram model of the SOC data.
- Define a random path visiting all grid nodes across the study area.
- For each un-sampled node, determine a searching window to obtain the
conditional dataset (sample data and simulated values within the
searching window) for the node.
- Estimate the mean and variance of SOC for the node location using
kriging and the conditional dataset, and build the conditional
cumulative distribution function (ccdf) of SOC based on the assumption
of Gaussian behavior;
- Draw a simulated value randomly from the ccdf and add the simulated
value to the conditioning dataset;
- The calculation proceeds to the
next node until all grid nodes have been simulated.
Since the sampling locations in 1980 were recorded by text descriptions,
some location errors may have occurred; thus, the assumption of a 500 m
location description error was applied during the SGS of
SOC1980 based on our field-work experiences in the area.
Specifically, the SOC1980 sampling locations were first
perturbed randomly within a buffer circle of 500 m in diameter, and then
the SGS was conducted to generate 200 SOC1980realizations. A total of 1000 SGS-generated SOC1980realizations were obtained by repeating this procedure 5 times (the
sampling locations in SOC1980 were randomly perturbed 5
times). Thus, the uncertainties resulting from the location description
error of SOC1980 were considered in this study. While
for SOC2000 and SOC2015, the SGS method
was directly used.