Materials and Methods
Data Compilation
We used a combination of keywords, “fung*” or “bacteria*”,
“ratio”, and “terrestrial” or “soil”, to search peer-reviewed
papers in Google Scholar. The papers were selected via following
criteria: 1) at least one of fungal biomass, bacterial biomass, or F:B
ratio and the units were clearly reported; 2) the data were extractable
from tables (assessing the text) or figures (using Engauge Digitizer
Version 10.7); 3) the study sites were not affected by disturbances such
as fire burning, mining, and heavy metal contamination; and 4) the
reported data contain 0-30 cm topsoil. Geological information of the
sampling sites was recorded and used to locate the sites on the global
map (Fig. 1 ). We also collected any available soil pH, mean
annual precipitation (MAP), mean annual temperature (MAT), SOC and total
nitrogen (TN) concentration, and soil texture, to validate the extracted
data from global datasets.
Fungal and bacterial biomass were measured using a number of methods
such as phospholipid fatty acid (PLFA), direct microscopy (DM), colony
forming units (CFU), substrate-induced respiration (SIR), and
glucosamine and muramic acid (GMA). Additionally, we included some
experimental data (214) measured using PLFA from global topsoil dataset,
detailed information about this dataset can be found in Bahram et
al. (2018). To examine the potential biases in the measurement of
fungal and bacterial biomass, we did a comparison among those methods
(Table 1, Table S1 ). To compare the fungal (FBC) and bacterial
(BBC) biomass C measured using different methods, we used conversion
factors for PLFA (Frostegård & Bååth 1996; Klamer & Bååth 2004), SIR
(Beare et al. 1990), CFU (Aon et al. 2001), DM (Birkhoferet al. 2008), and GMA (Jost et al. 2011) reported by
previous studies. Across biomes, FBC, BBC, and F:B ratio generally
follow the similar pattern using different methods. However, we found
large variations in measured FBC and BBC among different methods.
Specifically, compared with PLFA, SIR, and GMA, fungi were more dominant
over bacteria using CFU, while DM estimated higher dominance of bacteria
relative to fungi, suggesting that DM may underestimate FBC while CFU
may overestimate FBC. In addition, we found overall higher FBC and BBC
measured using GMA, which was largely different from the measurements
using other methods. Therefore, using data generated from multiple
methods in one analysis might be problematic. Finally we used PLFA data
for this analysis. This selection is due to two reasons: 1) the PLFA was
the most widely used approach (Materials and Methods ),
eventually the PLFA-derived FBC and BBC measurements account for 73% of
the whole dataset; 2) the PLFA has been evaluated and proved to be the
most appropriate approach for estimating FBC and BBC simultaneously
(Waring et al. 2013).
The final database included the fungal and bacterial biomass data
measured using PLFA from publications spanning from the late 1960s to
2018. Collectively, 1323 data points in unvegetated ground and 11 biomes
(i.e., boreal forest, temperate forest, tropical/subtropical forest,
grassland, shrub, savanna, tundra, desert, natural wetlands, cropland,
and pasture) across the globe were included in the database
(Fig. 1 ). Forest, grassland, and cropland contribute
approximately 39%, 22%, and 19% of the dataset, respectively, with
all the other biomes together contributed 20% of the dataset. A
majority of the field sites are located in North America, Europe, and
Asia. There is relatively small amount of observations in South America,
Africa, Russian Asia, Australia, and Antarctica. All soil samples are
for 0-30 cm soil profile. For data points without coordinate information
being reported, we searched the geographical coordinates based on the
location of study site, city, state, and country. Then, the geographical
information was used for locating the sampling points on the global map
to extract climate, edaphic properties, plant productivity, and soil
microclimate long-term data from global datasets.
Climate, Plant, and Soil
Data
MAT and MAP with the spatial resolution of 30s during 1970-2000 were
obtained from the WorldClim database version 2
(http://worldclim.org/version2). In addition, monthly mean SM and
soil temperature (ST) during 1979-2014 were obtained from the NCEP/DOE
AMIP-II Reanalysis
(https://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanalysis2.gaussian.html).
The global vegetation distribution data were from a spatial map of 11
major biomes: boreal forest, temperate forest, tropical/subtropical
forest, mixed forest, grassland, shrub, tundra, desert, natural
wetlands, cropland, and pasture, which have been used in our previous
publication (Xu et al. 2013). We used the data for spatial
distribution of soil properties, including soil pH, sand, silt, clay,
and SOC from the Harmonized World Soil Database (HWSD,
https://daac.ornl.gov/cgi-bin/dsviewer.pl?ds_id=1247), while soil
bulk density and TN data are from the IGBP-DIS dataset (IGBP,
https://daac.ornl.gov/SOILS/guides/igbp-surfaces.html) because TN
is not in HWSD. Since TN in IGBP-DIS are for the 100-cm profile as a
whole, we used the factor calculated from the fraction of SOC in the top
0-30 cm with HWSD. Since SOC and soil TN exhibit large spatial
heterogeneities, the variation in fine-scale variation in edaphic
properties were underrepresented in global datasets. To better account
for the edaphic effects on fungal and bacterial distribution, we
examined the relationships of FBC, BBC, and F:B ratio with SOC, TN, and
C:N ratio with the data directly extracted from literatures. Due to the
poor correlation between bulk density extracted from HWSD and reported
bulk density in literatures, we used the same soil bulk density values
for the entire top 100 cm soil profile from IGBP, assuming no difference
in bulk density between top 0-30 cm and 30-100 cm soil profiles. Root C
density (Croot) data were extracted from global dataset
of 0.5 degree based on observation data (Ruesch & Gibbs 2008; Songet al. 2017). Annual net primary productivity (NPP) was obtained
from MODIS gridded dataset with the spatial resolution of 30s during
2000-2015
(http://files.ntsg.umt.edu/data/NTSG_Products/MOD17/GeoTIFF/MOD17A3/GeoTIFF_30arcsec/).
We then compared the data directly extracted in literatures and those
extracted from global datasets, and consistencies were found for a
majority of the dataset (Fig. S1 ).
Model Selection and Validation
Considering the clear biogeographic patterns of FBC, BBC, and F:B ratio,
we developed generalized linear models with climate (MAP and MAT), soil
microclimate (ST and SM), plant (NPP and Croot), and
edaphic properties (clay, sand, soil pH, bulk density, SOC, and TN) to
tear apart the controlling factors on fungal and bacterial distribution.
Based on the generalized linear model of climate, plant, edaphic
properties, and soil microclimate for FBC, BBC, and F:B ratio, over 70%
of variations in FBC, BBC, and F:B ratio can be explained by the
generalized linear model, and FBC and BBC were better explained than F:B
ratio (Fig. 2 ).
Considering the higher proportion of missing data in FBC (14.8%) and
BBC (16.3%) relative to F:B ratio (1.9%), we built an empirical model
for F:B ratio with 75% of the dataset. With the generalized linear
model of F:B ratio, we did the principle component analysis to select
the important factors in explaining the variations in the F:B ratio.
Based on the variations explained by each component and the cumulative
variations of components, we selected 31 most important factors with
emphasis on climate in explaining the variation in F:B ratio using
stepwise regression, which explained 33.0% of the variation in F:B
ratio (Fig. S7; Table S2 ). The selected empirical
model had the formula: log10 (F:B
ratio)=0.6789-0.03402*MAT-0.000058*MAP+0.003772*ST+1.542*SM-0.00099*NPP+0.01553*Croot+0.1226*bulk
density+0.05991*soil
pH-0.03631*clay-0.0045*sand+0.002878*SOC-0.01607*TN+0.000177*MAT*ST-0.03955*MAT*SM-0.000015*MAP*ST-0.000335*MAP*SM+0.000005*MAT*NPP-0.001615*MAT*Croot+0.000001*MAP*NPP+0.000007*MAP*Croot+0.02201*MAT*bulk
density-0.003794*MAT*soil
pH+0.002188*MAT*clay+0.000137*MAT*sand-0.000061*MAT*SOC+0.00513*MAT*TN-0.000029*MAP*soil
pH+0.000001*MAP*clay+0.000003*MAP*sand-0.000001*MAP*SOC-0.000043*MAP*TN.
After the model is developed, we used the 25% of the data that were not
used in model development to validate the model that returned a high
consistency (Fig. S8a ). We then investigated the modeling
performance of F:B ratio by comparing the model simulation and observed
data in each biome (Fig. S9 ). We found the overall consistency
between simulated and observed log-scaled F:B ratio, with relatively
poor fit in deserts. Given the much lower BBC and FBC in deserts, this
inconsistency does not bring large bias to our large-scale estimation.
Additionally, we found a little overestimation of F:B ratio in croplands
and pastures, indicating large uncertainties in managed systems that was
caused by human activities.
Mapping Global Bacterial and Fungal Biomass
Carbon
We compared the microbial biomass C in Xu et al. (2013) and the
sum of FBC and BBC in this study and found a good agreement between the
sum of FBC and BBC and microbial biomass C (Fig. S8b ;
R2=0.91), indicating that the sum of FBC and BBC
constitutes a constant proportion of microbial biomass, providing a
feasible way to estimate FBC and BBC. Based on the microbial biomass C
dataset in Xu et al. (2013) and the global map of F:B ratio, we
generated the global maps of FBC and BBC and estimated global storage of
FBC and BBC. The auxiliary data used included global vegetation
distribution (Xu et al. 2013) and global land area database
supplied by surface data map generated by Community Land Model 4.0
(https://svn-ccsm-models.cgd.ucar.edu/clm2/trunk_tags/clm4_5_1_r085/models/lnd/clm/tools/clm4_5/mksurfdata_map/).
Uncertainty Analysis
To estimate the parameter-induced uncertainties in fungal and bacterial
biomass distribution and storage, we used improved Latin Hypercube
Sampling (LHS) approach to estimate the variations in F:B ratio. LHS
approach is able to randomly produce an ensemble of parameter
combinations with a high efficiency. This approach has been widely used
in the modeling community to estimate uncertainties in model output
(Haefner 2005; Xu 2010; Xu et al. 2014). First, we assumed that
all parameters follow normal distribution, then we used LHS to randomly
select an ensemble of 3000 parameter sets using the package of
“improvedLHS” in R program (Table S2 ). Then we calculated the
95% confidence interval of fungal and bacterial biomass C density and
storage for reporting (Table 2 ).
Statistical Analysis
Since FBC, BBC, and F:B ratio in our dataset did not follow normal
distribution, we used log-transformation to convert them to normal
distributions for subsequent statistical analysis. The mean and 95%
confidence boundaries of FBC, BBC, and F:B ratio were transformed back
to the original values for reporting. To understand the variations of
FBC, BBC, and F:B ratio, we conducted generalized linear model to
investigate relationships between FBC, BBC, and F:B ratio and long-term
climate (MAP and MAT), soil microclimate (ST and SM), plant (NPP and
Croot), and edaphic properties (clay, sand, soil pH,
bulk density, SOC, and TN). Then we used Akaike information criterion
(AIC) as selection criteria, i.e., the smaller the AICs, the better the
regression. Before conducting the generalized linear model, we tested
the multicollinearity for the variables within and among each variable
group, i.e., climate, soil microclimate, edaphic properties, and plant,
and we did not find significant multilinearity (VIF < 5). All
statistical analyses were carried out and relevant figures were plotted
with R3.5.3 in Mac OS X. The Fig. 1 and Fig. 3 were produced
with NCAR Command Language (version 6.3.0) and ArcGIS (version 10.5),
respectively.