2.3 Soil microbial community (SMC) analysis
Soil microbial community (SMC) composition was analyzed through Illumina® MiSeq whole genome sequencing (Chen et al., 2017). Soil DNA was extracted from the fresh soil samples stored at -20oC (N=75; Section 2.2) using the FastDNATM SPIN kit (MP Biomedicals, Solon, OH, USA) and following the manufacturer’s instructions. DNA concentration and purity were measured with the NanoDrop One kit (Thermo Fisher Scientific, MA, USA). The V4–V5 region of the bacterial 16S rRNA genes were amplified using the primer sets 515F (5’-GTGCCAGCMGCCGCGGTAA-3’) and 907R (CCGTCAATTCMTTTRAGTTT) from Invitrogen (Invitrogen, Carlsbad, CA, USA). PCR reactions were carried out using the BioRad S1000 apparatus (Bio-Rad Laboratory, CA, USA). PCR products were mixed in ratios of equal density as specified by the GeneTools Analysis Software (Version4.03.05.0, SynGene). Accordingly, a 50 µl volume of PCR mix contained 25 μl Premix Taq (Takara Biotechnology, Dalian Co. Ltd., China), 1 μl of each rRNA primer (10 M) and 3 μl of DNA (20 ng μl-1) template. The PCR mix was amplified following the thermos-cycling protocol: (i) 5 min at 94°C for initialization; (ii) 30 cycles of 30 s at 94°C for denaturation, 30 s at 52°C for annealing, and 30 s at 72°C for extension; (iii) 10 min at 72°C for final elongation. The PCR products were purified with the EZNA Gel Extraction Kit (Omega, USA).
A whole genome sequencing library was generated using the NEBNext® Ultra™ DNA Library Prep Kit for Illumina® (New England Biolabs, MA, USA) following the manufacturer’s recommendations. The quality of the library was assessed with the Qubit@ 2.0 Fluorometer (Thermo Fisher Scientific, MA, USA) and the Agilent Bioanalyzer 2100 system (Agilent Technologies, Waldbron, Germany). Then, the generated library was sequenced on the Illumina Hiseq 2500 platform from which 250 paired-end, base pair (bp) reads were created (Guangdong Magigene Biotechnology Co., Ltd. Guangzhou, China). Paired-end raw reads were filtered and quality-checked following the Trimmomatic v0.33 tool (http://www.usadellab.org/cms/?page=trimmomatic). Paired-end clean reads were merged using the FLASH v1.2.11 software (https://ccb.jhu.edu/software/FLASH/). Genome sequences were assigned to each soil sample (Section 2.2) based on their unique barcode and primer using the Mothur v1.35.1 software (http://www.mothur.org).
The detection of potential chimaera and clustering of the retrieved rRNA sequences were performed with the usearch v10 software (http://www.drive5.com/usearch/). Operational taxonomic unit (OTU) were defined as clusters of sequences with ≥97% similarity. For each sequence, the silva database (https://www.arb-silva.de/) was used to annotate taxonomic information, setting the confidence threshold to default at ≥0.5. According to the taxonomic results, OTU relative abundance diagrams were built from which rich infrared images were obtained using Origin 9.1 (OriginLab, Northampton, MA, USA).
Data availability: The raw sequence data for 16S rRNA gene have been deposited under NCBI Sequence Read Archive with accession number PRJNA674491.

2.4 Molecular ecological networks (MENs)

To investigate the effects of reclamation time on interactions and key species of SMC, phylogenetic molecular ecological networks (MENs; Zhou et al., 2011) were established per reclamation site (r8, r11, r14 and r17) using the information retrieved from the SMC sequence data (Section 2.3). MENs were built using the computational pipeline MENA (http://ieg4.rccc.ou.edu/mena) and plotted using the Cytoscape v3.7.1 software (Doncheva et al., 2018) for further analysis and characterization. Only those OTUs detected in all the analyzed soil samples were considered to build the MENs. Each MEN was characterized in the light of its topology (Tu et al., 2020) – i.e. the total number of nodes (species or OTUs) and links between the nodes. The quantity of nodes within a network was quantified by establishing a power-law distribution (Deng et al., 2012), for which we quantified the coefficient of determination (R 2, between 0 and 1) to classify the network into scale-free (i.e. most nodes have few neighbors; Girvan and Newman 2002; Tu et al., 2020). The links between the nodes were characterized in the light of the average node degree or average connectivity (avgK), which was used to classify the network into a small-world (i.e. the average distance between two nodes is short, meaning that the nodes in the network are closely related with each other; Amaral et al., 2000; Zhou et al., 2010). The average geodesic distance (GD) between nodes was also calculated (Deng et al., 2012). On one hand, AvgK refers to the number of nodes directly connected to a particular node, where ‘higher average degree’ (connectivity) indicates a more complex network (Deng et al., 2012). Nodes with higher degrees were considered to be the central nodes in the network structure (Layeghifard et al., 2017). Accordingly, nodes with higher connectivity (degree) in the network were defined as top nodes. On other hand, GD is the shortest path between two nodes. When GD is low indicates that the nodes within a network are close to each other, indicating that the network is complex (Deng et al., 2012). We used the metric ‘average clustering coefficient’ (avgCC) to describe how well a node was connected to its neighbors using a scale ranging from 0 to 1, where 1 denotes a fully connected node and 0 an unconnected node (Deng et al., 2012). Further MEN attributes were retrieved by establishing/detecting ‘Modules’ within each MEN. A ‘Module’ was defined using the metric ‘modularity’ (i.e. ranging for 0 to 1, it is the degree to which a network can be divided into communities or modules) as a group of OTUs with high connectivity within a MEN but with few connections outside the group. The higher the modularity is, the more modules a network can be divided into and, thus, the less complex a network is (Olesen et al., 2007; Deng et al., 2012). To estimate ‘modularity’, we used the ‘greedy modularity optimization approach (Deng et al., 2012). This approach enabled us to retrieve module connectivity metrics such as ‘within-module’ connectivity (Zi) and ‘among-module’ connectivity (Pi), considering all the nodes within the modules. These metrics were used to classify the nodes within a given MEN according to their role in the network (Newman, 2006; Deng et al., 2012) -i.e. peripherals (Zi ≤ 2.5, Pi ≤ 0.62), connectors (Zi ≤ 2.5, Pi > 0.62), module hubs (Zi > 2.5, Pi ≤ 0.62), and network hubs (Zi > 2.5, Pi > 0.62).

2.5 Statistical Analysis

Statistically significant differences between reclamation sites in terms of the studied key soil attributes were evaluated with the one-way ANOVA test at the 95 % and 99 % confidence levels, following normality testing through Shapiro-Wilk test and T test. The correlation between the OTU’s abundance profile and the studied key soil attributes (Section 2.2.) was assessed through Pearson’s correlation tests (r2).
Differences between the MEN retrieved for each reclamation site were evaluated through similarity matrices (Zhang and Horvath, 2005; Horvath and Dong, 2008). We also generated similarity matrices to evaluate the correlation between different MENs. To do so, Spearman’s rank correlation coefficients were used and similarity thresholds ranging between 0.01 and 0.99 were tried to then choose an optimum similarity threshold (Newman, 2004). The optimum similarity threshold was found through an iterative process that evaluated whether the statistical distribution of the eigenvalues’ spacing contained within the eigenvectors originated from each similarity matrix followed a Poison distribution (Newman, 2006; Deng et al., 2012). This test was carried out using the computational pipeline MENA (Deng et al., 2012 and 2016). We also investigated the relationship between modules (with >5 nodes) of MENs and soil attributes through Pearson’s correlation tests (r2) (Guimera et al., 2007). All other statistical tests were undertaken using the statistical software SPSS v20 (IBM, USA).