Sequence Data Processing and Statistical Analysis
(1) The DNA matrix was calculated. The UPARSE program (Edgar, 2013) was
employed to divide the operational taxonomic units (OTUs) and cluster
all the sequences considered very similar (i.e., with a minimum 97%
identity threshold). The RDP-Classifier software v2.11
(https://sourceforge.net/projects/rdp-classifier/) was used to classify
and annotate the sequences of each leaf sample, with the confidence
parameter set to 50%. Microorganisms corresponding to sequences with a
genetic distance of less than 3% are generally considered to belong to
the same microbiota (Wang et al., 2007).
(2) The data were homogenized, and the number of OTUs at different
taxonomic levels (genus and phylum) were calculated and their
rarefication curves plotted. To assess the alpha diversity of sequences
in each sample, the Chao1, Ace, Shannon, and Simpson indexes were used.
(3) Phyllosphere microbiota in leaf samples from different regions were
subjected to principal component analysis (PCA) (Gewersand et al., 2021)
and nonmetric multidimensional scaling (NMDS) multivariate analyses (the
correlations used in the PCA were linear).
(4) A Kruskal–Wallis (KW) sum-rank test was performed using the linear
discriminant analysis effect size (LEfSe) algorithm to compare the
distribution of OTUs between the leaves of rubber trees in different
regions.