Population genetic analysis in P. oryzae
To detect genetic lineages within the 46 P. oryzae isolates for
which we had full-genome information, we combined these data with 48
worldwide rice-infecting P. oryzae genomes published by
(Gladieux et al.,
2018). In Gladieux’s study, six rice-infecting lineages were described,
two of which (lineages 5 and 6) being represented by one isolate each.
Since then, two studies based on larger sets of fully sequenced and/or
genome-wide genotyped isolates
(Latorre et al., 2020;
Thierry et al., 2021) showed that lineages 5 and 6 are in fact part of
lineage 1. We used the phylogenetic network approach neighbor-net as
implemented in Splitstree 4.13
(Bryant & Moulton,
2004). This allowed visualizing evolutionary relationships, while taking
into account the possibility of recombination within or between
lineages. We also assessed the genealogical relationships among the 46
YYT fully-sequenced P. oryzae isolates by analyzing
pseudo-assembled genomic sequences (i.e. genomic sequences generated
from the table of SNPs and reference sequences) with RAxML
(Stamatakis, 2014). We
used the General Time-Reversible model of nucleotide substitution with
the Γ model of rate heterogeneity, and performed 100 bootstrap
replicates to assess branch support. To assess the population
subdivision among the genomic data of 46 P. oryzae isolates, sNMF
analyses were performed by generating GenLight object in R-statistical
environment. To assess population subdivision without consideringa priori information about the origin of samples, and without
assuming random-mating, discriminant analyses of principal components
(DAPC) were conducted on the microsatellite data for the 512 YYT and 45
worldwide isolates. DAPC analyses were done using the Adegenet
package in R (Jombart,
Devillard, & Balloux, 2010), by varying the number of inferred genetic
clusters (K) from 2 to 10.
For microsatellite data, a distance-based neighbour-joining tree was
generated with Population
(Langella, 2008), and
within-population diversity and linkage equilibrium parameters were
estimated using Poppr package in R-environment
(Kamvar, Tabima, &
Grünwald, 2014). Nucleotidic diversity (π) within lineages identified
using full-genome sequencing data was estimated using the package
Egglib 3.0.0b10
(De Mita & Siol,
2012) and divergence among lineages was estimated in 10kb windows using
the dxy statistic as implemented in the scikit-allel
(https://zenodo.org/record/4759368#.YW1duxBBzwQ). LD decay along the
genome was assessed within each genetic lineage using PopLDdecay
(Zhang, Dong, Xu, He,
& Yang, 2019). We determined the mating type of each resequenced
isolate using a BLAST search of Mat1 and Mat2 idiomorphs sequences
within each genome assembled de novo using ABySS 2.0 with default
parameters (Jackman et
al., 2017). The ancestral relationship and admixture among the YYT and
worldwide lineages were assessed with Treemix
(Pickrell &
Pritchard, 2012), assuming various admixture events. The input files
were extracted from the vcf file through in-house scripts, while the
results were plotted in R through the script provided with the software.