Four lineages can be identified in the A. viridifloracomplex
Individuals from twenty populations of the A. viridiflora complex were sampled for resequencing. Additionally, to ensure that the species complex with various phenotypes could be regarded as a monophyletic group, we also sampled other sympatric wild columbine species with theA. viridiflora complex. Whole-genome sequencing of 80 individuals from nine species was performed, and after filtering, we obtained 1,064,089 high-quality biallelic single nucleotide polymorphisms (SNPs). We constructed the phylogenetic relationships among the Aquilegiaspecies through the ML and NJ methods based on nuclear genome SNPs. Both topologies indicated that A. viridiflora , A. kamelinii andA. hebeica shared an MRCA with strong support and relationships among species were close to each other (Figure S4). Therefore, 672,439 high-quality biallelic SNPs in the 20 populations of the A. viridiflora complex were used to assess their evolution. Functional annotations indicated that 55.604% of SNPs were located upstream and downstream, while 17.454% were in intronic regions, and 7.07% were in exonic regions of genes. The ML tree inferred from the above SNPS indicated that the individuals of the A. viridiflora complex were divided into four lineages (NE, EL, CN and NW): NE comprised A. kamelinii and the individuals of A. viridiflora distribution in northeastern China, EL comprised the individuals of A. viridiflora and A. hebeica distribution in East Shandong South Liaoning area, the individuals of A. hebeica distribution in North China belonged to a single lineage (EL), and the individuals ofA. viridiflora distribution in northwestern China belonged to a single lineage (NW). In this case, the A. viridiflora complex showed a paraphyletic pattern, that is, NE and EL formed a sister clade, and the other two lineages, CN and NW, were closely related (Figure 1C). This revealed a different evolutionary history from the clusters based on phenotypes.
The population genetic structure of the A. viridiflora complex indicated that ancestral clustering at K = 4 was optimal according to the cross-validation error rate (Figure S5). The result of ancestral inference was obviously consistent with the geographical distribution of the 20 populations and the phylogenetic relationships detailed above. Individuals of SZ, LT, YM, XW, HL, HH and HD populations in the contact regions of lineages showed multiple ancestral compositions (Figure 1A and 1B), which might reflect recent gene flow between these lineages. From the PCA plot, the first principal component (PC1) and second principal component (PC2) explained 6.61% and 4.04% of the observed variation, respectively. It also showed four distinct lineages among the 20 populations, while individuals with multiple ancestral compositions occupied an intermediate space in distinct lineages (Figure S2B). The neighbor-net phylogenetic network depicted these patterns by grouping the differential of NE and EL, while CN and NW were not clearly differentiated, at the same time, it was also proven that the above individuals had a mixed genetic background (Figure 2A). In addition, we also detected a significant signal of hybridization in the above populations at the individual level, in which P1 and P2 did not belong to a group with hybrids (Table S2). Unsurprisingly, the neighbor-net phylogenetic network based on 190 polymorphic sites in the chloroplast genome also showed a little difference from that based on genome polymorphisms resulting from hybridization and backcrossing of hybrid lineages but still showed a paraphyletic pattern (Figure S6A). Taken together, there are four lineages across the sampled A. viridiflora complex through chloroplast and genome polymorphisms.
To understand the diversity patterns, the nucleotide diversity (π) of the NE, EL, CN and NW lineages was calculated throughout the genome for each 50 kb with a 10 kb step-size. Among the four lineages, lineage NW showed the highest nucleotide diversity, and lineage EL showed the lowest nucleotide polymorphism (Figure 2B). Based on chloroplast genome polymorphisms, we detected 27 haplotypes among the 66 A. viridiflora complex individuals. The haplotype diversity (hd) and nucleotide diversity (π) for all individuals were 0.971 and 0.00022, respectively. Among the four lineages, NW had the highest haplotype diversity and nucleotide diversity (Table S3). Moreover, the haplotypes in NW were in this network elsewhere, while the haplotypes in the other groups were limited in the haplotype network (Figure S6B, Table S4).