4. Discussion

4.1. Ultraconserved elements successfully delimit species in all investigated cases

In all six complexes of wild bee species examined here, UCEs provided robust species hypotheses and clearly outperformed COI for species delimitation. The main results of our study can be summarized as follows: First, we provide strong evidence of mitochondrial introgression in two species pairs (Andrena barbareae andA. cineraria , Lasioglossum cupromicans and L. bavaricum ): UCEs were in agreement with morphology but not with COI, which suggests that barcode sharing occurs in these species pairs. Second, three species complexes presented multiple mitochondrial DNA barcodes in a single biological species (i.e. Andrena amieti ,A. propinqua , A. trimmerana ); for all three species UCEs recovered strongly supported monophyletic groups which were in agreement with morphology, while the two mitochondrial barcodes within each species formed a paraphyletic assemblage from which another species arose (A. allosa , A. dorsata and A. carantonica , respectively), resulting in the absence of a barcode gap and unresolved mitochondrial species delimitation. Third, our results suggest that the two mitochondrial clades observed within Andrena bicolor probably represent two distinct cryptic species. In addition, UCE-based species delimitation solved long-standing controversies in the taxonomy of Central European bees; in particular, the two generations within each ofA. rosae and A. trimmerana , respectively, which do not appear to represent distinct species; and the distinctiveness of the other species pairs or triplets investigated here, which is strongly confirmed by the UCE data.

4.2. DNA barcoding errors

COI-based barcoding is subject to two types of errors. The first error (similar to type I error) occurs when one biological species is associated with two distinct DNA barcodes, as observed for A. amieti , A. propinqua and A. trimmerana . Type I errors ultimately lead to erroneous detection of two hypothetical species within a single biological species. Most often, these errors are triggered by deep within-species divergences or artefacts such as nuclear insertions (Song, Buhay, Whiting, & Crandall, 2008). The second error (i.e. type II error) occurs when DNA barcoding fails to recognize two distinct species because of barcode sharing, as observed between the pairs Andrena barbareae/cineraria and Lasioglossum cupromicans /barvaricum .
Identifying the exact biological mechanism behind these barcoding errors can be tedious, but often they are linked to incomplete lineage sorting, hybridization followed by introgressions, demographic disparities,Wolbachia infections or sex-biased asymmetries (i.e. male-biased dispersal, mating behaviour or sex-biased offspring production) (Toews & Brelsford, 2012). Most often these events occur in recently diverged species and are not necessarily mutually exclusive (Mutanen et al., 2016). In this study, the low number of specimens sampled and sequenced render the investigation on the underlying mechanism difficult. A more complete sampling across the entire distribution would be necessary to separate incomplete lineage sorting from the other mechanisms. Indeed, incomplete lineage sorting is most often not associated with any biogeographical pattern (Funk & Omland, 2003; Toews & Brelsford, 2012). In contrast, events such as hybridization/introgression often leave biogeographical footprints because they are unidirectional, which implies that the gene flow is directed from the native taxon towards the colonized taxon (Currat, Ruedi, Petit, & Excoffier, 2008; Nevado, Fazalova, Backeljau, Hanssens, & Verheyen, 2011; Pons et al., 2014). Therefore, introgression levels are highest at the hybridization zone and fade away over the colonized distribution zone (Toews & Brelsford, 2012). Further work with a wider geographic coverage would be necessary to unravel the cases of DNA barcoding errors documented here.

4.3. Could the UCEs have overlooked additional levels of cryptic diversity?

With regard to the low rate of evolution of UCEs, an important question in our study and more generally with the use of UCEs for species delimitation is whether they can successfully uncover variation between recently diverged species. It could be argued that the cases of mitochondrial paraphyly (i.e. A. amieti , A. trimmerana andA. propinqua ) in fact represent additional, overlooked instances of cryptic species, and that the UCEs rate of evolution is too low to recover these divergences. At least for A. amieti , our sampling across the entire known distribution of this species enables us to exclude this scenario. We included specimens from the Alps and from the Apennines in Southern Italy, some 600 km from the nearest Alpine population; the Apennine specimens are morphologically slightly divergent from the Alpine populations (Praz et al. 2019). In the COI tree (Figure 1), the southern Italian specimens all clustered in one mitochondrial clade, while the Alpine specimens were distributed over both mitochondrial clades. The UCEs recovered two strongly supported clades within A. amieti , one corresponding to the Southern Italian population and the other including all alpine specimens (Figure 1). This result strongly contradicts the hypothesis of two separated lineages corresponding to both mitochondrial clades. Rather, UCE results agree with the strong geographic separation of the Alpine and Apennine populations and with their slight morphological differentiation.
In the two other cases of mitochondrial paraphyly (i.e. A. trimmerana and A. propinqua ) investigated in our study, the presence of additional cryptic species can not completely be rejected. We however deem this scenario as strongly unlikely since near-cryptic species in bees are almost exclusively associated with some level of morphological differentiation in highly variable character such as pile colour or punctuation (McKendrick et al., 2017; Pauly et al. 2019; Praz et al. 2019). In our study, such morphological variations were not observed in the divergent specimens (it was admittedly also not observed between the two clades within A. bicolor , although more specimens of both genders and both generations are needed to address this question thoroughly). In addition, these divergent specimens in mitochondrial trees where nested within the clades of conspecific specimens in the UCE trees. We speculate that such high within-species divergences in mitochondrial barcodes will prove more common than previously expected once barcoding with continental-scales sampling will be achieved (Hinojosa et al., 2019; Schmidt et al., 2015).

4.4. Comparison of different species delineation methods

The method of species delimitation that provided the results most in agreement with current morphological hypotheses was BPP. By contrast, results by both (b)GMYC were less congruent with morphological species hypotheses, and in several cases had the tendency to inflate the number of species. Compared to BPP, GMYC analyses can overestimation or underestimations species delimitation rates (Carstens et al., 2013; Luo, Ling, Ho, & Zhu, 2018), especially in the presence of high intraspecific variation (Talavera, DincĒŽ, & Vila, 2013). In our particular case, specimens with low-quality input DNA yield high levels of missing values, which led to longer branches in the trees. In most cases, the GMYC analyses split specimens harboring long branches singleton species (Supplementary information S8) which ultimately inflated the overall species number. Therefore, GMYC analyses should be interpreted with caution when applied on UCE data.

4.5. Concluding remarks on the use of UCEs for species delimitation

Our results confirm that UCEs can provide sufficient variation at shallow time scale in insects to enable species discrimination, adding to previous evidence gathered in vertebrates (Harvey, Smith, Glenn, Faircloth, & Brumfield, 2016; Zarza et al., 2018). Harvey et al. (2016) comprehensively compared the utility of sequence capture methods, specifically using UCEs as in our study, and RAD-Seq for shallow phylogenies. They found that both techniques resulted in similar phylogenetic hypotheses and branch support values; and that RAD-seq provided more overall information while sequence-capture provided higher per-locus-information. They also suggested that the high amount of information typical of RAD-seq was not necessarily an advantage when the inherent question was phylogeography, phylogeny or species delimitation. Harvey et al. (2016) concluded that sequence capture is more useful in systematics because of its repeatability, the possibility to use low-quality samples, the ease in read orthology assessment, and the higher per-locus information.
Our results build upon this early work and largely confirm these predictions. RAD-seq datasets would have been nearly impossible to gather for the species investigated here due to low DNA quality or quantity. The possibility of processing specimens belonging to three different families simultaneously, and to iteratively assemble datasets, represent particularly promising advantages of UCE capture methods for species delimitation. Future work should focus on very recently diverged taxa to further determine the level of divergences that can be recovered with these conserved markers. In addition, whether UCEs will enable the detection of hybrids, and to what extend the presence of these hybrids impact the tree reconstruction or the species delimitation analyses should be investigated. Lastly, our analyses strongly suggest the presence of two cryptic species within one of the most common European bee, Andrena bicolor . Enlarging our dataset to the entire geographical range of A. bicolor will be necessary to further untangle this remarkable case of cryptic species in bees.