4.1 Macroinvertebrate identification based on DNA barcoding
The species identification using DNA barcoding is based on the fact that the genetic distance between two species is much greater than that within a species. It has been proposed to take 2% as the threshold value of species delimitation and, in general, the average genetic distance between two species is over 10 times of that within a species (Hebert et al., 2003; Ward et al., 2009). In this study, the average of interspecific K2P distance (16.37%) was 21-fold higher than that of intraspecific K2P distance (0.78%), which meets the criteria of threshold value that the average of interspecific genetic distance is 10 more times than that of intraspecific genetic distance. The distribution histogram of intraspecific and interspecific distances shows that 85.19% and 97.04% of the intraspecific distances were less than 1% and 2%, respectively, and 97.88% of the interspecific distances were greater than 6%, implying very little overlap between intraspecific and interspecific genetic distances. Based on the Barcode Gap analysis, the minimum interspecific distances to the nearest neighbor were larger than the maximum intraspecific distance for 187 species (98.9% of all species). Only for two species (P. laetum and P. bullum ), the maximum intraspecific distances overlapped with the NN distance, leading to the absence of a barcode gap. These results reveal that DNA barcoding based on COI gene is an effective method for the species identification of benthic macroinvertebrates in the transboundary rivers of northwest China.
Based on the NJ method, ABGD method and BIN analysis, high levels of genetic distance and genetic lineages were observed in nine taxa (Radix auricularia, Epeorus sp5, Rhithrogena tianshanica, Ameletus montanus, Atherix sp. XJ, Glyptotendipes sp. XJ, Euryhapsis sp, Dicranota guerini and Cricotopus ornatus ), suggesting the presence of cryptic species diversity of benthic macroinvertebrates in these transboundary rivers of northwest China. Although Hebert and Ward proposed the threshold value (2%) of species differentiation based on DNA barcoding (Hebert et al., 2003; Ward et al., 2009), the differences in genetic differentiation can occur in different geographical populations for the same species, and thus the genetic distance can exceed the threshold value of 2% for species classification (Tajiama et al., 1983; Hickerson et al., 2006; Wardet al., 2009). In present study, the nine species exhibited high intraspecific genetic distance and multiple genetic lineages, and this was consistent with the conclusions of Ward et al. (2009). Meanwhile, our results support the conclusion that the genetic distance between different geographical populations of the conspecifics can exceed 2% (Hebert et al., 2003b; Ward et al., 2005). Coincidentally, the two or three respective molecular lineages/clusters observed in R. auricularia , Glyptotendipes sp. XJ, C. ornatus , D. guerini , Atherix sp. XJ, R. tianshanica , Epeorussp5 and A. montanus corresponded to different geographical areas, implying that biogeographic events result in a great intraspecific divergence for these species. And geographical distance can play an important role in the formation of high intraspecific genetic distance or cryptic species.
Likewise, the genetic differentiation within one species occurred at different sample sites or geographic scales for the Irtysh river, Emin river and Ili river. For instance, the sampling site of Bieliezeke in the Irtysh river is nearly 700 km away from that of Qiaoerma in the Ili river. However, for species Ameletus montanus , four individuals in site Bieliezeke showed high intraspecific divergence (up to 15.07%) with 23 individuals in site Qiaoerma, suggesting that the divergence reaches an interspecific level. Moreover, the BIN and ABGD analysis divided them into two different groups, and the NJ tree analysis also formed two main branches. Through rechecking the specimens, we did not find any morphological feature that represents different species. In contrast, the altitude (2294 m) of site Qiaoerma is higher than that of site Bieliezeke (640 m). And the habitats of these two sites are totally different as well as the nuptial flight, breeding time of the two populations. This can provide the two populations with ideal criteria for cryptic species due to long evolution time and the difference in the habitats. Likewise, the species D. guerini showed high intraspecific divergence (up to 7.26%) between the Irtysh River and Ili River populations. Even in the same river (Ili River), the speciesR. auricularia yielded an intraspecific divergence of 4.3% between site Nileke and site Zhaosu, that are separated by a distance of 150 km. Specifically, these two sites are situated at two different tributaries in the upstream of Ili River and isolated by the Wusun Mountain. In present study, DNA barcoding proves to be effective for the species identification of benthic macroinvertebrates in most cases. However, as a preliminary hypothesis of species classification, DNA barcodes can be supplemented by morphological, ecological nuclear DNA and other non-molecular data in what respects the existence of cryptic species.
DNA barcoding has been widely used for species identification (Andrea et al., 2016; Versteirt et al., 2015). However, whether DNA barcoding can distinguish the individuals from different geographical populations, subspecies or biotype, remains unknow. The NJ tree showed that the conspecifics of barcoding sequences in the present study first clustered together, and then clustered with those of other areas (Germany, United States, Mexico, Canada, Norway, Italy, Finland etc.). In the NJ tree, both S. striata and C. pallidivittatus covered two subclusters, which were in accordance with the sampling locations. The same geographical populations clustered together with high support values, and the phylogenetic tree indicated that the evolution of geographical population was related to geographical distance. As a result, we inferred that the population differentiation of benthic macroinvertebrates in these four rivers was ascribed to geographical isolation. It has been reported that COI genes are not sensitive enough to identify intraspecific variation, especially when the geographical differentiation of populations is not long enough to form a single pattern (Verheyen et al., 2003; Aliabadian and Kaboli., 2008). This phenomenon was also observed in our study. In terms of the genetic structure analysis among different geographic population, common haplotypes existed in three adjacent geographic population of three mayfly species, whereas different geographical populations generated a certain degree of gene flow, intra-population and inter-population genetic divergence. In the NJ tree, the geographical populations did not divide into different branches according to different geographical locations. The lacking of genetic differentiation among populations made COI gene unable to effectively distinguish infraspecific category. However, if the influence of geographical isolation and ecological environment lead to the accumulation of genetic differentiation among populations, DNA barcoding have the potential to distinguish the geographical populations, subspecies or biotype (Monaghan et al., 2006). Although COI gene has great potential of species identification at a species level, but for infraspecific identification, the evolution rate of COI genes can be limited due to a protein-coding gene. Therefore, COI gene is not sensitive enough to identify populations with tiny genetic differentiation, in which the geographical locations are adjacent and the formation of geographical isolation pattern is not long enough. In this case, more factors (e.g., increasing the length of DNA barcoding) should be considered, especially for those non-protein-coding genes with faster evolution rate.
4.2 Environmental and biodiversity assessment
Phylogenetic diversity (PD) represents the adaptive potential of a community in that higher genetic diversity means greater adaptability to environmental changes (Vellend et al., 2011). In addition, PD indices measure the evolutionary relationship between species, and can reflect different aspects of biodiversity that are not captured by traditional indices based on taxonomic diversity (e.g., Petchey & Gaston, 2002; Cianciaruso et al., 2009; Vellend et al., 2011; Weiher, 2011). Therefore, PD metrics have the great potential to be used in biomonitoring of aquatic ecosystems (Vandewalle et al., 2010).
Our results demonstrated that human disturbance led to a decrease of PD since values at reference sites were significantly higher than those in disturbed sites (Fig. 5). Lower PD values in disturbance sites suggest that human activities impact not only the taxonomic diversity of rivers, but also the evolutionary history shared by the component species. The PD metrics responded to environmental impact and complemented the information provided by classical metrics. This suggests that PD metrics can reflect a certain environmental stress and thus can be used as a metrics of Index of Biotic Integrity (IBI) to reflect the degree of disturbance in river systems.
We evaluated the water quality of the Irtysh River and Ili River based on the combination of macroinvertebrate assemblages and biotic index. Moreover, PD was included into the candidate parameters of Index of Biotic Integrity (IBI). The results of water quality assessment in the Irtysh River through chemical measurements were identical with those of Liu et al. (2002). Thus, the proposed IBI system is appropriate to the studied rivers, and can serve as an effective measure of environment monitoring in river systems. In summary, DNA barcoding can provide a quantitative method to differentiate between Good and Bad water quality. Moreover, our study showed that species identifications based on DNA barcoding have the potential to detect tiny changes in stream condition (e.g., taxa abundance, cryptic species and multiple species lineages).
In conclusion, for the first time, our study constructed a DNA barcoding reference library of benthic macroinvertebrates in transboundary rivers of northwest China, which provides a coverage for 1227 sequences, 189 species, nine taxa of macroinvertebrates. The integration of Barcode Gap analysis, Tree-based methods, ABGD analysis and BIN analysis was adopted to compare with those by morphological identification. Our results demonstrated that DNA barcoding based on COI is an effective method to clarify species boundaries and quantitatively evaluate species diversity, which can be used to evaluate lineages diversity and phylogenetic structure, as well as assess biodiversity and environmental condition for specific areas.