Prediction of fish species diversity from OTU accumulation curves
Since the two approaches (taxa and OTUs) underestimated the level of taxonomic diversity within fish families with a high uncertainty, we modeled accumulation curves from the diversity of species and OTUs found across our 92 samples. The modeled asymptote of the assigned species reached 429 species, a value very close to the 394 sequenced species present in the Bird’s Head peninsula, but 3.7 times lower than the 1,611 species in the regional checklist (Fig. 4a). Meanwhile, the OTU accumulation curve reached an asymptote of 1,531, a value very close to the 1,611 fish species in the Bird’s Head.
Applying this method to the 15 fish families which counted more than 10 OTUs and 10 species in the checklist permitted to assess the ability of eDNA-based accumulation curves to predict regional fish richness. For instance, the OTU accumulation curves for the Gobiidae, Labridae and Pomacentridae, the three richest families (51, 54 and 53 OTUs respectively), produced asymptotes and thus predictions of fish diversity much lower than those in the regional checklists with 107.5, 66.1 and 76.2 OTUs, i.e. 47.5%, 81.7% and 69.6% of the checklist richness respectively (Fig. 4b, c, d).
We then tested the ability of the assigned taxa, the OTUs and the OTU accumulation curve approaches to predict fish species richness within families of the regional checklist so the predictive power of linear or proportional relationships. The total number of assigned taxa per family in our samples was a significant but weak predictor of the number of fish species per family in the checklist (R ² = 0.60, p<0.001, Fig. 5a) with the richness of some families being largely underestimated (e.g. 87.4% of net difference with the checklist for the Gobiidae, Fig. 5a, d). The number of OTUs per family was a better predictor of the family species richness in the checklist (R ² = 0.80, p < 0.001) but left 20% of unexplained variation among families with still a marked underestimation (73.3% of net difference with the checklist for Gobiidae, Fig. 5b, e). Using the asymptotes of OTU accumulation curves, we obtained a high predictive accuracy of R ² = 0.92 (p < 0.001) for the species richness within families with less bias for the Gobiidae (43.7% of net difference with the checklist) (Fig. 5c, f).
In addition, we observed that the net difference between the number of assigned taxa per family and the number of species per fish family in the checklist is not related to the number of species of the families (Fig. 5d) suggesting an absence of systematic bias towards the underestimation of species-rich families. By contrast, the net difference between the number of OTUs per fish family and the number of species per family in the checklist significantly increased (R 2 = 0.35, p = 0.02) with the number of species per family (Fig. 5e). This bias towards the underestimation of species richness within species-rich families is nonetheless avoided when using the asymptotes of OTU accumulation curves (p = 0.24, Fig. 5f). Thus, asymptotes of OTU accumulation curves are most accurate and least biased eDNA-based predictors of fish species diversity within families in this marine biodiversity hotspot.