Coen Westerduin

and 5 more

The development of DNA-based methods in recent decades has opened the door to numerous new lines of research in the biological sciences. While their speed and accuracy are clearly beneficial, the sensitivity of these methods has the adverse effect of increased susceptibility to false positives resulting from contamination in field or lab. Here, we present findings from a metabarcoding study on the diet of and food availability for several insectivorous birds, in which multiple lepidopteran species not known to occur locally were discovered. After describing the pattern of occurrences of these non-local species in the samples, we discuss various potential origins of these sequences. First, we assess that the taxonomic assignments appear reliable, and local occurrences of many of the species can be plausibly ruled out. Then, we look into the possibilities of natural environmental contamination, judging it to be unlikely, albeit impossible to fully falsify. Finally, while the pattern of occurrences did not suggest lab contamination, we find overlap with material handled in the same lab, which was undoubtedly not coincidental. Even so, not all exact sequences were accounted for in these locally conducted studies, nor was it clear if these and other sequences could remain detectable years later. Although the full explanation for the observations of non-local species remains inconclusive, these findings highlight the importance of critical examination of metabarcoding results, and showcase how species-level taxonomic assignments utilizing comprehensive reference libraries may be a tool in detecting potential contamination events, and false positives in general.

Tomas Roslin

and 96 more

To associate specimens identified by molecular characters to other biological knowledge, we need reference sequences annotated by Linnaean taxonomy. In this paper, we 1) report the creation of a comprehensive reference library of DNA barcodes for the arthropods of an entire country (Finland), 2) publish this library, and 3) deliver a new identification tool based on this resource. The reference library contains mtDNA COI barcodes for 11,275 (43%) of 26,437 arthropod species known from Finland, including 10,811 (45%) of 23,956 insect species. To quantify the improvement in identification accuracy enabled by the current reference library, we ran 1,000 Finnish insect and spider species through the Barcode of Life Data system (BOLD) identification engine. Of these, 91% were correctly assigned to a unique species when compared to the new reference library alone, 85% were correctly identified when compared to BOLD with the new material included, and 75% with the new material excluded. To capitalize on this resource, we used the new reference material to train a probabilistic taxonomic assignment tool, FinPROTAX, scoring high success. For the full-length barcode region, the accuracy of taxonomic assignments at the level of classes, orders, families, subfamilies, tribes, genera, and species reached 99.9%, 99.9%, 99.8%, 99.7%, 99.4%, 96.8%, and 88.5%, respectively. The FinBOL arthropod reference library and FinPROTAX are available through the Finnish Biodiversity Information Facility (www.laji.fi). Overall, the FinBOL investment represents a massive capacity-transfer from the taxonomic community of Finland to all sectors of society.