Abstract
Metazoa-level Universal Single-Copy Orthologs (mzl-USCOs) are
universally applicable markers for DNA taxonomy in animals which can
replace or supplement single-gene barcodes. While previously mzl-USCOs
from target enrichment data were shown to reliably distinguish species,
here we tested whether USCOs are an evenly distributed, representative
sample of a given metazoan genome and therefore able to cope with past
hybridization events and incomplete lineage sorting. This is relevant
for coalescent-based species delimitation approaches, which critically
depend on the assumption that the investigated loci do not exhibit
autocorrelation due to physical linkage. Based on 239 assessed
chromosome-level assembled genomes, we confirmed that mzl-USCOs are
genetically unlinked for practical purposes and a representative sample
of a genome in terms of reciprocal distances between USCOs on a
chromosome and of distribution across chromosomes. We tested the
suitability of mzl-USCOs extracted from genomes for species delimitation
and phylogeny in four case studies: Anopheles mosquitos,Drosophila fruit flies, Heliconius butterflies, and
Darwin’s finches. In almost all instances, USCOs allowed delineating
species and yielded phylogenies that correspond to those generated from
whole genome data. Our phylogenetic analyses demonstrate that USCOs may
complement single-gene DNA barcodes and provide more accurate taxonomic
inferences. Combining USCOs from sources that used different versions of
ortholog reference libraries to infer marker orthology may be
challenging and at times impact taxonomic conclusions. However, we
expect this problem to become less severe as the rapidly growing number
of reference genomes provides a better representation of the number and
diversity of organismic lineages.