SWARM clustering workflow
Sequences were merged using vsearch. Then, we used cutadapt for demultiplexing and primer trimming and vsearch to remove sequences containing ambiguities. SWARM was run with a minimum distance of 1 mismatch to make clusters. Once OTUs are generated, the most abundant sequence within each cluster was used for taxonomic assignment using ecotag. The same bioinformatics filters previously described were applied on the OTUs to remove PCR related errors. Then, a post-clustering curation algorithm (LULU, (S4) Frøslev et al 2017) was performed to curate data. This algorithm uses sequence similarity and co-occurrence patterns to detect and remove erroneous OTUs produced by the clustering algorithm. Following author’s recommendation, we set the thresholds at 84% sequence similarity and 95% of co-occurrence to identify errors.
  1. Schnell, I.B., Bohmann, K. & Gilbert, T.P. (2015) Tag jumps illuminated – reducing sequence-to-sample misidentifications in metabarcoding studies. Molecular Ecology Resources,15: 1289–1303. DOI: 10.1111/1755-0998.12402.
  2. Ficetola G. F., Pansu, J., Bonin, A., Coissac, E., Giguet-Covex, C., De Barba, M., Gielly, L., Lopes, C. M., Boyer, F., Pompanon, F., Rayé, G. & Taberlet, P. (2015) Replication levels, false presences and the estimation of the presence/absence from eDNA metabarcoding data. Molecular Ecology Resources, 15: 543-556. DOI: 10.1111/1755-0998.12338.
  3. Thomsen, P.F., Møller, P.R., Sigsgaard, E.E., Knudsen,S.W., Jørgensen, O.A. & Willerslev, E. (2016) Environmental DNA from seawater samples correlate with trawl catches of subarctic, deepwater fishes.PLoS One, 11: e0165252. DOI: 10.1371/journal.pone.016525.
  4. Frøslev, T.G., Kjøller, R., Bruun, H.H., Ejrnæs, R., Brunbjerg, A.K., Pietroni, C. & Hansen, A.J. (2017) Algorithm for post-clustering curation of DNA amplicon data yields reliable biodiversity estimates.Nature Communications, 8: 1188. DOI: 10.1038/s41467-017-01312-x.
Table S1. Proportion of the fish taxa for which the 12S mitochondrial rDNA is sequenced and referenced in the EMBL database. The checklists of fishes from Indonesia were retrieved from Fishbase (www.fishbase.de) and the checklists of fishes from the Bird’s Head Peninsula was obtained from Kulbicki et al. 2013.