SWARM clustering workflow
Sequences were merged using vsearch. Then, we used cutadapt for
demultiplexing and primer trimming and vsearch to remove sequences
containing ambiguities. SWARM was run with a minimum distance of 1
mismatch to make clusters. Once OTUs are generated, the most abundant
sequence within each cluster was used for taxonomic assignment using
ecotag. The same bioinformatics filters previously described were
applied on the OTUs to remove PCR related errors. Then, a
post-clustering curation algorithm (LULU, (S4) Frøslev et al 2017) was
performed to curate data. This algorithm uses sequence similarity and
co-occurrence patterns to detect and remove erroneous OTUs produced by
the clustering algorithm. Following author’s recommendation, we set the
thresholds at 84% sequence similarity and 95% of co-occurrence to
identify errors.
- Schnell, I.B., Bohmann, K. & Gilbert, T.P. (2015) Tag jumps
illuminated – reducing sequence-to-sample misidentifications in
metabarcoding studies. Molecular Ecology Resources,15: 1289–1303. DOI: 10.1111/1755-0998.12402.
- Ficetola G. F., Pansu, J., Bonin, A., Coissac, E., Giguet-Covex, C.,
De Barba, M., Gielly, L., Lopes, C. M., Boyer, F., Pompanon, F., Rayé,
G. & Taberlet, P. (2015) Replication levels, false presences and the
estimation of the presence/absence from eDNA metabarcoding data.
Molecular Ecology Resources, 15: 543-556. DOI:
10.1111/1755-0998.12338.
- Thomsen, P.F., Møller, P.R., Sigsgaard, E.E., Knudsen,S.W., Jørgensen,
O.A. & Willerslev, E. (2016) Environmental DNA from seawater samples
correlate with trawl catches of subarctic, deepwater fishes.PLoS One, 11: e0165252. DOI:
10.1371/journal.pone.016525.
- Frøslev, T.G., Kjøller, R., Bruun, H.H., Ejrnæs, R., Brunbjerg, A.K.,
Pietroni, C. & Hansen, A.J. (2017) Algorithm for post-clustering
curation of DNA amplicon data yields reliable biodiversity estimates.Nature Communications, 8: 1188. DOI:
10.1038/s41467-017-01312-x.
Table S1. Proportion of the fish taxa for which the 12S
mitochondrial rDNA is sequenced and referenced in the EMBL database. The
checklists of fishes from Indonesia were retrieved from Fishbase
(www.fishbase.de) and the
checklists of fishes from the Bird’s Head Peninsula was obtained from
Kulbicki et al. 2013.