Results
Sequencing results
The sequencing for all samples generated 6.9Gb raw sequencing data of
FASTQ formatted reads. The raw data have been deposited to the open
repository Zenodo (Amen et al. 2022). The total number of raw sequences
from the 27 stomach samples is 19,531,074, ranging from 19,226 to
867,093 per individual sample (\(M\text{ean}=335,020\);\(SD=234,220\)). Out of them, a total of 18,357,160 sequences were
obtained after quality filtering,
ranging from 16,703 to 828,735 per individual
(\(M\text{ean}=314,421\); \(SD=222,465\); see Supplementary File S4
for details).
Additionally, a series of negative (library and DNA extraction) and
positive controls, which were treated identically to other samples from
initial processing through library preparation, was included to further
filter the data and validate the obtained taxa. The positive control
sample was chosen as a stomach content sample of a fish (C.
compressirostris ) raised and fed on a known food in the laboratory. Of
the total sequence pool generated, we detected 0.013 % from library
controls and 0.019 % from DNA extraction controls.
Taxonomic classification
The taxonomic classification of the filtered sequences against the
custom marker database using Kraken2 classified all the reads into seven
phylogenetic levels (domain, kingdom, phylum, class, order, family,
genus, and species) or unclassified reads. The percentage of the
unclassified reads reached 33.22 % of the pooled sequences of all
samples. On the individual level, the percentage of unclassified reads
ranged from 8.38 % to 71.67 %. The high percentage of unclassified
reads is attributed to sequences outside the range of our custom marker
database (eukaryotic COI and rbcL only). When the entire nucleotide
database
(www.ncbi.nlm.nih.gov/nucleotide/)
is used instead of our custom database, the percentage of unclassified
reads dropped to 10.11 % of the pooled sequences of all samples. We
limited the analysis to the taxonomic classifications using the
eukaryote COI/rbcL custom database, however, to keep the focus on the
diet analysis (instead of prokaryotes in the gut microbiomes).
Comparing the distributions of the reads for the food taxa classified at
each phylogenetic level reveals that most of the reads are assigned to
few food taxa, while many of the classified taxa are represented by only
a few reads. To determine which food taxa should be included in
subsequent diet analyses, we subtracted the number of sequences of each
taxa present in the negative controls from the sequence abundance of
that taxa in the samples, following Nguyen et al. (2015). Further, we
excluded records from primates
and birds, which were presumably contamination. Taxa represented by less
than 0.01 % of reads were further excluded, as they may constitute
contamination/background noise (Alberdi et al. 2018). This approach
substantially decreased the sequence yield of the C. numeniussample, so this sample was eliminated from the subsequent analyses.
The remaining identified prey items in each sample were assigned to
different taxonomic levels, i.e., to 20 classes, 34 orders, 90 families,
115 genera, and 127 species. For the species with larger sample sizes
(C. compressirostris and C. tshokwe ), the classified reads
are visualized using the metagenomics data explorer Pavian and tabulated
in Tables S5.1 and S5.2 (Supplementary File S5).
Relative abundance
Campylomormyrus andG. petersii had a broad spectrum of prey items. On the class
level, Insecta dominated by far. They were found in all samples,
representing more than 90% of the total reads (Figure 1). Other classes
were also found in all samples but with less percentage of the total
reads, such as Clitellata, Arachnida, Malacostraca, and Hexanauplia. At
the order level, the most abundant prey items were Diptera, Coleoptera,
and Hymenoptera (all belong to the class Insecta), all of which were in
all samples (Figure 1). Also, the Haplotaxida (Clitellata) and Araneae
(Arachnida) were found in all samples. Further insect orders such as
Lepidoptera, Trichoptera, Ephemeroptera, and Hemiptera, as well as
Decapoda (Crustacea) were found in more than 83 % of the samples.
The Relative Read Abundances (RRA)
of the primary food taxa (excluding taxa with \(\text{RRA}<0.01\%\)potentially stemming from secondary predation,) for the three grouping
categories are depicted in Figure 2. Most groups share similar food
taxa, albeit in different proportions. The most dominant prey taxon are
insects (Diptera, Coleoptera, Hymenoptera, and Lepidoptera), annelid
worms (Haplotaxida), spiders (Araneae), crustaceans (Amphipoda,
Diplostraca, and Decapoda), Copepoda (Calanoida), and plants
(Cupressaceae). The RRA for
individual samples are given in the supplement (Supplementary File S6).
The RRA data were used to assess the dietary niche width by calculating
the Shannon diversity index (Figure 3 and Table S7.1 in the
Supplementary File S7). The Shannon diversity index differed
considerably among the studied groups, pointing towards different
dietary niche widths across the two species C. compressirostrisand C. tshokwe species and relative to EOD duration and snout
length.
Diet overlap
The diets among groups of the three categories (species, EOD duration,
and snout length) significantly overlap at class and order phylogenetic
levels (Table 1). At these levels, Pianka index values showed
statistically significant niche overlap based on comparison with 1,000
null models (see the Supplementary File S8 for more details) and Schöner
index scored more than 0.6 for all groups. At lower taxonomic level
(family, genus, species), there was much less overlap in the diet among
species. The degree of diet overlap is further confirmed by the
Bray–Curtis dissimilarity index (0: similar; 1: dissimilar), as shown
in Table S7.2 in the Supplementary File S7.
As many reads could not be assigned to family, genus, or species level
(see Table S4.1 in Supplementary File S4), we performed further
statistical analyses on read assignments at the order level. A perMANOVA
on the Bray–Curtis dissimilarity index data derived from the RRA values
at order level indicates significant dietary differences among C.
compressirostris and C. tshokwe (F = 4.64,r2 =0.215, p ≤0.001, df =1).
A perMANOVA on the EOD duration groups (long vs. short EOD)
indicated significant dietary differences between the two groups
(F =5.34, r2 =0.182, p ≤0.001,df =1, Bray–Curtis dissimilarity index data derived from the RRA
values at order level). Similarly, for snout morphology (shortvs. medium vs. long), a perMANOVA on the Bray–Curtis
dissimilarity index data (derived from the RRA values at order level)
showed significant dietary differences among the three groups
(F =2.50, r2 =0.182, p ≤0.01,df =2). The post hoc pairwise perMANOVA using a Bonferroni
correction of the p -values indicated that only the long snout
versus the medium snout and the long snout versus the short snout
differences are statistically significant (p ≤0.001 andp ≤0.05, respectively).
DAPC on group-specific diet discriminated across species (only C.
compressirostris and C. tshokwe ), EOD duration, and snout length
(see Supplementary File S9). It inferred also the significant
contributors to dietary differences (1) among C. compressirostrisand C. tshokwe (Calanoida, Ephemeroptera, and Diptera orders),
(2) among fish with long and short EOD duration (Calanoida, Diplostraca,
Diptera, and Ephemeroptera), and (3) among long and short snout
(Ephemeroptera, Calanoida, and Hymenoptera; Supplementary File S9).
The patterns of dietary difference among samples are visualized in
Figure 4 by ordinating the Bray–Curtis dissimilarity index values in
two dimensions using NMDS. The stress level for the NMDS was 0.152,
which indicates a good representation (Clarke 1993). The NMDS plot shows
the segregation of samples based on EOD and the degree of diet
overlap/dissimilarity based on species and snout length.