Results

Sequencing results

The sequencing for all samples generated 6.9Gb raw sequencing data of FASTQ formatted reads. The raw data have been deposited to the open repository Zenodo (Amen et al. 2022). The total number of raw sequences from the 27 stomach samples is 19,531,074, ranging from 19,226 to 867,093 per individual sample (\(M\text{ean}=335,020\);\(SD=234,220\)). Out of them, a total of 18,357,160 sequences were obtained after quality filtering, ranging from 16,703 to 828,735 per individual (\(M\text{ean}=314,421\); \(SD=222,465\); see Supplementary File S4 for details).
Additionally, a series of negative (library and DNA extraction) and positive controls, which were treated identically to other samples from initial processing through library preparation, was included to further filter the data and validate the obtained taxa. The positive control sample was chosen as a stomach content sample of a fish (C. compressirostris ) raised and fed on a known food in the laboratory. Of the total sequence pool generated, we detected 0.013 % from library controls and 0.019 % from DNA extraction controls.

Taxonomic classification

The taxonomic classification of the filtered sequences against the custom marker database using Kraken2 classified all the reads into seven phylogenetic levels (domain, kingdom, phylum, class, order, family, genus, and species) or unclassified reads. The percentage of the unclassified reads reached 33.22 % of the pooled sequences of all samples. On the individual level, the percentage of unclassified reads ranged from 8.38 % to 71.67 %. The high percentage of unclassified reads is attributed to sequences outside the range of our custom marker database (eukaryotic COI and rbcL only). When the entire nucleotide database (www.ncbi.nlm.nih.gov/nucleotide/) is used instead of our custom database, the percentage of unclassified reads dropped to 10.11 % of the pooled sequences of all samples. We limited the analysis to the taxonomic classifications using the eukaryote COI/rbcL custom database, however, to keep the focus on the diet analysis (instead of prokaryotes in the gut microbiomes).
Comparing the distributions of the reads for the food taxa classified at each phylogenetic level reveals that most of the reads are assigned to few food taxa, while many of the classified taxa are represented by only a few reads. To determine which food taxa should be included in subsequent diet analyses, we subtracted the number of sequences of each taxa present in the negative controls from the sequence abundance of that taxa in the samples, following Nguyen et al. (2015). Further, we excluded records from primates and birds, which were presumably contamination. Taxa represented by less than 0.01 % of reads were further excluded, as they may constitute contamination/background noise (Alberdi et al. 2018). This approach substantially decreased the sequence yield of the C. numeniussample, so this sample was eliminated from the subsequent analyses.
The remaining identified prey items in each sample were assigned to different taxonomic levels, i.e., to 20 classes, 34 orders, 90 families, 115 genera, and 127 species. For the species with larger sample sizes (C. compressirostris and C. tshokwe ), the classified reads are visualized using the metagenomics data explorer Pavian and tabulated in Tables S5.1 and S5.2 (Supplementary File S5).

Relative abundance

Campylomormyrus andG. petersii had a broad spectrum of prey items. On the class level, Insecta dominated by far. They were found in all samples, representing more than 90% of the total reads (Figure 1). Other classes were also found in all samples but with less percentage of the total reads, such as Clitellata, Arachnida, Malacostraca, and Hexanauplia. At the order level, the most abundant prey items were Diptera, Coleoptera, and Hymenoptera (all belong to the class Insecta), all of which were in all samples (Figure 1). Also, the Haplotaxida (Clitellata) and Araneae (Arachnida) were found in all samples. Further insect orders such as Lepidoptera, Trichoptera, Ephemeroptera, and Hemiptera, as well as Decapoda (Crustacea) were found in more than 83 % of the samples.
The Relative Read Abundances (RRA) of the primary food taxa (excluding taxa with \(\text{RRA}<0.01\%\)potentially stemming from secondary predation,) for the three grouping categories are depicted in Figure 2. Most groups share similar food taxa, albeit in different proportions. The most dominant prey taxon are insects (Diptera, Coleoptera, Hymenoptera, and Lepidoptera), annelid worms (Haplotaxida), spiders (Araneae), crustaceans (Amphipoda, Diplostraca, and Decapoda), Copepoda (Calanoida), and plants (Cupressaceae). The RRA for individual samples are given in the supplement (Supplementary File S6).
The RRA data were used to assess the dietary niche width by calculating the Shannon diversity index (Figure 3 and Table S7.1 in the Supplementary File S7). The Shannon diversity index differed considerably among the studied groups, pointing towards different dietary niche widths across the two species C. compressirostrisand C. tshokwe species and relative to EOD duration and snout length.

Diet overlap

The diets among groups of the three categories (species, EOD duration, and snout length) significantly overlap at class and order phylogenetic levels (Table 1). At these levels, Pianka index values showed statistically significant niche overlap based on comparison with 1,000 null models (see the Supplementary File S8 for more details) and Schöner index scored more than 0.6 for all groups. At lower taxonomic level (family, genus, species), there was much less overlap in the diet among species. The degree of diet overlap is further confirmed by the Bray–Curtis dissimilarity index (0: similar; 1: dissimilar), as shown in Table S7.2 in the Supplementary File S7.
As many reads could not be assigned to family, genus, or species level (see Table S4.1 in Supplementary File S4), we performed further statistical analyses on read assignments at the order level. A perMANOVA on the Bray–Curtis dissimilarity index data derived from the RRA values at order level indicates significant dietary differences among C. compressirostris and C. tshokwe (F = 4.64,r2 =0.215, p ≤0.001, df =1).
A perMANOVA on the EOD duration groups (long vs. short EOD) indicated significant dietary differences between the two groups (F =5.34, r2 =0.182, p ≤0.001,df =1, Bray–Curtis dissimilarity index data derived from the RRA values at order level). Similarly, for snout morphology (shortvs. medium vs. long), a perMANOVA on the Bray–Curtis dissimilarity index data (derived from the RRA values at order level) showed significant dietary differences among the three groups (F =2.50, r2 =0.182, p ≤0.01,df =2). The post hoc pairwise perMANOVA using a Bonferroni correction of the p -values indicated that only the long snout versus the medium snout and the long snout versus the short snout differences are statistically significant (p ≤0.001 andp ≤0.05, respectively).
DAPC on group-specific diet discriminated across species (only C. compressirostris and C. tshokwe ), EOD duration, and snout length (see Supplementary File S9). It inferred also the significant contributors to dietary differences (1) among C. compressirostrisand C. tshokwe (Calanoida, Ephemeroptera, and Diptera orders), (2) among fish with long and short EOD duration (Calanoida, Diplostraca, Diptera, and Ephemeroptera), and (3) among long and short snout (Ephemeroptera, Calanoida, and Hymenoptera; Supplementary File S9).
The patterns of dietary difference among samples are visualized in Figure 4 by ordinating the Bray–Curtis dissimilarity index values in two dimensions using NMDS. The stress level for the NMDS was 0.152, which indicates a good representation (Clarke 1993). The NMDS plot shows the segregation of samples based on EOD and the degree of diet overlap/dissimilarity based on species and snout length.