Figure 2 Number of annotated unigenes in 6 databases.
In the Nr database, the E -value distribution showed that 62.71% of unigenes exhibited homology (< 1e-30) with previously reported sequences, and the similarity distribution showed that 82.65% of the unigenes were similar (> 60%) to previous sequences (Figure 3A, B). The unigenes (12105, 65.20%) matched best matches to crustaceans, primarily including Penaeus vannamei (59.17%),Hyalella azteca (4.15%), and Procambarus clarkii (1.88%) (Figure 3C).
A total of 14,129 unigenes were classified into three categories, including biological processes (BP), cellular components (CC), and molecular function (MF) (Figure 4). Among the BPs, “cellular process”, “metabolic process”, and “biological regulation” were the top three functions. In CCs, “cell part”, “membrane part”, “organelle”, “protein-containing complex”, and “organelle part” were the dominant functions. Most unigenes were assigned to “binding” and “catalytic activity” among MFs. Moreover, we identified 83 unigenes related to olfactory and chemosensation by GO analysis (Table S4).
In the COG database, a total of 14,479 unigenes were assigned to 23 COG categories (Figure 5). As shown in Figure 5, category S formed the largest group which represented “function unknown” (8329 unigenes, 57.52%), followed by category O, representing “posttranslational modification, protein turnover, and chaperones” (1192 unigenes, 8.23%), and category J, representing “translation, ribosomal structure, and biogenesis” (1150 unigenes, 7.94%).
In our study, 9176 unigenes were assigned to 334 KEGG pathways and classified into six specific pathway groups (Figure 6). Among the six groups, the largest was “translation” (1,346 unigenes), followed by “signal transduction” (1,181), and “transport and catabolism” (761). These results can help elucidate the olfactory-related gene expression profile during the MP of crayfish and provide theoretical data for gene mining in crayfish.