Figure 1 The red swamp crayfish, Procambarus clarkii .
Total RNA extraction and detection
The library construction and sequencing of transcriptomes were performed by Majob Technology Co. Ltd (Shanghai, China). Total RNA extraction was performed using the Trizol reagent (Invitrogen, Shanghai, China). Both concentration and purity were measured using a Nanodrop 2000 spectrophotometer (Invitrogen, Massachusetts, USA). The degradation of extracted RNA was detected on a 1% agarose gels and RNA integrity number (RIN) was assessed using Agilent Bioanalyzer 2100 (Agilent Technologies, USA).
Library construction and sequencing
mRNAs were enriched using magnetic beads with Oligo (dT) and randomly fragmented using fragmentation buffer. Under the action of reverse transcriptase, the fragmented mRNAs were used as templates for first-strand cDNA synthesis using random hexamer primers. Subsequently, second-strand cDNA was synthesized using DNA polymerase I and RNase H. We added an end-repair mix (including the end-repair enzyme mix and end-repair buffer) to patch the cohesive ends of the double-strand cDNA, followed by the addition of tail and sequencing adapters. Subsequently, cDNA was amplified by PCR. The cDNA library was obtained after purifying the amplification products with AMPure XP beads. Thereafter, QuantiFluor dsDNA System and Quantus™ Fluorometer (Promega, Madison, Wisconsin, USA) were used to detect the concentration and inter size of the library, respectively. The effective concentration of the library was accurately quantified by q-PCR (quantitative polymerase chain reaction). Finally, the processed samples were sequenced on the Illumina HiSeqXten/NovaSeq 6000 Sequencing Platform.
de novo assembly and annotation
Using SeqPrep (https://github.com/jstjohn/SeqPrep) and Sickle (https://github.com/najoshi/sickle), clean reads were obtained by filtering the sequencing adapters, primer sequences, and the low-quality sequences from raw reads. Using the Trinity software (Grabherr et al., 2011), obtained clean reads were assembled de novo into unigenes, which were the encoding sequences. The corresponding amino acid sequences were detected by TransRate (http://hibberdlab.com/transrate/) to obtain information on the comprehensive gene function.
The unigene sequences were compared using six databases, including the National Center for Biotechnology Information (NCBI) Non-Redundant Protein Sequence Database (Nr) (ftp://ftp.ncbi.nlm.nih.gov/blast/db/), Protein family (Pfam) (http://pfam.xfam.org/), Gene Ontology (GO) (http://www.geneontology.org) Swiss-Prot (http://web.expasy.org/docs/swiss-prot_guideline.html), Cluster of Orthologous Groups of proteins (COG) (http://www.ncbi.nlm.nih.gov/COG/), and Kyoto Encyclopedia of Genes and Genomes (KEGG) (http://www.genome.jp/kegg/). Functional annotation was processed using the Blast2GO software (Conesa et al., 2005).
DEG enrichment analysis
The Bowtie software (Langmead et al., 2009) was used to compare clean reads with the unigene library and on combining with RSEM (Li and Dewey, 2011), the expression levels were estimated. The expression of unigenes was calculated using FPKM (fragments per kilobase per million) (Mortazavi et al., 2008) between two libraries. Then, the DEGs were analyzed using the DESeq2 software (Varet et al., 2016). The resultingP -values were adjusted to control the false discovery rate (FDR). The DEGs were filtered using the threshold of FDR < 0.01 and | log2 (Fold Change) |≥ 1. Subsequently, GO and KEGG pathway enrichment analyses were conducted using the Goatools software (Klopfenstein et al., 2018).
Real-time qPCR (RT-qPCR)
To verify the results of our sequencing analyses, we selected 8 olfactory-related genes from the male antennae of red swamp crayfish for RT-qPCR analysis. The total RNA was reversely transcribed into first-strand cDNA using the PrimeScript TM 1st stand cDNA Synthesis Kit (TaKaRa, Shanghai, China), following which the newly synthesized cDNA was used as the template for RT-qPCR. Specific primers were designed using the Primer 5.0 software. The β-actin gene was considered an internal normalization control. The primer sequences of 8 DEGs used for RT-qPCR are listed in Table S1. The RT-qPCR reactions (20 μl) contained 2 μl cDNA, 1 μl of each primer, 1 μl dNTP Mix, 10 μl 2×SYBR real-time PCR premixture, and 6 μl RNase free sterilized ultrapure water. The reaction procedure was as follows: 95 °C for 5 min, followed by 40 cycles with 95 °C for 15 s, and 60 °C for 30 s. To confirm reproducibility, the RT-qPCR reaction for each sample was performed in three technical replicates and three biological replicates. The level of expression of selected genes was calculated using the 2-ΔΔCt method (Livak and Schmittgen, 2001). Comparative analyses for each target gene among different samples were analyzed using independent samplet -tests (SPSS, version 25.0).
RESULTS
Assembly and splicing
A total of 133,462,976 and 125017008 raw reads were obtained from the NMP and MP groups, respectively. After quality control, we obtained at least 5.98 Gb reads per sample. A total of 130343776 clean reads were obtained from the NMP group and 122633780 from the MP group with a Q30 score > 92.04 %, and the G + C content ranged from 47.83% - 49.05% in NMP group and 44.38 %­- 47.45% in MP group (Table S2). Moreover, clean reads were randomly assembled into 78138 transcripts with an average length of 1147.85 bp and an N50 length of 2379 bp. The clean reads were further assembled into 59218 unigenes with an average length of 1056.41 bp and an N50 length of 2229 bp. The length distribution of transcripts and unigenes is shown in Table 1 and Figure S1. These results showed that the data were of high quality and the unigenes qualified for further annotation analysis. The raw data files have been uploaded to the NCBI sequence read archive (BioProject accession number: PRJNA508983).
Table 1 The length distribution of assembled transcripts and unigenes.