2.4 De novo assembly and polishing of the genome
PacBio long subreads were originally corrected with Canu v1.6 (Korenet al. 2017). The genome assembly was performed on WTDBG v1.2.8 using the error-corrected reads. The PacBio Subreads were subsequently mapped back to the raw contigs by Blasr v5.1, and contigs were further polished in Arrow v2.1.0 (Chin et al. 2013). Due to a high error ratio of PacBio raw long reads, Illumina short reads were mapped back to the improved contigs and further polished by Pilon v1.20 (Walkeret al. 2014). In addition, we applied the GC depth analysis to evaluate whether potential contamination remained during sequencing and the coverage of the assembly. The analysis showed that an average GC content of the genome was 33.23% and a single-peaked distribution cure (Additional file 1: Figures S2 and S3). Combining the GC depth analysis with the sequencing depth of the genome indicated that there was no contamination from other species (Additional file 1: Figure S4).