2.8 Gene family analysis
In order to identify gene families, protein sequences from S.
peregrina and other nine diptera species, including Sarcophaga
bullata , Stomoxys calcitrans , M. domestica , L.
cuprina , Ceratitis capitata , Bactrocera oleae andD. melanogaster , with A. aegypti and A. gambiaebeing the family Culicidae as an outgroup, were first aligned using
DIAMOND v0.9.30 (Buchfink et al. 2015), and the aligned results
were then clustered using OrthoFinder v2.7 (Emms& Kelly 2015; Xuet al. 2019). To further reveal the phylogenetic relationships
among S. peregrina and other nine species mentioned above,
single-copy families were aligned via the MAFFT v7.0 (Kazutaka&
Standley 2013), and then trimmed using the Gblocks v0.91b (Castresana
2000). The phylogenetic tree was inferred using a maximum likelihood
method as implemented in RAxML v8.2 with the GTRGAMMA model and 100
bootstrap replicates (Alexandros 2006; Stamatakis 2014). Afterwards,
divergence times were estimated under a relaxed clock model by MCMCTREE
program implemented in PAML v 4.9e (Yang 2007). The molecular clocks of
the family Culicidae (105.91-234.53 Mya), Stomoxys calcitrans andM. domestica (26.97-36.96 Mya) were used for fossil calibration.
According to gene families and phylogenetic relationships, the results
were further analyzed to identify the expanded and contracted gene
families by Computational Analysis of Gene Family Evolution v4.2 (Tijlet al. 2006). Moreover, in order to identify positively selected
genes in the S. peregrina , we retained orthologous groups amongS. peregrina and the remaining seven species (after removal of
the outgroup in evolutionary analysis) using Blastall (Camacho et
al. 2009). Subsequently, we calculated likelihood ratio tests for
selection (P<0.05) using Codeml with the branch-site model as
implemented in the PAML package (Yang 2007). Besides, we conducted the
chromosome synteny between S. peregrina and D.
melanogaster based on genome-scale ortholog alignment using MCScanX
(Wang et al. 2012).