Phylogenetic analysis
To generate clean mitochondrial and nuclear genomes for the phylogenetic
analysis, we removed regions that are subjected to transposition of
cytoplasmic mitochondrial DNA into the nuclear genome in the historical
past (numts). We discarded read pairs for which two reads in the pair
mapped to both mitochondrial genome and nuclear genome. First, trimmed
reads were mapped to the mitochondrial reference genome using BWA (Li
and Durbin 2009) with default parameters. We extracted paired-end reads
that aligned to the mitochondrial genome from each BAM file. GSNAP (Wu
and Watanabe 2005) was used to realign these reads to the mitochondrial
reference genome and nuclear genome separately following the pipeline
from MToolBox (Calabrese et al. 2014). Read pairs that mapped to both
the mitochondrial and nuclear genomes were removed from downstream
analysis.
We generated phylogenetic trees using nuclear and mitochondrial genomes,
separately. For the nuclear genome, a phylogenetic tree was constructed
using the consensus genomic sequences for each NZ population andD. pulex /D. pulicariaclones.
Consensus sequences for each population/clone were generated using
Samtools (Li et al. 2009) with command: samtools mpileup -uf ref.fa
aln.bam | bcftools call -c | vcfutils.pl vcf2fq
> cns.fq. We randomly selected 1% of the consensus
sequences and repeated this 1000 times, constructing maximum-likelihood
trees for each subset of data using iq-tree with GTR + I model and
ultrafast bootstrap (Nguyen et al. 2014; Hoang et al. 2017). The
consensus tree was generated using the Consense program, and branch
lengths estimated by the Dnaml program in the PHYLIP package
(Felsenstein 1993). A separate phylogenetic tree based on mitochondrial
data was constructed with the maximum likelihood method using MEGA 7
with 100 Bootstrap (Kumar et al. 2016).