Sequence data processing
Quality and adapter trimming were carried out on raw reads, discarding
reads < 50 bp or with an averaged quality score < 20
in a sliding window of five bases. Since the coral holobiont is
associated with high densities of prokaryotes and algal endosymbionts,
reads were filtered with the following steps: First, reads were compared
to an rRNA database (Silva132_LSU, Silva132_SSU) and matches (i.e.,
e-values ≤ 10-5) were removed using the programSortMeRNA (Kopylova et al., 2012). Second, reads were compared to
the algal endosymbiont genome (genus Cladocopium,symC_scaffold_40.fasta (Shoguchi et al., 2018) and matches were
removed using bbduk . The remaining reads were used to create ade novo assembly for the each offspring groups and a combinedde novo assembly for all four offspring groups usingTrinity (Grabherr et al., 2011). Small transcripts of <
400 bp were removed from the assemblies (Kenkel & Bay, 2017), and the
longest isoform of each trinity transcript was obtained. Mitochondrial
genes were identified using BLASTn to A. tenuismitochondrial genome (NC_003522.1.fasta (van Oppen et al., 2002)) and
were retained in the analysis. The remaining transcripts were then
identified by BLASTx searches against the most complete coral
gene model (A. digitifera ,
GCF_000222465.1_Adig_1.1_protein.faa (Shinzato et al., 2011)) and
NCBI’s nonredundant (nr) protein database, with a e-value cut off ≤
10-5.
Gene names and gene ontologies (GO) of the transcripts were assigned
using BLASTx search against UniProt Knowledgebase Swiss-Prot
database (The UniProt Consortium, 2015). Transcript abundance of the
samples was then estimated using RSEM , an alignment-based method
(B. Li & Dewey, 2011). Transcript quantification of the samples was
performed by aligning reads using bowtie2 (Langmead & Salzberg,
2012) and estimating abundance with RSEM (B. Li & Dewey, 2011).
For gene expression comparison between hybrids and parental purebreds,
we tested estimating transcript abundance using the parental assembly ofA. loripes , as well as the combined assembly produced using all
offspring groups. The two methods revealed very similar results (Figure
S1), and the result presented here are based on transcript abundance
estimated using the parental assembly of A. loripes . Due to the
small number of samples available for the parental purebred A.
tenuis (Table S1), a de novo assembly was not conducted or
tested as a basis for transcript abundance estimate. For gene expression
comparison between treatments within an offspring group, the de
novo assembly of each offspring group was used to estimate transcript
abundance. Treatment comparison was not conducted for A. tenuispurebreds due to an insufficient number of samples (Table S1).