Materials and Methods
Sample collection and Apibacterisolation
The worker bees of Apis cerana were collected from Sichuan, Jilin
and Qinghai Provinces in China (Dataset S1). A total of six guts were
dissected, homogenized and frozen in glycerol (25%, vol/vol) at –80°C.
Frozen stocks of homogenized guts were streaked out on the heart
infusion agar (Oxoid) or Columbia agar (Oxoid) supplemented with 5%
sheep blood (Solarbio). The plates were incubated at 35°C in 5 %
CO2 for 2–3 days. Colonies were screened by PCR using
universal primers 27F and 1492R for the16S rRNA gene and Sanger
sequencing.
Fluorescence in situ hybridization
(FISH) microscopy
Ten adult worker bees of A. cerana were collected from a single
colony in Beijing, China in May 2018. The whole guts of bees were
dissected and frozen in RNAlater (Qiagen) at -80 ºC. The FISH protocol
was adapted from Yuval et al. (Gottlieb et al., 2006). In brief, the
guts preserved in RNAlater were fixed in the fixative solution (Ethanol:
Chloroform: Acetic acid = 6:3:1) and kept overnight at the room
temperature, followed by rinsing with 1 × PBS. The guts were then
treated with 1 mg/mL proteinase K solution at 56 °C for 20 minutes,
followed by rinsing with 1 × PBS. The guts were incubated with 6%
H2O2 ethanol solution for 2 hours,
followed by rinsing twice with 1 × PBS. Finally, the guts were permeated
with 0.1% Triton for 2 hours, followed by rinsing for 2 or 3 times in 1
× PBS buffer.
Binding FISH probes targeting 16S rRNA genes were designed specifically
for each of the genera Apibacter , Snodgrasella andGilliamella , following Martinson et al. (Martinson et al., 2012)
(Table S1). Probe hybridization was performed overnight at 37 °C,
followed by rinsing in 1 × PBS for 3 times. Spectral imaging was used to
visualize gut sections on a ZEISS LSM780 confocal microscope.
Autofluorescence was assayed for each tissue type as the negative
control. Apibacter were labeled with florescent together with
either S. alvi or G. apicola .
Estimation of bacterial abundance using
qPCR
Worker bees were collected from 2 colonies at the same apiary in
Beijing, China in September 2018. Samples were stored at -80°C. Gut
segments (i.e., midgut, ileum, and rectum) were dissected and separated.
DNA was extracted following Powell et al. (Powell et al., 2014).
Universal 16S primers 27F and 355R for bacteria were employed andApibacter -specific primers Apiq9-F and Apiq9-R were designed in
this study (Table S1). Standard curves were created and reactions were
carried out on a LightCycler 480 (Roche Applied Science, Indianapolis,
IN). Significance in differences within and between samples was
determined using Mann-Whitney (Wilcoxon-rank) nonparametric U tests.
In vivo colonization
experiments
Microbiota-depleted bees were obtained following Zheng et al. (Zheng,
Powell, et al., 2017). Late stage pupae of both A. cerana orA. mellifera were removed from brood frames and incubated for
24-36 h in sterile plastic bins at 35 °C and 75% humidity. Newly
emerged bees were kept in cup cages provided with sterilized sucrose
syrup (0.5 M) and bee bread. For inoculation, bacteria strain of theApibacter sp. B3706 was grown on the heart infusion agar (Oxoid),
which was supplemented with 5% sheep blood from glycerol stocks, for
two days. Cultivated strains were scraped and suspended in 1 × PBS to
reach an OD600 of 1.0. Batches of 25 bees were placed in
a 50 ml conical tube, and 50 μl sucrose syrup was added. The tube was
rotated gently so that the syrup was coated on the surface of bees. The
tube was rotated again after the addition of 50 μl of theApibacter sp. B3706 suspension (~2 ×
106 cells per bee). The bacteria were inoculated into
the bee guts as a result of auto- and allogrooming. Inoculated bees were
reared in cup cages. Three replicate enclosures were set up for bothA. cerana or A. mellifera with 20 bees in each cage. The
inoculated bees were fed with sucrose syrup and sterilized pollen
throughout the experiment. Colonization levels were determined at day 6
using qPCR as described previously.
Genome sequencing, assembly and
annotation
Genomic DNA of Apibacter isolates were extracted using a
bead-beating method as previously described (Powell et al., 2014). Two
strains (Apibacter sp. B2966 and B3706) were sequenced with the
PacBio RS (PacBio) at Nextomics Biosciences Co. Ltd., China, and
assembled using OLC algorithm of the Celera (Chaisson & Tesler, 2012).
Three strains (Apibacter sp. B3239, B3546 and B2912) were
sequenced with a BGISEQ-500RS (BGI) at BGI-Qingdao, China. The other
nine strains (Apibacter sp. B3813, B3883, B3887, B3889, B3912,
B3913, B3918, B3924 and B3935) were sequenced with a HiSeq X-Ten
(Illumina) at Novogene Co., Ltd, China. The twelve strains except forApibacter sp. B2966 and B3706 were assembled withSOAPdenovo-Trans (version 2.04, -K 51 -m 91 –R for 150PE reads;
-K 31 -m 63 –R for 100PE reads) (Xie et al., 2014), SOAPdenovo(only for 150PE reads) and SPAdes (version 3.13.0, -k
33,55,77,85) (Bankevich et al., 2012). The quality trimmed reads were
mapped back to the assembled contigs using minimap2-2.9 (Li, 2018) to
examine assembly quality. The bam file generated by samtools(version 1.8) (Li et al., 2009) and the assemblies were processed byBamDeal(https://github.com/BGI-shenzhen/BamDeal,
version 0.19) to calculate and visualize sequencing coverage and GC
contents of assembled contigs. Spurious contaminants, contigs with low
depths, unnormal GC contents and those apparently differing from the
cluster, were removed from the draft genome. Assembled genomes were then
annotated using PROKKA 1.13.3 (Seemann, 2014).
Comparisons of genome structure, genome divergence, and gene
contents
The contigs of each assembly were re-ordered according to the single
circular genome of strain B3706 using the ‘Contig Mover’ tool of Mauve
version 2.4.0 (Darling, Mau, Blattner, & Perna, 2004; Rissman et al.,
2009). Pairwise average nucleotide identity (ANI) was calculated with
JSpeciesWS (Richter, Rosselló-Móra, Oliver Glöckner, & Peplies, 2016)
using the BLASTN algorithm (ANIb). Genome completeness was estimated
with CheckM (Parks et al., 2015), which was available at KBase online
(Arkin, Stevens, Cottingham, Maslov, & Perez, 2015) using recommended
parameters. The genome structures were compared using the R-package
genoPlotR (Guy, Kultima, Andersson, & Quackenbush, 2011).
Phylogenetic inference
Gene orthology was determined using OrthoMCL (Li et al., 2003) for all
genomes used in this study. All steps of the OrthoMCL pipeline were
executed as recommended in the manual and the mcl program was conducted
using parameters ‘-abc -I 1.5’.
Protein sequences of the identified single-copy orthologous were aligned
using Mafft-linsi (Katoh, Kuma, Miyata, & Toh, 2005). Alignment columns
only containing gaps were removed and the alignments were concatenated.
The phylogeny was reconstructed using RAxML v8.2.10 (Stamatakis, 2014)
with the PROTGAMMAIJTT model and 100 bootstrap replicates.
Gene flux analysis
Gene gain and loss analyses and inferences for gene contents of LCAs
(last common ancestors) were conducted using Count (Csurös, 2010).
Standard methods used in previous works were employed in the present
study (Segers et al., 2017). For each gene family, Wagner parsimony with
a gene gain/loss penalty of 2 (Zaremba-Niedzwiedzka et al., 2013) was
used to infer the most parsimonious ancestral states. Parameter choices
followed a previous publication (Oyserman et al., 2016).
Analysis of functional gene contents
Gene contents were categorized based on COG and eggNOG (Huerta-Cepas et
al., 2019) functions. Significantly differential genes and pathways were
enriched by OrthoVenn (Yi, Colemanderr, Chen, & Gu, 2015). For gene
families subset of interest, BLASTP(Altschul et al., 1997) was used to
query against the NCBI’s nr database and TIGRFAM Hidden Markov Model
(HMM) (Haft, Selengut, & White, 2003). The phylogenies of narGand narH were produced from protein sequences obtained by blastp
against NCBI’s nr database with default parameters. And the hits aligned
using Mafft-linsi, trimmed with trimAL (Capella-Gutiérrez,
Silla-Martínez, & Gabaldón, 2009) for sites with over 50% gaps and
phylogenetic trees were constructed using RAxML PROTGAMMALG with 100
bootstraps. The parameters were based on previous publication (Neuvonen
et al., 2016).
Carbohydrate-active enzyme (CAZymes) gene families were identified for
all analyzed genomes using the command-line version of dbCAN (Database
for automated Carbohydrate-active enzyme Annotation) (Yin et al., 2012),
following authors’ instruction. The PULs (Polysaccharide utilization
loci) were identified using the TIGRfam (Potter et al., 2018) and Pfam
(Finn et al., 2014) models, following Terrapon et al. (Terrapon,
Lombard, Gilbert, & Henrissat, 2015). Antibiotic resistance genes were
identified by querying all the genomes against the Comprehensive
Antibiotic Resistance Database (CARD) (McArthur et al., 2013).