Discussion
Comparisons of 118,617 metagenome assembled genomes from 287 gut
bacterial species enabled the identification of genes targeted by
balancing selection in the human gut. Results revealed that multidrug
efflux pumps (MEPs) display the strongest signatures of balancing
selection of any gut bacterial core open reading frames (CORFs). MEPs
from a diversity of prominent gut bacterial species, includingBacteroides and Bifidobacterium , displayed evidence of
balancing selection (Table S1, Figure 2), suggesting that adaptive
allelic variation within these loci has been maintained in parallel in
multiple bacterial lineages. MEPs were also overrepresented among the
CORFs displaying the highest Tajima’s D values, further supporting that
balancing selection shapes allelic variation within this functional
category of loci.
MEPs serve myriad functions for bacteria, including the extrusion of
antibiotics that are commonly used as medicines. Previous work has shown
that antibiotic therapies can act as harsh selective agents in the gut,
reshaping the community composition of the gut microbiota (Modi et al.,
2014) as well as the adaptive trajectories of individual gut bacterial
lineages (Banerjee et al., 2021; Card et al., 2021). The findings
reported here are consistent with the possibility that medical
antibiotic use also contributes to the maintenance of allelic variation
within multiple prominent gut bacterial species. In particular, the
observation that the CORF displaying the highest Tajima’s D value was a
homolog of the AcrB subunit of the RND superfamily of multidrug efflux
pumps (Figure 2) suggests medical antibiotic usage as an agent of
balancing selection. In Escherichia coli , the periplasmic distal
binding pocket of the AcrB subunit binds minocycline, a tetracycline
antibiotic, and erythromycin A, a macrolide antibiotic (Du et al.,
2018). Moreover, allelic variation at this locus in E. coli has
been shown to contribute to antibiotic resistance (Okusu et al., 1996,
Blair et al., 2015). However, MEPs are widely distributed among
bacterial genomes and serve ancient functions that predate the usage of
antibiotics in medical contexts (Blanco et al., 2016), including the
extrusion of heavy metals, organic pollutants, plant-produced compounds,
and bacterial metabolites. Therefore, it is possible that selective
agents other than medical antibiotics may contribute to the maintenance
of allelic variation in MEP loci displaying positive Tajima’s D values.
If medical antibiotics are in fact driving balancing selection in the
MEP loci identified, results presented her imply that these treatments
may be among the most influential selective agents maintaining allelic
variation in the human gut.
Positive Tajima’s D values are consistent with a history of balancing
selection, but they can also be caused by fluctuations in population
size. In particular, recent population contractions can generate
positive Tajima’s D values in the absence of balancing selection. Here,
Tajima’s D was estimated genome-wide for each CORF in each bacterial
species analyzed, allowing identification of loci that deviated
substantially from the genomic background. This approach provided tests
for balancing selection that accounted for genome-wide patterns of
nucleotide variation caused by demographic processes. For example,
genome-wide Tajima’s values differed significantly among bacterial
clades, with species within the Bifidobacterium displaying the
most negative values and species within the Bacteroidesdisplaying the most positive values. These differences among genome-wide
Tajima’s D values likely reflect difference among the demographic
histories of these clades, whereas loci with Tajima’s D values that
deviate from the genomic background are more likely to represent targets
of balancing selection.
Multidrug efflux pumps were significantly enriched among the CORFs
displaying the highest Tajima’s D values included, but this set of CORFs
also included a diversity of other functional categories of proteins.
Magnesium transporters, helicases, and various synthases and hydrolases
were all represented among the ORFs with the highest Tajima’s D values
(Table S1). Allelic variation in these enzymes may be maintained by
cyclic fluctuations in the availabilities of different substrates. The
CORFs with Tajima’s D values greater than three (Table S1) represent
excellent candidates for experimental study of the functional
consequences of allelic variation within these loci.
Interestingly, the bacterial species that contained CORFs with the
highest Tajima’s D values also tended to be the most abundant bacterial
species in the human gut based on metagenomic data (Figure 3). This
positive association suggests a relationship between balancing selection
and fitness in human gut bacteria. One possible explanation for this
pattern is that balancing selection is more effective in more abundant
bacterial species, given that selection is in general expected to be
more efficient in larger populations than smaller populations (Lanfear
et al., 2014). Alternatively, the CORFs identified as targets of
balancing selection may confer fitness benefits to bacterial species,
increasing their competitive advantage over other species in the gut.
This hypothesis is supported by the observation that the relationship
between balancing selection and relative abundance in the gut remained
evident after controlling for bacterial phylogenetic history (Figure
3B). Under this scenario, the allelic variation in MEPs displaying
evidence of balancing selection may underlie the success in the human
gut of lineages like Bacteroides spp., which are overrepresented
in industrialized human populations relative to non-industrialized
populations and non-human primates (Yatsunenko et al., 2011; Moeller et
al., 2014; Sonnenburg and Sonnenburg, 2019).