Introduction
The populations of bacteria that reside within the gastrointestinal tracts of humans experience a diversity of oscillating environmental variables. Within and among hosts, gut bacterial populations face fluctuating selection pressures imposed by variation in host diet (David et al., 2014), drug use (e.g., antibiotics) (Modi et al., 2014), and immunity (Schluter et al., 2020) as well as by variation in biotic interactions with other constituents of the microbiota (Coyte and Rakoff-Nahoum, 2019). These cyclic changes in the adaptive landscapes on which gut bacterial populations evolve may promote the maintenance of genetic diversity within gut bacterial species. However, the degree to which balancing/diversifying selection operates on gut bacterial genomes has not been widely investigated, and the genomic loci that represent the targets of such selective forces have not been identified.
Theory predicts that genes under balancing selection will display a greater number of pairwise sequence differences between copies in a population than expected under neutral evolution based on the number of polymorphic sites in the population. The difference between these values—the observed average number of pairwise differences (πo) and the expected number of pairwise differences based on the number of segregating sites under neutrality (πe)—provides a test statistic for balancing selection termed Tajima’s D (Tajima, 1989). Recent advances in assembling bacterial genomes directly from metagenomic sequence data have generated unprecedented opportunities for interrogating the strength and genomic targets of balancing selection in the human gut microbiota (Pasolli, 2019). Metagenome assembled genomes are now available for nearly all of the bacterial species detected at appreciable abundances in the human gut microbiota and, for many species, multiple genomes from a diversity of strains have been assembled from metagenomes of numerous host populations and individuals.
Here, we analyzed 118,617 metagenome assembled genomes (MAGs) to identify the targets of balancing selection in 288 species of human gut bacteria. We find that gut bacterial genomes evolve primarily under purifying selection. However, a subset of loci displayed significant population genetic evidence of balancing selection. In multiple prominent gut bacterial species, these loci included coding regions for components of multidrug efflux pumps, which were overrepresented among the gene functions displaying the most significant evidence of balancing selection. Integrating comparative genomic analyses with metagenomic measurements of microbiota composition revealed that bacterial species whose genomes contain targets of balancing selection tend to be more abundant in the human gut than do other bacterial species, implying a relationship between the loci under selection and fitness. Cumulatively, these findings reveal adaptive genomic diversity maintained by balancing selection within gut bacterial species.