Analyses of protein structure and chemistry
To visualize the putative tertiary structure of proteins coded for by
CORFs under balancing selection, the Protein Data Bank was searched for
homologs with resolved structures at
www.rcsb.org (Burley et al., 2021). Top hits
from Escherichia coli was used for downstream analysis. To
identify the putative location of CORFs under balancing selection within
the E. coli protein complex, CORF sequences from the
representative MAGs for SGBs were aligned against the sequence of theE. coli protein and visualized with the Mol* 3D Viewer. Disorder
was estimated along the protein sequence with IUPred2 (Mészáros, et al.,
2018) and hydrophobicity was calculated using the Kyte and Doolittle
method (Kyte and Doolittle, 1982) with a window size of 21.