Conclusion
In the present study, a comprehensive analysis of the amino acid and nucleotide Hsp60 sequences from 19 phyla was carried out. For this, databases 1 and 2 were built, consisting of 19220 amino acid and 1925 nucleotide Hsp60 sequences, respectively.
Multiple alignment, calculation of the percent identity (PID) of amino acid sequences from database 1, and subsequent clustering of the obtained data were performed to establish the level of conservatism of Hsp60. It turned out that Hsp60 cannot be considered a highly conserved protein, since the average PID values vary widely from 10.1±0.5% (Chlamydiae #1 / Apicomplexa #2) to 97.9±0.0% (Mollusca #2 / Mollusca #3). However, the result of component analysis indicates a relatively constant amino acid composition of Hsp60, showing a high content of aliphatic (Ala, Val, Gly, Leu, and Ile), charged (Glu, Lys, and Asp), and polar (Thr) amino acid residues. Thus, it can be assumed that the functional features of Hsp60 are determined not only by its sequence, but also by its amino acid composition.
The nucleotide sequences from database 2 were analyzed to determine the genetic and evolutionary characteristics of Hsp60 genes from 17 phyla using conventional metrics. The GC content of the analyzed Hsp60 genes was comparable to or higher than the corresponding genomic values. It can be assumed that the Hsp60 genes are tightly controlled by DNA repair systems, providing high resistance to spontaneous mutations in the third position of codons and thereby increasing their GC content. It was further found that natural selection plays a dominant role in the evolution of Hsp60 genes. According to the results of the neutrality plot analysis, the percent of impact of mutational pressure on the codon usage of Hsp60 genes in 16 phyla does not exceed 20%, with the exception of Euryarchaeota, for which Hsp60 genes are characterized by high mutational pressure with neutrality values of 32.4%. In addition, the direction of mutational pressure affecting the third position of codons was determined. Accordingly, the Hsp60 genes from Apicomplexa, Chlamydiae, Firmicutes, Streptophyta, Nematoda, Bacteroidetes, Mollusca, and Cyanobacteria are under AT mutational pressure. In turn, GC mutational pressurized Hsp60 genes belong to Euryarchaeota, Ascomycota, Euglenozoa, Basidiomycota, Chlorophyta, and Actinobacteria phyla. It should be noted that the Hsp60 genes from Chordata, Arthropoda, and Proteobacteria phyla cannot be assigned to any of these groups. However, further division by class showed an interesting result for Chordata. The Hsp60 genes from Fish were found to be under GC mutational pressure, while Hsp60 genes from other classes were AT-biased. Also noteworthy are four representatives of Hyperoartia, Mammalia, and Leptocardii classes, whose Hsp60 genes belonging to the Fish’s subgroup. This feature may be due to the fact that they are all aquatic animals, which makes them related to Fish.
The values of effective number of codons (ENC) and relative synonymous codon usage (RSCU) were used to assess codon usage and level of codon bias. According to the ENC values, a moderate or high level of Hsp60 gene expression was observed, which is evident since Hsp60 is a ubiquitous protein. At the same time, the direction of mutational pressure in the Hsp60 genes did not affect the size of a “codon dictionary” that is used to encode genes. However, the results of the synonymous codons bias analysis showed that the average RSCU values for A/T-ending codons were higher for Hsp60 genes under AT mutation pressure and vice versa . TAA codon was the most preferred stop codon for the Hsp60 genes. Using division by the direction of mutational pressure and the average RSCU values for Hsp60 genes from these groups, it was found that the number of high (RSCU>1.5) and low (RSCU<0.5) represented codons for the Hsp60 genes under GC mutation pressure is greater, than for AT-biased Hsp60 genes. It can be assumed that an increase in the GC content of Hsp60 genes ensures the optimization of codon usage.
Thus, the present study demonstrates that Hsp60 is a protein inherent in all living organisms, characterized by a relatively constant amino acid composition and low sequence conservatism, the feature of which is the dominance of natural selection forces in evolution of the gene and its high resistance to spontaneous mutations in the third synonymous position of codons.