ABSTRACT
As a novel protein post-translational modification, lysine succinylation is widely involved in metabolism regulation. In this study, we focused on the distribution of lysine succinylation sites and their physiological functions in Saccharopolyspora erythraea . Using high-resolution 4D label free mass spectrometry, a large and global protein succinylome was identified in a hypersuccinylated strain E3ΔsucC. The results showed that succinylated proteins are predominantly involved in protein synthesis-related pathways (e.g., ribosomes, tRNA) and metabolic pathways, such as the TCA cycle. Proteins in these pathways generally have a higher lysine content, suggesting that lysine succinylation may have a greater regulatory role in biochemical reactions involving acidic substrates. Motif analysis revealed that charged amino acids (D, E, K, R and W) display a more regular distribution around acylation sites, implying that the polar effect between residues may be the key factor influencing lysine succinylation. Based on predicted protein structures, we highlighted the potential impact of lysine succinylation on enzyme activity in the TCA cycle. In conclusion, this study offers valuable insights into the regulation of lysine succinylation and contributes to a comprehensive understanding of its physiological functions in actinomyces.
Statement of significance
This study provides a comprehensive protein succinylome ofSaccharopolyspora erythraea , aiming to explore the integrated impact of succinylation on cellular primary and secondary metabolism. The corresponding analysis not only reveals the functional distribution of succinylated proteins in S. erythraea , but also explores the effects of lysine proportion, amino acid residue polarity, protein structure, and other factors on the probability of lysine succinylation. These findings would not only enhance understanding of the physiological functions of protein acylation modifications, but also help to uncover the potential regulatory mechanisms of lysine succinylation.
Adapting to external environmental challenges and maintaining internal metabolic stability are crucial for the survival of cells and this requires a dynamic diversity in its proteome. While Although transcription and translation processes can achieve this, they come at a costly expense of time and energy. However, post-translational modifications (PTMs) can effectively enhance the heterogeneity of protein function and are considered an elegant mechanism for cells to dynamically regulate metabolism [1]. To date, over 600 PTMs have been identified, with approximately half of all proteinogenic amino acids capable of undergoing modification [2]. Notably, lysine acylation modification can rapidly respond to intracellular metabolism, making it a prominent modification conserved across phylogeny [3].
Nε-lysine acetylation is a critical process whereby an acyl group is donated to the epsilon amino group of lysine, and acyl-CoAs are the main intracellular donors of acyl group [4]. Depending on the acyl groups, lysine acylation can be classified into different types, such as acetylation, malonylation, succinylation, and others. Mass spectrometry technology has enabled the detection of thousands of acylation sites in various organisms, which highlights the prominent role that lysine acylation plays in regulating cell growth and metabolic functions [5]. Nevertheless, the comprehension of the intracellular distribution patterns of lysine acylation sites remains limited. Identifying the key factors that impact lysine acylation could provide valuable insights into its physiological functions.
Actinomycetes are the rich source of natural products, including polyketides and terpenes, which are synthesized from acyl-coenzyme A (acyl-CoA) precursors [6]. Saccharopolyspora erythraea is one such organism that produces erythromycin, a polyketide compound synthesized from propionyl-COA and methylmalonyl-CoA [7]. Succinyl-CoA plays a critical role in the tricarboxylic acid (TCA) cycle and is a crucial source of these two acyl-CoAs. Thus, it is considered to be a key node linking primary and secondary metabolic pathways in S. erythraea . Notably, succinyl-CoA also functions as an acyl donor in protein succinylation, studying on it might be an efficacious way to reveal the regulatory mechanisms of primary and secondary metabolism in S. erythraea .
We conducted a high-resolution mass spectrometry-based proteomic analysis in the engineered strain E3ΔsucC, which has a higher level of intracellular protein succinylation, to investigate the systematic succinylome profiling of S. erythraea [8]. Our results identified a total of 5,531 succinylation sites from 1,654 protein sequences, all at a 1% false detection rate (FDR) (supplementary Figure S1, Table S1). On average, each modified protein had 3.34 succinylation sites, and 23.12% (1,654/7,154) of all proteins were found to have the detected succinylation sites. Among the identified modified proteins, 658 proteins had only one succinylation site, accounting for 39.8%, while 90 proteins had more than ten succinylation sites, accounting for 5.4% (Figure 1A, B). In particular, five proteins had over 30 succinylation sites: 2-oxoglutarate dehydrogenase E1 component (36), DNA-directed RNA polymerase subunit beta rpoC (35), chaperone protein dnaK (32), elongation factor G fusA (32), and 60 kDa chaperonin groL (30). Subcellular localization analysis showed that 63.5% of succinylated proteins were located in the cytoplasm, while 14% were membrane-associated (Figure 1C). Previous studies in other microorganisms such as Escherichia coli , Streptomyces coelicolor , and Bacillus subtilis , resulting in the identification of 2580/670, 673/427, and 2150/634 succinylation sites and succinylated protein, respectively [9-11]. In comparison, our study identified a greater quantity succinylome with broader protein coverage. A high density of succinylation sites within a sample can be crucial for investigating the distribution patterns of these sites and delving into their physiological functions. For instance, machine learning methods were recently employed to predict acylation sites, a more comprehensive acylation site library might sustain the training and validation processes for these methods [12].
To shed light on the function of succinylation in cellular processes, we conducted a GO classification analysis. The classification results (supplementary Figure S2) relating to molecular function, biological process, and cellular component categories showed that the largest protein group of succinyl proteins are associated with catalytic activity, organonitrogen compound biosynthetic processes, and cytoplasm, which accounts for 24, 33, and 34% of the total succinyl proteins, respectively. Protein domains are the structural basis of their physiological functions. Therefore, to further identify the functions associated with succinylation, we annotated the domains of the identified succinylated proteins, and then carried out enrichment analysis (supplementary Table S2). The enriched succinylated proteins, as depicted in supplementary Figure S3, displayed functional domains such as DEAD/DEAH box helicase, histidine phosphatase superfamily (branch 1), cold-shock DNA-binding domain, and helix-turn-helix, which were predominantly associated with nucleotide binding. The KEGG enrichment results (Figure 2) highlighted that the succinylated proteins were mainly engaged in pathways such as ribosome, RNA degradation, citrate cycle (TCA cycle), and aminoacyl-tRNA biosynthesis. Previous researches suggested proteins in these pathways usually contain a higher proportion of basic amino acids (K, R), aiding in acidic substrate binding [13]. Thus, we continuedly calculated the proportion of lysine residues in proteins within the abovementioned KEGG pathways (supplementary Figure S4). The results revealed that most of them had a higher proportion of lysine residues, with ribosomal proteins exhibiting the highest lysine content at 8.38%, marking a 4.76-fold increase compared to the average level of 1.76%. It implied that lysine succinylation might play a more important regulatory role in biochemical reactions involving acidic substrates.
The biosynthesis of 12-, 14-, and 16-membered macrolides pathway and the polyketide sugar unit biosynthesis pathway, which are components of erythromycin synthesis, were enriched for succinylated proteins (supplementary Figure S5). These results suggested that succinylated proteins may play a direct role in the regulation of erythromycin synthesis. However, 12-, 14- and 16-membered macrolides and polyketide sugar unit biosynthesis pathways exhibited relatively lower lysine contents. Taking into account that protein expression levels may serve as a potential factor affecting acylation probability, we employed transcriptome data from S. erythraea E3 under similar growth conditions to gauge the average expression level of proteins within each pathway. Data indicated that these two pathways exhibited a higher degree of expression in S. erythraea , potentially bolstering the probability of lysine succinylation for these proteins (supplementary Figure S4, Table S3). In conclusion, we speculated that the proportion of lysine residues in proteins and the intracellular protein concentration may be important factors that affect the distribution of protein acylation.
Furthermore, we analyzed flanking protein sequences (10 amino acids upstream and downstream of succinylated lysine sites) using Motif-X to explore the amino acid preferences adjacent to modified lysine residues (supplementary Table S4). The heat map displayed the significance of amino acids present around modification sites, with a noticeable degree of symmetry in their distribution (Figure 3). Acidic amino acids, Aspartic acid (D) and Glutamic acid (E), appeared more frequently around the modification site, while basic amino acids had a lower frequency nearby but significantly increased further away. Nonpolar amino acids were distributed less prevalently around the modification site, possibly due to differing levels of residue hydrophobicity and distinct positions within the protein.
Previous acylome analysis have identified the common presence of polar amino acids E, K, and R in acylation site motifs, but the underlying reasons have not been clearly described [11; 14]. In this study, motif analysis showed that charged amino acids (D, E, K, R and W) exhibit a more regular distribution around acylation sites. Based on this result, we speculated that the polar effect between residues may be the key factor influencing lysine succinylation. The negative charge carried by acidic amino acids D and E could interact with the lysine side chain, enhancing its electrophilicity and reactivity, thereby increasing its susceptibility to react with succinyl-CoA. Notably, the longer side chain of E might extend its polar effect on lysine residues, explaining its higher occurrence frequency even at more distant positions. Similarly, the lower occurrence frequency of basic amino acids adjacent to acylation sites could result from a depolarization effect. However, the frequency of basic amino acids appearing a few residues away from the acylation site increases significantly, possibly due to the weakening of the depolarization effect and the more prominent recruitment of succinyl-CoA by basic amino acid residues. The different distribution between K and R might also be attributed to the broader range of depolarization effect caused by the longer side chain of R. Given the significant impact of neighboring amino acids on lysine succinylation, modifying the amino acids surrounding the lysine residue could be a novel strategy for regulating the level of lysine succinylation and enzyme activity.
Otherwise, we explored the lysine succinylation distribution pattern on the TCA cycle protein structures. With NetSurfP 3.0 algorithm, we predicted the secondary structure of succinylated proteins in the TCA cycle, and the results were shown in the supplementary Table S5 and Figure S6 [15]. In terms of α-helix and β-sheet, the probability distribution of modified and non-modified lysine residues appears to be inherently similar. By using the Wilcox test, no significant difference in solvent accessibility was found between modified and unmodified lysine (supplement Figure S7). According to the 3D structure predicted by alphafold, most of the succinylated lysine residues were found to be located on the surface of proteins, corresponding to the high solvent accessibility of lysine residues. For multi-subunit protein complexes, these modifications could affect the binding between subunits and subsequently influence their function. Then, we used POCASO to predict possible active pocket of five single-chain proteins, citrate synthase, isocitrate dehydrogenase, aconitase, fumarate dehydrogenase, and malate dehydrogenase, in TCA cycle [16]. Several succinylated lysine residues were found to be located around the active pockets or active sites, indicating the potential impact on protein activity (supplement Figure S8). This analysis suggested that the function of multiple proteins in the TCA cycle might be regulated by succinylation.
In summary, by employing high-resolution 4D label free mass spectrometry analysis and a highly succinylated strain, we identified a relatively large protein succinylome in S. erythraea . Our analysis revealed that succinylation probably tends to occur in biochemical pathways that bind to acidic substrates, as associated proteins have a higher lysine content. Additionally, polar amino acids surrounding acylation sites display a more regular distribution pattern, potentially due to the polar effect between amino acid residues. Lastly, we investigated the distribution of acylation sites in TCA cycle proteins based on predicted structures, highlighting the potential impact of acylation modifications on enzyme activity. These findings may contribute to a comprehensive understanding of lysine succinylation regulation and provide valuable omics data for further analysis and research advancement.