Most proteins with diurnal changes in abundance fluctuate independently of their transcript levels and belong to specific functional networks
To clarify which cellular and physiological processes have protein abundance dynamics, we grouped all significantly changing proteins with similar accumulation profiles into clusters and then subjected these clusters to gene set enrichment analysis (GSEA). Each of the resulting six clusters (CL1 – CL6) is enriched for proteins involved in specific processes (P value ≤ 0.01, gene set size ≥ 2) (Figure 1A-B; Supplemental Data 1 - 6). Cluster CL1 contains proteins involved in RNA splicing that decrease before dawn, while CL2 is enriched in proteins that peak early in the light period and have roles in nitrogen metabolism, iron homeostasis, responses to gravity and chloroplast stroma protein import. CL5 contains proteins with peak abundance before dawn and lower abundance before dusk that have specific functions in aerobic respiration and proteasome complex formation, while proteins in CL3 have functions in membrane-related processes and ribosome biogenesis. The CL3 abundance profile is complex with a sharp minimum during the second half of the light period that is also found at the transcript level for selected proteins in this group (see below). Clusters CL4 and CL6 exhibit distinct and opposing wave-form diurnal changes in protein abundance, with proteins in CL4 peaking during the light period and proteins in CL6 peaking early in the dark period. CL4 is enriched for proteins involved in nitrogen metabolism and photosynthesis, which are required for light-dependent carbon assimilation to support growth, while CL6 is enriched for proteins involved in metabolic and RNA-related processes that indicate a systemic change in the plant cell environment.
We then compared the proteins in CL1 to CL6 with their corresponding transcript expression profiles using transcriptome data from whole Arabidopsis rosettes grown and harvested in comparable conditions and at similar time-points (Figure 1C). This revealed that the dynamics of CL1 to CL6 protein changes are not strictly correlated with the diurnal abundance changes of their transcripts (Figure 1C; Supplemental Data 1-6), as has been found in other studies (Baerenfaller et al., 2012; Abraham et al., 2016; Graf et al., 2017; Seaton et al., 2018). We also determined the subcellular compartmentalization of proteins in each cluster using the consensus localization predictor SUBAcon (SUBA3; http://suba3.plantenergy.uwa.edu.au; Figure 1D) (Tanz et al., 2013).
Next, we built functional association networks between the proteins in each cluster using STRING-DB (http://string-db.org; Figure 2). STRING-DB scoring and Cytoscape visualization allowed us to estimate association confidence between protein nodes, while subcellular localization information resolved co-localized nodes at the protein level. Second level nodes not found in our data were also included to better depict the broader relationships between significantly changing proteins. This analysis strategy resolved multiple protein hubs with variable degrees of interconnectedness and belonging to relevant biological processes, with some processes complementing those enriched by GSEA. Proteins with no known connections above the set threshold for the association networks were removed for network visualization, but they may also serve on an individual basis to further connect GSEA and STRING_DB analyses. Using our STRING-DB analysis approach we defined network structures for proteins belonging to: RNA splicing (CL1) and processing (CL6; RNA helicases and binding proteins), chloroplast-related processes (CL4 and 5, light detection; CL1 and CL5, carbohydrate/starch metabolism; CL2, redox regulation), cell metabolism (CL4, nitrogen and fatty acid metabolism), secretion and intracellular transport (CL2), cell wall biosynthesis (CL5) as well as cytosolic (CL1, 3 and 5), mitochondrial (CL3) and plastidial (CL4 and 5) protein translation (Figure 2). Taken together, our GSEA and functional association analyses indicate that the complementary use of system-wide analysis approaches provides a more comprehensive view of the types of proteins whose abundances are diurnally modulated in cellular processes.