Comparative Quantitative Proteomic Analysis Reveals Differentially Expressed Proteins Involved in Grain-Filling Rate of Rice at the Early Ripening Stage

Grain-lling ability is a determinant factor of rice potential yield. Grain lling is a biological process of starch accumulation involved a large number of major enzymes implicated in carbohydrates metabolism in developing rice endosperm and is governed by a complex balance between the sink (grains) and the source (assimilates). This study was carried out to analyze the proteomic prole of rice grains and to identify proteins associated with high grain-lling rate during the early ripening period in rice. TMT and LC-MS/MS were employed in analyzing proteomic prole of grains from two rice cultivars possess contrasting phenotypes in grain lling rate to identify proteins associated with high grain-lling rate during the early ripening stage. The two cultivars differed signicantly in grain-lling rate during the period of 0–3 days after full heading, indicating the appropriacy of cultivars selection for quantitative proteomic analysis. A total of 219 differentially expressed proteins in grains between the two cultivars were identied, providing a database for quantied proteomics in rice grains during grain lling stage. Elevated expression of many enzymes involved in starch and sucrose metabolism during rice grain lling was observed. GO and KEGG analyses revealed that the largest portion of differentially expressed proteins during grain lling associated with carbohydrate metabolism and transportation, suggesting a more specic function of those protein in rice grain development. Starch, sucrose, fatty acids and amino acids biosynthesis and metabolism were signicantly enriched pathways. The Agilent 300Extend C18 column in the high pH reverse-phase HPLC was employed in fractionation of tryptic peptides. A gradient concentration of 8–32% of acetonitrile (pH 9.0) was then applied for 60 min to separate peptides into 60 fractions. Afterwards, peptides were combined in 18 fractions and dried using vacuum centrifuging. After dissolving the tryptic peptides were in 0.1% formic acid (solvent A), they were loaded onto a reversed-phase analytical column (15-cm length, 75 µm i.d.). The gradient was composed of an increase from 6–23% solvent B (0.1% formic acid in 98% acetonitrile) over 26 min, 23– 35% in 8 min and climbing to 80% in 3 min then remain at 80% for the last 3 min, all at a constant ow rate of 400 nL/min on an EASY-nLC 1000 UPLC system.


Abstract Background
Grain-lling ability is a determinant factor of rice potential yield. Grain lling is a biological process of starch accumulation involved a large number of major enzymes implicated in carbohydrates metabolism in developing rice endosperm and is governed by a complex balance between the sink (grains) and the source (assimilates). This study was carried out to analyze the proteomic pro le of rice grains and to identify proteins associated with high grain-lling rate during the early ripening period in rice.

Results
TMT and LC-MS/MS were employed in analyzing proteomic pro le of grains from two rice cultivars possess contrasting phenotypes in grain lling rate to identify proteins associated with high grain-lling rate during the early ripening stage. The two cultivars differed signi cantly in grain-lling rate during the period of 0-3 days after full heading, indicating the appropriacy of cultivars selection for quantitative proteomic analysis. A total of 219 differentially expressed proteins in grains between the two cultivars were identi ed, providing a database for quanti ed proteomics in rice grains during grain lling stage. Elevated expression of many enzymes involved in starch and sucrose metabolism during rice grain lling was observed. GO and KEGG analyses revealed that the largest portion of differentially expressed proteins during grain lling associated with carbohydrate metabolism and transportation, suggesting a more speci c function of those protein in rice grain development. Starch, sucrose, fatty acids and amino acids biosynthesis and metabolism were signi cantly enriched pathways.

Conclusions
Analysis of protein-protein interactions indicated the implication of the same proteins in several biological processes such as carbohydrate transport and metabolism, fatty acid metabolism, energy production and conversion and secondary metabolites biosynthesis. The data provide valuable information about the roles of biosynthesis, transport and metabolism of carbohydrate and amino acids in rice grain lling and development. The results represent a valuable foundation for further studies of the roles of differentially expressed proteins in underlying grain lling stage in rice and their potential impacts on rice productivity. Background Rice (Oryza sativa L.) is one of the most crucial food crop worldwide, in particular in Asia and the Paci c region, therefore improving rice productivity is of great importance to meet the rapid population growth and economic development and to ensure sustainable human food and animal feeding [1]. Grain yield which is one of the essential yield component parameters is de ned as the product of yield sink capacity and lling e ciency [2]. Grain-lling is a pivotal grain weight determinant factor and has a great impact on rice potential yield. High yielding rice cultivars that have been developed in many rice producing countries are characterized by having extra-large sink capacity but differed greatly in grain-lling ability [3,4]. Although both grain-lling rate and duration are associated with grain weight, high grain weight was attributed to a high grain-lling rate [5]. Although high yielding rice cultivars that have large panicles or/and extra heavy panicles with a large number of spikelets per panicle have become available, e.g., the New Plant Type, hybrid rice and 'super' rice [4,6,7], those cultivar however and due to their poor grainlling do not reach their high yield potential Grain lling is governed by a complex balance between the sink (grains) and the source (assimilates; [4]).
Modern high yielding rice cultivars with large panicle have a large sink capacity, however the insu cient carbohydrate source and export might cause poor lling capacity of inferior spikelet, reduce the rate of seed setting and ultimately reduce grain yield [2]. On the other hand, e cient coordination of source capacity (assimilates), e ciency of carbohydrate transport ( ow) and sink activity are crucial factors in increasing grain-lling rate and hence increasing rice yields [8,9]. Photoassimilates and the nonstructural carbohydrates (NSCs) stored in the culms and sheaths before heading are the most important source for assimilated carbon molecules during rice grain-lling [10,11]. Enhancing NSCs accumulation in culms as well as their transportation to rice grains improve sink capacity, grain-lling rate, and grain yield [2,11].
Therefore, it is of great immense to dissect the molecular mechanisms of culm NSCs accumulation and transportation to maximize rice yield potentials. Unloading sugars to grains during grain lling is affected by plant photosynthetic capacity and the e ciency of NSCs ow from the culms to the panicle. However, the abundance of NSCs enhances the sink activity and promotes grain lling at the early stage of grainlling [10]. Transportation of sucrose from the source to the sink is in uenced by 3 factors, i.e., phloem loading, length of transportation distance and phloem unloading. Transportation of assimilates from the source to the sink is in uenced by the function and structure of the vascular bundles, including the size, number, development and capacity ow of the bundles [12]. Environmental factors that reduce photosynthetic capacity and thereby affect the accumulation and transportation of culm NSCs such as heat stress, water de cit, low radiation level and nutrient de ciency enhances the transportation of culm NSCs [13][14][15][16].
Starch, contributes up to 80-90% of the dry weight of unpolished rice grains, is the main NSCs storage form in the sheaths and culms and accumulates during the vegetative growth period of rice plants and reduced sharply during grain-lling [17,18]. Culm NSCs contributed up to 10-40% of grain yield in rice, however great variations among rice cultivars in the accumulation of NSCs in the culms and their transportation to the grains have been observed [2,19].
Although, the ndings of previous studies are clearly helpful in understanding the physiological basis of grain lling in rice, however more efforts are essentially needed to further dissect the molecular mechanisms underlying grain-lling in rice. Therefore, the current study was conducted to analyze the proteomic pro le of rice grains and to identify proteins associated with high grain-lling rate during the early ripening period in rice.

Analysis of grain lling rate
The results showed that the cultivar Xiangzaoxian 42 signi cantly surpassed the cultivar Xiangzaoxian 24 in grain-lling rate. The grain-lling process was well tted by the logistic equation for both cultivars, showing a determination coe cient (R 2 ) of 0.991 and 0.987 for Xiangzaoxian 24 and Xiangzaoxian 42, respectively (Fig. 1a). The grain-lling rate, calculated according to the logistic equations, was more than 2.0 times higher in Xiangzaoxian 42 than in Xiangzaoxian 24 during the period of 0-3 days after full heading (Fig. 1b).
Protein pro les revealed differentially expressed proteins between the two rice cultivars during the early grain lling stage To further uncover the molecular mechanisms underlying the early grain lling period in the two rice cultivars, TMT and LC-MS/MS were employed to characterize the proteomic pro les of grains during the early ripening period (3 days after full heading). Quality control ltering resulted in a total of 8143 and 6808 highly reproducible proteins that could be quanti ed in the Xiangzaoxian 24 and Xiangzaoxian 42 cultivars, respectively. These highly reproducible proteins were implemented in the analysis of differentially abundant proteins during the early period of grain lling. Protein pro ling analysis identi ed 219 differentially expressed proteins in grains between Xiangzaoxian 24 and Xiangzaoxian 42 at 3 days after full heading. The molecular weight of those proteins ranged between 5.51-123.03 kDa. Number of peptides for each protein ranged between 1-27 peptides. Out of those 219 differentially expressed proteins, 112 were up-regulated in the cultivar Xiangzaoxian 24 and down-regulated in the cultivar Xiangzaoxian 42, while the remaining 107 proteins were down-regulated in the cultivar Xiangzaoxian 24 and up-regulated in the cultivar Xiangzaoxian 42 (Suppl. Table 2). The down-and up-expression ratio of the differentially expressed proteins ranged between 0.11-55.55.

GO and KEGG annotations of deferentially expressed proteins
All deferentially expressed proteins were annotated to the Gene Ontology (GO) database and classi ed based on their biological process, cellular compartment and molecular function to the GO database.
The BP category mostly involves metabolic processes, cellular processes, and single-organism process. Proteins with differential abundance levels assigned to CC category are mainly implicated in catalytic activity and binding. The most prominent proteins in the MF category includes cell, membrane and organelle ( Fig. 2).
Analysis of the biological metabolic pathways of the 219 differentially expressed proteins using the KEGG databases revealed that those proteins are mainly associated with general function prediction (20), post-translational modi cations, protein turnover, chaperons (17), energy production and conversion (12), carbohydrate transport and metabolism (12) and secondary metabolites biosynthesis, transport and catabolism (9) (Fig. 3). Comparing the differential expression of proteins involved in carbohydrate transport and metabolism and secondary metabolites biosynthesis between the two cultivars indicated that they are detected in the Xiangzaoxian 42 cultivar regarding the biological process (Suppl. Table 3).

Functional enrichment of differentially expressed proteins
After the classi cation of differentially expressed proteins into different categories, the -log10 (P-value) approach was employed to determine their abundance. According to subcellular location, proteins exhibited increased expression during the early ripening period of grain lling stage are mainly enriched in the chloroplasts, cytoplasm, nucleus, mitochondria and plasma membrane, while proteins exhibited downregulated expression are mainly enriched in chloroplast, cytoplasm, extracellular compartments, mitochondria, nucleus and plasma membrane (Suppl. Table 2).
Classi cation of the identi ed proteins based on subcellular location Subcellular localization annotation was implemented to determine the abundance of differentially expressed proteins in each subcellular location ( Fig. 4; Suppl. Table 2). Proteins associated with chloroplasts, cytoplasm and nucleus were mainly induced during the early ripening period of grain lling in rice. Noteworthy, differential expression of proteins was detected in the plasma membrane, extracellular compartment and mitochondria as well during the grain frilling period. Subcellular localization indicated that 1, 2, 3, 3, 10, 10, 14, 36, 48 and 92 differentially expressed proteins are localized to the vacuolar membrane, endoplasmic reticulum, cytoskeleton, peroxisome, mitochondria, plasma membrane, extracellular compartment, nucleus, cytoplasm and chloroplast, respectively. Protein domain analysis indicated that Alpha/Beta hydrolase fold, Glycoside hydrolase superfamily and Glycoside hydrolase, catalytic domains were the top three upregulated protein domain groups where they mapped 8,7 and 6 proteins, respectively. Meanwhile, for the downregulated proteins, START-like and aspartic peptidase domains were the most abundant groups ( Fig. 5; Suppl. Table 3).

Functional enrichment-based clustering
The KEGG pathways enrichment analysis of differentially expressed proteins identi ed 22 signi cantly enriched pathways in the highly expressed proteins including starch and sucrose metabolism, galactose metabolism, fatty acid biosynthesis, fatty acid elongation, MAPK signaling pathway plant, fructose and mannose metabolism, amino acids metabolism, Cyanoamino acid metabolism, oxidative phosphorylation. Likewise, 22 signi cantly enriched pathways in the low expressed proteins including starch and sucrose metabolism, glycolysis, biosynthesis of secondary metabolites, glycerophospholipid metabolism, pantothenate and CoA biosynthesis, pyruvate metabolism, amino sugar and nucleotide sugar metabolism and amino acids (purine, pyrimidine, glycine, serine, threonine, arginine, proline, tryptophan and glutathione) metabolism were identi ed ( Fig. 6; Suppl. Table 3).
According to transformed Z score, quantile-based analysis classi ed proteins into four groups (Q1, Q2, Q3, and Q4) when ranked by -log10 (P-value). Each quanti ed protein was then assigned to the respective quantile. Accordingly, quanti ed proteins were allocated to four quantiles Q1 (> 1. (1/1.2), and Q4 (< 1/1.3) (Fig. 8a&b). Q1 proteins group are enriched in the hydrolase activity acting on glycosyl bonds of the MF protein category. The Q2 proteins are enriched in the lipid transport of BP protein category, and metal ion binding and cation binding of the MF protein category. Proteins belong to the Q3 group are enriched in formaldehyde catabolic process and hexose metabolic process of the BP protein category, and transferase activity, transferring glycosyl group, S-formylglutathione hydrolase activity and ATP-phosphoribosyl transferase activity of the MF protein category. However, the Q4 group of proteins are enriched in regulation of proteolysis and negative regulation of hydrolase activity of the BP protein category and peptidase inhibitor activity, peptidase regulator activity, ADP binding, endopeptidase regulator activity, endopeptidase inhibitor activity, enzyme inhibitor activity, enzyme regulator activity and molecular function regulator of the MF protein category (Fig. 8a&b).
KEGG pathway clustering for four protein groups (Q1, Q2, Q3, and Q4) was also carried out (Fig. 8d). Q1 proteins group are enriched in the Ubiquinone and other terpenoid-quinone biosynthesis KEGG pathway. The Q2 proteins group are enriched in the Pentose and glucuronate interconversions and biosynthesis of secondary metabolites pathways. The Q3 proteins group are enriched in the biosynthesis of secondary metabolites pathway. However, no KEGG pathways assigned to the Q4 proteins group (Fig. 8c) Functional protein interaction networks of differentially abundant proteins The STRING database v10.5 identi ed 22 functional protein interaction networks, 9 of which are downregulated while the remaining 13 are upregulated (Fig. 9). Seven protein interaction networks are located in the chloroplast, 6 interactions are located in the cytoplasm, 5 interactions are located in the nucleus, 2 interactions are located in the mitochondria and one interaction located in each of cytoskeleton and extracellular compartment ( Fig. 8; Table 1). Seven interaction proteins belong to the chloroplast network, including 4 downregulated proteins, i.e., glutathione S-transferase, heterogeneous nuclear ribonucleoprotein A1/A3, small subunit ribosomal protein S17e and Xylanase inhibitor and 3 upregulated proteins, i.e., Glutathione synthase, Transferase CAF17, mitochondrial and Tropinone reductase I. The cytoplasm protein interaction networks comprise one downregulated interaction, i.e., dUTP pyrophosphatase, and 5 upregulated interactions, i.e., peptide chain release factor subunit 1, small subunit ribosomal protein S2e, large subunit ribosomal protein L24e, tyrosine aminotransferase and glycine hydroxymethyl transferase. The nucleus protein interaction networks include two downregulated interactions, i.e., dUTP pyrophosphatase and translation initiation factor 4A, and two upregulated interactions, i.e., translation initiation factor 3 subunit B, large subunit ribosomal protein L19e and large subunit ribosomal protein L27Ae ( Fig. 8; Table 1). Additional information regarding functional protein interaction networks are shown in Suppl. Table 3.

Discussion
Analysis of proteomics pro les has been demonstrated as a powerful and e cient molecular tool in exploring molecular mechanisms underlying various biological processes in living organisms. However, the e ciency of proteomic analysis in identi cation of proteins underlying a given biological process greatly depends on the applied proteomics analysis procedure and the experimental system [28][29][30][31][32][33]. Although proteomic analyses have been widely employed to dissect the molecular basis of variable biological processes in plants, in particular stress tolerance, [29,32,[34][35][36][37][38][39]), to the best of our knowledge this is the rst comprehensive study conducted to dissect proteins involved in grain lling rate of rice. Seed development in rice is a complex biological process that directly impact potential yield and is regulated by complex regulatory networks involving several transcription factors [40].In the current study, two rice cultivars, i.e., Xiangzaoxian 24 and Xiangzaoxian 42, that possess contrasting phenotypes in grain lling rate during the early ripening period were employed in identi cation of grain lling differentially expressed proteins. Determination coe cients (R 2 ) of the grain-lling process in Xiangzaoxian 24 and Xiangzaoxian 42 cultivars as tted by the logistic regression were 0.991 and 0.987, respectively, suggesting that the logistic regression is a reliable analysis to t the grain-lling process in rice which is consistent with previous reports [5,41]. The cultivar Xiangzaoxian 42 exhibited a grain-lling rate during the period of 0-3 days after full heading that is 2-times higher than that of the cultivar Xiangzaoxian 24 as estimated by the logistic regression, indicating the appropriacy of cultivars selection for quantitative proteomic analysis.
The TMT-labeled quantitative LC-MS/MS proteomics and LC-MS/MS metabolomics that rely on the highthroughput quantitative detection procedure was employed to perform quantitative proteomics analysis of the grain samples harvested from the two rice cultivars 3 days after heading. Quality control ltering identi ed a total of 8,143 and 6,808 highly reproduceable proteins that were quanti ed in the Xiangzaoxian 24 and Xiangzaoxian 42 cultivars, respectively, and were implemented in the analysis of differentially expressed proteins. The TMT procedure is a powerful approach in quantitative proteomic analysis offering higher sensitivity for the analysis [36]. Through annotation and identi cation of differentially expressed proteins, the key genes/proteins and pathways underlying the early period of grain lling stage in rice were dissected. The analysis aimed to obtain a more comprehensive understanding of the alterations of grain composition in terms of protein pro ling during the early ripening period of grain lling between the two phenotypically contrasting cultivars, and hence further exploring molecular mechanism underlying rice yield. Further bioinformatics analyses uncovered crucial proteins for the metabolic network and their regulation during the early period of grain lling stage. These ndings represent a theoretical basis required for dissecting molecular mechanisms of grain lling and their impact on potential yield. Revealing the abundance of protein expression during grain lling provides essential information on grain formation and development, and the accumulation of starch as well as improving crop productivity. Our data revealed the elevated expression of many enzymes involving in starch and sucrose metabolism during rice grain lling. The results of GO and KEGG analyses revealed that the largest portion of differentially expressed proteins during grain lling associated with metabolism. Most of these proteins are implicated in the transportation and metabolism of carbohydrate, production and conversion of energy, biosynthesis and transportation of secondary metabolites, and catabolism and metabolism of amino acids (Fig. 3; Suppl. Table 2). Besides, most of those proteins are differentially expressed between the two cultivars during grain lling, indicating that the enhancement of the primary metabolism to facilitate grain lling which is consistent with previous studies [42][43][44][45][46] that revealed that the upregulation of metabolic proteins enhances seed development. The abundance of some metabolic proteins in plants such as those involved in glycolysis, carbohydrate metabolism and amino acid synthesis may re ect enhanced plant growth and development [47,48]. Our data showed that proteins involved in glycolysis and carbohydrate metabolism (e.g., Glycoside hydrolase superfamily, Xyloglucan endotransglucosylase/hydrolase, Starch synthase, chloroplastic/amyloplastic, Mannose-6phosphate isomerase and Glycoside hydrolase superfamily; Suppl. Table 2) are among the most differentially regulated proteins during the early stage of grain lling, suggesting a more speci c function of those protein in grain development. It was further shown that amino acid metabolism proteins which are essential for seed coat, endosperm formation and embryo are among the most highly regulated in rice grains. Several proteins involved in transport, storage, stress tolerance and defense (e.g., glucose 1 phosphate adenyltransferase, heat shock proteins, MAPK signaling, pathogenesis-related protein PRB1, granule bound starch synthase, β amylase, etc.) exhibited differential expression patterns during grain lling which is consistent with previous nding in other crop plants, suggesting a biological function of those proteins in rice grain lling [34,35].
According to the KEGG pathway enrichment analysis, fatty acid biosynthesis, fructose and mannose metabolism, glutathione metabolism, starch and sucrose metabolism, biosynthesis of secondary metabolites and amino acids biosynthesis and metabolism were signi cantly enriched pathways ( Fig. 6; Suppl. Table 3). The synthesis, metabolism and transport of carbohydrates and amino acids which play crucial roles in starch and protein accumulation were enhanced to promote grain lling and to accelerate conversion rate between different substances [34,35,49]. Besides. analysis of domain enrichment of proteins exhibited that Alpha/Beta hydrolase fold, Glycoside hydrolase superfamily, Glycoside hydrolase, catalytic domain, START-like and Aspartic peptidase were the top protein domain groups with different dynamic expression patterns ( Fig. 7; Suppl. Table 3).
To provide an effective approach for dissecting the molecular mechanisms underlying protein expression during the early stage of grain lling, we have employed the online tool STRING 10.5 database to obtained protein-protein interaction (PPI) information. The central nodes in directed and undirected PPI networks represent individual proteins, and the lines show their relationships. The degree and combined score between each two nodes were implemented to estimate the nodes of differentially expressed protein-derived interaction networks. Analysis of protein-protein interaction indicated that the same proteins might implicated in several biological processes such as carbohydrate transport and metabolism, fatty acid metabolism, energy production and conversion and secondary metabolites biosynthesis.

Conclusions
TMT and LC-MS/MS were employed to analyze the proteomic pro les of rice grains during the early ripening stage of two rice cultivars possess contrasting phenotypes in grain lling rate, and to further explore the molecular mechanisms underlying rice grain yield. Protein pro ling analysis identi ed 219 differentially expressed proteins in grains between the two cultivars at 3 days after full heading, providing a database for quanti ed proteomics in rice grains during grain lling stage. Integrated analysis of proteomic data revealed that deferentially expressed proteins covered the three functional categories, i.e., biological process (BP), cellular compartment (CC) and molecular function (MF), indicating the comprehensiveness of proteomic pro ling of our study. GO and KEGG pathway analyses showed that the largest group of differentially abundant proteins are involved in carbohydrate transport and metabolism, energy production and conversion, secondary metabolites biosynthesis, transport and catabolism, and amino acid metabolism, indicating that the enhancement of the primary metabolism to facilitate grain lling. Analysis of functional protein interaction networks revealed that the same proteins were shown to be involved in several biological processes such as carbohydrate transport and metabolism, fatty acid metabolism, energy production and conversion and secondary metabolites biosynthesis. Our data provide valuable information about the roles of biosynthesis, transport and metabolism of carbohydrate and amino acids in rice grain lling and development. Besides, the results of this study represent a valuable foundation for further studies of the roles of differentially expressed proteins in underlying grain lling stage in rice and their potential impacts on rice productivity.

Plant materials and experiments
Two rice cultivars, i.e., Xiangzaoxian 24 and Xiangzaoxian 42, identi ed and provided by the Nongfeng Seed Industry Co., Ltd., Changsha, China, exhibited contrasting phenotypes in grain lling during the early ripening period were used in the current study. There is no voucher specimen of this material has been deposited in any publicly available herbarium. The cultivar Xiangzaoxian 24 exhibited a slower grain lling compared to the cultivar Xiangzaoxian 42 during the early ripening period [5,41]. The randomized complete-block design with three replicates with a plot size of 40 m 2 were implemented in carrying out a eld experiment growing the two cultivars in Yongan (28°09′ N, 113°37′ E, 43 m asl), Hunan Province, China, in the early rice-growing season in 2018. The soil physical and chemical analyses were performed based on samples taken from the 0-20 cm layer prior to the beginning of the experiment. The basic physical and chemical characteristics of the experimental eld soil are shown in Table 2.
Seeds were sown in trays on March 26, 2018 and 25-day-old seedlings were transplanted into the eld on April 20using the high-speed rice transplanter (PZ80-25, Dongfeng Iseki Agricultural Machinery Co., Ltd., Xiangyang, China). Transplanting of rice seedlings was done at 25 cm spacing between rows and 11 cm between plants within rows. To ensure a uniform plant population, missing plants were re-transplanted by hand 7 days later. Nitrogen fertilization was applied in three doses: 67.5 kg N ha − 1 as basal fertilizer (a day prior to transplanting), 27.0 kg N ha − 1 at early-tillering (7 days post transplanting), and 40.5 kg N ha − 1 at panicle initiation. Phosphorus fertilization of 67.5 kg P 2 O 5 ha − 1 was applied as a basal fertilizer dose. Potassium fertilization of 135 kg K 2 O ha − 1 was equally split between the basal fertilization dose and panicle initiation fertilization dose. The experiments were kept under ooded conditions from transplanting until 7 days before maturity. Chemical treatments were applied to control diseases, insects, and weeds.

Protein extraction
Three days after heading, 3 tagged panicles from each plot were sampled for protein extraction. The grains in the middle part of the sampled panicles were hand threshed, frozen in liquid nitrogen for 1 min, and stored at − 80 °C until assayed.
After grain sample was grinded in liquid nitrogen it has been transferred to a 5-ml centrifuge tube. After grinding, lysis buffers, i.e., 8 M urea, 10 mM dithiothreitol, 1% Triton-100, and 1% Protease Inhibitor Cocktail were added to the cell powder, followed by sonication on ice for three times using a high intensity ultrasonic processor (Scientz). Centrifugation at 20,000 g at 4 °C for 10 min was carried out to remove the remaining debris. The protein was precipitated with cold 20% TCA for 2 h at -20 °C. After centrifugation at 12,000 g 4 °C for 10 min, the supernatant was discarded. Cold acetone was then used to wash the remaining precipitate for three times. The protein was dissolved in 8 M urea and its concentration was estimated using BCA kit according to the manufacturer's instructions.

Trypsin digestion and TMT labeling
Samples digestions were performed according to the procedure described previously [29]. In brief, a 5 mM dithiothreitol was applied for 30 min at 56 °C to reduce protein solution. After alkylation using mM iodoacetamide for 15 min at room temperature in dark, 100 mM TEAB was used to dilute protein samples. Trypsin was added for the rst digestion at mass ratio of 1:50, trypsin:protein, overnight, and at mass ratio of 1:100, trypsin:protein, for the second digestion for 4 h. Strata X C18 SPE column (Phenomenex) followed by vacuum-dried was used to desalt peptide. Reconstitution of peptide was performed in 0.5 M TEAB and employed based on the TMT kit manufacturer's procedure. After incubation of peptide mixtures for 2 h, vacuum centrifugation at room temperature was performed for desalination and drying.
Protein fractionation and identi cation using LC-MS/MS analysis The Agilent 300Extend C18 column in the high pH reverse-phase HPLC was employed in fractionation of tryptic peptides. A gradient concentration of 8-32% of acetonitrile (pH 9.0) was then applied for 60 min to separate peptides into 60 fractions. Afterwards, peptides were combined in 18 fractions and dried using vacuum centrifuging. After dissolving the tryptic peptides were in 0.1% formic acid (solvent A), they were loaded onto a reversed-phase analytical column (15-cm length, 75 µm i.d.). The gradient was composed of an increase from 6-23% solvent B (0.1% formic acid in 98% acetonitrile) over 26 min, 23-35% in 8 min and climbing to 80% in 3 min then remain at 80% for the last 3 min, all at a constant ow rate of 400 nL/min on an EASY-nLC 1000 UPLC system.
After exposing peptides to NSI source, the Q ExactiveTM Plus (Thermo Fisher Scienti c, Waltham, MA, USA) was implemented for tandem mass spectrometry (MS/MS). The electrospray applied voltage was 2.0 kV. The m/z scan range was at 350 to 1800 for full scan, and the Orbitrap was then employed to detect the intact peptides at a resolution of 70,000. MS/MS of peptides was performed using the NCE setting as 28, the Orbitrap was implemented to detect the fragments at a resolution of 17,500. A datadependent approach that switches between one MS scan and 20 MS / MS scans with a dynamic exclusion of 15.0 seconds. Automatic gain control (AGC) was adjusted to 5E4, and the xed rst mass was adjusted to 100 m/z.
The Maxquant search engine (v.1.5.2.8) was employed in processing the resulting MS/MS data. Tandem mass spectra were searched against XXX database linked to reverse decoy database. Trypsin/P is identi ed as a cleavage enzyme that allows up to 2 missing cleavages. The mass tolerance for the precursor ions was adjusted to 20 ppm in rst search and 5 ppm in the main search, and the mass tolerance for the fragment ions was adjusted to 0.02 Da. Carbamidomethyl on Cys was designated as xed modi cation and oxidation on Met as variable modi cations. FDR was adjusted to < 1% and minimum score for peptides was set > 40.

Proteomic analysis and identi cation of differentially abundant proteins
Proteomic analysis was performed using PTM BioLab Co., Ltd. (Hangzhou, China), using tandem mass tag coupled to liquid chromatography-mass spectrometry/mass spectrometry. Proteins with relative quanti cation p-values < 0.05 and fold-changes > 1.30 or < 1/1.30 were selected as differentially expressed. The Kyoto Encyclopedia of Genes and Genomes (KEGG) database was employed to identify differentially expressed proteins related to starch and sucrose metabolism.
Gene ontology (GO), InterPro domain and KEGG pathway annotations and subcellular localization of abundant expressed proteins Gene Ontology proteome annotation was derived from the UniProt-GOA database (http://www.ebi.ac.uk/GOA/). Identi ed protein ID was converted into UniProt ID and then mapped to GO IDs by protein ID. The InterProScan soft (http://www.ebi.ac.uk/interpro/, v. 5.14-53.0) that depends on protein sequence alignment approach was performed to annotate functional GO of the previously identi ed proteins that failed to be annotated to UniProt-GOA database. GO annotation was performed in proteins classi cation to three categories, i.e., biological process, cellular compartment and molecular function. Using the InterPro domain database, the InterProScan was implemented in annotation of functional description of identi ed proteins domain based on protein sequence similarity.
Annotation of protein pathways was then performed using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Protein description of the KEGG database was annotated using theonline service tools KAAS of the KEGG. The KEGG mapper online service tools of the KEGG was then performed to map the annotation results to the KEGG pathway database. The subcellular protein localization predication software WolfPSORT (http://wolfpsort.seq.cbrc.jp/) was implemented to predict subcellular localization of differentially abundant proteins.

Enrichment of GO, KEGG pathway and protein domain analyses
GO annotation classi ed differentially expressed proteins into three categories, i.e., biological process, cellular compartment and molecular function. To test enrichment signi cancy of differentially abundant proteins against all identi ed proteins in each category, two-tailed Fisher's exact test was performed.
Enrichment was considered signi cant when the corrected p-value of a given GO is below 0.05. The twotailed Fisher's exact test was also implemented to further identify signi cantly enriched pathways in the KEGG database. The pathway with a corrected p-value less than 0.05 was considered signi cant. The KEGG website classi ed these pathways into hierarchical categories. The InterPro database was researched and the two-tailed Fisher's exact test was also performed to identify the signi cancy of enrichment of the differentially abundant proteins in each protein category. Protein domains with a pvalue below 0.05 were considered signi cant.

Enrichment and quantiles-based clustering
For further hierarchical clustering based on different protein functional classi cation, e.g., GO, Domain, Pathway and Complex. All protein categories obtained after enrichment along with their P values were collated. Categories that are enriched in at least one of the groups with a P value below 0.05 were ltered.
The mean (x) and standard deviation (y) of the -log10 (P-value) of all quanti ed proteins was estimated. Transformation of the P-value ratios of quanti ed proteins to Z scores according to the function X = − log10 (P-value) was employed. Z-transformation of these X values was performed for each functional category. The one-way hierarchical clustering tool (Euclidean distance, average linkage clustering) in Genesis was implemented in clustering the Z scores. The "heatmap.2" function in the "gplots" R-package was employed to generate cluster memberships that were visualized by a heat map.

Protein-protein interaction network
The accession and sequence databases of differentially expressed protein were queried against the STRING database v 10.5 for protein-protein interactions. Only the interactions between the proteins belonging to the queried dataset were considered, thus external candidates were excluded. The "con dence score" metric in the STRING database was implemented to de ne the con dence of interaction. Interactions of a con dence score > 0.7 (high con dent) were selected. The "networkD3" function in the R package was employed to visualize the Protein-protein interaction network form the STRING database.

Statistical analysis
A three replicate experimental design was followed for all experiments. The analysis of variance (ANOVA) in the R package was employed to statistically analyze all estimated parameters, e.g, GO functional enrichment and KEGG pathway.