RNA interactome capture in Brachypodium reveals a owering plant core RBPome

RNA binding proteins regulate gene expression at the post-transcriptional level by controlling the fate of RNA, in processes such as mRNA localization, translation, splicing and stability. The annotation of RNA binding proteins is mainly based on the well-known RNA binding domains and motifs. However, novel RNA binding proteins without such conventional domains have been identied in different species using in vivo RNA interactome capture. To nd support for novel conserved RNA binding proteins in plants, we applied an optimized RNA interactome capture to the monocot model Brachypodium distachyon. We provide experimental evidence for 203 RNA binding proteins isolated from Brachypodium shoot tissue and leaf mesophyll protoplasts, and grouped these into classic RNA binding proteins with recognizable RNA binding domains and motifs, and candidate RNA binding proteins without such domains. Compared to RNA binding proteins captured in Arabidopsis thaliana, candidate RNA binding proteins involved in carbon xation and carbon metabolic pathways are highly conserved. We tried to validate the RNA binding proteins captured in this research through a silica-based method, but this method appears not ecient for plants. This may indicate that optimized methods to validate high throughout RNA binding proteome are required for plants. Our results provide classic and candidate RNA binding proteins in Brachypodium distachyon and conserved RNA binding proteins in owering plants. Future functional characterization should point out what the signicance of RNA binding is for the function of these proteins.

Drosophila [13] to the plant Arabidopsis [14,15]. RIC yielded 797 RBPs in HEK293 cells, but 245 of these RBPs captured proteins were not annotated as RNA binding [10]. In Drosophila, 523 RBPs were identi ed using RIC, but over half the RBPs were not known to bind RNA [13]. Notable RBPs lacking classic RBDs like enzymes ENO1 and SHMT2 in HEK293 cells [10], TRIM proteins in embryonic stem cells [16] and GAPDH in T cells [17] were con rmed to indeed bind RNA. The emerging numbers of unconventional RBPs may reveal a new era in the function of RBPs.
In Arabidopsis, RIC has been applied to different tissues, like suspension cells (913 RBPs) and seedlings (236 RBPs) [14], leaf mesophyll protoplasts (325 RBPs) [18], and etiolated seedlings (746 RBPs) [15]. It is di cult to compare the yield of RBPs between different experiments because different criteria for protein identi cation were used. In these Arabidopsis experiments, RBPs with classic RBDs were enriched after RIC as expected, but again hundreds of novel RBPs without known RBDs were identi ed in Arabidopsis, such as DUF domain proteins and LIM domain proteins [15]. In this research, we optimized the UV dosage and captured RBPs using RIC from the monocot model plant Brachypodium distachyon. By making the comparison to Arabidopsis, we hope to contribute support for conserved novel RBPs in owering plants, which may enhance the understanding of the identi cation and functions of RBPs in the future.

Results
High UV dosage does not degrade RNA but yields less RBPs Though RIC has been successfully applied to Arabidopsis cell suspension, etiolated plants, seedlings and leaf mesophyll protoplasts, the number of identi ed proteins is lower than that in mammalian cells [24].
This could be because of a difference in crosslinking e ciency between plants and mammalian cells.
Plant tissues contain a waxy cuticle and light acceptors in the chloroplast, which can absorb short wavelength UV light [25] and could lower the crosslinking e ciency. The leaf anatomy of the model plant Arabidopsis is different from monocot plants like wheat and rice [26][27][28], which may indicate the different UV e ciency for different species.
Therefore, we want to optimize the UV dosage in the monocot model Brachypodium to increase the e ciency of RIC in monocot plants (Fig. 1A). We used Brachypodium leaf mesophyll protoplasts and 2- week old seedlings as starting materials. For leaf mesophyll protoplasts, we applied the same method as previously used for the Arabidopsis leaf mesophyll protoplasts [18]. For seedlings, the UV dosage for Brachypodium was optimized in order to increase the e ciency of crosslinking (Fig. 1A). Different UV treatments and dosages were used including 0.9J/cm 2 , two times of 0.9J/cm 2 treatments, continuous 10J/cm 2 , two times of 5J/cm 2 treatments with turning over plants and four times of 5J/cm 2 treatments (Fig. 1B). As derived from the RIN numbers obtained by Bioanalyzer analysis, none of the UV treatments or dosages resulted in RNA degradation, as the average RIN number from three independent biological replicates was above 9 for all conditions (Fig. 1C). After RIC were applied to all seedling samples with different UV treatments, the results of visualization of proteins on the silver stained SDS-PAGE gel suggested that the CL samples were enriched in proteins compared to the noCL samples (Fig. 1B). Also, a higher UV intensity treatment showed more pronounced protein bands on silver staining, which suggests a possible higher concentration of proteins (Fig. 1B).
After mass spectrometry detection and protein analysis, we identi ed 106 RBPs under 0.5J/cm 2 UV two times exposure and 55 RBPs under 5J/cm 2 UV four times exposure (Fig. 1D), of which 39 RBPs overlap ( Fig. 1D). 39 RBPs showed in both UV treatment conditions and 85% overlapped proteins contains the RNA binding domains or motifs (Table S2). Surprisingly, the number of RBPs identi ed in the lower UV condition was higher than the number in the higher UV condition, which is not consistent with the staining intensity on the SDS-PAGE gel (Fig. 1B). One possibility is that UV overexposure can activate the repair mechanism for transcripts, which degrades RNA-protein complexes [29][30][31]. In this case, only the abundant RNA-protein can be captured using RIC, which is consisted with that the overlapped proteins containing the RNA binding domains or motifs.
In addition, we identi ed 128 RBPs in leaf mesophyll protoplasts. We pooled the RBPs from shoots in different UV dosage conditions (in total 123 RBPs), which are used to compare with the RBPs from leaf mesophyll protoplasts (Fig. 1D). There are 47 RBPs overlapped between shoots and protoplasts and 85% RBPs contain RNA binding domains or motifs. For further analysis, we pooled identi ed RBPs from different samples and UV treatment and in total, 203 RBPs were detected in Brachypodium (Table S2).
Conserved RNA binding domains distinguish classic from candidate RBPs Of the 203 RBPs identi ed in Brachypodium, 112 proteins (55%) with known RBDs were grouped as classic RBPs. In classic RBP group, RRM domains are the largest group, followed by ribosomal proteins.
The other 91 identi ed protein (45%) lacking these classic domains or motifs were de ned as candidate RBPs. In candidate RBPs group, the largest one contains the information in the photosynthesis, such as photosystem I reaction center subunit D and chloral A/B binding proteins (Fig. 2). However, whether these proteins are involved in RNA metabolism is unclear.
RBPs with conserved RBDs were identi ed in classic RBPs group and novel RBPs were found in candidate RBPs group The detailed RBDs in both classic and candidate RBPs groups were listed in Fig. 2 and Table S2. In classic RBPs group (Fig. 3), the largest group was proteins containing RRM domains (41). In Arabidopsis, proteins with RRM domains through RIC were the largest group in canonical RBPs as well [14,15]. We identi ed the classic RNA binding zinc nger proteins, such as Znf-CCCH type and Znf-RanBP2 (Fig. 3). The Znf-RanBP2 protein, also captured in Arabidopsis etiolated seedlings [15], has been reported to bind single strand RNA in human [32]. Other emerging and new RBPs in plants were identi ed in this research, such as two YTH-domain proteins and three alba proteins, which were also found in Arabidopsis [14,15].
YTH domain proteins (ECT2/3) in Arabidopsis have been proved to recognize the m 6 A through binding to target RNA 3'UTR, and regulate the trichome morphogenesis and leaf development through facilitating the degradation of the ECT2 binding transcripts [33,34]. Alba domain was closely related to the ancient RNA-binding IF3-C fold [35] and was accepted as multifunctional proteins that participate in genome organization, translational process, RNA metabolism [36]. 28 proteins containing YTH domain were predicted in Brachypodium through Interpro (http://www.ebi.ac.uk/interpro/entry/IPR007275/proteinsmatched?taxonomy=15368), but the function has not been reported in Brachypodium.
Except for the classic RBPs, 45% of RBPs are novel RNA binding proteins in Brachypodium, for which no RNA binding function has been proposed previously. Eleven proteins involved in involved in photosystems were strongly enriched in candidate RBPs group. In addition, RBPs containing DUF, ATPase and UspA domain proteins were also identi ed. Actin family proteins were identi ed in Brachypodium, which were found in Arabidopsis etiolated seedlings as well [15].
Similar to captured RBPs through RIC in other species, enzymes were captured as well in Brachypodium, such as serine fructose-bisphosphate aldolase (FBPase) and transketolase, which are involved in the Calvin cycle for carbon xation. Combining with the photosystem proteins, it seems the RBPs in chloroplasts are enriched in RNA binding activity. Glutamine synthetase, glycosyltransferase, pectinesterase, transferase and cellulose synthase are also captured in Brachypodium. It is very interesting to know whether these enzymes captured in this research are really binding to RNA or not.

GO and KEGG enrichment analysis
According to the biological molecular function analysis (GO analysis), the enriched proteins in classic RBPs were annotated to have RNA binding activity (p < 0.05), including mRNA binding and Structural constituent of ribosome ( Fig. 3A and Table S3). The subcategories of mRNA binding were RNA binding, nucleic acid binding, organic cyclic compound binding and heterocyclic compound binding ( Fig. 2A). The enriched RNA binding function among the classic RBPs demonstrated the high e ciency of this interactome capture and the accuracy of the domain search method for analysis. For the candidate RBPs, the most enriched molecular biological functions were molecular function, catalytic activity and chlorophyll binding (Fig. 2B). However, when comparing the GO enrichment in these two groups, the GO enrichment of classic group is much higher than candidate group.
In addition, the KEEG pathway enrichment was analyzed for classic and candidate RBPs (Table S3).
Except for the ribosomal proteins, the most enrichment KEEG were mRNA surveillance pathway, spliceosome, RNA transport and RNA degradation. Similar to the result of GO enrichment analysis, the method to classify classic RBPs and candidate RBPs based on the domain information is e cient and accurate. For the candidate RBPs, the most enrichment in KEGG pathway were metabolic pathways, carbon metabolism, photosynthesis, glyoxylate and dicarboxylate metabolism, fructose and mannose metabolism, biosynthesis of secondary metabolites, biosynthesis of amino acids. It appears more enzymes are detected in the candidate RBPs. However, the validation for candidate RBPs are required to test their RNA binding ability.
A core RBPome identi es RBPs in plant-speci c processes RIC was rst applied to the model plant Arabidopsis and RBPs were identi ed in Arabidopsis leaf mesophyll protoplasts (325) [18], cell suspension (913) and leaves (236) [14], and etiolated seedlings (746) [15] (Summarized in Fig. 4A). Different criteria and methods used to identify RBPs in each article make it di cult to combine these data. Comparing with all RBPs identi ed in Arabidopsis, the largest RBPs overlap is between leaf mesophyll protoplasts and etiolated seedlings (129 RBPs). Since shoot tissue and leaf mesophyll protoplasts were used as materials to identify RBPs in Brachypodium, it is meaningful to compare with RBPs identi ed in the same tissue of Arabidopsis. Thus, we pooled the RBPs data from Arabidopsis leaf and mesophyll protoplasts in order to compare with the RBPs identi ed in Brachypodium.
Using Inparanoid analysis to compare orthologs between Arabidopsis and Brachypodium, 10689 clusters from both species were extracted as fundamental database for further analysis. 343 protein clusters from Arabidopsis RBPs of leaf and leaf mesophyll protoplasts were matched when compared to Arabidopsis fundamental database, while 130 protein clusters of Brachypodium RBPs were matched to the Brachypodium fundamental database. When comparing the orthologs of the subsets, 57 clusters were de ned, which include 82 RBPs from Arabidopsis and 62 RBPs from Brachypodium. We termed these RBPs as core RBPs ( Fig. 4B and Table S4. KEGG analysis was used to compare the core RBPs in Brachypodium and Arabidopsis ( Fig. 4C and Table  S4). The largest RBPs number belong to ribosomes as expected, which is followed by a carbon xation in photosynthetic organism and carbon metabolism. These enzymes are highly conserved between Arabidopsis and Brachypodium. In addition, the proteins involved in mRNA surveillance pathway, RNA transport and spliceosome are highly conserved in both species as well. It is interesting to explore the roles of these conserved core RBPs in plants. Interestingly, RNA degradation is absent in Arabidopsis core RBPs, which may reveal the RIC method is limited to capture all RBPs in plants.

Validation methods in the plant RBPs need further optimization
To validate the candidate RBPs identi ed in this research, we applied silica-based solid-phase extraction followed by a western blot [28]. Firstly, we transformed the selected RBPs fused with green uorescent protein (GFP) tag (35S: RBP: GFP) into Brachypodium leaf mesophyll protoplasts and UV treatment was applied after incubation. Silica matrix-based columns were used for nucleic acid puri cation, in which the crosslinked nucleic acid-protein complexes are retained. The RNA-protein interaction can be visualized using western blot with GFP-antibody to detect the GFP-tag. In order to con rm whether the silica columns capture the RNA-protein complexes, we visualized the isolated proteins from CL and noCL treated leaf mesophyll protoplasts with silver staining (Fig. 5A). The CL samples contained enriched proteins compared to noCL samples, which indicates that the combination of silica column and UV crosslinking can be used for validation. In this research, we randomly selected both classic RBPs (Bradi1g20440.1 (RRM-domain protein), Bradi2g61835 (LSM-domaim), Bradi5g17360.1 (PPR-domain)) as positive controls and candidate RBPs (Bradi4g08800.1 (carbon-xation). As seen on the western blot ( Fig. 5B), the GFP tagged proteins well expressed in protoplasts, but the target RBPs could not be detected after silica puri cation of the RNA-protein complexes of crosslinked samples. The possible reasons will be discussed as follows.

Discussion
RNA interactome capture has been widely used in different species to explore their repertoires of RNAbinding proteins ( [10][11][12][13][14][15]. An interesting follow-up question is whether different phylogenetic groups of species, such as plants in this instance, have a derived repertoire of RBPs or not. A second and key question in these studies is also whether the identi ed RBPs without a canonical RNA binding domain actually bind RNA or whether they concern a sort of contamination. To shed light on these questions, we investigated the conservation of interactome captured proteins between two plants from divergent lineages: Arabidopsis (a eudicot) and Brachypodium (a monocot), and attempted to validate a subset of these proteins in vivo.
In Brachypodium, we identi ed 203 RBPs from leaf mesophyll protoplasts and seedling tissue. The method we applied to classify the identi ed RBPs into classic and candidate groups is based on the known RNA binding domains, motifs. The high enrichment of GO analysis and KEGG pathway analysis in the RNA binding ability ensure that this method is e cient and accurate. 55% RBPs were appointed to be classic RBPs with known RNA binding domains and motifs, while 45% of RBPs were novel RBPs without these domains and motifs. The highly enriched photosynthesis proteins in candidate RBPs group could indicate that these proteins somehow have a thus far uncharacterized direct interaction with RNA. Alternatively, the applied method could fool us in some ways. One possible way is that under UV exposure, the light reaction process is strongly active and abundant photosystem proteins are easily captured while being translated from mRNA. Another possible way is that these proteins are contamination and illustrate a limitation of UV crosslinking, which could crosslink the free mRNA to the abundant photosynthesis proteins. A comparison of the abundance of proteins in the general proteome to the RBPome could shed some light on these possibilities. In order to avoid this capture of chloroplast proteins during UV treatment, RIC was also applied to Arabidopsis etiolated plants and cell suspension materials, which seems to yield more RBPs [14,15]. This may indicate that the UV crosslinking e ciency is lower in green tissues. Therefore, the etiolated plants and cell suspension experiments may be useful to capture RBPs more e ciently. However, plants grown in darkness are similar to phytochrome-de cient mutants [37] and show senescence [38]. Therefore, the etiolated plants and cell suspension materials cannot be used for all research questions.
Of 203 RBPs captured in Brachypodium, 62 (30%) RBPs were conserved with RBPs captured in Arabidopsis seedlings and leaf mesophyll protoplasts. In these core RBPs, proteins in carbon xation are highly conserved in both species. However, it is interesting to validate whether the conserved enzymes identi ed through RIC and conserved in Arabidopsis and Brachypodium can bind to RNA or not. Though over 50% of RBPs were identi ed to harbor conserved RBDs or motifs, still 45% of RBPs identi ed in this research were lacking these known characteristics. We describe potential RNA binding capacity of some selected proteins. The proteins with CLU domain proteins contain tetratricopeptide repeat-containing domain (TRP domain), which can bind to mRNAs of multiple genes such as OsPAL (OsPAL1-7) in rice to promote the turnover of these mRNA [39]. This information may indicate the possibility of CLU domain proteins to bind RNA. Another interesting protein is UspA domain protein, which is upregulated by stress to protect DNA damage in bacteria [40]. The identi cation of UspA domain protein through mRNA interactome capture may reveal its regulation mechanism to enhance cell survival rate under stress. We also captured actin family proteins in Brachypodium, which were identi ed in Arabidopsis etiolated seedlings as well [15]. Actin proteins, one kind of the most abundant proteins in eukaryotic cells, are well known to assist cell mobility and scaffold in the cytosol [41]. But nuclear actin proteins are well documented to play a role in transcription, chromatin remodeling, histone modi cation and DNA damage response [42]. Therefore, it is meaningful to validate these captured RBPome proteins.
The failure to validate RBPs in vivo in our research makes it di cult to ensure that these novel RBPs have RNA binding ability or not. We attempted to validate the RBPs captured in vivo in this research using silica column-based method [23]. The failure to detect the expected bands by western blot from silica samples could be explained by different reasons relating to sensitivity. First, the e ciency of UV crosslinking in leaf mesophyll protoplasts could be too low so that not enough RNA-protein complexes are detectable. Second, the canonical RNA of the RBPs is not abundant enough to be detected in spite of the fact that the tagged proteins are overexpressed. Indeed, in the original article describing the method, the yield of RNA after silica elution reached up to 12.5ug [23]. In comparison, the RNA yield from protoplasts only reaches 2 µg maximum with around 10 8 cells per sample. To overcome the above problems, a large number of protoplasts is required. However, the low yield of protoplast in Brachypodium (40 ~ 60 seedlings can produce 1.7 × 10 7 cells [19] appears not suitable for the purpose of validating several RBPs. Therefore, we need alternative validation strategies. Though Marondedze et al. [14] validated the DEAD-box RBPs via oligo (dT) beads and western blot, this approach is not fully independent because the same method is used in both RBPs capture and validation. Compared to mammalian cells or yeast, which can be transformed and crosslinked easily, it appears more di cult to validate the high throughput proteomics data in plants. Except for the in vivo methods, there are possible in vitro methods for validation, such as the electrophoretic mobility shift assay (EMSA). However, it appears time and labor consuming to validate the high throughput proteome.
The obvious limitation of RIC to capture RBPs is that it can only capture the RBPs bound to poly(A). XRNAX, named for protein crosslinked RNA extraction, can characterize proteins bound to coding and noncoding RNA [43]. Applying XRNAX method to plant cells could possibly unravel cellular mRNA interactome more comprehensively beyond the limitation of yielding the mature mRNAs only. Another issue for RBPomes are the data analysis methods, to compare the proteins in CL versus noCL samples. The ratio of CL/noCL has been used to quantify the signi cance of proteins in both CL and noCL [10,13,15]. However, for the proteins only captured in CL, it is impossible to use this method because of missing noCL data. In Arabidopsis etiolated seedlings, proteins only captured in CL were catalogued as candidate RBPs [15]. While a semi-quantitative analysis based on the number of peptide occurrences was used to characterize these proteins in Drosophila [13], the proteins only identi ed in CL were also recognized as signi cant ones. In our research, most of the proteins captured were only presented in CL samples and absent in noCL samples. We used the limma t-test to characterize the proteins showing in both CL and noCL, while it is meaningless to do this for the proteins only identi ed in CL. We grouped proteins based on their domains and motifs. But more detailed and accurate methods are required to quantify these RBPomes.
In addition, discovering the key regulators and understanding their functions in stress signalling pathways is still valuable to improve the crops stress tolerance in the future. As previous studies of individual RBPs have already demonstrated the diverse stress tolerance [44][45][46], it is interesting to globally detect the shaping of RBPomes under certain stresses and thereby to nd out stress-sensitive RBPs. Recently, RIC has been applied to identify the drought stress RBPs and 150 out of 546 signi cantly responded to drought [47]. While application of RIC in plants has been limited, using RIC in abiotic and biotic stress experiments can reveal novel RBPs, which may disclose new regulatory mechanisms.

Conclusions
In this study, we identi ed 203 RNA binding proteins from Brachypodium shoot tissue and leaf mesophyll protoplasts and conserved RNA binding proteins in both Arabidopsis and Brachypodium. These conserved RNA binding proteins were found to be involved in carbon xation and carbon metabolic pathways. We provided the conserved RNA binding proteins in owering plants with modi ed RNA interactome capture, which paved way to explore more functional RNA binding proteins in different plant species.

Plant materials and growth condition
Brachypodium distachyon accession Bd21-3 was grown in a 12/12 h 22 °C growth room and 4-week old plant leaves were used to isolate leaf mesophyll protoplasts for the RIC assay. Seedlings from Brachypodium distachyon accession Bd21 were grown in a 16/8 h, 22 °C growth room and 2-week old plants were used for the RIC assay. For validation experiment, Bd21 plants were grown in a 16/8 h, 22 °C growth room and the leaf of 2-week old seedlings was used to isolate the leaf mesophyll protoplasts. All replicates used in each experiment were taken at the same time and randomly from different pots.

UV crosslinking and RNA interactome capture in seedlings
For RIC applied to seedlings, 2 g of shoot tissue (without the root) per sample were exposed to 254 nm UV and irradiated at different UV intensities, including 5 J/cm 2 four times, 5 J/cm 2 two times with turning over the plants, continuous 10 J/cm 2 , and 0.9 J/cm 2 . The RNA was isolated from different UV treatments using TRIsure™ and RNA quality was tested using a Bioanalyzer. Three independent biological replicates of CL and noCL samples were used for RNA quality control. Four independent biological replicates exposed to 5 J/cm 2 four times UV with turning over plants each time and 0.5 J/cm 2 two times UV were used for RIC assay. After UV exposure, samples were immediately ash frozen and used for RIC.
A modi ed RIC method was applied to seedling samples. After UV crosslinking, the samples were ground for 30 minutes in liquid nitrogen and lysed in 30 ml lysis buffer (500 mM LiCl, 0.5% (w/v) Lithium Dodecyl Sulphate (LiDS), 5 mM DTT, 20 mM Tris-HCl, pH 7.5, and 1 mM EDTA, pH 8.0). After centrifugation, the supernatant was incubated with 4 mg Oligo-d(T) 25 beads at RT for 1 h. The RNA-protein-beads complexes were washed two times with both washing buffer I (500 mM LiCl, 0.1% (w/v) LiDS, 5 mM DTT, 20 mM Tris-HCl, pH 7.5, and 1 mM EDTA, pH 8.0) and washing buffer II (500 mM LiCl, 5 mM DTT, 20 mM Tris-HCl, pH 7.5, and 1 mM EDTA, pH 8.0), and one time with low salt buffer (200 mM LiCl, 20 mM Tris-HCl, pH 7.5, and 1 mM EDTA, pH 8.0). Then the mRNA-protein complexes were collected with 1 ml elution buffer (20 mM Tris-HCl, pH 7.5, and 1 mM EDTA, pH 8.0) after being incubated at 55 °C for 3 minutes. The capture was repeated two times from the same lysate. After pooling all eluates together, 200 µl was used for RNA quality control after 1 µl Proteinase K (NEB) treatment at 37 °C for 1 h. After treating the remaining 1.8 ml eluate with 1 µl RNase Cocktail™ Enzyme mix (ThermoFisher Scienti c) for 1 h, samples were concentrated to 75 µl using Amicon® Ultra Centrifugal Filters. 25 µl was loaded on an SDS-PAGE gel for visualization and the rest was analyzed by LC-MS/MS after digestion.

Protein digestion and mass spectrometry
Protein samples were digested and detected using LC-MS/MS as previously described [18]. Brie y, gel lanes were hydrated and dehydrated with 50 µL 100 mM NH4HCO3 and CH3CN for 10 min, respectively. Proteins in the gel pieces were digested overnight at 37 °C. Then the tryptic peptides were extracted using 80 µL of 50 mM NH 4 HCO 3 and twice with 80 µL of 50% (w/v) CH 3 CN and 5% (v/v) formic acid (FA) for 30 min each. The samples were dried and dissolved in 25µL solution containing 2%(v/v) CH 3 CN and 0.1% (v/v) aqueous tri uoroacetic acid (TFA) and desalted with Millipore ZipTip C18 columns. The nal eluent containing puri ed peptides was dissolved in 4 µL 60% (v/v) CH 3 CN and 0.1% (v/v) FA and dried again.
The LC-MS analysis was performed on a Q Exactive™ Hybrid Quadrupole-Orbitrap™ Mass Spectrometer (Thermo Scienti c, Bremen, Germany), coupled online to an Ultimate 3000 ultra-high-performance liquid chromatography (UHPLC) instrument (Thermo Scienti c, San Jose, CA). 5 µL sample was loaded on a precolumn (Acclaim PepMap100, C18, 75 µm x 20 mm, 3 µm, 100 Å; Thermo Fisher Scienti c) at a ow rate of 5µL/min. Further, the sample was separated at a ow rate of 300 nl/min on an analytical column integrated with the nano-electrospray ion source (EASY-Spray, PepMap RSLC, C18, 75 µm x 150 mm, 3  Identi cation of mRNA binding proteins for Brachypodium distachyon Four independent biological replicates of seedlings and three independent biological replicates of leaf mesophyll protoplasts for both CL and noCL were used for RIC capture and label-free quanti cation (LFQ) LC-MS/MS (Table S1). The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identi er PXD018076.
We used MaxQuant and applied a FDR < 1% for peptides and protein identi cation with xed and variable modi cations. Brachypodium distachyon protein database from Uniprot Protein were used for protein identi cation. LFQ intensity was used for statistical analysis. Proteins with at least one peptide in two biological replicates were considered for further analysis. Proteins detected only in noCL were excluded from further statistical analysis and proteins detected only in CL samples were considered as positive ones. For proteins in both noCL and CL, we applied a limma moderated T-Test (R limma package) on LFQ intensities for the statistical analysis and proteins with adjusted P-value < 1% were considered as positive ones.

Domain analysis and Inparanoid comparison of identi ed RBPs
The proteins only in CL samples and proteins with P-value < 1% in both CL-and noCL samples were annotated against the Interpro and Pfam database. Based on the known RBDs, motifs and RNA binding characteristics listed on the review by [22], we grouped the RBPs identi ed to classic group and candidate group. GO analysis and enrichment were performed using PANTHER (http://www.geneontology.org/page/go-enrichment-analysis). STRING (https://string-db.org/cgi/input.pl?sessionId=8BeWlyUSyKaF&input_page_show_search=on) and KEGG mapper (https://www.kegg.jp/) were used to analyze the biological pathways and KEGG enrichment for RBPs identi ed in this research. The orthologs between Arabidopsis thaliana and Brachypodium distachyon were downloaded in Inparanoid8 (http://inparanoid.sbc.su.se/cgi-bin/index.cgi). Comparison between RBPs identi ed and orthologs of these two species were performed. Validation for RBPs identi ed using silica column and Brachypodium leaf mesophyll protoplasts In order to validate the RBPs in vivo, we tried to apply a silica column treatment on Brachypodium leaf mesophyll protoplasts transformed with GFP-tagged RBPs. Firstly, the coding sequence of selected RBPs (showing in both shoot and protoplasts) was PCR-ampli ed without stop codon from Bd21 cDNA and inserted in the HBT95 expression vector containing a GFP tag in frame. The construct with GFP protein sequence was driven by the 35S promoter and terminated by the NOS terminator. All constructs were con rmed by sequencing (LGC Genomics, Berlin). Then we applied a PEG-Ca 2+ -mediated transfection to protoplasts (Cells x 10 − 7 =1) with 30 µg GFP-construct plasmid DNA. Protoplasts were incubated in dim light (light intensity around 8 µmol/m 2 s 2 ) for 12 hours.
The validation method we applied is based on Asencio et al. [23]. After UV treatment (0.13 J/cm 2 ) to the incubated protoplasts, a commercial RNA extraction kit (InviTrap Spin Plant RNA Mini Kit) was used. The columns in this kit can retain the RNA and RNA-protein complexes [23]. After elution, RNase cocktail enzyme mix (ThermoFisher Scienti c) was used to free the proteins and a western blot with anti-GFP antibody was used to detect the GFP tagged target proteins.

Competing interests
The authors declare that the research was conducted in the absence of any commercial and nancial relationships that could be construed as a potential con ict of interest.

Funding
We gratefully thank the supportive funding. ML was supported by the China Scholarship Council (CSC) for 4 years of study at KU Leuven. BS was supported by a FWO SB fellowship 1S06517N. Geuten lab were supported by KU Leuven grant C24/17/037. The funders did not play any roles in the design, analysis, and interpretation of this study or relevant data Author contributions ML, ZhZh and KG designed the experiment. ML designed and carried out the shoot RIC assay, analyzed all the data, wrote the manuscript. ZhZh performed the leaf mesophyll protoplasts RIC assay. SB performed the RNA quality assay and VA cloned all the constructs in this experiment. PB and KB carried out the MS Spectrum assay. KG revised the manuscript. All authors contributed to the nal manuscript.