Genome-Wide Exploration of Thaumatin-like Protein (TLP) Gene Family in Cereals

Background: TLP genes are the members of a conserved pathogenesis-related protein 5 (PR-5) gene family. They play role in abiotic stress response, hormone signaling, cell death, cold tolerance, enzyme inactivation, fruit maturation and seed germination. In this study, we characterized the TLP gene family in barley with specific emphasis on germination and malting. Results: We identified 19 TLP genes from the reference genome of Hordeum vulgare L. cv. Morex and 37, 28 and 35 TLP genes from the Oryza sativa, Brachypodium distachyon and Sorghum bicolor genome respectively. Comparative phylogenetic analysis and thaumatin domain organization of TLPs using the conserved region classified the TLP family into nine groups. Data revealed that localized gene duplications contributed to the expansion of the TLP gene family in cereals with diverse exon/intron structures. In the barley genome, most HvTLPs were localized on chromosome 5H. The differential spatiotemporal expression pattern of HvTLP genes in barley indicate that TLPs have been expressed predominantly in the embryo, developing grains, root and shoot tissues. Additionally, transcript abundance of HvTLP genes was measured between 16 hrs. to 96 hrs. of grain germination. Differential expression of HvTLP14, HvTLP17 and HvTLP18 in the malting variety (Morex), as compared to the feed variety (Steptoe) at different stages of seed germination indicates their possible role in malting. Conclusion: Barley genome contains higher number (19) of TLP genes as previously thought (8). This study provides a description of the TLP gene family in barley and their differential expression between 16-96 hrs. of germination. The results indicate their possible involvement in the malting process.


Background
Thaumatin-like proteins (TLPs), are part of a large PR (pathogenesis-related) gene family, involved in a broad range of defense and developmental processes in plants, fungi, and animals (Brandazza et al., 2004). In Plants, TLPs are members of the PR-5 gene family that includes permatin, osmotin, osmotin-like proteins (OLPs), synthesis of which is mainly triggered in response to biotic and abiotic stress. However, it is also developmentally regulated during seed germination [1] and fruit ripening [2], which perform defense and development related functions 93 [3]. 3 TLPs have high sequence similarity with sweet tasting disulfide thaumatin protein, initially identified in the West African shrub Thaumatococcus daniellie [4]. TLPs are highly conserved 24-34 kDa proteins with polypeptides comprising of 225-319 amino acid residues [5]. Further, they have 5 to 8 disulfide linkages depending upon the number of cysteine residues, ranging from 10-16. These disulfide structures provide stability and resistance to proteins against pH, high temperature induced denaturation, and protease degradation [6]. TLPs having 10 conserved cysteines are designated as small TLPs and have been identified in various monocotyledonous and coniferous plant species [5,7,8].
TLPs are involved in the plant defense against numerous biotic and abiotic stresses [5].
Lower β-glucan content has been considered as a desirable trait to determine malting quality [17]. A barley TLP (HvTLP8), having a carbohydrate binding motif (CQTGDCGG), has been implicated in the redox-regulated interaction with the β-glucan [1]. In addition, barley TLPs have been found to be involved in antifungal activity [18], antimicrobial activity [19] and binding with carbohydrates [20].
Considering the importance of TLP genes that are associated to various defense, development and physiological responses, and diversity of TLP gene members in different plant species, it is of our great interest to investigate the global status and evolution of the TLP gene family in barley and of other cereals involved in brewing industry, especially sorghum (a cereal with gluten free beer). There is lack of information about the status of TLP gene family members in cereal grains. We are specifically interested in TLPs which possess carbohydrate binding domain that may interact with 4 different polysaccharides moieties during the germination and malting process. Our data provide the global status of TLP gene family and its expansion in rice, Brachypodium, sorghum, and barley. The availability of updated genome sequence databases of several plant species facilitated us to explore the TLP gene family status in cereals.

Materials And Methods
Sequence Retrieval and Identification of Thaumatin like-proteins in Cereals TLP domain sequence was retrieved from conserved domain database (CDD) (http://www.ncbi.nlm.nih.gov/guide/domains-structures/) and used as a query sequence to do protein blast (BLASTp) (https://blast.ncbi.nlm.nih.gov/Blast) in the NCBI (www.ncbi.nlm.nih.gov) database.
BLASTp was performed in non-redundant protein sequences database by using e value cut off of e-10 to retrieve TLPs protein sequences for Hordeum vulgare, Oryza sativa, Sorghum bicolor and Brachypodium distachyon. In terms of selecting gene models, only the longest gene models were selected for further analysis. In addition, identified barley HvTLP gene sequences were also verified using IPK barley (http://webblast.ipk136 gatersleben.de/barley/) and Ensembl Plants database (http://plants.ensembl.org/Hordeum_vulgare/Info/Index) to retrieve their gene IDs. Furthermore, screening was performed, and only those genes having the thaumatin family signature were selected for further analysis. The pipeline for the bioinformatic analysis is shown in (Fig. 1B). Identified genes were also confirmed as TLPs from SMART (http://smart.embl-heidelberg.de) and Pfam Chromosomal location, intron/exon structures, and predicted alternative splice variants of the HvTLPs were determined using the H. vulgare ensembl database (http://plants.ensembl.org/Hordeum_vulgare/Info/Index). Initially, 32 TLP proteins with the thaumatin signature were identified in Genebank using BLASTp as described in a previous section; upon blasting these sequences against the ensemble database, we determined that these 32 amino acid sequences aligned to only 19 genetic loci. In all subsequent analyses, only the first gene isoform, normally denoted with a ".1", was used are provided in (Additional file2: Table S2 developmental stages, tissues, and inflorescence treatments. CAR5 expression data was collected from developing grains, bracts removed (5 DPA), ROO was collected from roots from the seedlings (10 cm shoot stage), LEA was taken from Shoots from the seedlings (10 cm shoot stage), EMB was collected from 4-day embryos dissected from germinating grains, CAR15 was collected from developing grain, bracts removed (15 DPA), INF1 was taken from young developing inflorescences (5mm), INF2 was taken from developing inflorescences (1-1.5 cm), and NOD was taken from developing tillers at 6-leaf stage, third internode. The heat map of HvTLP transcript abundance was generated by using the online Mev tool (http://www.tm4.org/mev.html) with the average hierarchical clustering method.

Plant Material and Growth Conditions
Two barley varieties (Malt: Morex and Feed: Steptoe) were used as experimental material. Mature barley seeds were surface sterilized with 20% NaOCl and 70% ethanol. They were germinated in the dark on wet filter paper in sterile petri dishes at 21°C. Germinating seeds were collected at 16, 48, and 96 h and were flash frozen in liquid nitrogen and stored in -80°C until further processing.
Total RNA Isolation and cDNA Synthesis 6 Total RNA was extracted by using Spectrum Plant Total RNA kit from the germinating seed tissue according to the manufacturer' instructions (Sigma-Aldrich). cDNA was synthesized by using AffinityScript QPCR cDNA synthesis kit from 500 ng of DNaseI-treated total RNA according to the manufacturer's instructions (Agilent Technologies, USA).

RT-PCR analysis
Primers were designed for five HvTLPs using IDT primer quest tool (https://www.idtdna.com/PrimerQuest/Home/Index) (Additional file 3: Table S3). Transcript abundance was determined using RT-PCR analysis using GoTaq® G2 Green Master Mix. The amplification conditions were 95°C for 2 minutes, followed 30 cycles at 95°C for 30 seconds, (annealing temperature was adjusted according to the primers provided in Additional file 2: Table S2) for 30 seconds, 72°C for 30 seconds and final extension at 72°C for 5 minutes. β-ACTIN was used as an expression control 28.

Identification of TLP Gene Family in Cereals
A total of 19, 28, 35 and 37 TLP genes were identified in barley, Brachypodium, sorghum and rice, respectively (Fig.1A). TLP domain was retrieved from CDD and was used as query to perform BLASTp in NCBI to identify TLP genes in barley, rice, sorghum and Brachypodium. Sequences of TLP candidate genes were confirmed by Pfam (Domain number: PF00314) and SMART for the presence of thaumatin 3)-C. Using this approach, 32 TLP genes were identified for barley; by using the Ensembl barley blast database, it was determined that these 32 genes corresponded to only 19 genetic loci. The primary amino acid sequence associated with each loci was used for further analysis. Out of the 19 genes in barley, 8 has been previously reported (HvTLP1-8) 1. Eleven new coding sequences of HvTLP genes ranged from 522 to 1080 base pairs.

Gene Structure Analysis and Identification of Thaumatin Signature in Barley TLPs
Classification of 19 TLPs from barley depending on the number of exons resulted in the formation of four (I-IV) groups (Fig. 3). Number of exons ranged in HvTLP genes from 1 to 4. Eleven TLPs (HvTLP2, 3,4,6,8,10,11,12,13,14) having one exon were classified in Group I and Group II includes six HvTLPs (HvTLP1, 5,6,9,16,17,18). However, Group III (HvTLP18) and Group IV(HvTLP19) had one genes, respectively (Fig. 1B). It is worth noting that in barley, only 8 TLP genes have been reported previously [1]. Our major focus of study was the genome -wide exploration of TLPs in barley especially during germination as our previous work established that a TLP, HvTLP8, differentially expressed during germination in malt and feed varieties and play an important role in sequestering β-glucan during malting process [1]. Barley grain contains β-glucan as a non-starch polysaccharide and its higher quantity in the grain affects the brewing process [29]. HvTLP8 possesses carbohydrate binding domain and the binding motif CQTGDCGG, which allowed its binding to β-glucan in a redox-dependent manner. [1]. In our data, we have identified two other barley TLP genes that also contain the binding motif, which is indicative of their interaction with carbohydrate moieties, which needs further investigation. Previously, 44 TLP genes were reported in rice [15]. However, our careful analysis indicates that only 19 genes are true TLPs due to the presence of the thaumatin family signature.
As described above, we have identified 11 new TLPs in barley. Previously, eight known barley TLPs were classified into two groups (group 1 and 2) based on the number of cysteine residues (10 and 16) and that they were localized on chromosome 4H, 5H and 7H only [1]. However, our data emphasize that genes for TLPs could also be assigned to chromosome 1H and 3H (Fig. 6). Generally, TLPs have been considered cysteine rich proteins and maximum number of 16 cysteines have been reported in TLPs. However, our data revealed higher number of cysteine residues in some TLP. For example, HvTLP17 contains 24 cysteine residues (Table 1). It is well documented that cysteine residues result in the formation of disulphide linkages, which provides stability to proteins, especially when exposed to extremes of pH, temperature and protease degradation etc. [6]. Plant TLPs are documented as proteins having 21-26 kDa size. However, molecular weight of new TLPs identified in the present study has been calculated up to 41 kDa. Moreover, TLP genes have been identified in a wide range of plants from mosses to wheat, which the complex hexaploid genome [5,30], suggesting that this gene family has expanded in different plant species during the process of evolution. To better understand the diversification of this gene family in small grain cereals, we performed phylogenetic analysis of predicted TLP protein sequences from rice, sorghum, Brachypodium and barley ( Fig. 2A). A total of 119 TLP proteins from four different plant species were classified into nine groups. The maximum number (34) of TLP proteins were clustered in group nine (Fig. 2B). Previously, 44 TLP genes were reported in rice [15]. The way of identification was based on keywords searches in rice genome, some of which did not even have the thaumatin family signature. However, by using our approach of gene identification, as explained in the materials and method section, we found that there are only 37 true TLP genes having the thaumatin family signature in rice. Based on our analysis, Oryza sativa genome has nearly two times higher number of genes (37 TLPs) as compared to Barley's 19 TLPs. This could be due to recent segmental and whole genome duplication events in rice [31,32]. However, it appears from our data that the expansion in rice, barley, Brachypodium and sorghum is probably due to localized gene duplications, since many small TLP groups, which are located in close proximity on the same chromosome, have high sequence similarity. This situation can be clearly observed for the TLPs located on chromosome 5H (Fig 6). So, localized gene duplication events may be the main reason for TLP gene family expansion in barley.
Intron structures are also very important in determining the complexity of genetic structure of eukaryotic organisms [33]. Notably, 10 HvTLP genes were found without introns, whereas the 11 remaining HvTLP genes contained at least one intron (Fig. 3). This result 321 suggests that variation in the number of intron/exons among HvTLP genes might play important role controlling their function 11 during growth and development of barley.
Ambient temperature causes alternative splicing that functions as a molecular thermometer in plants.
Recently, alternative splicing in SPL genes in barley [34] was identified and shown to have differential level of accumulation during vegetative to reproductive phases. Alternative splicing is also involved in the process of seed germination in barley [35]. Further, alternative splicing in FT genes was also identified that regulate flowering in Brachypodium distachyon [36]. ARF8.4 a splice variant of AUXIN RESPONSE FACTOR 8 is involved in the stamen development in Arabidopsis [37]. Likewise, we also investigated alternative splicing events in HvTLP genes and found that about 83% of HvTLP genes produce splice variants, implying their possible diverse role in barley growth & development (Additional file 4: Table S4).
Gene expression gives a clue for the possible functions of genes in the absence of mutant. Therefore, we examined the spatio-temporal expression patterns of HvTLPs in eight different tissues of barley (Fig. 4). The heatmap based transcript profiles of HvTLPs showed their expression was differential in different tissues however, (HvTLP1, 2, 4, 5, 6, 7, 8, 9, 18 and 19) were found to have higher expression in EMBs (Fig. 4), which shows they might have a possible role in embryo development.
Seed germination is a key step in the process of malting. In our experiments, HvTLP 4 and 7 had elevated levels of expression in all stages irrespective of malt and feed varieties, suggesting their possible role during seed germination. However, expression of HvTLP5 and HvTLP6 was found to be higher in 48hr of germination only in malting variety, which indicates that these genes may have specific role only to influence the malt biochemistry of the seed (Fig. 5) and could be considered as new gene candidates for breeding of barley varieties useful for malting and brewing. Previously, we have identified important genes which have been involved in the germination, dormancy [38] and malting process [1]. In publicly available Morex RNA-seq database, currently no data is available for germinating grains. So, we have performed the validation of HvTLP gene expression by measuring their transcript abundance in germinating barley grains of malt and feed varieties at different stages.
Our motivation to conduct these experiments include our recent data, where differential expression of a HvTLP8 has been associated with the level of β-glucan in the germinating barley grains [1].
Reduction of β-glucan is an important breeding objective in barley for breeding of the malt varieties.
The association of TLPs with β-352 glucan could also lead to the development of high value and high fiber cereals by knocking out their expression using new CRISPR-based approaches.

Conclusion
Our results provide novel information about the status of TLP gene family in cereals as knowledge about this gene family is scarce. Due to the availability of sequencing data and new tools, we were able to identify new TLPs which were previously unknown. Interestingly, some of these TLPs possess higher number of cysteine residues than previously thought. One of the cysteine rich TLP, HvTLP8, has been found to be associated with β-glucan and the interaction was found to be dependent on the 13 II extracted the sequences, analyzed them, and wrote the paper; RKT carried out the RNA-seq analysis, wrote parts and edited the manuscript; OW conceived the project along with JS, and wrote parts and edited the paper; JS conceived and supervised the project, and helped to write the paper.    Transcript abundance of HvTLP genes at different stages of seed germination. Gel representation of the expression profiles of HvTLPs (1, 2, 4, 5, 6, 7, 9, 13, 14, 16, 17 and 18) and the housekeeping genes β-actin and GAPDH at 16 hrs, 48 hrs. and 96 hrs. of germination in malting (MX= Morex) and feed (ST= Steptoe) varieties of barley using RT-PCR. 1.2 % agarose gel was used for resolving amplified bands.