Identification and evaluation of suitable reference genes for gene expression analysis in rubber tree leaf

Gene expression profiles are increasingly applied to investigate molecular mechanism for which, normalization with suitable reference genes is critical. Previously we have reported several suitable reference genes for laticifer samples from rubber tree, however, little is known in leaf. The main objective of this current study was to identify some stable expression reference genes at various developmental stages of leaf, as well as during abiotic (high and low temperature extremes) and biotic stresses (pathogen stress). Gene expression profilings identified the ubiquitin–proteasome system as excellent potential as reference genes for rubber tree leaf. Among a total of 30 tested genes investigated, 24 new candidate (including 11 genes involved in the ubiquitin–proteasome system), 4 previously identified and 2 specific genes, were further evaluated using quantitative real-time PCR. Our results indicated that the new candidate genes had better expression stability comparing with others. For instance, an ubiquitin conjugating enzyme (RG0099) and three ubiquitin-protein ligases (RG0928, RG2190 and RG0118) expressed stably in all samples, and were confirmed to be suitable reference genes for rubber tree leaf under four different conditions. Finally, we suggest that using more than one reference gene may be appropriate in gene expression studies when employing different software to normalize gene expression data. Our findings have significant implications for the reliability of data obtained from genomics studies in rubber tree and perhaps in other species.


Introduction
Gene expression profiles are widely used to investigate plant processes, including metabolic pathways and regulatory networks that are crucial for plant growth & development as well as in stress responses. These types of genomics-based investigation necessitate appropriate reference genes to normalize gene expression data. Incorrect choice of reference genes in techniques such as real-time quantitative PCR (qRT-PCR), RT-PCR, Northern blotting, ribonuclease protection assay and gene chip can result in the misinterpretation of gene expression data [10,19,47]. An optimal reference gene should be stably expressed in various tissues and under different experimental conditions without being influenced by the experimental conditions used [3,15]. Common reference genes that are used to normalize gene expression data include 18S ribosomal RNA (18S rRNA), actin (ACT), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), tubulin (TUB). However, various studies have indicated that they may be unsuitable as reference genes for normalization due to the effects of certain experimental conditions on their expression [36,52,55]. Many studies have Xiangyu Long and Jilai Lu contributed equally to this work.

Electronic supplementary material
The online version of this article (https ://doi.org/10.1007/s1103 3-020-05288 -8) contains supplementary material, which is available to authorized users. been conducted to screen and identify new reference genes that show stable expression patterns and to determine the appropriateness of genes traditionally used as reference genes in specific species and experimental conditions [2, 9, 20-23, 26, 27, 50, 53].
In rubber trees (Hevea brasiliensis), natural rubber synthesis is the main metabolic activity in laticifers which are highly specialized cells that directly influence the yield of rubber [45]. Most studies in molecular biology, especially those using gene expression profiles, have paid close attention to rubber biosynthesis in laticifers [5, 29-31, 43, 44]. Previously, we have evaluated the 22 reference genes across abundant laticifer samples, including latex regenerations, hormone treatments, tappings, individual trees and genotypes of Hevea [24,32]. As mentioned earlier, the availability of reference genes is critical for gene expression studies involving glycometabolism and hormone regulation of rubber biosynthesis in laticifers [4,30,31,39].
Leaves as photosynthetic tissues (sources) provide the nutrition and energy for rubber biosynthesis [43]. In recent years, more and more attention has been paid to leaf research, including leaf development and disease-resistance. For instance, due to the fact that sucrose is the precursor for rubber biosynthesis, enzymological characterization and gene expression of sucrose metabolizing enzymes were systematically investigated to understand the regulation of sucrose metabolism during leaf development in the rubber tree [59]. In addition, leaf development has been widely studied within the context of leaf diseases, especially South American leaf blight (SALB), which directly influences rubber yield [11,25]. However, as mentioned previously, little is known about reliable reference genes in leaves in order to normalize gene expression data in leaves.
In this study, two transcriptome sequencing profiles of leaves were screened for appropriate reference genes [11,18]. The expression of selected genes was verified under four different experimental conditions using qRT-PCR. Our results demonstrated that the ubiquitin-proteasome system had significant potential as reference gene in rubber leaf tissues, and which genes had different expression stability under four experiments. Another important finding from this study was that the choice of algorithm used to normalize gene expression data had a significant impact on the results. Our findings were also discussed within the context of using the appropriate reference gene to normalize gene expression data, which has the potential to significantly affect results from gene expression profile studies.

Plant materials and treatments
The Hevea cultivars in Hainan Province, Reyan7-33-97 (synonym: CATAS7-33-97 or RY7-33-97), were provided by the experimental plantation of CATAS (Chinese Academy of Tropical Agricultural Sciences, Danzhou, Hainan, China). One-year-old tissue cultured Reyan7-33-97 plants were selected for the collection of six developmental stages of leaf, including bud, bronze, color-change, pale-green, bright green (mature) and aging. In addition, 6-month-old tissue cultured Reyan7-33-97 plants were used for our biotic and abiotic stress experiments. For the biotic stress experiments, leaves were collected at 0 h, 12 h, 24 h, 48 h, 3days and 5days after inoculating with the pathogenic fungus Corynespora cassiicola HccYN57. In the case of abiotic stress experiments, plants were treated with low (4 °C) and high (45 °C) temperatures in a growth cabinet with relative humidity of 75-90% and lighting of 16 h per day, and their leaves were collected at 0 h, 3 h, 6 h, 9 h, 12 h, and 24 h after the imposition of stress. Each treatment had biological replicates, and all collected leaves were frozen and stored at − 70 °C until they were used for RNA isolation.

Total RNA isolation and cDNA synthesis
Total RNA was extracted from leaf tissue using PureLink™ Plant RNA Reagent (Thermo, USA) following the manufacturer's instructions. Genomic DNA was removed after digestion with DNase 1 (Thermo, USA) also according to manufacturer's protocols. RNA concentration and its integrity were determined using a spectrophotometer NanoDrop 2000 and using agarose gel electrophoresis. cDNA was synthesized by reverse transcription using PrimeScript™ RT reagent Kit with gDNA Eraser (Perfect Real Time, Takara, Japan), as per the manufacturer's protocols. Newly synthesized cDNA was diluted tenfold for use in qRT-PCR experiments.

Collecting candidate reference genes from transcriptome sequencing data
In this study, we focused on two reported transcript profiles for screening new candidate reference genes for leaf from rubber tree database. And the one transcript profile had four different stages of leaf development, including bronze, color change, pale-green and bright green [11]. The other one was temperature treatment of leaf that the rubber trees were treated with 4 °C and 45 °C, for 0 h, 6 h and 12 h [18]. Genes were selected as candidate reference genes if they showed 1 3 reads per kilobase per million mapped reads (RPKM) fold change < 1.5. Functional annotations and classifications of these genes were carried out using BLAST NCBI (https :// blast .ncbi.nlm.nih.gov/Blast .cgi) and KEGG (https ://www. genom e.jp/kegg/pathw ay.html). Finally, 24 genes were chosen as new candidate reference genes to further validate expression stability using qRT-PCR.

Primer design and qRT-PCR assays
A total of 30 tested primer pairs were designed using the OligoAnalyzer web service (https ://sg.idtdn a.com/calc/analy zer), for the 24 new candidate reference genes, 4 previously used reference genes (Eukaryotic translation initiation factor 1A, eIF1Aa; DEAD/DEAH box helicase, RH8; Ubiquitinprotein ligase, UBC2a and UBC2b) and 2 specific genes (ATP citrate lyase, RG1588154 and RG65560). PCR reactions were performed on a CFX96 Touch Real-Time PCR (Bio-Rad, USA), using the SYBR® Premix Ex TaqTM II (Perfect Real Time, TaKaRa, China). Each sample had three technical replicates, and the PCR reaction conditions were as previously described [32]. Data analysis was carried out using CFX Manager (Bio-Rad, USA), including the melting curve, quantification cycle values (Ct), PCR efficiency (E) and correlation coefficients (R 2 ).

Data analysis
The arithmetic mean, geometric mean, standard deviation and variation coefficient of quantification cycle values (Ct) were calculated and statistically analyzed using SPSS 13 (https ://www.spss.com/). Four software with different algorithm, NormFinder, GeNorm, Bestkeeper and comparative delta Ct method, were used to analyze expression stability of reference genes, according to their instructions (RefFinder, https ://reffi nder.net/) [1,38,41,48]. Minitab 15 software (https ://www.minit ab.com/) was used to calculate the Pearson correlation coefficients for ranking results from four different evaluation algorithms.

Selection of candidate reference genes
In this study, we used two previously reported [11,18] transcript profiles of rubber tree in order to select new candidate reference genes from leaf, including developmental stages (bronze, color change, pale-green and bright green) and abiotic stress (high and low temperature). When the gene expression data was analyzed for those genes which the expression change less than 1.5 fold, we observed that there were 3,673 and 1,987 transcripts that could be selected from the two experiments, respectively. Among those, 82 genes were observed to exhibit little or no change of expression in both experiments (Fig. S1). In order to identify additional suitable candidate genes, BLASTx was used for gene functional annotation against NCBI and KEGG databases, which resulted in 62 annotated genes involved in five pathways (Table S1). Within those pathways, the genetic information processing pathway was the most represented (39 genes, 47.56%), and included transcription machinery (16 genes, 19.51%), ubiquitin system (13 genes, 15.85%), membrane trafficking (6 genes, 7.32%), chromosome (3 genes, 3.66%) and protein processing (1 gene, 1.22%). Others that were represented included metabolism (11 genes, 13.41%), environmental information (6 genes, 7.32%), organismal system (3 genes, 3.66%) and cellular processes (3 genes, 3.66%) (Fig. 1). These results provided us with the identities of several genes that could be further analyzed using qRT-PCR to narrow down those that may serve as reliable reference genes.

Expression profiles of candidate reference genes
Based on the expression and annotation results, 24 genes were selected as candidate reference genes which were involved in basic metabolic pathways or had the most stable expression in transcript profiles. These genes represented the transcription machinery (9 genes), ubiquitin system (11 genes), specific protein (1 gene), amino acid metabolism (1 gene), nucleotide metabolism (1 gene) and uncharacterized protein (1 gene). The identities of the genes selected for detailed characterization of expression profiles are provided in Table 1. In addition, 4 previously used reference genes (Eukaryotic translation initiation factor 1A, eIF1Aa; DEAD/DEAH box helicase, RH8; Ubiquitin-protein ligase, UBC2a and UBC2b) and 2 specific genes (ATP citrate lyase, RG1588154 and RG65560) were analyzed as control to ascertain new candidate genes ability to serve as reliable reference genes (Table 1). Thus, a total of 30 genes were evaluated for their expression stability in 60 leaf samples including various developmental stages, high and low temperature stress, and pathogen stress.
qRT-PCR results showed that the amplification efficiencies (E) for these genes ranged from 0.85 to 1.05, and the correlation coefficients (R 2 ) ranged from 0.997 to 1.000 (Table 1). After removing missing and inconsistent data, the mean Ct value for each gene was calculated for all samples to determine expression levels ( in expression levels between these two genes at either end of the expression spectrum. With respect to variation in levels of expression between different samples, the gene RG1039 exhibited the lowest variability (CV, 5.95) and the gene RG65560 was observed to be the most variable (CV, 11.15).

Assessment of delta Ct, BestKeeper, normfinder and genorm
Four different software packages with different algorithms, Delta Ct [41], BestKeeper [38], Normfinder [1] and Genorm [48], were used to evaluate the expression data for the 30 genes in all samples. Our results suggested that Delta Ct, Normfinder and Genorm provided very similar results for gene expression, specifically the same ranking from 19 to 30. Furthermore, these three software packages exhibited high correlation coefficients (R 2 ), ranging from 0.9259 to 0.9829 further illustrating the fact that they generally yield similar results (Fig. 3). BestKeeper had the lowest R 2 ranging from 0.066 to 0.080 when compared with other software packages, which indicated an inconsistent ability to accurately calculate gene expression data (Fig. 3). In conclusion, among the software packages evaluated in this study, Delta Ct, Normfinder and Genorm provided more reliable quantification of gene expression when compared with BestKeeper.

Expression stability of candidate reference genes
Our results suggested that there were obvious differences in gene expression during different developmental stages, high and low temperature challenges, as well as pathogen stresses for all genes tested, when Delta Ct, Normfinder and Genorm were used for analysis of the data (Table 3). Based on ranking of expression stability, high and low temperature stress experiments were clustered together, indicating they have similar expression patterns (Fig. S2). Furthermore, in temperature stress experiments, the five most stable genes were RG0623, RG0099, RG1206, RG1293 and RG0470 in high temperature stress treatment, and rUBC2A, RG0427, RG0099, RG1200 and RG1121 in low temperature stress treatment (Table 3). In contrast, the expression of both RG0239 and RG0544 was unstable under temperature stress (Table 3). In disease stress, RG0677, RG1039, RG2190, RG0623 and RG0118 were found to be the most stably expressed genes, whereas RG1376, RG0544, RG1478, RG0470 and RG1206 were the least stable genes. Finally, during leaf development, RG0782 showed the most stability in expression, followed by rEIF1Aa, RG0427, RH8 and RG1121 and the genes RG1478, RG0544, RG1293, RG65560 and RG1206 exhibited lower stability of expression (Table 3).
Taken together, the gene expression data from our four experiments indicate that the genes RG0099, RG0427, RG0928, RG2190, RG0118, RG0677, rEIF1Aa, RG1121, RG0623 and RG0354 were the top ten most stable candidate reference genes whereas the genes RG1478, RG1588154, RG0544, RG1206 and RG65560 were the least stable (Table 3). In top ten stable genes, four experiments only shared one gene, RG0099, that was ranked 2nd in high temperature stress, 3rd in low temperature   (Fig. 4, Table 3). In contrast, RG0554 was considered to be unstable in expression, ranking 30th in high temperature stress, 29th in low temperature stress, 27th in developmental stages and disease stress (Fig. 4, Table 3). Among the previously used four reference genes, rEIF1Aa showed a better stability of gene expression in all samples with an overall ranking of 7th. Two ATP citrate lyase genes, RG1588154 and RG65560, were identified to be the least stable, ranking as 27th and 30th respectively when all four experiments are considered.

Normalization using selected reference genes
To validate further, two most and least suitable reference genes were selected as inter control to normalize two specific genes (RG1558415 and RG65560) expression under development and temperature treatments of leaves. Among three treatments, the RG1558415 and RG65560 expressions with the most stable reference genes normalization had similar pattern with transcriptome data, but had different or opposite pattern with unsuitable normalization (Fig. 5). The results showed that the RG1558415 and RG65560 expression presented earlier increase and later decrease trends  following leaves development, and slightly fall down under high temperature stress, and no change (< 1.5 folds) under low temperature stress. These results had again proved it is necessary to select suitable reference genes for normalization to gene expression analysis in different treatments.

Discussion
qRT-PCR is increasingly and widely used for the analysis of gene expression because of the convenience, rapid turnaround and capacity for high throughput analysis [19,47]. However, in order to ensure the accuracy of gene expression information obtained through qRT-PCR analysis, it is critical that appropriate reference genes are used to normalize gene expression [47]. Significantly different results have been obtained for the same reference genes in different studies, suggesting that not choosing appropriate genes as reference genes may lead to poor results from gene expression experiments [12,16,19,48]. Therefore, the identification, selection, and validation of reference genes for use in qRT-PCR experiments are extremely important. In this study, 82 genes were preliminarily selected from leaf transcriptome profiles [11,18], as candidate reference genes for gene expression analysis in rubber tree (Fig. S1). Among the annotated genes, two broad categories were observed to stably express during leaf development and temperature stresses (Table S1). These involved the ubiquitin-proteasome system (13 genes) and transcription machinery (15 genes) (Fig. 1). The ubiquitin-proteasome system is a conserved pathway present in all eukaryotes that mediates ubiquitination and subsequent proteasomal degradation of target proteins [13,40,56]. Three key enzymes belonging to this pathway, ubiquitin activating enzyme, ubiquitin conjugating enzyme and ubiquitin-protein ligase are increasingly being considered as candidate reference genes because of their regulatory functions in wider biological processes including cell cycle, growth and apoptosis [28,35,37,42,49,57]. In the case of genes associated with transcription machinery, some genes related to the translation of mRNA are considered to be appropriate reference genes in many species. These genes include those coding for ribosomal proteins and eukaryotic translation initiation factor [14,17,51]. Surprisingly, results from our study did not include other traditionally used reference genes, for example, actin, tubulin and glyceraldehyde-3-phosphate dehydrogenase (GAPDH).
Compared with specific and traditional genes, there was no doubt that most of new reference genes exhibited  Table 3). As stated earlier, among the top five most stable reference genes, four were from the ubiquitin-proteasome system, one ubiquitin conjugating enzyme (RG0099) and three ubiquitin-protein ligases (RG0928, RG2190 and RG0118). Additionally, a traditional reference gene, eIF1Aa, was also observed to stably express in all samples (ranked 7th), especially in leaf developmental stages (ranked 2nd). In our previous studies, it has been observed that four eukaryotic translation initiation factors and five ubiquitin-protein ligases exhibited higher stability of expression in latex [24,32]. In other species, ubiquitin conjugating enzyme, ubiquitin-protein ligases and eukaryotic translation initiation factors have also been identified as appropriate reference genes for various expression experiments, including in different tissues, different developmental stages, biotic and abiotic stresses in Arabidopsis [8], Oryza [34], Elaeis [54], Eucommia [57], Populus [46], Eleusine [6]. These observations in the literature further support the utility of these genes as reliable reference genes in similar studies. As stated above, our results suggest that ubiquitin-proteasome and eukaryotic translation initiation factor have the potential as reference genes for gene expression studies in rubber tree. The expression of two specific ATP citrate lyase genes, RG1588154 and RG65560, as control, fluctuated significantly in leaf samples (Fig. 5), presumably due to the ATP citrate lyase playing an important role in substance and energy metabolism [58]. So, both of them had low-ranking of expression stability compared with new and traditional reference genes. In addition, the results from our current study indicate that those selected stable reference genes had obviously different stability of expression in different leaf developmental stages, high and low temperature stress, stress. Abbreviation B. Normalization using two better stable reference genes; Abbreviation W. Normalization using two worse stable reference genes; Abbreviation T. Expression from transcriptomes and disease stress, implying that there were no universal and common reference genes that could be used in different experiments. Similar observations have been made in both Arabidopsis and Capsicum, where it was observed that many genes outperformed traditional reference genes, and had variably stable expression in different developmental stages and environmental conditions [7,8]. It is absolutely necessary that the expression stability should be evaluated to select the suitable reference gene according to particular experiment, and better to choose multiple reference genes for normalization to improve the reliability of expression analysis [1,19,33].
It is interesting to note that three softwares Delta Ct, Normfinder and Genorm provided more similar ranking of expression stability, and their outcomes were more reliable than BestKeeper. Previous research also had reported that they were considered as useful and common softwares for selecting the suitable reference genes [32,33]. Both GeNorm and Delta Ct method use pair-wise comparison for standard deviation between one and all other ones to rank stability of expression, and could not distinguish co-regulation reference genes [41,48]. However, NormFinder takes into account intra-group and inter-group variation to evaluate the expression stability, and could control the influence of co-regulation [1]. Therefore, it is better to use Delta Ct, Normfinder and Genorm together to evaluate expression stability of reference genes, because each type of softwares has its own risks and benefits according to different algorithms.

Conclusions
The most interesting finding from our current study was that ubiquitin-proteasome system had great potential as reference genes in rubber tree, with little influence in leaf developmental stages, high and low temperature stress. New reference genes were isolated and evaluated from leaf, and had better stability of expression comparing with traditional ones. One ubiquitin conjugating enzyme (RG0099) and three ubiquitin-protein ligase (RG0928, RG2190 and RG0118) stably expressed in all detected samples, and were confirmed as suitable reference genes in leaf. In addition, the results of this study also respectively provided optimum reference genes for leaf developmental stages, high and low temperature stress, and disease stress. Another important finding was that Delta Ct, Normfinder and Genorm provided more reliable outcomes than BestKeeper, mainly because their own risks and benefits according to different algorithms. Finally, to increase the reliability of expression results, it is suggestion that more than one reference gene are selected and evaluated for normalization using different assessing softwares.