Identification and testing of reference genes for qRT-PCR analysis during pear fruit development

Quantitative real-time PCR (qRT-PCR) is currently one of the most reliable and improved tools for analyzing gene expression. Various studies have shown that housekeeping genes vary with cultivars, tissues and treatment. Reliable and stable reference genes were necessarily identified and evaluated according to different experimental requirements. In this study, 10 candidate reference genes were initially screened based on transcriptome sequencing data of four pear fruit development stages for three pear cultivars, including a candidate housekeeping gene PbrTUB. Furthermore, we ranked the expression stability of 10 candidate reference genes using GeNorm, NormFinder, BestKeeper and ReFinder algorithms. The results showed that Pbr028511, Pbr038418 and Pbr041114 were the most stable reference genes in Cuiguan, Housui and Xueqing fruit, respectively. Taken together, these results serve as a useful reference for gene function investigations and molecular mechanism studies in fruit development and ripening for various pear cultivars.


Introduction
Gene expression analysis is used to verify mRNA transcription levels of target genes and to explore novel gene functions in various biological processes, such as growth, development, signal transduction, stress responses and metabolism. Compared with other gene expression detection methods, quantitative real-time PCR (qRT-PCR) has become one of the most reliable and improved upon for studying gene transcript levels on account of its high accuracy, sensitivity and specificity (Bustin et al. 2005). However, the accuracy and reliability of this technology are affected by a variety of other factors, such as RNA quality, number of replicates, primer amplification efficiency, and suitability of reference genes (Derveaux et al. 2010;Die and Roman 2012;Svec et al. 2015). The most general approaches of qRT-PCR normalization that enhance the assay accuracy include application of a normalization step and internal reference genes or housekeeping genes (Guenin et al. 2009).
Ideal reference genes should exhibit relatively stable and consistent expression levels in different cultivars, tissues and conditions. However, no absolute universal reference genes have been reported until now. In previous studies, the traditional housekeeping genes, such as tubulin (TUB), actin (ACT ), ubiquitin (UBQ), elongation factor 1-α (EF1-α), 18S ribosomal RNA (18S rRNA), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), were directly selected to standardize the results of the qRT-PCR assay (Chen et al. 2015;Sarwar et al. 2020;He et al. 2021). Nevertheless, these housekeeping genes do not always show constant expression in variable conditions. Invalid or unstable reference genes can lead to erroneous conclusions in certain situations (Jain et al. 2006;Guenin et al. 2009). For example, several reports have shown that EF1-α, ACT and TUB were not stably expressed consistently (Nicot et al. 2005;Gutierrez et al. 2008;Hong et al. 2008). Housekeeping genes may exhibit varied expression levels in different developmental stages in Lycopersicon esculentum Mill. (Exposito-Rodriguez et al. 2008). In addition, some studies pointed out that a single reference gene cannot satisfy with the experimental requirements (Tong et al. 2009;Chen et al. 2011). Therefore, it is necessary to screen and validate stable reference genes in order to accurately quantify target genes in diverse cultivars and experimental backgrounds during qRT-PCR normalization analysis.
Pear (Pyrus) is identified as one of the most important temperate fruit species with high economic value in the world (Wu et al. 2013). Multiple internal and external factors participate in the characteristics and biological processes of pear, including pollen growth (Wang et al. 2016;Tang et al. 2020), self-incompatibility (Wang et al. 2017;Chen et al. 2018), seed germination (Qi et al. 2019), fruit development, and senescence (Hao et al. 2018;Gu et al. 2020). These factors also affect the expressions of related genes. Therefore, accurate and reliable analysis of expression patterns helps to reveal gene functions and related molecular mechanisms. However, characterization of reference genes in pear has only been reported in limited tissues and cultivars (Wu et al. 2012;Chen et al. 2015;Liu et al. 2018;Wang et al. 2018;Wang et al. 2019;Chen et al. 2020a;Chen et al. 2020b). Accordingly, there is demand for identification of appropriate reference genes in pear to obtain reliable and accurate gene expression analysis data.
In this study, we measured the expression stability of 10 candidate reference genes, employing the RNA-seq data for four development stages of Cuiguan, Housui and Xueqing pear fruit. In addition, three software packages, including geNorm, NormFinder, BestKeeper, along with RefFinder, an online tool, were utilized to evaluate the expression stability of the 10 candidate reference genes in three pear varieties at different developmental stages. Our results provide reliable reference genes for qRT-PCR normalization analysis in Cuiguan, Housui and Xueqing pear fruit. This will contribute to both expression pattern analysis of targeted genes and discovery of the breeding molecular mechanism.

Plant materials and experimental treatments
All 36 samples of the three pear cultivars were collected from the pear germplasm orchard of the Pear Engineering Technology Research Center of Nanjing Agricultural University in Nanjing, China. The fruits of pear cultivars Cuiguan (cg), Housui (hs) and Xueqing (xq) were collected from the fruitlet to ripening stages, including 10 cg (C1-10), 13 hs (H1-13) and 13 xq (X1-13). Four fruits at different stages (fruitlet, early enlargement, later enlargement, and ripening) were used for transcriptome sequencing. The flesh was ground into powder, then frozen in liquid nitrogen and stored at −80 °C until use.

Candidate reference gene selection and primer designing
Based on fruit transcriptome sequencing data of different pear cultivars from our laboratory, 10 candidate reference genes, including nine novel genes and one housekeeping gene, were applied to normalize and validate qRT-PCR experiments according to their RPKM and fold change values ( Table 1). The primers of the candidate reference genes were designed using Primer Premier 5.0 software with the following parameters: annealing temperature (Tm) of 60-65 °C, GC percent of 45-60%, primer length of 18-28 bp and product length of 150-210 bp. Lin-RegPCR was used to calculate the qRT-PCR primer amplification efficiency of the 10 pairs of primers (Table 1) (Ramakers et al. 2003). All primers were synthesized by Sangon (Nanjin, China); the primer sequences are listed in Table 1. To assay the expression specificity and efficiency of all primers, PCR was performed and the products were analyzed on a 2.0% agarose gel.

Total RNA extraction and cDNA preparation
Total RNA was isolated using the RNAprep Pure Plant Kit (Tiangen, Beijing, China) according to instructions. The Nanodrop ND1000 spectrophotometer (Thermo Scientific) was used to determine total RNA concentration and purity, and RNA samples were then assessed with OD 260/280 > 2.0 and OD 260/230 > 1.8. For each sample, 500 ng of the extracted total RNA were reverse-transcribed with TransScript® One-Step gDNA Removal and cDNA Synthesis SuperMix Kit (TransGen, Beijing, China). RNA extraction and cDNA synthesis from all samples were performed with three biological replicates. Then, a reference gene was used to verify the quality of all cDNA samples by PCR before carrying out qRT-PCR. The results of 2.0% agarose gel electrophoresis showed that all cDNA templates had no genomic DNA contamination.

Quantitative real-time PCR
Quantitative real-time PCR amplification reactions were carried out by Light Cycler 480 (Roche, USA). 20 μL reaction mixtures contained 10 μL SYBR Green I Mix, 5 ng cDNA, ddH 2 O, and a final primer concentration of 0.4 μM. Reaction mixtures were incubated for 10 min at 95 °C for preincubation, followed by 45 amplification cycles of 15 s at 95 °C, 15 s at 60 °C and 20s at 72 °C. After that, the specificity of the primer amplicons was checked by melting curve analysis. All samples had three independent biological replicates with three technical replicates each. Lin-RegPCR was used to estimate the amplification efficiency of the 10 pairs of primers in qRT-PCR (Ramakers et al. 2003). Expression levels of the 10 candidate genes in all samples were determined by their cycle threshold (Ct) values.

Statistical analysis
To rank the expression stability of the 10 candidate reference genes in pear fruit, four publicly available Microsoft Excelbased methods were used: geNorm analysis (Vandesompele et al. 2002), NormFinder analysis (Andersen et al. 2004), BestKeeper analysis (Pfaffl et al. 2004), and comparative Ct methods (Silver et al. 2006). Finally, to select the most stable reference genes according to different algorithms, we compared the candidates based on the web-based comprehensive tool RefFinder (http:// www. leonx ie. com/ refer enceg ene. php) (Xie et al. 2011).
The geNorm analysis enabled the selection of the optimal set of genes using a gene-stability measure (M). The two most stable genes, with the lowest M values, were ranked on the right. On the contrary, the least stable genes with the highest M values were ranked on the left. An  M value of no more than 1.5 for reference gene was the default limit (Vandesompele et al. 2002). The Normfinder software, similar to geNorm, is another Visual Basic Application (VBA) for identifying and ranking the optimal normalization genes from the candidates (Andersen et al. 2004). NormFinder provides a stability value for each gene according to intergroup and intragroup expression variation. For the estimated expression, enabling variation evaluated the systematic error introduced when using a gene for normalization.
BestKeeper, another Excel-based tool, determines the most stable genes according to the coefficient of Genes with the lowest standard deviation (SD) values are the most stable (Pfaffl et al. 2004).
RefFinder is a web-based comprehensive online program that calculates the geometric mean, taking into account the rankings from each aforementioned program to determine the overall final ranking.

Screening of stably expressed genes using transcriptome sequencing data
A total of 28,331 expressed genes were detected (RPKM > 0) for at least one developmental stage in the fruit tanscriptome data of three pear cultivars. Genes with FPKM values lower than 5 were considered to be poor qRT-PCR references because of the difficulty in detecting and quantifying their expression (Stanton et al. 2017). After their removal, 10,236 genes in pear were evaluated in the subsequent studies. The coefficients of variation (CV) of the 10,236 gene expression showed a right-skewed distribution (Fig. 1A). The CV% was distributed between 3.3 and 172, including 2142 genes with CV% < 20 (Fig. 1A), which had relatively stable expression in all developmental stages. The value of CV% < 20 was the basic requirement for reference genes Chen et al. 2020a). Based on transcript abundance and the CV values for gene expression, potential reference genes with FPKM > 40 and CV% < 20 were selected for testing. Finally, a set of 10 candidate genes were selected, including one common housekeeping gene PbrTUB (Pbr042345) and nine novel genes (Pbr002841, Pbr028511, Pbr038418, Pbr016129, Pbr027964, Pbr041114, Pbr000214, Pbr016048 and Pbr018827) (Supplementary Table 1). The expression variability of the selected genes was analyzed in a log 2 RPKM box plot (Fig. 1B).

Identification and characterization of candidate reference genes
A total of 10 candidate reference genes were identified for qRT-PCR normalization from fruit transcriptome sequencing data of various pear cultivars. The details of gene ID, primer sequences, amplicon size, and annealing temperature were listed in Table 1. To identify amplification specificity of primers, agarose gel electrophoresis was performed following PCR. The candidate housekeeping gene PbrTUB was used for comparison. The result showed that all primer pairs had single bands at the expected positions (Fig. 2).

Ct values of candidate reference genes
The average cycle threshold (Ct) values were used to calculate transcript levels of the candidate reference genes in different stages of fruit development. The 10 candidate reference genes displayed a relatively wide range of Ct values, from 22.73 for Pbr002841 to 31.83 for Pbr016129 in the 36 tested samples. In addition, each candidate gene maintained a relatively consistent expression level in all samples (Fig. 3A). Moreover, PbrTUB transcript levels were the most variable with a CT value of 5.83, while Pbr002841 showed the least variability with a 1.54 Ct value (Fig. 3B). Since gene transcript levels were negatively correlated with Ct values, Pbr002841 had higher expression in pear fruit than the other candidate reference genes.

geNorm analysis
Gene expression stability was verified through geNorm analysis, with average expression stability represented by  (Vandesompele et al. 2002). The M values of the 10 candidate reference genes were lower than 1.5, the geNorm threshold considered stable, for all samples (Fig. 4). In the Cuiguan group, Pbr028511 and Pbr027964 had the most stable expressions through developmental stages C1-10, while PbrTUB had the most unstable expression (Fig. 4A). In the Housui group, Pbr002841 and Pbr027964 were the two most stable genes through developmental stages H1-13, while Pbr016048 was the most unstable gene (Fig. 4B). Similarly, in the Xueqing group, Pbr041114 and Pbr018827 genes showed the most stability through developmental stages X1-13 (Fig. 4C).

NormFinder and BestKeeper analysis
NormFinder's geNorm software was used to determine the most suitable internal reference gene. The ΔCt method was performed to directly evaluate the expression stability of candidate reference genes (Andersen et al. 2004). The smaller the value, the better the stability of the reference gene. Pbr002841 had the lowest Ct value in the Cuiguan group, indicating the highest stability (Fig. 5A). Pbr038418 exhibited the lowest Ct value, indicating the highest stability, in both the Housui and Xueqing groups ( Fig. 5B and C). PbrTUB had the highest Ct value in both Cuiguan and Xueqing groups, and was thus the most unreliable candidate ( Fig. 5A and C).
BestKeeper was used to estimate the stability of candidate reference genes via standard deviation (SD) (Pfaffl et al. 2004). In the Cuiguan, Housui and Xueqing groups, the lowest SD values were 0.49, 0.39, and 0.44 for Pbr002841, Pbr038418, and Pbr038418 respectively (Fig. 6). The results showed that these three genes were the most stable in their respective groups. Meanwhile, PbrTUB, Pbr016048, and PbrTUB were the most variable reference genes with the highest SD values of 0.79, 0.56 and 0.74 in the Cuiguan, Housui and Xueqing groups respectively (Fig. 6). These results were consistent with the analysis of NormFinder.

RefFinder analysis
RefFinder is an online tool used to comprehensively integrate the results of geNorm, NormFinder, BestKeeper, and the ΔCt. It ranks candidates on the basis of their geomean constancy. Consistent with the three tools discussed above, the lowest ranking value corresponded to the highest stability. The comprehensive ranking was displayed in Table 2. This integrated tool indicates that Pbr028511, Pbr038418, and Pbr041114 were the most stable reference genes in Cuiguan, Housui and Xueqing fruit, respectively.

qRT-PCR validation
To test the applicability of candidate reference genes experimentally by qRT-PCR, eight genes were selected with contrasting expression patterns in the RNA-seq data (increased or decreased expression patterns) in the three cultivars (Supplementary Table 2). The specific primer pairs of eight selected genes for qRT-PCR validation are listed in Supplementary Table 3. The genes of Pbr028511, Pbr038418, and Pbr041114 were used as reference genes in Cuiguan, Housui and Xueqing fruit, respectively. As expected, the eight genes had nearly identical expression profiles by qRT-PCR than by RNA-Seq analyses (Fig. 7), showing that Pbr028511, Pbr038418, and Pbr041114 are reliable reference genes for gene expression studies.

Discussion
Gene expression patterns were widely used to better analyze gene expression levels and understand their biological functions. Recognized as an effective tool for performing accurate and rapid quantification of target gene expression, qRT-PCR was commonly performed in bioresearch. Generally, traditional housekeeping genes were used to standardize the transcriptional accumulations of target genes such as TUB, ACT , UBQ, EF1-α, and GAPDH. Nevertheless, common reference genes are not consistently stably expressed and thus cannot be applied to all species (Hong et al. 2008;Gutierrez et al. 2008). Based on previous studies, reference genes for fruit development vary among different genera and cultivars, such as TEF2, UBQ10, and RP II in peach (Tong et al. 2009), EF1-α, CKL, and WD40 in apple (Zhu et al. 2019), BPS1 and ICDH1 in pear (Chen et al. 2020b), and RPT6A and RPN5A in strawberry . Therefore, the selection of optimal reference genes for data normalization was critical in qRT-PCR assays. In this study, we identified the appropriate reference genes for fruit development and ripening in the pear cultivars Cuiguan, Housui, and Xueqing.
Transcriptome sequencing analysis is a high-throughput sequencing technology, which provides unbiased test transcripts and increased test specificity Hao et al. 2018;Pei et al. 2020). Therefore, transcriptome sequencing data provides a new resource for screening reference genes at the genome level. Candidate reference genes have been identified via transcriptome data in diverse plant species, such as Oryza sativa L. (Narsai et al. 2010), Fagopyrum esculentum Moench (Demidenko et al. 2011), Brassica napus L. (Yang et al. 2014), and Euscaphis konishii Hayata (Liang et al. 2018). In this study, we selected 10 relatively stable candidate reference genes based on the transcriptome data of three pear cultivars in four different developmental stages (Table 1).
Studies on pear, which is identified as the third largest temperate fruit tree, have progressed with the release of pear (Pyrus bretschneideri Rehd.) genome datasets (Wu et al. 2013). In previous studies, several traditional or novel reference genes were identified and evaluated under various biotic or abiotic stresses and at each stage of development in different tissues of diverse pear cultivars. Ubiquitous housekeeping genes may exhibit inconsistent stability under different conditions (Wu et al. 2012;Imai et al. 2014;Li et al. 2015). For instance, EF1α and TUB-b2 were the most stable in different pear cultivars, GAPDH and EF1α were the most suitable in different tissues, while TUB-b2 and GAPDH were the most stable in different developmental stages (Wu et al. 2012). In addition, it has been reported that different groups of pear tissues have different optimal reference genes depending on the experimental purpose. In pollen, the PbrGAPDH, PbrPP2A, and PbrUBI were suitable reference genes for low temperature, NaCl treatment, and CuCl 2 treatment conditions, respectively. PbrEF1α was a stable reference gene for all abiotic stresses (Chen et al. Table 2 The comprehensive ranking of 10 candidate reference genes in Cuiguan, Housui and Xueqing samples analysed by ReFinder   . In leaf, SKD1 and ARM were the most apropos single reference genes . In pear peel, ACT6/7/8/9 and NAP1 were recommended as the optimal reference genes (Chen et al. 2020a). Of the more well-researched reference genes in pear fruit, PbPDI.F1 displayed the highest expression stability during pear fruit development , the housekeeping gene EF1α members displayed a clearly unstable expression in pear fruit at different developmental stages , SOX2 and PP2A displayed stable expression in pear fruit , and BPS1 and ICDH1 were the high and stable expressed genes (Chen et al. 2020b). In this study, we identified stable and novel reference genes in the fruits of three different pear varieties. Therefore, reliable and accurate reference genes were identified according to different experimental requirements. We used three analysis algorithms (geNorm, NormFinder and BestKeeper) to evaluate the expression stability of 10 candidate genes in different stages of 10 Cuiguan, 13 Housui, and 13 Xueqing pear fruit samples. Although different statistical algorithms and analytical procedures may lead to different stability rankings, most results were consistent according to these three algorithms. Finally, the online tool RefFinder was used to comprehensively integrate all of the rankings ( Table 2). The result showed that Pbr028511, Pbr038418, and Pbr041114 were the most stable reference genes in Cuiguan, Housui, and Xueqing fruit, respectively. In addition, each of the above reference genes possesses more stable expression than PbrTUB in pear fruit in all stages of fruit development and ripening.

Conclusion
In this study, three genes were screened and identified as the most reliable and stable reference genes based on a series of stability analyses in Cuiguan, Housui, and Xueqing pear fruit. Pbr028511 was the most stable reference gene in Cuiguan, Pbr038418 was the most stable reference gene in Housui, and Pbr041114 was the most stable reference gene in Xueqing pear fruit. Therefore, our study provides a series of stable and valuable reference genes that can be applied to exploring gene functions and molecular mechanisms in pear fruit development and ripening.