Evaluation And Validation of Reliable Reference Genes For Quantitative Real-Time PCR Analysis of The Gene Expression In Macadamia Integrifolia


 Background: Macadamia integrifolia, a new economically important crop, the kernel oil is rich in bioactive compound and monounsaturated fatty acid. Gene expression analysis of qRT-PCR is beneficial to understand the complex regulatory networks of macadamia.Results: In this study, the expression stability of 11 traditional housekeeping genes including α-tubulin (TUBa), β–tubulin (TUBb), malate dehydrogenase (MDH), 18S ribosome RNA (18S), glyceraldehyde-3- phosphate dehydrogenase (GAPDH), α-elongation factor 1 (EF1a), β- elongation factor 1 (EF1b), ubiquitin (UBQ), ubiquitin-conjugating enzyme (UBC), cyclophilin (CYP) and actin (ACT) were accessed by qRT-PCR in macadamia seedlings under different experimental conditions and tissues. The expression stability of the 11 reference genes was evaluated by the online tool RefFinder, which include ΔCt, geNorm, NormFinder, BestKeeper four commonly software, and then determinated a comprehensive expression stability ranking by integrating above four ranking results based on the geometric mean. Our results show that ACT was the best stable genes for all samples, cold stress, NaCl sress, PEG stress, ABA treatment, MeJA treatment, stem and leaf tissue samples; EF1b is the most stable gene in GA treatment and heat stress samples; UBC and CYP were respectively ranked top in ethylene treatment and root tissue samples. Finally, the reliability of these results was further validated with a target gene SAD by qRT-PCR. Conclusions: In summary, this study evaluated and validated the suitable reference genes for qRT-PCR under different experiment treatment and tissues, and will be useful for further gene expression studies on the molecular mechanisms in Macadamia.


Background
Macadamia is a genus that belongs to the Proteaceae family, is a long-lived evergreen tree of subtropical and tropical origin, and native to Australia. The commercial cultivars is composed of Macadamia integrifolia Maiden and Bethche, Macadamia tetraphylla and their hybrids [1]. Macadamia produce edible dried fruit [2], and also are consumed as an ingredient in making pastry, chocolate, oil, cosmetic and pharmaceutical industry. Therefore macadamia is the most economically important crop. Macadamias have also been expanded in South Africa, Kenya, China, the United State, Guatemala, Malawi, Brazil, Vietnam, Mexico and Thailand currently. In China the planting area reach to 295856 hm 2 until 2018, and account for more than 65% of the total area of the world grow, becoming the world's largest country with planting area.
The shells of Macadamia nut is very hard to crack as the high strength and toughness, it is considered as an interesting research material to study for developing of impact-and puncture-resistant engineering material [3]. Moreover, tocotrienols [4] and squalene [5] these bioactive compounds were detected in macadamia oil, and these compounds with high antioxidant ability can be bene cial to health maintenance and disease prevention [6]. Diets containing macadamia nuts reduce plasma LDL cholesterol levels and decrease the risk of CVD [7]. The kernel of macadamia nut is rich of lipids. Macadamia oil contains higher monounsaturated fatty acid than other vegetable oil [5,8], and hence oil content as a major indicator of macadamia kernel quality, but the mechanisms accumulating the unsaturated fatty acid are still unknown. Gene expression analysis of fatty acid synthesis is bene cial to understand the complex regulatory networks of macadamia oil accumulation. In macadamia, soluble acy-ACP desaturases [9][10] and thiesterases [11]which were responsible for accumulation of unsaturated fatty acids were cloned and characterized, but the transcriptional level regulation need to be further elucidate. Quantitative real-time reverse transcription polymerase chain reaction (qRT-PCR) is one of the most frequently-used technologies to study the expression pattern of genes [12][13]. And it was widely used in the analysis of gene expression levels with high sensitivity, speci city, reproducibility and accuracy [14]. The accuracy of qRT-PCR results depends largely on the selection of internal reference genes. Selecting suitable reference genes as internal control under different experimental conditions is very necessary because the starting amount of material, RNA integrity, reverse transcription e ciency, cDNA quality, and different tissue and cell type can signi cantly in uence the accuracy of gene expression [15][16]. Studies have shown that the expression of housekeeping genes is not constant in different cell types and physiological states. The expression of housekeeping genes will change with the change of experimental conditions and organ speci city. If the change of internal reference gene expression is not considered, the accuracy of qRT-PCR results will be reduced or even the opposite or wrong conclusion will be reached. Therefore, in order to decipher the molecular mechanisms of UFAs biosynthesis (accumulation) promoted by SAD and FAT genes in Macadamia integrifolia kernel, the need for selecting suitable reference gene for qRT-PCR in macadamia is urgent .

Expression stability analysis of candidate reference genes
In order to select the best reference genes, RefFinder online tool was used to rank the expression stability values. This method include comparative ΔCt, geNorm, NormFinder and BestKeeper, four popular algorithms, determinates a comprehensive expression stability ranking combined with above four data generated from different algorithms( Fig. 2; Table 2). The comparative ΔCt algorithm ranks the potential candidate reference genes expression stabilities based on the standard deviation values(Supplemental Table 3) [17]. As determined by ΔCt, ACT was consistently identi ed as the most stable gene through the total samples(Supplemental Fig. 3). CYP was the most stable reference gene in macadamia root and leaf, while ACT was the most stable reference gene in the stem in the different organs dataset from macadamia. In the subset of cold stress, NaCl stress, PEG stress, ABA treatment and MeJA treatment, ACT was still considered as the best stable gene in the 11 candidate reference genes, and TUBa was the least stable gene under the cold stress, NaCl stress, PEG stress, ABA treatment and MeJA treatment. Under the heat stress EF1b ranked top than ACT gene, and was determined the best stable reference gene. UBQ was ranked as the stable reference gene, while 18S was the least stable reference gene under the GA treatment. In the ethylene treatment, UBC was the most highly ranked, and EF1a performanced the least stable gene( Fig. 2; Table 2, Supplemental Table 3).
geNorm Analysis The geNorm program used the M-value to assess stability of the reference gene expression. The M-value was calculated at each step during stepwise exclusion of the least stable reference gene until two best gens were obtained[18]. The top two stable candidate reference gene were ACT / MDH for all samples(Supplemental Fig. 4), leaf and MeJA treatment, ACT and EF1b under cold stress, heat stress, NaCl stress and PEG stress, MDH / UBC in the GA treatment and ABA treatment, ACT and UBC for stem and ethylene treatment show the most stability. Additionally, as show in Fig. 2, Table 2 and Supplemental Table 4, the least stably expressed was TUBa under cold stress, NaCl stress, ABA treatment, MeJA treatment, stem and all samples, UBC under heat stress and leaf, TUBb under PEG stress, 18S under the GA treatment and root, EF1a under the ethylene treatment.

NormFinder Analysis
The NormFinder algorithm ranking candidate reference gene stability based on a variance estimation approach [19]. The NormFinder ranking orders are similar to the geNorm analysis. According this analysis( Figure S4; Table 2; Supplemental Table 5), the most stable gene were ACT for cold stress, ABA treatment, MeJA treatment, stem and all samples, EF1b for heat stress, CYP for leaf, ethylene treatment and NaCl stress, UBC for PEG stress, UBQ for GA treatment, EF1a for root. (Supplemental Fig. 5)

BestKeeper Analysis
The BestKeeper algorithm determined the stabilities of candidate reference genes based on the CV ± SD values [20]. The stability rankings by BestKeeper was quite different from the ranking order determined by ΔCt, geNorm and NormFinder algorithms(Supplemental Fig. 6). ACT was considered as the best stable gene under all samples, PEG stress, and leaf. TUBb in heat stress, UBC for cold stress, MeJA treatment and stem subset, 18S for NaCl stress and root, MDH for ABA treatment and ethylene treatment, EF1b for GA treatment were all ranked in top stable genes in different situation. Additionally, 18S and TUBa were determined as the least stable reference gene in the most samples by BestKeeper algorithm (Fig. 2; Table 2; Supplemental Table 6).

Comprehensive Ranking
In order to obtain a consensus result of the best stable reference genes as recommended by the four above approached according to the RefFinder [21], the geometric mean of four algorithms corresponding rankings for each candidate gene were calculated (Table 2;  Tables 7 and 8). ACT was ranked the rst stable gene in all samples, cold stress, NaCl stress, PEG stress, ABA treatment, MeJA treatment, stem tissue and leaf, EF1b for heat stress and GA treatment, CYP for root, UBC for ethylene treatment ( Fig.  2; Table 2).

Validation of the reference genes with the SAD gene
To further validate the selected reference genes, the relative expression level of a target gene SAD in macadamia under different experimental conditions were evaluated using qRT-PCR. It was normalized using the most stable reference genes and two least stable reference genes as the internal control both singly and combination under different treatment subsets (Fig. 3). The relative expression abundance of SAD showed substantial divergence when normalized to different kinds of reference genes. In this study, the water treatment was designed as a negative control, when the most stable genes ACT and CYP were employed respectively and together to normalized expression levels of SAD gene, the expression abundant was not signi cantly effect; when the least stable genes TUBa were used to normalized expression level of target gene, the expression abundant was up-regulation dramatically. When the EF1a was used as the reference, the expression pattern of SAD gene was similar with the most stable genes normalization results. The expression level of SAD gene gradually declined under the cold stress using ACT and EF1b as the internal control, while the expression level was increased dramatically at 6h and 24h when the least stable reference genes TUBa was used, and increased sharply at 24h for UBQ normalized as internal reference. Under heat stress, the expression pattern of SAD gene were similar when the most stable reference genes (ACT and EF1b) and the least stable reference genes( UBC and UBQ) were used as controls, but the highest expression level at 12h was 2.5-fold higher by using UBQ normalizations than using the most stable reference gene ACT normalization. In PEG stress, the expression level of SAD gene was up-regulated at 24h and then down-regulated at 48h when using ACT and UBC genes normalization, while the expression appeared overestimated when normalized using GAPDH and underestimated for TUBb. Similarly, in NaCl stress, the expression pro les of the SAD showed same trends when stable reference genes ACT and CYP were used as internal control, when the least stable reference gene TUBb and TUBa were used as reference control, the expression level of target gene was signi cantly overestimated and underestimated respectively. Under ET and ABA treatment, the relative expression trends of SAD gene showed a similarly trend, appeared slightly descent at 6h and then increased at 24h when the most stable genes were used in combination and independent for normalization, while appeared down-regulated at 12h and 24h when normalized by using the least stable reference genes. In MeJA stress, a similar expression pattern was observed when either ACT or MDH was used alone and combination for normalization. While the expression levels of SAD gene were overestimated when the least stable genes TUBa and EF1a were used for normalization. In GA stress, the relative expression of SAD gene increased at 6h when using EF1b and 18S as internal control, while declined slightly at 6h when normalized by using TUBa and UBQ as reference genes. Above all, the result showed that ACT and EF1b were more suitable for different treatment subset and TUBa failed standardized the expression data effectively.
In addition, to validate the selected reference genes, 4 genes involved fatty acid accumulation including FATA, FAD, Oleosin and SAD in qRT-PCR were compared with the expression pattern in RNA-seq results. In the qRT-PCR results, Actin was used as the normalizer to quantify the expression. As show in the Fig. 4, all the selected genes revealed similar expression le in the both methods. Therefore, our result provide the reliable reference gene in Macadamia.

Discussion
Gene expression quanti cation which as an important way for charactering gene function, has been accepted and applied widely in the eld of genetic research [22]. qRT-PCR is commonly regarded as an practical method for accurate analysis of gene expression pro ling with high sensitivity, speci city, reproducibility and less samples as MIQE(minimum information for publication of quantitative real-time PCR experiments, MIQE) a precondition [23]. There is no universally suitable reference genes under each experimental condition, and the reference genes expression is tissue-speci c and various based on the physiological status of the organs or experimental condition [24][25]. Numerous studies have reported that the expression of housekeeping gene may be constant in some experiment condition, but varied considerably in other cases [26][27][28]. The nature protocols suggest that using a speci cally and stably reference for normalization could ensure accurate results [29]. Therefore suitable reference genes selection and evaluation is critically mandatory for data normalization in gene expression analysis [30]. Suitable reference genes selection has been studied in the plant eld recent years, including cucumber [31], Setaria viridis [32], sugarcane [33], wheat [34], soybean [35] conditions and speci c organisms. In present study, we systematically validated the stability of 11 traditional reference gene under the speci c environmental condition or treatment as well as the in different tissues of Macadamia integrifolia seedlings. Traditional housekeeping genes involved in cytoskeleton structure(ACT, TUBa, TUBb), protein synthesis (EF1a, EF1b, 18S), biological metabolic process(GAPDH and UBQ), and multifunctional proteins (CYP and MDH), are selected and the open reading frame sequences of these genes were cloned from Macadamia integrifolia.
It is recommended that a comprehensive RefFinder algorithms could provide a reliable assessment of the expression stability of candidate reference gene under various experiment condition for real-time qRT-PCR analysis [40]. This statistical algorithms four methods, includingΔCt, geNorm, NormFinder and BestKeeper, that have been developed to assess the expression stability of candidate reference genes for accurate normalization in gene expression studies were used to rank the reference gene. The ranking results by different algorithms are relatively coincident mostly. In the heat treatment subset, EF1b and ACT are top two genes inΔCt, NormFinder and geNorm analysis, and ranked third and fourth in bestkeeper analysis. In ABA treatment, NaCl, stem, and all samples sets, TUBa was always ranked the least stable gene by four different algorithms. But sometimes the stability ranking showed signi cant differences due to the complementary different statistical program ( Table 2). For example, under NaCl stress, ACT was ranked rst by ΔCt and geNorm, while it was ranked sixth by BestKeeper. CYP was ranked rst by NormFinder, and followed by ACT. The ΔCt method by comparing relative expression of pairs of genes within each sample to con dently identify stability of reference genes [17]. geNorm is one of the most commonly used method for systematic validation procedures of reference gene stability [18]. geNorm identi es two reference genes according to the similarity of expression pro le in each sample and the lowest intra-group variation [41][42], and not suitable for distinguishing the gene of similar expression pattern. Andersen developed the NormFinder approach [19], which takes both the iner-and intra-group variations into account, and combines them into a stability value, and nally ranks the top genes with minimal inter-and intra-group variation. Most of time the results of NormFinder is similar to the geNorm results. For instance in the heat stress samples, EF1b and ACT were ranked top two stable reference genes, and TUBa and UBQ were the least stable reference genes, other genes ranked order have slightly change by geNorm and NormFinder. The BestKeeper software analysis is not restricted to the stability of reference genes but also target gene [20], it determines the stability ranking of the reference genes according to their SD and CV values [43][44]. Compare with geNorm and NormFinder, the BestKeeper results have signi cant difference owing to the different calculation strategy. For example, in the subset of all samples, ACT was ranked rst genes by all the programs. Under cold stress, ACT was determined the most stable genes byΔCt, geNorm, NormFinder and comprehensive ranking, excepted BestKeeper. As these programs ranked different most stable reference genes [45][46][47], we selected the most reliable reference gene by using a comprehensive tool RefFinder, which that rank the reference genes based on the geometric mean of the weights of every gene calculated by each program [21]. The comprehensive results obtained on the basis of statistical analysis byΔCt, geNorm, NormFinder and BestKeeper showed mostly consistency in this study.
ACT is commonly known to be a reliable reference gene [48]. Although there are studies have provided evidence that ACT vary considerably in Zea mays [49], our nding support that ACT is the most stable reference gene for the relative quanti cation in Macadamia among all samples, cold stress, salt stress, PEG stress, ABA treatment and MeJA treatment samples. Elongation factors involved in translocation of aminoacyl-transfer RNA to the ribosome during protein synthesis. EF1a and EF1b are highly conserved in sequence and expression in eukaryotes, usually used as internal genes. EF1b was established as the most appropriate reference gene under GA treatment and heat treatment in this study, while EF1a is considered the most stable gene under complex developmental process in Populus [50]. UBC ranked rst than ACT gene in ethylene treatment which consistent with the result in platycladus orientali [43]. CYP(Cyclophilin), a speci c cytosolic binding protein, which is ranked the top stable gene in root in macadamia. TUBa mostly was not satisfactory for qRT-PCR analysis in present study, but it was su ciently stable in Hererosigma akashiwo (Raphidophyceae) [51], which has been selected as reference gene in Ulva linza [52] and the diatom Pseudo-nitzschia multistriata [53].
Our previous studies found thatΔ 9 -Stearoyl-ACP desaturases(SAD) is the key of unsaturated fatty acid synthesis and accumulation in macadamia oil (no publish data). The expression pro le of SAD and the fatty acid level were affected in Arabidopsis Crown Galls under drought and hypoxia stress conditions [54]. In this study, we quanti ed the SAD gene expression using two most stable reference genes both singly and combined as well as two least stable genes as an internal control in qRT-PCR analysis to validate the selected reference genes in macadamia. When the least stable genes TUBb was validated as a reference gene for normalization, the expression pattern of SAD gene was obviously overestimated, and underestimated using TUBa to reference under the NaCl treatment (Fig. 3). The SAD gene expression was induced using the stable reference gene, while down-regulation when normalization with the two least reference gene TUBa and EF1a after the 6h ET treatment. So these results demonstrated that the suitable selection of internal control is critically important for the normalization, and used reference without validation would be leading misinterpretation.
RNA-seq is an approach to analysis the transcriptome pro ling of various species, and recently was used to search for candidate reference genes. In this study, the target gene expression pro le consistent both in qRT-PCR results and RNA-seq results, and they supported each other. So, our nding indicated that the result of this experiment are credible. The reference genes evaluated in this study would be very useful for further gene expression analysis in molecular mechanism of fatty acid accumulation under different treatment in Macadamia.

Conclusions
The reference genes selected in current study will be helpful for accurate normalization of qRT-PCR data in M. integrifolia. In this study, we selected the most stable reference genes in different tissue of macadamia seedlings and under different treatment. We found that ACT shows a good stability in all samples. UBC could be used to normalization when treated by ethylene. In the GA treatment samples, EF1b is the best reference genes. The identi cation of the suitable reference genes in this study will facilitate the future work on gene expression studies in Macadamia to improve our standing on the molecular mechanisms of fatty acid synthesis and accumulation under different experiment condition. For experiments, each treatment group was set up in triplicate. Control group was treated with water. The organ-speci c series of samples (root, stem, leaf) were collected from the seedlings. For drought treatment, 20% PEG-6000 solution(w/v, polythyleneglycol) was applied to incubate the plants for 0, 12, 24, 48 and 72h. For cold and heat stress, plants in the pots were placed at chamber with the temperature of 4℃ and 42℃ respectively for 0, 2, 6, 12 and 24h. For salinity treatment, seedling were transferred to 1/2 SD medium with 150 mmol NaCl for 0, 12, 24, 48, 72h. For hormone treatment, plants were imposed in 100 μmol/L methyl jasmonate(MeJA), abscisic acid(ABA), gibberrellins(GA) or ethylene (ETH) for 0, 2, 6, 12 and 24h. After that, the roots, leaves and stems were sampled separately at different periods for each treatment used for expression analysis. The detailed information of samples collected from various tissues/experimental conditions were also listed in Table S1. All samples were frozen in liquid nitrogen and then stored at -80℃ prior to RNA isolation.

Methods
Total RNA Isolation and cDNA Synthesis The frozen samples were ground to a ne powder in liquid nitrogen using a pestle and mortar. To veri ed the speci city of each primer, a melting-curve analysis was include. The mean ampli cation e ciency of each primer pair was checked by the LinRegPCR program [55].
Analysis of real-time PCR data The comparative ΔCt algorithm assesses the expression stability of reference genes by the mean standard deviation values from the test sample [17]. The ΔCt value indicated the variably transcription of candidates.
The geNorm VBA applet determines the best reference based on the geometric mean and pair wise variation of each gene from all the candidate reference genes in the total sample. Two parameters were used to evaluate the expression stability of the reference genes; the average expression stability value(M value), based on the pairwise variation between a particular gene compared to all others, and the pairwise variation(Vn/n+1), which determines the required number of genes to result in a more accurate normalization [18]. A cutoff value of Vn/n+1 <0.15 indicates that an additional reference gene makes no signi cant contribution to the normalization.
While geNorm performs a stepwise exclusion of the least stably expressed genes, NormFinder uses a model-based approach, which calculates both inter-and intra-group variability to estimate the stability of gene expression. NormFinder identi ed the optimal normalization genes among a panel of candidates according to their expression stability in a given sample set or given experimental designs. This algorithm evaluates not only the overall expression variation of the candidate reference gens, but also the variation between subgroup of samples. In addition to data from real-time PCR , NormFinder can also analyze expression data obtained through other quantitative gene expression methods, such as microarray. Ranking the stability based on the stability value (SV) where lower stability value represents higher gene expression stability and vice versa [19].
Bestkeeper is used to determinate the stability of reference genes based on the coe cient of variance (CV) and the standard deviation(SD) of the average Ct values. The lower of CV±SD values, the more stable the candidate reference genes, and gene with SD >1 would be considered unacceptable and should be excluded [20].
RefFinder is a user-friendly web-based comprehensive tool developed for evaluating and screening reference genes from extensive experimental datasets. It integrates the currently available major computational programs (geNorm, Norm nder, BestKeeper, and the comparative ΔCt method) to compare and rank the tested candidate reference genes according to the geometric mean of four algorithms corresponding rankings. [21]. We analysis the qRT-PCR data on the RefFinder mirror site(http://blooge.cn/RefFinder/) .

Evaluating Reference Genes Expression
The expression level of the target gene SAD was analyzed using the most stable and least stable reference genes after normalization across all the experimental sets. The ampli cation e ciencies of the target genes were also estimated by the LinRegPCR program.
The average Ct value was calculated from three biological and technical replicates and used for relative expression analyses.

Consent for publication
This manuscript has been reviewed and approved by all authors.

Availability of data and materials
Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

Competing interests
The authors declare that they have no competing interests.