Selection of oat candidate reference genes
Due to the absence of genomic sequencing for oat, the exclusive released transcriptome of oat seeds [1] was used as the BLAST database, and the sequences of up to 19 RGs published in previous articles [37-40] or collected online were used as query data in TBLASTN to search their homologs in oat seeds transcriptome. However, several query sequences were not found with any BLAST hits in that oat transcriptome. Among matched subject transcripts, some of them were too short for qPCR primer designing. Consequently, a total of 11 candidate RGs, namely Protein Phosphatase 2A Subunit A3 (PP2A), Polyubiquitin 10 (UBQ10), Ubiquitin-Conjugating Enzyme 21 (UBC21), Elongation factor 1-Alpha (EF1A), Glyceraldehyde-3-phosphate Dehydrogenase C Subunit 1 (GAPDH1), 18S ribosomal RNA (18S), Heterogeneous nuclear ribonucleoprotein 27C (HNR), Expressed protein (EP), TBC1 domain family member 22A (TBC), Tubulin alpha-6 chain (TUA6) and Eukaryotic initiation factor 4A-3 (EIF4A), were identified as candidates for qPCR primer designing and their sequences were listed in Additional file 1. In hexaploid oat, it was not surprising that duplicated genes in its transcriptome was identified. Among 11 candidate RGs, 18S and GAPDH1 were matched with two copies, while TUA6 and UBC21 had three and four transcripts respectively, and others only had one best BLAST hit (Table 1).
Considering the limited information from one set of oat transcriptome data, the genes with only one BLAST hit were insufficient to be considered as single-copy genes. Potential additional copies of them and their expression differentiation among tissues might affect their validity as RGs. Therefore, the genes with multiple hits were also included in the test of their feasibility as RGs. Clearly, it was impractical to design qPCR primers for duplicate genes which share relatively low similarities (i.e. less than 60%). Thus, the sequence similarities of these duplicated genes and their expression levels were characterized first. Sequence alignment showed that two 18S genes had up to 99.84% similarity in their coding regions, followed by 97% similarity among four UBC21 genes and 83.33% between two GAPDH1 genes, and three TUA6 genes merely displayed approximately 72% similarity. Additionally, the expression level indicated by the RPKM (Reads Per Kilobase per Million mapped reads) value of each transcript from the same homologs could be similar to each other or vary considerably (Table 1). In details, the expression levels of two 18S genes were almost the same at 1,067 and 1,053. Contrary to 18S genes, among four UBC21 genes, two of them showed two times more mRNA accumulation than other two, and the RPKM value of one TUV6 gene was even seven-fold higher than the other two. These results suggested that duplicated genes could be differentially expressed and one transcript of them might not represent them all at least at the expression level, even only in one specific organ. Therefore, primers for qPCR of these duplicated genes were designed in their identical regions.
Primer verification and PCR amplification efficiency
The primer specificity for candidate RGs was verified by both regular PCR and qPCR, and the cDNA of oat shoots from three-leaf stage seedlings was used as templates. Based on agarose gel electrophoresis, the amplification product sizes ranged from 87 bp of EF1A to 288 bp of GAPDH1 (Fig. 1, Table 1). Specific amplicon was amplified by most pairs of primers, apart from those of TUA6 (Fig. 1). Similar conclusion was drawn by the number of peaks in melting curve analysis. Only the melt curve of TUA6 amplicon contained an evident peak noise, which further confirmed the inevitable mispairing of this pair of primers. Meanwhile, other primer pairs produced specific amplificons based on the single peak in their melt curves (Fig. 2). In primers designed for qPCR, over two different mispairing nucleotides on primer can lead to a distinction of two similar sequences [41]. Because that there is no other identical region on three TUA6 transcripts as an alternative priming position, TUA6 was excluded in following experiments.
The amplification efficiencies of other ten RG primers varied from 92.7% for TBC to 112.4% for EP, all of which were in the reliable section from 90% to 115% [42]. The correlation coefficient values (R2) ranged from 0.993 for EIF4A and UBC21 to 1.000 for HNR, which indicated that these primer pairs were highly specific to their targeted region. Other information including the primer sequences and primer characteristics of candidate RGs were all summarized in Table 1.
Expression stabilities of candidate RGs
To evaluate whether the ten candidate RGs are suitable for qPCR analysis in various organs and tissues of oat, 18 samples were named as four experimental sets: all samples, shoot and root (seedlings), developing seed (seeds) and developing endosperm (endosperms).
The qPCR results were firstly displayed using Ct values in boxplot analysis (Fig. 3) and then evaluated by algorithms including the ΔCt method, geNorm, NormFinder and BestKeeper (Table 2). Corresponding index values for determining gene expression stability were listed in the brackets of each RG (Table 2). The lower these index values are, the higher the gene expression stabilities are. In all sample set, the Ct values ranged from 7.05 of 18S to 29.23 of TBC (Fig. 3a). The 18S displayed the highest expression and the least variation, whereas GADPH1 showed the least stability (Table 2). The geNorm and NormFinder analyses both exhibited that HNR and EIF4A were the top two stable RGs while 18S was the least. But similar to the results of the ΔCt method, 18S ranked the first in the BestKeeper analysis.
In seedling set, the STDEV values of all candidate RGs were obviously lower than any other sample sets (Fig. 3b; Table 2). The Ct values varied from 7.01 of 18S to 27.76 of TBC. Based on the ΔCt method and the BestKeeper analysis, UBC21 had the most stable expression, followed by GAPDH1 and EP. However, GAPDH1 and 18S were the least stably expressed RGs calculated by geNorm and NormFinder. Furthermore, HNR performed relatively better than most of RGs in all four algorithms.
In seed set, candidate RGs displayed the most variation and the least stability among four experimental sets (Fig. 3c; Table 2), which indicated the complicated regulation network of gene expressions during developing oat seeds. 18S still had the minimum Ct value of 7.13, which indicated its highest expression level among different stages of oat seeds and accorded with the highest RPKM values in oat seed transcriptome (Table 1). GAPDH1 ranked last in the ΔCt method and BestKeeper analysis, and ranked the second last in geNorm and NormFinder, respectively. Additionally, the rankings of candidate RGs in geNorm and NormFinder were exactly the same, among which EF1A and EP were the most two stable RGs.
In endosperm set, the distribution of Ct values was similar to that of oat seeds, but the variations and stabilities were distinctly different between them (Fig. 3d; Table 2). Similar conclusions were drawn by the ΔCt method and the BestKeeper analyses, that 18S and EIF4A were the most reliable RGs while GAPDH1 was the least. In the geNorm analysis, though HNR was the greatest RG, the M value of it was very close to that of PP2A, UBC21 and EIF4A.
Previous research had proven that the conventional use of a single RG for normalization led to relatively large errors in qPCR results [25]. The pairwise variation analysis provided an accurate standard to select the minimum number of RGs by comparing the Vn/Vn+1 values with 0.15. Once the Vn/Vn+1 value was lower than 0.15, the top n RGs should be combined as the internal standard gene set. As shown in Fig. 4, the V2/V3 values of four experimental sets were all less than 0.15, which indicated that the two RG combinations were reliable enough for result normalization in them. According to this conclusion, the geometric means of all candidate RGs’ ranking values given by four algorithms were calculated, and the best two RGs for each sample set were listed in Table 3. Meanwhile, the least stable RGs of four sample sets were also displayed. To be precise, EIF4A + HNR was the most stable RG set across all samples and for developing endosperms. The best RG set for seedling samples was UBC21 + HNR, and the optimal RG set for developing seeds was EP + EF1A. However, GAPDH1 was the least recommended RG for both seeds and endosperms, and 18S was particularly unstable for shoots and roots of seedling.
Validation of candidate reference genes
PKP1 (Plastidial Pyruvate Kinase 1) and AGPL2 (ADP-glucose pyrophosphorylase large subunit 2) play important roles in glycolysis [43] and starch biosynthesis [44], respectively. Homologs of SGT1 [suppressor of the G2 allele of SKP1 (S-phase kinase-associated protein 1)] and SCL14 (Scarecrow-Like 14) in wheat were reported in wheat seedling growth studies [45, 46]. To confirm the reliability of the selected best RG sets after the comprehensive analysis above, the expression levels of AsPKP1 and AsAGPL2 were detected in developing seeds and developing endosperms, while those of AsSGT1 and AsSCL14 were detected in shoots and roots of seedlings, respectively. In the meantime, the RG with the comprehensively lowest stability in each sample set was used for gene expression normalization as a negative control (Table 3; Fig. 5).
In developing oat seeds, it was evident that the expression patterns of AsPKP1 and AsAGPL2 normalized by EP + EF1A, EP and EF1A were mainly similar, and were obviously different from that by GAPDH1 (Fig. 5a; b). When the EP + EF1A set was used for normalization, the relative expression of AsPKP1 was 2.2 times higher at stage C than that of stage B, and increase slightly at stage D subsequently, followed by continuous decrease till stage J. However, almost no difference was shown in the expression trend of GAPDH1 normalization results from stage B to D (Fig. 5a). The expression level of AsGPL2 normalized by EP, EF1A and their combination showed an evident increase from stage B to C, while that normalized by GAPDH1 kept decreasing in same developing period (Fig. 5b). Similar differences were also found in developing endosperms (Fig. 5c; d). When the most stable RG set “EIF4A + HNR” was used for normalization together and separately, the transcript abundance of AsPKP1 dropped steadily during the whole developmental stages of endosperms. Conversely, the relative expression of AsPKP1 showed a sharp increase at stage C compared to that of stage B when GAPDH1 was used for normalization (Fig. 5c). The expression level of AsGPL2 varied much more when normalized by GAPDH1 compared with that normalized by EIF4A and HNR together or separately (Fig. 5d).
In two-leaf and three-leaf stages of oat seedlings, the relative expressions of AsSGT1 and AsSCL14 normalized by optimal UBC21, HNR and their combination showed almost the same levels, which were evidently different from those normalized by the least stable RG 18S (Fig. 5e; f). When UBC21 and HNR were used for normalization, AsSGT1 in seedling roots always showed tiny decreases compared with that in shoots at two different stages. However, the relative expression level of AsSGT1 even displayed 2 fold more change between shoots and roots at three-leaf stage, when 18S was used as RG (Fig. 5e). Thus, these results confirmed the validation and the reliability of the identified RG sets.