The justification and results of every step of the procedure for selection of RGs is discussed as follows.
Step 1. Selection of genes as candidates for reference genes and amplification tests
Genes were selected according to procedures specified in Material and Methods. Single amplicons for each CHG were obtained whose specificity was confirmed by the observation of single-peak melting curves of the qPCR products or by the presence of a single band of expected size for each primer pair in agarose gel electrophoresis after PCRs employing either gDNA (data not shown) or cDNA as templates [see Additional file 2]). No primer dimers or other products resulting from non-specific amplification were found.
Step 2. Calculation of Cq´s
Calculated amplification efficiencies ranged from 1.855 to 1.935 (Table 2), values appropriated to be subsequently utilised in Cq´s calculation. The different abundance of each RG transcript affect the normalized results [31], but in the present case no considerable differences in transcript accumulation levels were observed. Furthermore, suitable RGs should be equivalent in transcript abundance to that of the target gene (TG), whose Cqs should be between 15 and 30 [26], limits within which the Cqs obtained in this analysis fits (data not shown; See Fig. 1 for a visual inspection of Cq´s).
Fig. 1. Corrected quantification cycle (Cq’) values for the seven candidate housekeeping genes. Mean of biological groups is represented, with upper semi-bars indicating Cq’ SD for biological groups (n=6=3 assays x 2 RT-level replicates), and bottom semi-bars indicating Cq’ mean SD for samples (3 samples with 2 RT-level replicates each)
Step 3. Estimation of transcript level variations:
The theoretically optimal way to identify the more transcriptionally stable candidate genes is trough estimation of both overall and inter-group variations in transcript levels under the employed experimental conditions [5]. With this purpose, two consecutive procedures were performed, as follows:
Displaying of Cq´ means and standard deviation for experimental groups.
The sample maximization design here used for plate arrangement avoids inter-run variability among samples, frequently underestimated [25]. Cq´ values for the amplicons of the seven CHGs were depicted (Fig. 1) to allow a visual evaluation of the transcript level stabilities and common trends among genes along the different factors (time, treatment). Hsp resulted as the most unstable CHG, with a general opposite trend to ACT, EF, GADPH, H2B and TUA, all of them genes showing a high degree of co-expression, v.g., co-regulation, in spite of they are functionally unrelated in a direct way. Co-regulation may be induced by stress or multi-stress [32,33], and could bias RG selection. In spite of their co-regulation, those genes show an opposite trend to the remaining and also stable gene, OUB, a behaviour that can be associated to its pre-proteolitic function, i.e., it is expressed when protein replacement is needed. The observed intra-group (time x treatment) variations were moderate except for Hsp, to what contributes a heterogeneous group-dependent variability among biological samples, since its intra-group inter-replicate variabilities were low (Fig. 1).
Calculation of transcriptional gene stability values
CV and F statistics estimates, respectively, the overall and inter-group variation of transcript levels of single CHGs without counterbalancing with the rest of CHGs. Thus, CV and F estimators provide stability rankings not so biased by concomitant, systematic biological variations that may be associated to an experimental condition, leading to a group of genes to follow a similar expression trend -such as the variation due to co-regulation. Thus, CV and F provide in most cases a good classification of the stability of CHGs. It is worth to point out that the purpose of CV and F statistics is the generation of a transcriptional gene stability ranking rather than gene transcriptional stability values, and not at all the obtaining of significant differences. Hence, inference statistics pre-requisites such as normal distribution are not mandatory for the pursued goal. F was calculated for bi- (time x treatment) and trifactorial (time x treatment x assay) cases (thus F was named as F2 and F3, respectively).
Step 4. Ranking of candidate housekeeping genes according to their stability
The more transcriptionally stable genes should have a more or less parallel classification in both CV and F descriptive parameters [5]. In the CV/F method, preference is given to CV (overall variation) ranking [5], thereby overcoming possible large discrepancies caused by opposite and extreme values of CV and F. The rankings of CHGs according to CV/F method -so as used software algorithms- are shown in Table 3.
All replicates
|
Time x Treatment
|
Time x Treatment x Assay
|
RFi algorithms
|
CV
|
F2
|
independent NFi
|
F3
|
independent NFi
|
BKr2
|
BKS
|
CoD
|
GNo
|
NFi
|
RFi
|
All tested
housekeeping genes
|
EF (0.67)
|
H2B (0.34)
|
H2B (1.07)
|
EF/GADPH
|
OUB (0.24)
|
H2B (1.57)
|
H2B (0.29)
|
Hsp (4.84)
|
OUB (0.24)
|
H2B (11.66)
|
OUB (0.27)
|
ACT (0.57)
|
OUB (0.38)
|
ACT (1.12)
|
(0.59)
|
H2B (0.32)
|
OUB (2.45)
|
OUB (0.34)
|
H2B (7.70)
|
H2B (0.26)
|
OUB (14.59)
|
H2B (0.28)
|
OUB (0.55)
|
ACT (0.58)
|
OUB (1.14)
|
H2B (0.69)
|
ACT (0.38)
|
EF (2.83)
|
ACT(0.43)
|
OUB( 8.49)
|
ACT (0.31)
|
ACT (26.05)
|
ACT (0.33)
|
GADPH (0.54)
|
EF (0.68)
|
EF (1.16)
|
TUA (0.74)
|
EF (0.66)
|
ACT (3.08)
|
EF (0.56)
|
ACT(10.79)
|
EF (0.45)
|
TUA (33.03)
|
EF (0.46)
|
TUA (0.43)
|
GADPH (0.76)
|
GADPH (1.17)
|
ACT (0.79)
|
GADPH (0.76)
|
GADPH (3.34)
|
GADPH(0.63)
|
EF (13.53)
|
GADPH (0.49)
|
EF (34.86)
|
GADPH (0.5)
|
H2B (0.38)
|
TUA (0.80)
|
TUA (1.23)
|
OUB (0.82)
|
TUA (0.88)
|
TUA (5.42)
|
TUA (0.67)
|
TUA (17.89)
|
TUA (0.58)
|
GADPH (38.6)
|
TUA (0.56)
|
Hsp (0.30)
|
Hsp (2.01)
|
Hsp (2.79)
|
Hsp (1.38)
|
Hsp (2.74)
|
Hsp (7.00)
|
Hsp (3.04)
|
GADPH (19.02)
|
Hsp (1.43)
|
Hsp (1237.12)
|
Hsp (1.37)
|
Best 6 tested
housekeeping genes
|
EF (0.86)
|
H2B (0.34)
|
H2B (0.74)
|
EF/GADPH (0.58)
|
H2B (0.42)
|
H2B (1.32)
|
H2B (0.29)
|
H2B (7.70)
|
H2B (0.21)
|
H2B (11.66)
|
H2B (0.21)
|
GADPH (0.85)
|
OUB (0.38)
|
GADPH (0.80)
|
GADPH (0.57)
|
GADPH (2.11)
|
OUB (0.34)
|
OUB (8.49)
|
ACT (0.28)
|
OUB (14.59)
|
ACT (0.30)
|
TUA (0.80)
|
ACT (0.58)
|
EF (0.82)
|
H2B (0.70)
|
EF (0.58)
|
EF (2.45)
|
ACT (0.43)
|
ACT (10.79)
|
EF (0.31)
|
ACT (26.05)
|
EF (0.32)
|
H2B (0.64)
|
EF (0.68)
|
ACT (0.82)
|
TUA (0.74)
|
ACT (0.59)
|
ACT (3.94)
|
EF (0.56)
|
EF (13.53)
|
GADPH (0.32)
|
TUA (33.03)
|
GADPH (0.32)
|
ACT (0.60)
|
GADPH (0.76)
|
TUA (0.86)
|
ACT (0.79)
|
TUA (0.64)
|
OUB (4.56)
|
GADPH (0.63)
|
TUA (17.89)
|
OUB (0.34)
|
EF (34.86)
|
OUB (0.36)
|
OUB (0.37)
|
TUA (0.80)
|
OUB (0.88)
|
OUB (0.82)
|
OUB (0.70)
|
TUA (4.95)
|
TUA (0.67)
|
GADPH (19.02)
|
TUA (0.37)
|
GADPH (38.6)
|
TUA (0.37)
|
Table 3. Transcriptional stability rankings for transcript accumulation of candidate housekeeping genes for the whole panel of experimental conditions1.
1According to BestKeeper Pearson coefficient of correlation (BKr), and the stability parameters for BestKeeper SD (BKS), comparative ΔCt method (CoD), GeNorm (GNo), NormFinder (NFi), RefFinfer (RFi) and CV/F method. A lower value for stability parameters indicates higher transcript accumulation stability. F and NF were calculated for different sample groups. Underlayed are the genes contributing to the 2-genes normalization factor with maximum stability according to NFi independent software. In bold, genes selected as reference genes (see Fig. 2). In cursive, the discarded CHG Hsp.
2All tested genes with P<0.01 for Pearson correlation coefficient for the correspondent BestKeeper normalization factor, according to Pfaffl et al (2004).
Step 5. Discarding of transcriptionally unstable genes
As it is also evident from visual distribution of CV and F parameters (Fig. 1), Hsp resulted in the more transcriptionally unstable CHG for all methods used. Moreover, in Table 3, Hsp ranked as the less stable CHG for CV and F3, and its first position in F2 is an effect of an intolerably high intra-group Hsp instability. Consequently, Hsp was discarded and removed for subsequent re-rankings.
Step 6. Re-rankings of remaining genes
For CV/F method, the exclusion of Hsp does not change stability values and the order of the rest of CHGs. Thus, omitting Hsp, the order H2B, OUB, ACT and EF is clear for CV/F method, and, taking into account the preference given to CV ranking over that from F, those genes may be followed by GADPH and, finally, TUA. GADPH showed the lowest inter-group stability according to both calculated F rankings, and had a relatively low score according to CV, being ranked as 5th for this parameter.
Step 7. Ranking with software methods as support and comparison with CV/F ranking
Selection of functionally distant CHGs, such as those from the group here tested (Table 2) and some previously considered as suitable RGs for plant in vitro growth and rooting studies ([9] and references therein), diminish the risk that most of them be transcriptionally affected in a similar way by the experimental conditions. In spite of this, it have been advised to evaluate each individual CHG stability by other means previously to the software usage [5]. CV has been used as a main parameter for normalization [34], frequently complementing normalization software [5,35,36].
Software methods employed in the present study (except BKS algorithm implemented in RFi) use statistics that relate transcript levels among all used CHGs, and consequently rely under the hard to support assumption that a majority of the evaluated CHGs do not show real biological concomitant, systematic variations on their transcript accumulation among sample groups, i.e., the measured variation is assumed to be due to technical error for most of the tested genes. In other words, all those programmes are sensitive to concomitant variations such as co-regulation [37], especially CoD [38] and gNo [39]. Furthermore, NFi software, although less affected by co-regulation, does not account for inter-group systematic errors associated to sample preparation [38] –in contrast to F-. On the other hand, BestKeeper requirements are too strict with the inter-sample variance [40]. Moreover, rankings offered by all those methods are frequently conflicting ([5–9], reviewed in [10]). The RFi programme uses the geometric mean of the four statistical approaches (BKS in the case of BestKeeper, and overall variation for NFi) to rank the CHGs. RFi weighting may help to overcome some pitfalls, but it is obviously also influenced by the systematic variations affecting the integrated algorithms.
In spite of these commented pitfalls, with the exception of gNo, which algorithm more strongly ranks according to similar gene transcript accumulation trends –likely a biasing effect-, the rest of used software agreed with CV/F method in including H2B, OUB and ACT into the three or four firstly ranked CHGs, although without concordance in the exact order positions (Table 3). : H2B ranked from 1st to 2nd position, OUB also (except for CoD method, where it ranked third), and ACT ranked between 2nd and 4th positions. The remaining CHGs, in a sort of parallelism to CV/F rank, usually oscillated between positions 3th and 5th (EF) or 4th and 6th (GADPH and TUA), except in the case of gNo software, which considered EF and GADPH as the optimal option.
The removal of the highly unstable Hsp affected the ranking of methods for which transcript accumulation or Cq’ values of each CHG algorithmically interact with those from the others. Table 3 shows that H2B is still well ranked for all algorithms. Nonetheless, in those algorithms into RFi where the gene ranking is dependent on interactions among genes (CoD, gNo, and NFi), ACT and OUB pass to be into the last three positions. The removal of Hsp, which strongly affected average trends (see changes in BKr values), changed the rankings into CoD and gNo, two algorithms highly biased by co-regulation, a fact to consider especially in cases of plant multi-stress induction (such as the present one) even in cases of normally non-regulated genes [32,33]. The NFi integrated in RFi is also sensible to inter-sample systematic variations: in the present case, it is sensible to co-regulation. The use of the remaining six CHGs for subsequent estimations is into the limit of the advised number of CHGs to be used in ranking algorithm NFi [26].
After Hsp removal, all methods showed H2B as the most stable CHG (or 3rd most stable for gNo) (Table 3). In other words, H2B is the only selected CHG common for all methods. H2B was previously recommended to be used as RG due to its high stability in in vitro rooted Eucalyptus globulus microcuttings [9]. ACT usually ranks between 2nd and 4th positions. This ranking reinforces its 2nd place for CV/F. OUB, 3rd for CV/F ranking, ranks in the last two positions for those algorithms more susceptible to co-regulation effects. Since this is due to its transcript accumulation trend opposite to the rest of the five remaining CHGs, in a way its behaviour compensates the opposite trend of both H2B and ACT, stabilizing the correspondent NF, as happens when the selection of two genes is configured for independent NFi. It should be noticed that when the selection of only two genes is switched for this software, OUB –together with H2B- is suggested to compose the NF.
Selection of RGs composing NFs is performed after establishment of a stability ranking of CHGs. In spite of the commented preference for CV/F method in order to establish a stability ranking for CHGs, it is worth to compare if NFs obtained from each method can be similarly valid. There is a wide consensus in accepting different criteria for establishment of a proper NF after CHG ranking, but they have not been applied together in a systematic way. According to the 2nd round classifications (after discarding Hsp) of the prompted CHGs, such criteria were adapted for an optimal selection of RGs into each ranking methodology, as follows in next Step.
Step 8. Selection of RG sets
Calculation of pairwise variations between possible NFs
A proper NF should be i) as stable among samples as possible as long as ii) its variations mainly reflect the real technical errors in order these errors be compensated. The first criterion, NF stability, is usually procured through selection of a number of the most stable CHGs –i.e., RGs, which compose the NF. According to this, six NFs were calculated for each gene ranking method, being each NF composed from the most stable n CHGs, being n=1 to 6, and ordered according to n (Fig. 2).
The replicate pairwise variation (V) [23] indicates the variability between NF pairs. If V is low, both NFs are considered equivalent for normalization. If the compared NFs are composed each by the n and the n+1 most stable CHGs, and V is low, both NFs may be valid for normalization. If the pairwise comparisons are made from n=1 to the total number of CHGs, the first low V found is considered to correspond to the optimal NF pair, and usually the NF composed with the minor number of genes is selected. A V threshold value of 0.15 is accepted as low enough, but this consensus should not be strict [11], especially when considering complex designs of various factors as in case presented in this article. Also, a trend of changing V values when adding new genes for the calculation of the NF is recognized to be equally informative [31,41,42], and in fact values higher than 0.15 have been accepted [32,43,44]. Concretely, the best option for a decreasing V trend are the NFs corresponding to the V lowest value. Moreover, if after a decreasing trend, even to less than 0.15, V starts to grow again by the addition to the NF of a worse ranked CHG, more instability may be being introduced, and consequently more error. Thus, the inclusion of such a CHG as RG is not advisable.
The methodology for V calculation has been used so far only for comparisons of pairs of NFs composed by n and n+1 CHGs as ranked according to the CHG stability obtained with gNo, since this algorithm is implemented in that software [23]. In the present work, such an algorithm was adapted to determine V between any pair of possible NFs, hereby obtaining left graphics of Fig. 2.
Fig. 2.Pairwise variation (left) and descriptive parameters (right) for the normalization factors. Nomalization factors are composed from the indicated candidate housekeeping genes, determined according to the indicated methods. Descriptive parameters are CV and F for bifactorial –F2– and trifactorial –F3– cases. Discontinuous lines indicate the maximum value of the parameter into the candidate housekeeping genes contributing to the indicated normalization factors. Arrows indicates the selected NFs after the determination of representative candidate NFs (left) and the assessment of congruent stability (right) (see text, Step 8).
Determination of representative candidate NFs
Normalization against more than one gene is a priori recommended [45]. At least three RGs to calculate NF are recommendable [46,47]. Three non-physiologically related genes are more representative, as long as NF is stable.
Assessment of congruent stability and selection of NFs for each gene ranking method
Selected NF among those constructed for each ranking method should be as stable as possible as long as the previous criteria are kept. Maximum or high NF stability underlies the method proposed in [48] for choosing NFs, but this may overlook CHG representation. Thus, the overall CHG representation in the candidate NFs and the congruency of these NFs –according to the previous criteria of stability and compensation of errors– were tested by assessing the stability of the candidate NFs regarding discarded NFs and individual CHGs, a matter that tends to be ignored. To inspect the suitability of the candidate NFs, the next rules were introduced:
- i) Ideally, important variability parameters (CV and F) for NFs should be lower than those of the CHG –composing the NF- having the maximum values for such parameters (see Fig. 2 right). Otherwise, this worse ranked CHG may be unnecessarily contributing to additional NF instability.
- ii) In this regard, as in general, when selecting NFs the ranking provided by overall variability (CV) should ideally have priority to that given by inter-group variability (F).
iii) F would be more representative of the whole experimental panel if it spans a higher number of factors: in the present case, F3 should have more priority than F2.
- iv) Economical criteria may be taken into account additionally when two or more NFs may be similarly valid according to all above criteria, consequently selecting the RG set with the minimum number of CHGs.
In Table 3 (genes in bold) and Fig. 2, the proposed and selected RGs (genes which contribute to the NFs) for different algorithms are represented after discarding Hsp as highly unstable gene and applying the above described criteria. In all cases those rules brought NFs composed from three RGs. Results show three possible NFs, depending on the ranking method: H2B, OUB and ACT (for CV/F and BKS methods); H2B, ACT and EF (for independent NFi algorithms), and H2B, EF and GADPH (for the rest of the methods). There is a consensus for the selection of these three CHGs amongst RFi and their implemented methods, except for BKS. This apparent congruence can be an effect of the bias by co-regulation. In the case of independent NFi, which takes into account biological groups, the GADPH 2nd position resulting from those algorithms is substituted by ACT, but this can be still an effect of the influence of co-regulation. BKS and CV/F methods, not affected by the commented pitfalls, change the ACT of the selected NF of independent NFi by OUB. The different trend of this gene with respect to the majority of CHGs increases the stability of the resulting NF. Only H2B, the most stable CHG for all methods except for gNo (where ranked 3rd), constituted a consensus among the selected NFs.
In order to test the possible validity of any of the three selected NFs, V values between them were calculated. V values were always much higher than 0.15 (0.27 was the lowest one –data not shown), thus there being important significant differences between the correspondent NFs. The commented deficiencies of algorithms suggest the CV/F method as a reasonable solution, and its use at least as a first approach. Only BKS, unaffected by co-regulation since uses a similar way to CV/F, had analogous results to CV classification. If being not too strict with high standard deviation for BestKeeper, BKS classification could be a good combination with F to classify stable genes. In fact, according to [49], the BestKeeper approach may be useful to narrow down a search if no specific genes are known to be plausible candidates.
Step 9. Error compensation versus stability: inspection of the quality of the selected NFs
The quality of NFs may be partially assessed by inspecting: i) in what measure they compensate the technical errors of transcript levels measurement, and ii) if they can also be valid for inter-assay (biological replicates in the present case) comparisons.
The total variation of the measurement of the transcriptional levels comprises the real biological variation plus technical errors. These errors are systematic when they are due to factors of experimental design (e.g. in the case of samples from a given experimental condition which are all more prone to RNA degradation), and may increase from RNA isolation with associated effects like carried over effects (such as analytical processing of experimental groups at different times or in different conditions). Carried over effects are minimized when using a sample maximization design [25] –as in the present case–, e.g., run-to-run variation. On the other hand, non-systematic, i.e., non-experiment-associated technical errors should weigh proportionally less among experimental groups (formed by, for example, different treatments or time points) than among biological, equivalent replicates. In the present case, since assay is a factor fixed by rooting potential, the assay-associated systematic variation (and consequently the included systematic error) should weigh proportionally more in the overall inter-assay variation than in assays for which the rooting potential would have been similar. Thus, the inter-assay systematic errors should be compensated as possible by the selected NF avoiding the overcompensation, i.e., the compensation of the real biological inter-assay variation. This convenience is kept for the NFs selected in the analyses (Fig. 2) for all the indicated CHG ranking methods, since inter-assay variation of these NFs is moderate or low with respect to those non-selected and the individual CHGs (Fig. 3). What is more, except for the independent NFi, the inter-assay variation of the selected NFs is similar to that of the CHG, among those composing the NF, with maximal inter-assay variation. Thus, selection criterium i for Step 8 (Assessment of congruent stability) is also kept for inter-assay case. These considerations can lead to the conclusion that the selected NFs, for both bi- and trifactorial cases, are also valid for inter-assay comparisons.
Fig. 3. Error compensation and stability of normalization factors. Mean distances between replicates (MD), and compensations (standard deviations of mean distances) (SDC) (graphics on the left), and inter-assay and inter-replicate variations (graphics on the right) for the normalization factors composed from the indicated candidate housekeeping genes, obtained according to the indicated methods. MD and SDC are shown in Cq’s. Discontinuous lines indicate the maximum value of the indicated parameter into the candidate housekeeping genes contributing to the indicated normalization factors.
Obviously, NFs should also compensate non-systematic analytical errors. The inter-replicate variation (Fig. 3) virtually only accounts for the technical error accumulated during RT-replicates preparation and later associated effects, i.e., a non-systematic, non-factor associated error. Inter-replicate variation tends to decrease asymptotically with the number of RGs to a theoretically precise average value, as can be confirmed in Fig. 3. All genes are in reality absolutely stable for technical replicates, and consequently inter-replicate error would be optimally compensated by a NF composed from a large number of RGs. Nevertheless, the contribution to the NFs of such a large number of RGs would compromise other requirements such as previously indicated criteria on this regard, so as a good compensation for systematic errors. Thus, a moderate NF inter-replicate variation value, of course lower than the NF-composing CHG with maximum value for this parameter, would be an acceptable deal. This is the case of selected NFs. This means a partial compensation of inter-replicate error, without compromising systematic error compensations and a reasonable stabilization of real transcript levels represented by the NF.
The inter-replicate mean distance of the NFs composed from the all seven CHGs (Ct’=0.0042) specifically accounts of average error due to RT-replicate preparation. In exchange, the individual CHG inter-replicate mean-distances include all non-systematic technical errors, carried over or not. Selected NFs are again well ranked for this descriptor, having an intermediate-high value (0.0072 corrected cycles), and then overcompensating RT average error (0.0042 corrected cycles) and thereby accounting of later errors too (see replicate compensation ranking, given by standard deviations of the individual CHG inter-replicate mean distances) (Fig. 3).
Step10. Determinationof the optimal normalization factor for the complete bi- and trifactorial panels
For CV/F method, NF3, composed from H2B, OUB and ACT, is a good compromise to reduce V between two consecutive CHG groups (Fig. 2), having a major representation than only two CHGs and keeping the stability conditions stated above. OUB has a slight opposite trend to both H2B and ACT transcript accumulations, then compensating bifactorial (time x treatment) and trifactorial (time x treatment x assay) inter-group variations for NF3. NF3 offers low inter-group (bi- or trifactorial) variation (Fig. 2 right) and intermediate inter-replicate variation (Fig. 3 right), thus being reasonably stable for group-associated systematic variations without compromising the compensation of non-systematic technical errors. The inclusion of additional genes introduces more instability (Fig. 2 right). Furthermore, the addition of one or two worse ranked genes is not advisable, since they rarely are selected when testing experimental conditions separately (see next Step). As a general conclusion, this three-gene NF represents the optimal combination of CHGs, amongst those tested, for the general panel of assayed experimental conditions. Slight instability among groups may account for systematic variations.
Step 11. Normalization factors for more specific experimental conditions
Additionally, the validity of NF3 for specific treatments inside of a time and along time inside of a treatment was checked. In these cases, the experimental condition groups consisted of the different levels of time or treatment (thus giving a one factor design) or in combination with the assays (thus giving a two factor design). Single CHGs were subjected to graphic inspection and ranked for each time or treatment according to the same algorithms used for the complete bi- or trifactorial panels (Table 4). After inspection of the graphics (Fig. 1), Hsp, due to its high instability, was removed from further consideration for IBA, IBA+SHAM, 4 h and 24 h. The genes ranked in last position for F were not removed as always ranked in the first four positions according to CV (Table 4). After determination of a CHG ranking according to CV/F method – the method discussed above to be considered in general more suitable - the before described RG selection criteria were followed (Fig. 4).
Table 4. Stability rankings for transcript accumulation of candidate housekeepig genes for the indicated experimental conditions1
All replicates (Treatment)
|
Time
|
Time x Assay
|
Level
|
BKS
|
CoD
|
GNo
|
NFi
|
RFi
|
CV
|
F
|
NFi
|
F
|
NFi
|
Control
Treatment
|
H2B
|
ACT
|
ACT/OUB
|
H2B
|
ACT (1.57)
|
H2B (0.18)
|
Hsp (3.85)
|
OUB (0,22)
|
H2B (6,6)
|
H2B (0,29)
|
OUB
|
H2B
|
ACT
|
H2B (1.57)
|
OUB (0.26)
|
H2B (4.78)
|
H2B (0,26)
|
OUB (13,55)
|
OUB (0,29)
|
ACT
|
OUB
|
H2B
|
OUB
|
OUB (2.06)
|
ACT (0.36)
|
OUB (5.86)
|
ACT (0,32)
|
EF (24,01)
|
ACT (0,3)
|
EF
|
EF
|
EF
|
EF
|
EF (4.00)
|
EF (0.41)
|
GADPH (7.27)
|
GADPH (0,47)
|
GADPH (27,98)
|
GADPH (0,5)
|
GADPH
|
GADPH
|
GADPH
|
GADPH
|
GADPH (5.00)
|
GADPH (0.42)
|
EF (8.93)
|
EF (0,5)
|
TUA (40,48)
|
EF (0,52)
|
TUA
|
TUA
|
TUA
|
TUA
|
TUA (6.00)
|
TUA (0.54)
|
TUA (9.63)
|
TUA (0,64)
|
ACT (42,01)
|
TUA (0,62)
|
Hsp
|
Hsp
|
Hsp
|
Hsp
|
Hsp (7.00)
|
Hsp (1.32)
|
ACT (22.53)
|
Hsp (1,01)
|
Hsp (442,64)
|
Hsp (1,08)
|
IBA
|
OUB
|
OUB
|
EF | GADPH
|
OUB
|
OUB (1.32)
|
H2B (0.32)
|
OUB (5.51)
|
OUB (0,39)
|
OUB (13,16)
|
OUB (0,43)
|
H2B
|
GADPH
|
GADPH
|
GADPH (2.21)
|
OUB (0.34)
|
H2B (7.13)
|
H2B (0,44)
|
H2B (14,69)
|
H2B (0,44)
|
ACT
|
H2B
|
OUB
|
H2B
|
H2B (2.91)
|
ACT (0.49)
|
ACT (12.54)
|
TUA (0,47)
|
TUA (29,5)
|
TUA (0,46)
|
EF
|
TUA
|
H2B
|
TUA
|
EF (3.16)
|
TUA (0.62)
|
Hsp (12.54)
|
EF (0,48)
|
ACT (34,59)
|
EF (0,51)
|
TUA
|
EF
|
TUA
|
EF
|
TUA (4.47)
|
EF (0.64)
|
EF (16.73)
|
ACT (0,49)
|
EF (53,02)
|
GADPH (0,52)
|
GADPH
|
ACT
|
ACT
|
ACT
|
ACT (5.05)
|
GADPH (0.67)
|
GADPH (20.47)
|
GADPH (0,49)
|
GADPH (53,55)
|
ACT (0,53)
|
Hsp (2.27)
|
TUA (24.77)
|
Hsp (2664,02)
|
IBA
+
SHAM
|
H2B
|
H2B
|
H2B | TUA
|
H2B
|
H2B (1.00)
|
H2B (0.25)
|
Hsp (4.04)
|
ACT (0,31)
|
H2B (8,68)
|
H2B (0,33)
|
OUB
|
ACT
|
ACT
|
TUA (2.45)
|
OUB (0.32)
|
ACT (7.51)
|
H2B (0,34)
|
ACT (13,65)
|
ACT (0,34)
|
ACT
|
TUA
|
ACT
|
TUA
|
ACT (2.45)
|
ACT (0.35)
|
TUA (8.69)
|
TUA (0,5)
|
OUB (14,68)
|
TUA (0,51)
|
TUA
|
GADPH
|
GADPH
|
GADPH
|
GADPH (4.23)
|
GADPH (0.42)
|
H2B (12.06)
|
GADPH (0,53)
|
GADPH (14,69)
|
GADPH (0,51)
|
GADPH
|
EF
|
EF
|
EF
|
OUB (4.56)
|
EF (0.42)
|
EF (13.78)
|
EF (0,63)
|
EF (17,63)
|
EF (0,62)
|
EF
|
OUB
|
OUB
|
OUB
|
EF (5.23)
|
TUA (0.53)
|
OUB (15.85)
|
OUB (0,66)
|
TUA (19,87)
|
OUB (0,67)
|
Hsp (1.97)
|
GADPH (22.17)
|
Hsp (1107,17)
|
All replicates (Time)
|
Treatment
|
Treatment x Assay
|
Level
|
BKS
|
CoD
|
GNo
|
NFi
|
RFi
|
CV
|
F
|
NFi
|
F
|
NFi
|
4 h
|
GADPH
|
GADPH
|
EF | GADPH
|
GADPH
|
GADPH (1.00)
|
H2B (0.16)
|
Hsp (3.19)
|
H2B (0,2)
|
TUA (4,11)
|
H2B (0,18)
|
OUB
|
EF
|
TUA
|
EF (2.06)
|
TUA (0.18)
|
TUA (3.92)
|
GADPH (0,2)
|
OUB (15,79)
|
GADPH (0,22)
|
EF
|
TUA
|
TUA
|
EF
|
TUA (2.91)
|
OUB (0.22)
|
EF (5.28)
|
TUA (0,2)
|
GADPH (20,15)
|
EF (0,3)
|
TUA
|
H2B
|
H2B
|
H2B
|
H2B (4.43)
|
GADPH (0.25)
|
H2B (5.51)
|
EF (0,28)
|
EF (21,49)
|
TUA (0,3)
|
ACT
|
ACT
|
ACT
|
ACT
|
OUB (4.56)
|
EF (0.37)
|
ACT (8.01)
|
ACT (0,33)
|
H2B (24,26)
|
ACT (0,35)
|
H2B
|
OUB
|
OUB
|
OUB
|
ACT (5.00)
|
ACT (0.39)
|
OUB (12.15)
|
OUB (0,45)
|
ACT (38.25)
|
OUB (0,45)
|
Hsp (1.53)
|
GADPH (13.54)
|
Hsp (1408,8)
|
1 d
|
H2B
|
EF
|
EF | OUB
|
EF
|
EF (1.41)
|
ACT (0.15)
|
ACT(1,22)
|
EF (0,15)
|
ACT (1,58)
|
EF (0,22)
|
ACT
|
OUB
|
OUB
|
OUB (1.86)
|
H2B (0.2)
|
H2B(2,85)
|
OUB (0,17)
|
H2B (1,98)
|
OUB (0,23)
|
OUB
|
ACT
|
ACT
|
ACT
|
ACT (2.71)
|
OUB (0.27)
|
OUB(3,15)
|
H2B (0,25)
|
GADPH (8,59)
|
H2B (0,27)
|
EF
|
TUA
|
TUA
|
TUA
|
H2B (3.34)
|
EF (0.33)
|
TUA(7,6)
|
ACT (0,3)
|
EF (10,25)
|
ACT (0,32)
|
GADPH
|
H2B
|
H2B
|
H2B
|
TUA (4.43)
|
GADPH (0.39)
|
EF(10,14)
|
TUA (0,31)
|
OUB (11,74)
|
TUA (0,37)
|
TUA
|
GADPH
|
GADPH
|
GADPH
|
GADPH (5.73)
|
TUA (0.52)
|
GADPH(34,04)
|
GADPH (0,34)
|
TUA (20,03)
|
GADPH (0,39)
|
Hsp (1.19)
|
Hsp(145,76)
|
Hsp (41,6)
|
2d
|
H2B
|
H2B
|
ACT/OUB
|
H2B
|
H2B (1.32)
|
H2B (0.37)
|
EF (6.19)
|
OUB (0,19)
|
OUB (18,46)
|
H2B (0,23)
|
OUB
|
OUB
|
OUB
|
OUB (1.68)
|
OUB (0.49)
|
H2B (7.79)
|
H2B (0,22)
|
H2B (28,88)
|
OUB (0,26)
|
ACT
|
ACT
|
H2B
|
ACT
|
ACT (2.28)
|
ACT (0.52)
|
OUB (9.76)
|
EF (0,23)
|
TUA (33,74)
|
ACT (0,29)
|
GADPH
|
GADPH
|
GADPH
|
GADPH
|
GADPH (4)
|
EF (0.53)
|
ACT (13.31)
|
ACT (0,26)
|
ACT (37,23)
|
EF (0,35)
|
TUA
|
EF
|
EF
|
EF
|
EF (5.23)
|
TUA (0.59)
|
GADPH (17.13)
|
GADPH (0,43)
|
EF (40,52)
|
GADPH (0,37)
|
EF
|
TUA
|
TUA
|
TUA
|
TUA (5.73)
|
GADPH (0.61)
|
TUA (19.76)
|
TUA (0,49)
|
GADPH (126,8)
|
TUA (0,46)
|
Hsp
|
Hsp
|
Hsp
|
Hsp
|
Hsp (7.00)
|
Hsp (0.7)
|
Hsp (29.99)
|
Hsp (1,09)
|
Hsp (291,95)
|
Hsp (1,11)
|
4d
|
H2B
|
H2B
|
ACT/H2B
|
H2B
|
H2B (1.00)
|
EF (0.16)
|
GADPH (1.21)
|
ACT (0,12)
|
OUB (4,12)
|
H2B (0,14)
|
EF
|
ACT
|
ACT
|
ACT (2.00)
|
H2B (0.19)
|
H2B (2.16)
|
H2B (0,14)
|
H2B (5,36)
|
EF (0,21)
|
OUB
|
EF
|
EF
|
EF
|
EF (2.71)
|
OUB (0.24)
|
ACT (2.29)
|
EF (0,15)
|
EF (5,66)
|
ACT (0,25)
|
ACT
|
GADPH
|
GADPH
|
GADPH
|
GADPH (4.23)
|
ACT (0.24)
|
TUA (2.68)
|
GADPH (0,21)
|
ACT (6,45)
|
GADPH (0,35)
|
GADPH
|
OUB
|
OUB
|
OUB
|
OUB (4.4)
|
GADPH (0.32)
|
Hsp (5.65)
|
OUB (0,32)
|
GADPH (8,63)
|
OUB (0,4)
|
TUA
|
TUA
|
TUA
|
TUA
|
TUA (6.00)
|
TUA (0.35)
|
EF (6.92)
|
TUA (0,34)
|
TUA (9,82)
|
TUA (0,44)
|
Hsp
|
Hsp
|
Hsp
|
Hsp
|
Hsp (7.00)
|
Hsp (0.48)
|
OUB (6.99)
|
Hsp (0,37)
|
Hsp (33,83)
|
Hsp (0,5)
|
1According to stability parameters for BestKeeper SD (BKS), comparative ΔCt method (CoD), GeNorm (GNo), NormFinder (NFi), RefFinfer (RFi) and CV/F method. A lower value indicates higher transcript accumulation stability. F and NFi were calculated for different sample groups. Underlayed are the genes contributing to the 2-genes normalization factor with maximum stability according to NFi independent software. In bold, genes selected as reference genes (see Fig. 4).
Fig. 4.Pairwise variation (left) and descriptive parameters (right) for the normalization factors. Normalization factors are composed from the indicated candidate housekeeping genes, determined for CV ranking. Descriptive parameters are CV and F for bifactorial –F2– and trifactorial –F3– cases. Discontinuous lines indicate the maximum value of the parameter into the CHGs contributing to the indicated normalization factors. Arrows indicates the selected normalization factors after the determination of representative candidate NFs (left) and the assessment of congruent stability (right) (see text, Step 8).
To see a detailed selection of RGs for groups see Additional file 3. Summarizing (Fig. 4), except for 96 h, where the combination H2B, ACT and EF is the optimal, H2B and OUB (NF2) belong to all selected NFs for the cases of specific experimental conditions (one given treatment with time point as a factor or a given time point with treatment as a factor), whatever assay is included or not as a second factor. In some cases, the addition of EF (Control treatment) or TUA (4 h) to NF2 forms the optimal set for normalization. In most of the cases (IBA, IBA+SHAM, 24 h and 48 h), the addition of ACT to NF2 (and then constituting NF3) improves normalization according to the above pointed criteria. As previously indicated, for the bi- and trifactorial whole panels (treatment x time and treatment x time x assay, respectively), NF3 (H2B, OUB and ACT) is proposed as the optimal NF.
For more specific cases (a given treatment at a given time point, or even biological or technical replicates in the same experimental conditions of the same assay), checking of the variability of CHGs among biological or technical replicates would be required, following a similar methodology to that exposed. With the present transcript accumulation dataset, the total number of replicates inside of a given experimental condition (6 = 3 assays x 2 technical replicates) may be too low to perform a consistent test. By this reason, if a NF should be estimated from the present datasets, the results obtained for the level-associated bifactorial cases (treatment level x assay or time level x assay) can be more valid. In spite of this, the decomposition of bi- and trifactorial whole panels for separate normalization and statistical comparison into the more simple level-associated mono- or bifactorial panels, respectively, is not advisable if a considerable loss of statistical power may be involved. Anyway, in such cases, CV and F values for NF3 are always low, which support the general use of NF3 as the most practical NF for most situations. Nevertheless, the obtained results coincide with those of other reports where differences in RG stability depend on characteristics of biological material or environmental conditions ([50] and references therein). The complexity level of an experimental panel diminishes normalization accuracy, as can be expected a priori.
Step 12. Evaluation of NF3 by comparison with NF2
It was argued above (Steps 10 and 11) the proposition of NF3 as the best possible option for normalization of the whole panel, and also for several more concrete experimental conditions here assayed. In spite of this, NF3 may be sub-optimal for the rest of the cases and could be used as a less accurate solution. Nevertheless, according to the results on overall and intergroup variability for restricted panels (Fig. 4), NF2 (which is like NF3 without including the less stable gene, ACT), although less representative, could constitute also a good option, especially for more concrete experimental situations, such as monofactorial case IBA+SHAM (Step 11). NF3 is evaluated bellow by comparison with NF2 when normalizing a hypothetical highly stable gene and a real TG with low stability.
NF3/NF2 comparison by normalizing a stable gene
The theoretical results on an ideally completely stable TG (without any variation) regarding NF3 normalization were compared with those that would be obtained by normalising with NF2 on the same TG. The results are equivalent to normalize NF3 against NF2, and provide accurate differences between both NFs. Fig. 5 shows the deviations for such normalization after equalizing both NFs for time 0 (which constitutes the same situation for all treatments). Differences in corrected cycles (Ct´) were only between 0.10 and 0.55, which are in the ranges of the intra-group standard deviations even of the more stable genes (Fig. 1), thus indicating that normalization of a real stable gene (which would show intra-group standard deviations) with NF2 would easily trough non-significant differences regarding NF3. In other words, in the practice, NF2 or NF3 could be used for less stable genes normalization without obtaining too much differences. In fact, V for both NFs is about the threshold value of 0.15 accepted as low enough.
Fig. 5. Estimated marginal means of an ideally stable gene according to NF3 when NF2 is applied. NF3: normalization factor composed from H2B, OUB and ACT. NF2: normalization factor composed from H2B and OUB.
In spite of it, since all differences gave higher transcript levels than time 0, NF2 normalization may lead to overestimation of relative transcript levels obtained after NF3 normalization (Table 5). According to the above justifications for adding a third RG, it is supposed that the introduction of ACT in NF2 to compose NF3 corrects not only punctual non-systematic technical errors but also some systematic, group associated errors. Among these, can be detected a decrease of “NF2- transcript accumulation” 4 days after the experiment initiation or in samples from not-SHAM treated explants (with stress palliated by AOX) 4 h after the experiment initiation.
NF3/NF2 comparison by normalizing an unstable gene
According to the previous conclusions, to evaluate the applicability of NF2 or NF3, significant differences in transcript levels of a real unstable gene (Hsp) were determined according to both NFs. As shown in Fig. 1 and reflected in CV and F2 values (Table 3), Hsp displayed high bifactorial intragroup variation but low bifactorial intergroup variation. Thus, despite showing instability, Hsp resulted not highly modulated by different conditions in the system. Assumptions to perform ANOVAs were tested using Fligner-Killeen test so as graphics of residuals and observed and predicted values. The differences found when normalizing against NF2 and NF3 obtained after Bonferroni transformation are in table and graphics in Additional file 4. The use of NF3 renders lower calculated transcript levels than the use of NF2, as explained above, and thereby generated slight small number of significant differences among groups, i.e., does not throw some differences generated by F2. Thus, in both bifactorial (time levels inside of treatment) –11 for NF2 and 10 for NF3– and trifactorial –6 for NF2 and 5 for NF3– cases. Bifactorial (treatment levels inside of time) case showed a total of 5 significant differences for both NFs. Thus, globaly, NF3 presented slightly more conservative results than NF2. Summarizing, only inside of IBA+SHAM treatment, NF2 generated significant differences between 1d and 2d that NF3 did not do (P =0.062), and the same happens in BA for assays I and III.
Combined evaluation.
After applying both comparisons, it can be concluded that the more TG-transcript level variations appear (higher intra-group standard deviations and more evident inter-group variation), the less discrepancies about significant differences in gene transcript accumulation are generated by the application of different NFs. Thus, NF2, here considered as sub-optimal, can generate the same significant differences than NF3 when the TG is unstable enough, including when its expression depends on experimental conditions.