2.1. Curation of the Dataset and Descriptive Analysis
A total of three lysates were collected from each passage from both MCF-7 cultures A1 and A2. Only two lysates were evaluated for passage 28 (p28) and passage 31 (p31) from culture A1. Amplification for each of the 12 reference genes produced a dataset with 900 Cq values. As shown in Table 1, RNA18S showed the highest expression in both cultures A1 and A2 (Cq mean = 7.93 and 8.26 respectively), closely followed by RNA28S (Cq mean = 8.27 and 8.35 respectively). Both genes showed high amplification levels, appearing close to seven cycles earlier than any other gene in both cultures (ACTB Cq mean = 15.98 and 16.11 respectively). CCSER2 presented the least expression levels in both cultures (Cq mean = 26.58 and 26.56 respectively) while GAPDH showed the least standard deviation (S.D = 0.30) in culture A1. ACTB showed the least standard deviation (S.D = 0.32) in culture A2. The largest variation between Cq values was shown by HNRNPL (S.D = 0.35) in culture A1 and PGK1 (S.D = 0.61) in culture A2.
Table 1. Descriptive Statistics of the Reference Genes Cq (Quantification Cycle) Values
Gene
|
N *
|
Mean Cq ± S.D **
|
Minimum Cq
|
Maximum Cq
|
|
A1
|
A2
|
A1
|
A2
|
A1
|
A2
|
A1
|
A2
|
ACTB
|
30
|
45
|
15.98 ± 0.26
|
16.11 ± 0.32
|
15.38
|
15.53
|
16.45
|
16.62
|
GAPDH
|
30
|
45
|
17.13 ± 0.17
|
17.07 ± 0.39
|
16.78
|
16.43
|
17.52
|
17.67
|
RPL13A
|
30
|
45
|
21.03 ± 0.29
|
20.74 ± 0.43
|
20.71
|
20.14
|
21.64
|
21.70
|
PGK1
|
30
|
45
|
21.08 ± 0.22
|
21.09 ± 0.61
|
20.79
|
20.29
|
21.76
|
22.11
|
HSPCB
|
30
|
45
|
20.40 ± 0.29
|
20.42 ± 0.51
|
20.02
|
19.70
|
20.98
|
21.77
|
RNA28S
|
30
|
45
|
8.27 ± 0.23
|
8.35 ± 0.40
|
7.61
|
7.83
|
8.64
|
9.58
|
RNA18S
|
30
|
45
|
7.93 ± 0.27
|
8.26 ± 0.49
|
7.45
|
7.58
|
8.47
|
9.83
|
PUM1
|
30
|
45
|
23.14 ± 0.23
|
23.08 ± 0.39
|
22.73
|
22.35
|
23.68
|
24.23
|
CCSER2
|
30
|
45
|
26.58 ± 0.21
|
26.56 ± 0.39
|
26.04
|
26.03
|
26.95
|
27.32
|
HNRNPL
|
30
|
45
|
22.73 ± 0.35
|
22.91 ± 0.56
|
22.11
|
21.97
|
23.52
|
23.99
|
PCBP1
|
30
|
45
|
22.13 ± 0.22
|
22.17 ± 0.37
|
21.78
|
21.65
|
22.76
|
22.94
|
SF3A1
|
30
|
45
|
23.39 ± 0.34
|
23.51 ± 0.43
|
22.99
|
22.78
|
24.28
|
24.68
|
* N – Total number of Cq values from all triplicates of 3 lysates. ** S.D – Standard Deviation.
2.2. Coefficient of Variation (CV%)
As shown in Table 2, across both cultures, the CV% was in the range of 14.89% to 40.11% indicating that all the candidate reference genes were characterized by stable expression. The CV analysis revealed that PCBP1 was the only gene to be in the top three stable genes (CV% = 14.89% (A1) and 24.90% (A2)) in both the cultures. Apart from PCBP1, the other two top stable genes in culture A1 were GAPDH (CV% = 12.35%) and PGK1 (CV% = 14.86%), while in culture A2 those were ACTB (CV% = 22.98%) and RNA28S (CV% = 24.52%). In stark contrast, PGK1 in culture A2 was the least stable gene (CV = 40.11%) showing variations in gene expression between the two cultures.
Table 2. Mean 2-Cq, Standard Deviation 2-Cq and CV% of the candidate reference genes
Candidate Reference Gene
|
MCF-7 Culture
|
Mean 2-Cq
|
S.D. 2-Cq
|
CV (S.D/Mean)
|
ACTB
|
A1
|
1.6E-05
|
2.9E-06
|
18.81%
|
A2
|
1.4E-05
|
3.3E-06
|
22.98%
|
GAPDH
|
A1
|
7.0E-06
|
8.7E-07
|
12.35%
|
A2
|
7.5E-06
|
2.0E-06
|
26.84%
|
RPL13A
|
A1
|
4.7E-07
|
8.7E-08
|
18.39%
|
A2
|
5.9E-07
|
1.6E-07
|
26.71%
|
PGK1
|
A1
|
4.5E-07
|
6.7E-08
|
14.86%
|
A2
|
4.9E-07
|
2.0E-07
|
40.11%
|
HSPCB
|
A1
|
7.4E-07
|
1.4E-07
|
19.34%
|
A2
|
7.5E-07
|
2.4E-07
|
31.31%
|
RNA28S
|
A1
|
3.3E-03
|
5.8E-04
|
17.66%
|
A2
|
3.1E-03
|
7.7E-04
|
24.52%
|
RNA18S
|
A1
|
4.2E-03
|
8.0E-04
|
19.17%
|
A2
|
3.4E-03
|
9.9E-04
|
28.97%
|
PUM1
|
A1
|
1.1E-07
|
1.7E-08
|
15.65%
|
A2
|
1.2E-07
|
2.9E-08
|
25.03%
|
CCSER2
|
A1
|
1.0E-08
|
1.6E-09
|
15.55%
|
A2
|
1.0E-08
|
2.7E-09
|
25.65%
|
HNRNPL
|
A1
|
1.5E-07
|
3.5E-08
|
23.70%
|
A2
|
1.4E-07
|
5.2E-08
|
38.21%
|
PCBP1
|
A1
|
2.2E-07
|
3.3E-08
|
14.89%
|
A2
|
2.2E-07
|
5.5E-08
|
24.90%
|
SF3A1
|
A1
|
9.3E-08
|
2.0E-08
|
21.20%
|
A2
|
8.7E-08
|
2.4E-08
|
27.76%
|
2.3. Relative Mean Changes in Expression Profiles of Selected Candidate Reference Genes
Relative mean changes in expression profiles were analyzed to study the gene expression stability and variations over successive passages in the both cultures. Passage 28 (p28) was selected as the experimental calibrator for culture A1 while Passage 25 (p25) was selected as the calibrator for culture A2, since they represented the initial investigated passages in both cultures. The fold change was then calculated using 2-Cq with the other passages in both cultures (Figure 1 and 2).
To determine significant relative expression changes, one-way ANOVA was used (Supplementary Table S1, see Additional file 1). In culture A1, only CCSER2 (Figure 2C) and PCBP1 (Figure 2E) showed no significant expression changes over successive passages (ANOVA P > 0.05). In culture A2, all the genes selected, including CCSER2 and PCBP1 showed significant expression changes over successive passages. Both ACTB (Figure 1A) and GAPDH (Figure 1B) showed significant expression differences (ANOVA P < 0.01) in both cultures over successive passages, providing evidence using these together as endogenous reference genes in MCF-7 cell lines should be avoided, if possible.
2.4. NormFinder Analysis of MCF-7 Sub-clones
GAPDH showed the highest expression stability in both cultures A1 and A2 (S.D = 0.14 and 0.16 respectively) as shown in Figure 3. Followed by PCBP1 and CCSER2 in culture A1 (S.D = 0.17) while in culture A2, it was matched by RNA18S (S.D = 0.16). The algorithm was further used to evaluate the pair of genes best suited to work together as reference genes as shown in Supplementary Table S2 (see Additional file 1) (only the top 10 pairs for both the cultures with their pairwise standard deviations shown). The algorithm ranked ACTB and PUM1 as the most stable pair (S.D = 0.07) in culture A1, while in culture A2, two pairs of genes were equally most stable– RPL13A – SF3A1 and GAPDH – SF3A1 (S.D = 0.07).
2.5. geNorm Analysis and Determination of Optimal Number of Genes Needed for Normalization of Dataset
As shown in Figure 4, there were noticeable differences in the obtained results for both cultures. In culture A1, ACTB-HSPCB was the most stable gene (M = 0.169; Figure 4A) while in culture A2, RNA18S-RNA28S tied for the most stable gene (M = 0.177; Figure 4B). geNorm was further used to determine the optimal number of reference genes needed for an accurate estimation of normalization of expression data, as shown in Figure 5. For both the cultures A1 and A2, the V2/3 (0.00437 and 0.00468 respectively) was less than the recommended cutoff (Vn/Vn+1 < 0.15), indicating the addition of a third reference gene would not make a difference in the normalization results.
2.6. BestKeeper Analysis
In the present study, first we considered the standard deviation with crossing points (C.P) for both cultures (Figure 6A and 6B). As seen in the two previous methods (Section 2.4 and 2.5), BestKeeper also showed deviations in results for both cultures A1 and A2. While GAPDH (S.D = 0.14) and CCSER2 (S.D = 0.18) were the top two genes in culture A1, ACTB (S.D= 0.28) and RNA28S (S.D = 0.30) were top in culture A2. The standard deviation with changes in x-fold were also examined as shown in Supplementary Table S3 (see Additional file 1). The minimum and maximum fold change (x-fold) obtained from BestKeeper were in concordance with our results of relative mean fold changes (Section 2.3; Figure 1 and 2)
Finally, coefficient of correlation (r) given as Pearson’s correlation by BestKeeper was evaluated to look for pairwise gene expression stability in both cultures (Figure 7A and 7B). The top five gene pairs with high positive coefficient of correlation (r > 0.5) in culture A1 (Figure 7A) were HSPCB-ACTB (r = 0.833), RPL13A-RNA18S (r = 0.825), PGK1-HSPCB (r = 0.775), PUM1-PCBP1 (r = 0.704) and PUM1-SF3A1 (r = 0.598). Similarly, upon analysis of gene pairs in culture A2 (Figure 7B), the top five gene pairs with r > 0.5 were, PGK1-HNRNPL (r = 0.948), RNA28S- RNA18S and PGK1-GAPDH (both pairs with r = 0.946), PCBP1-SF3A1 (r = 0.933) and PGK1-HSPCB (r = 0.908).
2.7. Pairwise Comparative ΔCt Analysis of the MCF-7 Sub-clones
In culture A1, GAPDH, CCSER2 and PCBP1 were the top three genes ranked by the algorithm while in culture A2, RNA28S, GAPDH and PCBP1 were the top three genes (Supplementary Table S4, see Additional file 1). It is interesting to note that in culture A1, the three top genes ranked by Comparative ΔCt were ranked in the same position and order by BestKeeper (Figure 6A). In culture A2, only RNA28S was reported in the top three by both these algorithms (Supplementary Table S4; Figure 6A).
2.8. RefFinder Analysis
RefFinder [44] is a comprehensive, user friendly, web based tool (https://www.heartcure.com.au/reffinder/) that uses NormFinder, geNorm, BestKeeper and Comparative ΔCt to rank and compare candidate reference genes. The rankings from RefFinder are summarized in Supplementary Table S5 (see Additional file 1). The top three genes in culture A1 were reported to be GAPDH, CCSER2 and PCBP1. For culture A2, RNA28S, GAPDH and PCBP1 were the top three most stable genes.
2.9. Moving Towards Selection of Reference Gene/s Based on an Integrated Approach
Based on the results from above reported algorithms and approaches, we came to a conclusion that for culture A1, GAPDH and CCSER2 were suitable reference genes as they were both constantly ranked in top three in NormFinder (Figure 3A), BestKeeper (Figure 6A), Comparative ΔCt (Supplementary Table S4, see Additional file 1) and RefFinder (Supplementary Table S5, see Additional file 1). Furthermore, CCSER2 had no significant relative mean expression change over successive passages (Figure 2C) while GAPDH had shown the least CV% (Table 2) in culture A1. Also, the Pearson Correlation between GAPDH-CCSER2 (r = 0.515) was significant (P = 0.004), thereby indicating their positive correlation (Figure 7A) and corroborating their selection. Based on a similar approach, for culture A2, GAPDH and RNA28S were selected as suitable reference genes. Interestingly, to avoid excluding a potential candidate as a reference gene, as a result of the strict selection of only two reference genes, as suggested by geNorm (Figure 5) an experimental candidate with consistent high rankings in all the algorithms for both cultures was also chosen. PCBP1 was identified as a strong contender and hence was included in the validation of selected reference genes (Section 2.10 to Section 2.11). PCBP1 was one of two genes to have no significant relative expression changes in culture A1 (Figure 2E) and claimed the top spot in NormFinder (Figure 3), BestKeeper (Figure 6), Comparative ΔCt (Supplementary Table S4) and RefFinder (Supplementary Table S5). It also consistently ranked in the top five genes as per the geNorm analysis for both cultures (Figure 4A and 4B).
2.10. Generation of Data for Two Genes of Interest (GOIs) & Validation of Reference Genes Using Normalization
To further evaluate the four selected reference genes (GAPDH, RNA28S, CCSER2 and PCBP1) and their utility as reference genes, two simulated datasets were created. The genes simulated were referred to as Gene of Interests (GOI) 1 and 2. Both were assigned some random Cq values (triplicates for 3 lysates over 4 passages for culture A1 and over 5 passages for culture A2). The criteria for the assignment of Cq values, as well as the datasets are presented in Supplementary table S6 and S7 (see Additional file 1).
For GOI 1 in culture A1, as suggested by the algorithms used in this study, the GAPDH – CCSER2 pair proved its utility when used for normalization of the simulated GOI 1 (Supplementary Figure S1A, see Additional file 2 and Figure 8A). After normalization with two other pairs, namely GAPDH – PCBP1 and CCSER2 – PCBP1, there were no statistically significant fold changes (ANOVA P > 0.05), in culture A1, GOI 1, as seen in Figure 8A (also see Supplementary Figure S1A, Additional file 2). In culture A2, the most stable pair suggested was GAPDH-RNA28S, which when used for normalization of GOI 1 showed statistically significant fold changes (ANOVA P < 0.05) for passage 25/28 and passage 25/30. Furthermore, all other reference genes pairs in culture A2 showed statistically significant fold changes when used for normalization (Supplementary Figure S1B, see Additional file 2 and Figure 8B).
Normalization was then performed in pairs of three reference genes to investigate whether any pairing can be used for normalization, as shown in Supplementary table S8 (see Additional file 1) and Figure 8C. In culture A1, all the different combinations of reference genes possible showed successful normalization while in culture A2, only GAPDH-PCBP1-CCSER2 triplet achieved successful normalization (Figure 8D and Supplementary table S8, Additional file 1).
On similar lines, for GOI 2 in culture A1, GAPDH – CCSER2 pair produced statistically significant results in fold change (ANOVA P = 0.035) after normalization of GOI 2, indicating the pair may not after all be able to handle large differences in Cq values (greater than +/ 0.5 Cq) as seen in Figure 8A and Supplementary Figure S2A (see Additional file 2). Interestingly, all the other reference gene pairs proved their utility and produced insignificant changes (ANOVA P > 0.05) after normalization of GOI 2 in culture A1 (Figure 8A and Supplementary Figure S2A, see Additional file 2). In this instance, for culture A2, GAPDH-RNA28S pair produced statistically significant fold changes (ANOVA P = 0.07), indicating it not to be a suitable pair for normalization. For all pairs in culture A2, statistically significant changes were recorded in only one passage, p25/p29, as seen in Figure 8B (also see Supplementary Figure S2B, Additional file 2), and hence further normalization with three reference genes was undertaken in a bid for successful normalization.
After performing further normalization of GOI 2 for cultures A1 and A2 in pairs of three reference genes (Supplementary Table S9, see Additional file 1), our analysis showed no suitable pair of reference genes for GOI 2 in culture A2 (Figure 8D) but for culture A1, GAPDH-PCBP1-CCSER2 was the only triplet to yield successful normalization (Figure 8C).
2.11. Validation of Selected Reference Genes with Normalization of qPCR Data
To support and continue perusing the selected reference gene pairs, two genes of interest were also used, namely AURKA (Aurora Kinase A) and KRT19 (Keratin 19). The qPCR was performed, and the dataset was normalized using the reference gene pairs as shown in Supplementary Figure S3-S4 (see Additional file 2). Statistical significance and methodology were the same as in Section 2.10. The Cq range for AURKA was from 23.36 (min) to 24.51 (max) with mean Cq of 23.92 ± 0.39 for culture A1 while for culture A2, the Cq range was from 23.19 (min) to 24.32 (max) with mean Cq of 23.74 ± 0.31. The Cq range for KRT19 was from 18.90 (min) to 20.13 (max) with mean Cq of 19.29 ± 0.29 for culture A1 while for culture A2, the Cq range was from 18.65 (min) to 21.19 (max) with mean Cq of 19.96 ± 0.77.
Upon normalization of AURKA, none of the gene pairs in culture A1 were able to normalize AURKA adequately (P < 0.05) as seen in Figure 8A and Supplementary Figure S3A (see Additional file 2). In culture A2, however, four reference gene pairs were able to normalize the AURKA (Figure 8B; Supplementary Figure S3B, see Additional File 2). For KRT19, in both cultures A1 and A2, no gene pair could yield successful normalization (Figure 8A and 8B; Supplementary Figure S4A and S4B, see Additional file 2). Again, further investigations were done using genes in triplets.
For AURKA, in culture A1, GAPDH-PCBP1-CCSER2 and GAPDH-CCSER2-RNA28S yielded successful normalization (Figure 8C and Supplementary table S10, see Additional file 1). In culture A2, GAPDH-PCBP1-CCSER2 and PCBP1-CCSER2-RNA28S pairs showed adequate normalization of AURKA (Figure 8D and Supplementary table S10, see Additional file 1). For KRT19, in both cultures A1 and A2, only GAPDH-PCBP1-CCSER2 triplet showed successful normalization (Figure 8C and 8D; Supplementary table S11, see Additional file 1).
2.12. Combined Analysis of Dataset from MCF-7 Sub-clones A1 and A2: Analysis of MCF-7 Cell Line
From the extensive analysis provided above, the biological replicate cultures of MCF-7 (sub-clones) does not necessarily depict similar phenotypic characteristics and gene expression and hence, the reproducibility of the results among sub-clones may not be 100% efficient. Therefore, to further investigate the reference gene/s which show least variance overall in the MCF-7 cell line, the dataset from both sub-clones A1 and A2 were combined and analyzed. CV% was determined to evaluate the stability of reference genes (Table 3). As before, comprehensive analysis was conducted for the combined dataset using NormFinder, geNorm, BestKeeper, Comparative ΔCt and RefFinder. The gene rankings were determined using each of these algorithms as summarized in Table 3.
Table 3. Comprehensive analysis of candidate reference gene ranking with dataset from both culture A1 and A2
CV%
|
NormFinder
|
geNorm
|
Candidate Gene
|
CV%
|
Rank
|
Candidate Gene
|
Group S.D
|
Rank
|
Candidate Gene
|
Stability value M
|
Rank
|
PCBP1
|
22.13
|
1
|
GAPDH
|
0.15
|
1
|
CCSER2
|
0.23
|
1
|
ACTB
|
22.28
|
2
|
PCBP1
|
0.16
|
2
|
PCBP1
|
0.23
|
1
|
RNA28S
|
22.74
|
3
|
RNA28S
|
0.21
|
3
|
GAPDH
|
0.25
|
2
|
PUM1
|
22.99
|
4
|
CCSER2
|
0.21
|
3
|
ACTB
|
0.26
|
3
|
CCSER2
|
23.17
|
5
|
ACTB
|
0.21
|
3
|
HSPCB
|
0.28
|
4
|
GAPDH
|
23.62
|
6
|
HSPCB
|
0.24
|
4
|
RNA28S
|
0.29
|
5
|
SF3A1
|
26.49
|
7
|
PGK1
|
0.27
|
5
|
PGK1
|
0.30
|
6
|
RNA18S
|
27.97
|
8
|
HNRNPL
|
0.28
|
6
|
HNRNPL
|
0.31
|
7
|
RPL13A
|
28.11
|
9
|
PUM1
|
0.29
|
7
|
SF3A1
|
0.32
|
8
|
HSPCB
|
28.65
|
10
|
RPL13A
|
0.29
|
7
|
PUM1
|
0.33
|
9
|
HNRNPL
|
35.10
|
11
|
RNA18S
|
0.30
|
8
|
RPL13A
|
0.34
|
10
|
PGK1
|
35.61
|
12
|
SF3A1
|
0.30
|
8
|
RNA18S
|
0.35
|
11
|
BestKeeper
|
Comparative ΔCt
|
RefFinder
|
Candidate Gene
|
S.D
|
Rank
|
Candidate Gene
|
Average S.D
|
Rank
|
Candidate Gene
|
Geomean
|
Rank
|
RNA28S
|
0.25
|
1
|
GAPDH
|
0.30
|
1
|
GAPDH
|
1.97
|
1
|
PUM1
|
0.26
|
2
|
PCBP1
|
0.30
|
1
|
PCBP1
|
2.00
|
2
|
ACTB
|
0.26
|
3
|
CCSER2
|
0.33
|
2
|
CCSER2
|
2.91
|
3
|
PCBP1
|
0.27
|
4
|
RNA28S
|
0.33
|
2
|
RNA28S
|
2.91
|
3
|
GAPDH
|
0.27
|
4
|
ACTB
|
0.33
|
2
|
ACTB
|
4.16
|
4
|
CCSER2
|
0.28
|
5
|
HSPCB
|
0.34
|
3
|
PUM1
|
6.34
|
5
|
SF3A1
|
0.32
|
6
|
PGK1
|
0.36
|
4
|
HSPCB
|
6.51
|
6
|
RPL13A
|
0.33
|
7
|
HNRNPL
|
0.37
|
5
|
PGK1
|
7.84
|
7
|
RNA18S
|
0.34
|
8
|
PUM1
|
0.38
|
6
|
HNRNPL
|
8.85
|
8
|
HSPCB
|
0.35
|
9
|
SF3A1
|
0.38
|
6
|
SF3A1
|
9.32
|
9
|
PGK1
|
0.41
|
10
|
RPL13A
|
0.38
|
6
|
RPL13A
|
9.92
|
10
|
HNRNPL
|
0.43
|
11
|
RNA18S
|
0.38
|
6
|
RNA18S
|
10.93
|
11
|
As reported in Table 3, CV% analysis revealed that the least variable gene is PCBP1 (CV% = 22.13%) closely followed by ACTB and RNA28S (CV% = 22.28% and 22.74% respectively). NormFinder, Comparative ΔCt and RefFinder had almost identical rankings where GAPDH and PCBP1 were consistently reported as the most stable genes. Furthermore, in Comparative ΔCt and RefFinder, both CCSER2 and RNA28S were shown to be the next most stable genes. In contrast, NormFinder ranked RNA28S before CCSER2, however both were reported in top four genes. geNorm on the other hand ranked CCSER2 – PCBP1 duo as the most stable genes (M value = 0.23). It further ranked GAPDH as the third most stable gene. geNorm was further used to calculate Vn/Vn+1 to estimate the number of reference genes required for normalization of GOIs. geNorm reported that two reference genes should be used (V2/3 = 0.005 is less than the cutoff of 0.15). BestKeeper, however ranked RNA28S and PUM1 as the two most stable genes whilst ranking PCBP1 as the fourth most stable gene.
Finally, BestKeeper was used to report the Pearson’s correlation (r) among genes. A high positive correlation was reported for PCBP1-CCSER (r = 0.757). A moderate positive correlation was reported for PCBP1-GAPDH and CCSER-GAPDH pairs (r = 0.643 and 0.666 respectively). Whilst RNA28S showed moderate positive correlation with GAPDH (r = 0.623) and PCBP1 (r = 0.686), it portrayed a low positive correlation with CCSER2 (r = 0.496). Two genes that were constantly ranked as the least stable (most variable) were RNA18S and RPL13A except by CV% and BestKeeper.
2.13. Transcriptomic Analysis of the Reference Genes from TCGA Database
The TCGA (The Cancer Genome Atlas) database was used for transcriptomic validation of the expression of the reference genes analysed in the study. Based on the PAM50 BRCA subtype classifications, data of only luminal A subtype tumor patients were analysed (others like Normal, luminal B, Undertermined, Basal etc. sub-types were removed). In the dataset, selected reference genes were looked for. Normalised gene expression data was available for 9 of the 12 reference genes (data not available for ACTB, RNA18S and RNA28S) as presented in Supplementary Table S12 (see Additional file 1). For better data visualization and understanding gene expression of the reference genes, the “normalized_count” was converted to the log scale using log2 (normalized_count +1) as presented in Supplementary Figure S5 (see Additional file 2). CCSER2 is the most stable gene (from our selected reference genes) identified in the database for luminal A sub-type breast cancer. Furthermore, PCBP1 was found to be quite stably expressed. Interestingly, GAPDH which was identified as the most stable gene in both cultures, was identified as the least stable gene in the transcriptomic analysis.
To evaluate the TPM (transcripts per million), the TCGAbiolinks was used again to retrieve the clinical, morphological and expression data of luminal A sub-type breast cancer patients. In case of TPM, the data was available for 10 of the 12 selected candidate reference genes (except for RNA28S and RNA18S) as shown in Supplementary Table S13 (see Additional file 1). For better visualization and data analysis, the TPM values were also converted to logarithmic scale using log2(TPM).
All the selected reference genes from the database showed low variance with a S.D. (log2 TPM) of less than 1. Also, they all showed medium to high expression level since the mean (log2 TPM) was greater than 5 for all genes (except for CCSER2, which showed lower expression levels). The log2 (TPM) ranges of all the genes are shown in Supplementary Figure S6 (see Additional file 2).
2.14. Relationship Between Cq Values from RT-qPCR and log2 (TPM) of TCGA RNA-Seq
To facilitate an estimation of the relationship between the Cq values obtained from RT-qPCR for both cultures A1 and A2 and the log2 (TPM) values obtained from TCGA database, a correlation analysis was performed, as seen in Supplementary Figure S7 (see Additional file 2). Although there exists no formal relationship or formula between Cq and Log2 (TPM), the formulas (y intercept and R2) presented in Supplementary Figure S7, provide an estimation of the Ct for each reference gene prior to RT-qPCR experiments based on RNA-Seq data of luminal A sub-type breast cancer which may be extended to other luminal A cell lines. Pearson’s correlation was also calculated between Cq values from both cultures A1 and A2 (and combined mentioned as MCF-7) and log2 (TPM) as seen in Table 4. Statistical significance was set at P < 0.05. No significant correlation was found in any of the gene in both cultures when Cq values were compared with Log2 (TPM) RNA-Seq values.
Table 4. Pearson’s Correlation between Cq values from culture A1 and A2 from RT-qPCR and log2 (TPM)
Candidate Reference Gene
|
Pearson Correlation r
(Culture A1 vs log2 TPM)
|
Pearson Correlation r
(Culture A2 vs log2 TPM)
|
Pearson Correlation r
(MCF-7 vs log2 TPM)
|
ACTB
|
0.215 (P = 0.253)
|
- 0.028 (P = 0.852)
|
0.115 (P = 0.323)
|
GAPDH
|
- 0.316 (P = 0.088)
|
- 0.315 (P = 0.837)
|
0.078 (P = 0.504)
|
RPL13A
|
- 0.018 (P = 0.923)
|
- 0.132 (P = 0.385)
|
- 0.071 (P = 0.272)
|
PGK1
|
- 0.308 (P = 0.097)
|
0.002 (P = 0.988)
|
0.028 (P = 0.812)
|
HSPCB
|
- 0.053 (P = 0.780)
|
- 0.013 (P = 0.927)
|
- 0.004 (P = 0.970)
|
PUM1
|
0.229 (P = 0.221)
|
0.119 (P = 0.432)
|
0.083 (P = 0.478)
|
CCSER2
|
- 0.015 (P = 0.936)
|
- 0.049 (P = 0.747)
|
- 0.029 (P = 0.798)
|
HNRNPL
|
- 0.150 (P = 0.428)
|
0.079 (P = 0.601)
|
0.069 (P = 0.554)
|
PCBP1
|
0.061 (P = 0.745)
|
- 0.126 (P = 0.407)
|
- 0.045 (P = 0.702)
|
SF3A1
|
0.310 (P = 0.095)
|
0.261 (P = 0.082)
|
0.132 (P = 0.256)
|
2.15. Nutrient Stress – Fold Changes of Reference Genes
To demonstrate and validate the reference gene triplet (GAPDH-CCSER2-PCBP1), samples from MCF-7 cells cultured in four different culturing conditions were analyzed (Section 5.2). The cell cultures were named as B5, D5, E5 and R5. We first calculated the fold change in expression of reference genes in control (cultures A1 and A2) versus nutrient stress (cultures B5, D5, E5 and R5). Additionally, we examined the potential differences in fold change between control cultures. Similar to Section 2.3, we selected p28 from culture A1 and p25 from culture A2 for comparisons of fold change as shown in Supplementary Table S14 (see Additional file 1). A well defined threshold for acceptable range of tolerable fold change does not exist, however, a commonly suggested arbitary limit of ≥ 2x fold change was used in the present study to decide whether or not the reference gene was a good internal control [38]. Accordingly, our analysis revealed that there was more than 2x fold change for ACTB (all nutrient stress cultures), PGK1 (B5, D5 and E5), GAPDH (E5) and HNRNPL (E5) when compared with control MCF-7 cell line (A1+A2).
2.16. Nutrient Stress – Expression Differences of GOIs when Normalized using GAPDH-CCSER2-PCBP1
Finally, the normalized fold change in expression for AURKA and KRT19 was evaluated in all four nutrient stress cultures and normalized against 3 controls – p28 from culture A1, p25 from culture A2 and p25 with p28 for overall MCF-7 cell line as shown in Figure 9. The triplet pair successfully normalized the expression fold change for both genes of interest in all stress cultures except for B5/p28 (culture A1), where, although the fold change was < 2, it was found to be statistically significant (P = 0.045). Fold change due to nutrient stress varied from -0.593 (D5 and E5) to -1.533 (B5) for AURKA and from -1.292 (R5) to -1.615 (D5) for KRT19.