The normalization of gene expression in a group of samples is necessary to validate the stability of the expression of a reference gene under experimental conditions before its use for studies. In the literature, it is possible to observe that most works use the standardization programs GeNorm and NormFinder to search for the most appropriate reference genes for their studies. However, although these tools are capable of analyzing endogenous genes in different groups to try to identify specific behaviors for each of them, they do not do it in a completely satisfactory way 41, 46–49.
Since this work had four distinct groups: pediatric ALL, adult ALL, adult AML, and control, checking the data through these tools, it was possible to perceive that when they were analyzed all together, there was a loss of intra-group variability; on the other hand, when the analyzes were performed separately by group, there was a loss of intergroup variability. Thus, it was not possible to actually elect the most appropriate endogenous using these tools for the proposed study. For this reason, additional analyzes were performed using the R software 32,44,45,50.
Currently, the most frequently used reference genes for general expression studies are B-actin (ACTB), glyceraldehyde-3- phosphate dehydrogenase (GAPDH), and hypoxanthine-guanine phosphoribosyl transferase 1 (HPRT1) 5, 51–55.
The GAPDH gene is involved in many cell processes such as membrane transport and membrane fusion, microtubule assembly, nuclear RNA export, protein phosphotransferase/kinase reactions, DNA replication and DNA repair. With this in mind, the GAPDH expression would be expected to vary as it has a diverse range of functions unrelated to its glycolytic activity 56.
This study’s’ analysis determined that the GAPDH gene presented the most unstable behavior between the analyzed endogenous. Our findings agree with several other studies that have scrutinized the stability of commonly known reference gene GAPDH and have demonstrated that it should be used with caution as its expression varied considerably, and it was consequently unsuitable as reference gene in some cases 4, 56–59. However, some studies have showed different results regarding the expression stability of GAPDH, as it was identified as one of the best housekeeping gene in the analysis of a great variety of tissues type 60–62.
The HPRT gene is also widely used as an endogenous control in many studies of gene expression in different types of cancer. This gene is found in all cells as a soluble cytoplasmic enzyme. Although HPRT is found in all types of somatic cells, significantly higher levels are found in the central nervous system 63,64. Many studies have shown that the HPRT gene presents a behavior of a good reference gene, both used alone and associated with other genes such as TBP, GAPDH, among others 3,57, 65–68.
However, our study also identified that the HPRT gene presents low stability in samples of patients with acute leukemias and is not indicated as a suitable endogenous. Some other studies have also reported that the HPRT gene exhibits high expression variability and have classified it as an inadequate reference gene, corroborating the data found in this work 69,70.
By using the previously described method, the endogenous GAPDH and HPRT were removed from the analysis due to poor performance. Both genes presented high standard deviation and high variability between the analyzed groups, characterizing a bad behavior for reference genes. Therefore, the study proposes the set of endogenous ACTB, ABL, TBP and RPLPO as the most appropriate for the analysis of expression assays of acute leukemia samples.
The ACTB gene is an abundant and highly conserved cytoskeleton structural protein that is widely distributed in all eukaryotic cells and that plays critical roles in multiple cell processes. It is usually regarded as a constitutive housekeeping gene, assuming that its expression is normally unaffected by most experimental or physiological conditions. Therefore, ACTB has been widely used as a reference gene for expression analysis in many types of tissues 2,24,71,72.
In this study, the ACTB gene was reported as one of the most stable endogenous analyzed, presenting low mean and standard deviation values intragroup and between all four groups. This gene has also been reported as a good reference gene in other studies of different types of cancers, but especially in breast cancer expression analysis 37, 73–75.
However, the ACTB gene was found to be differentially expressed in many different types of cancer such as liver, melanoma, renal, colorectal, gastric, pancreatic, esophageal, lung, breast, prostate, ovarian cancers, leukemia, and lymphoma under certain conditions. This suggests that it might be an unsuitable endogenous for expression analysis 1,72,76,77. Some studies reported ACTB as an unsuitable reference gene 53,57,59,78,79.
Two other genes that showed stable gene expression were two protein-coding genes, ABL and TBP, respectively, according to mean values and standard deviation. The ABL gene is an oncogene likely associated with many roles of cell-cycle regulation, stress responses, integrin signaling and neural development 80–82. The TATA-binding protein (TBP), in its turn, has been considered a universal transcription factor that is required for initiation by all three nuclear RNA polymerases. This gene is associated with a variety of factors that play important roles in regulation of gene expression 2,83.
The ABL gene was constantly expressed in the peripheral blood of healthy individuals at levels comparable to other analyzed reference genes in different studies, including studies with chronic myelogenous leukemia (CML) expression analysis 84–86. Furthermore, a study published by Weisser et al., in 2004 reported that the ABL gene was a suitable endogenous for monitoring minimal residual disease in acute myeloid leukemia patients 87.
Altogether, the data from many different studies show the relevance of TBP gene expression stability, indicating that it is a suitable reference gene to be used as control in studies of various kinds of diseases, including some types of cancer such as bladder cancer and glioblastoma. However, the majority of this studies also showed that the use of TBP associated with other reference gene presented an even better performance 77,78, 88–92.
RPLPO is a ribosomal protein that is responsible for recruiting both translation factors and other ribosomal proteins to the ribosomal complexes, facilitating protein synthesis. Usually, it is strongly expressed in normal lymph nodes, skin, spleen, and fetal brain tissue, expressed at lower levels in normal lung, bladder, and placenta, and not expressed in normal colon, kidney, and bone marrow 93–95.
What was observed in this study in relation to the RPLPO gene is what is usually demonstrated in the other endogenous validation studies for gene expression techniques. The RPLPO gene presents a relatively good expression stability in several studies, but it is not the most suitable reference gene 34, 96–100.
Given these findings, this study suggests the main endogenous set for use as control/reference for the analysis of gene expression in peripheral blood and bone marrow samples from patients with acute leukemias, is composed by the ACTB, ABL, TBP and RPLPO genes. In addition, the statistical analysis in this type of study is indispensable. It is important to verify the variation of endogenous expression between groups. Once the calculation of delta Ct is made, the variability of endogenous expression is transferred to the target. Then, if there is a high endogenous variability between groups, potential differences found on the target may not really indicate target variation, but actually the influence of the endogenous variation. Therefore, it is extremely necessary to perform the validation of reference genes for any gene expression study, considering that the endogenous used influences the reliability and accuracy of these studies.