In this study, we analyzed the AML samples provided by the TCGA database using the WGCNA method to obtain 22 AML gene modules. The clinical characteristics of the sample were combined to further analyze its correlation with age, gender, race and so on. The most positive correlation module with clinical characteristics is the grey60 model, and the correlation between the grey60 module and death is the strongest. The selected modular genes were further screened by the Cox risk proportional regression model to meet our requirements to achieve the most optimized prediction effect with fewer genetic variables. To our knowledge, this effective combination of the two methods to construct a model was the first study to result in a more comprehensive understanding of the prognosis of AML. The following screening of prognostic genes provides a more convenient and optimal method, which suggested five potential prognostic genes named PTCRA, SLC4A3, SERPINF1, PPM1J, and MB, respectively.
The relevant literature showed only a few studies have investigated the above-mentioned genes in AML, which are in fact worthy of more interest and further evaluation. According to the time-dependent ROC of these genes, PPM1J displayed the largest area under the curve as well as the significant association with the survival time of AML patients in nomogram, indicating that PPM1J may be a valuable tumor biomarker for AML and benefit the prognosis of patients. The PPM1J gene encodes the serine/threonine protein phosphatase and belongs to the protein phosphatase 2C gene family[14, 15]. Li et al. reported the single nucleotide polymorphism (SNP) of the PPM1J locus as a new locus related to the development of renal function. Whisenant et al. have developed a new method combining professional knowledge and machine learning to find a new 'D-site' in the genome, and realized that PPM1J can be used as a new c-Jun N-terminal protein kinase (JNK), which is of great significance for the study of cell biology and cancer protease network. Yan et al. utilized comprehensive bioinformatics approaches such as ESTIMATE algorithm and PPI to mine the AML TCGA database and found that the expression of PPM1J is closely related to the prognosis of AML, which is consistent with the results of our research.
The other co-expression genes also demonstrated their potential importance for the occurrence and survival of cancers. PTCRA is a single-channel membrane protein that expresses in immature T cells and regulates T cell development, which is essential for the initiation of methylation during embryonic development[18, 19]. By performing the whole-exome sequencing of patients with chronic myelogenous leukemia, Russian researchers found that different genotyping variants of PTCRA may be a leading cause for the failure of targeted tumor therapy. SLC4A3 gene family proteins contain bicarbonate transporters, which are involved in ion transportation in most cells[21, 22]. Kimi et al. screened out SLC4A3 as a significant radiation-sensitive gene from the database of breast cancer as well as head and neck cancer. As a member of the serine protease inhibitor family, PEDF has a high affinity to type I collagen in bone tissue. The high expression of PEDF in osteoblasts and active areas of bone formation suggested that it played an important role in the process of bone angiogenesis and matrix reconstruction. For SERPINF1 (serpin peptidase inhibitor, clade F, member 1), located on chromosome 17p13.3, it encodes pigment-epithelium-derived factor (PEDF). Meanwhile, the SERPINF1 is usually reported to associate with PEDF gene, which is universally involved in antitumor-related cellular processes, such as antiangiogenic, tumor growth inhibition and anti-metastasis[27, 28]. Regarding to the antitumor effect, it is rational to detect their aberrant expression in the endpoints, including recurrence, metastasis, or even death. The myoglobin (MB) is a monomeric blood protein present in striated muscle cells, which accounts for 1%-2% of the total weight of skeletal muscle. It promotes the oxygen transportation to mitochondria from the total oxygen stored in the human body[29, 30], which is an essential cellular process leading to the chemotherapy-resistance in AML[31, 32]. With the development of high-throughput screening, various genes related to human diseases are discovered. It is of high biological significance as providing a new direction for the investigation of clinical problems for various cancers, including pathological mechanisms and the discovery of effective molecular targeted therapies.
Although we screened out the target genes that were significantly related to the survival of AML, this study still has certain limitations. First of all, the sample source of this study is based on the data of AML samples in the TCGA database. The results might be affected by ethnic differences in the source. In future studies, AML samples from China and the rest of Asia need to be collected for further verification. Secondly, additional rigorous experiments are needed to be performed on the key genes revealed in this study. As a follow-up, further in vitro and in vivo experiments such as RT-PCR, western blot, and animal experiments are desired to be conducted as the biological verification for the key genes screened.