1. The expression of ALDH family in all cancers of TCGA.
Through the TCGA database, the expression data of all ALDH families in all cancers are obtained. As shown in Figure 1, the expression genes of the ALDH family genes in cancer are very different (Figure 1.A), and the expressions of the ALDH family genes are highly correlated (Figure 1.B). In detail, ALDH9H1, ALDH18A1, ALDH3A2, ALDH1A1, ALDH1B1 and ALDH2 are expressed higher than other ALDH family genes in all cancers. The expression of ALDH2 and ALDH8A1 is highly positively correlated. Further analysis of the first six ALDH family genes that are highly expressed in cancer. According to the classification of normal tissues and tumor tissues, it was found that the expression of ALDH3B1, ALDH18A1, etc., increased in a variety of cancers, but ALDH9A1 and ALDH2 showed the opposite trend (Figure 1.C) in UCEC. Except for ALDH3B1, which was statistically significant, the expression of the other 5 ALDH family members decreased in tumor tissues. These results show that the expression of ALDH family genes changes in a variety of cancers, and the changes in their expression may be involved in the regulation of tumors.
2. The prognostic significance of ALDH family.
To further confirm the effect of ALDH family genes that are highly expressed in cancer on UCEC patients. We retrieved the survival information of each patient in the TCGA database, combined with the expression analysis of ALDH family genes. Changes in other ALDH family genes do not affect the overall survival of UCEC patients. Only changes in the expression of ALDH2 gene will significantly affect the overall survival of UCEC patients (Figure 2.A). The results showed that the overall survival of the low expression group of ALDH2 and ALDH18A1 was low. At the same time, the survival of DSS, DFI, and PFI of the low expression group of ALDH2 was the same as the overall survival result, all showing a worse survival rate (Figure 2.B). For DSS in the ALDH18A1 low expression group, DFI and PFI were not different from the high expression group (Figure 2.C). These results indicate that patients in the ALDH2 low expression group are associated with worse prognostic survival in different survival groups. In addition to survival time and survival status, TCGA data also contains complete clinical information of the patient. The results of the clinical analysis found that the expression of ALDH2 was not significantly correlated with the age and race of the patients (Figure 2.D).
3. Analysis of GO, KEGG, GSEA function enrichment.
In this study, we downloaded the expression data of all UCEC samples in the TCGA database. The deletion was divided into 2 groups according to the median value and the expression of ALDH2, and the different genes between the groups were analyzed. A total of 477 differential genes were identified. Among them, 322 genes were up-regulated, and 155 genes were down-regulated (Figure 3.A, B). In order to understand the possible regulatory functions of these differential genes. The six pathways with the highest correlation found in GO enrichment are (Figure 3.C), humoral immune response mediated by circulating immunoglobulin, complement activation, classical pathway, complement activation, immunoglobulin mediated immune response, B cell mediated immunity, and protein activation cascade. The 6 pathways with the highest correlation of KEGG enrichment are (Figure 3.D), Cell adhesion molecules, Type I diabetes mellitus, Intestinal immune network for IgA production, Hematopoietic cell lineage, Th1 and Th2 cell differentiation, and Phagosome. GSEA enrichment found that the GO processes of the main ALDH2 up-regulated group included AZUROPHIL GRANULE LUMEN, RESPONSE TO INTERFERON GAMMA, MYELOID LEUKOCYTE MIGRATION, MMUNE RECEPTOR ACTIVITY, TETRAPYRROLE METABOLIC PROCESS (Figure 3.E). The GO processes of the down-regulation group include MITOCHONDRIAL GENOME MAINTENANCE, REGULATION OF DNA RECOMBINATION, CELL CYCLE CHECKPOINT, DNA REPLICATION CHECKPOINT, SULFUR AMINO ACID BIOSYNTHETIC PROCESS (Figure 3.F). The above enrichment results indicate that the alteration of ALDH2 expression may be related to cellular immunity. Therefore, the immune infiltration score of each UCEC patient was calculated by R software with cibersort package. The results showed that patients in the ALDH2 low expression group had lower scores for CD8+ T cells and plasma cells (Figure 3.G).
4. PPI network construction.
PPI of DEGs was constructed by using the String network tool. After hiding all the individual nodes, it is found that there are 392 nodes and 295 edges (Figure 4.A). Statistics found that the first six genes with the number of interaction relationships are ORM2, C3, CTSH, HLA-DPA1, HLA-DPB1, HLA-DQA1 (Figure 4.B). At the same time, import the results into Cytospace software. Use MCODE and Cytohubba apps to calculate the core subnet (Figure 4.C) and core gene (Figure 4.D) in the PPI network. The result is shown in the figure.
5. The risk score model of UCEC construction.
Our data show that the high expression ALDH2 group has a better OS, DSS, and PFI(Figure 2). Therefore, a prognostic model based on the overall survival score of ALDH2 expression was constructed. The expression profile of UCEC in the TCGA database was randomly divided into an equal number of training groups and test groups. COX univariate results showed that a total of 62 genes were significantly related to the overall survival of UCEC patients. Further use lasso regression to screen out 15 genes to prevent the model from overfitting (Figure 5.A, B). Finally, a 6-factor overall prognostic rishk score model was established using COX stepwise regression. The coef of rishk score model of overall survival was: CDKN2A×0.18793+WNT10A×0.23536+AQP5×-0.21242+STX18×-0.30617+GZMA×-0.44271+LCN2×-0.15148. COX multifactorial analysis found that they were all highly significantly related to prognosis (Figure 5.C).
Next, use the test group to test the model. The risk scores of 268 UCEC patients in the verification group are shown in the figure (Figure 5.D). The statechart shows that the number of deaths in the high-risk group is more (Figure 5.E). Moreover, the survival prognosis of the high-risk group is worse (Figure 5.F). Besides, the C-index of the model is calculated to be 0.798. The AUC value of 1 and 3 years is 0.738, and the AUC value of 5 years is 0.764 (Figure 5.G). These results show that the model shows that the model has a better ability to predict. Therefore, we draw a nomogram (Figure 5.H) according to the risk score model, and the verification result is shown in the Figure 5.I.
6. Effect Of ALDH2 Overexpression On Tumor Progression In Vitro And In Vivo.
To determine whether ALDH2 expression is reduced in endometrial cancer cell lines. It was detected by Western Blot and qRT-PCR. Compared with normal human endometrial epithelial cells hEEC, the expression of ALDH2 is reduced in ISHIKAWA, SPEC-2, and KLE (Figure 6.A, B). At the same time, we constructed a lentiviral plasmid pLKO.1-ALDH2 that overexpressed ALDH2. The virus was collected after transfection of 293T cells. After being concentrated by PEG8000, it infects ISHIKAWA and KLE cell lines. The overexpression efficiency was verified by Western Blot and qRT-PCR (Figure 6.C, D). Next, the relationship between the expression of ALDH2 and the proliferation ability of endometrial cancer was verified by CCK-8 and colony formation analysis (Figure 6.E, F). The results showed that restoring the overexpression of ALDH2 reduces the proliferation ability of endometrial cancer cell lines. In addition, the tumor xenograft model is used to determine whether overexpression of ALDH2 has the same function. The results showed that the tumor mass and volume of mice in the overexpression group ALDH2 group were significantly smaller than those of the control group (Figure 6.G).
7. Hsa-miR-135b-3p binds to the 3UTR of.
The change of ALDH2 plays a crucial role in the survival and immune infiltration of UCEC patients. Therefore, there is an urgent need to find a way to regulate ALDH2. Analysis of gene mutation frequency found that in most patients, it is not the mutation that caused the effect of ALDH2 to disappear (Figure 7.A, B).
Numerous studies have found that miRNA can silence gene expression by targeting gene 3UTR. We speculate that ALDH2 may also be regulated by this mechanism. Screen the difference miRNA between UCEC patients and the normal group. The volcano plot showed that a total of 122 up-regulated miRNAs were screened (Figure 7. C). At the same time, visit the bioinformatics prediction website TargetScan. As a result, 312 miRNAs were predicted to bind to ALDH2's 3UTR. A total of 10 miRNAs were selected from the intersection, and the correlation between their expression and ALDH2 expression is shown in the figure (Figure 7D, E). The top three miRNAs with the highest negative correlation are hsa-miR-301b-5p, hsa-miR-3187-3p and hsa-miR-135b-3p.
To verify which miRNA regulates the expression of ALDH2, we constructed the Luciferase plasmid of ALDH2 3’UTR. The dual fluorescein report experiment results showed that only the hsa-miR-135b-3p transfection group had a decrease in relative fluorescence intensity. However, there was no significant difference between hsa-miR-301b-5p and hsa-miR-3187-3p after transfection (Figure 7F). The mutation group showed that the mutation could eliminate the inhibitory effect of hsa-miR-135b-3p on ALDH2 (Figure 7G). At the same time, RIP and RNA pull-down found that hsa-miR-135b-3p can enrich the expression of ALDH2, but the control group did not have this phenomenon (Figure 7H, I). qRT-PCR showed that after transfection of hsa-miR-135-3p mimic, the expression of hsa-miR-135b-3p increased significantly (Figure 7J), while the results of WB and qRT-PCR showed that the expression of ALDH2 decreased (Figure 7K, L). Besides, TCGA survival data showed that the high expression group of hsa-miR-135b-3p showed a lower prognosis (Figure 7M). These results all indicate that hsa-miR-135b-3p can down-regulate the expression of ALDH2.