Analysis of P4HA1 expression in cancers
In this work, we acrried out a comprehensive analysis towards the latent biological role of P4HA1 (NM_001017962.3 for mRNA or NP_001017962.1 for protein, Figure S1A) across tumors. First, we found that the P4HA1 protein structure was conserved among distinct species and universally consisted of the P4Ha_N (pfam08336) domain and 2OG-FeII_Oxy_3(cll7304) domain (Figure S1B). Besides, we also analyzed the evolutionary correlation of the P4HA1 protein in distinct species and displayed through the phylogenetic tree data (Figure S2). All above results indicated that P4HA1 might act as a vital role in various biological process.
Then, we evaluated the relative expression level of P4HA1 in distinct normal tissues and cell lines. In basis of the combination of the Human protein atlas (HPA), GTEx and Function annotation of the mammalian genome 5 (FANTOM5) datasets, P4HA1 presented highest expression level in the vagina, followed by the skeletal muscle and breast (Figure S3A), showed tissue enhanced expression pattern (vagina). We evaluated the HPA/GTEx/FANTOM5 datasets and found that P4HA1 was ubiquitously expressed in immune cells, and its mRNA expression showed a low immune cell specificity (Figure S3B). Besides, we employed the HPA/Monaco/Schmiedel datasets to evaluate the P4HA1 expression in distinct blood cells. The results indicated that P4HA1 exerted a low RNA blood cell type specificity (Figure S3C).
Furthermore, TIMER 2.0 tool was applied to calculate the expression pattern of P4HA1 across tumors. As displayed in Figure 1A, the relative expression level of P4HA1 in the tumor tissues of bladder cancer (BLCA), BRCA, cholangio carcinoma (CHOL), colon adenocarcinoma (COAD), esophageal carcinoma (ESCA), GBM, head and neck squamous cell carcinoma (HNSC), KIRC, LUAD, lung squamous cell carcinoma (LUSC), prostate adenocarcinoma (PRAD), rectum adenocarcinoma (READ), stomach adenocarcinoma (STAD), thyroid carcinoma (THCA) and UCEC were remarkedly higher than expression in adjacent normal tissues. While the expression level of P4HA1 in the tumor tissues of kidney chromophobe (KICH) was obviously decreased.
Considering that the normal tissues data of several types of tumors was not available in TCGA, the GEPIA tool was employed to further analyze the expression level of P4HA1 between the tumor tissues and normal controls. The results demonstrated that P4HA1 expression level was increased in lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), brain lower grade glioma (LGG) and uterine carcinosarcoma (UCS) tumor tissues, while decreased in acute Myeloid Leukemia (LAML) (Figure 1B). Additionally, the expression pattern of P4HA1 in adrenocortical carcinoma (ACC), OV, pheochromocytoma and paraganglioma (PCPG), sarcoma (SARC), testicular germ cell tumors (TGCT) and thymoma (THYM) did not exert a significant difference (Figure S4A).
Besides, we also employed the CPTAC database to evaluate the total protein level of P4HA1 in breast cancer, OV, colon cancer, ccRCC, UCEC and LUAD. The results indicated that the total protein level of P4HA1 in these 6 tumor tissues was much higher than that in negative controls (Figure 1C). Oncomine dataset was employed to perform the pooling analysis of distinct studies. The results demonstrated that P4HA1 was upregulated in brain and CNS cancer, breast cancer, CRC, head and neck cancer, kidney cancer, lung cancer, pancreatic cancer and sarcoma in comprasion with normal tissues (Figure S5).
Moreover, the “Pathological Stage Plot” module of GEPIA2 was used to evaluate the relationship between P4HA1 expression level and the pathological stages of tumors. We discoveried that the expression level of P4HA1was interrelated to several cancers including ACC, CESC, HNSC, KICH, KIRP, LUAD and TGCT (Figure 1D), but not others (Figure S5B-E).
Survival analysis of P4HA1
To determine the correlation between P4HA1 expression level and the prognostic value across distinct cancers, we divided the tumors cases into high-expression cohorts and low-expression cohorts in accordance with the expression level of P4HA1 and analyze the relationship by employing the TCGA and GEO database, respectively. The results discoveried that high expression level of P4HA1 was interrelated to poor prognosis of OS for cancers of BLCA, CESC, KICH, KIRP, LUAD, MESO, PAAD, SARC, THCA and UVM within the TCGA project (Figure 2A). DFS analysis data indicated a correlation between high expression level of P4HA1 and poor prognosis for the TCGA cases of ACC, KICH, KIRP, LUAD, LUSC, MESO, PAAD, PCPG and UVM (Figure 2B). Additionally, we also illustrated that low expression of the P4HA1 was interrelated to poor OS prognosis for KIRC (Figure 2A), which was inconsistent with mRNA and protein expression patterns.
Furthermore, we exploited the Kaplan-Meier plotter dataset to evaluate the survival data and discoveried a positive correlation between high P4HA1 expression level and poor OS, distant metastasis-free survival (DMFS), relapse-free survival (RFS) prognosis for breast cancer (Figure S6A), poor OS and progress-free survival (PFS) prognosis for ovarian cancer (Figure S6B) and poor OS for lung cancer (Figure S6C). Nevertheless, a low expression level of P4HA1 was interrelated to poor OS, first progression (FP) and post-progression survival (PPS) prognosis for gastric cancer (Figure S6D), poor PFS for liver cancer (Figure S6E). We conducted a meta-analysis to further verified the correlation between P4HA1 expression level and prognosis for breast cancer, ovarian cancer, lung cancer, gastric cancer and liver cancer (Figure S7). In addition, a series of subgroup analysis utilizing selected clinical factors was performed to observe different conclusions (Table S1-S5).
Genetic alterations analysis of P4HA1
The cBioPortal tool (https://www.cbioportal.org/) was utilized to explore the genetic alterations of P4HA1 [19, 20]. As shown in Figure 3A, SKCM patients with “mutation” (>4%) as the primary type cases. The “amplification” type was the primary type in stomach tumor patients, which presented an frequency of 2%. Notably, the prostate cancer cases with genetic alteration (>1% frequency) had copy number deletion of P4HA1. Besides, the 3D structure of P4HA1 protein was presented in Figure 3B. All data about sites, types and case number of the P4HA1 genetic alteration are displayed in Figure 3C. We discoveried that the missense mutation of P4HA1 was the major type of genetic alteration, and R399H/C alteration was detected in 3 cases of UCEC, 2 cases of SKCM, I case of GBM and 1 case of PRAD (Figure 3C). Moreover, the relationship between genetic alteration of P4HA1 and the prognostic value across TCGA cancers was investigated. The results indicated that tumors cases without altered P4HA1 reflected better prognosis in DSS and PFS, but not in OS and DFS, compared with cases with P4HA1 alteration.
DNA methylation analysis of P4HA1
We evaluated the association between P4HA1 expression level and DNA methylation across all tumors of TCGA. There were positive correlation between P4HA1 expression and four methyltransferases in ACC, CESC, KICH, KIRP, LGG, PRAD, READ, SKCM, TGCT, THCA, UCEC and UVM (Figure S8). Besides, we exploited the MEXPRESS approach to evaluate the latent correlation between P4HA1 DNA methylation and the pathogenesis of distinct cancers. We found a negative correlation of P4HA1 gene expression and DNA methylation in BLCA, SARC and BRCA, while a positive association in TGCT (Figure S9).
Immune infiltration analysis in cancers
Moreover, we analyzed the association between P4HA1 expression and immune cells infiltration level across diverse tumor types. As presented in Figure S10, the P4HA1 expression was remarkedly interrelated with infiltrating immune cells in most of cancers (top three cancers: CESC, KIRC and LGG). Next, we analyzed the correlation of P4HA1 expression and StromalScore, ImmuneScore and ESTIMATEScore across TCGA tumors. The top 3 tumors most absolutely interrelated with expression of P4HA1 were PCPG, LGG and COAD (StromalScore), CESC, UCEC and LGG (ImmuneScore and ESTIMATEScore) (Figure S11A). Besides, P4HA1 expression level was positively acssociated with infiltrating StromalScore in KIRP and KIRC (Figure S11B).
Moreover, the EPIC algorithms, TIMER, MCPCOUNTER, XCELL, TIDE, TIMER, CIBERSORT, CIBERSORT-ABS and QUANTISEQ were applied to evaluate the potential correlation between the infiltration level of CAFs and CD8+ T cells and P4HA1 gene expression in various cancers. After analyzing, we found a significant positive correlation of P4HA1 expression and the immune infiltration of CAFs for the TCGA tumors of ESCA, HNSC, HNSC-HPV- and OV (Figure 4). We also discovered a statistical positive relationship between P4HA1 expression level and the estimated infiltration value of CD8+ T-cells in HNSC and HNSC-HPV-, while a statistical negative correlation in UVM (Figure 5) in basis of all or most algorithms.
Furthermore, we carried out a correlation analysis between P4HA1 expression level and immune checkpoint gene expression. Among them, P4HA1 expression was positively interrelated to TNFSF15, CD80, VTCN1, TNFSF18 and CD200R1 in multiple types of tumors, but negatively correlated with CD40LG and CD244 in multiple cancers. Nevertheless, the association of P4HA1 expression with LAG3, ICOS, CTLA4, CD48, TNFSF14 and CD27 was inconsistent among distinct cancer types (Figure S12A). Additionally, we investigated the correlation between P4HA1 expression and tumor mutational burden (TMB)/microsatellite instability (MSI). As shown in Figure S12B-C, we discoveried a positive association between P4HA1 expression and TMB for BRCA, COAD, KICH, LUAD, PAAD, PRAD, SKCM, STAD, THYM and UCEC, but found negative correlation for THCA. Furthermore, P4HA1 expression level was also positively correlated with MSI of COAD, KICH, READ and STAD, but is negative correlated with that of BRCA, LUAD, PCPG and SKCM. Overall, the results proved that P4HA1 played a vital role in regulating tumor immunity, which might explain its influence in the prognosis and survival of tumor patients.
Enrichment analysis of P4HA1-related genes in cancers
To explore the potential mechanism of P4HA1 in tumorigenesis, we tried to screen for genes that target the P4HA1 binding protein and P4HA1 expression for several pathway enrichment analysis. According to the STRING database, 50 potential P4HA1 binding proteins were obtained and the interaction network was shown in Figure 6A. Then, we utilize d the GEPIA2 tool to combine all the tumor expression data of TCGA and obtained the 100 main genes linked to the expression of P4HA1. The results revealed that the P4HA1 relative expression was positively interrelated to that of BNIP3L (BCL2 interacting protein 3 like) (R = 0.49, p<0.001), FUT11 (fucosyltransferase 11) (R = 0.55, p<0.001), LDHA (lactate dehydrogenase A) (R = 0.53, p<0.001), PGK1 (phosphoglycerate kinase 1) (R = 0.60, p<0.001) and RPL17P50 (ribosomal protein L17 pseudogene 50) (R = 0.56, p<0.001) genes (Figure 6B). Besides, the corresponding heatmap analysis also indicated a positive correlation between P4HA1 and these 5 genes (BNIP3L, FUT11, LDHA, PGK1 and RPL17P50) in the majority of 32 types of cancers (Figure 6C). We performed an intersection analysis of the two groups and found 5 genes, PLOD1 (procollagen-lysine,2-oxoglutarate 5-dioxygenase 1), PLOD2, LOXL2 (lysyl oxidase like 2), EGLN1 (egl-9 family hypoxia inducible factor 1) and EGLN3 (Figure 6D).
Moreover, the two datasets were combined to conduct KEGG and GO enrichment analysis. The results of GO enrichment analysis revealed that most of these genes are linked to the pathways of carbohydrate binding, monosaccaride binding, oxidoreductase activity and others (Figure 6E). The KEGG data suggest that “HIF-1 signaling pathway”, “Diabetic cardiomyopathy”, “Lysine degradation” and others might be involved in the effect of P4HA1 on tumorigenesis.
Then, GSEA was conducted to analyze the functional enrichment of high P4HA1 expression and low P4HA1 expression (Figure S13). KEGG enrichment term displayed that high expression of P4HA1 was primarily related to cell cycle, oocyte meiosis and one carbom pool by folate. HALLMARK terms displayed that high expression of P4HA1 was mainly associated with mtorc1 signaling, G2M checkpoint and glycolysis.
P4HA1 could promote the proliferation, migration and invasion of RCC
To further confirm the relative expression level of P4HA1 in RCC tissues, the qRT-PCR and western blot assays were performed. The results indicated that P4HA1 was upregulated in RCC tissues both at mRNA and protein level (Figure 7A, B). Then, we transfected RCC cells with P4HA1 overexpression plasmid (OE-P4HA1) and found that OE-P4HA1 could remarkedly elevate the expression of P4HA1 (Figure 7C, D). Moreover, EdU and Transwell assay were conducted to verify the biological function of P4HA1 in RCC. As shown in Figure 7E-G, overexpression of P4HA1 could signifiantly promote the proliferation, migration and invasion of RCC cells.
P4HA1 exerted regulatory effects in RCC progression via regulating EMT
To explore the potential mechanism of P4HA1 in RCC, we transfected 786-O and ACHN cells with P4HA1 overexpression plasmid and detected the mRNA and protein level of EMT-related genes by qRT-PCR, western blot and immunofluorescence assay. The results revealed that downregulated expression level of E-cadherin and increased expression level of N-cadherin and vimentin were found in the OE-P4HA1 group (Figure 8A-C). These results suggested that P4HA1 might promote RCC progression by upregulating EMT.