Identification of hub genes and their clinical value for predicting the development of prostate cancer from benign prostate hyperplasia by bioinformatic analysis

doi:10.21203/rs.3.rs-840637/v3

Download PDF

Research Article

Identification of hub genes and their clinical value for predicting the development of prostate cancer from benign prostate hyperplasia by bioinformatic analysis

https://doi.org/10.21203/rs.3.rs-840637/v3

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Prostate cancer (PCa) and benign prostate hyperplasia (BPH) are commonly encountered diseases in males. Studies showed that genetic factors are responsible for the occurrences of both diseases. However, the genetic association between them is still unclear. Gene Expression Omnibus (GEO) database can help determine the differentially expressed genes (DEGs) between BPH and PCa. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis were utilized to find pathways DEGs enriched. The STRING database can provide a protein–protein interaction (PPI) network, and find hub genes in PPI network. GEPIA can be used to analyze expression and survival data for hub genes. R software was used to progress regression analysis. Finally, the results were tested in other databases, clinical samples and PCa cells. Fifteen up-regulated and forty-five down-regulated genes were found from GEO database. Seven hub genes were found in PPI network. The hub gene expression was tested on The Cancer Genome Atlas (TCGA) data. Except CXCR4, all hub genes expressed differently between tumor and normal samples. Exclude CXCR4, other hub genes have diagnostic value in predicting PCa and their mutations can cause PCa. The expression of CSRP1, MYL9 and SNAI2 changed in different tumor stage. CSRP1 and MYH11 could affect disease-free survival (DFS). Same results reflected in different databases. The expression and function of MYC, MYL9, and SNAI2, were validated in clinical samples and PCa cells. In conclusion, seven hub genes among sixty DEGs can be achievable targets for predicting which BPH patients may later develop PCa.

Benign prostate hyperplasia

Prostate cancer

Hub gene

Clinical value

Bioinformatic analysis

Benign prostate hyperplasia (BPH) is common among elderly men over 70. Prostate cancer (PCa) is still have high incidence rates of all cancers and is a major reason to death in elderly men[1]. Although BPH and PCa are different diseases, in that BPH is a benign disease that arises in the transitional zone, and PCa is a malignant tumor that arises mainly in the peripheral zone, they are related. The relationship between BPH and PCa was first noted in the 1950s in studies of prostate glands. During the past 60 years, some studies have shown that BPH and PCa have some definite associations. Sommers first studied BPH and PCa on cadavers. In their work, BPH was found in 80% and 45% of cadavers with or without PCa, respectively[2]. Later, another study covered alike results[3]. In 2002, Hammarsten and Hogstedt reported that BPH which has a faster growing speed may be hazards for developing clinical PCa[4]. Another study proved that the volume of the prostate may be one of the reasons for the aggressiveness of PCa, and PCa located in small glands is more aggressive than that located in larger glands[5]. This means that BPH may affect the degree of malignancy of PCa. Orsted et al. followed 3,009,258 Danish men from 1980 to 2006. During 27 years of follow-up, they found that clinical BPH was linked with a raised risk of PCa and a higher risk of death by BPH[6].

In 2001, Luo et al. researched the genetic relationship between BPH and PCa, and found that 3,215 genes were expressed differently between BPH and PCa samples[7]. Some studies have reported that gene expression could be a causal factor in the development of PCa from BPH and may even affect the degree of malignancy of PCa[8, 9]. Microarray technology and bioinformatic analysis have been extensively wielded to analyze differentially expressed genes (DEGs) and can be used to find functional pathways that will help us to better understand PCa[10]. Through gene expression profiling, some investigations have found many DEGs that play critical roles in the process of the development of PCa from BPH[8, 9]. However, the genetic mutation between BPH and PCa is still unclear.

The Gene Expression Omnibus (GEO) database can provide many microarray datasets. Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis and protein–protein interaction (PPI) network analyses have been performed to help us understand the potential mechanisms for the occurrence and progression of diseases. Thus, we used bioinformatic analysis to find key genes that may be important for development of PCa from BPH. Then, based on genetic and clinical data from The Cancer Genome Atlas (TCGA) database, the diagnostic model was built to predict the hub genes of diagnostic value for PCa. Tumor staging and survival time were also analyzed. Logistic regression was used for predicting the mutation of hub genes leading to PCa. The expression of hub gene was validated in other databases. Finally, we tested the expression and function of three hub genes, MYC, MYL9, and SNAI2, in PCa based on clinical specimens and C4-2 PCa cells.

Microarray data

The GEO database (http://www.ncbi.nlm.nih.gov/geo/) at the National Center for Biotechnology Information (NCBI) is a communal database that provides a genomics data repository of gene expression, chip, and microarray data[11]. The criteria for GSE data included in the study as follow: 1. The GSE samples have complete gene expression data from high-throughput sequencing and can be downloaded from GEO database. 2. The GSE samples data included both BPH samples and PCa samples. 3.There is a clear definition of BPH and PCa samples. Then we found three datasets GSE5377, GSE104749 and GSE30994 met our criteria. Then we downloaded the three datasets from GEO database. GSE5377 included 3 BPH samples and 17 PCa samples. GSE104749 included 4 BPH samples and 4 PCa samples. GSE30994 included 3 BPH samples and 3 PCa samples. Overall, 10 BPH and 24 PCa samples were enrolled in our study.

Data handling and DEGs searching

The primary data were got and normalized by R software. According to comments of the documents, the expression matrix including probe ID was substituted by the corresponding gene ID, and if there were multiple probes that corresponded to the same gene, the average value was calculated using the R software for further study[12]. Then all genes of each data set were searched using the limma R package, and genes with an adjusted P-value<0.05 and |log2fold change (FC)|>1 were considered DEGs. Then, we used the online web tool, Venn diagrams (http://bioinformatics.psb.ugent.be/webtools/Venn/) to find the integrated DEGs. In addition, the up-regulated and down-regulated genes were downloaded for further study.

GO and KEGG pathway analysis of DEGs

The Database for Annotation, Visualization, and Integrated Discovery (DAVID) v6.8 (https://david.ncifcrf.gov/) was used to perform GO functional and KEGG pathway analyses of the integrated DEGs[13]. The GO functional analysis of integrated DEGs involved three parts: biological processes (BP), cell components (CC), and molecular functions (MF). P<0.05 was considered statistically statistical differences [14].

PPI network and module analysis

A PPI network of the integrated DEGs was structured by online tool the Search Tool for the Retrieval of Interacting Genes (STRING) database with the default medium confidence (0.4) (http://www.string-db.org/)[15]. It helped us to find the key genes and critical gene modules participated in the promotion of BPH to PCa. Cytoscape software was used for reconstructing the PPI network, and module and GO analyses were carried out by two plug-ins in Cytoscape, Molecular Complex Detection (MCODE) and Biological Network Gene Ontology tool (BiNGO), to clarify the biological significance of gene modules from BPH to PCa. P<0.05 indicated a significant difference, and these genes were designated as hub genes[16].

Construction of risk prediction model and survival analysis

Hub gene expression between normal prostate specimens and PCa tissues was compared using gene expression profiling interactive analysis (GEPIA; http://www.gepia.cancer-pku.cn/) dependent on TCGA database[17]. Logistic regression was performed to screen the hazard ratios of hub genes changes leading to PCa. A nomogram was built to predict the risk value of the hub genes. A forest map was utilized to show the hazard ratios more intuitively. Moreover, the prognostic value of gene was enucleated by GEPIA. Then, overall survival (OS), disease-free survival (DFS) was analyzed too.

Construction of the diagnostic model and decision curve analysis

To further analyze the hub genes’ diagnostic value for PCa, we collected the gene expression of hub genes and clinical data from TCGA databases (https://portal.gdc.com)[18]. GraphPad Prism 7 (GraphPad Software, Inc., San Diego, CA) was used to draw the receiver operating characteristic (ROC) curve and decision curve analysis (DCA) was carried out by R software.

Expression of hub genes at different tumor stages

Tumor-Node-Metastasis (TNM) classification of malignant tumors is commonly used to assess the tumor severity[19]. Hub gene expression in different TNM stage of PCa was analyzed on TCGA data.

Validation of hub gene expression based on Chinese PCa patients and different databases

We downloaded the RNA-sequence data of Chinese PCa patients from Chinese Prostate Cancer Genome and Epigenome Atlas (CPGEA; http://www.cpgea.com/) and tested hub genes expression in Chinese PCa patients. We further analyzed the hub genes’ expression in normal prostate samples and PCa specimens on the UALCAN database (http://ualcan.path.uab.edu/) and The Human Protein Atlas (https://www.proteinatlas.org/)[20, 21].

Clinical specimen collection

The PCa patients participated in the study was confirmed had BPH before. The methods used for collecting the samples were approved by the Ethics Committee of Tongji Hospital, School of Medicine, Tongji University (SBKT-2021-220). Patients who provided the samples were familiar with the process of the experiment and gave informed consent.

Cell culture and transfection

C4-2 PCa cells were purchased from the Chinese Academy of Science Cell Bank (Shanghai, China). C4-2 cells were cultured in in Roswell Park Memorial Institute (RPMI) 1640 medium (Sigma, Darmstadt, Germany，Catalog No. R8758) with 10% fetal bovine serum (FBS) (Gibco, Thermo Fisher Scientific, Waltham, MA, USA, Catalog No. 10091). C4-2 cells were transfected with MYC breakdown (shMYC) plasmids, MYL9 overexpression (oeMYL9) plasmids, and SNAI2 overexpression (oeSNAI2) plasmids (constructed by fenghbio company Hunan, China) by Lipofectamine 2000 (Thermo Fisher Scientific, Catalog No. 11668019) according to the manufacturers’ instructions. The shMYC plasmid sequence is: CCTGAGACAGATCAGCAACAA.

Antibodies

Rabbit monoclonal antibodies against c-MYC (Catalog No. ab32072) and MYL-9 (Catalog No. ab191312) was purchased from abcam company (Cambridge, UK). Mouse monoclonal antibodies against SNAI2 (Catalog No. ab51772) and anti-GAPDH (Catalog No. ab8245) was purchased from abcam company, too. HRP AffiniPure Goat Anti-Rabbit IgG (Catalog No. A0216) and HRP AffiniPure Goat Anti-Mouse IgG (Catalog No. A0208) were purchased from Beyotime Biotechnology Company (Shanghai, China).

RNA extraction and qRT-PCR

The total RNA was extracted from tumor and para-cancerous tissues of patients and cells utilizing TRIzol Reagent (Sigma–Aldrich, St. Louis, MO, USA, Catalog No. T9424). cDNA was transcribed using the reverse transcription kit (Advantage® RT-for-PCR Kit, Takara Bio Inc., Kusatsu, Japan, Catalog No. 639505). Finally, we measured the volume of cDNA using real-time PCR reagents and a kit (TB Green® Premix Ex Taq™ II, Takara Bio Inc., Catalog No. RR420A) according to the manufacturer’s descriptions. The following primers of c-MYC, MYL9, SNAI2 and GAPDH were shown in Table 1 (Table 1). The 2^−ΔΔCt method was used to quantify mRNA expression levels.

Western blot

Protein was extracted with RIPA lysis buffer from tissues and cells. Protein samples were treated with Dual Color Protein Loading Buffer (Thermo Fisher Scientific, Waltham, MA, USA, Catalog No. NP0007). SDS–PAGE gels (10% and 15%) were used to separate proteins, followed by transfer to nitrocellulose membranes (Merck KGaA, Darmstadt, Germany, Catalog No. 71078)). Protein-Free Rapid Blocking Buffer (Thermo Fisher Scientific, Catalog No. 37584) was utilized to block the membranes. Then the membranes were incubated overnight at 4°C with primary antibodies against c-MYC (1:1000), MYL9 (1:1000), SNAI2 (1:1000) and GAPDH (1:1000). The next day, 1xTBST was used to wash the membranes three times (10 min. each). Then, the membranes were incubated at room temperature for 1 h with a matched secondary antibody (1:1000). Lastly, the membranes were exposed to X-ray film (FluorChem R, Protein Sample, California, USA).

Immunohistochemistry (IHC)

The expression of MYC, MYL9, and SNAI2 in clinical patients’ specimens was detected by IHC. Tumor samples were fixed by formalin and embedded into paraffin. Four-micrometer thick sections were cut from the samples and fixed. Sections were antigen retrieved and immunostaining was performed as described[22]. Anti-MYC antibody (1:1000), anti-MYL9 antibody (1:400) and anti-SNAI2 antibody (1:500). Two experienced pathologists (unaware of tissue information) independently evaluated and scored the intensity of IHC.

Cell invasion assay

After 48 h of transfection, approximately 1*10⁵ C4-2 cells and 150 uL 2% fetal bovine serum FBS +1640 culture medium was put in the upper chamber, and 10% FBS+1640 culture medium was placed in the lower cubicle. After 48h, cells were fixed with 4% paraformaldehyde fixative solution. The cells were stained with crystal violet and observed by an Olympus microscope (Olympus Corp. Tokyo, Japan). ImageJ was utilized to count cell numbers.

Cell proliferation assay

After 48 h of transfection, about 1000 C4-2 cells were placed in each well of a 96-well plate. Each set was repeated three times. The proliferation of cells in 0, 24, 48, and 72 h were detected by Cell Counting Kit-8 (CCK-8) (Solarbio, Beijing, China, Catalog No. CA1210). The optical density (OD) at 450nm was measured by enzyme labeling (LD942, Beijing, China).

Statistical analysis

The matrix data was handled with R version 4.0.2 (Institute for Statistics and Mathematics, Vienna, Austria; https://www.r-project.org). For descriptive statistics, mean±standard deviation was used for continuous variables with normal distributions, whereas the median (range) was used for continuous variables with abnormal distributions. Categorical variables were described by counts and percentages. Hazard ratios (HRs), the 95% confidence interval (95% CI), and P values were used as statistical metrics. Two-tailed P<0.05 was deemed as statistically significant.

Identification of DEGs

Gene expression datasets GSE5377, GSE107479, and GSE30994 were acquired from GEO database. The GSE5377 dataset included 547 DEGs, with 167 up-regulated genes and 380 down-regulated genes. The GSE104749 dataset included 3790 DEGs consisting of 833 up-regulated genes and 2957 down-regulated genes. The GSE30994 dataset contained 3790 DEGs, including 1429 up-regulated genes and 1872 down-regulated genes. The up and down regulated DEGs was shown in Venn map (Figure 1). In total, 15 up-regulated DEGs and 45 down-regulated DEGs were included.

Hub gene finding in PPI network

The PPI network of DEGs was constructed to help find the hub genes. The most significant module was obtained using Cytoscape software, and 7 hub genes, MYC, CXCR4, CSRP1, SNAI2, MYL9, ACTG2, and MYH11 were found (Figure 2A). Moreover, the contact of the 7 hub genes was also analyzed (Figure 2B).

GO and KEGG pathway analysis of DEGs

Try to find the pathways which DEGs enriched, we used DAVID to find the pathways. In the KEGG pathway analysis, we found that the genes were mainly enriched in proteoglycans in cancer (Figure 2C). In GO analysis, we found that the genes mainly enriched in inositol 1, 4, 5 trisphosphate binding and negative regulation of nitric oxide biosynthetic process (Figure 2D).

Expression of hub genes in TCGA database

To further confirm that the hub genes could be important factors leading to PCa, we used the GEPIA database to compare the hub genes’ expression in normal and PCa specimens. We found the hub genes were expressed differently in normal samples and PCa specimens, except for MYC and CXCR4 (Figure 3).

ROC curve and decision curve analyses

The ROC curve was used to evaluate the hub genes’ ability to diagnose PCa. We found that CXCR4 had poor diagnostic efficacy, with an Area Under Curve (AUC) of 0.5198 (95% CI, 0.4316–0.6079; P=0.6419). The other hub genes had perfect diagnostic values (P<0.05). MYC had an AUC of 0.7553 (95% CI, 0.6773–0.8332), CSRP1 had an AUC of 0.8764 (95% CI, 0.8294–0.9234), SNAI2 had an AUC of 0.7399 (95% CI, 0.7830–0.8968), MYL9 had an AUC of 0.8300 (95% CI, 0.7727–0.8873), ACTG2 had an AUC of 0.8499 (95% CI, 0.7956–0.9042), and MYH11 had an AUC of 0.8651 (95% CI, 0.8103–0.9198) (Figure 4A and Table 2). This means that almost all the hub genes were meaningful for the diagnosis of PCa. In addition,we made DCA to value the total value of these hub genes in predicting PCa (Figure 4B).

Hub gene expression at different tumor stages

Using the clinical data downloaded from TCGA database, we then analyzed hub genes’ expression at different tumor stages. We analyzed the hub genes’ expression in different TNM classification of malignant tumors. We found that some hub genes expression will also change when PCa progression. For example, the expression of CXCR4 will increase significantly from T2 tumor stage to T3 tumor stage (P=0.028). CSRP1 and MYL9 will decrease when tumor progressed from T2 to T4 (P=0.032 and P=0.047) (Figure 5A). The expression level of CXCR4 and SNAI2 will change when node metastasis happened (P=0.022 and P=0.012) (Figure 5B).

Risk prediction model and survival analysis

Logistic regression was predictive analysis model in predicting disease progress. Here, we made logistic analysis to predict hub genes’ expression in resulting PCa. The nomogram was constructed to forecast the probability of hub gene mutation leading to PCa (Figure 6A). A calibration curve was made to verify the nomogram (Figure 6B). Single factor and multi-factor regression showed the hub genes’ mutation risk in causing PCa (Figure 6C-D). In single factor logistic regression forest map, we found that except CXCR4, (P=0.848) all other hub genes may be risk factors in the occurrence of PCa (Figure 6C). However, in the multiple factor logistic regression forest map, we found that only SNAI2 (P=0.04) and MYH11 (P=0.024) may be risk factors in leading to PCa (Figure 6D). The residuals plot and the normal P–P plot of standardized residuals of logistic regression were used to test the effect of the regression model (Figure S1A-B). We further found that the hub gene CSRP1 and MYH11 can affect patients’ DFS (Figure 7). However, no hub genes had an effect on OS (Figure S2).

Hub gene expression in Chinese PCa patients and different databases

As the data from TCGA mainly included western people’s data, we tried to analyze the hub gene expression in Chinese PCa patients. We downloaded the RNA-sequence data of Chinese patients from CPGEA database. We found all these hub genes expressed differently in Chinese patients including CXCR4 (Figure 8). Further, we used the UALCAN and The Human Protein Atlas databases to compare the hub genes’ expression in normal and tumor specimens. Like the results found above, the other 6 hub genes except CXCR4 were expressed differently in normal prostate tissue and PCa samples (Figure S3).

Validation in clinical specimens

Then, we analyzed the hub genes’ expression in clinical specimens. MYC and CXCR4 was up-regulated in tumor tissues than normal tissues. However, CXCR4 expressed no difference in GEPIA and UALCAN. So, we choose MYC, as the upregulation gene,for experiment validation. In addition, there are 5 hub genes down-regulated. All the down-regulated genes expressed differently in TCGA. In ROC curve analysis, we found that MYL9 and SNAI2 has the lowest AUC among 5 down-regulated hub genes. That means the two hub genes may have minimum authenticity among 5 down-regulated hub genes in predicting PCa. So, we chose MYL9 and SNAI2 as the down-regulated genes for further study. We included 18 patients who were diagnosed with PCa with previous diagnoses of BPH in the study. We found that MYC wastrulyincreased in both mRNA and protein level when PCa happened (Figure 9A-B). MYL9 expressedlower in tumor tissues than normal tissues at mRNA and protein level (Figure 9C-D). Meanwhile, SNAI2 was also down-regulated in PCa sample tissues than para-cancerous normal tissues. (Figure 9E-F) At the same time, the IHC results reflect the same trend with the results as qRT-PCR and western blot (Figure 9G-L).

Hub genes influence on C4-2 prostate cancer cell lines

To further verify the function of the three hub genes in PCa cell, we constructed MYC knockdown (shMYC) plasmids, MYL9 overexpression (oeMYL9) and SNAI2 overexpression (oeSNAI2) plasmids. We found that we successfully knock-down MYC in C4-2 cells (Figure 10A-B). The oeMYL9 and oeSNAI2 C4-2 cells were constructed successfully, too (Figure 10C-F). Then, we detected cells invasion and proliferation ability after C4-2 cell transfected by different plasmids. We found that after transfected by shMYC, oeMYL9 and oeSNAI2 plasmids, the C4-2 cells invasion ability will decrease (Figure 10G). When C4-2 cells transfected with shMYC, oeMYL9 and oeSNAI2 plasmids, the cell proliferation will decrease, too (Figure 10H).

PCa and BPH are common encountered diseases and affects nearly seventy percent of men older than 70[23]. In recent decades, because of changes in the global population structure and aging, elderly males have increased in number[24, 25]. Therefore, both PCa and BPH are threats to the health of older men seriously. So, understanding the relationship between PCa and BPH may help better predict the occurrence of PCa and may relieve pressure on the medical system. PCa and BPH have important factors in common, such as growth depend on hormone and responsiveness to antiandrogen therapy[26]. Moreover, inflammation could be an underlying cause of both BPH and PCa. In a study of 180 men with suspected PCa who were biopsied at baseline and after 5 years of follow-up, the 5-year PCa incidence was 20% for men with biopsy specimens showing inflammation at baseline compared with 6% for men with no evidence of inflammation in baseline biopsies[27]. In 1974, Armenian et al. found that patients with BPH had a 4 to 5 times increased risk of PCa[28]. A study by Chokkalingam et al. investigated about 87,000 men, and found that patients with BPH had a 1.2 to 1.7 folds increased risk of PCa incidence and mortality[20]. All these studies reflected that BPH is a risk factors for patients developing into PCa.

Over recent decades, bioinformatics on microarray data have focused broadly on the PCa occurrence depend on bioinformatic analysis and have revealed some of the mechanisms that may lead to PCa. However, the genetic mutation of BPH to PCa is still unclear. We first adopted an integrated bioinformatics approach to directly compare the differences in gene expression between BPH and PCa. A total of 60 DEGs were found from GEO database. In addition, DEGs were mainly enriched in pathways associated with in proteoglycans in cancer, inositol 1, 4, 5 trisphosphate binding, and negative regulation of nitric oxide biosynthetic process which has been reported important in the occurrence of cancers[29-31].

Hub genes, MYC, CXCR4, CSRP1, SNAI2, MYL9, ACTG2, and MYH11, were found. This result indicates that mutations in these genes may play significant roles in the development of PCa from BPH. These genes function in cancer have been widely reported. MYC (MYC proto-oncogene) affects PCa progression due to a high-fat diet and plays a positive role in regulating the androgen receptor and androgen-receptorsplice variants in PCa[32, 33]. CXCR4 (C-X-C motif chemokine receptor 4) may promote PCa metastasis through regulation of phosphatidylinositol 4-kinase IIIα (PI4KIIIα) and SAC1 phosphatase[34]. In addition, CXCL12/CXCR4 can increase the malignancy of breast cancer and cervical cancer[35, 36]. SNAI2 (snail family transcriptional repressor 2) can regulate prostate tumor progress, angiogenesis, and metastasis potentially by modulating the GSK-3β/β-catenin signaling pathway[37]. SNAI2 also participates in regulation of the initiation and metastasis of breast cancer cells[38]. MYL9 (myosin light chain 9) can predict malignant progression and poor biochemical recurrence-free survival of PCa[39]. MYL9 is also associated with the recurrence of colorectal cancer[40]. ACTG2 (actin gamma 2, smooth muscle) and MYH11 (myosin heavy chain 11) have an affection in the development of PCa[41, 42]. ACTG2 can also affect hepatocellular carcinoma cell migration and tumor metastasis, and MYH11 may play a pivotal function in the progression of lung cancer and bladder cancer[43, 44].

When analyzing hub gene expression, we found 2 up-regulated hub genes, MYC and CXCR4, were not expressed differently between normal prostate samples and PCa samples in the GEPIA database. These results may due to that the normal sequence from GEPIA includes data from both TCGA and GTEx database. So, we detected these hub genes expression in different databases, we found that all these hub genes except CXCR4 can lead tothe occurrence of PCa.In addition, some hub genes expression differently in different tumor stage. That means some hub genes can be indicators to predict disease progression. As study reported that patients who develop PCa from BPH would have higher cancer-related mortalities, we carried out a survival analysis[6]. We found that CSRP1, and MYH11 can influence patients’ DFS. These results reflect that the expression level change of these hub genes can be potential signal of disease occurrence and progression.

Finally, we utilized clinical specimens and C4-2 PCa cells to validate the hub genes’ functions in promoting PCa development. We found that the hub gene MYC, MYL9, and SNAI2 can truly influence PCa progress. These means that these hub genes we found are important in the occurrence of PCa.

However, our study had some limitations. First, although we included 3 datasets in the study, the number of samples was still small; there were only 10 BPH samples and 24 PCa samples. A too-small sample size may have led to a less representative study. Second, although we validated the hub genes’ expression in different databases and clinical samples, the mechanisms of these hub genes in leading to PCa from BPH did not study. Thus, our results require further validation. However, our study is the first one to address genes differentially expressed in BPH and PCa by bioinformatic analysis.

We used bioinformatic analysis to identify significant genes in the development of PCa from BPH and validated their roles in PCa. The seven hub genes we found can be achievable targets for predicting which BPH patients may later develop PCa.

Acknowledgement

Not applicable.

Funding

This study was supported by Natural Science Foundation of Shanghai Municipal Science and Technology Committee (22ZR1456800, 21ZR1458300, 19ZR1448700). Clinical Research Plan of SHDC (NO. SHDC2020CR3074B) and New Frontier Technology Joint Research Project of Shanghai Municipal Hospital (No. SHDC12019112) and Clinical project of Shanghai Municipal Health Commission (NO.20184Y0263, NO.2018Y0105).

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

All authors contributed to the study conception and design. Xi Chen and Junjie Ma put forward the idea of the article, wrote the manuscript and analyzed the data. Licheng Wangand Yicong Yao collected the data from GEO and TCGA database. Chengdang Xu and Xinan Wang finished the RT-qPCR, Western blot and IHC experiments. Xi Chen and Tong Zi finished cell invasion and proliferation experiments. Cuidong Bian and Denglong Wu help collecting clinical specimens. Denglong Wu and Gang Wu supervised the experiments progress and revised the manuscript. All authors read and approved the final manuscript.

Ethic approval and consent participate

This study was performed in line with the principles of the Declaration of Helsinki. The study was approved by the ethic committee of Tongji Hospital, School of Medicine, Tongji University (SBKT-2021-220). Each participate volunteered to join and signed the informed consent form.

Consent for publication

Not applicable.

Orsted, D.D. and S.E. Bojesen, The link between benign prostatic hyperplasia and prostate cancer. Nat Rev Urol, 2013. 10(1): p. 49-54.
Sommers, S.C., Endocrine changes with prostatic carcinoma. Cancer, 1957. 10(2): p. 345-58.
Chokkalingam, A.P., et al., Prostate carcinoma risk subsequent to diagnosis of benign prostatic hyperplasia: a population-based cohort study in Sweden. Cancer, 2003. 98(8): p. 1727-34.
Hammarsten, J. and B. Hogstedt, Calculated fast-growing benign prostatic hyperplasia--a risk factor for developing clinical prostate cancer. Scand J Urol Nephrol, 2002. 36(5): p. 330-8.
Briganti, A., et al., Prostate volume and adverse prostate cancer features: fact not artifact. Eur J Cancer, 2007. 43(18): p. 2669-77.
Orsted, D.D., et al., Association of clinical benign prostate hyperplasia with prostate cancer incidence and mortality revisited: a nationwide cohort study of 3,009,258 men. Eur Urol, 2011. 60(4): p. 691-8.
Luo, J., et al., Human prostate cancer and benign prostatic hyperplasia: molecular dissection by gene expression profiling. Cancer Res, 2001. 61(12): p. 4683-8.
Muller, I., et al., Comparison of genetic alterations detected in circulating microsatellite DNA in blood plasma samples of patients with prostate cancer and benign prostatic hyperplasia. Ann N Y Acad Sci, 2006. 1075: p. 222-9.
Shah, U.S. and R.H. Getzenberg, Fingerprinting the diseased prostate: associations between BPH and prostate cancer. J Cell Biochem, 2004. 91(1): p. 161-9.
Guo, L., et al., Identification of key genes and multiple molecular pathways of metastatic process in prostate cancer. PeerJ, 2019. 7: p. e7899.
Barrett, T., et al., NCBI GEO: archive for functional genomics data sets--update. Nucleic Acids Res, 2013. 41(Database issue): p. D991-5.
Gautier, L., et al., affy--analysis of Affymetrix GeneChip data at the probe level. Bioinformatics, 2004. 20(3): p. 307-15.
Dennis, G., Jr., et al., DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol, 2003. 4(5): p. P3.
Smyth, G.K., Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol, 2004. 3: p. Article3.
Szklarczyk, D., et al., The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acids Res, 2017. 45(D1): p. D362-D368.
Shannon, P., et al., Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res, 2003. 13(11): p. 2498-504.
Tang, Z., et al., GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses. Nucleic Acids Res, 2017. 45(W1): p. W98-W102.
Blum, A., P. Wang, and J.C. Zenklusen, SnapShot: TCGA-Analyzed Tumors. Cell, 2018. 173(2): p. 530.
Paner, G.P., et al., Updates in the Eighth Edition of the Tumor-Node-Metastasis Staging Classification for Urologic Cancers. Eur Urol, 2018. 73(4): p. 560-569.
Chandrashekar, D.S., et al., UALCAN: A Portal for Facilitating Tumor Subgroup Gene Expression and Survival Analyses. Neoplasia, 2017. 19(8): p. 649-658.
Basha, O., et al., The DifferentialNet database of differential protein-protein interactions in human tissues. Nucleic Acids Res, 2018. 46(D1): p. D522-D526.
Zhang, H., et al., FOXO1 inhibits Runx2 transcriptional activity and prostate cancer cell migration and invasion. Cancer Res, 2011. 71(9): p. 3257-67.
Alcaraz, A., et al., Is there evidence of a relationship between benign prostatic hyperplasia and prostate cancer? Findings of a literature review. Eur Urol, 2009. 55(4): p. 864-73.
Bushman, W., Etiology, epidemiology, and natural history of benign prostatic hyperplasia. Urol Clin North Am, 2009. 36(4): p. 403-15, v.
Siegel, R.L., K.D. Miller, and A. Jemal, Cancer statistics, 2019. CA Cancer J Clin, 2019. 69(1): p. 7-34.
Andriole, G.L., et al., Effect of dutasteride on the risk of prostate cancer. N Engl J Med, 2010. 362(13): p. 1192-202.
MacLennan, G.T., et al., The influence of chronic inflammation in prostatic carcinogenesis: a 5-year followup study. J Urol, 2006. 176(3): p. 1012-6.
Armenian, H.K., et al., Relation between benign prostatic hyperplasia and cancer of the prostate. A prospective and retrospective study. Lancet, 1974. 2(7873): p. 115-7.
Edwards, I.J., Proteoglycans in prostate cancer. Nat Rev Urol, 2012. 9(4): p. 196-206.
Ismatullah, H. and I. Jabeen, Combined Pharmacophore and Grid-Independent Molecular Descriptors (GRIND) Analysis to Probe 3D Features of Inositol 1,4,5-Trisphosphate Receptor (IP3R) Inhibitors in Cancer. Int J Mol Sci, 2021. 22(23).
Switzer, C.H., et al., Nitric oxide and protein phosphatase 2A provide novel therapeutic opportunities in ER-negative breast cancer. Trends Pharmacol Sci, 2011. 32(11): p. 644-51.
Bai, S., et al., A positive role of c-Myc in regulating androgen receptor and its splice variants in prostate cancer. Oncogene, 2019. 38(25): p. 4977-4989.
Labbe, D.P., et al., High-fat diet fuels prostate cancer progression by rewiring the metabolome and amplifying the MYC program. Nat Commun, 2019. 10(1): p. 4358.
Sbrissa, D., et al., A novel cross-talk between CXCR4 and PI4KIIIalpha in prostate cancer cells. Oncogene, 2019. 38(3): p. 332-344.
Lecavalier-Barsoum, M., et al., Targeting the CXCL12/CXCR4 pathway and myeloid cells to improve radiation treatment of locally advanced cervical cancer. Int J Cancer, 2018. 143(5): p. 1017-1028.
Martinez-Ordonez, A., et al., Breast cancer metastasis to liver and lung is facilitated by Pit-1-CXCL12-CXCR4 axis. Oncogene, 2018. 37(11): p. 1430-1444.
Tian, X., et al., The miR-203/SNAI2 axis regulates prostate tumor growth, migration, angiogenesis and stemness potentially by modulating GSK-3beta/beta-CATENIN signal pathway. IUBMB Life, 2018. 70(3): p. 224-236.
Jiang, B., et al., TOX3 inhibits cancer cell migration and invasion via transcriptional regulation of SNAI1 and SNAI2 in clear cell renal cell carcinoma. Cancer Lett, 2019. 449: p. 76-86.
Huang, Y.Q., et al., Decreased expression of myosin light chain MYL9 in stroma predicts malignant progression and poor biochemical recurrence-free survival in prostate cancer. Med Oncol, 2014. 31(1): p. 820.
Qiu, X., et al., Weighted gene co-expression network analysis identified MYL9 and CNN1 are associated with recurrence in colorectal cancer. J Cancer, 2020. 11(8): p. 2348-2359.
Lai, J., et al., A microsatellite repeat in PCA3 long non-coding RNA is associated with prostate cancer risk and aggressiveness. Sci Rep, 2017. 7(1): p. 16862.
Wu, Y., et al., Identification of ACTG2 functions as a promoter gene in hepatocellular carcinoma cells migration and tumor metastasis. Biochem Biophys Res Commun, 2017. 491(2): p. 537-544.
Hu, J., et al., The identification of new biomarkers for bladder cancer: A study based on TCGA and GEO datasets. J Cell Physiol, 2019.
Nie, M.J., et al., Clinical and prognostic significance of MYH11 in lung cancer. Oncol Lett, 2020. 19(6): p. 3899-3906.

Table 1. Primers used for the qRT-PCR

Gene Name	Primer sequence
c-MYC	Forward: 5′-AGCGACTCTGAGGAGGAACAA-3′
c-MYC	Reverse: 5′-TGGGCTGTGAGGAGGTTTG-3
MYL9	Forward: 5′-AACATGTCCAGCAAACGTGC-3′
MYL9	Reverse: 5′-GCGAAGACATTGGAGGTGG-3′
SNAI2	Forward: 5′-GGACTAGTATGCCGCGCTCCTTCCTGGTC-3′
SNAI2	Reverse:5′-CGGAATTCTCAGTGTGCTACACAGCAGCCAGATTC-3′
GAPDH	Forward: 5-GGAGCGAGATCCCTCCAAAAT-3′
	Reverse: 5′-GGCTGTTGTCATACTTCTCATGG-3′

Table 2. The diagnostic value of hub genes in PCa

Id	P value	AUC and 95% CI
MYC	0.0027	0.7553(0.6773-0.8332)
CXCR4	0.6419	0.5198(0.4316-0.6079)
CSRP1	0.0009	0.8764(0.8294-0.9234)
SNAI2	0.0079	0.7399(0.7830-0.8968)
MYL9	0.0153	0.8300(0.7727-0.8873)
ACTG2	0.0178	0.8499(0.7956-0.9042)
MYH11	1.76E-07	0.8651(0.8103-0.9198)

No competing interests reported.

FigS1.tif
Figure S1: (A)The residuals plot of logistic regression. (B) The normal P–P plot of standardized residuals of logistic regression.
FigS2.tif
Figure S2: The OS of 7 hub genes in PCa patients was evaluated by Kaplan-Meier curve from GEPIA. (A) MYC (B) CXCR4 (C) CSRP1 (D) SNAI2 (E) MYL9 (F) ACTG2 (G) MYH11
FigS3.tif
Figure S3: The expression of 7 hub genes in different databases on TCGA data. (A) The expression of 7 hub genes in PCa depend on UALCAN database. (B) The expression of 6 hub genes in PCa depend on The Human Protein Atlas.

Download PDF

Editorial decision: Major revision
24 Apr, 2022
Reviews received at journal
20 Apr, 2022
Reviewers agreed at journal
11 Apr, 2022
Reviewers agreed at journal
08 Apr, 2022
Reviewers invited by journal
30 Mar, 2022
Editor assigned by journal
30 Mar, 2022
Editor invited by journal
29 Mar, 2022
Submission checks completed at journal
29 Mar, 2022
First submitted to journal
21 Mar, 2022

You are reading this latest preprint version

Identification of hub genes and their clinical value for predicting the development of prostate cancer from benign prostate hyperplasia by bioinformatic analysis

Status:

Version 3

Abstract

Figures

Introduction

Materials and methods

Results

Discussion

Conclusion

Declarations

Reference

Tables

Additional Declarations

Supplementary Files

Status:

Version 3