Transcription Factors E2F2/3/4 As Possible Colorectal Cancer Prognostic Biomarkers

Wen-Da Wang Department of Colorectal and Anal Surgery, Shanxi Cancer hospital Li-Chun Wang Department of Colorectal and Anal Surgery, Shanxi Cancer hospital Mao-Xi Liu Department of Colorectal and Anal Surgery, Shanxi Cancer hospital Yan-Jun Lu Department of Colorectal and Anal Surgery, Shanxi Cancer hospital Bo Jiang (  13834567839@163.com ) Department of Colorectal and Anal Surgery, Shanxi Cancer hospital


Introduction
The E2F transcription factors comprise an eight-member family. E2Fs contain one or more conserved DNA binding domains that interact with target promoters to regulate gene expression (1,2). E2Fs comprise two opposing functional classes: activators (E2F1-3) and repressors (E2F4-8). In addition, the activities of E2Fs are context-dependent (3,4). Most E2Fs and their target genes coordinate the oscillatory nature of the cell cycle (5). However, E2Fs also participate in cell proliferation, tissue homeostasis, differentiation, angiogenesis, metabolism, autophagy, mitochondrial functions, DNA damage response, apoptosis, and tumorigenesis(6-11). Aberrant expression of E2Fs and alterations in E2F function coincide with poor prognosis in cancers; these ndings emphasize the importance of E2Fs in the cancer phenotype (12).
Colorectal cancer (CRC) is the third most diagnosed malignant tumor and the second leading cause of mortality worldwide (13). Despite improved early diagnosis and treatment strategies (surgery, chemotherapy, radiotherapy, target therapy), the prognosis of CRC is still unsatisfactory (14). Because of CRC tumor heterogeneity, a broader and deeper understanding of the occurrence and progression mechanisms of CRC is required for individualized treatment and improvement in prognosis.
Deregulated expression of E2Fs is a common phenomenon in human cancer (15). This deregulated expression may promote or suppress tumor progression depending on cell type and tissue context. Most CRC studies have focused on only a single E2F and its important function in CRC progression (16)(17)(18)(19)(20).
Still absent is a comprehensive analysis of the simultaneous activities of all the E2Fs in CRC progression and development. Presently, big data and bioinformatics technology have provided novel means to examine cancer mechanisms. In this study, we used public datasets and bioinformatics technologies to investigate the functions of E2Fs in CRC. Our ndings provide improved understanding of the functions of E2Fs in CRC, and our report will contribute to identifying clinical implications for diagnosis, prognosis, and the treatment strategy design.

Oncomine database analysis
The Oncomine database (http://www.oncomine.org) (21,22) is a web-based data mining platform that collects, standardizes, analyzes, and delivers transcriptomic cancer data for biomedical research. We used the Oncomine database to compare the mRNA expression of E2Fs in various types of cancers and their matched normal tissues. The threshold was determined as the following: p-value ≤ 1E-4, fold change ≥ 2.
2.2 TIMER database analysis TIMER (https://cistrome.shinyapps.io/timer/) is a web-based data mining platform for systematic analysis of immune in ltration levels and gene expression (23). We used the TIMER mRNA expression data of E2Fs in CRC and compared them between cancer tissues and matched normal tissues. A p-value < 0.01 was considered statistically signi cant.

GEPIA database analysis
Gene Expression Pro ling Interactive Analysis (GEPIA) (http://gepia.cancer-pku.cn/index.html) was used to generate survival curves (OS and DFS) based on RNA sequencing-determined expression from the TCGA database (24). We used GEPIA to identify differences in E2F mRNA expression in CRC and matched normal tissues. A p-value < 0.05 was considered statistically signi cant. GEPIA was also used to identify correlations between E2F mRNA expression and survival in CRC. Hazard ratio and log-rank p-value were measured, and a p-value < 0.05 was considered statistically signi cant. To identify the correlation between E2Fs in CRC, we obtained the R and p-values; a p-value < 0.01 was considered statistically signi cant. The absolute value of R from 0-0.09 indicated no correlation, 0.1-0.3 was weak correlation, 0.3-0.5 was medium correlation, and 0.5-1.0 was strong correlation. To pro le the correlation of E2F mRNA expression of E2Fs in CRC according to pathological stages, a p-value < 0.05 was considered statistically signi cant.

PrognoScan database analysis
PrognoScan (http://www.abren.net/PrognoScan/) is an online database with clinical annotation and a web-based tool for assessing the biological relationships between gene expression and prognostic information such as OS, DSS, and DFS in various types of cancers (25). In addition, this tool automatically calculates p-value, hazard ratio, and 95% con dence intervals based on a particular gene expression. We used the PrognoScan database to identify correlations between E2F mRNA expression and survival from CRC with the adjusted cox p-value < 0.05.

TCGA database analysis
The expression of E2F mRNAs in CRC was selected for further analyses with cBioPortal (http://www.cbioportal.org/index.do? Session id=5b4c1773498eb8b3d566f7b8). The genomic pro les included mutations, putative copy number alterations, and mRNA expression Z scores (RNA-seq v.2 RSEM). Genes co-expressed with E2Fs were calculated according to the cBioPortal's online instructions and analyses using the STRING (www.string-db.org/) database. Finally, the results of the PPI network were displayed with CentiScaPe 2.2.

GO functional annotation and KEGG pathway enrichment analyses
The Database for Annotation, Visualization, and Integrated Discovery (DAVID) v.6.8 (https://david.ncifcrf.gov)(26) was used to perform GO (27) functional annotation and KEGG pathway enrichment analyses of genes co-expressed with E2Fs. The human genome was selected as the background list parameter, and a p-value < 0.05 was set as statistical signi cance.

GEO dataset analysis
The Gene Expression Omnibus (GEO) repository(29) distributes microarray, next-generation sequencing, and other forms of high-throughput functional genomic data, which also helps users to analyze GEO data by GEO2R, an R-based web application (30). The difference of miRNA between CRC tissues and corresponding normal tissues was analyzed by the GEO database (GSE 115513), (http://www. ncbi.nlm.nih.gov/geo/). A p-value < 0.05 and absolute value of fold change ≥ 0.58 (logFC ≥ 1.5) were set as statistical signi cance. The common miRNAs of the StarBase analysis and the GEO dataset analysis were recognized as validated targets.

The mRNA expression levels of E2Fs in CRC
To compare E2F expression in tumor and normal tissues, we extracted from the Oncomine, TIMER, and GEPIA databases the E2F mRNA levels in multiple cancer types. In one or more of the datasets, the mRNA expression levels of E2F1, E2F3, E2F4, E2F5, E2F6, E2F7, and E2F8 were signi cantly upregulated in CRC patients. The mRNA expression levels of E2F2 were signi cantly downregulated in CRC patients based on the Oncomine and TIMER datasets. See summary in Figure 1, Figure 2A, and Figure 2B.

The IHC results of E2Fs in CRC
We obtained the IHC results of E2Fs in human colon tissues. In parallel with the increased mRNA expression levels as revealed in the Oncomine, TIMER, and GEPIA datasets, the IHC expression levels of E2F1, 3, 4, 5, 6, 7 and 8 were signi cantly higher in carcinoma tissues ( Figure 3).
Positive correlations between E2F expression levels were the following ( Figure 4) E2F2 had a signi cant negative correlation with E2F5.

Prognosis values of E2Fs in CRC
The Kaplan-Meier curve and log rank test GEIPA analyses of E2Fs in CRC revealed that increased expression of E2F3 and E2F4 was associated with poor overall survival (OS) (p < 0.05) [ Figure 5]. The prognosis values of E2Fs in CRC determined by PrognoScan revealed that increased expression of E2F1, E2F2, and E2F7 was associated with improved Disease free survival (DFS), OS, and Disease speci c survival (DSS) (p < 0.05) [Table ]. The correlations of E2Fs with tumor stage indicated that expression of the E2F1, E2F3, and E2F5 group varied as a function of tumor stage; however, expression of the E2F2, E2F4, E2F6, E2F7, and E2F8 groups did not vary with tumor stage [ Figure 6 A]. Further, we did not nd any signi cant differences in expression of the E2Fs between the non-metastasis and metastasis groups [ Figure 6 B]. E2F4 expression was higher in the male group compared with the female group [ Figure 6 C]. E2F8 expression was higher in the mucinous adenocarcinoma group compared with the adenocarcinoma group [ Figure 6 D].

Alterations of E2Fs in CRC
E2Fs expression levels were varied in 175 of 524 patients with CRC (33 %), and the alterations of E2Fs in mucinous adenocarcinoma were more frequent than alterations in the adenocarcinomas. In addition, E2F1 had the highest mutation frequency with ampli cation [ Figure 7]. The network constructed by E2Fs and the 69 most frequently altered co-expressed genes showed that cell cycle-related genes were closely associated with E2Fs alterations [ Figure 8].

Differential miRNA expression between CRC tissues and corresponding normal tissues
To assess the potential targeting of miRNAs to E2F2, E2F3, and E2F4, we compared the miRNAs between CRC tissues and corresponding normal tissues in the GEO database (GSE 115513). We obtained 148 miRNAs, 90 upregulated and 58 downregulated, between CRC tissues and corresponding normal tissues.

Discussion
The E2Fs are encoded by eight genes whose protein products form a core transcriptional axis crucial for coordinating the oscillatory nature of the cell cycle, proliferation, tissue homeostasis, differentiation, angiogenesis, metabolism, autophagy, mitochondrial functions, DNA damage response, apoptosis, and tumorigenesis(6-11). Deregulated expression of E2Fs has been observed in many types of cancers (31)(32)(33). In addition, restoring the balance between E2F1 and E2F7 is a therapeutic strategy in head and neck squamous cell carcinomas (34). Although the functions of single E2F genes in tumorigenesis and progression of CRC have been partially con rmed(16-20), a bioinformatics analysis of these transcription factors in CRC had not been performed. Thus, we investigated the mRNA and protein expression, prognostic values, and potential biological functions of E2Fs in CRC that will contribute to identifying clinical implications for diagnosis, prognosis, and treatment strategy.
We found that the mRNA and protein levels of E2F1, E2F3, E2F4, E2F5, E2F6, E2F7, and E2F8 were upregulated in CRC tissues, whereas the expression level of E2F2 was downregulated. In addition, the expression of E2Fs showed complex, intertwined positive correlations with each other; however, E2F2 showed a negative correlation with E2F5. We also found the E2Fs expression level were altered in CRC tissues. These results may indicate that synergy or mutual antagonism by E2Fs promote or suppress CRC tumor progression. The expression levels of different E2Fs were related to the OS and clinical pathological parameters, and E2F expression varied with tumor stage, These results suggest that E2Fs could serve as markers for CRC progression.
In colon cancer, E2F1 is the most highly examined gene of the eight E2F genes. CDCA3 mediates p21dependent proliferation by regulating E2F1 expression in colorectal cancer (35). For E2F2, miR-155 regulates the proliferation and cell cycle of colorectal carcinoma cells by targeting E2F2 mRNA(36). For E2F3, miR-503 inhibits cell proliferation and induces apoptosis in colorectal cancer cells by targeting E2F3 mRNA (37). Micro RNA-449b inhibits proliferation of SW1116 colon cancer stem cells by downregulating CCND1 and E2F3 expression(38). CircPRMT5 circular RNA promotes proliferation of colorectal cancer through sponging miR-377 to induce E2F3 expression (39). For E2F5, miRNA-34a targets FMNL2 and E2F5 and suppresses the progression of colorectal cancer (40). MicroRNA-32 inhibits the proliferation, migration, and invasion of human colon cancer cell lines by targeting E2F5 mRNA (41). In view of the activities of miRNAs to target E2Fs in CRC, we found that several miRNAs may target E2F2, E2F3 and E2F4. Therefore, these miRNAs may exert their functions on E2F2, E2F3 and E2F4 mRNAs to regulate CRC tumors, but the mechanism needs further study.
Some studies have shown that E2Fs may exert function by regulating cell signaling pathways. Knockdown of E2F8 suppressed CRC cell proliferation through the NF-kB pathway (42). Upregulated miR-1258 regulates cell cycle and inhibits cell proliferation by directly targeting E2F8 in CRC(43). In our research, the network constructed by E2Fs and the 69 most frequently altered co-expressed genes showed that the cell cycle-related genes were associated with E2Fs mutations.
The GO functional annotation and KEGG pathway enrichment analyses showed that the major functions of E2Fs were regulation of cell cycle, cell division, and proliferation, DNA replication and repair. The cell cycle was the most enriched pathway. We also found new functions and pathways for E2Fs in CRC, such as angiogenesis, dynein binding, annealing helicase activity, and anaphase-promoting complex; these new associations are worthy of further study.
We have systematically analyzed the relationship between E2F transcription factors and CRC. We suggest that these transcription factors could be markers of CRC. Although functions of some E2Fs in CRC have been reported, the mutual regulation of these factors is rarely reported for CRC, and the regulatory issue should be further investigated. Our bioinformatics analysis showed potential miRNA targets for E2F2, E2F3, and E2F4; these particular factors are related mainly to cell cycle signaling pathways.