A Research of the Predictive Values of E2F Factors in Colorectal Cancer

Background: E2F family of transcription factors are related to cell cycle and apoptosis with activate or repress function on human cancer. Their critical role in colorectal cancer (CRC), including colon adenocarcinoma (COAD) and rectum adenocarcinoma (READ), were widely investigated with contradictory and ambiguous results. Methods: In this study, we explored the expressions of these eight E2F genes and their predictive values on the overall survive (OS) and disease-free survival (DFS) in CRC patients by means of bioinformatic databases, such as ONCOMINE, GEPIA, DAVIA and cBioPortal. Results: We found that the transcriptional levels of E2F1/2/3/5/7/8 were signicantly higher in CRC patients than that in normal samples, while the E2F4/6 showed no signicantly difference from GEPIA database. The transcriptional levels of all eight E2F factors showed no correlation with four tumor stages, except for the E2F3 is signicantly varied in COAD patients. Survival analysis based on GEPIA database indicated that lower E2F3 and E2F4 mRNA expressions were signicantly associated with the OS of the COAD patients. Conclusions: This study manifested that the E2F1/2/3/5/7/8 are potential biomarkers of CRC patients, and E2F3 could be used as the predictive factor of tumor stage in COAD patients. E2F3 and E2F4 might be the prognostic markers for COAD patients.


Introduction
The E2F family are these transcription factors that involved in several cellular functions pertinent to cell cycle and apoptosis. Traditionally, E2F family with eight members (E2F1-8) are grouped into ''activators'' and ''suppressors''. E2F1, E2F2 and E2F3 are classi ed into the so-called ''activators'' and there were supposed to promote cell proliferation with the same promotor binding element [1,2]. While the E2F4-8 are classi ed into "suppressors" because the main function of these members were to repress the transcription of some E2F target genes [3]. E2Fs play pivotal role in the transcriptional regulation and the G1/S transition of mammalian cell [4]. The close association between E2Fs and several malignant tumors, such as lung cancer [5], prostate cancer [6], breast cancer [7] and colorectal cancer (CRC) [8] were wildly reported.
CRC is one of the common malignant tumors in the world with high morbidity and mortality, and tumor metastasis is the primary factor that result in the relapse of CRC patients [9]. This kind of malignant tumor was divided in to two main types based on location: Colon Adenocarcinoma (COAD) and Rectum Adenocarcinoma (READ). Although there are big breakthroughs in the research and clinical treatment of CRC, the high dearth rate and low ve-year survival rate are still frustrated us. CRC ranked the third place in most common cause of cancer death in both men and women in the United States [10,11]. These intractable problems will not be solved until we surmount numerous di culties. Many CRC patients were in the middle or late stage when there were diagnosed at the rst, because the scarcity of precise biomarkers for early diagnosis. Therefore, it is of great importance to investigate the deep mechanism of CRC and seek early biomarkers for the diagnosis and distinguishing of different tumor stages.
Among all of these eight E2F factors, most of them were identi ed in the development of CRC extensively.
In Fang's report, in CRC cell lines, overexpression of E2F1 promoted the migration and invasion [12]. Further, the CRC patients with low levels of E2F1 expression have a better prognosis, and the protein expression levels of E2F1 were positively related with lymph node metastasis, TNM stage and distant metastasis [12]. Guo [13] found that rs3829295 at E2F7 to be signi cantly associated with risk of CRC. Zhang reported that the expression level of E2F8 was signi cantly upregulated in CRC tissues compared with adjacent normal tissues, and the upstream miR-1258 regulates cell cycle and inhibits cell proliferation by directly targeting E2F8 [14].
Generally speaking, the aforementioned literatures have not adequately illustrated the root causes of this disease. And the relationship of the expression levels of these abnormal E2F factors with the progress and survival rate of CRC has not been reported. In this study, we are trying to nd opportune biomarkers by bioinformatic tools. By means of huge amount of gene expression and survival data online, we analyzed their expression pattern and the correlation with prognosis and survival rate of CRC patients in detail. A new kind of direction to the following research provided by bioinformatic analysis is to be expected.

Material And Methods
This study was approved by the ethic commit of The First A liated Hospital of Hunan University of Chinese Medicine. All of these data were collected from online datasets, so the informed contents were acquired from the patients in these literatures.

Oncomine Analysis
ONCOMINE database is a cancer microarray database with comprehensive cancer mutation spectrum and gene expression data, in which the transcription levels of E2Fs in different cancers were analyzed.
The mRNA expressions of E2Fs genes in CRC patients were compared with normal controls, and the P value and fold change of were set 10 − 4 and 2 as signi cant, respectively. GEPIA (Gene Expression Pro ling Interactive Analysis) dataset GEPIA (http://gepia.cancer-pku.cn/) is a newly developed interactive web server for analyzing the RNA sequencing expression data of 9,736 tumors and 8,587 normal samples from the TCGA and the GTEx projects, using a standard processing pipeline.
GEPIA provides customizable functions such as tumor/normal differential expression analysis, pro ling according to cancer types or pathological stages, patient survival analysis, similar gene detection, correlation analysis and dimensionality reduction analysis [15]. cBioportal [16,17] Colorectal Adenocarcinoma (TCGA, Provisional) datasets including 640 total samples. The genomic pro les included mutations, putative copy-number alterations (CNA) from GISTIC, mRNA expression zscores (RNA Seq V2 RSEM) and protein expression Z-scores (RPPA). Co-expression and network were calculated with built-in tools of cBioportal.
DAVID (The Database for Annotation, Visualization and Integrated Discovery) DAVID now provides a comprehensive set of functional annotation tools for investigators to understand biological meaning behind large list of genes [18,19]

Transcriptional levels of E2Fs in CRC patients vs normal population
All of the eight E2F factors have been studied in the human, with some of whom were identi ed in detail. In this study, we compared the transcription levels of these E2F factors in CRC patients with those in normal samples by mining the Oncomine database (Fig. 1). The signi cant changes of E2Fs expression in transcription level were summarized in Table 1. The mRNA expression levels of E2F1 were signi cantly upregulated in patients with CRC that compared with normal samples in three datasets. In Hong's dataset [20], E2F1 is overexpressed in colorectal carcinoma with a fold change of 2.068. In the dataset of The Cancer Genome Atlas, E2F1 is also overexpressed in rectal adenocarcinoma with a fold change of 2.056. Aa for the E2F2, however, Skrzypczak's dataset showed signi cantly lower expression in colon adenoma epithelia and colon carcinoma epithelia with fold changes of -2.095 and − 2.139, respectively.
Similarly, E2F3 is signi cantly overexpressed in rectosigmoid adenocarcinoma and colorectal carcinoma with fold changes of 2.109 and 2.940 in Kaiser's and Hong's dataset, respectively. Unfortunately, the mRNA expression of E2F4 is not available in the Oncomine database. In Skrzypczak's and Sabates-Bellver's dataset, E2F5 is upregulated in colorectal adenocarcinoma with a fold change of 2.061 and in Colon Adenoma with a fold change of 2.132. The mRNA expression levels of E2F6, E2F7 and E2F8 were signi cantly increased in patients of colon and rectal adenocarcinoma as well. The transcription levels of E2F6 in colon and rectal adenocarcinoma are signi cantly higher than normal samples, and their fold changes are 2.527 and 2.328, respectively. A similar trend is showed in E2F7 in Skrzypczak's and Gaedcke's datasets, which showed that E2F7 is signi cantly upregulated in colon carcinoma and rectal adenocarcinoma with fold changes of 8.952 and 2.194, respectively. In Sabates-Bellver's dataset, the colon and rectal adenoma were signi cantly higher than normal samples in the transcription levels of E2F8 expression. Expression analysis of eight E2F factors and their comparison among different clinicopathological parameters The GEPIA (Gene Expression Pro ling Interactive Analysis) dataset (http://gepia.cancer-pku.cn/) was used to compare the mRNA expression levels of these E2F factors in the patients of COAD and READ with normal colorectal samples. We found that the transcriptions levels in of E2F1, E2F2, E2F3, E2F4, E2F5, E2F6, E2F7 and E2F8 were increased in COAD and READ samples than that in normal colorectal samples ( Fig. 2A). However, the difference between COAD or READ samples and normal colorectal samples showed no statistical signi cance in E2F4 and E2F6 (Fig. 2B). Furthermore, we then compared the expression of these E2F factors among four tumor stages (stage -) for COAD and READ. The results indicated that the expression of eight E2F factors showed no signi cance among four tumor stages, except for the E2F3 is signi cantly varied in COAD patients (Fig. 3).

Associations of mRNA expression of E2F factors with the prognosis of CRC patients
Then, we investigated the predictive value of mRNA expression of E2F factors in the survival of CRC patients. The Survival Plots of GEPIA dataset was used to calculate the correlation between the mRNA levels of E2Fs and the survival of CRC patients. The log-rank test revealed that the lower E2F3 and E2F4 mRNA expressions were signi cantly associated with the overall survival (OS) of the COAD patients (Fig. 4A). However, the increased and decreased mRNA expressions of E2F factors showed no difference of OS and disease-free survival (DFS) in READ patients (Fig. 4B).

Detailed changes of E2F factors and their interactive genes in CRC patients
The alterations type of E2Fs, correlations and interactions with neighbor genes were analyzed by online cBioportal dataset (http://www.cbioportal.org/) (TCGA, PanCancer Atlas) for CRC patients. In this analysis, 594 samples showed altered E2Fs in 524 CRC patients. Among all of these four subtypes of CRC, the increased mRNA expressions of E2Fs accounted for a large proportion (more than 50% samples) (Fig. 5A). After that, we explored the mutual relationships of eight E2F genes with each other by analyzing their mRNA expression (RNA Seq V2 RSEM). The results indicated that a majority of these genes were positively related with each other, while E2F2 was negatively related with E2F5 and E2F6, E2F5 was with E2F7, and E2F6 was with E2F8 (Fig. 5B). Then, the network of E2F factors with 50 mostly changed adjacent genes was constructed. We found that the CDK8, SMAD2 RB1, and ELF1 were tightly related with the E2F changes (Fig. 5C).
The enrichment analysis and functional annotations of E2Fs and these genes that signi cantly related with E2Fs were predicted by DAVID dataset (The Database for Annotation, Visualization, and Integrated Discovery) (https://david.ncifcrf.gov/summary.jsp), the GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes). GO annotations were classi ed into three categories to predict the function of target genes: biological processes, cellular components, and molecular functions. These results showed that GO:0045944 (positive regulation of transcription from RNA polymerase II promoter), GO:0000122 (negative regulation of transcription from RNA polymerase II promoter) and GO:0045893 (positive regulation of transcription, DNA-templated) in biological processes (Fig. 6A), and GO:0005654 (nucleoplasm), GO:0005634 (nucleus) and GO:0005667 (transcription factor complex) in cellular components (Fig. 6B), and GO:0008134 (transcription factor binding), GO:0003700 (transcription factor activity, sequence-speci c DNA binding) and GO:0003677 (DNA binding) in molecular functions (Fig. 6C) were signi cantly controlled by these E2F alterations.
KEGG analysis from DAVID dataset showed the signaling pathways of how these E2F genes changed and the most signi cantly changed genes nearby (Fig. 7). As we can see in Fig. 6D, top 10 of 22 signaling pathways that most related with the signi cant change of E2F genes in CRC were presented with KEGG analysis.

Discussion
The transcription factors E2F family were reported in the regulation of cell cycle and apoptosis [3]. Some of them could activate the quiescent cells from G phase to S phase [21] and they might be the indispensable and determant factor for the survival of CRC cells, as E2F knockdown slows down the G /S transition and the prolifration rate of human intestinal epithelial cells [22]. In spite of this, survival analysis of CRC patients with bioinformatic tools has not been widely reported so far. In this study, we integrated several online resourses about the expression of E2F factors, and evaluated their prognostic values in OS and DFS.
Among all these E2F family, E2F1 is the most extensively investigated member of E2F family in CRC [23][24][25]. A research reported suppressive function of E2F1 showed that the absence of E2F1 gene lead to an increased incidence and progression of tumorigenesis induced by PTEN loss [23]. While others found that E2F1 was overexpressed in CRC patients [25] and E2F1 could promote the aggressiveness of human colorectal cancer by activating the ribonucleotide reductase small subunit M2 [12]. In this study, we disclosed that the mRNA expression of E2F1 was signi cantly higher in both COAD and READ based on ONCOMINE and TCGA datasets. Generally speaking, E2F2 was regarded as the "activator" by acting on downstream proteins to promote proliferation and cell growth. In the Table 1, we found that the expression of E2F2 were signi cantly lower in CRC tissues, which is in accord with Xanthoulis's report that E2F2 was expressed in 41cases at low levels [26]. However, the transcriptional levels of E2F2 were signi cantly higher both in COAD and READ. It might be explained by Johnson that the ability of some E2F family members to behave as both oncogene and tumor suppressor gene can be reconciled by putting E2F into context [27]. However, the OS and DFS were not associated with the differential E2F1 or E2F2 expression levels after the follow up of 150 months. Besides, neither the clinical stages of COAD and READ patients were not correlated with E2F1 or E2F2 expression.
The overexpression of E2F3 in CRC tissues and cell lines has been substantiated by RT-qPCR [28]. Both in vitro and in vivo experiments, Zhang demonstrated that high levels of miR-34a expression could inhibit the expression of direct target gene E2F3, and then inhibit the growth of CRC cells [29]. CircPRMT5 is an upstream regulator of E2F3 and it could promote the proliferation of CRC by inducing the expression of E2F3 [30]. Comparing with other E2F members, E2F4 has been reported in a lower extent. In Garneau's report, they found that E2F4 is required for cell cycle progression of not only normal intestinal crypt cells but also CRC cells [22]. In an immunohistochemical study of 100 cases, Xanthoulis found that E2F4 expressed nuclear immunopositivity in all cases [26]. In our study, the expressions of E2F3 were signi cantly higher in CRC tissues than normal tissues. While the data about the expressions of E2F4 was absent. Secondly, the expressions of E2F3 was correlated with the clinical stages in COAD patients, implying the E2F3 could be used as the reference index of tumor staging for this disease. Besides, we found that the higher E2F3 and E2F4 expression were signi cantly related with poor OS in COAD patients but not with DFS. So, these two factors could be applied to evaluate the prognosis of COAD patients.
E2F5 has been widely reported with upregulation in several tumors and played a crucial part in the cell proliferation, cell migration and invasion of these diseases [31,32]. Yu found that the E2F5 with higher expression signi cantly increased CRC cells proliferation and could also reverse the inhibition of CRC cells proliferation induced by upstream SNHG6 silencing [33]. Similarly, the facilitating effect of E2F5 on CRC cells progression could be suppressed by upstream miR-34a [34]. The higher expression of E2F6 was reported in gastric and breast cancers [35,36]. However, there is no study about the expression and prognostic role of E2F6 in CRC patients as far as we know. The E2F7 was signi cantly higher in CRC cells as reported by Liu's study, and the decreased E2F7 expression by all-transretinoic acid could reverse the progression of CRC cells [37]. The low-frequent missense variant in E2F7 was reported with signi cant association with CRC risk, indicating the important role of E2F7 in development of this tumor [13]. Similarly, the expressions of E2F8 in CRC patients were signi cantly higher [14], and these CRC cells with E2F8 knockdown showed lower growth rates [38]. In this study, the expressions of E2F5, E2F6, E2F7 and E2F8 were signi cantly higher in CRC tissues than normal tissues. But the expression levels of these four E2F members were neither related with the clinical stages nor the OS and DFS of COAD and READ patients.
In general, we thoroughly explored the expression and survival evaluation of these E2F members in CRC patients,and found a new path with bioinformatic tools to analyze the complexity of CRC. We found that E2F1/2/3/5/7/8 were signi cantly higher in CRC patients than that in normal samples and E2F3 could be used as the biomarker for clinical staging in COAD patients. Besides, the lower E2F3 and E2F4 expression predicted higher OS and DFS rate. The summary of eight E2F factors in different cancers in transcription levels (Oncomine database).

Figure 3
Page 18/21 E2Fs expression in different tumor stages in COAD and READ patients (from GEPIA).

Figure 4
The associations of mRNA expressions E2F factors with OS and DFS in CRC patients (from GEPIA).
Page 19/21 Figure 5 The analysis of expression and mutation of E2F genes in CRC (cBioportal) Page 20/21

Figure 6
The enrichment analysis and functional annotations of E2Fs, and the prediction of these genes associated signi cantly with E2Fs alterations by DAVID (A, B and C) and KEGG (D).