Multiple Omics Analysis Reveals E2F5 Predict Prognostic Biomarker in Diffuse Large B-Cell Lymphoma


 Diffuse large B-cell lymphoma (DLBCL) is highly aggressive and fatal hematological malignancy. There are few biomarkers that can be used to predict the survival of DLBCL patients. Therefore, there is an urgent need to find new biological targets to improve the predictive value and sensitive diagnosis of DLBCL.E2F family play an essential role in tumorigenesis, however, remains obscure in DLBCL.E2F transcription factor family(E2Fs) mRNA expression between DLBCL and nonmalignant samples were screened by GEPIA,CCLE and EMBL-EBI. The associated regulation pathway in DLBLC was established using the GeneMANIA,Metascape ,SMATAPP database. Transcription analysis indicated E2F1/4/5/8 mRNA expression was significantly higher in patients and the cell lines.What’s more, the high E2F5/8 expression had significantly lower survival rate. Further functional analysis showed that E2F1/3/5 were hypomethylated in DLBCL,which may associated with patient chemo-resistance. Subsequently,these genes with their co-expression genes mainly formed transcription factor complex, regulated G1/S transition of mitotic cell cycle and through TGF-beta signaling pathway to participate DLBCL tumorigenesis. This results demonstrate that E2F5 were potential prognostic biomarkers for better survival of DLBCL patients.


Introduction
DLBCL is a heterogeneous malignancy molecularly can be classi ed into at least three distinct subtypes (ABC, GCB, PMBL) 1 . According to adverse outcomes,biomarkers including MYC/BCL2 rearrangement, and TP53 mutations implicated worse prognosis 2 . R-CHOP regimen indeed improved overall survival, but more than 30% of patients will ultimately relapse 3 .The cell origin phenotype and gene mutation are suggested as predictors in DLBCL, however, due to the heterogeneity and complexity of lymphoma, there are still many unknown oncogenes and tumor suppressor genes that need to be discovered. Thus, nding reliable biomarkers that can accurately detection of DLBCL is necessary.
The E2F family has been considered as an important regulator of the cell cycle,which plays vital roles in regulating transcription and tumor suppressor proteins. E2F family members are divided into transcriptional activators or repressors 4 . E2F1, E2F2 and E2F3A, as activator proteins, lead to adequate transition from the G1 to S phase and enhanced cellular proliferation 5 6 . E2F3b and E2F4/5/6,which are 'repressive' E2F branches, repress the transcription in quiescent and early G1 cells 7 . As atypical repressors, E2F7/8 suppress E2F1-induced cell cycle 8 . The E2Fs mRNA expression is aberrant in several human malignancies, such as breast cancer 9 , gastric cancer 10 , hepatocellular carcinoma 11 , lung adenocarcinoma 12 and colon cancer 13 . To data, however, the overall biological role and clinical signi cance of the E2F factors in DLBCL remains not fully elucidated.
With the recent evolution of sequencing technologies, a vast amount sequencing data has been uploaded to the online repositories 14 . Here, we analyzed the unique expression patterns and signi cance for survival prognosis of eight E2Fs in DLBCL patients by bioinformatics.

Ethics statement
This research has been approved by the Ethics Committee of the First A liated Hospital of Xiamen University (Fujian, China). The TCGA and GTEx data was retrieved from published literature, and all procedures are implemented in accordance with relevant guidelines and regulations.
GEPIA Dataset Analysis GEPIA (http://gepia.cancer-pku.cn/),which is a gene expression interactive analysis server,providing clinical and sequencing data from the TCGA and the GTEx projects. Meanwhile,it visualize this data by customizable functions depending on your need 15 .

SMART APP
The SMART (Shiny Methylation Analysis Resource Tool) App combines multi-omics and clinical data with comprehensive DNA methylation analysis.It visualizes the relationship between gene expression and methylation in carcinoma 18 .

Metascape
The Metascape (http://metascape.org),as a reliable, effective and intuitive tools,enables visualization analyses of multi-omic data. For this, it was utilized to conduct KEGG and GO(containing MF,BP,CC) analyses of E2Fs 19 .

GeneMANIA analysis
GeneMANIA (http://www.genemania.org) uses genomics and proteomics data to predict genes with similar functions. Here it was employed for predict the function of E2Fs and coexpression genes in DLBCL 20 .

Statistical analysis
The GSE31312 was analyzed by using R studio software version 1.2.1335. And it was used to verify OS and PFS between E2Fs expression of lymphoma patients.*p < 0.05, **p < 0.01, and ***p <0.001 mean statistically signi cant difference.

Results
E2Fs were highly expressed in DLBCL in the mRNA level To call DEGs of E2Fs family in DLBCL, we used the GEPIA dataset to compare the mRNA expression of DLBCL patient samples from TCGA tumors (n=47) and GTEx healthy tissues (n=337). We used log2 (TPM + 1) to quantity mRNA expression data for E2Fs and found that E2F1/4/5/6/8 were highlyexpressed in DLBCL, while E2F2/3/7 were not signi cantly different. (Fig.1)

Expression of E2Fs translation factors in lymphoma cell lines
After determining the expression of E2Fs in clinical lymphoma patients, we analyzed E2Fs' mRNA expression in cell lines through the CCLE and EMBL-EBI database. By assembling CCLE, we found that

Functional enrichment analysis of E2F1/4/5/8 and co-expression genes in patients with DLBCL
To reveal the function and potential mechanism of E2F1/4/5/8, we constructed a network of E2F1/4/5/8 and their neighboring genes (TFDP1/2, RBL1/2, E2F2) by the GeneMANIA database(Fig5A). By analyzing Metascape, we found the E2F1/4/5/8 and their neighboring genes were mainly enriched in G1/S cell cycle. These genes may form protein-DNA complex.Then they enhanced the interaction between RNA polymerase and speci c promoters to promote targeted gene expression by changing the structure of DNA. (Fig.5B and Table2).
The top KEGG pathways showed that the cell cycle signaling pathway and TGF-beta signaling pathway were signi cantly found to be involved in the development of multiple tumor and participated in the tumorigenesis and pathogenesis of lymphoma. (Fig.5C and Table 3.) In addition, to further understand the role of E2Fs in lymphoma, we conducted a relevant protein-protein interaction (PPI) analysis (Fig.5D).

Discussion
DLBCL is the most common form of lymphoma. Current biological and clinical research shows that it is a highly heterogeneous and aggressive malignant tumor, which is manifested in clinical outcome, genetic characteristics, cell origin, etc. 23 . Although with the clinical application of the RCHOP treatment strategy, the cure rate of DLBCL has been signi cantly improved, 30-40% of patients still relapse 24 . Therefore, researches looking for biomarkers to predict clinical prognosis in DLBCL are necessary. Researches suggested that the E2F transcription factors family plays a key role in cell cycle network regulation, which regulate cell proliferation, differentiation, apoptosis, and participate in physiological and pathological processes. 25,26 . However, the multifaceted roles of E2Fs in the development, metastasis, and prognostication of DLBCL remain to be clari ed. Our study analyzed the transcription levels of E2F mRNA expression, potential pathways and prognostic (OS/PFS) values in DLBCL.
E2F1 preferentially binds to the retinoblastoma protein (pRB) in a cycle-dependent manner 27 . It can simultaneously mediate cell proliferation and p53-dependent or independent apoptosis 28 . Our results indicated that the higher expression of E2F1 mRNA means extended OS and PFS. Møller found that E2F1 as tumor suppressor gene in DLBCL and low E2F1 expression was associated with treatment failure of DLBCL, which may serving as prognostic markers for DLBCL patients 29 . The research results are consistent with ours. However, Samaka found that high E2F1 expression had shorter survival time of DLBCL cases and upregulation of E2F1 indicated that the tumor is more malignant 30 . Therefore, more detail researches need to do to prove the relation between the E2F1 and DLBCL patient survival.
As an oncogene, E2F5 can cooperate with other oncogenes promote cell malignant transformation. E2F5 is up-regulated and shows signi cant correlation with pathological variables and tumorigenesis 31 . E2F5 repressors as effectors of RB that control cell proliferation and apoptosis 8 . Previous studies reveal that E2F5 was highly expressed in several tumors 32 , such as glioblastoma 33 , breast cancer 34 and prostate cancer 35 . In this study, we found that Elevated expressions of E2F5 were signi cantly correlated with shortened OS/PFS in DLBCL patients.
As a tumor suppressor, E2F8 mediates transcriptional suppression by blocking cells from entering S phase 36 . Over-expression of E2F8 activates target genes that may promote cell cycle, mitosis, immune and other cancer related functions in Burkitt's lymphoma (BURK) and mantle cell lymphoma (MCL) 37 . Our research showed that elevated expressions of E2F8 were signi cantly correlated with shortened OS in DLBCL.
In our research, we revealed that E2F1/4/5/8 mRNA expression level were up-regulated in the patients and the cell lines of DLBCL. Further analysis showed that E2F2/3/4/5, RBL1/2, TFDP1/2, CDK2 mainly formed transcription factor complex participating cell cycle,and TGF-beta signaling pathway in DLBCL. Previous studies have shown that abnormally methylated genes affect its molecular expression regulation and patient's survival, which found that gene expression is associated with methylation in DLBCL 38 . We found that the E2F1/3/5 expression is negative with DNA methylation. Emerging studies have provided evidence that E2F and other transcription factors regulate DNMTs expression through E2F-Rb-HDAC-dependent and -independent pathways, which potentially contribute to tumor progression 39 . In our study, we found that elevated E2F1, E2F5, TFDP1, HDGF and DNMT1 were negative with DNA methylation in DLBCL patients. TFDP-1 and E2F factors jointly participate in the normal cell cycle procession. Somatic mutations in TFDP-1 uncouples the normal biological processes of the E2F pathway, which could lead to tumorigenesis 40 . HDGF upregulates in a variety of malignant tumors ,which is closely related to the tumor stage, grade and proliferation 41 . Studies have found that cancers harboring mutant p53 genes are accompanied by high expression of HDGF 42 . In DLBCL patients, p53 mutations are accompanied by related gene expression dysregulation leading to poor prognosis 43 . In our research, we found that TFDP1and HDGF are positive correlation with E2F5, which is negative with DNMT1 expression in DLBCL patient databases. This result also consistent with observations in immortalized cell lines, indicating that only co-expressed with TFDP1,the transient overexpression of E2F5 can stimulate the cell proliferation 31 . We indicated that DNMT1-mediated DNA methylation of E2F5/TFDP1/HDGF ternary complex regulates prognosis in DLBCL patients. These deregulation makes cell cycle change and has prognostic signi cance in DLBCL.

Conclusions
We comprehensively analyzed the role of E2F transcription factors in the occurrence and development of DLBCL and its prognosis through bioinformatics. we found that E2F1/4/5/8 were over-expression in DLBCL cell lines and patients. And E2F2/3/4/5, RBL1/2, TFDP1/2, CDK2 mainly formed transcription factor complex, regulating cell cycle and TGF-beta signaling pathway in DLBCL. E2Fs expression and DNA methylation status showed that E2F1/3/5 expression might be negatively related to DNA methylation in DLBCL. These genes with DNA methylation genes were correlated with TFDP1 and HDGF, they may anticipate the pathogenesis of DLBCL. Furthermore, our study indicated that increased expression of E2F5 indicated worse prognosis in DLBCL, it may be potential targets for individualized treatment of DLBCL patients. In conclusion, this research demonstrates for the rst time that E2F5 were potential prognostic biomarkers for better survival of DLBCL patients.  Tables   Table I  Cox regression  Top 8 clusters with their representative enriched terms (one per cluster). "Count" is the number of genes in the user-provided lists with membership in the given ontology term. "%" is the percentage of all of the user-provided genes that are found in the given ontology term (only input genes with at least one ontology term annotation are included in the calculation). "Log10(P)" is the p-value in log base 10. "Log10(q)" is the multi-test adjusted p-value in log base 10. Top 4 clusters with their representative enriched terms (one per cluster). "Count" is the number of genes in the user-provided lists with membership in the given ontology term. "%" is the percentage of all of the user-provided genes that are found in the given ontology term (only input genes with at least one ontology term annotation are included in the calculation). "Log10(P)" is the p-value in log base 10. "Log10(q)" is the multi-test adjusted p-value in log base 10.

Figures
Page 16/22 Figure 1 The E2F family members' mRNA level in DLBC patients (GEPIA). The box plots compare the differential expression of E2Fs in tumor tissues and normal tissues; *Indicate that the p-value <0.05.

Figure 3
Intersection of the predicted results in the GEPIA, CCLE and EMBL-EBI.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.