HLF is a Potential Prognostic Biomarker in Head and Neck Squamous Cell Carcinoma Based on Bioinformatic Analysis and Experimental Validation

Background Head and neck squamous cell carcinoma (HNSCC) is one of the most frequent cancers worldwide, with an increasing incidence. However, the underlying molecular mechanisms of HNSCC are poorly understood. Method In this work, 5 original datasets (GSE23558, GSE13601, GSE30784, GSE9844, GSE78060) of Head and neck squamous cell carcinoma (HNSCC) were selected from Gene Expression Omnibus (GEO) database. To identify differentially expressed genes (DEGs) in HNSCC and adjacent tissues. The common DEGs were acquired by Venn diagram. The sensitivity and specicity of HLF were determined by Receiver operating characteristic curves (ROC). Then, In order to further conrm the relationship between HLF and HNSCC patient’s prognosis, the expression and survival analysis of HLF was performed by Gene Expression Proling Interactive Analysis (GEPIA), Cell culture, reverse transcription polymerase chain reaction (RT-PCR), western blotting and immunohistochemical staining. Seventeen DEGs were screened from ve sets of HNSCC functional gene expression series in GEO datasets. The low expression of HLF was indicated might be correlated with poor prognosis of HNSCC patients based on the bioinformatics analysis. According to the results of Cell culture, RT-PCR, western blotting, immunohistochemical staining, it was conrmed that the low level of HLF expression correlated with poor prognosis of HNSCC patients. The study effectively revealed useful information about the relationship of the low level of HLF expression and HNSCC. In summary, we identied HLF as a potential prognostic biomarker and therapeutic target for HNSCC.


Introduction
Head and neck squamous cell carcinoma (HNSCC) is one of the most common malignancy, with >450,000 patients diagnosed every year [1]. Despite the integrated application of therapies, the overall 5year survival rate of HNSCC patients is still below 50% in the last 20 years [2][3]. Although breakthrough ndings have been made on the pathogenesis of HNSCC, no potential prognostic biomarkers have been screened to improve the diagnosis and treatment strategies [4].
As we all know, the occurrence and development of tumor is divided into multiple stages, which are affected by many factors. Even if the same tumor occurs in different lesion sites or different disease stages of the same patient, it may express different molecules [5]. Comprehensive studies are needed to provide the basis for identi cation of the reliable prognostic biomarker. The differentially expressed genes (DEGs) or differentially expressed proteins (DEPs) can be identi ed between tumor and normal tissues by bioinformatics methods, and further analyzed the correlation between patients survival rate and DEGs or DEPs, so as to screen the potential tumor prognostic markers. Some scholars demonstrated the expression of DDB2 (damaged DNA binding protein 2) inhibited the expression of endogenous hypoxia markers, promoted the tumor angiogenesis, and correlated with the in ltration and prognosis of HNSCC. [6]. Zhao et al [7] proved that SPP1, ITGA6, TMPRSS11D, MMP1, LAMC2, FAT1, ACTA1, SERPINE1 and CEACAM1 played important roles in the development of tumor, may serve as prognostic marker and a potential therapeutic target for HNSCC. But the above conclusion and the mechanism between tumor and these genes need more experiments to prove. In another experiment, the opposite conclusion was put forward, which suggested that the expression of SPP1 did not affect the development and prognosis of tumor [8]. Therefore, more valuable prognostic markers need to be identi ed.
In this study, we downloaded the raw data (GSE23558, GSE13601, GSE30784, GSE9844, GSE78060) from Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo), and revealed 17 differentially expressed genes between tumor samples and normal tissues. The clinical data in the TCGA database further con rmed that the weak expression of HLF was closely related to bad prognosis and patient survival. The results of RT-PCR and Western blotting and immunohistochemical staining revealed some certain relationship between HLF low expression and HNSCC disease.

Identi cation of DEGs and clinical signi cance
Five GEO Series was used in the prsent study including GSE23558, GSE13601, GSE30784, GSE9844 and GSE78060, and analyzed separately using online GEO2R with default parameters (https://www.ncbi.nlm.nih.gov/geo/geo2r/). The DEGs with the cutoff criteria of |the log 2 -fold change (logFC)|>2 and adj.P.value≤0.05 were considered to be signi cantly different. Venn diagram to determine common HNSCC DEGs of these ve series was created using the web tools (http://bioinformatics.psb.ugent.be/webtools/Venn/). Subsequently, the common genes highly related to HNSCC were de ned as hub genes. In addition, the sensitivity and speci city of HLF were determined by Receiver operating characteristic curves (ROC).
The expression and survival of HLF, to further test the accuracy of HLF as a prognostic factor, were analyzed by Gene Expression Pro ling Interactive Analysis (GEPIA, http://gepia.cancer-pku.cn/). The results were shown as box plots and Kaplan-Meier survival curves [9].

Patients and clinical samples
This is a prospective study from February 2008 to December 2015, which has been approved by the ethical committee of the Stomatological Hospital, Southern Medical University. All participants were asked to provide signed informed consent to participate in the trial. The present study was performed in accordance with the guidelines of the World Medical Association Declaration of Helsinki Ethical Principles. All tumor tissues were obtained from patients who did not receive chemoradiotherapy.

Immunohistochemistry
The tissue samples were xed with neutral buffered formalin and embedded in para n, followed by H&E staining or immunohistochemical staining. The sections were soaked in 3% hydrogen peroxide for 10 min. After soaking,they were further treated with EDTA buffer (pH= 8.0) for antigen retrieval. Then, the sections were treated with 1% bovine serum to block nonspeci c background binding. Anti HLF antibody (1 mg/mL; Thermo Fisher Scienti c,Inc.) and secondary antibody were sequentially applied on tissue sections at room temperature. The sections were divided into two groups: low HLF expression group (≤60 %) and high HLF expression group (> 60 %). All section slides were assessed independently and blindly by two pathologists.

Western blot analysis
30 frozen tissue samples, including 15 tumor tissues and 15 normal tissues, were homogenized in tissue lysis buffer, and total protein was extracted. The protein lysates was separated by 10% sodium dodecyl sulfate-polyacrylamide gel electrophoresis, and then transferred to polyvinylidene di uoride membranes. The membranes were followed with overnight incubation with anti-HLF antibody (1:2000;Thermo Fisher Scienti c, USA) in 4℃,and incubation with secondary antibody(Thermo Fisher Scienti c, USA) for 1h in room temperature. Western blots were developed following manufacturer's instructions. The band intensity of the protein was quanti ed by ImageJ (National Institutes of Health). Actin served as a control.

Cell culture and Real-Time PCR
In this study, we purchased oral squamous cell carcinoma cell line Tca8113 from American ATCC cell bank, and cultured it in DMEM/F12 medium (Corning, USA) containing 10% fetal bovine serum (Gibco, USA) at 37℃ with 5% CO 2 . The HOMK100 cell line, which is derived from human oral mucosa, was purchased from Cell Research Corporation (Singapore) and maintained in medium supplemented with EpiLife de ned growth supplement (Gibco, USA).
Total RNA was extracted from Tca8113 cells or HOMK100 cells by use of the RNeasy Mini Kit (Qiagen) and treated with DNaseI-RNase free (Invitrogen) according to the manufacturer's protocol. Then cDNA was synthesized using oligo(dT)15 primers and SuperScript Reverse Transcriptase (Invitrogen). Expression levels of HLF mRNA was quanti ed using Taqman probes (HLF, Hs00171406_m1, Applied Biosystems) with the Taqman gene expression master mix (Applied Biosystems). The data of the TaqMan Gene Expression Assays was analysed by the SDS v2.2 software (Applied Biosystems).

Results
Five data series (GSE23558, GSE13601, GSE30784, GSE9844, GSE78060), which were associated with HNSCC, were obtained in NCBI-GEO datasets and analyzed with GEO2R. Based on the threshold of |logFC|>2 and adj.P.value≤0.05, a total of 17 common HNSCC DEGs were screened, including 11 upregulated DEGs (including RSAD2, MMP1, MMP3, MMP10, IFI6, ISG15, PTHLH, PLAU,  The results of immunohistochemical staining showed that the expression of HLF in HNSCC was lower than that in normal tissue (Fig5). Western blot results demonstrated that HLF protein expression was low in tumor tissues (P<0.05), in keeping with the results of the immunohistochemical staining. Further analysis showed that the mRNA expression of HLF in Tca8113 cells group was lower than that in HOMK100 cells group by RT-PCR (P<0.05).

Discussion
Hepatic leukemia factor (HLF) is a member of theproline and acidic amino acid-rich family of transcription regulatory proteins [11]. The 5' breakpoints in E2A may lead to type II rearrangements, and E2A exon 12 was directly fused to HLF exon 4 [12]. E2A-HLF has the potential to promote the development of leukemia [13][14]. Moreover, Shu Chen et al found that HLF was able to inhibit the proliferation, metastasis and radioresistance of glioma cells by binding to BS1 site of promoter to enhance the expression of miR-132, and then target to inhibit factor TTK [15]. Previous studies were showed that Jun is an important oncogene. Xiang etc [16] demonstrates that HLF transactivates c-Jun to promote tumour initiating cell (TIC) generation and enhances TIC-like properties, thus driving tumour initiation and progression. As well, HLF form a positive feedback loop with the pluripotency factors OCT4 and SOX2, and the feedforward system composed of OCT4 and SOX2 and NANOG also regulates the transcription of their own coding genes by forming an interconnected self regulatory network. Recent studies have shown that Sox2, Oct4 and Nanog are involved in regulating the proliferation of HNSCC cancer stem cells and promoting tumor growth [17][18][19][20][21][22][23]. Therefore, it may be hypothesised that HLF plays a major role in the development of HNSCC. However, Little is known about the relationship between HLF and HNCSS, and this conclusion needs more experimental data to further con rm.

Conclusion
In this study, we used bioinformatics methods to screen the prognostic biomarker HLF of HNSCC, and veri ed the reliability of HLF as a prognostic marker through experimental and clinical data. HLF expression was low in HNSCC tumor cells compared to normal tissues, and low expression of HLF had also been linked to shortened overall survival in patients with HNSCC. However, there are some limitations in this study due to the preset research objectives and limited time. The molecular mechanism of HLF in HNSCC is still not clear, and more research is needed.

Declarations
Authors' contributions WF and XQY initiated and designed the work and prepared the manuscript. DW performed the experiments. JC, WQM, and ZL contributed to the acquisition of patients and tissues specimens and to the analysis and interpretation of data. All authors read and approved the nal manuscript.

Funding
This work was supported by the grant of the Stomatological Hospital of Southern Medical University Projects (NO.PY2020021).

Availability of data and materials
The data sets during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Ethics approval and consent to participate
The study was approved by the Stomatological Hospital, Southern Medical University, Guangzhou, China, and all participants were asked to provide signed informed consent to participate in the trial.

Consent for publication
Not applicable.

Competing interests
The authors declare that they have no competing interests. Figure 1 A total of 17 common HNSCC DEGs were screened based on the data of the ve HNSCC gene expression series from GEO datasets.