Erythropoietin receptor is a risk factor for prognosis: A potential biomarker in lung adenocarcinoma.

Lung cancer has the highest mortality rate of all cancers, and LUAD's survival rate is particularly poor. Erythropoietin receptor (EPOR) can be detected in lung adenocarcinoma (LUAD), however, the expression levels and prognostic value of EPOR in LUAD are still unclear. In our study, clinicopathological data of 92 LUAD patients between January 2008 and June 2016, multiple bioinformatics databases and immunohistochemistry were used to explore the EPOR expression, the mutant genes affecting EPOR expression, and the correlation of EPOR expression with oxidative stress - related genes, prognosis, immune microenvironment. All statistical analyses were performed in the R version 4.1.1. The study found that EPOR expression might be down-regulated at the mRNA levels and significantly up-regulated at the protein levels in LUAD, which indicates that the mRNA and protein levels of EPOR are inconsistent. The muTarget showed that the expression of EPOR was significantly different between the mutant group and the wild group of 15 genes, including DDX60L and C1orf168. Importantly, we found that EPOR was associated with VEGF and HIF family members, and had significant positive correlation with oxidative stress - related genes such as CCS, EPX and TXNRD2. This suggests that EPOR may be involved in the regulation of oxidative stress. The Kaplan-Meier Plotter and PrognoScan databases consistently concluded that EPOR was associated with prognosis in LUAD patients. Our clinicopathological data showed that high EPOR expression was associated with poorer overall survival (29.5 vs 46 months) and had a good predictive ability for 4-year and 5-year survival probability. EPOR is expected to be a potential new prognostic marker for LUAD.


Introduction
According to the latest statistics, cancer remains the leading cause of human death. Compared to other cancers, lung cancer has a relatively low survival rate, 80% of which is non-small cell lung cancer (NSCLC) [1]. Adenocarcinoma (AD) is the most common histological subtype of NSCLC [2,3]. In recent years, despite the achievements in medical treatment and the inclusion of targeted drugs in the treatment of patients with this type of cancer, the survival rate of lung adenocarcinoma (LUAD) is still very poor, only 4-17% [4]. This may be due to the high heterogeneity of LUAD and the lack of effective prognostic markers [5,6]. The prognosis of LUAD has been a hot topic of medical research in recent years [7][8][9][10], it is not di cult to see the urgent pursuit of improving survival rate, so there is still a long way to go to explore the prognostic markers of LUAD. Erythropoietin receptor (EPOR) is a member of the cytokine class I receptor family, that mediates erythropoietin (EPO) -induced proliferation and differentiation of erythroblasts, with a structural motif consisting of two extracellular immunoglobulin-like domains, four similarly spaced cysteine residues and the sequence WSXWS, lacking tyrosine kinase activity and binding to JAK kinase, forming homodimers, heterodimers or heterotrimer complexes [11]. Upon EPO stimulation, binding to its homologous dimer receptor complex, induces the activation of Janus kinase 2 (JAK2), a non-receptor tyrosine kinase [12]. Activation of JAK2 results in the activation of speci c downstream effectors, such as STAT1, STAT3, or STAT5 [13], the signal transducer and activator of transcription 5 (STAT5) usually refers to STAT5A and STAT5B proteins [14], thus activates phosphatidylinositol 3-kinase (PI3K), protein kinase B (AKT), mitogen-activated protein kinase (MAPK) and extracellular signalregulated kinase 1/2 (ERK1/2) [15,16]. EPOR was originally discovered and described in erythroid progenitor cells, but it is also present in non-hematopoietic cells (tissues, organs) such as adipose tissues [17], bone progenitor cells [18], neurons [19], endothelial cells [20] and intestinal tract [21]. It is also widely present in various cancer cells and tumor tissues, such as head and neck squamous cell carcinoma [22], diffuse large B-cell lymphoma [23], rhabdomyosarcoma [24], breast cancer [25], hepatocellular carcinoma [26] and laryngeal malignancy [27]. Studies have shown that it is upregulated in gastric adenocarcinoma [28], liver cancer [29], prostate cancer [30], glioma [31] and other cancers. EPOR expression was also detected in non-small cell lung cancer cell lines [32,33], however, tumor immunohistochemistry (IHC) data based on anti-EPOR antibodies have been scarce, and Brown et al. [34] thought that inappropriate use of EPOR antibodies could result in false-positive results.In locally advanced squamous cell carcinoma of the head and neck, EPOR expression was an independent prognostic factor for OS, and improved OS was signi cantly associated with the absence of EPOR expression [35]. There was no signi cant difference in overall survival rate and recurrence-free survival rate among patients with different EPOR expression in cervical cancer [36]. Våtsveen et al. [37] found high levels of EPOR in myeloma cells to be associated with a better survival prognosis and suggested that EPOR expression may be a novel prognostic marker in primary myeloma.Thus, it was evident that the prognosis of EPOR expression is very different in various cancers.In NSCLC, many studies have concluded that EPO/EPOR co-expression or co-overexpression is associated with poor prognosis [38][39][40], however, there is a lack of evidence regarding the prognostic impact of EPOR alone on patients. Given that EPOR is expressed in NSCLC, including LUAD, but evidence at the tissue level is limited, whether EPOR is upregulated or downregulated in LUAD tumor tissue relative to normal tissue needs to be further explored, and an accurate correlation between EPOR signaling and prognosis in LUAD has not been established.Traditional clinical analyses, such as TNM staging and assessment of pathological parameters, do not accurately or dynamically re ect the progression of LUAD, and the continuous discovery of new prognostic markers would be very bene cial in predicting patient prognosis and could contribute to improving the survival rate of patients.The aim of this study was to explore the expression of EPOR in LUAD and related issues such as prognosis by combining various bioinformatics methods, immunohistochemical method, and clinicopathological data of the patients.

Materials And Methods
Tumor Immune Estimation Resource (TIMER) TIMER (https://cistrome.shinyapps.io/timer/) is a web server for comprehensive analysis of tumorin ltrating immune cells. In this study, the expression of EPOR in a variety of cancers was assessed by the "Diff Exp" module. The "Gene" module can study the relationship between EPOR expression in LUAD and the levels of immune cell in ltration (B cells, CD8+ T cells, CD4+ T cells, neutrophils, macrophages and dendritic cells) by using TCGA database. Using the "Correlation" module, the relationship between EPOR expression and different gene marker sets in immune cells was investigated. The correlation between EPOR expression and immune in ltration could be evaluated together with Spearman's ρ value and statistical signi cance after adjusting for tumor purity. The "SCNA" module can compare the level of immune in ltration with different somatic copy number alternations of EPOR gene in LUAD, including "deep deletion", "arm-level deletion", "diploid/normal", "arm-level gain" and "high ampli cation". In ltration level for each SCNA category was compared with normal level using a two-sided Wilcoxon rank sum test.
Oncomine Oncomine (https://www.oncomine.org/resource/login.html) is currently the world's largest oncogene microarray database and integrated data mining platform designed to help data mine the transcriptional expression of genes in various cancers. The p-value was set to 0.05, fold change was set to 1.5, and gene rank was set to all. UALCAN UALCAN (http://ualcan.path.uab.edu/) is a comprehensive, user-friendly and interactive web resource for analyzing cancer OMICS data, providing easy access to publicly available cancer OMICS data (TCGA, MET500 and CPTAC), expression pro les of coding proteins and evaluating epigenetic regulation of gene expression by promoter methylation and so on. The "TCGA" module can explore the expression of EPOR in LUAD.
Gene Expression Pro ling Interactive Analysis (GEPIA) GEPIA (http://gepia.cancer-pku.cn/index.html) is a interactive web server that integrates both TCGA and GTEx data for gene expression analysis. In this study, EPOR expression analysis was evaluated using the TCGA-LUAD dataset. In the "Expression DIY" module, we used TCGA Normal and GTEx data matching method and log 2 (TPM+1) for log-scale to study the expression of EPOR in LUAD and normal lung tissue specimens. Co-expressed genes were screened in the "Similar Genes" module.

MuTarget
The muTarget (https://www.mutarget.com/) is an online tool for linking mutation status to gene expression changes in solid tumors. "All somatic mutations" was selected as the somatic mutation type and the mutation prevalence was set at least 2%.

Analysis of EPOR-interacting genes and proteins
The GeneMANIA database (https://genemania.org/) was applied to construct the EPOR gene-gene interaction network. The STRING online database (https://string-db.org/) was applied to construct the EPOR protein-protein interaction (PPI) network.
Gene Ontology (GO) term, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis and gene set enrichment analysis (GSEA) RNAseq data for EPOR in LUAD were obtained from The Cancer Genome Atlas (TCGA;https://tcgadata.nci.nih.gov/tcga/), and analysis of differentially expressed genes (DEGs) for EPOR was performed by the R package DESeq2.GO and KEGG analyses were applied to explore the potential biological functions of DEGs in LUAD, complemented by GSEA. GO enrichment analysis contains three levels: molecular function (MF), cellular component (CC) and biological process (BP). GO, KEGG and GSEA were performed by the R package ClusterPro ler. Normalized enrichment score (NES) >1.5, false discovery rate (FDR) <0.25 and p.adjust<0.05 were considered to be of signi cant signi cance in GSEA.

Kaplan-Meier Plotter
KM Plotter (http://kmplot.com/analysis/), containing gene expression data and survival information in GEO, EGA and TCGA databases. The samples of LUAD patients were divided into two groups (high and low expression) according to the best cutoff and different Affymetrix IDs (37986_at, 216999_at, 209962_at, 209963_s_at, 396_f_at) were selected to explore the prognostic value of EPOR in LUAD. The relationship of overall survival (OS) and progression-free survival (PFS) with hazard ratio (HR) was analyzed using 95% con dence interval (95% CI) and the log-rank test p value. PrognoScan PrognoScan (http://dna00.bio.kyutech.ac.jp/PrognoScan/index.html), a new database for meta-analysis of prognostic value of genes. PrognoScan employs the minimum P-value approach for grouping patients and searches the biological relationship between gene expression and patient progonsis across a large collection of publicly available cancer microarray datasets with clinical annotation. To explore the relationship between EPOR expression and patient prognosis in LUAD, the screening parameters of the datasets were set as follows: "cancer type" was lung cancer, and "subtype" was "adenocarcinoma". Five datasets were nally obtained, including GSE13213, GSE31210, jacob-00182-CANDF and so on, in which the GSE13213 dataset contained 117 patients with lung adenocarcinoma with high probability of recurrence and the GSE31210 dataset contained 204 patients with pathological stage I-II lung adenocarcinoma. HR (95% CI) was calculated, and the threshold was adjusted to corrected P-value < 0.05. cBioPortal The cBioPortal (https://www.cbioportal.org/) contains a large-scale cancer genome dataset with visualization, download, analysis and other functions. We selected three lung cancer datasets with 855 cases and three LUAD datasets with 718 cases, respectively, for further analysis using cBioPortal. The type and frequency of EPOR gene alterations in LUAD tissues were analyzed by the "OncoPrint" module and the "Cancer Types Summary" module. The OS of EPOR genes in the altered and non-altered groups was analyzed by the "Comparison/Survival" module.

Immunohistochemistry (IHC)
Lung adenocarcinoma 180 point tissue microarrays (HLugA180Su04, XT16-032, Outdo Biotech, Shanghai, China) were placed in an oven and baked at 63 ℃ for 1 h (including 88 pairs of lung adenocarcinoma and adjacent normal lung tissue specimens, and tumor tissues had 4 extra points). Dewaxing was followed by antigen repair and subsequently placed in distilled water at room temperature and allowed to cool naturally for at least 10 min. Rinsed by PBS buffer, adding anti-EPOR primary antibody working solution (1:1000, bs-1424R, Bioss, Beijing, China), and at 4 °C overnight. Then it was incubated with biotin-labeled secondary antibodies for 45min at room temperature and cleaned with PBS. The slides were placed into an automatic immunohistochemistry instrument (Dako, United States), and the blocking, secondary antibody binding, and 3,3'-diaminobenzidine (DAB) color development programs were performed according to the "Autostainer Link 48 User's Guide". Hematoxylin staining for 1 min, submerged in 0.25% hydrochloric alcohol (400 ml 70% alcohol + 1 ml concentrated hydrochloric acid) for about 10 s, and rinsed with tap water for 5 min. Dried at room temperature and sealed the lm with neutral resin. IHC staining score can be divided into two parts: staining intensity and percentage of positive tumor cells. Staining intensity was classi ed as 0 (no staining), 1 (weak staining), 2 (moderate staining), and 3 (strong staining). Percentage of positive tumor cells was classi ed as 1 (≤ 25%), 2 (26% -50%), 3 (51% -75%) and 4 (> 75%). The nal expression scores were calculated by multiplying the two variables together, with a maximum score of 12 [41,42].

Validation of EPOR prognostic value based on clinicopathological data of patients
To verify the prognostic value of EPOR expression in LUAD patients, we performed survival analysis using detailed clinicopathological data from 92 lung adenocarcinoma tissue microarrays (HLugA180Su04) with a follow-up period of 3 -8.5 years (operation time: January, 2008 -July, 2013, follow-up time: June, 2016), which contained the uorescence in situ hybridization (FISH) data of anaplastic lymphoma kinase (ALK) and epidermal growth factor receptor (EGFR) and immunohistochemical data of programmed death ligand 1 (PD-L1). After excluding 14 cases of lost follow-up and 7 cases of undetectable EPOR (no cancer/no point), of which 1 overlapped, the 72 cases of lung adenocarcinoma were nally divided into two groups using cut-off value of EPOR immunohistochemical staining score (≥ 5: high expression; < 5: low expression), and 95% CI and log-rank P value were used to explore the effect of EPOR on OS of LUAD patients. After deleting the pathological data of 14 patients with missing clinical stages, the 58 cases were nally used to perform univariate Cox analysis on Gender, Age, AJCC stage, T/N/M stage, EPOR, ALK, EGFR and PD-L1 characteristics, followed by multivariate Cox analysis to explore the risk/protective factors for survival in LUAD patients. Finally, nomogram was created based on the results of multivariate Cox analysis to construct a scoring system capable of assessing patients' 5-year OS, and calibration curve based on the Hosmer-Lemeshow test were used to assess the consistency of the actual results with the predicted results of the nomogram. The above was performed by R package rms, survminer and survival.

Statistical Analysis
The results generated in Oncomine were displayed as p-values, fold changes and gene ranks. Kaplan-Meier Plotter and PrognoScan results were shown by log-rank test for HRs (95% CIs) and p-values. EPOR expression levels of ALK, EGFR and PD-L1 high and low groups (grouped by median expression) were compared by RNAseq data of LUAD in TCGA database. Paired-samples T-test or Wilcoxon rank sum test were used for comparison between groups. Correlation heat map of interacting proteins in the STRING database was displayed using Spearman's correlation and the R package ggcorrplot and corrplot.Single gene co-expression heat map was constructed with the R package ggplot2. Forest maps of the value of EPOR prognostic were plotted using the R package forestplot. The R package estimate was used to calculate the relationship between EPOR expression in LUAD and the tumor microenvironment (TME), and gene expression was considered signi cantly correlated when P < 0.5 and |R| > 0.20 were both satis ed [43]. Each P value < 0.05 was considered statistically signi cant.

EPOR expression in LUAD patients
The mRNA expression of EPOR in LUAD tissues was analyzed by TIMER database for the rst time. EPOR expression was higher in bladder urothelial carcinoma (BLCA), cholangiocarcinoma (CHOL), head and neck squamous cell carcinoma (HNSC), renal clear cell carcinoma (KIRC), gastric adenocarcinoma (STAD) and thyroid carcinoma (THCA) tissues than in the adjacent normal tissues, and lower in breast invasive carcinoma (BRCA) and lung squamous cell carcinoma (LUSC) tissues than in normal tissues ( Fig. 1a). In the UALCAN database, EPOR expression was not signi cantly different in LUAD primary tumor tissues from normal lung tissues (Fig. 1b). However, in GEPIA, EPOR expression was lower in LUAD tissues than in normal lung tissues (Fig. 1c). Subsequently, EPOR expression in LUAD tissues and in adjacentnormal tissues was analyzed using data obtained directly from TCGA. The expression of EPOR in LUAD was signi cantly decreased (Fig. 1d). After sample matching, 57 paired LUAD tumor specimens and adjacent normal tissue specimens were obtained, and EPOR expression was signi cantly downregulated in LUAD tissues (Fig. 1e). In addition, the EPOR mRNA expression was further detected by Oncomine database, and the expression of EPOR in LUAD tissues of six cohorts was downregulated compared to normal lung tissues, including "Bhattacharjee Lung" and "Stearman Lung" and so on ( Supplementary Fig. 1). These ndings were based on mRNA levels, but protein levels seems closer to the real situation. Therefore, we further studied the expression of EPOR protein in LUAD tissues and found that the protein levels of EPOR in LUAD tissues was higher than that in paired normal lung tissues (t = -10.184, P < 0.001) (Fig. 2a, b).

Gene mutations affecting EPOR expression and the correlation of EPOR expression with ALK, EGFR and PD-L1 in LUAD
We were curious about the gene mutations affecting EPOR expression in LUAD, and to this end, explored which gene mutations in LUAD lead to downregulation/upregulation of EPOR expression. The muTarget database showed that mutations of DDX60L, LGR6, POTEB3, RIF1 and SOX5 genes resulted in the downregulation of EPOR expression (Fig. 3a), and mutations of C1orf168, DBX2, EIF5B, FNDC1, KIAA0430, LRRC16A, MGAT3, PTPRM, TPH2 and UNC80 genes resulted in the upregulation of EPOR expression (Fig. 3b). The presence of ALK gene rearrangements and EGFR mutations in LUAD, both of which are tumor driver genes with targeted drug therapy and are of great signi cance to LUAD patients. PD-L1 is often closely associated with immune escape in cancer, and PD-L1 is also one of the common immunotherapy targets in lung cancer. Based on our clinicopathological data of tissue microarrays containing FISH data of ALK and EGFR and IHC data of PD-L1, we rst compared the expression of EPOR in the ALK/EGFR/PD-L1 groups with high and low expression using the TCGA database. The results showed that the expression level of EPOR was signi cantly higher in the ALK high expression group than in the ALK low expression group, and there was no signi cant difference in EGFR and PD-L1 groups between high and low expression.In our pathological data, no difference was found in the expression of EPOR between ALK/EGFR negative and positive groups and between PD-L1 high and low expression groups (Fig. 2c).

EPOR-interacting genes and proteins, EPOR co-expressed genes and genetic alterations
The PPI network of EPOR was generated using the STRING database. There were 46 edges and 11 nodes, including EPO, JAK2 and STAT5B, etc (Fig. 4a). We also constructed correlations between EPOR genes and genes expressing these proteins. EPOR genes were more correlated with STAT5B and PTPN6 genes (R > 0.2), and was correlated with EPO, KIT and GRB2 (Fig. 4c). The EPOR-interacting genes network was constructed by GeneMANIA. EPOR interacted physically with EPO, CCDC150 and CFAP161, with the strongest interactions with EPO, and had weak genetic interaction with PCNA (Fig. 4b). The top 40 EPOR co-expressed genes were screened by GEPIA database and a single gene co-expression heat map was constructed (Fig. 4d). The three datasets, LUAD (TCGA, Nature 2014), LUSC (TCGA, Nature 2012) and NSCLC (TRACERx, NEJM&Nature 2017), were rst analyzed in the cBioPortal database, and the results showed that the alteration frequencies of EPOR in LUSC and NSCLC were more than 2% and less than 1% in LUAD (Supplementary Fig. 2a). The EPOR gene alterations in the three LUAD datasets were further analyzed, and the average alteration frequency was only 0.7%, with "Deep Deletion" being the more common type (Supplementary Fig. 2b, d). Due to the lack of alteration data, the Kaplan-Meier plotter and log-rank test could only obtain the correlation between OS and EPOR altered/unaltered groups, and the result showed that the difference is not statistically signi cant ( Supplementary Fig. 2c).

Correlation of EPOR expression with tumor purity, immune cells, SCNA and tumor microenvironment
In the TIMER database, we analyzed the correlation of EPOR expression with tumor purity and six in ltrating immune cells including B cells, CD8+ T cells, CD4+ T cells, macrophages, neutrophils, and dendritic cells (DCs).The results showed that EPOR expression levels were not signi cantly correlated with tumor purity and in ltration of macrophages, neutrophils, DCs, negatively correlated with in ltration of CD8+ T cells, and positively correlated with in ltration of B cells and CD4+ T cells, where the correlation coe cientwith CD4+ T cells was close to 0.3 (Fig. 5a).We further analyzed the correlation of EPOR expression with gene markers of immune cells in the TIMER database.The results after adjusting for tumor purity showed that EPOR expression correlated with some gene markers of B cells, M1 macrophages, neutrophils, natural killer (NK) cells, DCs, type 1 T -helper (Th1) cells, type 2 T -helper (Th2) cells, type 17 T -helper (Th17) cells, follicular helper T (Tfh) cells, regulatory T (Treg) cells, effector memory T (Tem) cells and natural killer T (NKT) cells. Among them, the correlation coe cientwith CD19 (B cell), CD11c (DC), STAT6 (Th2), BCL6 (Tfh) and TGFB1 (Treg) was more than 0.2, but the overall showed a weak correlation between EPOR expression and immune cell markers in LUAD ( Table 1). The results of the SCNA module showed that signi cant differences in in ltration levels of B cells, CD4+ T cells, macrophages, neutrophils and DCs at the "Arm-level Deletion" somatic copy number state of EPOR in LUAD compared to normal levels,in addition, the in ltration levels of macrophages in the "Arm-level Gain" somatic copy number state of EPOR was signi cantly different compared to the normal level (Fig.  5b). In addition, we further investigated the correlation between EPOR expression and tumor microenvironment (TME) in LUAD. TME can be evaluated by ImmuneScore, StromalScore and ESTIMATEScore, which indicate tumor immune cell in ltration, presence of tumor tissue mesenchyme and tumor purity, respectively. The results showed no signi cant correlation between EPOR mRNA expression levels and the three scores ,with |R| ≤ 0.2 (Fig. 5c). Table 1 Correlation

analysis between EPOR and gene markers of immune cells in TIMER Description
Gene axon terminus. In BPs, signal release, response to metal ion, neurotransmitter transport, hormone metabolic process and amine transport dominated (Fig. 6a). And DEGs were enriched in neuroactive ligand-receptor interaction pathway and protein digestion and adsorption pathway (Fig. 6b). To make the enrichment pathways more comprehensive, we further performed gene set enrichment analysis (GSEA) on all DEGs. The results showed that they were signi cantly enriched in KEGG pathways such as calcium signaling pathway, pathways in cancer, cytokine-cytokine receptor interaction, MAPK signaling pathway and neuroactive ligand-receptor interaction (Fig. 6c) and signi cantly enriched in Wiki pathways such as focal adhesion PI3K/Akt/mTOR signaling pathway, sudden infant death syndrome-sids susceptibility pathways, MAPK signaling pathway, PI3K/Akt signaling pathway and nuclear receptors meta pathway (Fig. 6d).

Prognostic analysis of EPOR in LUAD patients
We rst analyzed the prognosis of EPOR in LUAD patientsin KM Plotter database. In OS, 37986_at and 216999_at probes showed that patients with high EPOR mRNA expression in LUAD had a better prognosis, and 209962_at, 209963_s_at and 396_f_at probes showed that patients with high EPOR expression in LUAD had a worse prognosis (Fig. 7a). In PFS, the 37986_at, 209962_at and 209963_s_at probes showed that patients with high EPOR mRNA expression in LUAD had a better prognosis, and the 396_f_at probe showed that patients with high EPOR expression in LUAD had a worse prognosis (Fig. 7b).
We then performed a further prognostic analysis in the PrognoScan database, which showed that high EPOR expression was associated with better OS in the GSE13213, GSE31210, jacob-00182-MSK and jacob-00182-UM LUAD datasets, and in the jacob-00182-CANDF dataset, high EPOR expression was associated with poorer OS (Fig. 7c).

Validation of EPOR prognostic value based on clinicopathological data of patients
To validate the prognostic value of EPOR expression in LUAD patients, we performed a prognostic analysis using detailed clinicopathological data from our 92 cases of lung adenocarcinoma tissue microarrays (HLugA180Su04). KM analysis showed poorer OS in the EPOR high expression group (29.5 vs 46 months) (Fig. 8a).To test the independence of EPOR as a prognostic factor, we performed the Cox risk regression analysis.Multivariate Cox risk regression analysis showed that T3-T4 stage, AJCC stage III/IV, high expression of EPOR and ALK (+) were associated with OS (Table 2). KM analysis and multivariate Cox analysis demonstrated better agreement on EPOR being a risk factor for OS in LUAD.Subsequently, based on the results of multivariate Cox risk regression analysis, we constructed a prediction model, and the nomogram more visually demonstrated the effect of EPOR on 5-year survival rate (C-index = 0.711) (Fig. 8b). The calibration curve showed good agreement between our results and predicted values in the 5-year survival rate (Fig. 8c).

Discussion
Among cancers, lung cancer has a high incidence and mortality rate [1], among which the survival rate of lung adenocarcinoma is not ideal due to the heterogeneity of tumor and the lack of prognostic markers [4][5][6]. Meanwhile, because the expression of EPOR and its correlation with patient prognosis in LUAD are not well de ned, we explored the expression of EPOR in LUAD and the possibility of EPOR as a prognostic marker by combining various bioinformatics methods, immunohistochemical methods, and patients clinicopathological data. In this study, using TIMER, UALCAN, GEPIA, Oncomine and TCGA bioinformatics databases and immunohistochemistry we found that the EPOR mRNA expression in LUAD tissues was possibly downregulated compared with that in normal lung tissues, but the EPOR protein expression in LUAD tissues was higher than that in paired normal lung tissues (Fig. 1, 2; Supplementary Fig. 1). This showed that the mRNA and protein levels of EPOR may beinconsistent, complementing the EPOR expression data in LUAD. To this end, we explored the muTarget database and found that mutations in ve genes in LUAD, DDX60L, LGR6, POTEB3, RIF1, and SOX5, cause downregulation of EPOR expression, and mutations in 10 genes including C1orf168, DBX2 and EIF5B lead to upregulation of EPOR expression (Fig. 3). The expression of ALK genes may also affect EPOR expression (Fig. 2). We also found that EPOR-interacting proteins include EPO, JAK2, STAT5 and so on (Fig. 4), this fully illustrated that EPOR interacts with EPO and is closely associated with the activation of JAK2 and STAT5 [12,13]. Based on patient samples in the cBioPortal database, approximately 0.7% of LUAD patients had EPOR genetic alterations, with "Deep Deletion" being the more common type and EPOR alteration frequency in NSCLC was also only 2% ( Supplementary Fig. 2). This was consistent with previous reports, which EPOR mutations were very rare [44]. The TIMER database showed that EPOR expression in LUAD was negatively correlated with in ltration of CD8+ T cells and positively correlated with in ltration of B cells and CD4+ T cells (Fig. 5a). EPOR expression correlated with CD19 (B cell), CD11c (DC), STAT6 (Th2), BCL6 (Tfh) and TGFB1 (Treg) gene markers with coe cients above 0.2, but overall there were weak correlation with immune cell markers ( Table 1). The in ltration levels of B cells, CD4+ T cells, macrophages, neutrophils and DCs in the "Arm-level Deletion" somatic copy number state of EPOR were signi cantly different compared to normal levels (Fig. 5b). In addition, EPOR expression in LUAD was negatively correlated with ImmuneScore, StromalScore and ESTIMATEScore (P < 0.05), but not signi cantly( |r| ≤ 0.2) (Fig. 5c). These results illustrated that EPOR expression in LUAD correlated weakly with immune cell in ltration. In the DEGs enrichment analyses, we enriched the information related to the nervous system, which validated the expression of EPOR on neurons [19]. And we also found that EPOR was closely associated with pathways such as MAPK signaling pathway and PI3K/Akt signaling pathway, which con rmed that activation of MAPK, PI3K and Akt cannot be achieved without the expression of EPOR [15,16]. (Fig. 6) Finally, the KM Plotter database showed that different prognostic results may be obtained when using different probes to detect EPOR, and the PrognoScan database also showed different prognostic results in different LUAD datasets, but consistently concluded that EPOR was associated with prognosis in LUAD patients (Fig. 7). Our clinicopathological data showed that high EPOR expression was associated with poorer OS, and that high EPOR expression was an independent risk factor for prognosis in LUAD patients and had a good predictive power for 5-year survival ( Fig. 8; Table  2). These evidences suggested that EPOR may be an independent and novel prognostic marker for LUAD.
EPOR is present not only in hematopoietic cells but also in non-hematopoietic cells such as neurons [19], endothelial cells [20], skeletal muscle cells [45]  EPO plays a role in promoting proliferation in hematopoietic progenitor cells and may also promote the growth of tumor cells, thus promoting tumor progression and metastasis, leading to poor prognosis of patients. And as the receptor of EPO, EPOR's effect on prognosis of cancer patients has been also discussed. On the one hand, high EPOR expression in locally advanced squamous cell carcinoma of the head and neck was considered to be an independent prognostic factor for OS and was associated with poorer OS [35], and activation of EPOR in melanoma was thought to promote tumor progression and contributed to survival of tumor cells [52], meanwhile, inhibition of EPOR gene expression in NSCLC reduced the growth of NSCLC cells under hypoxia [53]; on the one hand, there was no signi cant difference in survival rates between patients with different EPOR expression in gastric and cervical adenocarcinoma [28, 36]; on the other hand, recurrence-free survival was signi cantly improved in ER+/EPOR+ breast cancer patients with untreated tamoxifen in breast cancer [54], and in the breast cancer cell lines, RAMA 37 cells (low EPOR expression) had a stronger proliferation ability than RAMA 37-28 cells (high EPOR expression), suggesting that high EPOR expression can reduce the ability of cells to divide [55], in addition, high levels of EPOR mRNA in myeloma were associated with better survival [37]. In some data, EPO/EPOR co-expression or co-overexpression in NSCLC was associated with poor prognosis [38][39][40], but independent evidence on the prognostic impact of EPOR on patients was lacking. Our results showed that high EPOR expression was associated with poorer prognosis, which may contradict the conclusions of Rozsas et al. [56].
This study improves our understanding of the relationship between EPOR and LUAD, but there are still some limitations. First, this study was based on bioinformatics analyses and combined with immunohistochemistry and some clinicopathological data, but lacks experimental depth. Second, most of our analyses were based on the mRNA levels of EPOR, and the inclusion of more evidence at the protein levels would be closer to the truth and more convincing. Third, due to our limited funds, we could only rely on bioinformatics databases for the detection of EPOR mRNA expression levels, which could not be veri ed by RT-PCR assays. Fourth, the prognostic results in the bioinformatics analysis were not very uniform, which may be related to factors such as the study subjects; for example, the GSE13213 dataset included only patients with adenocarcinoma with a high probability of recurrence, and the GSE31210 dataset included only patients with stage I-II lung adenocarcinoma.Although our clinicopathological data validation results showed that high EPOR expression was associated with poorer prognosis, prospective trials with large clinical samples are still needed for validation.

Conclusions
EPOR expression may be downregulated at the mRNA levels and signi cantly upregulated at the protein levels in LUAD. The high expression of EPOR is associated with poor prognosis and may be a potential new prognostic marker for LUAD. Overall, our study will provide a solution ideas to the hot issue of poor survival rate that has been a problem in LUAD and provide new evidence for the study of EPOR in cancer.At the same time, our ndings agree on the need to further explore the prognosis of EPOR in LUAD using a multicenter, large-sample clinical study.  EPOR expression scores were calculated by multiplying staining intensity scores and positive rate scores.
The paired plot depicts EPOR expression in 73 pairs of LUAD tissues and adjacent normal lung tissues.
(c) The dot plots in the rst line describe the EPOR expression between the high and low expression groups of ALK, EGFR and PD-L1 in TCGA database (grouped by median expression).The dot plots in the second line are based on the FISH data of ALK and EGFR and the immunohistochemical data of PD-L1 from the pathological dataof tissue microarrys. EPOR expression score was used as ordinate to describe the expression of EPOR in each group. ns: p > 0.05, **p < 0.01, ***p < 0.001.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.