Genomic analysis and filtration of novel prognostic biomarkers based on metabolic and immune subtypes in pancreatic cancer

Patients with pancreatic cancer (PC) can be classified into various molecular subtypes and benefit from some precise therapy. Nevertheless, the interaction between metabolic and immune subtypes in the tumor microenvironment (TME) remains unknown. We hope to identify molecular subtypes related to metabolism and immunity in pancreatic cancer Unsupervised consensus clustering and ssGSEA analysis were utilized to construct molecular subtypes related to metabolism and immunity. Diverse metabolic and immune subtypes were characterized by distinct prognoses and TME. Afterward, we filtrated the overlapped genes based on the differentially expressed genes (DEGs) between the metabolic and immune subtypes by lasso regression and Cox regression, and used them to build risk score signature which led to PC patients was categorized into high- and low-risk groups. Nomogram were built to predict the survival rates of each PC patient. RT-PCR, in vitro cell proliferation assay, PC organoid, immunohistochemistry staining were used to identify key oncogenes related to PC High-risk patients have a better response for various chemotherapeutic drugs in the Genomics of Drug Sensitivity in Cancer (GDSC) database. We built a nomogram with the risk group, age, and the number of positive lymph nodes to predict the survival rates of each PC patient with average 1-year, 2-year, and 3-year areas under the curve (AUCs) equal to 0.792, 0.752, and 0.751. FAM83A, KLF5, LIPH, MYEOV were up-regulated in the PC cell line and PC tissues. Knockdown of FAM83A, KLF5, LIPH, MYEOV could reduce the proliferation in the PC cell line and PC organoids The risk score signature based on the metabolism and immune molecular subtypes can accurately predict the prognosis and guide treatments of PC, meanwhile, the metabolism-immune biomarkers may provide novel target therapy for PC


Introduction
Pancreatic cancer (PC) is a highly malignant tumor that beyond 50% of PC patients are diagnosed with distant metastasis and the five-year relative survival rate of all stages is merely 9% [1].
Guangyu Chen, Yueze Liu and Dan Su contributed equally.
Based on demographic and annual death rates, approximately 63,000 patients will die within 88,000 new cases of PC and become the second cause of cancer-related death over the next decades in the U.S [2].PC patients can be treated with curative surgical resection accounts for less than 20%, most patients with resectable tumors experience recurrent risk [3,4].Though the substantial progress on various reagents and surgery techniques, the prognosis of PC patients has not improved significantly.In clinical practice, classification of PC on the basis of histopathological type is much common and treatment options of different types of patients with PC are diverse.An increasing number of molecular profiling studies have illuminated that PC patients can benefit from some specific therapy based on the effective molecular subtype [5][6][7].Hence, it is crucial for PC patients to carry out individual precision therapy and perfect the genomic landscape in the future.
Cancer cell growth involves alteration of energy metabolism referred to as reprogramming energy metabolism which is a major hallmark of cancer [8].Whatever anomalous glucose, lipid, and amino acids metabolism play vital roles in pancreatic cancer development, immunosuppression, chemoresistance, etc. [9].Sivakumar et al. indicated that PC could be classified into 3 subtypes (Hedgehog, Notch, and cell cycle), among them, the cell cycle subtype was characterized with the largest mutational burden, scarce immune infiltration, and metabolic adaption [10,11].Recently, PC was identified by 4 subtypes (quiescent, glycolytic, cholesterogenic, and mixed metabolic subtypes), and the cholesterogenic subtype had the best prognosis, while the glycolytic subtype was opposite [12].Furthermore, the progress on metabolism-related molecular subtyping may predict which patients may respond to effective therapy [13].Therefore, the subtyping of metabolic alterations is greatly significant and can provide novel personalized therapy and evaluate the outcome for PC patients.
In our study, we generated the metabolic and immune subtypes for PC.Further, we elaborated that diverse metabolic patterns were distinctly associated with prognosis and tumor immune microenvironment (TIME).The aims of our study are expected to explore novel prognostic markers and guide treatments based on the metabolism and immune molecular subtypes for PC.

Data acquisition and preparation
RNA-seq profiles and matched clinical information of PC patients were extracted from The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC).RNA-seq profiles of normal pancreatic tissues were extracted from the Genotype-Tissue Expression Project (GTEx).We used the sva package to remove the batch effects, and normalized the data by the limma package.Patients with missing clinical information or less than 30 days of survival time were removed for further analysis.The curated gene sets related to metabolism were downloaded from the molecular signatures database (MSigDB v7.4).Then, the univariate Cox regression analysis was utilized to screen the genes related to the overall survival (OS) of PC patients for further analysis.Usage of data complied with the data access policy of TCGA, ICGC, GTEx, and MSigDB.The clinical characteristics of the PC patients were showed in Table S1.

Screening of differentially expressed genes in PC
Firstly, the differentially expressed genes (DEGs) were screened between PC and normal samples utilizing the limma R package [14].The false discovery rate (FDR) < 0.05 and |log 2 fold change (FC)| > 0.5 were set as the threshold criteria.Secondly, we screened the DEGs between metabolic and immune subtypes with threshold criteria of combination with FDR < 0.05 and |log 2 (FC)| > 0.8.

Clustering analysis
The ConsesusClusterPlus package was used to explore a metabolic molecular classification of PC patients based on metabolism-related genes.The optimal subtyping was selected by the relative change in the area under the cumulative distribution function (CDF) curves and the consensus heatmap.We calculated the enrichment scores of the 29 immune genes set in each PC sample by the single-sample gene-set enrichment analysis (ssGSEA) [15].Next, Ultimately, hierarchical clustering was used to identify immune molecular subtypes of PC patients based on the enrichment scores of the 29 immune gene set.

Survival analysis
Kaplan-Meier (K-M) survival analyses were used to evaluate the relation between gene expression level and survival time of PC patients.The survminer package was used to get the optimal cut-off values of each data.

Gene set variation analysis (GSVA)
To evaluate the most distinctly enriched KEGG pathways between the two metabolic subtypes, we used the GSVA and limma package that analyzed differential enrichment scores of KEGG pathways.The adjusted P value < 0.05 was considered the most differentially enriched pathways between the two metabolic subtypes of cluster 1 and cluster 2.

Evaluation of TME cell infiltration in PC
We utilized 'Estimation of Stromal and Immune cells in MAlignant Tumours using Expression data' (ESTIMATE) to estimate the TME components including immune cell infiltration, stromal content, and tumor purity based on the gene expression profiles of PC [16].

Assessment of tumor mutation burden (TMB)
The somatic mutation data of PC patients were obtained from the TCGA database.The TMB was assessed by dividing the total count of variants by the whole length of exons.Somatic mutation data were visualized via the Maftools package [17].

Construction of the predictive signature
We used the univariate Cox regression analysis and the LASSO binary logistic regression model to screen 6 selected prognostic genes.Then, the linear combination of the regression coefficients (β) was utilized to construct the predictive signature.Risk score = (β1 × expression level of AHNAK2) + (β2 × expression level of LIPH) + (β3 × expression level of KLF5) + (β4 × expression level of MYEOV) + (β5 × expression level of SFTA2) + (β6 × expression level of FAM83A).All PC patients were stratified into high-and low-risk based on the median risk score.TCGA and ICGC datasets were utilized to be training and validation cohorts respectively.

Calculation of the metabolic score
First, the Boruta algorithm was conducted for the feature selection of the metabolic genes based on 85 metabolismrelated prognostic genes.Then, we performed principal component analysis (PCA) to construct metabolic gene signatures which were termed as the metabolic gene signatures A and B respectively.Furthermore, principal component 1 was extracted as the metabolic score by the PCA.Finally, we performed a method similar to previous research to assess the metabolic score: that was similar to the previous research [18,19].The formula for calculating the metabolic score for each sample was as follows: Metabolic score =∑PC1 A -PC1 B .

Screening of the clinical factors
To screen the clinical factors which were related to the prognosis of PC patients, we performed univariate Cox regression analyses as a preliminary screening.Next, the multivariate Cox regression analysis was used to narrow the confounding clinical factors.

Building the nomogram
We used all screened independent prognostic factors to build a nomogram.ROC curve analysis was performed to assess the discrimination performance of the nomogram involved with the prognosis of PC patients at 1, 2, 3 years.Calibration curves were performed to assess the predictive capacity in which the 45° line represented the best predictive power.

Gene set enrichment analysis (GSEA)
GSEA software was performed to identify the biological pathways between the high-and low-risk groups.The pathways with P value < 0.05 were considered to be significantly enriched.

Prediction of immunotherapy and drug sensitivity prediction
The Tumor Immune Dysfunction and Exclusion (TIDE, http:// tide.dfci.harva rd.edu/) could model the tumor immune escape and help predict the response of immune checkpoint blockade (ICB) of cancer patients by testing the expression of each gene in tumors [20].Therefore, The TIDE algorithm was used to predict the clinical response to immunotherapy of PC patients using the gene expression profiles of PC samples.Meanwhile, we obtained the immunophenoscore (IPS) of PC patients from The Cancer Immunome Database (TCIA, https:// tcia.at/ home).IPS was calculated by z-scores based on evaluating gene expression in four cell types (effector cells, MHC molecules, immunosuppressor cells, and immune modulators) [21].Furthermore, we predicted the chemotherapeutic drug sensitivity based on the Genomics of Drug Sensitivity in Cancer (GDSC) database (https:// www.cance rrxge ne.org/).pRRophetic package was used to estimate the half maximal inhibitory concentration (IC50).

Cell culture
All pancreatic cell lines were purchased from the American Type Culture Collection (ATCC).HPNE was cultured in high glucose Dulbecco's modified Eagle's medium (DMEM, HyClone).AsPC-1 was cultured in RPMI-1640 medium (HyClone) supplemented with 10% fetal calf serum (HyClone).All cell lines were maintained at 37 °C in a humidified incubator containing 5% CO2.

RNA extraction and RT-PCR
Total RNA was extracted from PC cells by Trizol reagent (Invitrogen, 15,596,018).The cDNA was synthesized by Pri-meScriptTM RT Master Mix (TaKaRa, RR036A).Quantitative PCR was conducted by the TB Green Fast qPCR Mix (TaKaRa, RR430S).The mRNA expression level was normalized to GAPDH using the 2 −Δct .The primer sequences were shown in Table S2.

In vitro cell proliferation assay
To conduct the CCK-8 assay (Donjindo Laboratories, Japan), cells were seeded at a density of 3000 cells per well in 96-well plates, and cell viability was assessed.The plates were incubated at 37℃ for 2 h, and the absorbance was measured at 450 and 630 nm.The optical density (OD) value was calculated as OD 450 nm-OD 630 nm.

Generation of patient-derived PC organoid
The specimens were placed into basal organoid media and sent to the laboratory within 2 h for further processing and organoid generation.The isolation and culture of organoids were based on Tuveson's protocols [22].

Statistical analysis
Each experiment was replicated a minimum of three times.The data were presented as mean ± standard deviation.Statistical analysis was performed using the unpaired t-test for comparison between two groups, and one-way ANOVA followed by Tukey's test for comparison among three or more groups.Correlations between two groups were assessed using the Spearman rank-order correlation analysis.Kaplan-Meier method was used to generate survival curves, which were compared using the log-rank test.A p value of less than 0.05 was considered statistically significant.

Filtration of metabolism-related DEGs with distinct prognosis
The flow diagram of our study is shown in Fig. S1.The curated gene sets were downloaded from the molecular signatures database (MSigDB v7.4).We extracted the metabolism-related gene sets and RNA-seq expression profiles containing 935 genes from the MSigDB v7.4 and TCGA databases.We got the RNA-seq data of 178 PC samples and 171 normal pancreatic tissues (4 from TCGA, and 167 from GTEx).Moreover, we screened the 465 DEGs (254 upregulated and 211 downregulated) with the cutoff value that were |logFC| > 0.5 and FDR < 0.05.The volcano plot of DEGs was shown in Fig. 1A.Furthermore, the heatmap displayed that the distribution of DEGs in pancreatic neoplastic and normal tissues (Fig. 1B).
Next, the KEGG pathway and GO functional enrichment analyses were utilized to investigate the potential biological functions of DEGs.In the KEGG pathway enrichment analysis, the top 6 enriched pathways were 'Purine metabolism', 'Glycerophospholipid metabolism', 'Arachidonic acid metabolism', 'Carbon metabolism', 'Drug metabolism-cytochrome P450', and 'Drug metabolism-other enzymes' (Fig. 1C).The GO functional enrichment analysis of biological processes (BP), cellular components (CC), and molecular function (MF) was shown in Fig. 1D.The top three of BP were 'Small molecule catabolic process', 'Organic acid biosynthetic process', and 'Purine-containing compound metabolic process'.The top three of CC were 'Mitochondrial matrix', 'Peroxisome', and 'Microbody'.In addition, the top three of MF were 'Coenzyme binding', 'Lyase activity', and 'Oxidoreductase activity, acted in paired donors, with incorporation or reduction of molecular oxygen'.Finally, we utilized the univariate Cox regression analysis to screen the genes related to prognosis for further analysis, 85 metabolism-related genes which were significantly related to the overall survival (OS) of PC patients were selected (P < 0.01).

Identification of two molecular metabolic subtypes and correlation with outcomes, clinical features, and functional mechanism
According to the above 85 metabolism-related genes, we used unsupervised consensus clustering to identify the metabolic molecular subtypes in PC patients from the TCGA database.As shown in Fig. 2A-B, we calculated the cumulative distribution function (CDF) curves and consensus heatmap at k value = 2 to 9 (Fig. S2) and found k value = 2 was the optimal number of clusters.Finally, a total of 171 PC patients were classified into two clusters of which cluster 1 was composed of 46 patients and cluster 2 had 125 patients.Meanwhile, K-M survival analysis exhibited that patients in cluster 1 had significantly longer OS than those in cluster 2 (P < 0.001; Fig. 2C).Then, we visualized the expression patterns of two metabolic subtypes and analyzed the clinical information (Stage, grade, gender, age, and survival status) between two clusters of PC patients in Fig. 2D.The results indicated that the expression levels of different metabolic genes were characterized with two distinct distributions in cluster 1 and cluster 2. Among the various clinical parameters, the proportion of PC patients with pathological grade 1 to grade 2 accounted for 79% in cluster 1 which was significantly higher than 67% of PC patients with pathological grade 1 to grade 2 in cluster 2 (P < 0.05; Fig. 2E).Moreover, 69% of PC patients were alive during the follow-up period in cluster 1 that had better outcomes than 35% of alive PC patients (P < 0.001; Fig. 2E).However, clinical stage, gender, and age were no statistical differences between the two clusters.
Subsequently, GSVA analysis was used to explore the underlying molecular mechanisms.There were 3 pathways including glycosphingolipid biosynthesis ganglio series, olfactory transduction, and neuroactive ligand-receptor interaction were activated in cluster 1.In addition, 35 pathways were positively correlated with cluster 2 (Fig. 2F).The pathways of cluster 2 were mainly enriched in tumorigenic pathways (including cell cycle, tight junction, and notch signaling pathway), metabolic pathways (including glutathione, histidine, and drug metabolism), and DNA damage repair pathways (including base excision repair and homologous recombination).

Identification of two immune subtypes and exploration between TIME patterns and metabolic subtypes
We used ssGSEA analysis to calculate the enriched score of diverse immune cell types, functions, and signaling pathways based on 29 immune-related gene sets in the PC samples [15].Ultimately, PC patients were categorized into two subtypes as high and low immunity by hierarchical clustering (Fig. 3A).38 (21.3%)PC patients were defined as immune hot tumors, whereas, the rest of 140 (78.7%)patients were defined as immune cold tumors.Next, the ESTIMATE algorithm was utilized to assess the stromal levels, the levels of immune cell infiltration, ESTIMATE score, and tumor purity for each PC patient.The results in Fig. 3B indicated that the immune, stromal and ESTIMATE scores were higher in the high immunity subtype than in the low immunity subtype, meanwhile, tumor purity was reverse (all P < 0.001), illustrating affluent enrichment of immune and stromal cells in high immunity subtype.Subsequently, we used the CIBERSORT algorithm to quantify the contents of 22 immune cells (Fig. 3C).The fractions of naïve B cells, plasma cells, CD8 + T cells, activated memory CD4 + T cells, and gamma delta T cells in the high immunity subtype were larger than those in the low immunity subtype (all P < 0.05).Besides, the low immunity subtype contained the larger fractions of resting memory CD4 + T cells, activated NK cells, M0 macrophages, and M2 macrophages than the high immunity subtype (all P < 0.05).Furthermore, cluster 1 had the higher stromal score, immune score, and ESTIMATE score, moreover, lower tumor purity than cluster 2 (Fig. 3D; All P < 0.001).As shown in Fig. 3E, the higher contents of plasma cells, CD8 + T cells, activated memory CD4 + T cells, and monocytes were enriched in cluster 1 (all P < 0.05).Cluster 2 had higher fractions of M0 macrophages and activated dendritic cells than cluster 1 (all P < 0.05).

Characteristics of a metabolic scoring system and TMB landscape in metabolic subtypes
The above analyses of molecular subtyping described the landscape of population distribution of PC patients which could not exactly evaluate the metabolic patterns in individual patients.Therefore, we utilized the PCA to construct a metabolic scoring system to quantify the metabolic score of each PC.All PC patients were stratified into the high and low metabolic score groups according to the median metabolic score.
K-M survival analysis showed that the patients in the high metabolic score group had worse outcomes than the low metabolic score group (Fig. 4A). Figure 4B demonstrated that the distribution of metabolic score groups and survival states of PC patients in the two metabolic clusters by the Sankey diagram.
Next, we analyzed the differences of somatic mutation between low and high metabolic score groups, moreover, the top mutated genes of those two metabolic score groups were separately shown in Fig. 4C and D. The proportion of patients with KRAS mutations in the high metabolic score group (80%) was distinctly higher than it in the low metabolic score group (22%).The mutation frequencies of TP53 (72%), CDKN2A (28%), SMAD4 (25%) in the high metabolic score group were higher compared with the frequencies (32%, 5%, and 8% separately) in the low metabolic score group.Besides, other proportions of differentiated mutated genes (TTN, RNF43, USH2A, BTBD11, etc.) were also higher in the high metabolic score group than the low metabolic score group.As shown in Fig. 4E, the high metabolic score group possessed higher TMB than the low metabolic score group (P = 2.1 × 10 -5 ).Simultaneously, The metabolic score and TMB presented a significant positive correlation (Fig. 4F; R = 0.37, P = 9.5 × 10 -6 ).In addition, the PC patients with high TMB were significantly correlated with a shorter OS by K-M survival analysis (Fig. 4G).Further, we found that the patients with low TMB and the low metabolic score had the best prognosis compared to other patients with the combination of high/low TMB and high metabolic score (Fig. 4H, all P < 0.05).

Construction of the prognostic risk signature related to the metabolic and immune subtypes in PC
TCGA cohort was considered into the training set.Firstly, we intersected the DEGs between the DEGs of two metabolic and immunity subtypes and got 331 genes for further analyses (Fig. 5A).We utilized the univariate Cox regression analysis to narrow the candidate genes and identified the 43 genes associated with OS (P < 0.05) (Table S3).Moreover, the six screened genes (AHNAK2, LIPH, KLF5, MYEOV, SFTA2, and FAM83A) whose coefficients were not zero and were required for appearing 1000 times of 1000 repetitions in the LASSO logistic regression model (Fig. 5B, C).Meanwhile, those six genes were all related to the outcomes of PC patients distinctly (Fig. S3).Meanwhile, the result of K-M survival analysis indicated that the OS of PC patients with high-risk scores was poorer than the patients with low-risk scores (P = 3.641 × 10 − 7 ; Fig. 5D).
The scatter plot of risk scores and survival status of each PC patient, and the heatmap of six selected genes expression profiles were visualized in Fig. 5F, H. Furthermore, the prognostic signature showed well accuracy in predicting 1-, 2-and 3-year survival rates of which AUC values were 0.736, 0.739, and 0.746 respectively in the TCGA training cohort by the time-dependent ROC analysis (Fig. 6A).
To further evaluate the predictive capacity of the risk signature, we used the same manner to predict prognosis in the ICGC cohort regarded as an independent external validation cohort.184 PC patients in the ICGC cohort were divided into the low-risk and high-risk groups with the same median risk score.Figure 5G and I displayed that the scattergram of risk scores and survival status and heatmap plot of each selected gene expression.In accord with the previous results, the PC patients from the high-risk group had a distinctly worse OS than the low-risk group (P = 1.269 × 10 − 2 ; Fig. 5E).In addition, the time-dependent AUCs with the prognostic risk signature exhibited the favorable predictive ability of which the 1-, 2-and 3-year survival rates were 0.677, 0.714, and 0.687 respectively in the ICGC validation cohort (Fig. 6B).
Additionally, we analyzed the proportion of high/low metabolic score group, immunity subtype, and TMB in the risk groups of PC patients.91% of patients with high metabolic scores were in the high-risk group, meanwhile, the proportion of low immunity subtype in the high-risk group accounted for 94% both of which were significantly higher than the low-risk group (Fig. 6C, D).It was shown in Fig. 6E that 62% of PC patients in the high-risk group possessed high TMB which was higher than 45% of patients with low TMB in the low-risk group.Moreover, we also analyzed the relationship between the risk groups and the clinical variables (age, gender, T stage, N stage) that all had distinctly statistical significance (Fig. S4).Finally, we performed the GSEA to analyze the underlying molecular mechanisms by comparing various signaling pathways between the high-risk group and low-risk group of PC patients.The top 5 signaling pathways were exhibited in Fig. 6F and G.The signaling pathways of 'Amyotrophic lateral sclerosis', 'Calcium signaling pathway', 'Fc epsilon RI signaling pathway', 'Long term depression', and 'Neuroactive ligand-receptor interaction' were enriched in the low-risk group.Furthermore, the top 5 enrichment signaling pathways of the high-risk group were 'Base excision repair', 'Cell cycle', 'Glycosphingolipid biosynthesis lacto-and neolacto-series', 'P53 signaling pathway', and 'Pyrimidine metabolism'.In summary, the enrichment of those signaling pathways indicated the potential roles of the DEGs between metabolic and immune subtypes in the progression of PC.

Construction and evaluation of the nomogram for PC patients
First of all, we identified the valid clinical parameters relevant to prognosis by the univariate and multivariate Cox regression analyses.As a consequence of univariate Cox regression analysis in Fig. 7A, we demonstrated that age, N stage, number of positive lymph nodes, and high/ low-risk group were correlated with OS (all P < 0.05).Besides, we used the multivariate Cox regression analysis to further infiltration and draw the conclusion that age, number of positive lymph nodes and, high/low-risk group were independent prognostic clinical variants (all P < 0.05) which were selected to generate the nomogram for PC patients (Fig. 7B).Next, we constructed a nomogram using those independent clinical factors and performed the calibration curves to assess the accuracy of predicting 1-, 2-, and 3-year survival rates (Fig. 7C and D).The 45-degree line of the calibration curve meant the optimal capacity of prediction so that we could demonstrate that this nomogram had a valid and accurate capacity for predicting the prognosis.In addition, we used the ROC curve analysis to validate the predictive abilities of 1-, 2and 3-year survival rates, with AUC values of 0.792, 0.752, and 0.751 severally which were all superior to single usage of age, the number of positive lymph nodes, and risk score (Fig. 7E).

Exploration of the therapeutic response based on risk score group for PC
We utilized Genomics of Drug Sensitivity in Cancer (https:// www.cance rrxge ne.org) to predict the chemotherapy response of high-and low-risk groups to screen the valid chemotherapy drugs.Figure 8A showed that 8 drugs (Erlotinib, tipifarnib, dasatinib, docetaxel, doxorubicin, paclitaxel, rapamycin, bortezomib) sensitively responded to the PC patients in the high-risk group of which the estimated IC50 were all lower than the low-risk group (all P < 0.05).Furthermore, we assessed the immunogenicity of high-and low-risk groups was by IPS analysis, but the scores of IPS, IPS-CTLA4, IPS-PD1, and IPS-PD1-CTLA4 were no statistical significance between high-risk and low-risk (Fig. S5A-5D).The results indicated that PC patients with low-or high-risk might not present effective responses for immunotherapy.Meanwhile, we used the TIDE algorithm to evaluate the capacity of risk score served as a biomarker to predict the clinical response to immune-checkpoint blockade (ICB) therapy.Unfortunately, the results showed that there were also no differences between the TIDE scores of the high-and low-risk group, indicating undesirable responses to ICB therapy (Fig. S5E).
The above results indicated that the therapeutic effects of immunotherapies in PC patients requested further studies to verify due to the poor prediction results.

Expression of six genes in pancreatic cells and HPA database
To verify the expression of the six genes in the risk signature, we analyzed the protein and mRNA expression from the HPA database and pancreatic cells.AHNAK2, KLF5, LIPH, and MYEOV were positive in PC tissues compared with the normal pancreatic tissue (Fig. 8B and E).However, SFTA2 and FAM83A were not found in the HPA database.Meanwhile, we detected the mRNA expression of six genes in the PC cell line AsPC-1 and normal human pancreatic ductal epithelium cell line HPNE.Figure 8F showed that FAM83A, KLF5, LIPH, and MYEOV were up-regulated in AsPC-1.While AHNAK2 was down-regulated in AsPC-1.The expression of SFTA2 was not statistically different between PANC-1 and HPNE.

Experimental validation
After analyzing the expression of six genes in PC cell lines and the HPA database, we selected FAM83A, KLF5, LIPH, and MYEOV for further investigation of their role in PC cell line and organoid proliferation.The CCK-8 assay revealed that knockdown of these four genes significantly reduced proliferation in the AsPC-1 cell line (Fig. 9A and D).We then established PC organoid lines using patient tissues obtained from surgical resection and found that knockdown of FAM83A, KLF5, LIPH, and MYEOV resulted in a significant reduction in the size and number of PC organoids (Fig. 9E and I).Additionally, we examined the expression of these four genes in 15 human pancreatic tumor tissues and their corresponding normal tissues in the Peking Union Medical College Hospital (PUMCH) cohort.Based on the IHC score, we found that these genes were expressed at higher levels in tumor tissues compared to normal tissues (Fig. 9K, J).Overall, our experimental validation suggests that FAM83A, KLF5, LIPH, and MYEOV play a significant role in the growth of PC cell lines and patient-derived organoids(PDOs).

Discussion
Aberrant metabolic reprogramming is considered a contribution to the progression of pancreatic [9].Emerging studies demonstrated that metabolism took a crucial role in immunity, tumor microenvironment as well as drug resistance in PC [23].The integration of malignant cells, immune cells, stromal cells, and extracellular components constitute a tumor immune microenvironment of PC [24].Recent years, some bioinformatic studies were conducted and tried to explore metabolic panels and predict the prognosis of patients suffering from PC [25][26][27][28].While, the characterizations of special correlation between immune cells infiltration and metabolism are not comprehensively recognized.Therefore, identification of the distinct roles of metabolic patterns in the TIME will be beneficial to provide novel directions for therapeutic strategies in PC.
Here, we screened 85 metabolism-related genes and revealed two distinct metabolic subtypes.Then, we used ssGSEA and hierarchical clustering analysis to identify two immune subtypes of high and low immunity.The higher infiltration of T cells (both CD8 + and CD4 + ) and B cells were in the high immunity subtype compared with the low immunity subtype which had been partially reported before [29,30].The different metabolic subtypes impacted the prognosis and pathological grades of PC and had significantly obvious immune cell infiltration characterization.Cluster 1 was characterized by the higher content of plasma cells, CD8 + T cells, activated memory CD4 + T cells, monocytes, and lower content of tumor purity and M0 macrophages than cluster 2. Tumor-infiltrating lymphocytes (TIL) in tumor stroma are regarded as the crucial inhibitor in the progression of PC [31].Previous studies had shown that PC patients with increased infiltration of CD4 + , CD8 + T cells, or plasma cells were associated with prolonged survival [32].As far as we know, B cells (e.g.plasma cells) could present antigens to CD4 + and CD8 + T cells involved with immune responses in the tumor microenvironment.Meanwhile, the B cells played an antitumoral role in differentiating into plasma cells which could correlate with improved CD8 + T cell infiltration in tertiary lymphoid structures (TLS) [33,34].Gunderson AJ et al. found that PC patients with abundant TLS had optimal immune memory to suppress disease progression inducing long-term survival following surgery [35].Furthermore, TLS induction could enhance the antitumor capacity of chemotherapy in PC [36].Moreover, M0 macrophages were involved in pancreatic tumor suppression via TNF-α secretion and were an independent predictive factor of a poor prognosis for PC patients [37].Therefore, the results indicated that the patients in cluster 1 with the higher of TLS and lower density of M0 macrophages in TME might lead to a better prognosis than cluster 2.
Next, we established a metabolic scoring system to evaluate the metabolic pattern of individual PC patients.The survival of patients with high metabolic scores was poorer than with low metabolic scores.Further, prior studies indicated that glycolysis had a significant correlation with tumor mutational burden (TMB) in some cancers [38].Meanwhile, the underlying relationship between the high TMB and checkpoint blockade immunotherapy was established and TMB could predict the therapeutic response as a sensitive biomarker in immunotherapy [39].Existing studies elaborated that TMB could significantly affect the prognosis of PC and was associated with immune cells infiltration in the tumor microenvironment [40].Then, we found that metabolic scores of PC patients were positively correlated with TMB, and the patients in combination with low TMB and low metabolic scores had the best prognosis.Hence, these findings could indicate potential differences in the immunotherapy response between the high and low metabolic score groups.Afterward, we selected 6 genes (AHNAK2, LIPH, KLF5, MYEOV, SFTA2, and FAM83A) from the overlapped DEGs between the metabolic and immune subtypes.Crucial roles of 6 genes had been reported in previous studies.The expression of AHNAK2 was higher in pancreatic ductal adenocarcinoma than in the normal pancreas and up-regulated AHNAK2 was associated with poor prognosis [41,42].Additionally, AHNAK2 was involved in tumorigenesis via promoting epithelial-mesenchymal transition (EMT), cancer cell stemness, and TGF-β/Smad3 Pathway [43,44].Meanwhile, AHNAK2 was prominently correlated with the infiltration of diverse immune cells [45,46].LIPH overexpression was a carcinogenic biomarker correlated with immune suppression or evasion in PC [47].In terms of molecular mechanisms, LIPH was regulated by HIF-1α that could promote migration and metastasis by EMT in papillary thyroid carcinoma [48].Besides, LIPH knockdown significantly decreased both the ratio of CD44 + /CD24 − stem-like cells and the oxygen consumption rate of the mitochondria in triple-negative breast cancer [49].KLF5 was significantly associated with pancreatic cancer risk and could bind to the PLA2G16 promoter as a transcription factor to enhance glycolysis and growth in PC [50,51].Li J et al. demonstrated that KLF5 promoted infiltration of myeloid cells and exclusion of T cells to affect antitumor immunity in TME of pancreatic cancer [52].MYEOV enhanced glycolysis and SOX9 transactivity to promote pancreatic cancer progression [53,54].Specifically, MYEOV expression was correlated with infiltrating levels of B cells, T cells, dendritic cells, and neutrophils in non-small cell lung cancer [55].SFTA2 was up-regulated and significantly correlated with poor OS in patients with PC by bioinformatic analysis from TCGA and GEO database [56].FAM83A was a direct target of MIR454 and up-regulated TSPAN1 through the WNT-CTNNB1 signaling pathway to promote autophagy flux in PC [57].Furthermore, overexpression of FAM83A markedly promoted cancer stem cell-like phenotype via Wnt/β-catenin and TGF-β/Smad pathways and enhanced chemoresistance in PC [58].FAM83A was correlated with TILs in smokingrelated lung adenocarcinoma (LUAD) [59].In addition, FAM83A could promote PD-L1 through the ERK pathway which led to immune escape of LUAD [60].Therefore, in accord with the results in our study that overexpression of AHNAK2, LIPH, KLF5, MYEOV, SFTA2, and FAM83A could predict unfavorable outcomes and provide potentially significant targets to elucidate molecular mechanisms of PC related to metabolism and immunity.Furthermore, the nomogram model has been widely utilized to predict prognosis in clinical practice [61].We constructed a nomogram for the predicting prognosis of individual PC patients in our study.As shown in the calibration plots and ROC curves, the nomogram had an excellent predictive capacity.This provides a hint that the nomogram with the combination of sequencing data and clinical parameters might be superior to translational factors for the clinical application prospect of predicting prognosis in PC.
The resistance to chemotherapy of PC causes a dismal prognosis [62].Thus, we explored different treatment strategies based on the above high-and low-risk groups.In our current study, the patients in the high-risk group seemed to benefit from multiple drugs indicating that the selected markers from risk signature could further identify PC patients who were sensitive to chemotherapy.
Certainly, there are some limitations to this study.Firstly, the risk model and nomogram should be validated in a multicenter and prospective study to further evaluate the practicality and robustness.Secondly, the screened genes are needed to explore and elucidate the molecular mechanisms in vivo and in vitro in PC.Thirdly, all of the sequencing data we used were extracted only from TCGA and ICGC databases, the interpatient variability from the different race or geographical regions is wide.Lastly, the other clinical characteristics used in the nomogram were insufficient due to limited information in the study cohorts.We are sure that a better prognostic nomogram with complete clinical information and sequencing data will be constructed in the future.
In conclusion, we identified two metabolic and immune subtypes which had different tumor microenvironments in PC patients.And we built a risk score system to assess the prognosis of PC patients and select sensitive patients to chemotherapy, moreover, we generated a nomogram to provide promising clinical value to predict the OS for PC patients.Therefore, our prognostic prediction model related to metabolism and immunity may improve the prognosis and even guide the new therapeutic strategies for PC patients.

Fig. 1
Fig. 1 Identification of DEGs related to the metabolism and functional enrichment analysis.A Volcano plot of DEGs related to the metabolism between 185 PC and 171 normal samples.B Heatmap of the metabolic-related DEGs in PC and normal pancreatic tis-

Fig. 2
Fig. 2 Identification of two metabolic subtypes with distinct survival outcomes, clinical features, and functional annotations.A k = 2 was set as the optimal cluster number with consensus clustering matrix.B CDF curves of the consensus score (k = 2 to 9).C Kaplan-Meier survival curves for two metabolic subtypes showed that cluster 2 in PC patients had poorer OS than cluster 1 in the TCGA cohort.D The

Fig. 3
Fig. 3 Identification of two immune subtypes and the TIME features.A The immune subtypes of PC patients were categorized into the 'immune cold' and 'immune hot' groups based on the immune activity of PC.B Comparisons of the infiltration level of stromal, immune cells, the ESTIMATE score, and tumor purity in two immune sub-

Fig. 4
Fig.4 Features of the metabolic scoring system and TMB in metabolic subtypes.A Kaplan-Meier survival curves showed that the high metabolic score group had shorter OS than the metabolic score.b A Sankey diagram showed the distribution of metabolic clusters, metabolic scores, and survival status.Waterfall plots showed the top frequently mutated genes in the low (C) and high metabolic score group (D).and the validation cohort (D).E Box plot exhibited the differences of TMB in the high and low metabolic score groups.F Correlation analysis between the metabolic score and the TMB in different metabolic clusters.G Kaplan-Meier curves showed the PC patients with high TMB had a poorer prognosis than those with low TMB.H Kaplan-Meier curves of survival time for PC patients with 'low TMB & low metabolic score', 'low TMB & high metabolic score', 'high TMB & low metabolic score', and 'high TMB & high metabolic score'

Fig. 5
Fig.5 Construction of a prognostic risk signature related to the metabolism and immunity for PC.A Venn diagrams showed the intersection of DEGs between two metabolic and immune subtypes.B Screening of the optimal tuning parameter (lambda) selection in the LASSO regression with ten-fold cross-validation by the maximum criteria.C Six genes were regarded as candidate genes with nonzero coefficients to build the risk signature by appearing 1000 times of

Fig. 6 Fig. 7
Fig. 6 Prognostic performance, proportion, and GSEA of the prognostic risk signature in PC.The prognostic performances of the prognostic risk signature by the time-dependent ROC curve for predicting the 1-, 2-, and 3-year OS rates in the TCGA training cohort (A) and GEO validation

Fig. 8
Fig.8 The estimation of chemotherapy response of high-and low-risk groups and the expression of six genes in the risk signature.A Boxplot indicated the difference of the IC50 of various drugs based on the GDSC database between the high-and low-risk groups.The protein expression of AHNAK2 (B), KLF5 (C), LIPH (D), and MYEOV (E) in the normal and PC tissues.F The difference of mRNA expression of six genes between HPNE and AsPC-1.*P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, ns, no statistically significant