Identi cation of MARCO and PCSK6 As Key Genes Promoting Hepatic Metastasis in Pancreatic Cancer via Bioinformatics Analysis


 Background: Hepatic metastasis occurred frequently in pancreatic cancer and leads to a poor prognosis. Genes promoting hepatic metastasis would help finding out the way to improve survivals of patients. Methods and results: GSE42952 and GSE71729 out of 181 data serials from Gene Expression Omnibus (GEO) with samples of both hepatic metastasis and primary tumor of pancreatic cancer were chosen to generate differentially expressing genes (DEGs). 217 up-regulated and 257 down-regulated genes overlapped between the two data serials and were defined as hepatic metastasis related DEGs (H-DEGs), among which 27 up-regulated and 2 down-regulated H-DEGs were verified by both tumor tissues and cell lines of pancreatic cancer. Gene ontology (GO) and KEGG pathway analysis were performed and showed reprogramming of tumor metabolism might promote hepatic metastasis. The prognostic significance of H-DEGs were investigated. Up-regulation of MARCO and PCSK6 were both correlated to decreased overall survivals in pancreatic cancer. Immune cell infiltration analysis indicated MARCO involved in regulating tumor immune microenvironment. Gene set enrichment analysis found PCSK6 was related to activations of critical pathways and regulation of glycolysis. Conclusions: MARCO and PCSK6 might promote hepatic metastasis of pancreatic cancer via modulating tumor immune microenvironment and tumor metabolism respectively.


Introduction
Pancreatic cancer was one of the most aggressive and malignant tumors in alimentary system [1]. The 5year survival rate for patients with pancreatic cancer was less than 10% despite great advance in diagnosis and therapeutic regimen [2]. liver was one of the most common sites of distant metastasis and hepatic metastasis was a major reason for impeding curative resections of pancreatic cancer [3].
Investigation of the genes promoting hepatic metastasis will provide unique biomarkers for early diagnosis and promising molecules for targeting therapies, to improve the overall survivals of patients with pancreatic cancer [4,5].
Process of hepatic metastasis consisted of several stages including detachment of tumor cell from primary tumor, dissemination of tumor cell through circulation and settlement in liver [6]. Previous studies had found a serial of novel genes and canonical pathways diving tumor progression, via microarray analysis or high throughput sequencing with samples of tumor tissue, peripheral blood or tumor cell line [7][8][9], however the understanding of hepatic metastasis of pancreatic cancer remained inadequate.
Tumor cell gaining or losing biological properties were dependent on the microenvironment during the process of metastasis and accompanied by variation of expression pro le [10,11]. Due to heterogeneity of primary tumor and dynamic change of microenvironment, investigation of DEGs between primary tumor and adjacent normal tissue may not focus on the genes most correlated to hepatic metastasis of pancreatic cancer.
In this study, we made a comparison between samples of hepatic metastasis and primary tumor to nd out key genes related to hepatic metastasis. The generated DEGs were further veri ed with data from public database and analyzed through online tools or R packages. Promising genes were identi ed and interaction networks were established to explore the potential functions and downstream signaling pathways.

Materials And Methods
Microarray data Data serials were collected from GEO (www.ncbi.nlm.nih.gov/geo/), a public functional genomics data repository. The searching criteria was de ned as pancreatic cancer OR pancreatic tumor OR pancreatic ductal adenocarcinoma (keyword), Homo sapiens (organisms), and expression pro ling by array (study type). Data serials including samples of both hepatic metastasis and primary tumor from surgical resections were chosen for further analysis.

Identi cation of DEGs
Genes expressing in samples of hepatic metastasis were compared to genes expressing in samples of primary tumor in each data serial to generate DEGs respectively by GEO2R (www.ncbi.nlm.nih.gov/geo/geo2r), using the parameters including adjusted p value and Benjamini & Hochberg (False discovery rate). Log transformation was applied to the data. DEGs with Log2 FC ≥1 or ≤-1 and adjusted p value <0.05 were considered as statistically signi cant. DEGs from each data serial were analyzed to obtain overlapped genes by Venn diagrams (bioinformatics.psb.ugent.be/webtools/Venn), which were presented as up-regulated and down-regulated DEGs respectively.

Validation of DEGs in tumor tissues and cell lines
Both up-regulated and down-regulated DEGs were validated in samples of primary tumor compared to samples of adjacent normal tissue from the Cancer Genome Atlas (TCGA) and GTEx through an online tool GEPIA2 (gepia2.cancer-pku.cn/#index). Adjusted p value <0.01 and Log2 FC ≥1 or ≤-1 were de ned as cutoff values.
To exclude the in uence of mesenchymal components, the obtained DEGs were veri ed in cell lines of pancreatic cancer from Cancer Cell Line Encyclopedia (CCLE) through an online tool DepMap Portal (depmap.org/portal/). The mRNA expressing pro les of 13 cell lines from CCLE (ASPC1, BXPC3, CAPAN1, CAPAN2, CFPAC1, HPAC, HPAFII, HS766T, MIAPACA2, PANC1, SU8686, SUIT2, SW1990) were applied. The mRNA expression of each gene was presented as Transcripts Per Kilobase Million (TPM). Log2(TPM+1) <2 was de ned as negative expression. The DEGs in previous step with Log2(TPM+1) <2 in all 13 cell lines were excluded. The veri ed DEGs were presented by heat plot using an online tool Heatmapper (www.heatmapper.ca/). GO and KEGG pathway analysis of DEGs R package clusterPro ler and org.Hs.eg.db were employed to performed GO and KEGG pathway analysis. The cutoff of p value was de ned as <0.05 for both GO and KEGG pathway analysis. Categories of gene ontology consisted of biological process (BP), cell component (CC) and molecular function (MF).

Investigating prognostic DEGs
Cox proportional hazards regression analysis of DEGs was carried out basing on data from TCGA through the online tool GEPIA2. Cutoff of adjusted p value was de ned as <0.05. For overall survival analysis, cohorts were grouped into high and low expressions of DEGs using the median expressions. Kaplan-Meier survival plots were drawn and Log-rank test were employed. Cutoff of p values was de ned as <0.05. For tumor stages analysis, cohorts were grouped by TNM stage and One way ANOVA was adopted. Cutoff of p value was de ned as <0.05.

Exploring interaction networks of DEGs
The corresponding proteins of DEGs were analyzed according to interaction frequency and interaction intensity through an online protein database String (string-db.org/). The minimum required interaction score was set to >0.04 and protein-protein interaction (PPI) networks were generated. PPI networks were visualized by Cytoscape (version 3.8.2). The expressions of corresponding proteins of DEGs in para n embedded samples were validated via The Human Protein Atlas (www.proteinatlas.org/).

Immune cell in ltration analysis
Tumor in ltrating lymphocytes (TILs) in pancreatic cancer were investigated using an online tool (cis.hku.hk/TISIDB/). The correlations between mRNA expressions of MARCO and abundance of TILs in pancreatic cancer were analyzed by Spearman correlation test. The cutoff of p value was de ned as <0.01.

Gene set enrichment analysis
The primary tumors of pancreatic cancer from TCGA were separated into group PCSK6_high and group PCSK6_low according to the median expression of PCSK6. Hallmark gene sets (including 50 gene sets) from Molecular Signatures Database (MSigDB v7.4 www.gsea-msigdb.org/gsea/msigdb) were used to performed Gene sets enrichment analysis (GSEA) between group PCSK6_high and group PCSK6_low. Normalized Enrichment Score (NES) was generated and normalized p value of gene set <0.05 was considered statistically signi cant.

Results
Microarray data information 181 data serials in total were collected from GEO, among which 8 data serials (Supplementary table 1) included samples of both primary tumor and adjacent normal tissue, and 2 data serials (GSE42952 and GSE71729) included samples of both hepatic metastasis and primary tumor (table 1).
Data serials GSE42952 and GSE71729 were selected for following analysis. GSE42952 including 10 primary tumor samples and 7 hepatic metastasis samples from surgical resections, was processed on array platform GPL570. GSE71729 including 145 primary tumor samples and 25 hepatic metastasis samples from surgical resections, was processed on array platform GPL20769.
DEGs in hepatic metastasis of pancreatic cancer Data serials GSE42952 and GSE71729 were analyzed through GEO2R and volcano plots of DEGs in each data serial were showed ( gure 1A and 1B). 1490 DEGs were identi ed between group of hepatic metastasis and group of primary tumor in GSE42952, including 588 up-regulated and 902 down-  The corresponding protein of MARCO and PCSK6 were examined by immunohistochemistry ( gure 6B).
The representative stainings of PCSK6 in pancreatic cancer, normal pancreas tissue and normal liver tissue were showed, while the stainings of MARCO were negative in current samples of pancreatic cancer, normal pancreas tissue and normal liver tissue in database of The Human Protein Atlas.

Expression of MARCO and TILs in pancreatic cancer
As the expression of MARCO might be related to TILs, correlation between mRNA expression of MARCO and 8 types of TILs including Neutrophil, Macrophage, natural killer cell (NK), myeloid derived suppressor cell (MDSC), regulatory T cell (Treg), activated B cell (Act_B), activated CD4 T cell (Act_CD4) and activated CD8 T cell (Act_CD8) in pancreatic cancer were analyzed ( gure 6C). 5 types of TILs including Macrophage, MDSC, NK, Treg and Act_CD4 were positively correlated to expression of MARCO (p value <0.01 and rho>0). Treg, MDSC, Macrophage, NK and Act_CD4 were ranked in decreasing order of rho coe cients.

Functional gene sets correlated to expression of PCSK6
Totally 172 samples of pancreatic adenocarcinoma from TCGA were separated into group PCSK6_high (n=86) and group PCSK6_low (n=86). The enrichment list of gene sets (top 20) was ranked by NES (table 3). There were 9 signi cant gene sets (normalized P<0.05) included Notch signaling, Mitotic spindle, TGF beta signaling, Glycolysis, Estrogen response early, Peroxisome, Estrogen response late, Apoptosis and P53 pathway. The enrichment plots of signi cant gene sets were showed ( gure 7A). Top 50 genes in each group were plotted by heatmap ( gure 7B). The correlations between NES and p value of gene sets were visualized ( gure 7C).

Discussion
Liver was one of the most common sites that pancreatic cancer metastasized to and around 40-50% of the rst admitted patients were found with hepatic metastasis, whose prognosis were extremely poor [12,13]. Besides, hepatic metastasis arose early and frequently in patients even if curative resections of tumors had been performed [14][15][16]. It indicated that hepatic metastasis initiated at very early stage and tumor cells had already settled down in liver before operations, while no sign of hepatic metastasis would be found by radiologists. Therefore, it was challenged to improve overall survivals of patients with pancreatic cancer.
In this study we established a comparison between hepatic metastasis and primary tumor to explore DEGs, focusing on the genes associated with hepatic metastasis. We found 588 and 445 up-regulated genes in two data serials respectively, meanwhile 902 and 608 down-regulated genes were found respectively. Overlapped DEGs including 217 up-regulated H-DEGs and 257 down-regulated H-DEGs showed consistent differential expressions across two data serials.
Due to requirements for adaptations to different microenvironments, the transcriptional pro les within primary tumor might change within metastatic lesions of distant organs [17,18]. If the H-DEGs showed differential expressions only in hepatic metastasis (compared to primary tumor), these H-DEGs known as passenger genes might accommodate the colonized tumor cells to a new microenvironment. Instead, if the H-DEGs showed differential expressions in both hepatic metastasis (compared to primary tumor) and primary tumor (compared to adjacent normal tissue), these H-DEGs known as driver genes might initiate metastasis, promote metastasis and establish tumor colonies in liver. In the latter condition, the colonized tumor cells in liver would show an enrichment of subclones from the primary tumor with driver H-DEGs.
The driver H-DEGs functioning in both primary tumor and hepatic metastatic lesion, would be promising therapeutic targets and diagnostic markers in pre-metastasis or post-metastasis timescale. Therefore, a further comparison between H-DEGs and P-DEGs was undertaken to explore the driver H-DEGs. Subsequently a veri cation in CCLE database was adopt to minimize the in uence of mesenchymal components in the tissue samples. Totally, 27 up-regulated H-DEGs and 2 down-regulated H-DEGs were obtained.
It revealed that the H-DEGs participated in serials of biological regulations including regulation of coagulation, lipid metabolism, secretion of particles and amino acid metabolism through GO and KEGG pathway analysis, instead of classical processes such as cell proliferation, apoptosis, cell migration, EMT and so on in previous studies [19,20]. Reprogramming of tumor metabolism played a signi cant role in tumor metastasis [21,22]. Plenty of tumor metabolism related genes had been discovered [23][24][25][26]. Among the H-DEGs, iron metabolism related genes including GOT1 and CP [27,28], glutamine metabolism related gene GOT1 [29], pyruvate metabolism related gene PC [30], proline metabolism related gene ALDH4A1 [31], and so on were found in this study. Alterations of tumor metabolism including shift of energy source, glucose metabolism through pentose phosphate pathway, lipid synthesis, increase of glutamine consumption and regulations of redox, were compatible with variations of tumor microenvironment, to meet the demands for metastasis. Tumor metabolism related H-DEGs in this study might be a novel sight in hepatic metastasis of pancreatic cancer.
In current study MARCO and PCSK6 were found to be prognostic genes. MARCO was a receptor with collagenous structure, related to removal of pathogens and tumor progression, however it was mainly expressing on macrophage subset [32]. It was unclear whether MARCO functioned in mesenchymal cells, tumor cells, or both during the process of hepatic metastasis in pancreatic cancer. No obvious staining of MARCO protein was found in pancreatic cancer and it was consistent to the result that most of the cell lines of pancreatic cancer didn't express MARCO except BxPC-3 in this study. It indicated that upregulation of MARCO in pancreatic cancer might attribute to mesenchymal components. Tumor in ltrating lymphocytes in tumor microenvironment were analyzed in this study. Treg, MDSC and Macrophage were the top three TILs correlated to up-regulation of MARCO in pancreatic cancer. We subsequently established a PPI interaction network including MARCO and 10 predicted proteins. VSIG4 was a B7 family-related macrophage protein and promoted tumor progression by inhibiting T cell activation [33]. C1QA, C1QB and C1QC were correlated to immune activity of tumor microenvironment and in uenced patient's survivals [34,35]. These evidences suggested MARCO might play a role in regulating tumor immune microenvironment and more investigations would be needed.
PCSK6 was one member of the novel family of proprotein convertases, Ca 2+ dependent serine proteases, which were capable of processing various protein precursors into their active products, including hormones, tyrosine phosphatases, growth factors, metalloproteinases and adhesion molecules [36]. PCSK6 had been found involving in migration of smooth muscle cells and multiple biological behaviors of tumor cells [37][38][39], but the role of PCSK6 in tumor metastasis especially hepatic metastasis remained elusive. In this study PCSK6 was found up-regulated in hepatic metastasis of pancreatic cancer and overexpression of PCSK6 was correlated to decreased overall survivals. Signi cant staining of PCSK6 was veri ed in pancreatic cancer and it was mainly expressing in cytoplasm of tumor cells, in accordance with that PCSK6 was expressing in 10 out of 13 cell lines of pancreatic cancer. Among the PPI interaction network of PCSK6, NGF a nerve growth factor secreted from pancreatic cancer cells and stellate cells was identi ed as a promising therapeutic target in pancreatic cancer [40,41]. Inhibiting PCSK9 increased the expression of MHC I on tumor cell surface and promoted intra-tumor in ltration of cytotoxic T cells, potentially enhancing immune checkpoint therapy [42]. Proprotein convertases participated in the process of lipid metabolism and regulated clearance of plasma low density lipoprotein [43]. Notch signaling, Mitotic spindle, TGF beta signaling, Glycolysis and Estrogen response early were the top 5 gene sets generated by the GSEA analysis in this study, indicating that over-expression of PCSK6 was correlated to activations of these biological process in pancreatic cancer. Therefore, up-regulation of PCSK6 might activate various enzymes, substrates, receptors, chemokines and transcription factors contributing to regulation of critical signaling pathway and reprogramming of tumor metabolism, leading to hepatic metastasis, but elucidations of detailed mechanisms would be needed in the future.

Conclusions
In this study two data serials with samples of hepatic metastasis and primary tumor in pancreatic cancer were chosen to generate a cluster of hepatic metastasis related DEGs. These H-DEGs were further veri ed using samples of tumor tissue and cell lines of pancreatic cancer in public database. Totally 27 upregulated and 2 down-regulated H-DEGs were obtained. The enrichment results of GO and KEGG pathway analysis showed reprogramming of tumor metabolism might play a role in hepatic metastasis of pancreatic cancer. Among the H-DEGs, MARCO and PCSK6 were prognostic genes and correlated to tumor immune microenvironment and reprogramming of tumor metabolism. The H-DEGs without signi cant prognostic indications obtained in this study were also deserved to further explorations. However, there were limitations in this study and further investigations might be performed to nd out the detailed mechanism how these H-DEGs regulated hepatic metastasis of pancreatic cancer.

Competing interests
The authors declared that they had no relevant nancial or non-nancial interests.

Data availability
The original ndings of current study were presented in the article and further inquiries will be available from the corresponding author.

Ethics approval
Ethical review and approval were not required in this study.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.