1. Identification of ECM subtypes in pancreatic cancer
Fourteen pathways related to the pancreatic cancer ECM were selected from the Gene Ontology (GO) and REACTOME pathway databases, including the biological processes of collagen, hyaluronic acid, and laminin, as well as ECM organization, construction, and interaction with cancer cells. GSVA was performed in 174 patients in the TCGA-PAAD cohort to obtain a score for each pathway. By performing hierarchical clustering according to GSVA scores, we obtained 4 patient categories and 2 pathway categories (Fig. 1a). The Type 1 pathways include collagen synthesis, metabolism, binding, ECM construction, and binding of hyaluronic acid, while the Type 2 pathways mainly include the biological processes of hyaluronic acid and laminin. A significant difference in survival was observed among the four groups (P = 0.0021). Cluster 1 (n=57, median survival = NR, 95% Cl 596-NR) and Cluster 3 (n=46, median survival = 913, 95% Cl 568-NR) had better survival than Cluster 2 (n=37, median survival = 518, 95% Cl 353-738) and Cluster 4 (n=34, median survival = 418, 95% Cl 366-732) (Fig. 1b). The contribution of the GSVA score to survival in the Type 1 and Type 2 pathway categories was then analyzed. The results indicated that there was no significant difference in survival between patients with high and low scores in Type 1 pathways, while the GSVA score of Type 2 pathways was significantly correlated with survival (Fig. 1c).
2. Construction of the pancreatic cancer ECM scoring model
The PECMS model was constructed based on the GSVA scores of the Type 2 pathways. Through limma analysis, we screened 31 genes significantly different at the transcriptional level between the Type 2 score-high group and the Type 2 score-low group (Fig. 2a, Supplementary Table S1). Based on these genes, a prediction model was constructed by Lasso regression. When the mean squared error (MSE) of Lasso regression was taken as the minimum value and when one standard error (1 SE) of the MSE was used, 17 and 8 genes were selected for model construction, respectively (Fig. 2b, Supplementary Table S2). The AUCs of the two sets of parameters were 0.93 and 0.90, respectively. Cutoff values for PECMS were decided according to the maximum Youden index (Supplementary Fig. S2). The two sets of parameters had similar AUCs and accuracy rates, but the scoring model based on 1SE MSE significantly reduced the number of variables and had a more balanced sample size between groups in the TCGA-PAAD cohort (81 in the PECMS-high group and 93 in the PECMS-low group) than the model based on the minimum MSE (64 in the PECMS-high group and 110 in the PECMS-low group). Based on the parameters obtained from the 1SE MSE of Lasso regression, a total of 8 genes (COL17A1, AREG, KLHL32, CDA, POSTN, SLC2A1, FN1, and IHNBA) were included to obtain PECMS. The PECMS-high group contained many Cluster 2 and Cluster 4 patients, while the PECMS-low group contained a high proportion of Cluster 1 and Cluster 3 patients (Fig. 2c). Among the solid tumors included in TCGA database, pancreatic cancer ranked 3rd in PECMS and was the highest among all kind of adenocarcinomas (Fig. 2d). Another pancreatic cancer cohort in TCGA, CPTAC-3, was used to verify the above results. The PECMS scores of patients in the PCTAC-3 cohort were close to those of patients in the TCGA-PAAD cohort (Fig. 2e). Except for KLHL32, the remaining feature genes had higher transcriptional levels in the PECMS-high group, while the expression of KLHL32 showed the opposite trend (Fig. 2f). In both cohorts, the median survival of the patients in the PECMS-low group was significantly better than that of patients in the PECMS-high group (TCGA-PAAD: 913 d, 95% Cl 603-NR vs. 481 d, 95% Cl 381-634, P=0.00049, CPTAC-3: 743 d, 95% Cl 599-902 vs. 398 d, 95% 303-603, P=0.0015) (Fig. 2g). In the TCGA-PAAD and CPTAC-3 datasets, the clinical characteristics of gender, age, American Joint Committee on Cancer (AJCC) stage, smoking history and drinking history were similar between the two groups. The PECMS score, site of the primary lesion, and AJCC stage were prognostic risk factors in univariate Cox analysis combining the two datasets. In multivariate Cox regression analysis, the PECMS score was an independent risk factor for prognosis (Fig. 2h).
3. PECMS was associated with ECM characteristics
In both the TCGA-PAAD and CPTAC-3 cohorts, KRAS, TP53, CDKN2A and SMAD4 had high mutation rates. The major mutation type of KRAS was missense mutations, and almost all PECMS-high patients had KRAS mutations, while the proportion of KRAS mutations in patients with PECMS-low was relatively low (Fig. 3a). In both cohorts, the TMB of patients in the PECMS-high group was significantly higher than that of patients in the PECMS-low group (Fig. 3b). In the differential pathway analysis based on the GO database, pathways were enriched in cell-matrix adhesion, ECM structure construction, collagen aggregation, and laminin complex (Supplementary Fig. S3). In addition to the ECM-related pathways used for clustering, two classes of pathways were enriched in pathway pairing analysis: the acute inflammatory response-related pathway and the hypoxia response-related and signal transduction pathway (Fig. 3c). Among the 156 representative genes of metabolic characteristics (Supplementary Table S3) , PECMS-high patients had higher transcriptional levels of glycolysis genes, while PECMS-low patients had higher transcriptional levels of fatty acid synthesis, β-oxidation and glutaminolysis genes (Fig. 3d). This suggests that there may be higher oxidative pressure in the pancreatic cancer tissues of PECMS-high patients. This result was verified in the analysis of the mRNA expression levels of oxidative stress marker genes . Almost all these genes had significantly higher mRNA levels in the PECMS-high group (Fig. 3e). Interestingly, although there is a strong relationship between hypoxia and angiogenesis, no general significant differences were observed between the two groups in the expression of angiogenesis marker genes, including HIF-1α (HIF1A, a key gene for signaling in hypoxia) (Supplementary Fig. S4a). At the pathway level, there was also no significant difference in the GSVA score of angiogenesis between the two groups (Supplementary Fig. S4b). This suggests that the downstream signaling of hypoxia is at least partly blocked.
We then analyzed the immune characteristics of the different PECMS groups. In terms of the normalized GSVA scores, PECMS-low patients had higher scores in MHC-I biosynthesis and antigen presentation, which suggests a more active antigen recognition and presentation in these patients. Moreover, the chemokine signaling pathway, which plays an important role in immune cell infiltration and activation, was also upregulated in PECMS-low patients. In contrast, PECMS-high patients had noticeably higher levels of the IFN-γ, TGF-β and WNT/β-catenin pathways (Fig. 3f, g, the gene markers  in Fig. 3f are listed in Supplementary Table S4). These pathways are regarded as negative regulators of immune responses. In CIBERSORT prediction of immune cell infiltration, the PECMS-low group had higher infiltration of immune effector cells, including significantly more CD8-positive T cells, activated CD4 T cells and activated NK cells. This finding agrees with the results of immune characteristics in the GSVA pathway analysis. The PECMS-high group had higher infiltration of M0 macrophages, activated DCs, etc. (Fig. 3h and Supplementary Fig. S5).
4. PEMCS affect drug sensitive in pancreatic cancer
PECMS-high patients had higher transcription levels of immunotherapy predictors, including CD274 (PD-L1), TNFRSF9, CD80, and CD86. However, multiple immune depletion markers, including PDCD1 (PD-1), CTLA-4, TIGHT, and CD27, were significantly higher in the PECMS-low group (Fig. 4a), suggesting a high proportion of depleted T cells in the infiltrated immune cells of patients in the PECM-low group.
To further discuss the relationship between PECMS and the response to ICB therapy, we introduced an independent cohort (IMvigor-210) in which renal cell carcinoma patients were treated with ICB (atezolizumab, an anti-PD-L1 antibody). After aligning the mRNA data to TCGA-PAAD, 237 patients were divided into the PECMS-low group, and 111 patients were divided into the PECMS-high group. The median survival of the PECMS-low group was 10.4 months (95% Cl 8.02-13.31 months), which was significantly longer than that of the PECMS-high group (6.7 months, 95% Cl 5.65-9.49 months, P=0.022) (Fig. 4b). A total of 198 patients in the PECMS-low group had response evaluation data. Twenty patients (10.1%) had the best outcome of complete response (CR), 32 (16.2%) had partial response (PR), 46 (23.2%) had stable disease (SD), and 100 (50.5%) had progressive disease (PD). In the 100 patients with response evaluation data in the PECMS-high group, the numbers of patients with the best efficacy of CR, PR, SD and PD were 5 (5%), 11 (11%), 46 (46%) and 67 (67%), respectively. Patients in the PECMS-low group had a significant response rate (P = 0.047) (Fig. 4c).
Among the main types of chemotherapeutic drugs approved for pancreatic cancer in the National Comprehensive Cancer Network (NCCN) guidelines, there was no significant difference between the PECMS-high and PECMS-low groups in the sensitivity to fluorouracil (5-FU or gemcitabine) and platinum. However, for taxane (paclitaxel and docetaxel), the PECMS-low group had a significantly higher sensitivity prediction. For all the selected chemotherapeutic drugs, PEMCS was negatively correlated with the drug sensitivity score. This partly explains the poor survival of patients in the PECMS-high group. The relationship between the PECMS score and sensitivity to phenformin (a biguanide hypoglycemic drug) was then analyzed. The results suggested that PECMS-high patients had a higher predicted sensitivity to phenformin, and PECMS showed a significant positive correlation with phenformin sensitivity (Fig. 4d).
Among the targeted medicines, we screened 17 drugs with significant differences in sensitivity prediction between the two groups. Seven drugs were predicted to be more efficient in the PECM-low group, including AICAR (AMPK agonist), MBS-754807 (IGF1R inhibitor), dasatinib (ABL inhibitor), thapsigargin (sarcoendoplasmic reticulum Ca2+-ATPase inhibitor), and trametinib (MEK inhibitor), while CPT724714 (ERBB2 inhibitor), crizotinib (ALK inhibitor), GT-2580 (CSF1R inhibitor), masitinib (KIT inhibitor), OSI-027 (mTOR inhibitor), parthenolide (NFKB1 inhibitor), ruxolitinib (JAK inhibitor), shikonin (PKM2 inhibitor), and vorinostat (histone acetylase inhibitor) showed higher predictive sensitivity in the PECMS-high group (Supplementary Fig. S6).
5. PECMS predicts survival in our retrospective cohort
In a retrospective cohort containing 20 patients from our single center, we further validated the associations of the PECMS score with survival and drug sensitivity (Supplementary Table S5). All patients were diagnosed with pancreatic cancer and underwent radical operation in 2018. The transcriptome data of these patients were obtained using surgical specimens. According to the scoring results, 10 patients were included in the PECMS-high group, and 10 patients were included in the PECMS-low group. All patients received albumin paclitaxel and gemcitabine (AG) as first-line therapy after recurrence. The median OS of patients in the PECMS-low group (20 months, 95% Cl 13-NR) was significantly better than that of patients in the PECMS-high group (13 months, 95% Cl 11-NR, P=0.042) (Fig. 4e). For the response to first-line AG therapy, 4 (40%) patients in the PECMS-high group had PD in the first-effect evaluation based on the RECIST 1.1 standard, while the number in the PECMS-low group was 1 (10%). Two patients in the PECMS-low group achieved PR, however, only 1 patient in the PECMS-high group achieved PR. The response to AG first-line therapy in the PECMS-low group was significantly better than that in the PECMS-high group (Fig. 4f).
6. KLHL32 is a key gene modeling the ECM of pancreatic cancer
We analyzed the correlation between eight molecules involved in the PECMS score and different clinical and tumor stromal features. KLHL32 showed distinct characteristics from the other molecules (Fig. 5a, b). In terms of survival, KLHL32 expression was associated with better prognosis in both univariate and multivariate Cox regression analyses, while COL17A1, AREG, FN1 and INHBA were associated with worse prognosis. We then set the median mRNA levels of each feature gene of PECMS as the cutoff value to divide the patients in the TCGA-PAAD cohort. In survival analysis, we found that for COL17A1, AREG, CDA, POSTN, SLC2A1, FN1 and IHNBA, the median survival of the high expression group was significantly lower than that of the low expression group. However, for KLHL32, higher expression suggests better survival (Supplementary Fig. S7). This is consistent with the trend of univariate Cox analysis. Among the pathway correlations, KLHL32 showed the most obvious negative correlation with the TGF-β pathway. In drug sensitivity prediction, the expression of KLHL32 was positively correlated with the sensitivity to fluorouracil, platinum, and taxol. KLHL32 was negatively correlated with most of the oxidative stress markers, which was quite different from all the other molecular markers. In the CIBERSORT score of immune cell infiltration, KLHL32 was positively correlated with the infiltration of a variety of immune cells, including CD8-positive T cells, while it was negatively correlated with immune depletion markers (PDCD1, CTLA4, TIGHT, and CD27) (Fig. 5a).
In our retrospective cohort, the patients were divided into two groups using the median KLHL32 transcription level as the cutoff value. We then analyzed the relationship between KLHL32 levels and ECM phenotypes using GLUT-1 (SLC2A1) as a marker of oxidative stress and energy transport, CD31 as a marker of angiogenesis, CD8 as a marker of T-cell infiltration, and PD-L1 as a marker of immune escape. In the IHC staining of continuous sections of surgical specimens, low levels of CD31 and PD-L1 expression were observed in both groups. GLUT-1 expression was significantly lower in the KLHL32-high group than in the KLHL32-low group, deepening the relationship between KLHL32 and glucose transport in the ECM of pancreatic cancer and highlighting the role of GLUT1 in energy utilization in pancreatic cancer. Moreover, patients in the KLHL32-high group had more infiltrating CD8-positive cells, suggesting that KLHL32 plays a positive role in pancreatic cancer immune cell infiltration and the immune response (Fig. 5c, d and Supplementary Fig. S8).