Immune micro-environment analysis and establishment of response prediction model for PD-1 blockade immunotherapy in glioblastoma based on transcriptome deconvolution

Only a small proportion of patients obtain survival benefit from PD-1 blockade immunotherapy due to the highly heterogeneous and suppressed immune micro-environment of GBM. We aimed at revealing the characteristics of tumor micro-environment (TME) of GBM related to response to PD-1 inhibitors and constructing a response prediction model for screening patients possibly benefit from PD-1 inhibitors. Based on the composition and expression profiles of cell subpopulations calculated by deconvoluting the GBM bulk RNA-seq, differentially expressed gene analysis and gene set enrichment analysis (GSEA) were performed to explore genes and pathways related to response to PD-1 inhibitors. Further by combining least absolute shrinkage and selection operator (LASSO) regression and expression correlation with PD-L1, the response prediction genes of PD-1 inhibitors were identified and the response prediction model was constructed through binary logistic regression. The comparison of abundances of tumor infiltrating immune cells showed that the abundance of M0 macrophages of responders was lower, while the abundance of activated dendritic cells (DCs) was higher before PD-1 inhibitors treatment; the abundances of plasma cells and M0 macrophages of responders were lower after PD-1 inhibitors treatment. In addition, GSEA showed that the main up-regulation pathways in the tumor micro-environment of responders before PD-1 inhibitors treatment included the regulation of T-helper 1 type immune response, the positive regulation of natural killer cell-mediated cytotoxicity, p53 signaling pathway, homotypic cell–cell adhesion, etc., the main down-regulation pathways included the activation of microglia and myeloid leukocytes, Ras signaling pathway, etc. Afterward, ITGAX, LRRFIP1 and FMN1 were identified as the key response prediction genes of PD-1 inhibitors and the response prediction model based on them showed good predictive performance with potential value of clinical application in its validation and verification. ITGAX, LRRFIP1 and FMN1 were identified as the response prediction genes of PD-1 inhibitors and the response prediction model based on them was proved to have potential clinical value.


Introduction
Glioblastoma, the most common tumor in central nervous system with an annual incidence rate of 3.19/100,000 (Ostrom et al. 2013), also the most malignant glioma classified as WHO grade IV, is almost an incurable cancer with a median overall survival down to 14.6 months (Stupp et al. 2005). In recent years, PD-1/PD-L1 inhibitors showed incomparable curative effect in melanoma, lung cancer, primary hepatocellular carcinoma, etc., while the response rate of PD-1 inhibitors was reported only approximately 8% in GBM (Ott et al. 2019). Nevertheless, recurrent patients responded to PD-1 inhibitors showed a significant improvement in median overall survival, reaching 14.3 months according a retrospective clinical study (Zhao et al. 2019). Besides, two newly diagnosed patients involved in the research of Schalper et al. (2019) survived over 33 months and 28 months, respectively, following neoadjuvant treatment with nivolumab. Studies further showed that it was the highly heterogeneous and suppressed immune micro-environment of GBM that mainly influenced the efficacy of PD-1/PD-L1-based immunotherapy (Bikfalvi et al. 2023) which indicated the vital importance of revealing the TME of GBM. RNA-seq has become an effective method to analyze TME at transcription level for its advantages of high throughput and low requirements for samples. However, bulk RNA-seq shows only the average expression of mixed cells in bulk tissue and masks heterogeneous expression among cells and subpopulations, scRNA-seq can precisely reveal cellular heterogeneity of TME at single cell level, its application is restricted by high cost though. Deconvolution algorithms represented by Cibersortx (Newman et al. 2019), developed to calculate abundances and expression of cell subpopulations within bulk RNA-seq, compensated for the limitations of bulk RNA-seq to a certain degree and provided a new approach to analyzing characteristics of GBM immune micro-environment related to response to PD-1 inhibitors. Cibersortx has been wildly used in the analyses of micro-environment of various cancers and has been proved to be one of the most advanced transcriptome deconvolution algorithms (Barroso-Sousa et al. 2022;Newman et al. 2019;Zhang et al. 2022).
Screening patients possibly benefit from immune checkpoint inhibitors (ICIs) by pre-treatment evaluation based on biomarkers is of great importance for improving clinical efficiency of immune checkpoint blockade therapy, while there is no reliable ICI response biomarker for GBM currently. In the present study, we deconvoluted the bulk RNA-seq of GBM tissue from patients treated with PD-1 inhibitors. On this basis, we sought the characteristics of GBM immune micro-environment that might influence the efficacy of PD-1 inhibitor treatment, screened response prediction genes and constructed a prediction model.

Data acquisition
The bulk RNA-seq data of GBM from 17 patients receiving PD-1 inhibitors and related clinical information used in this study were provided by Zhao et al. (2019), which included 16 pre-treatment samples and 9 post-treatment samples.
The scRNA-seq data of GBM and related metadata used in this study was obtained from Gene Expression Omnibus database (GEO) by "GSE84465".
The bulk RNA-seq data of GBM and related clinical information of the external cohort in the survival analysis of this study were provided by Cloughesy et al. (2019).

scRNA-seq analysis
This part of the study was mainly conducted by the "Seurat" R package V4.1.0. Genes and cells were firstly filtrated based on the standards-genes detected in less than 3 cells and cells with less than 50 detected genes. Then, further filtration was performed to remove genes and cells with outlier of nCount RNA, nFeature RNA, percent.MT or percent. ERCC by the standards-100 < nFeatureRNA < 7500, 10,000 < nCount RNA < 1,300,000 and percent. ERCC < 25%, which were determined by plotting the distribution map of these parameters. Then, the gene expression of the remaining cells was normalized by "LogNormalize" method to eliminate the influence of the sequencing depth among different cells. Next, the top 1500 highly variable genes were identified by "vst" method to perform principal component analysis (PCA) following Z-score conversion and 20 initial principal components (PCs) were determined to be involved in the dimensionality reduction of t-distributed stochastic neighbor embedding (tSNE), as a result, all the cells were classified into 13 clusters. To identify the cell type of each cluster, the differentially expressed gene analysis among all the genes within each cluster was performed by Wilcoxon test with the standard-adjusted p value < 0.05 and |log2 (Fold Change)|> 0.5. On this basis, cell type of each cluster was identified by comparing top differentially expressed genes of each cluster with cell markers from the CellMarker database (Hu et al. 2023).

Deconvolution analysis of bulk RNA-seq
Deconvolution analysis in this study was conducted by Cibersortx (Newman et al. 2019), which aimed to analyze bulk RNA-seq (Mixture) into composition and expression of cell subpopulations taking a transcriptome map (Signature Matrix) of corresponding type of cells as reference.
The expression profile of the GBM scRNA-seq was annotated with cell types based on the previous analysis and then uploaded to Cibersortx to create Signature Matrix. Meanwhile, the expression profile of the bulk RNA-seq based on the GBM tissue from the patients with PD-1 inhibitor therapy was firstly normalized from Count to CPM (Counts per million) and then uploaded to Cibersortx as Mixture. Afterward, the composition and expression of cell subpopulations of the Mixture were respectively calculated using Cibersortx taking the created Signature Matrix of GBM and the transcriptome map of lymphocytes and myeloid cells merged from the transcriptome profile of 22 types of human immune cells (LM22) as reference.

Differentially expressed gene analysis
Differentially expressed gene analysis based on deconvolution was performed with the "Limma" R package V3.50.3. The selection criteria for differentially expressed genes is |log2 FoldChange|> 1 and p < 0.05.

Gene set enrichment analysis (GSEA)
GSEA was performed by the "clusterProfiler" R package V4.2.2. Genes sorted in descending order by log2 Fold-Change based on differentially expressed gene analysis were enriched taking the gene sets of Gene Ontology (GO)-Biological Process (BP) and Kyoto Encyclopedia of Genes and Genomes (KEGG) as reference with the following criterion: p.adj (Benjamin Hochberg) < 0.25.

Selection of response prediction genes
The response prediction genes of PD-1 inhibitors were firstly screened by least absolute shrinkage and selection operator (LASSO) regression with threefold cross-validation to determine the optimal regularization parameter (λ.min). Then, the response prediction genes were further selected by the expression correlation analysis with PD-L1 on the online analysis platform of GEPIA2 (Tang et al. 2019) with the following criteria: |Spearman correlation coefficient > 0.3 and Spearman correlation p value < 0.05. Next, the expression correlation between the response prediction genes of the immune cells in GBM was conducted to remove gene with multi-collinearity under the same criteria.

Construction of the response prediction model
The response prediction model was constructed through binary logistic regression using the "glm" function of R software with the "family" parameter of "binomial". Variance inflation factor (VIF) of each gene in the regression model was calculated using the "car" R package V3.0-13 to detect genes with multi-collinearity. The nomogram of the model was plotted using the "regplot" R package V1.1.
Kaplan-Meier curves of the two groups divided by the median response probability according to the model were plotted using the R package "survival" V3.3-1 and "survminer" V0.4.9. Log-rank test was performed to compare the differences of survival distributions between the two groups.

Differences in abundances of infiltrating immune cells in GBM between PD-1 inhibitor responders and non-responders
To explore the relationship between abundances of infiltrating immune cells in GBM and response to PD-1 inhibitors, the component fractions of 22 types of immune cells in the TME of each patient before and after anti-PD-1 treatment were calculated by deconvoluting the bulk RNA-seq taking the transcriptome map of 22 types of human immune cells (Newman et al. 2019) as reference.
The comparison of abundances of infiltrating immune cells between PD-1 inhibitor responders (n = 9) and non-responders (n = 7) before treatment showed that the abundance of M0 macrophages of responders was lower, while the abundance of activated DCs was higher ( Fig. 1A; Wilcoxon test; M0 macrophages, p = 0.047; activated DCs, p = 0.029), which suggested that low abundance of M0 macrophages and high abundance of activated DCs in the TME of GBM might be predictive factors for response to PD-1 inhibitors.
The comparison of abundances of infiltrating immune cells between PD-1 inhibitor responders (n = 7) and nonresponders (n = 2) after PD-1 inhibitor treatment showed that the abundances of plasma cells and M0 macrophages Fig. 1 Comparison of abundances of infiltrating immune cells between PD-1 inhibitor responders and non-responders before (A) and after (B) treating. Wilcoxon test, * represents p < 0.05

Gene expression and pathway regulations related to response to PD-1 inhibitors
Differentially expressed gene analysis was considered to further excavate genes that are predictive for response to PD-1 inhibitors. Taking into consideration of the high heterogeneity of the micro-environment of GBM, we first constructed the TME transcriptome map of nine types of cells by analyzing a scRNA-seq data from four GBM patients (Darmanis et al. 2017) (Fig. 2).
Combing that and the transcriptome map of 22 types of human immune cells as reference, the expression profiles of neoplastic cells, lymphocytes, and myeloid cells within the bulk RNA-seq of GBM treated with PD-1 inhibitors was calculated by deconvolution to reduce the interference caused by the heterogeneity between cell subpopulations. With that, the differentially expressed gene analysis was performed and 384 differentially expressed genes of neoplastic cells (Fig. 3A), 81 differentially expressed genes of lymphocytes ( Fig. 2B) and 205 differentially expressed genes of myeloid cells (Fig. 3C) were identified.
Gene set enrichment analysis (GSEA) was performed to explore changes of pathways in the micro-environment of responders compared to non-responders before PD-1 blockade treatment based on the differential gene expression analysis. GSEA of neoplastic cells showed that the upregulated pathways of the responders mainly included the regulation of T-helper 1 type immune response, the positive regulation of natural killer cell mediated cytotoxicity and p53 signaling pathway; the down-regulated pathways mainly included the activation of microglia and myeloid leukocytes and Ras signaling pathway (Fig. 4A). GSEA of lymphocytes showed that the up-regulated pathways of the responders mainly included the positive regulation of cell population proliferation and the regulation of cellular amide metabolic process (Fig. 4B). GSEA of myeloid cells showed that the up-regulated pathways of the responders mainly included cell-substrate junction, integrin mediated signaling pathway, homotypic cell-cell adhesion and leukocyte-mediated cytotoxicity (Fig. 4C).

Identification of predictive genes of response to PD-1 inhibitors in GBM
With the expression profiles of immune cells of 27 pre-treatment samples from 17 patients (all pre-samples of the same patient were included to expand the sample size) based on deconvoluting the GBM bulk RNA-seq and the differentially expressed genes of the immune cells, 15 predictive genes of response to PD-1 inhibitors with nonzero coefficients-ASS1, NR1H3, ITGAX, CPEB1, LRRFIP1, FKBP5, DNAJC5B, PI4K2A, TESK2, USB1, HIAT1, NCOA7, FMN1, GALC and SPNS3, were screened by least absolute shrinkage and selection operator (LASSO) regression with threefold crossvalidation (Fig. 5).
Preclinical and clinical evidences suggested a strong expression correlation between PD-L1 and response to PD-1 inhibitors in multiple malignant tumors (Davis and Patel 2019;Lantuejoul et al. 2020), according to which four predictive genes highly correlated to PD-L1 at transcription level were further screened by expression correlation Fig. 5 Selection of predictive genes of response to PD-1 inhibitors through least absolute shrinkage and selection operator (LASSO) regression. A Coefficient profiles exhibited by LASSO regression based on the log(Lambda) sequence. B The LASSO regression model was cross-validated threefold to determine the optimal regularization parameter (λ.min) and 15 genes with nonzero coefficients were screened out by the LASSO regression model with λ.min (0.0117). Lambda: λ analysis (Table1). The four predictive genes were ITGAX, LRRFIP1, PI4K2A and FMN1, PI4K2A of which was next removed in order to select genes as independent factors of response to PD-1 inhibitors due to the multi-collinearity between PI4K2A and LRRFIP1 that was found by calculating the variance inflation factor (ITGAX, VIF = 1.482; LRRFIP1, VIF = 5.879 > 5; FMN1, VIF = 2.147; PI4K2A, VIF = 8.122 > 5) of each gene in the linear regression and analyzing the expression correlation between them (Fig. 6).

Establishment and validation of response prediction model for PD-1 inhibitors in GBM
Three genes, ITGAX, LRRFIP1 and FMN1, were finally determined as the key predictive genes of response to PD-1 inhibitors in GBM and the response prediction model was constructed then by binary multiple logistic regression (Table 2, Fig. 7).
The probability for GBM patients to respond to PD-1 inhibitors can be calculated by the following formula according to the model: Transcriptome expression value of ITGAX, LRRFIP1 and FMN1 in the formula can be calculated according to the expression profiles of immune cells from deconvoluting the bulk RNA-seq of GBM tumor tissue or directly according to the bulk RNA-seq of immune cells sorted from GBM tumor tissue, following the max-min normalization as below based on the minimum value min{x i } and the maximum value max{x i } of x i -the expression value of ITGAX, LRRFIP1 and FMN1: Then, we performed leave-one-out cross-validation (LOOCV) to test the model. The area under curve (AUC) of receiver operating characteristic (ROC) was 0.922 (95% CI 0.791-1), suggesting a good discriminative ability (Fig. 8A). The p-value of the Hosmer-Lemeshow (H-L) test was 0.574 and the calibration curve demonstrated a high consistency between prediction and actual observation, indicating a high prediction accuracy (Fig. 8B).

Clinical utility of the response prediction model
The clinical utility of the model was evaluated by the decision curve analysis (DCA) and the clinical impact curve (CIC). The DCA showed that carrying out PD-1 blockade immunotherapy according to the model when the threshold probability was lower than 82% provided greater net benefit than the "treat all" and the "treat none" strategies, indicating a high clinical value of the model (Fig. 9A).
The CIC showed that the population responsive to PD-1 inhibitors identified by the model was highly consistent with the actual population responded to PD-1 inhibitors when the threshold probability was higher than 70%, confirming a high clinical efficiency of the model (Fig. 9B).
Combining the DCA and CIC, the model was expected to be capable of predicting response to PD-1 inhibitors in GBM and showed an ideal clinical gain when the threshold probability ranged between 70 and 82%.

Survival differences between patients with PD-1 blockade treatment classified by the model
The ultimate significance of drug response prediction models lies in whether patients can benefit from their application in survival time and survival outcomes. Therefore, we performed survival analysis to evaluate the survival significance of the model. The probabilities of response to PD-1 inhibitors of each patient in the internal cohort and the external cohort were calculated according to the bulk RNA-seq of tumor tissue sampled before (internal cohort) or on the early stage (external cohort) of the treatment by the model. The two cohorts were next respectively divided into low predictive response group and high predictive response group by the median response probability to plot survival curves using Kaplan-Meier method and compare differences of distributions of survival time using Log-rank test. The survival analysis of the internal cohort (n = 16) showed that the survival rate of the high predictive response group was higher than that of the low predictive response group (Log rank test, p = 0.033) (Fig. 10A). The survival analysis of the external cohort (n = 26) showed that the survival rate of the high predictive response group was markedly higher than that of the low predictive response group (Log rank test, p = 0.005) with a median survival time two times over that of the low predictive response group (Fig. 10B).
The differences of the distributions of survival time between the two groups were consistent with the prediction from the model both in the survival analyses of the internal cohort and the external cohort which indicated a certain predictive value for survival in GBM patients with PD-1 inhibitor treatment.

Discussion
GBM is one of the most malignant refractory tumors with rapid growth, strong invasiveness and poor prognosis. PD-1 blockade immunotherapy was proved effective in extending survival time of GBM patients in multiple clinical researches while the response rate was unsatisfactory (Omuro et al. Searching for biomarkers predictive for response to PD-1 inhibitors is a major problem demanding prompt solution in the clinical practice of treating GBM. Further research showed that the efficacy of ICIs was associated with the special micro-environment of GBM (Bikfalvi et al. 2023), which was proved heterogeneous, immune-suppressed and complex, while relevant mechanisms were still remaining to be investigated. With the development of high-throughput sequencing technology, large amounts of transcriptome data were generated to study TME which greatly contributed to developing molecular markers and targets for tumor therapy, driving the growth and implementation of precision medicine.
Transcriptome deconvolution by Cibersortx (Newman et al. 2019), aimed to analyze bulk RNA-seq into compositions and expression of cell subpopulations, showed excellent calculation performance in the analyses of micro-environment of multiple tumors, for instance, melanoma, liver cancer and breast cancer (Barroso-Sousa et al. 2022;Newman et al. 2019;Zhang et al. 2022). Moreover, the outstanding calculation accuracy and robustness of Cibersortx were proved in several studies by comparing it with mainstream deconvolution algorithms and verifying it through flow cytometry and scRNA-seq in analyzing the cell compositions of peripheral blood mononuclear cells Fig. 9 Clinical utility of the prediction model. A The decision curve tested by leave-one-out cross-validation. The red curve represents the predicting performance of the model. Besides, there are two lines representing two extreme cases respectivelythe gray curve represents the hypothesis that all patients are responsive to PD-1 inhibitors, the black horizontal line represents the hypothesis that no patient is responsive to PD-1 inhibitors. B The clinical impact curve tested by leave-oneout cross-validation. The red curve represents the number of people determined to respond to PD-1 inhibitors by the model among 1000 hypothetical GBM patients at different threshold probabilities, the blue curve represents the number of people not only determined to respond to PD-1 inhibitors by the model but also actually responded to PD-1 inhibitors at different threshold probabilities 1 3 from healthy people, lymph nodes from patients with follicular lymphoma, micro-environment of colorectal cancer, squamous cell carcinoma of head and neck (HNSCC) and melanoma (Newman et al. 2015;Wang et al. 2023), which indirectly supported for the reliability of deconvolution calculation based on the GBM scRNA-seq analysis in our research.
The response prediction model performed well in the validation and testing of discrimination, calibration, clinical utility and survival significance, showing potential clinical value. However, what should be noted is that the limitation of the model lies in lack of validation in large independent cohort which is key to improving its clinical predictive capability. With extensive development of GBM research, more and more RNA-seq of GBM is going to be performed and released, which will provide more powerful support for the improvement of the model.
In order to extend the application of the model, we evaluated it in melanoma and lung cancer cohorts with PD-1 blockade treatment, but unfortunately, the model showed no prediction effect (AUC = 0.5), which could be because the huge differences among these cancers, such as gene expression, protein function, tissue morphology, etc.
Cibersortx provided a new approach to analyze the immune micro-environment of GBM related to response to PD-1 inhibitors. Nevertheless, it still could not analyze bulk RNA-seq into single cell level comparing to scRNAseq which was recognized as the best means to study the heterogeneity of TME at transcription level. We expect that scRNA-seq of large cohort will depict the micro-environment of GBM more comprehensively and accurately in the future.

Conclusions
In the present study, the comparison of abundances of infiltrating immune cells between responders and non-responders suggested that high abundances of M1 macrophages and activated DCs in the micro-environment of GBM might be related to response to PD-1 inhibitors. In addition, the gene set enrichment analysis indicated that the potential mechanisms of PD-1 inhibitors in treating GBM might involve the down-regulation of Ras pathway, the up-regulation of p53 pathway and inhibiting tumor development by regulating tumor immune micro-environment and affecting adhesion between tumor cells and extracellular matrix mediated by integrin and cadherin, aside from activating T cells through binding PD-L1. Furthermore, ITGAX, LRRFIP1 and FMN1 were identified as the key response prediction genes and the response prediction model based on them was validated and verified to have good predictive performance and potential clinical value.

Author contributions
The manuscript and all the research work included were done by David Wong.
Funding None.

Data availability
The data included in this study are available from the lead author upon reasonable request.

Declarations
Competing interests The authors have no competing interests.