Identify growth-related genes of GBM by CRISPR-Cas9 technology
Flowchart depicting the entirety of this study (Fig. 1). To screen genes essential for GBM growth, we performed genome-wide screening using DepMap, which revealed 61 GBM cell lines with 17,386 genes. We further screened 699 genes with CERES scores < -1 in over 80% of GBM cell lines. Based on the expression of these 699 genes in the TCGA cohort, a univariate COX analysis was conducted, which resulted in the selection of twelve growth-related genes (Table S2) that had significant prognostic value. Notably, The HR values of these genes were greater than 1, indicating that they were associated with a poor prognosis. (Fig. 2A).
Subtype Classification Based On Growth-related Genes
The expression of growth-related genes in the TCGA cohort was used to construct a matrix. The consensus clustering was performed via the ConsensusClusterPlus package[21] based on this matrix, and the samples were initially divided into 2–9 clusters (k = 2–9). According to the comprehensive analysis of cumulative distribution function (CDF) curves and consensus clustering matrix heatmap (Fig. 2B, S1A), we selected the optimal k value of 3. According to the optimal k value, all samples were divided into three growth-related subtypes (GGSs). Principal Component Analysis (PCA) indicated that significant differences in subtypes were validated at the growth gene level (Fig. 2C). Meanwhile, the expression of growth-related genes was highest in GGS1, followed by GGS2, and lowest in GGS3 (Fig. 2D). As expected, GGS1 had the worst prognosis, GGS3 had the best prognosis, and GGS2 was intermediate (Fig. 2E).
Validation In Microarray Cohort And In-house Rnaseq Cohort
We performed validation in two microarray cohorts (platform GPL570) and one in-house cohort. Using the expression of feature genes (Table S2), we utilized the NTP algorithm to predict the subtypes of the validation sets. To ensure the accuracy of the results, samples with FDR greater than 0.05 were filtered out. Each validated dataset is divided into three subtypes, GGS1-3. The Kaplan-Meier prognostic analysis was the same result as the TCGA cohort (Fig. 2F-H). According to the classification results, the feature genes had similar expressions in the corresponding subtypes of the validation sets (Figure S1B-D). Besides, the submap analysis also showed a high similarity of common signature genes between the TCGA cohort and the validation cohort (Fig. 2I-K).
The Differences In Copy Number Variations And Frequency Among Ggss
We observed that among the top 15 segments with the highest frequency of copy number variations, amplified segments were predominantly located at chr 7 and deleted segments were predominantly located at chr 10 (Fig. 3A). To explore the overall picture of copy number variations, we calculated the GISTIC score and copy number alteration frequency for each subtype using the GISTIC2 algorithm[16]. We found that the frequency of alterations in chr 7 amplification was highest in GGS1, slightly lower in GGS2, and lowest in GGS3 (Fig. 3B). The frequency of chr 10 loss was lowest in GGS3(Fig. 3B). These findings were consistent with the trend observed in the row copy number, and we also found GGS1 has a higher frequency of chr 7 gain & chr 10 loss than GGS2 (Figure S2A-C). In addition, the frequency of chr 19 & 20 co-gain was highest in GGS2, followed by GGS3, and lowest in GGS1 (Fig. 3B). To further analyze the distribution of samples in the three subtypes. After referring to previous studies[22], we discovered that the proportion of samples with chr 7 gain and chr 10 loss was highest in GGS1, followed by GGS2, while GGS3 had the lowest proportion (Fig. 3C). Chr 19 & 20 co-gain exhibits the highest proportion of samples at GGS2, followed by GGS3, with GGS1 exhibiting the least (Fig. 3D).
The Underlying Biological Explanation For The Poor Prognosis Of Ggs1
Our analysis focused on investigating the biological functions of GGS1, which is associated with the worst prognosis. Referring to previous findings[22], we observed that TERT expression status accounted for a relatively high proportion of cases in GGS1 (Figure S2D). At the same time, TERT expression status is closely related to TERT promoter status, and TERT expression status with TERT promoter mutation is mostly present[22]. As expected, the proportion of TERT promoter mutation in GGS1 was the lowest (Figure S2E). However, due to insufficient information on TERT promoter mutation in the TCGA cohort, we validated it in the in-house cohort (Figure S2F). In addition, we found that both EGFR and GABP were highly expressed in GGS1 (Fig. 4A), and EGFR could promote cell proliferation and immortality by GABP acting on TERT-promoter mutated regions (Fig. 4B) by previous findings[23]. Next, we performed GSEA enrichment analysis, which revealed that pathways promoting proliferation and immortality were enriched in GGS1, including regulation of cell cycle phase transition, telomere maintenance, and regulation of DNA replication (Fig. 4C). Besides, the protein expression level of EGFR was highly expressed in GGS1 (Fig. 4D). To further compare differences in pathway activity between subtypes, GSVA enrichment analysis was performed. Enriched pathways are divided into four categories, metabolism, stemness, proliferation, and signaling. We discovered that GGS1 is activated and highly expressed in all four classes of functions, whereas GGS2 and GGS3 are not (Fig. 4E).
Comparison With Recognized Characteristics Of Gbm
In this study, we examined the distribution of various known features in GBM, including age, gender, Karnofsky Performance Status (KPS), grade, IDH status, 1p/19q co-deletion, chr 7 gain & chr 10 loss, chr 19 & 20 co-gain, TERT promoter status, TERT expression status, MGMT promoter methylation, and transcriptome subtypes (Fig. 5A). We found that the proportion of IDHwt-non-codel was significantly increasing and the proportion of chr 7 gain & chr 10 loss was significantly decreasing from GGS1 to GGS3. At the same time, GGS1 had the highest proportion of older patients (Figure S2G) and the lowest proportion of MGMT methylation (Figure S2H). In addition, The IDH of GGS1 was all wild type with grade 4. Compared to the previous classification of GBM, GGS1 has the highest proportion of classical subtypes and does not contain neural subtypes (Fig. 5B). We conducted both univariate and multivariate Cox regression analyses, which further confirmed the prognostic value of these features. We subsequently identified GGS1 as an independent factor associated with poor prognosis (Fig. 5C-D).
Discovery Of Potentially Specific Drugs For Ggs1
Using drug sensitivity data and gene expression data from PRISM and CTRP, we identified potential therapeutic agents for GGS1 (Fig. 6A). A ridge regression model was employed via the pRRophetic package to deduce the AUC value corresponding to the drug. A lower AUC value indicated more sensitivity to the drug. A lower AUC value indicated more sensitivity to the drug. TMZ is the first-line drug for glioma treatment, and previous studies[24] have shown that low ADM expression improves glioma sensitivity to TMZ. Our results showed that samples with low ADM expression had lower AUC values, confirming the rationality of our method in glioma (Fig. 6B). We divided the samples into two groups according to the level of ADM expression, and the results showed that samples with low ADM expression had lower AUC values both in CTRP and PRISM, which were more sensitive to TMZ. Following the same approach, we identified potential drugs CCT128930 and Foretinib for GGS1 (Fig. 6C-D).