Proteomic landscape of primary and metastatic brain tumors for heterogeneity discovery

Despite recent advancements in our understanding of driver gene mutations and heterogeneity within brain tumors, whether primary or metastatic (also known as secondary), our comprehension of proteomic changes remains inadequate. The aim of this study is to provide an informative source for brain tumor researches, and distinguish primary brain tumors and secondary brain tumors from extracranial origins based on proteomic analysis.


INTRODUCTION
The brain, as the major organ of the central nervous system (CNS), governs and orchestrates a majority of the body's physiological processes.
Glioblastomas (GBMs) and brain metastases (BrMs), which are considered to be the most malignant types of brain tumors, are among the deadliest cancers with poor prognosis and short survival [1].Currently, effective treatment options for patients with malignant brain tumors have long posed a challenge for oncologists.
Gliomas are one of the most prevalent primary brain tumors with disproportionate mortality and morbidity among subtypes.The most commonly occurring types of adult gliomas are astrocytoma, oligodendroglioma, and GBM [2,3].Glioma alterations, such as isocitrate dehydrogenase (IDH) 1 and 2 mutations as well as 1p/19q codeletion, generally occur in low grade gliomas (WHO grade 2 or 3) and provide superior prognostication compared to IDH wildtype tumors [4,5].
Despite the implementation of conventional therapies such as surgical resection, radiation and chemotherapy, patient outcomes remain unsatisfactory.The median survival rates of glioma patients are stubbornly low, varying from years (WHO grade 2) to months (WHO grade 4) [6].
BrMs, secondary brain tumors from extracranial primary tumors (such as lung, breast, melanoma, and colorectal cancers), are 10 times more common than primary brain tumors with 10%-30% incidence in adults, and they have an even lower survival rate that is typically measured in months [7,8].Fewer than 10% of all BrMs are found prior to the diagnosis of primary cancer.The determination of whether extracranial tumors develop BrMs mainly relies on cranial imaging [magnetic resonance imaging (MRI) or computed tomography (CT)], which has a severe time lag from diagnosis to treatment, causing the optimal therapeutic timeframe to be missed [9,10].In addition, although these two representative primary and secondary brain tumors exhibit markedly different modes of antigen presentation and tumor microenvironment [11], there is no effective molecular marker to assist in distinguishing these two types of brain tumors.
The immunohistochemical confirmation of gliomas and BrMs poses a complex and time-consuming challenge, which hinders timely intervention and precise treatment selection.For instance, glial fibrillary acidic protein (GFAP), pancytokeratin (AE1/AE3) and cytokeratin were frequently adopted for differentiating the glioma and BrMs [12,13].
However, the diagnosis of different origins of BrMs and different types of gliomas requires specific immunohistochemical markers, making the entire process lengthy and demanding expertise.Consequently, rapid and accurate preoperative discrimination of BrM and glioma is imperative for individualized therapeutic decision-making.
Markers found in the blood and tissue samples have been utilized for the diagnosis of the primary disease and to guide treatment.
Recent studies applying immunohistochemistry, genome-wide transcriptomics, and single-cell transcriptomics to investigate BrMs and gliomas have had a profound impact on cancer biology [14][15][16][17][18]. Klemm et al. constructed a high-dimensional, multi-omics characterization of the brain tumor microenvironment, allowing elucidation of the diseaseand cell type-specific expression patterns of gliomas and BrMs [19].
However, there remain obstacles impeding the translation of these discoveries into novel and efficacious therapies.Potential explanations for the disconnect between genomics-based studies and clinical trials include the lack of protein information and a weak correlation between protein and mRNA expression (0.54) [20,21].As the main carrier and executor of vital biological processes, proteins exhibit a more direct connection with the onset and progression of diseases.Despite some transcriptomic studies, the precise proteomic composition of these two distinct human brain tumors, especially BrMs, remains unclear.
Thus, an integrated and in-depth proteomic analysis is required to fully comprehend these brain cancers.
Mass spectrometry (MS)-based proteomics is an integral part of cancer research, shedding lights on the functional profile of cancer cells.
The present study demonstrated, for the first time, a systematic proteomic analysis of two typical brain tumors, namely BrMs and gliomas.
We generated and analyzed a comprehensive catalog of the disease type-specific protein expression patterns as a valuable resource for the research community, and we also investigated their interrelationships.

Patient and sample collection
The present study was conducted in accordance with the guidelines of the Declaration of Helsinki with the approval of the Research Ethics Committee from Ruijin Hospital, Shanghai Jiaotong University School of Medicine (1.0/2019-10-1).Written informed consent was obtained from all patients or their legal representatives prior to their participation in the study.
A total of 29 samples from glioma and brain metastasis (BrM) patients were obtained from the Ruijin Hospital, Shanghai Jiao Tong University School of Medicine between October 2020 and September 2021.Details of clinical information on the subjects were shown in Figure 2A and Table S1.Tumor tissues were collected after surgery, and were washed three times with PBS.After that, tissues were immediately transferred to liquid nitrogen and collected into 2 mL cryogenic storage vials (Corning, New York, USA), and stored at −80 • C for later use.

Cell culture and collection
The human cervical cancer cell line, HeLa (ATCC, USA), was cultured at 37

Sample preparation based on iST kit
Glioma and BrM tissues from patients as well as HeLa cells were prepared according to the iST kit manufacturer's instructions [22].Briefly, for sample lysis, reduction and alkylation, tissue samples were cut and weighed, and 1 mg of each tissue sample was placed into a clean 1.5-mL protein LoBind tube followed by the addition of 100 μL of LYSE and incubation at 95 • C for 30 min with shaking (1000 rpm).HeLa cells for the experimental control were treated simultaneously.The sample was sheared using a sonicator (10 cycles; 30 s ON/OFF), and protein concentration was determined using the Pierce BCA Protein Assay Kit.An equal amount of protein (100 μg) for each sample was used for digestion.Then, 210 μL of RESUSPEND was added to dissolve the DIGEST followed by shaking at room temperature for 10 min at 500 rpm.For protein digestion, 50 μL of DIGEST was added to the sample and heated using a pre-heated heating block at 37

Database search
All raw data were analyzed by Peaks online (X build 1.

Data analysis
The proteome data were filtered for 50% valid intensity values in each group.And the protein and peptide intensities were quantile normalized and log 2 -transformed for downstream statistical and bioinformatics analysis.Quantified proteins with fold change > 2 or < 0.5 and adjusted P-value < 0.05 (Benjamini-Hochberg FDR method) were considered as differentially expressed proteins (DEPs).In the figures, experimental data are shown as standard error of mean.
Metascape [23] was utilized for functional enrichment and proteinprotein interaction networks analysis.P-values for the functional enrichments were calculated by a hypergeometric test and corrected by the Benjamini-Hochberg FDR method.Cytoscape [24] software was used for reorganizing and visualizing the interaction networks.
The proportional Venn diagrams was analyzed using a Bioinformatics online tool (http://www.bioinformatics.com.cn/srplot).The artwork was created with BioRender.com.MetaboAnalyst 5.0 [25] was used for the statistical analysis and biomarker discovery of DEPs, including unsupervised clustering, PCA, Pearson correlation analysis, and machine learning.The complete clustering with violin and dot plot was analyzed on the 'Wu Kong' platform (https://www.omicsolution.com/wkomics/main/).
For machine learning, ROC curves were generated using Metabo-

Overall proteomic analysis of brain tumors
To holistically analyze the proteome of both BrMs and gliomas, we assembled a diverse range of tumor types and grades (Figure 1 S2).In addition, the missed cleavage rate was as low as 10% on average (Figure S1H), indicating a well-controlled sample preparation.

Cell proliferation and immune response are conserved features for BrMs
The following four essential hallmarks of metastatic cells during inva-  S3).Functional analysis of the three clusters revealed marked enrichment in cell proliferation for colonization and immune response to modulate the microenvironment (Figure 3B and Figure S2B), which is consistent with the transcriptomic results of BrMs [18].Specifically, protein groups in cluster 1 were relevant to extracellular matrix (ECM) organization, such as focal adhesion, actin cytoskeleton organization and collagen formation (Figure 3C).Cluster 2 was characterized by complement and coagulation cascades for cell proliferation and immune response to the microenvironment.The complement components (e.g., C3, C5, and C9) were highly enriched in clusters 2 (Figure 3D).And cluster 3 encompassed a series translation factors in the EGFA-VEGFR2 signaling pathway for tumor proliferation (Figure 3E).Ribosomal proteins (RPs) of the 60S (e.g., RPL13A, RPL5, and RPL27) and 40S (e.g., RPS6 and RPS11) were also enriched in clusters 3.

Glioma subtypes correlate with respective protein patterns
Although the heterogeneity of gliomas has been described in terms of histochemistry and prognosis [4,26], knowledge of different grades of glioma remains rudimentary to date, and there is no effective diagnostic method based on proteomics [28,29].In response to the poor diagnostic outcomes in different subtypes of gliomas, our subsequent focus was directed towards elucidating discrepancies in protein patterns among gliomas.Integrated proteomic analysis of different grades of glioma showed independent and well-separated clusters.Relatively lower grade gliomas (WHO grade 2, astrocytoma, IDH1-mutant; WHO grade 3, oligodendroglioma, IDH1-mutant and 1p/19q-codeleted) exhibited a close correlation and differed from F I G U R E 1 Schematic diagram of the workflow.Proteomic samples were prepared and analyzed by mass spectrometry.The protein profiles of the two types of brain tumors were discussed separately, followed by comparative analysis to find potential biomarkers that could distinguish between the two tumors of different origin.grade 4 gliomas (including astrocytoma, IDH1-mutant and GBM, IDH1wildtype) (Figure 4A).Complementarily, GSM, a rare form of grade 4 gliomas (2%) that has both sarcomatous and malignant glial components [19], was found as a relatively separate entity in principal component analysis (PCA).However, in WHO grade 4 gliomas, no separate clustering between IDH1 mutation (Grade 4-Mut) and wildtype (Grade 4-WT) was observed (Figure 4B).

Comparative proteomics for primary and secondary brain tumors
Gliomas and BrMs are the most predominant primary and secondary brain tumors in the human brain with a common trait of various malignancies.We next investigated the proteomic heterogeneity of these two types of brain tumors.These two typical brain tumors are composed of unique cell types, anatomical structures, metabolic constraints, and immune environment [19], which may explain the tendency of some extracranial cancers to migrate toward the brain and provide new insights into the blood-brain barrier (BBB) changes during tumor metastasis.For comparative proteomic analysis, brain tumors from primary (gliomas, n = 13) and secondary (BrMs, n = 14) were utilized.
In total, 407 proteins were statistically different between cohorts (adjusted P < 0.05), of which 171 proteins were upregulated and 236 proteins were downregulated in BrMs versus gliomas (Figure 5A, Figure S4A, and Table S4), and these DEPs were highly correlated with neuron projection development (Figure 5B).Besides, the 407 DEPs effectively distinguished two types of tumors in unsupervised clustering and PCA based on component 1 (52.2%) and component 2 (7%) (Figure S4B,C).Subsequently, we conducted functional enrichment analysis separately for the upregulated and downregulated proteins.(Figure 5C).
To a certain extent, the fundamental cellular and molecular pathways exhibited similarities in both types of brain tumors.However, proteins in some distinct pathways were upregulated in BrMs, such as desmosome organization, neutrophil degranulation and formation of the cornified envelope.Specific processes involved in gliomas were  correlated with neuron development and migration, including nervous system development, gliogenesis regulation of cytoskeleton organization, cell adhesion molecules and so on.Of note, the two tumors were enriched in distinctly different tissue-and cell-specific gene patterns, with gliomas predominantly composed of brain components such as the cerebellum, whereas BrMs were related to the components of their primary sites, such as the lung, colon and breast cell (Figure 5D).These findings prompted further exploration of candidate biomarkers for distinguishing primary and secondary brain tumors.

Diagnostic classification model for gliomas and BrMs
To illustrate the capacity of proteomic profiling as a powerful prognostic tool for discriminating glioma and BrM tumors, we attempted the diagnostic classification model.To construct a more precise model, malignant brain tumors from primary (WHO grade 4 gliomas, glioma (grade 4), n = 10) and secondary (metastases from LC, BrM (LC), n = 10) tumors were adopted for further analysis, which resulted in 278 DEPs (Figure S4D, Table S4).The DEPs in BrM (LC) versus glioma (grade 4) and BrM versus glioma showed high overlap (Figure S4E).Consistent with the aforementioned GSEA results, the most prominent pathways among the 278 DEPs were associated with cellular locomotion, such as regulation of cytoskeleton organization, formation of the cornified envelope and supramolecular fiber organization (Figure S4F).These results were consistent with the migratory and invasive characteristics of malignant cells.
We next used multivariate receiver operating characteristic (ROC) curve analysis based on partial least squares discriminant analysis (PLS-DA).The above 278 differentially expressed variables were analyzed to obtain the optimal and most economical biomarker combination.Five variables (TBR1, MUC1, LAMB3, SFN, and GPRC5A) reached the most economical and optimal area under the curve (AUC) of 0.991 (95% confidence interval [CI] = 0.914−1) (Figure 6A).
To evaluate the reliability of the machine-learning strategy, confusion matrices were generated, and the results demonstrated that different samples were correctly classified with 93% accuracy (Figure 6B,C).Notably, these five proteins were significantly upregulated in the BrM (LC) samples compared to the glioma (grade 4) samples (Figure 6D).

DISCUSSION
BrMs and gliomas, representing two distinct types of brain tumors, are mostly fatal tumors and are accompanied by poor prognosis.In addition, clinical and biological variability is thought to exist within each type and each grade of tumor, suggesting that the identification of molecular factors that contribute to this variation is invaluable for the development of targeted therapies.
To reveal the common denominators of brain colonization by widely different types of BrMs, we summed up three clusters with distinct protein patterns by complete clustering.Proteins related to tumor proliferation and immune response were recognized as commonalities for metastatic cells to colonize the brain.Among them, collagen proteins (e.g., COL18A1, COL4A1, and COL6A1) in ECM organization, which have been recognized as diagnostic tumor markers, were enriched in cluster 1 (Figure 3C).The accumulation of collagens can establish tumorigenesis and metastasis [32][33][34].Additionally, the laminin family (e.g., LAMA2, LAMA4 and LAMA5) also plays significant roles in tumor invasion and colonization [35].In cluster 2, abundant subunits of complement components were observed, which function in inflammatory processes and induce diseases such as renal disease, lung disease and Alzheimer's disease [36].For instance, complement component 3 (C3) has been verified to promote brain metastasis by disrupting the blood-cerebrospinal fluid barrier [37].Moreover, the VEGFA-VEGFR2 signaling pathway was mainly present in cluster 3. Vascular endothelial growth factor (VEGF)-related pathways stimulate angiogenesis for tumor colonization, and they have been observed in many tumors, including BrMs [17,38,39].In cluster 3, a series of eukaryotic initiation factors (eIFs) that cooperate with ribosomes for mRNA translation was also detected (Figure 3E).Because mis-regulated mRNA expression is a common feature of tumor growth, eIFs are aberrantly expressed in many human cancers and serve as potential drug targets in cancer therapy [40].
Among gliomas, highly expressed proteins in Grade 4-Mut subgroups were enrich in RNA processes such as mRNA metabolic process, rRNA metabolic process and histone modification, suggesting malignant tumor proliferation (Figure 4C).Simultaneously, immune system processes including interferon signaling and immune response-activating signaling pathway were also activated.[46].The transmembrane glycoprotein Mucin 1 (MUC1) is aberrantly glycosylated and overexpressed in multiple cancers, including triplenegative breast cancer, and recognized as a major target for the design and development of a universal cancer vaccine [47][48][49].In particular, MUC1 is also frequently overexpressed in metastatic cancers and has been used as a diagnostic marker for metastatic progression [50].Similarly, the upregulation of laminin subunit beta-3 (LAMB3) promotes invasive and metastatic abilities of certain types of cancer, including colon, pancreas and lung [51].Moreover, the 14-3-3 protein sigma (SFN), which exhibits as a tumor suppressor in some malignancies, has also been considered as metastasis-related proteins in human lung squamous carcinoma [52,53].Particularly, the orphan recep-tor GPRC5A functions as a lung tumor suppressor and low levels of GPRC5A were found in the lung patients [54].Conversely, our BrM (LC) samples exhibited high expression levels of GPRC5A.And a recent study has detected exclusive overexpression of GPRC5A in both primary and metastatic high-grade serous ovarian cancer cells, indicating the association with chemotherapy resistance and poor survival [55].In summary, these five biomarkers have been validated as dysregulated in other cancers, and the present study has demonstrated their potential capacity for distinguishing between glioma (grade 4) and BrM (LC).
Here, we conducted a comprehensive and comparative proteomic analysis for both BrMs and gliomas for the first time, revealing the distinct proteome patterns of these two typical tumors in the brain.
These findings hold promising implications for the development of targeted therapies specific to BrM and glioma treatment.Further endeavors should prioritize biomarker validation using an expanded pool of clinical samples.

ASSOCIATED DATA
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://proteomecentral. proteomexchange.org)via the iProX partner repository [56] with the dataset identifier PXD033881.
7.2022-03-24_170623).The following parameters were used: MS 1 tolerance, 10 ppm; MS2 tolerance, 0.02 Da; and searched database, UniProthuman database (20,375 entries, downloaded on Dec. 2, 2021) for the tumor tissue as well as HeLa samples.A false discovery rate (FDR) lower than 1% was used as the cutoff value for the peptide, protein, and peptide spectrum match (PSM) identification based on the target decoy strategy.Carbamidomethylation of cysteine was considered a fixed modification, and protein N-terminal acetylation, oxidation of methionine, and deamidation of asparagine and glutamine were considered as variable modifications.

Analyst 5 .
0. Multivariate ROC curves were generated by Monte Carlo cross-validation (MCCV) using balanced sub-sampling.In each MCCV, two-thirds of the samples were used to evaluate the feature importance.The top 2, 3, 5, 10 . . . 100 (max) important features were then used to build classification models, which were validated using onethird of the remaining samples.The procedure was repeated multiple times to calculate the performance and confidence interval of each model.PLS-DA was used as the classification method, and the T statistic was selected as the feature-ranking method with two latent variables.Feature selection was based on the ROC curve results, and the top 5, 10, 15, 25, 50, and 100 proteins were used for predictive accuracy assessment.

For
the study design, we implemented two quality controls at experimental and instrumental level.Firstly, during the MS acquisition, four commercial HeLa digests (200 ng each) were uniformly inserted for monitoring MS for instrumental quality control (QC), which showed a constant stability across the entire process of sample analysis (Figure2C, FigureS1A,B).Secondly, three HeLa samples (H) were treated in parallel with tumor tissues from sample preparation to data analysis as experimental quality control.Two types of tumors were analyzed alternately at random, and the HeLa sample was inserted every 10 samples, resulting in an average of approximately 5200 identified proteins and 4499 co-quantified proteins in HeLa cells (FigureS1C,D).Furthermore, a similar distribution of peptide intensities, low coefficient of variation (CV), and high correlation (0.99) of label free quantification (LFQ) protein intensities were observed in the three HeLa samples (Figure2D, FigureS1E-G).Overall, we performed robust proteomic analysis for two types of brain tumors using rigorous experiment controls.As a result, 27 of the 29 cohort samples success-fully passed the quality control filters with more than 50% overlap of quantified proteins in each sample.In total, 8165 protein groups and 101656 peptides were quantified across all samples, with averages of approximately 4200 proteins and 33,000 peptides were recovered per sample (Figure2E, Table sion and brain metastasis have been previously proposed: (a) motility and invasion; (b) modulation of the microenvironment; (c) plasticity; and (d) colonization [27].LC, BC, and CC are the most prevalent types of cancer that metastasize to the brain.Here, we performed complete clustering and searched for robust clustering of distinct features of the proteome correlated with all BrMs to increase our understanding of the general mechanisms of BrMs.Hierarchical clustering identified 3 clusters across all samples, resulting in 1112 proteins in cluster 1, 481 proteins in cluster 2, and 2261 proteins in cluster 3 (Figure 3A, Figure S2A, and Table

4 -
Mut and Grade 4-WT gliomas, which accounted for 662 highly expressed proteins (fold change > 2 or < 0.5), of which 430 proteins were in Grade 4-Mut and 232 proteins were in Grade 4-WT F I G U R E 2 System-wide analysis of clinical samples.(A) Heatmap describing samples collected from 29 patients, including gliomas from WHO grade 2 to 4, with IDH1 mutations and wildtypes, as well as BrMs from LC, BC, OC to CC. (B) Contrast-enhanced MRI of BrMs and gliomas.Patients with BrMs from LC, OC, and BC as well as gliomas of WHO grade 3 with IDH1 mutation (Grade 3-Mut), grade 4 with IDH1 mutation (Grade 4-Mut), and gliosarcoma (Grade 4-GSM) are shown.(C) Number of quantified proteins segregated by binned average protein intensities in four commercial HeLa digests for mass spectrometry quality control (QC).(D) Distribution of peptide intensities in parallel HeLa (H) treatments (n = 3) for experimental quality control.(E) Number of protein group, peptide and PSM identifications for BrMs (n = 14) and gliomas (n = 13).Values above columns indicate average numbers of identification.Error bars indicate standard error of mean based on biological replicates.

F I G U R E 3
Distinct features of BrM proteome profile.(A) Complete clustering of the expression level of 3854 LFQ proteins from BrM samples (n = 14).Violin and dot plot showed the log 2 -transformed protein and site intensity distributions for each region.Pearson correlation was used for distance measurement.(B) GSEA of proteins in the three clusters.The top 20 annotations of GO-BP, Reactome, WikiPathways, and KEGG pathways are shown (Benjamini−Hochberg FDR method, adjusted P < 0.01).(C-E) Protein enrichment networks in 3 clusters based on the top functional enriched terms from GSEA.

F I G U R E 4
Unsupervised clustering of protein patterns distinguishes distinct grades of gliomas.(A) Multigroup heatmap with dendrogram of 3774 LFQ proteins across 13 glioma samples.Ward's method was performed for clustering, and Pearson correlation was used for distance measurement.(B) PCA of all glioma samples (n = 13) based on LFQ proteome profiles.And 95% confidence regions were displayed.(C) Heatmap of the fold changes between WHO grade 4 (Grade 4-Mut, n = 4) and 2/3 (Grade 2/3-Mut, n = 3) IDH1 mutant gliomas and their GSEA results.Fold change > 2 or < 0.5.The top 20 annotations of GO, Reactome, WikiPathways, and KEGG pathways are shown (Benjamini−Hochberg FDR method, adjusted P < 0.01).(D) Volcano plot of the statistical significance in WHO grade 4 with IDH1 mutation (Grade 4-Mut, n = 4) versus IDH1 wildtype (Grade 4-WT, n = 4).Two-tailed Student's t-test, Benjamini−Hochberg FDR method, adjusted P < 0.05 and fold change > 2 or < 0.5.(E) The log 2 -transformed intensity of co-detected DEPs (KDM3B and IER3IP1) in two comparisons.Error bars indicate standard error of mean based on biological replicates.F I G U R E 5 Proteomic differences between primary and metastatic brain tumors.(A) Heatmap of 407 DEPs between BrMs (n = 14) and gliomas (n = 13).Complete method was performed for clustering and Pearson correlation was used for distance measurement.(B) Protein enrichment networks of upregulated and downregulated proteins in BrMs (n = 14) versus gliomas (n = 13).(C) GSEA results of downregulated and upregulated proteins in the two groups.The top 20 annotations of GO-BP, Reactome, WikiPathways, and KEGG pathways are shown (Benjamini−Hochberg FDR method, adjusted P < 0.01).(D) The enrichment of tissue-and cell-specific gene patterns based on PaGenBase (Benjamini−Hochberg FDR method, adjusted P < 0.01).

While in the Grade 2 / 3 -
Mut subgroups, the highly expressed proteins were mainly concentrated in energy metabolism and cell development, such as organophosphate catabolic process, propanoate metabolism, amino acid metabolic process and protein catabolic process.Likewise, when comparing Grade 4-Mut with Grade 4-WT group, the GSEA results of high-expressed proteins in Grade 4-Mut were correlated with RNA process and immune response (FigureS3C).And the Grade 4-WT group showed activities on biological regulation, for instance, regulation of proteolysis, negative regulation of endopeptidase activity, response to oxygen levels and so on.In addition, comparison of the DEPs in the grade 4 versus grade 2/3 group (Grade 4-Mut vs. Grade 2/3-Mut) and in the IDH1 mutant versus wildtype group (Grade 4-Mut vs. Grade 4-WT) resulted in a low overlap between the two cohorts.And the overlap of two proteins (KDM3B and IER3IP1) are uniquely highly expressed in the Grade 4-Mut (Figure4Eand FigureS3E).The function of lysine-specific demethylase 3B (KDM3B), a specific regulator of H3K9 methylation[41], has been poorly studied in glioma.However, an increase of H3K9 methylation has been detected in IDH mutations compared with wildtypes[42], suggesting a potential function of KDM3B on IDH mutant gliomas.And KDM3B may have tumor suppressor activity like its homolog demethylase KDM3A[43].Immediate early response 3 interacting protein 1 (IER3IP1) is an endoplasmic reticulum protein with its potential function involved in brain development and the secretion of ECM proteins[44].We envision its potential function on gliomas can be unearthed.In short, these results suggested that the proteomics results are reliable and have the potential to discriminate different types of gliomas.The present study utilized BrMs and gliomas to comprehensively analyze primary and secondary brain tumors.The proteome-based results clearly illustrated that BrM colonization in the brain depends on tumorigenesis and multiple interactions of metastatic cancer cells with the brain microenvironment, whereas gliomas, as one of the representative primary tumors in the brain, maintain a high tendency to invade.Notably, microenvironment analysis has shown that BrM samples have a more pronounced accumulation of lymphocytes and neutrophils compared to gliomas, whereas gliomas are dominated by microglia[19].Furthermore, the proteomic differences of BrM (LC) and glioma (grade 4) were utilized for precise disease classification.By attempting machine learning, five proteins (TBR1, MUC1, LAMB3, SFN, and GPRC5A) were furtherly selected to classify these two tumors with an accuracy of 93%, all of which were highly expressed in BrMs (Figure5D).T-box brain protein 1 (TBR1) protein is a transcriptional repressor involved in multiple aspects of brain development like cortical brain malformations[45].Notably, accumulations of TBR1 methylation have been observed in the BrMs of renal cell cancer F I G U R E 6 Determination of biomarker combinations for distinguishing primary and secondary tumors.(A) Receiver operating characteristic (ROC) curves for all biomarker combination models for discriminating BrMs (LC) from gliomas (grade 4) based on Monte-Carlo cross-validation (MCCV).Partial least squares discriminant analysis (PLS-DA) was used as the classification method, and PLS-DA built-in was selected as the feature-ranking method with two set latent variables.(B) Predictive accuracies with different features (top 5, 10, 15, 25, 50, and 100 proteins) based on the ROC curves (A).(C) Predicted class probabilities (average of the cross-validation) for each sample using a 5-biomarker combination model.Due to balanced subsampling, the classification boundary is at the center (x = 0.5, dotted line).(D) Selected frequency of five markers (TBR1, MUC1, LAMB3, SFN, and GPRC5A) in both groups.