Four Key Genes are Biomarkers Associated with Immunity in Neuroglioma

Background: Glioma is the most common intracranial tumor, with glioblastoma being the most malignant. However, its treatment is very few, and targeted therapy is an important breakthrough in treatment. Methods: Numerous genes are differentially expressed during the progression of glioma, some of which may play a key role. To nd key genes, we analyzed three multi-sample microarrays (GSE4290, GSE54004, and GSE29796) in the GEO database to obtain intersection differential genes among them. We entered all DEGs into the STRING database and characterized the protein interactions of these DEGs as visual PPI networks by Cytoscape software. Also, we used the GEPIA2 and CGGAdatabase to predict the relationship between key genes and the prognosis of glioma patients. Results: A total of 222 up-regulated genes and 127 down-regulated genes were identied. Four genes(FN1, LAMB1, FAM20C, and COL6A1) were signicantly negatively correlated with malignant glioma survival. Expression levels of four genes increased with the glioma grade. All gene expression is more common in IDH wild glioma and are enriched in the Mesenchymal subtype(AUC>0.8). In addition,they can be dened as hazard factors for glioma. We found that these genes were co-expressed and jointly involved in the inltration of immune cells in tumors. Conclusion: In conclusion, FN1, LAMB1, FAM20C, and COL6A1 is associated with poor prognosis in glioma patients. These genes might be clinical targets of glioma immunotherapy.

Background Glioblastoma (GBM) is the most lethal primary malignant brain tumor in adults with poor survival because of acquired therapeutic resistance and rapid recurrence [1]. At present, temozolomide is the rstline drug, which has a good therapeutic effect for patients with MGMT positive. Differential correlation analysis of glioblastoma shows that immune cell interaction can predict patient survival rate, and there have been some advances in existing immunotherapy [2,3]. Besides, physical therapy and tumor electric eld therapy have also entered the research stage [4]. With the advent of the big data era, gene sequencing has provided us with a wealth of gene chips. Through speci c research methods, people can use the genetic data in them for bioinformatics analysis, and can study the key genetic changes and epigenetic characteristics of glioma at the molecular level [5,6]. Moreover, the treatment of high-grade glioma is currently a di cult problem to overcome, mainly due to its strong heterogeneity and invasion and metastasis instinct. Therefore, to nd genes that may play a key role in tumor progression, we attempt to analyze the differential genes between LGG and GBM.

GO and KEGG
We analyzed target genes on CC, MF and BP, and found out the genes related to them. The KEGG(https://www.kegg.jp/kegg/pathway.html) database is a database based on the various pathways involved in genes. Using R(4.0.2)to conduct enrichment analysis of key genes to determine the functions and pathways enriched by key genes. TIMER TIMER(https://cistrome.shinyapps.io/timer/) contains 10,897 tumors out of 32 types of cancer. It provides six major analysis modules, allowing users to interactively explore links between immune in ltration and a wide range of factors, including gene expression, clinical outcomes, somatic mutations, and somatic copy number changes. We inquired about the correlation between key genes and immune cells. And we assessed the major risk factors. TIMER outputs the Cox regression results including hazard ratios and statistical signi cance automatically.

Screening of DEGs
Three gene data chips (GSE4290, GSE54004 and GSE29796) were downloaded from the GEO database.

Protein interaction network construction and key gene screening
Using the STRING web tool to explore the interactions between the proteins encoded by DEGs. 348 nodes and 834 edges of the PPI network were illustrated by Cytoscape software (Fig. 1b). Cluster 2 was chosen as the research object and then sorted by Cytohubba (Fig. 1c, Table 2).

Survival analysis of DEGs
Our results revealed that 4 of 21 genes are signi cantly related to the prognosis of patients. To explore the prognosis of the four genes in LGG and GBM patients, we used the GEPIA2 online survival analysis tool to draw the overall survival curve of key genes (Fig. 2). The survival analysis results of FN1, LAMB1, FAM20C and COL6A1 showed signi cant statistical differences (Log-rank p < 0.05).

Analysis of gene expression by GEPIA
GEPIA was used to further analysis of the expression of each gene in LGG and GBM. In LGG, FN1 is highly expressed. LAMB1 expression is lower than normal tissues. FAM20C and COL6A1 have no signi cant difference in expression with normal tissues. In GBM, FN1, FAM20C, and COL6A1 are all expressed higher than normal tissues; LAMB1 has no differential expression in normal tissues. In general, the expression of these four genes increased during the progression from LGG to GBM (Fig. 3).

Key genes veri cation analysis
We validated the key roles of FN1, LAMB1, FAM20C and COL6A1 in glioma using the CGGA database. These four genes resulted in shorter survival in glioma patients (p < 0.05) (Fig. 4a). More importantly, they were signi cantly associated with poor prognosis in GBM patients (p < 0.05) (Fig. 4b). With the increase Page 7/26 of the WHO grade of glioma, gene expression also increased (Fig. 5a). Four genotypes were more common in wild-type IDH gliomas compared to mutant IDH (Fig. 5b).AUC(Area Under Curve) in the CGGA database predicting Mesenchymal subtype was greater than 0.8( Fig. 5c and d).

Enrichment analysis of four key genes
Four genes were mainly enriched in neutrophil degranulation, neutrophil activation involved in immune response and neutrophil activation of BP and receptor interaction of MF. Key genes in CC were mainly enriched in the collagen-containing extracellular matrix, focal adhesion,cell-substrate junction, and cytokine-cytokine, etc. Besides, DEGs in the KEGG pathway analysis were predominantly enriched in the Cytokine-cytokine receptor interaction signaling pathway (Fig. 8, Table 4).

Co-expression and immune cell in ltration
The co-expression analysis showed the relationship between different genes. Through the CGGA database, GEPIA and TIMER, we analyzed the co-expression between FN1, LAMB1, FAM20C and COL6A1. These four genes were closely related to each other (Fig. 7).GO and KEGG result revealed that they are related to immunity. The immune in ltration of key genes in LGG and GBM was analyzed by TIMER( Fig. 9a and b).In LGG,hazard facotors CD8 + Tcell, Macrophage, FAM20C, and COL6A1 are de ned as hazard factors (Table 5,P < 0.05).FAM20C and COL6A1 are associated with poor prognosis in patients (Fig. 9c).In GBM, Dendritic, CD4 + Tcell were de ned as hazard factors (Table 5,P < 0.05). COL6A1 are associated with poor prognosis in patients ( Fig. 9d).

Discussion
The mortality rate of glioma is very high, and the current status of treatment is very worrying. A large amount of genotype-oriented disease classi cation principles have been introduced, which have made medical treatment possess a broad research direction [7]. We obtained three microarrays of the GEO database based on a systematic analysis method, targeting important participating genes throughout the development of glioma, and extracted four key genes. These genes play a vital role in the development of glioma. Immunotherapy is a new therapeutic method at present, which can inhibit the tumor during the treatment, and can speci cally act on the tumor to achieve the effect of adjuvant therapy [8]. Part of the treatment of brain tumors has shifted to immunomodulatory intervention therapies. In gliomas, a variety of immune cell types are in ltrated, such as neutrophils, macrophages, and T cells, which are in ltrated [9,10]. Microglia and macrophages are enriched in the microenvironment, and there is a signi cant interaction between these cells to promote the malignant progression of gliomas [11]. At present, many recognized immune markers play an important immunoregulatory function in gliomas.IDH1 (R132H) is a neoantigen that triggers immune responses in IDH1 (R132H) mutant gliomas [12]. Programmed cell death protein and its ligands (PD-1 and PD-L1) and cytotoxic T lymphocyte-associated antigen 4 (CTLA-4) may be key factors for tumor cells to evade immunosuppression [13]. Unfortunately, the single-target immunotherapy effect is still not signi cant and patient survival is not signi cantly increased [14,15].We hope to search for meaningful immune research targets and promote the progress of immunotherapy.
Fibronectin 1 (FN1) is a central component of the extracellular matrix (ECM), which constructs the tumor microenvironment (TME) and participates in the invasion, migration ,immune in istrate, and metabolism of tumor cells [16,17]. By comparing the genetic differences between grade III and IV gliomas, it is found that the genes ELAV-like protein 1(ELAVL1) and FN1 may participate in the growth of gliomas through the PI3K-Akt signaling pathway, and ECM can be found to promote tumor invasion [18]. Similar to our results, COL3A1, FN1, MMP9 and other genes can be considered to play an important role in glioblastoma, and these genes are also mainly present in ECM [19]. GBM tumor guanylate binding protein 2 (GBP2) is a large-scale GTPase induced by interferon, which can improve the immunity of microorganisms. Studies have found the role of GBP2/Stat3/FN1 signaling cascade in GBM invasion [8]. There are many genes in the TME of malignant gliomas that are related to the prognosis of the patient. These genes include LAMB1, FN1, ACTN1, TRIM, SERPINH1, CYBA, LAIR1, LILRB2 [20]. Also, MIR-1 and MIR-1271 exert an inhibitory function on FN1, which can ultimately improve the effect of chemotherapy. Their low expression is all related to the poor prognosis of glioma patients [21,22].
In the process of tumor epithelialization and metastasis, LAMB1 (laminin β-1) is activated to promote the EMT process [23]. The level of protein phosphorylation in breast cancer has an obvious change, and the level of secreted phosphorylated protein group may re ect the progression and subtype of the disease.
Among them, CD44, OPN, FSTL3, LAMB1, STC2 are of great signi cance [24]. In colorectal cancer, LAMA1, LAMA3, LAMB1 and LAMB4 are more abundant [25]. LAMB1 is superior to CEA (carcinoembryonic antigen) in distinguishing colorectal cancer patients from control groups. The combined measurement of LAMB1 and CEA may improve the accuracy of diagnosing colorectal cancer [26]. The silencing of LAMB1 and CACNA1D in prostate tissue can also reduce tumor cell in ltration [27]. These candidate genes may assist diagnosis and treatment, and predict the risk of tumor metastasis in the early stage of tumor development.
FAM20C protein is a new kinase that phosphorylates secreted proteins and proteoglycans. FAM20C phosphorylates hundreds of secreted proteins and is activated by the pseudokinase Fam20A, which is closely related to the metabolism of substances in the Golgi apparatus [28,29]. It phosphorylates many extracellular proteins, including the small integrin-binding ligand, N-linked glycoproteins [30]. Studies believe that the activator of Fam20C may be bene cial in cancer. In addition, the activator of G-CK/Fam20C may provide a new treatment tool for the eld of biomineralization and low phosphate diseases [31,32]. In lung cancer, FAM20C, MYLIP ,and COL7A1 have been identi ed as key hypoxia-related genes in the LUAD process, and are regulated by DNA methylation [33,34]. The triple-negative breast cancer (TNBC) cells that activate FAM20C exhibit a strong anti-proliferation effect, with increased apoptosis and decreased migration [35]. There are few studies on Fam20C in gliomas. In this study, we found for the rst time that the expression of FAM20C was also up-regulated in GBM. We speculate that FAM20C may also play an anti-proliferative effect as an antagonist of glioma evolution. Perhaps, its gene expression is up-regulated with the up-regulation of tumor-promoting gene expression.
The COL6A1 (VI collagen α1 ) is located on chromosome 21 and can maintain the integrity of various tissues [36].COL6A1 gene expression is signi cantly different in normal glial cells compared with lowgrade (grade I, II) astrocytoma and high-grade astrocytoma (grade III, IV). And the difference is more obvious in high-grade samples [37,38]. Many studies have used this protein as one of the markers of epithelial-mesenchymal transition, and play an important role in tumor ECM receptor interaction and lesion adhesion pathway [39,40].
Compared with differentiated glioblastoma cells(DGCs), the expression levels of 10 proteins that interact with ECM in cancer stem cells (CSCs) are increased(COL6A1, COL6A3, FN1, ITGA2, ITGA5, ITGAV, ITGB1,  ITGB3, LAMB1, and LAMC1), indicating that CSC may be highly aggressive (12). Therefore, these three genes(FN1, LAMB1, and COL6A1)are also involved in tumor recurrence, which is one of the characteristics of glioma stem cells. Considering the close connections between these three genes, it is very meaningful to explore their speci c interactions in glioma. Our research screened out genes related to the prognosis of GBM as potential therapeutic targets. Moreover, in the process of glioma evolution, glioma cells can evolve their suitable microenvironment, increasing their proliferation and invasion capabilities [41]. In general, the expression of 4 genes increased during the progression from LGG to GBM.
If we can regulate key microenvironment genes FN1, LAMB1, COL6A1 ,and FAM20C expression at an early stage, a good therapeutic effect may be obtained. Moreover, from the results of enrichment analysis and co-expression, it is reasonable to think that these genes may also be involved in immune-related processes.While COL6A1,FAM20C may be risk factors for immune cell in ltration and resulted in a shorter survival time for glioma patients.

Conclusion
Our bioinformatics analysis was based on microarray screening of gene expression data from the GEO database looking for DEG between GBM samples and LGG brain tissues. Ultimately, 21 possible hub genes were screened. According to the nal survival analysis, four genes including FN1, LAMB1, FAM20C, COL6A1 overexpression were associated with a poorer prognosis in LGG and GBM patients. And these four genes are associated with the immune process of neuroglioma. Of course, further research is merited to explore the biological functions of these genes and the underlying mechanisms involved in the pathogenesis of glioma.

List Of Abbreviations
LGG

Declarations
Ethics approval and consent to participate Not applicable.

Availability of data and materials
The data and materials used to support the fndings of this study are available from the corresponding author upon request.

Consent for publication
Not applicable.

Competing interests
The authors declare that they have no competing interests.