Preprocessing and preliminary screening of pseudogene methylation site data in glioma
The clinicopathological feature of patients with glioma was downloaded from TCGA, while the methylation data were obtained from TCGA.GBMLGG.sampleMap/HumanMethylation450 (https://xenabrowser.net/datapages/?dataset=TCGA.GBMLGG.sampleMap%2FHumanMethylation450&host=https%3A%2F%2Ftcga.xenahubs.net&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443). To generate the methylation of pseudogene-related prognostic signatures, we enrolled TCGA public databases with the clinical data and methylation level of patients. The β-value (0 to 1) represented the methylation level of each probe. Finally, 649 glioma samples were classified into a training and a validation dataset randomly, in which there are 325 samples in the training dataset while the validation dataset contained 324 samples.
Identification of methylated pseudogenes and construction of the signature of six-methylated pseudogenes
The differentially expressed heatmaps were drawn to systematically compare the levels of pseudogene methylation in LGG and GBM. Under the filtering conditions of P <0.05, 22 prognostic pseudogenes were obtained. Subsequently, the univariate Cox regression analysis was put into filtrate methylated pseudogenes, with the cut-off P-value setting as 0.05. Next, the multivariate stepwise Cox regression analysis was carried out to build the risk prognostic model. Ultimately, six methylated pseudogenes were used as candidates to construct the prognosis predictive model. The risk score was figured up using the following formula of the model. Risk score = coef ( AZGP1P1 ) × promoter methylation level of AZGP1P1 + coef ( SUMO1P1) × promoter methylation level of SUMO1P1 + coef ( INGX ) × promoter methylation level of INGX + coef ( KRT19P2 ) × promoter methylation level of KRT19P2 + coef ( SBF1P1 ) × promoter methylation level of SBF1P1 + coef ( CES1P1 ) × promoter methylation level of CES1P1. Glioma patients were ranked according to risk scores and divided into high and low-risk groups using the median risk score of the training dataset as a tipping point.
Evaluation of the Prognostic Model
Kaplan-Meier survival curve was employed to compare the overall survival(OS)of high and low-risk groups. To determine whether this model can be an independent prognostic factor for patients with glioma, univariate and multivariate Cox regression analyses were performed for these prognostic factors. Then ROC curves were employed to estimate the predictive power of the prognostic model.
Consensus clustering based on pseudogene methylation status to analyze prognostic molecular subtypes
Consensus clustering was performed by the ConsensusClusterPlus package in R to obtain certain glioma subgroups. The samples were clustered and divided into different subtypes based on 6 methylated pseudogene signatures, to further explore the impact of different subtypes on the prognosis of glioma patients. The resampling method was performed to extract the data datasets of a certain sample. Different cluster numbers are specified and the rationality of k was calculated. The number of subclasses was determined based on the following standards: First, the cluster with relatively high consistency and no significant rise in the area under the cumulative distribution function (CDF) curve. Then, the heatmap of the sample consensus matrix made by rearranging the final clustering results was distinct. The heatmap was easy to identify the quality of the sample clustering results and the internal structure of the sample similarity matrix. Besides, we performed a survival analysis to obtain the survival circumstance. To examine the patient's survival status among different glioma subgroups, the K-M curve was performed to compare the difference in overall survival between high and low-risk groups. Besides, to generate a heatmap containing the pseudogene methylation level, sample type, radiation, gender, age, grade, and fustat, as is to show the difference of pseudogene methylation levels.
Analysis by Bioinformatics Methods
Nomogram was constructed using the R software (version 4.1.0) package "rms" to evaluate the 1-year, 3-year, and 5-year patient overall survival. Then, calibration plots were also used to graphically evaluate the discriminative ability of the nomogram. Finally, the prognostic nomogram was externally demonstrated in the validation dataset and the entire dataset. Differentially methylated sites were investigated by Gene Ontology (GO), Kyoto Encyclopedia of Genes (KEGG) analyses, and GSEA, which we employed to assess the cell functions related to risk factors based on the prognostic characteristics of six methylated pseudogenes.
Tumor Immune cell infiltration
To confirm whether and how pseudogene methylation influenced the tumor-immune microenvironment, related plots and a violin plot were drawn to show the contribution of immune cell infiltration in two datasets [27]. EpiDISH package was used to deduce the proportions of prior known certain cells in a sample containing a mixture of such cell types. In this study, it was performed to infer the proportions of 6 immune cells (Nature Killer(NK) cells, B cells, Monocytes, CD4+ T cells, CD8+ T cells, and Neutrophils) in samples based on the different pseudogene methylation levels.
Cells Culture
H4, SF126, LN18, SNB19, T98G, U251, and SW1088 glioma cells, and one normal human astrocyte cell line (HEB) were obtained from the Cell Bank of the Chinese Academy of Sciences (Shanghai, China) and the cell bank of Sun Yat-Sen University (Guangzhou, China), respectively. All cells were added to Dulbecco's modified Eagle medium (DMEM, 319-00518, Gibco, USA) containing 10% fetal bovine serum (FBS, 10,270-106, Gibco; USA) and placed in an incubator at a temperature of 37.0℃, humidity saturation and CO2 concentration of 5%. When the adhesion rate reaches 80% or more, the cells are digested by trypsin (C0201, Beyotime, China) and passed from generation to generation.
Silencing and Transfection of pseudogenes
siRNAs were used to silence SBF1P1 and SUMO1P1. SBF1P1 siRNA (U0804, RiboBio, Guangzhou, China) included three sequences (siSBF1P1-1, CCTCTCAGATACAGCTTCA, siSBF1P1-2 GCAAGAACAAGAACCTGTA, and siSBF1P1-3 CACATTCCAGCTGCTGAAA), while SUMO1P1 siRNA (U0804, RiboBio, Guangzhou, China) included another three sequences (siSUMO1P1-1 CCTCTCAAGAAACTCAAGA, siSUMO1P1-2 GAGGTCATTCAACAGTTTA, and siSUMO1P1-3 TAACGACTAACTCCAAAGA ). Firstly, cells at the logarithmic growth stage were chosen and uniformly inoculated into 6 plates, and transfections were prepared when cell growth compatibility reached 40 to 50% density. Secondly, 10µL siRNA/con-RNA was diluted with 120µL 1X riboFECTTM CP Buffer and 12µL riboFECTTM CP Reagent, respectively. The reagents were fully mixed and incubated at room temperature for 15 minutes. Add riboFECTTM CP mixture to 6 plates. Finally, 24 h after transfection of cell culture, the original medium in the 6-well plate was discarded and trypsin was washed twice with PBS for use.
RT-qPCR and Western blot analysis
RT-qPCR primers applied are shown below: GAPDH, forward5'-AATGGGCAGCCGTTAGGAAA-3', reverse5'-GCGCCCAATACGACCAAATC-3'. SBF1P1, forward5'-ATTCCCCCAGCTGTTTTGCC-3', reverse5'-TTTCCTGCTCCCAGAAGGTCAAG-3'. SUMO1P1, forward5'-TGAGGCATAGCGGAAGTGAC-3', reverse5'-CAGACATGGTGATGGGGCAT-3'. Total RNA was extracted from LN18 and T98G cells by TRIzol reagent of Takara Company. Cell RNA was reversely transcribed into cDNA according to PrimeScript RT Master Mix (AK51812A, Takara, Japan) reverse transcription kit instructions. Secondly, RT-qPCR analyses were carried out with TB Green Premix Ex Taq II (Tli RNaseH Plus) (AK51812A, Takara, Japan). PCR using cDNA as template, substrate 1.6µ, primer 3.2µ, TB Green 5.2µ. The step of RT-qPCR was as follows: pre-denatured 95.0◦C, for the 30s, 1 cycle of 95.0◦C for 10 s, 60.0◦C for 31 s, and 40 cycles. The solubility curve was observed at 95.0°C for 15 S, 60.0◦C for 1 minute, and 95.0◦C for 15 s, a cycle. The relative expression of other reference genes should be as follows: 2-ΔΔCt. U251 and T98G cells were lysed in RIPA protein extraction reagent containing phosphatase inhibitor and protease inhibitor cocktail. BCA Protein Assay Kit (Beyotime, China) was used to test protein concentrations. Antibodies against proteins are listed in Supplementary Table 3.
Cell Viability Assay
The MTT kit (Biyuntian, Shanghai) was used to count cell viability (Biyuntian, Shanghai). Transfection of T98G and U251 cells was performed using si-SBF1P1 and si-SUMO1P1, respectively. After 24 hours, the cells were digested and counted, and 6000 cells were planted in each well of a 96-well plate for culture. 10 µl MTT stain was added to each well, and then the tablet was kept for 4 hours in the dark. Next, carefully discard the medium and add 140µL DMSO to each well. After that, the 96-well plate was gently shaken in the dark environment for 10min. Finally, the tablet was tested with the micro edition reader.
Cell Migration and Invasion Assay
24 h after transfection, T98G and U251 cells were collected from 6-well plates. A transwell chamber was hydrated with 200 µl of serum-free DMEM for 1h. Next, 2×104 cells were inoculated in each chamber after the cells were counted using the cell counting plates carefully. Add 600 µl DMEM containing 30% fetal bovine serum to overlay the bottom of the chambers. After 24 hours of culture, cells with 4% paraformaldehyde were soaked for 30 min, then use 0.1% crystal violet staining for 30 min. Next, Matrigel R (Gibco, USA) was added into the upper chamber, and 10 × 104 cells were inoculated in the upper chamber and cultured for 48 hours under the same culture conditions to evaluate the invasion ability of the cells. Then, the matrix R was carefully wiped in the matrix R chamber. Finally, an inverted contrast phase microscope (Olympus, Japan, magnification, 100×) was used to measure the number of cells which passed through the membrane at the bottom of the compartment.
Colony Formation Assay
A colony formation experiment was performed to measure glioma cell proliferation ability. The experimental group and control group cells were collected first, then an inverted phase contrast microscope was used, and a cell count plate was used to calculate the number of cells. Add 800-900 cells to each well in 6-well plates, and then incubate the cells in a humidified incubator at 37℃ with 5% CO2 (Model no. :370, The American Thermo scientific). Culture for a week, the cell colonies suspended in the culture chamber gradually formed. 0.1% crystal violet (Shanghai, Biyuntian) was dyed for 20minture, and 4% paraformaldehyde (Shanghai, Biyuntian) was fixed for 30 minutes. White light was used to take photos and the number of cloning per well was calculated.
Statistical Analysis
All data were statistically analyzed using GraphPad Prism 9.0 (La Jolla, USA). Results were expressed using at least 3 times ±SD. An unpaired T-test was used to compare expression differences between the two groups, or a one-way analysis of variance (ANOVA) was used to assess the mean expression between the different groups. P <0.05 was considered statistically significant.