Pre-processing and prelimintureary screening of pseudogene methylation site data in glioma
The clinicopathological feature of patients with glioma was downloaded from TCGA, while the methylation data was obtained from 450K methylation (https://xena.ucsc.edu) . To generate the methylation of pseudogenes-related prognostic signature, we enrolled TCGA public databases with the clinical data and methylation level of patients. The β-value (0 to 1) represented the methylation level of each probe. Finally, 649 glioma samples were classified into a training and a validation dataset randomly, in which there are 325 samples in training dataset while the validation dataset contained 324 samples.
Identification of methylated pseudogenes and construction of signature of six-methylated pseudogenes
The differentially expressed heatmaps were drawn to systematically compare the levels of pseudogene methylation in LGG and GBM. Under the filtering conditions of P <0.05, 22 prognostic pseudogenes were obtained. Subsequently, the univariate Cox regression analysis was put into to filtrate methylated pseudogenes, with the cut off P-value setting as 0.05. Next, the multivariate stepwise Cox regression analysis was carried out to build the risk prognostic model. Ultimately, six-methylated pseudogenes were used as candidates to construct the prognosis predictive model. The risk score was figured up using the following formula of the model. Risk score = coef ( AZGP1P1 ) × promoter methylation level of AZGP1P1 + coef ( SUMO1P1) × promoter methylation level of SUMO1P1 + coef ( INGX ) × promoter methylation level of INGX + coef ( KRT19P2 ) × promoter methylation level of KRT19P2 + coef ( SBF1P1 ) × promoter methylation level of SBF1P1 + coef ( CES1P1 ) × promoter methylation level of CES1P1. Glioma patients were ranked according to risk scores and divided into high and low risk groups using the median risk score of the training dataset as a tipping point.
Evaluation of the Prognostic Model
Kaplan-Meier survival curve was employed to compare the overall survival(OS)of high and low risk groups. To determine whether this model can be an independent prognostic factor for patients with glioma, univariate and multivariate Cox regression analyses were performed for these prognostic factors. Then ROC curves were employed to estimate the predictive power of the prognostic model.
Consensus clustering based on pseudogene methylation status to analyze prognostic molecular subtypes
Consensus clustering was performed by the ConsensusClusterPlus package in R to obtain certain glioma subgroups. The samples were clustered and divided into different subtypes based on 6 methylated pseudogenes signature, so as to further explore the impact of different subtypes on the prognosis of glioma patients. The resampling method was performed to extract the data datasets of a certain sample. Different cluster numbers are specified and the rationality of k was calculated. The number of subclasses were determined based on the following standards: First, cluster in a relatively high consistency and no significant rise in the area under the cumulative distribution function (CDF) curve. Then, the heatmap of the sample consensus matrix made by rearranging the final clustering results was distinct. The heatmap was easy to identify the quality of the sample clustering results and the internal structure of the sample similarity matrix. Besides, we performed survival analysis to obtain the survival circumstance. To examine patient's survival status among different glioma subgroups, K-M curve was performed to compare the difference in overall survival between high and low risk groups. Besides, to generate a heatmap containing the pseudogene methylation level, sample type, radiation, gender, age, grade and fustat, as is to show the difference of pseudogenes methylation levels.
Analysis by Bioinformatics Methods
Nomogram was constructed using R software (version 4.1.0) package "rms" to evaluate the 1-year, 3-year, and 5-year patient's overall survival. Then, calibration plots were also used to graphically evaluate the discriminative ability of the nomogram. Finally, the prognostic nomogram was externally demonstrated in the validation dataset and entire dataset. Differentially methylated sites were investigated by Gene Ontology (GO), Kyoto Encyclopedia of Genes (KEGG) analyses and GSEA, which we employed to assess the cell functions related to the risk factors based on the prognostic characteristics of six-methylated pseudogenes.
Tumor Immune cell infiltration
In order to confirm whether and how pseudogene methylation influenced the tumor-immune microenvironment, related plots and a violin plot were drawn to show the contribution of immune cell infiltration in two datasets [27]. EpiDISH package was used to deduce the proportions of prior known certain cells in a sample containing a mixture of such cell types. In this study, it was performed to infer the proportions of 6 immune cells (Nature Killer(NK) cells, B cells, Monocytes, CD4+ T cells, CD8+ T cells and Neutrophils) in samples based on the different pseudogene methylation levels.
Cells Culture
H4, SF126, LN18, SNB19, T98G, U251 and SW1088 glioma cells, and one normal human astrocyte cell line (HEB) were obtained from the Cell Bank of the Chinese Academy of Sciences (Shanghai, China) and the cell bank of Sun Yat-Sen University (Guangzhou, China), respectively. All cells were added with Dulbecco's modified Eagle medium (DMEM, 319-00518, Gibco, USA) containing 10% fetal bovine serum (FBS, 10,270-106, Gibco; USA) and placed in an incubator at temperature of 37.0℃, humidity saturation and CO2 concentration of 5%. When the adhesion rate reaches 80% or more, the cells are digested by trypsin (C0201, Beyotime, China) and passed from generation to generation.
Silencing and Transfection of pseudogenes
SiRNAs were used to silence SBF1P1 and SUMO1P1. SBF1P1 siRNA (U0804, RiboBio, Guangzhou, China) included three sequences (siSBF1P1-1 CCTCTCAGATACAGCTTCA, siSBF1P1-2 GCAAGAACAAGAACCTGTA, and siSBF1P1-3 CACATTCCAGCTGCTGAAA), while SUMO1P1 siRNA (U0804, RiboBio, Guangzhou, China) included another three sequences (siSUMO1P1-1 CCTCTCAAGAAACTCAAGA, siSUMO1P1-2 GAGGTCATTCAACAGTTTA, and siSUMO1P1-3 TAACGACTAACTCCAAAGA ). Firstly, cells at logarithmic growth stage were choosed and uniformly inoculated into 6 well plates, and transfections were prepared when cell growth compatibility reached 40 to 50% density. Secondly, 10µL siRNA/con-RNA was diluted with 120µL 1X riboFECTTM CP Buffer and 12µL riboFECTTM CP Reagent, respectively. The reagents were fully mixed and incubated at room temperature for 15 minutes. Add riboFECTTM CP mixture to 6 well plates. Finally, 24 h after transfection cell culture, the original medium in the 6-well plate was discarded and trypsin was washed twice with PBS for use.
RT-qPCR
RT-qPCR primers applied are shown below: GAPDH, forward5'-AATGGGCAGCCGTTAGGAAA-3', reverse5'-GCGCCCAATACGACCAAATC-3'. SBF1P1, forward5'-ATTCCCCCAGCTGTTTTGCC-3', reverse5'-TTTCCTGCTCCCAGAAGGTCAAG-3'. SUMO1P1, forward5'-TGAGGCATAGCGGAAGTGAC-3', reverse5'-CAGACATGGTGATGGGGCAT-3'. Total RNA was extracted from LN18 and T98G cells by TRIzol reagent of Takara Company. Cell RNA was reversely transcribed into cDNA according to PrimeScript RT Master Mix (AK51812A, Takara, Japan) reverse transcription kit instructions. Secondly, RT-qPCR analyses were carried out with TB Green Premix Ex Taq II (Tli RNaseH Plus) (AK51812A, Takara, Japan). PCR using cDNA as template, substrate 1.6µ, primer 3.2µ, TB Green 5.2µ. The step of RT-qPCR was as follows: pre-denatured 95.0◦C, 30s, 1 cycle of 95.0◦C for 10 s, 60.0◦C for 31 s, and 40 cycles. The solubility curve was observed in 95.0°C for 15 S, 60.0◦C for 1 minute, 95.0◦C for 15 s, a cycle. The relative expression of other reference genes should be as follows: 2-ΔΔCt.
Cell Viability Assay
The MTT kit (Biyuntian, Shanghai) was used to counted cell viability(Biyuntian, Shanghai). Transfection of T98G and U251 cells was performed using si-SBF1P1 and si-SUMO1P1, respectively. After 24 hours, the cells were digested and counted, and 6000 cells were planted in each well on a 96-well plate for culture. 10 µl MTT stain was added to each well, and then the tablet was kept for 4 hours in the dark. Next, carefully discard the medium and add 140µ L DMSO to each well. After that, the 96-well plate was gently shaken in the dark environment for 10min. Finally, the tablet was tested with the microedition reader.
Cell Migration and Invasion Assay
24 h after transfection, T98G and U251 cells were collected from 6-well plates. A transwell chamber was hydrated with 200 µl of serum-free DMEM for 1h. Next, 2×104 cells were inoculated in each chamber after the cells were counted using the cell counting plates carefully. Add 600 µl DMEM containing 30% fetal bovine serum to overlay the bottom of chambers. After 24 hours of culture, cells with 4% paraformaldehyde soak for 30 min, then use 0.1% crystal violet staining for 30 min. Next, Matrigel R (Gibco, USA) was added into the upper chamber, and 10 × 104 cells were inoculated in the upper chamber and cultured for 48 hour under the same culture conditions to evaluate the invasion ability of the cells. Then, the matrix R was carefully wipe on matrix R chamber. Finally, an inverted contrast phase microscope (Olympus, Japan, , magnifcation, 100×) was used to measure the number of cells which passed through the membrane at the bottom of the compartment.
Colony Formation Assay
Colony formation experiment was performed to measure glioma cell proliferation ability. Experimental group and control group cells were collected first, then inverted phase contrast microscope was used and cell count plate was used to calculate the number of cells. Add 800-900 cells to each well in 6-well plates, and then incubate the cells in a humidified incubator at 37℃ with 5% CO2 (Model no. :370, The American Thermoscientific). Culture for a week, the cell colonies suspended in the culture chamber gradually formed. 0.1% crystal violet (Shanghai, Biyuntian) was dyed for 20minture, and 4% paraformaldehyde (Shanghai, Biyuntian) was fixed for 30 minutes. White light was used to take photos and the number of cloning per well was calculated.
Statistical Analysis
All data were statistically analyzed using GraphPad Prism 9.0 (La Jolla, USA). Results were expressed by means of at least 3 times ±SD. Unpaired T test was used to compare expression differences between the two groups, or one-way analysis of variance (ANOVA) was used to assess the mean expression between the different groups. P <0.05 was considered statistically significant.