Data collection
Transcription level data of cervical samples and complete clinical data sets were obtained from TCGA (https://portal.gdc.cancer.gov/). The study collected a total of 306 CESC RNA sequencing data samples and survival follow-up time and 3 adjacent normal tissue sequencing data samples. The workflow is shown in Figure1.
Screening for differentially expressed genes
The genome-wide transcriptional expression profile was obtained from TCGA's RNA-seq Counts data. The limma[26] software package of R version 3.6.3 (http://www.r-project.org) was used to screen differentially expressed genes. Log2 (fold change)≥2.00 and false discovery rate (FDR)<0.01 were considered to indicate DEGs.
Co-expression analysis of DEGs in CESC
WGCNA analysis is a widely used method in systems biology and is designed for multivariate data (ie gene expression, DNA methylation, metabolites, etc.)[27]. It can reveal the correlation of genes and look for significantly related gene modules. In this study, we used the "WGCNA" program package [28,29] to construct a co-expression network based on the results of the previous differential expression analysis to analyze potential genes in CC.
Identification of clinically significant modules
Two methods were used to determine modules related to clinical features. According to the linear regression between clinical traits and gene expression, the log10 conversion of P value was defined as gene significance (GS), and then the module significance (MS) was calculated using the average value of all genes in one module. It was usually considered that the module with the largest absolute value of MS was the module that has the closest relationship with clinical characteristics. Module exigencies (MEs) were used as the main components of gene modules and clinical traits to determine relevant modules. Select modules that were highly relevant to a given clinical feature for further analysis.
Protein-Protein Interaction (PPI) Network Establishment and Hub Gene Identification
Use STRING protein database 11.0 (http://string-db.org/) to construct a PPI network based on genes in significant modules. Next, export and import the results of the string database into the Cytoscape software [30]. The cytohubba plug-in is used to find the Hub genes.
Functional annotation and pathway analysis
To explore the function of genes in significant modules, we used David[31,32] online analysis tool to perform go and KEGG enrichment analysis on the genes in the significant module. Then downloaded the enrichment results, and used the R language to map the results.
Hub genes basic expression in normal and cancer tissues
The further verification and survival analysis of these hub genes were performed by using the gene expression profile interactive analysis (GEPIA) database (http://gepia.cancer-pku.cn/index.html; Z.Tang et al., 2017). We drew the overall survival curve and Disease-Free Survival curve of the hub genes in the GEPIA database, with a p-value of <0.05. Meanwhile, we use the Human Protein Atlas database (https://www.proteinatlas.org/) to verify the protein expression of the hub genes.
Tissue collection
From October 2020 to January 2021, 16 samples of cervical squamous cell carcinomas and adjacent normal tissues were collected from Zhuzhou Hospital affiliated with Xiangya Medical College. All specimens were assessed by immunohistochemistry and confirmed by two independent pathologists. At the same time, fresh tissue specimens of the corresponding patients were collected and stored in an ultra-low temperature refrigerator for Q-PCR. The study was approved by the Research and Clinical Trial Ethics Committee of Zhuzhou Hospital, and all eligible participants provided written informed consent. All clinical procedures are carried out in compliance with the ethical standards of the "Declaration of Helsinki" guidelines and relevant Chinese policies.
Validation of hub gene
After obtaining the hub gene through the co-expression network, we verified the hub gene by Quantitative real-time polymerase chain reaction (Q-PCR). Total RNA was extracted using RNAiso Plus (9109; Takara, Dalian, China). To obtain cDNA, the Hifair® Ⅱ 1st Strand cDNA Synthesis Kit (11119ES60; Yeasen Biotechnology, Shanghai, China) was used in 1 μg of total RNA. Use the following system for mRNA amplification: 85°C for 5 minutes, then 40 cycles (95°C for 10 s, 60°C for 30 s). ACTB was employed as an internal control for mRNA evaluation. To standardize SOX9 gene expression, ACTB expression levels were assessed as housekeeping genes and comparative CT(2−ΔCt)methods were used for the analysis. Table1 show the primers of SOX9 and ACTB
Table1 The primers of SOX9 and ACTB
Primer
|
Sequence(5'-3')
|
Product(bp)
|
SOX9-F2
|
CCCGCTCACAGTACGACTAC
|
|
SOX9-R2
|
CTGAGCGGGGTTCATGTAGG
|
113
|
ACTB-F2
|
AGACCTGTACGCCAACACAG
|
|
ACTB-R2
|
CGCTCAGGAGGAGCAATGAT
|
132
|