Datasets
The gene expression and clinicopathological data of 310 CC patients and 3 adjacent noncancerous tissues (ANTs) were downloaded from TCGA (https://portal.gdc.cancer.gov/)[17, 18]. According to the TCGA publication guidelines (https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga), these mRNA sequencing data have no restrictions on publication, and no additional approval by an ethics committee was required to publish the use of the data.
With the Ensembl platform (http://www.ensembl.org/), we separated the mRNAs from all the TCGA genes. Genes that had missing values in over 50% of the samples were removed. Finally, there were 12,084 genes included in the study. Samples without data on the survival state and survival time were also removed. Finally, 291 CC tissues, including 167 early-stage (FIGO 2009 IA2-IIA2) CC tissues and 3 ANTs, were included in the study. For the early-stage samples, any missing data on whether LVSI and corpus involvement occurred were all recorded as nonoccurrence (median of the available data).
Four CC datasets from Oncomine (version 4.5) (https://www.oncomine.org/)[19] were used to validate the results obtained from TCGA.
Kaplan-Meier (KM), univariate Cox, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) and protein-protein interaction (PPI) analyses
The prognostic value of each gene was calculated in the KM analyses and univariate Cox analyses for the early-stage cohort. A total of 416 genes with both PKM < 0.05 and PCox < 0.05 were early-stage prognosis-related genes and were kept for further analyses. GO biological process, cellular component, and molecular function categories and KEGG pathway analyses and PPI network construction were conducted by the Metascape website (http://metascape.org/gp/index.html), using false discovery rate (FDR) q-value < 0.05 as the standard for statistical significance.
Differential expression analyses (DEA)
To identify genes that are more highly expressed in early-stage CC than in ANTs, we performed a DEA of prognosis-related genes between 167 early-stage CC patients and 3 ANTs with the R package “DEseq2”. The differentially expressed mRNAs with log2|FC| > 1.5 and P-adjusted < 0.05 were considered to be significant. Hierarchical clustering analysis was applied to categorize the data into two groups with similar expression patterns between early-stage CC and ANTs.
Coexpression analyses
Coexpression analyses was conducted by the cBioPortal website (https://www.cbioportal.org/). Using Spearman’s correlation analyses, the genes with FDR q-value < 0.05 were regarded as coexpressed with DSG2. Then, GO biological process analysis and oncogenic signature analysis were conducted among the positively correlated genes (Spearman's correlation > 0) and negatively correlated genes (Spearman's correlation < 0) by the Metascape website.
Tissue sample collection
A total of 150 CC tissues, 6 ANTs and 30 normal cervical tissues (NCTs) collected from January 2006 to October 2012 were obtained from the archives of the Pathology Department and Gynecology Department of the First Affiliated Hospital of Sun Yat-sen University. All enrolled CC patients were matched from stage Ia2 to IIa2 and underwent radical hysterectomy and lymphadenectomy. Only patients with no preoperative radiotherapy or chemotherapy and with available clinical follow-up data were enrolled. Thirty NCTs were collected from patients who underwent hysterectomy without malignant conditions. Written informed consent was obtained from each patient. All specimens were handled according to legal and ethical standards.
Cell lines and cell culture
In this study, SiHa, HeLa, C33A, CaSki, MS751 and ME180 cells were purchased from the American Type Culture Collection (ATCC, Rockville, MD, USA) and cultured according to their guidelines in a humidified atmosphere with 5% CO2 at 37°C. The SiHa, HeLa and ME180 cell lines were cultured in DMEM (Thermo Fisher, America). The CaSki cell line was cultured in RPMI 1640 medium (Thermo Fisher, America). The C33A and MS751 cell lines were cultured in Eagle’s minimum essential medium (Thermo Fisher, America). The media were supplemented with 10% fetal bovine serum (Life Technology, America) and 1% antibiotics (100 U/ml penicillin and 100 µg/ml streptomycin) (Life Technology, America).
Immunohistochemistry (IHC)
For IHC, 4-µm paraffin-embedded sections were baked at 60°C for 1 h, deparaffinized with xylene, rehydrated with a series of graded alcohols, and microwaved in EDTA antigen retrieval buffer. Then, the sections were blocked with 10% goat serum before incubation with a primary antibody at 4°C overnight, followed by HRP-conjugated secondary antibody incubation for 30 min at room temperature. DAB was added to detect antibody binding. Once brown color appeared, the sections were immersed in distilled water to stop the reaction. The sections were counterstained with hematoxylin, dehydrated in graded alcohols and mounted. The primary antibodies were rabbit anti-human DSG2 monoclonal antibody (ab150372, Abcam, Britain) and mouse anti-human D2-40 monoclonal antibody (MAB-0567, MXB, China). The DSG2 staining results were scored based on the following criteria: (i) percentage of positive tumor cells in the tumor tissue: 0 (0%), 1 (1-10%), 2 (11-50%), 3 (51-70%) and 4 (71-100%); and (ii) staining intensity: 0 (none), 1 (weak), 2 (moderate), and 3 (strong). The staining index was calculated as the staining intensity score × the proportion of positive tumor cells (range from 0 to 12). The staining score of 6 was defined as the cutoff. Thus, patients with different positive staining levels of DSG2 expression were divided into low- and high-staining groups.
RNA extraction and quantitative real-time PCR (qRT-PCR)
Total RNA was extracted using Trizol reagent (TAKARA, Japan) according to the manufacturer’s instructions, and the concentration of the RNA extracts of each sample was measured quantitatively by a NanoDrop ND-2000 spectrophotometer. RNA was reverse transcribed into cDNA by using PrimeScript RT Master Mix (TAKARA, Japan). cDNA was amplified and quantified using a 7500 Fast Real-Time PCR system (Applied Biosystems, USA) and SYBR Premix Ex Taq (TAKARA, Japan). The RT‑PCR conditions for genes were set at 95°C for 2 min, followed by 39 cycles at 95°C for 20 sec, 58°C for 30 sec and 72°C for 30 sec. The DSG2 sequences were 5’-CTCAGGTGTGCAGCCTACTC-3’ (forward) and 5’-GTGGTGTTCCTAGCCGTCAT-3’ (reverse), while the GAPDH sequences were 5’- TGCACCACCAACTGCTTAGC-3’ (forward) and 5’-GGCATGGACTGTGGTCATGAG-3’ (reverse). qRT-PCR was repeated at least three times. mRNA expression was defined based on Ct, and relative expression levels were calculated using the comparative Ct (2-ΔΔCt) method after normalization with reference to the expression of the housekeeping gene GAPDH.
Western blot assay
Total protein was extracted with cold RIPA lysis buffer and fractionated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and then transferred onto a 0.45-μm PVDF membrane (Millipore, America). The membranes were blocked with 5% skimmed milk and incubated with the primary antibody at 4°C overnight, followed by secondary antibody incubation for 1 h at room temperature. Bound antibodies were detected with Immobilon Western Chemiluminescent HRP Substrate (Millipore, America). Rabbit anti-human DSG2 monoclonal antibody (ab150372, Abcam, Britain) and rabbit anti-human GAPDH antibody (XS20180808002, Bioworld, China) were used in this study.
siRNA-mediated knockdown of DSG2
SiHa and HeLa cells were transfected with control siRNA (GenePharma, Shanghai, China) or DSG2-specific siRNA (GenePharma, China) using Lipofectamine RNAiMAX Reagent (Invitrogen, America) and Opti-MEM media (Life Technology, America) at the time of cell culture. There were two DSG2 siRNA sequences. The siRNA393 sequences were 5’-CCAAUUGCCAAGAUACAUUTT-3’ (forward) and 5’-AAUGUAUCUUGGCAAUUGGTT-3’ (reverse). The siRNA613 sequences were 5’-CCUUAGAGCUACGCAUUAATT-3’ (forward) and 5’-UUAAUGCGUAGCUCUAAGGTT-3’ (reverse). The negative control sequence (siRNA-NC) was 5’-UUCUCCGAACGUGUCACGUTT-3’ (forward) and 5’-ACGUGACACGUUCGGAGAATT-3’ (reverse).
Cell Counting Kit-8 (CCK-8) assay
For the CCK-8 assay, 5 × 103 SiHa and HeLa cells were seeded into each well of 96-well plates. The time calculation started when the cells adhered to the wall, and the wells were transfected with siRNA. Cell viability was measured at specific times by CCK-8 (CCK-8, DOJINDO, Japan). The absorbance value at 450 nm was read by a microplate reader (Tecan Sunrise, Tecan Group Ltd.).
Migration assay
The stable cell lines SiHa siRNA-NC, SiHa siRNA393, SiHa siRNA613, HeLa siRNA-NC, HeLa siRNA393 and HeLa siRNA613 were counted and then 10 × 104 stably infected SiHa cells and 20 × 104 stably infected HeLa cells in 250 µl of serum-free medium were separately plated into the upper chamber of 8-µm transwell inserts (BD Biosciences, Franklin Lakes, NJ), while 500 µl of medium containing 10% bovine serum albumin was added to the lower chamber. After 24 h of incubation at 37°C, SiHa siRNA cells in the upper chamber were removed carefully. After 48 h of incubation at 37°C, HeLa siRNA-NC and HeLa siRNA cells in the upper chamber were removed. Migrated cells on the lower membrane surface were fixed in 4% paraformaldehyde (Solarbio, Beijing, China) for 10 min and then stained with 0.1% crystal violet (KeyGEN biotech, Nanjing, China) for 10 min. The number of cells was counted in 5 randomly selected visual fields (100×) per well under an inverted microscope DMI4000B (Leica, Wetzlar, Germany).
Statistical analyses
Statistical analyses were performed using SPSS 22.0 statistical software (Chicago, IL, USA) and R version 3.6.0. The differences between two groups were analyzed by Student’s t test. The differences among more than two groups were analyzed by ANOVA. The chi-square test and Fisher’s exact test were used to analyze the relationship between DSG2 expression and the clinicopathological characteristics. Survival data were evaluated using univariate and multivariate Cox regression analyses. Survival curves were plotted by the KM method and compared using the log-rank test. In all cases, P < 0.05 was considered statistically significant.