Microarray data. GEO database (https://www.ncbi.nlm.nih.gov/geo/) is a public functional genomics data repository supporting MIAME-compliant data submission. It archives and freely distributes microarray, next-generation sequencing, and other forms of high-throughput functional genomics data submitted by the research community. GSE27262 was based on the GPL570 platform (Affymetrix Human Genome U133 Plus 2.0 Array), and it included 25 matched pairs of LUAD samples and normal samples. GSE74706 was based on GPL13497 (Agilent-026652 Whole Human Genome Microarray 4x44K v2), and it included 10 LUAD samples and 18 normal samples. GSE75037 was based on GPL6884 (Illumina HumanWG-6 v3.0 expression beadchip), and it included 83 matched pairs of LUAD samples and normal samples. All of the data can be viewed online.
Identification of DEGs. GEO2R is an online web tool (https://www.ncbi.nlm.nih.gov/geo/geo2r/) used for comparing two or more groups of samples to determine DEGs. The cutoff criteria were set to adjust P-value <0.05 and ∣log 2FC ∣>1.5 and probe sets without exact gene symbols were excluded. Then, the results of difference analysis from the three datasets were intersected, the shared genes were used for the following analysis. Numbers of up- and downregulated genes from different databases and counts of crossed genes were shown using a Venn diagram.
Protein-Protein Interaction (PPI) Network Construction and Hub Genes Verification. In order to detect potential associations among those DEGs, we construct the PPI network using Search Tool for the Retrieval of Interacting Genes (STRING) database (https://string-db.org/), a minimum required interaction score of 0.4 was considered significant. Subsequently, we used Cytoscape version 3.8.2 software to visualize the molecular interaction network of DEGs. In addition, we used Cytoscape plug-ins Molecular Complex Detection (MCODE) to search for candidate modules with the degree cutoff value of 2, node score cutoff value of 0.2, k-core value of 2, and maximum depth of 100, respectively. Modules with a score greater than 5 were screened out and analyzed using another Cytoscape plug-ins cytoHubba to screen out hub genes with scores greater than 10.
GO functional enrichment and KEGG pathway analysis. GO functional enrichment and KEGG pathway analysis were conducted using R package “clusterprofiler” version 4.0.5. GO term or KEGG pathways with P-value <0.05 were considered statistically significant and visualized by “ggplot2” and “enrichplot”
Survival Analysis. Gene Expression Profiling Interaction Analysis (GEPIA) database (http://gepia.cancer-pku.cn/) is a free online visualization platform that provides key interactive and customizable functions including survival analysis. The top 12 hub genes were entered into GEPIA database to examine the relationship between these genes and the survival months of LUAD patients. The logrank P value and hazard ratio (HR) with 95% confidence intervals were presented on the plot. P <0.05 was considered statistically significant.
Co-expression analysis between CCNA2, HMMR and AURKA. The cBioPortal for Cancer Genomics (https://www.cbioportal.org/) is an open-source platform that allows for visualization, exploration and analysis of cancer genomics data sets. We used cBioPortal co-expression section to verify and visualize the correlation between CCNA2, HMMR and AURKA.
Protein Expression of CCNA2, HMMR and AURKA. The University of ALabama at Birmingham CANcer data analysis Portal (UALCAN) (https://ualcan.path.uab.edu/) is a comprehensive, user-friendly and convenient free online platform used to analyze various cancer data. We used UALCAN to conduct and visualize protein expression analysis of CCNA2, HMMR and AURKA, which UALCAN obtained the data from Clinical Proteomic Tumor Analysis Consortium(CPTAC). CPTAC is a centralized repository containing clinical data and proteomic sequence datasets of different types of cancers. In addition, immunohistochemistry of CCNA2, HMMR and AURKA between LUAD and normal tissues was determined using the Human Protein Atlas database v.21 (HPA, https://www.proteinatlas.org/). HPA is a useful and powerful database which is dedicated to providing tissue and cellular distribution information of a variety of human proteins. Furthermore, the IHC-based protein expression pattern is commonly used to help researchers in detecting the relative location and abundance of proteins [9].
Analysis of Tumor-Infiltration Immune Cells. To examine the correlation between expression of CCNA2, HMMR and six different types of tumor-infiltrated immune cells (B cells, CD8+ T cells, CD4+ T cells, macrophages, neutrophils and dendritic cells), we chose to use Tumor Immune Estimation Resource (TIMER, https://cistrome.shinyapps.io/timer/) database to analyze and visualize our results. TIMER is a public resource for systematical analysis of immune infiltrates across diverse cancers.
Heatmap analysis of EMT-related genes. Large number of samples from TCGA database were downloaded and combined with the Epithelial-Mesenchymal Transition Gene Database (dbEMT2.0, http://dbemt.bioinfo-minzhao.org/) to analyze the relationship of CCNA2 and these EMT-related genes. Heatmap analysis was conducted and visualized using R package “pheatmap”.
Immunohistochemical staining and scoring method. Paraffin-embedded specimens of resected lung adenocarcinoma tissues obtained from surgical treatment were prepared by pathological department of Ruijin hospital, Shanghai, China, and the immunostaining was carried out by Boerfu Biotechnology company (Wuhan, China). Brown staining was considered positive and the protein expression levels were independently scored by two pathologists. Initially, four degrees of proportional score for positive staining were assigned for both tumor cells and peritumoral cells (1, ≤25% positive cells; 2, 25–50% positive cells; 3, 50–75% positive cells; and 4, >75% positive cells). Thereafter, four degrees of intensity score were evaluated according to the intensity of staining (1, very weak; 2, weak; 3, intermediate; and 4, strong). The proportion and intensity scores were then added to obtain a total score ranging from 2 to 8. A total score ≥5 was considered as high expression and a score ≤4 was considered as low expression. The scoring criteria were based on the assessment reported by Meng et.al and Kawai et.al [10,11].
Ethics approval and consent to participate. The informed consent of all patients has been obtained before the operation, and the procedures for organizing collection have been approved by the Ethics Committee of Ruijin Hospital, Shanghai JiaoTong University School of Medicine, Shanghai, China. All procedures comply with the guidelines and ethical principles outlined in the Helsinki Declaration.
Cell culture and transfection. The human NSCLC cell lines PC9 were kindly provided by Shanghai Cancer Institute, cultured in RPMI 1640 (Shanghai BasalMedia Technologies Co, Shanghai, China) supplemented with 10% fetal bovine serum (FBS) (Lonsera, Uruguay), 100U/ml penicillin and 100µg/ml streptomycin, and were cultured at 37°C in a humidified incubator of 5% CO2. For shRNA-mediated knockdown, specific short hairpin RNAs (shRNA) targeting CCNA2 (shRNA-CCNA2-1, target sequence: 5’-GCTGACCCATACCTCAAGTAT-3’, shRNA-CCNA2-2, target sequence: 5’-CCTTAGGGAAATGGAGGTTAA-3’, shRNA-CCNA2-3, target sequence: 5’-CCTCTTGATTATCCAATGGAT-3’ and shRNA-CCNA2-4, target sequence: 5’-GCCTGAATCATTAATACGAAA-3’) and corresponding negative control (pLKO.1-copGFP-PURO-NC) were purchased from TsingkeBiotechnologyCo, Shanghai. 293T cells were seeded into 6-well plates and transfected with 3ug plasmid mixture (shRNA plasmid + psPAX2 + pMD2.G) and 9ul PEI transfection reagent. Following 5h of transfection, the transfection reagent in the wells were replaced with fresh DMEM (Shanghai BasalMedia Technologies Co, Shanghai, China) supplemented with 10% FBS and antibiotics. The cells were then cultured for 2 days before transfecting PC9 cells.
Cell proliferation and Colony formation assays. Stably transfected PC9 cells were inoculated into 96-well plate at a density of 3000 cells per well for the determination of cell proliferation. After incubation for 0, 1, 2 and 3 days, Cell Counting Kit-8 (CCK-8) (Bimake, Texas, America, cat no. B34302) solution was seeded to each well based on the manufacture protocol. The optical density of each hole was measured under 450nm. For cell colony formation assays, PC9 and shCCNA2-PC9 cells were incubated in 6-well plates at a density of 1000 cells per well. A weeks later, the cells were stained with crystal violet (0.2%) for 30 min and the colony numbers were counted.
qPCR Analysis. Total RNA was extracted from PC9 and shCCNA2-PC9 cells using the TRIzol reagent (Invitrogen, America) following the manufacturer’s instructions. The total RNA was then further quantified using NanoDrop (Thermo Fisher Scientific, Massachusetts, America). cDNA synthesis was performed using Hifair® Ⅱ 1st Strand cDNA Synthesis Kit (Yeasen, Shanghai, China) and resulting complementary DNA was used for real-time qPCR (Light Cycler® 480II) using SYBR Green (Yeasen, Shanghai, China) and primers specific for CCNA2 (forward: 5’- GGATGGTAGTTTTGAGTCACCAC-3’, reverse: 5’- CACGAGGATAGCTCTCATACTGT-3’). Primers were synthesized by TsingkeBiotechnologyCo, Shanghai.
Western Blot Analysis. Whole cells lysates were prepared using SDS buffer containing protease and phosphatase inhibitors. After determining protein concentration using Protein Quantification Kit (BCA Assay), equal amounts of protein samples were fully electrophoresed with SDS-PAGE and transferred onto PVDF membranes. The associated primary antibodies that we used were as following: Cyclin A2 (proteintech, Chicago, America, cat no. 18202-1-AP), E-cadherin (Cell Signaling Technology, Danvers, America, cat no. 3195), Vimentin (Cell Signaling Technology, Danvers, America, cat no. 5741), HSP90 (Abclonal, Wuhan, China, cat no. A5006).
Migration and Wound Healing assays. The migration ability of cells was measured by Transwell chambers (Corning costar, Corning, America, cat no. 3422). Medium with 20% FBS was added to the lower chamber as a chemical attractant. About 8x105 of transfected cells were incubated with 200µl serum-free medium in the upper chamber, and then incubated with 5% CO2 at 37°C. After 24h, the cells were fixed with polyformaldehyde fixing solution (Servicebio, Wuhan, China, cat no: G1101) and stained with crystal violet dyeing solution (Servicebio, Wuhan, China, cat no: G1014). For wound healing assays, cells were seeded in 6-well plates and a 200µl sterile micropipette tip was used to make a scratch wound. Wound healing was observed after 24h standardized culture and photographed using phase-contrast microscopy.