2.1 Data collection
GEO (http://www.ncbi.nlm.nih.gov/geo)[7] is a gene expression database created by NCBI, which contains high-throughput gene expression data submitted by research institutes worldwide. Three microarray datasets (GSE8401, GSE46517, and GSE7956) were downloaded from it (Affymetrix GPL96 platform, Affymetrix [HG-U133A] Affymetrix Human Genome U133A Array). The GSE8401 dataset includes 31 primary melanoma samples and 52 metastatic melanoma samples. GSE46517 consists of 31 primary melanoma samples and 73 metastatic melanoma samples. GSE7956 contains 10 poorly metastatic melanoma samples and 29 highly metastatic melanoma samples.
2.2 Identification of DEGs
The DEGs between metastatic melanoma and primary melanoma samples were screened via GEO2R (http://www.ncbi.nlm.nih.gov/geo/geo2r). GEO2R is an interactive web tool that allows users to compare two or more datasets in a GEO series in order to identify DEGs across experimental conditions[8]. The adjusted P-value (adj. P-value) using Benjamini and Hochberg false discovery rate were applied to discover statistically significant genes while false‐positive results corrected. Probe sets without corresponding gene symbols or genes with more than one probe set were removed or averaged, respectively. LogFC (fold change) > 0.5 and adj. P-value < 0.05 was considered statistically significant.
2.3 Enrichment analyses of DEGs
DAVID 6.8 (https://david.ncifcrf.gov/)[9] was used for enrichment analyses to investigate DEGs at the molecular and functional level. DAVID is a gene functional classification tool that integrates a set of functional annotation tools for investigators to analyze biological functions behind massive genes. In addition, the functional enrichment analysis tool (Funrich)[10] was used to analyze the biological pathways of DEGs. Funrich software is an open-access software that facilitates the analysis of proteomics data, providing tools for functional enrichment and interaction network analysis of genes and proteins. KOBAS 3.0 (http://kobas.cbi.pku.edu.cn)[11] was used to evaluate the pathway enrichment analyses of DEGs. KOBAS 3.0 can annotates an input set of genes with putative pathways based on mapping to genes with known annotations from 5 pathway databases (KEGG PATHWAY, PID, BioCyc, Reactome and Panther). P-value < 0.05 was considered significant.
2.4 PPI network construction and hub genes selection and analyses
Search Tool for the Retrieval of Interacting Genes (STRING; http://string-db.org) (version 10.0)[12], which is an online database of known and predicted protein interactions, was applied to predict the PPI network of DEGs. Only interactions with a combined score > 0.4 were considered statistically significant. Cytoscape (version 3.6.1, http://www.cytoscape.org) was used to visualize molecular interaction networks[13], with its CytoNCA plug-in analyzing the topological properties of nodes in the PPI network and setting parameters to no weight[14]. By ranking the scores of each node, we obtained important nodes involved in protein interactions within the network. Considering most networks were scale-free, the hub genes were chosen with degrees ≥ 10. The enrichment analysis of biological processes was through Metascape (https://metascape.org)[15]. Metascape is a web-based portal designed to provide a comprehensive gene list annotation and analysis resource for experimental biologists. The pathway enrichment analyses for the genes were conducted by KOBAS 3.0. Besides, a network of genes and their co-expression genes was analyzed via GeneMANIA (http://www.genemania.org/)[16], which is a flexible, user-friendly web interface for generating hypotheses about gene function, analyzing gene lists and prioritizing genes for functional assays. Finally, Drug-Gene Interaction database (DGIdb) 2.0 (http://www.dgidb.org/), which mines existing resources and generates assumptions about how genes are therapeutically targeted or prioritized for drug development[17], was used in the study to predict drugs based on the module genes. The parameters were set as: preset filters: FDA approved; antineoplastic; all the default. All the drug-gene relationship pairs related to the module genes were predicted, and the network map was formed by Cytoscape.
2.5 Validation of hub genes expression in TCGA databases
The mRNA expression of identified hub genes was verified using TCGA data which contains 102 primary melanomas and 369 metastatic melanomas (https://tcga-data.nci.nih.gov/tcga/). The comparison between the two sets of data uses the Mean T test. P-value < 0.05 was considered significant.
2.6 Patients and Ethical approval
From March 2016 to June 2020, a total of 72 patients from the Department of Plastic Surgery of the Second Affiliated Hospital of Soochow University obtained 36 primary cutaneous melanomas and 36 metastatic melanomas. No radiotherapy or chemotherapy was received before surgery. Intraoperative specimens of primary skin melanoma and metastatic melanoma were collected and fixed with 4% paraformaldehyde. Clinical data can be obtained from hospital records. This study was supported by the Independent Ethics Committee (IEC) of the Second Affiliated Hospital of Soochow University, and all patients were fully aware of the preservation and upcoming use of their removed specimens for further research.
2.7 Immunohistochemistry (IHC)
Sections were deparaffinized in xylene for 15 min and then rehydrated in graded alcohols and water. Endogenous peroxidase activity was blocked by treatment with 3% H2O2-methanol for 30 min at room temperature. After blocking nonspecific binding with serum for 40 min at 37℃, sections were incubated with rabbit anti-CXCL8 polyclonal antibody (1:500; 27095-1-AP), anti-THBS1 a ntibody (1:500; 18304-1-AP) and anti-KIT antibody (1:500; 18696-1-AP) in a humid chamber at 4℃ overnight. After three washes with PBS, sections were incubated with biotinylated secondary antibodies (SA00004-1, Proteintech) for 30 min, washed three times in PBS, and incubated with streptavidin-conjugated peroxidase (ab7403, Abcam) for 30 min. Slides were rinsed in PBS, exposed to diaminobenzidine (SK4100, Vector Laboratories) and counterstained with Mayer’s hematoxylin (ab128990, Abcam; negative control = omission of the primary antibody). The percentages of cells that express CXCL8, THBS1 or KIT were assessed by quantitative histomorphometry (Olympus X71-F22PH; Olympus Corporation, Tokyo, Japan). Two experienced pathologists independently assessed the positive or negative staining of a protein in one FFPE slide and were supervised by a clinician. Depending on the level of staining intensity (no staining, weak staining, medium staining and strong staining), the score ranges from 0 to 3. The staining range of immunoreactive tumor cell coverage (0%, 1-25%, 26-50%, 51-75%, 76-100%) is 0-4. According to the multiples of the staining intensity and degree score, the IHC score is comprehensively scored from 0 to 12. Negative staining is grade 0 to 4, and positive staining is grade 5 to 12.