GEO data identified 24430 abnormally expressed genes in osteosarcoma comparing to normal bone samples
Based on the transcriptome data of the four GEO profiles, 12168, 17090, 9173 and 3062 genes were identified to be abnormally expressed in GSE12865, GSE16088, GSE28424 and GSE42352 respectively. Besides the genes that were shared in multiple profiles, a total of 24430 genes were identified to be abnormally expressed in osteosarcoma comparing to normal bone samples (Table S2).
Considering the heterogeneity nature of osteosarcoma between individuals from different research groups, we analyzed the four profiles genes separately. The genes of all four profiles were divided into four groups according to their expression level, namely the genes with expression discrepancy level <2 fold, 2~4 fold, 4~8 fold and >8 fold respectively.
And the results showed that in GSE12865, the expression change of 6050 genes were <2-fold, 4864 genes were 2~4 fold, 962 genes were 4~8 fold and 292 genes were >8 fold in osteosarcoma comparing to normal bone tissues (Figure 1A). In GSE16088, the gene number was 9177, 5923, 1464 and 526 in each group respectively (Figure 1B). In GSE28424, the gene number was 7543, 1194, 288 and 148 in each group (Figure 1C). Meanwhile, in GSE42352, the gene number was 2286, 528, 166 and 82 in <2 fold, 2~4fold, 4~8 fold and >8 fold group respectively (Figure 1D). A total of 24430 AEGS were identified and the expression of 815 genes was >8 fold (Figure 1E, 1F, Table S2).
GO/KEGG interpretation of the abnormally expressed genes revealed a specific location phenomenon
To preliminary interpret the biological functions of the abnormally expressed genes and explore the potential difference among gene groups with diverse expression, GO and KEGG analysis were performed. And the result revealed an inspiring phenomenon that more different the genes expression are in osteosarcoma comparing to normal samples, their cellular location tend to be more outwards from the cell nuclear. To be more specific, the function analysis revealed that the cellular location of <2 fold genes were mainly in nuclear, and 2~4 fold as well as 4~8 fold genes were mostly locating in the cytoplasm, meanwhile >8 fold genes were tend to locate on the cell membrane and in extracellular region. Moreover, despite the genetic heterogeneity among different individuals, the phenomenon was shown in all four GEO patients profiles (Figure 2A-2H). Considering the commonsense that a characteristic feature of osteosarcoma is the formation of specific osteoid matrix, we mainly focused on the > 8 fold genes which are not only more feasibly to be further tested using IHC experiment but also mainly locating in extracellular region for next step analysis.
PPI network and function module analysis to screen the extracellular matrix construction related gene cluster
Based on the analysis of four GEO expression profiles, the expression of 292, 526, 148 and 82 genes were identified to be >8 fold in GSE12865, GSE16088, GSE28424 and GSE42352 respectively, and 67 of the genes were identified in at least two of four profiles indicating they are more credible abnormally expressed gene in osteosarcoma comparing to normal control samples. To further understand the correlation of the multiple genes and screen the potential key gene, PPI network of the 67 >8 fold genes was constructed to firstly observe the interaction among individual genes and secondly to perform function module analysis thus exploring promising gene clusters that potentially relate with osteosarcoma extracellular matrix construction (Figure 3A).
Eventually, three top gene clusters were identified based on the network, and the first module contains 29 genes which were mainly related with G protein coupled receptors and cellular communication (Figure 3B). The second module contains 22 genes covering mostly extracellular locating and matrix construction related genes (Figure 3C). And the third module includes 14 genes mainly associating with ribosome structure and ubiquitinated protease regulation (Figure 3D). Given the specific potential connection between extracellular locating genes with the osteosarcoma specific osteoid matrix construction, the 22 genes in second module were highly payed attention for further survival analysis.
Survival analysis identified STC2 as a promising risk indicator in osteosarcoma development
To further scale down the candidate genes and identify the promising responsible key gene during osteosarcoma development, KM survival analysis was performed using TARGET osteosarcoma transcriptome data to analyze the overall survival of the 22 extracellular matrix regulation related genes, and the result indicated that 3/22 genes were associated with both osteosarcoma overall survival and disease-free survival, namely FGF2 (Figure 4A, 4D), PRKCSH (Figure 4B, 4E) and STC2 (Figure 4C, 4F).
GeneCards was then used to interpret the basic information of the three genes and the result revealed that STC2 is short for stanniocalcin 2 and has been reported to play a role in the regulation of cellular calcium and phosphate homeostasis, its over expression might result in reduced bone and skeletal muscle growth in mice models. And FGF2 is a member of the fibroblast growth factor (FGF) family and has been implicated in diverse biological processes, such as limb and nervous system development, wound healing, and tumor growth. Meanwhile, PRKCSH is an acidic phosphoprotein known to be a substrate for protein kinase C and related to the signaling pathways including small molecular transporting to cellular golgi and subsequent modification and innateimmune system.
Given the importance of the specific osteoid extracellular matrix formation in clinical diagnosis of osteosarcoma, we mainly focused on STC2 gene for further analysis.
Basic physicochemical properties of STC2 gene
Before further exploring the detailed biological function of STC2 gene in osteosarcoma progression, ProtParam and ProtScale were combine used to interpret the basic physicochemical property of STC2 gene. And the results revealed that STC2, which is short for stanniocalcin2 locating in 5q35.2 encodes a protein composed of 302 amino acids including 35 positively charged amino acid residues (Arg+Lys) and 36 negatively charged amino acid residues (ASP+Glu). The molecular weight of the protein is estimated to be 33.2KD with the theoretical isoelectric point computed as 6.96.
Moreover, the protein half-life time of STC2 is computed to be 30 hours in mammals and the estimated instability index is 38.27 indicating it works as a cellular stable protein. Meanwhile, ProtParam analysis computed the hydrophobic value of STC2 as 70.79 and average hydrophilicity as -0.518 which is consistent with ProtScale analysis which supported STC2 harbors several hydrophilic regions and shall be classified as a hydrophilic protein (Figure 4G).
Cellular location prediction of STC2 gene
Given the feasibility of further clinical medical testing of the gene by IHC experiment, cNLS-mapper, TMHMM database and Human Protein Atlas were in succession used to predict the cellular location of STC2 protein in human cells. And the result of cNLS-mapper analysis indicated that there’s no nuclear localization regions in STC2 protein (Figure 4I), meanwhile, TMHMM database analysis ruled out the potential transmembrane domains existing in STC2 protein structure (Figure 4J), followed by Human Protein Atlas analysis which revealed that STC2 mostly locates in cellular endoplasmic reticulum and extracellular regions (Figure 4H).
To sum up above protein location analyzing and physicochemical properties predicting results which revealed that STC2 works neither through transmembrane nor nuclear locating dependent ways, it’s highly probable that STC2 works as a hydrophilic secretary protein in human cells and participates in signaling transduction related biological processes.
Aberrant STC2 gain of expression in osteosarcoma comparing to normal bone or other types of sarcomas
To validate the aberrant changed expression of STC2 in osteosarcoma, bioinformatic analysis based on both cancer tissue samples and cell lines were conducted. Firstly, the analysis result of CCLE which has been a well known online database for cancer cell lines researches also supported that STC2 expresses much higher in chondrosarcoma and osteosarcoma cell lines comparing to other cancers and sarcomas (Figure 5A).
Moreover, the Oncomine analysis result which database was constructed based on human tissue samples revealed that although STC2 expression various in multiple sarcomas, the expression was not only almost all up regulated in sarcomas including chondrosarcoma, liposarcoma, leiomyosarcoma, poorly differentiated sarcomas, osteosarcoma and so on comparing to normal tissues, but also the expression was the highest in osteosarcoma comparing to the other sarcomas. Meanwhile, as for the STC2 expression in different subtypes of osteosarcoma, the result indicated that it expresses higher in chondroblastic and osteoblastic osteosarcoma than fibroblastic and telangiectatic osteosarcoma (Figure 5B).
The association between STC2 gene and osteosarcoma clinical features
Besides the expression analysis based on online datasets, the immunohistochemistry (IHC) experiment was also performed using local hospital patients samples. Not only the result verified the aberrant STC2 gain of expression in osteosarcoma comparing to normal osteoblasts based on the experiment fact that 41 of the 43 tissues samples showed positive staining (positive ratio 95.3%), but also the association between STC2 expression and osteosarcoma clinical parameters was explored which result revealed that STC2 tends to expresses higher in patients with high Ki67 value and combined with tumor recurrence indicating it’s a potential disease risk indicator, meanwhile, no significance relationship was found between STC2 expression and patients age, gender, bone location, tumor volume nor necrosis (Table 1).
A fact worth emphasizing during the IHC experiment was that 38/41 cases were nuclear staining (Figure 5E-5F), meanwhile, the other 3/41 cases were membrane and cytoplasm staining. The possibility of IHC technical elements was rules out based on the same experiment which showed that STC2 stains in perinuclear endoplasmic reticulum in other cancer samples (Figure 5D). Given above cNLS-mapper, TMHMM and Human Protein Atlas database predicting result which revealed that no nuclear localization regions nor transmembrane domains exist in STC2 protein structure, it is reasonable to infer that STC2 locates in octeosarcoma cellular nuclear through interacting with other elements, for instance nuclear locating transcription factors or regulators indicating a specific regulatory role of STC2 in osteosarcoma development.
STC2 gene centered biological functions and related signaling pathways
To preliminary explore the potential biological functions of STC2 gene in osteosarcoma and the probable signaling pathways involved, STRING was used to construct the STC2 centering PPI network for revealing the surrounding potentially connecting genes followed by GO and KEGG analyzing the probable signaling pathways these genes enriched in. And GO results showed that the biological processes STC2 gene participated in were mainly focused on cellular response to hypoxia, cellular calcium and phosphate metabolism regulation, response to hormone related biological processes (Figure 6A).
Meanwhile, KEGG analysis revealed the signaling pathways FGF1 gene involved were mostly ribosome regulation related signaling pathway, HIF-1 signaling pathway, growth hormone synthesis, secretion and action related signaling pathway, as well as cell cycle modulation related pathways (Figure 6B).
Above results shall provide promising directions for further exploring the mechanism of STC2 regulation on bone osteosarcoma development.