Extract GMPSs in OS for metastases and non-metastases patients
We built an integrated and computational pipeline based on REO approach using gene expression profiles (Fig. 1A). First, 245 metastases-related genes were identified and they would be used to follow analysis. Second, a 0–1 matrix for a pair of genes (Gm and Gn) in metastases and non-metastases OS patients according to expression pattern of the gene pair. Then, if the frequency of metastases OS patients accord with a particular REO pattern was measured by fisher’s exact test compared with non-metastases controls. The particular REO pattern was expression Gm > Gn or Gm < Gn. Lastly, MGPSs were identified if the P values of Fisher’s exact test were smaller than 0.05. We performed this pipeline based on dataset from TARGET. The distribution of P values of MGPSs with fluctuation is between 0.01 and 0.05 (Fig. 1B). We divide all P values into five regions: 0 to 0.01, 0.01 to 0.02, 0.02 to 0.03, 0.03 to 0.04 and 0.04 to 0.05. The amount in each region had no great difference (Fig. 1C). In order to ensure the accuracy and stability of MGPSs for OS, another dataset GSE33382 from GEO was used for identifying MGPSs. The P values of GSE33382 showed similar distribution to TARGET dataset (Fig. 1D, 1E). Lastly, we got 138 common MGPSs in both dataset TARGET and GSE33382 for OS (Fig. 1F).
Some MGPSs showed strong correlation in metastases-specific co-expressed network for OS
The correlation values of MGPSs would showed great difference between metastases and non-metastases OS patients. According to the inference, a metastases-specific co-expressed network was constructed (Fig. 2A). The network contained 71 nodes and 57 edges. More than half MGPSs showed great difference (absolute values of difference for PCCs > 0.25) between metastases and non-metastases OS patients. There were 50.88% and 5.26% MGPSs were both positive and negative correlated in metastases and non-metastases OS patients (Fig. 2B). The correlated direction of 43.86% MGPSs were reversed in metastases and non-metastases OS patients. The global network showed scale-free distribution (R square = 0.75) which is a specific topological feature of transcriptional regulatory network (Fig. 2C). We found metastases-related gene RPL27A had highest degree in this metastases-specific co-expressed network (degree = 19). Most of the interacted strength between RPL27A and other genes had great difference in metastases and non-metastases OS patients. For example, gene RPL27A and MYL5 had strong positive correlations (Cor = 0.75, P < 0.001) in metastatic OS patients (Fig. 2D). However, gene RPL27A and MYL5 were not correlated in non-metastatic OS patients (Cor = 0.02, P = 0.86, Fig. 2E). The results indicated that these MGPSs maybe play key and specific roles in metastases process for OS patients.
Some genes in MGPSs were associated with prognosis for OS patients
As we known, time to metastases and number of lesions are the most important prognostic factors for OS patients [19]. Thus it’s essential to consider the prognostic roles of MGPSs in OS patients. 13 genes were associated with survival in above metastases-specific co-expressed network for OS (Fig. 3A). For example, prognosis-related MGPSs CYBB and P4HA1 showed strong negative correlations in metastases OS patients (Cor= -0.35, Fig. 3B). Prognosis-related MGPSs CYBB and P4HA1 also showed strong negative correlations in metastases OS patients (Cor= -0.25, Fig. 3C). Gene P4HA1, CYBB and CD53 were all related with survival (Fig. 3D, E, F). In addition, samples with high expression for these genes usually showed worse survival. All the results indicated that some genes in MGPSs were associated with prognosis for OS patients.
Prognosis-related genes in MGPSs could classify metastases and non-metastases OS patients
We constructed a prognosis-related MGPS network which extracted from metastases-specific co-expressed network. This network contained prognosis-related genes and their direct neighbored genes in metastases-specific co-expressed network (Fig. 4A). There were 25 nodes (13 prognosis-related genes) and 18 edges in this prognosis-related MGPS network. We found three MGPSs including P4HA1-CYBB, P4HA1-EVI2B and P4HA1-CD53 were all associated with survival. In order to validate if these prognosis genes in MGPSs could become biomarkers for metastases of OS patients, we classify metastases and non-metastases OS patients using gene expression profile follow a consensus clustering method. The prognosis-related genes in MGPSs could distinguish all samples into diverse groups. Final number of group was 4 based on area under CDF curve plot (Fig. 4B and C). Each OS patients group had a consensus expression pattern and could distinguish clearly (Fig. 4D). Most OS patients could be classified accurately (Chi-square test, P < 0.001) and especially the last group (C4) were all matched with the non-metastases OS patients (Fig. 4E). Collectively, all the above results indicated that the expression of prognosis-related genes in MGPSs could be served as specific biomarkers to distinguish metastases and non-metastases OS patients.
Functional characterizations and drug repurposing candidates of MGPSs in OS
Functional enrichment analyses were performed to characterize the functions of MGPSs in OS. For GO enrichment analyses, these MGPSs were enrichment in some critical biological functions such as regulation of microtubule polymerization, negative regulation of autophagy and negative regulation of smooth muscle cell migration (Fig. 5A). Dynamics of acting filament cooperated with microtubules could drive cell motility process. Many studies revealed that microtubule dynamics were necessary for promoting epithelial-mesenchymal transition (EMT) [20–22]. Expansion of dormant tumor cells into metastases or anti-tumor inflammatory responses would be restricted and promoted by autophagy. On the contrary, metastasis would be promoted by self-eating based on strengthening fitness of tumor cell environmental stresses response including anoikis during metastatic progression [23]. For KEGG pathway enrichment analyses, these MGPSs were enrichment in some key pathways associated with cancer or metastasis such as Natural killer cell mediated cytotoxicity, B cell receptor signaling pathway and MAPK signaling pathway (Fig. 5B). Natural killer cells participate to immune response against metastasis [24]. These functional enrichment analysis showed these GMPSs were associated with cancer metastasis.
We also constructed a drug-related MGPS network to explore drug repurposing candidates for OS (Fig. 5C). We found Proline (DB00172) was a drug repurposing candidate and its target gene were P4HA1 and PYCR2. Proline is one of the twenty amino acids in organism, which is a component of protein. Normal functions of jonts and tendons and maintain and strengthen all reply on proline at a great extent [25]. Azacytidine is clinically used to treat myelodysplastic syndrome, a group of heterogeneous bone marrow stem cell diseases [26]. In our analysis, Azacitidine was considered as a drug candidate for OS. Azacitidine is a cytidine nucleoside analogue, which has the clinical activity of myelodysplastic syndrome and the potential activity of solid tumor.