1. The relationship between high expression levels of CCT6A and survival in ES patients.
After initial screening, a total of 32 samples in the dataset GSE17618 met our screening criteria. Subsequently, after our three-step screening process, only the CCT6A gene finally met all of our screening criteria. We used the Kaplan-Meier method for survival analysis of patients and divided all samples into two groups using the median value of gene expression of CCT6A as a reference. Median gene expression values greater than CCT6A are classified as high expression group; median gene expression values less than or equal to CCT6A are classified as low expression group. Survival curves were plotted based on the survival time of patients in the high and low expression groups( Figure 1). We can derive from the survival curves that the probability of survival of patients with high expression of the CCT6A gene was significantly lower than the probability of survival of patients with low expression of the CCT6A gene (P=0.024), and the difference was statistically significant.
2. Univariate and multivariate Cox regression analysis for CCT6A
We used univariate Cox regression analysis and multivariate Cox regression analysis to analyze CCT6A in relation to survival status and survival time, respectively. From the univariate Cox regression analysis plot (Figure 2.A), we found that CCT6A was significantly associated with survival status and survival time (P < 0.001, HR=7.953), and HR>1 indicated that CCT6A was a high-risk factor, and the higher the CCT6A expression value, the poorer the prognosis of this patient. Also, CCT6A can be considered as an independent prognostic biomaker and we can predict the patient's prognosis by detecting the gene expression level of CCT6A. From Figure 2.A we can also see that EFS is also strongly associated with survival status and survival time (P<0.001, HR=0.218), with EFS<1 indicating that EFS is a low-risk factor. From the forest plot obtained from multivariate Cox analysis (Figure. 2.B), we found that CCT6A was significantly associated with survival (P<0.001, HR=9.513) and was a high-risk factor for ES prognosis. Age was also strongly associated with survival, (P=0.002, HR=0.269) and was a low-risk factor for ES prognosis. EFS was strongly associated with survival (P<0.001, HR=0.196) and was a low-risk factor for ES prognosis.
3. Expression of CCT6A in different subgroups
We analyzed the gene expression of CCT6A in the different groups and found substantial differences. Gene expression of CCT6A in both age groups: the median value of gene expression in the >20-year-old group was higher than the median value of gene expression in the <=20-year-old group, but the difference was not statistically significant (P>0.05), see Figure. 3.A for details. The median value of CCT6A expression was significantly different in the group of patients with EFS<=5 years compared to the group of patients with EFS>5 years (P=0.01), and gene expression was higher in the EFS<=5 group compared to the EFS>5 group, as detailed in Figure. 3.B. From this subgroup of tumor origin, it was observed that patients in the recurrent group had the highest median value of CCT6A gene expression, followed by those in the metastatic group and last in the primary group. Among them, the expression of CCT6A in the patients in the primary group was significantly different from that in the relapsed group (P<0.01), as detailed in Figure 3.C. From the expression of the CCT6A gene in different sexes (Figure. 3.D), we observed that the median value of CCT6A expression in females was significantly higher than that in males, and the difference was statistically significant (P<0.05).
4. Screening of differentially expressed genes, enrichment analysis, and visualization, and construction of protein interaction networks.
We analyzed differentially expressed genes and performed GO enrichment analysis and KEGG enrichment analysis and visualization of differentially expressed genes; finally, we constructed protein reciprocal networks through databases. Screening of differentially expressed genes according to the screening criteria yielded 188 differentially expressed genes, of which 106 were up-regulated and 82 were down-regulated. Heat maps and volcanoes of differentially expressed genes are shown in Figure 4.A and Figure 4.B. A total of 106 genes were positively correlated with CCT6A and 82 genes were negatively correlated with CCT6A, see Figure 5. Subsequently, we performed GO enrichment analysis and visualization of differentially expressed genes (Figure. 6.A), showing the biological process (BP), cell component (CC), and molecular function (MF) located in the top 10, respectively. The differentially expressed genes were analyzed and visualized for KEGG enrichment (Figure. 6.B), demonstrating the KEGG-enriched pathways. Finally, we imported the differentially expressed genes into the STRING database to obtain the protein interaction network, which was imported into Cytoscape for visualization (Figure. 7).