Molecular and clinical characterization of CCT2 expression at transcriptional level via 2994 samples of breast cancer

Background Molecular chaperones play important roles in regulating various cellular processes and malignant transformation. Expression of some subunits of molecular chaperonen CCT/TRiC complex have been reported to be correlated with cancer development and patient survival. However, little is known about the expression and prognostic signicance of Chaperonin Containing TCP1 Subunit 2 (CCT2), a gene encoding a molecular chaperone that is a member of the chaperonin containing TCP1 complex (CCT), also known as the TCP1 ring complex (TRiC). Method Through the Cancer Genome Atlas (TCGA) and Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) databases, we systematically reviewed a total of 2994 cases with transcriptome data and analyzed the functional annotation of CCT2 by Gene ontology (GO) and KEGG analysis. Univariate and multivariate survival analysis were performed to investigate the prognostic value of CCT2 in breast cancer.


Introduction
According to the "global cancer statistics" released by the World Health Organization (WHO) in 2015, approximately 1.15 million new cases of breast cancer are diagnosed every year and it accounts for 23% of all female malignancies; there are approximately 410,000 deaths every year, accounting for 14% of deaths due to cancer in women worldwide [1]. Although breast cancer is one of the solid tumors of best prognosis and outcome, given that the gure differs signi cantly among different subtypes, there are still many problems to be solved urgently. With the beginning of the new era of precision medicine, we should give more emphasis on individualized and accurate diagnosis and treatment of breast cancer. Therefore, to seek for novel and promising biomarker for both diagnosis and treatment as well as effective therapeutic target is a major and pressing issue for us.
Although the hardened armors of cancer such as genomic instability, uncontrolled proliferation, metastasis and so on make it a well-equipped army to ght against our various therapeutics [2,3], it does has a soft spot: its dependency on major cellular processes like transcription, translation, splicing, protein degradation and protein-folding [4]. During this signi cant process, proteostasis network (PN), contributing a lot to keep proteome balanced, plays an important role in maintaining native function of proteins and guaranteeing the health of cell and organism. As the central components of the PN, one substance called chaperonin is a key player [5]. To produce proteins participated in proliferation, angiogenesis, survival and migration, which are vitally essential for tumor formation, progression and metastasis, when compared with healthy cells, cancer cells are more highly addicted to molecular chaperones since there are more imbalances caused by overexpression of oncogenes and chromosomal abnormalities [6].
Apart from the HSP90 inhibitors, which were found two decades ago and then abandoned due to incomplete inhibition of HSP90, dose-limited toxicity and insu cient downregulation of client proteins [7,8], there is another class of protein-folding complexes named chaperonins in recent years. As a large hetero-oligomeric ATP-dependent complex, this type II chaperonin named CCT is constructed by two stacked back-to-back rings, each creating a place called central chamber to sequester and fold substrate polypeptides that are newly synthesized or misfolded [9][10][11][12]. CCT is composed of eight paralogous subunits: CCT1-8, also known as CCT α, β, γ, δ, ε, ζ, η, θ [13]. Approximately 10% of newly synthesized proteins in eukaryotic cells are bound and folded under the assistance of CCT [14], and this gure is considered more in cancer cells, where the substrates consisting of some oncogenic proteins as well as mediators such as STAT3, KRAS and so on [15][16][17][18]. Given the evidence that CCT facilitates neoplastic transformation, it is an newly emerging and promising substance that could probably serve as diagnostic marker as well as therapeutic target.
Considering the CCT was a complex that many previous studies focused on, without taking the its structure constructed by eight different subunits into account, the importance of a single subunit, for example chaperonin containing TCP1 subunit 2 (CCT2 or CCTβ), was considerably undetermined.
According to several limited published studies, increased expression of CCT2 was observed in various tumor cell lines as compared to normal tissues, including liver, prostate, cholecyst, lung, colorectal and breast cancers [15,[19][20][21][22][23]. In terms of breast cancer, though several studies had illustrated the correspondence between CCT2 expression and the growth of breast cancer cells, there was no comprehensive and detailed conclusion based on clinical data towards different biological, clinical and molecular characteristics of each distinct subtype [19,24,25]. Therefore, many unknown factors regarding the expression and prognostic signi cance of CCT2 in breast cancer must be clari ed.
In the present study, we assessed the CCT2 expression status and related biological process by characterizing transcriptome data across two comprehensive genomic databases including a total of 2994 breast cancer samples. Further, we also explored relationships between CCTs gene family, and their prognostic value. To our best knowledge, this is the largest and most comprehensive study characterizing CCT2 expression in whole grade breast tumor masses.

Methods And Materials
Data acquisition TCGA dataset on breast invasive carcinoma was downloaded and processed using GDCRNATools (access date: Feb 01, 2020) [26]. Raw counts data normalized by TMM implemented in edgeR [27] was then transformed by voom in limma [28], and only genes with cpm > 1 in more than half of the samples were kept. Sieved TCGA breast cancer clinical data was kindly provided by Dr. Hai Hu and Dr. Jianfang Liu in Chan Soon-Shiong Institute of Molecular Medicine at Windber. HER2 status was recalled using DNA copy number for cases without an IHC or FISH status. Standardized survival data from TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR) [29] was utilized in this study. METABRIC dataset [30] on breast cancer (METABRIC, Nature 2012) acquired from cBioPortal (http://www.cbioportal.org/) were utilized for this study (access date: Feb 01, 2020). CCT2 expression data in GSE15852, GSE54002, GSE45827 and GSE42568 datasets were collected from GENT2 database [31] (http://gent2.appex.kr/gent2/), a newly updated platform for exploring gene expression patterns across tumor and normal tissues. Gene expression patterns of CCT2 across tumor and normal tissues were assessed using GENT2 database.

Kaplan-Meier Plotter database analysis
The Kaplan Meier plotter database [32] is capable to assess the effect of 54 k genes on survival in 21 cancer types, breast cancer is the largest dataset in Kaplan-Meier plotter containing a total of 6,234 samples. The effect of CCT2 expression on survival together with hazard ratio (HR) with 95% con dence intervals and log-rank P-value in breast cancer was estimated by Kaplan-Meier plotter (http://kmplot.com/analysis). TIMER database analysis TIMER database (https://cistrome.shinyapps.io/timer/) is a comprehensive web platform containing 10897 samples for systematical analysis of immune in ltrates across 32 cancer types from TCGA database [33]. The "DiffExp" module was used to explore the differential expression of CCT2 between tumor and adjacent normal tissues, and Wilcoxon test was applied to determine statistical signi cance of differential expression.
Functional enrichment analysis GO [34] and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment [35] was performed using clusterPro ler package in statistical software R version 3.6.0. (http://www.r-project.org/). GO terms and KEGG pathways with adjusted P-value less than 0.05 were considered to be statistically signi cant. Dot plot of enriched KEGG pathways were plotted using clusterPro ler package [36].

Statistical analyses
Chi-square tests was performed to assess possible associations between CCT2 expression and clinicopathological characteristics. One-way analysis of variance (ANOVA) or T-test were used to determine the differences in CCT2 expression between clinicopathologic characteristics. Survival analysis was estimated using the Kaplan-Meier method, and any differences in survival were evaluated with logrank test. Univariate and multivariable Cox proportional hazards regression was used to assess association with OS. Gene expression correlation was analyzed by Pearson correlation coe cient. All statistical tests were performed using R software version 3.6.0. P-value < 0.05 was considered statistically signi cant.

Expression pattern of CCT2 in various cancers
To determine the mRNA levels of CCT2 in multiple human cancers, we analyzed expression of CCT2 using RNA-sequencing (RNA-seq) data derived from TCGA database. The expression of CCT2 in tumor and adjacent normal tissues across all tumors in TCGA were shown in Fig. 1a. TDO2 expression was signi cantly higher in BLCA (bladder urothelial carcinoma), BRCA (breast invasive carcinoma), CHOL (Cholangiocarcinoma), COAD (colon adenocarcinoma), ESCA (Esophageal carcinoma), HNSC (head and neck cancer), KIRP (Kidney renal papillary cell carcinoma), LIHC (liver hepatocellular carcinoma), LUAD (lung adenocarcinoma), LUSC (Lung squamous cell carcinoma), PRAD (Prostate adenocarcinoma), READ (Rectum adenocarcinoma), STAD (stomach adenocarcinoma) and UCEC (Uterine Corpus Endometrial Carcinoma) when compared with adjacent normal tissues. However, CCT2 expression was signi cantly lower in only two types of cancers, that were, KICH (Kidney Chromophobe) and KIRC (Kidney renal clear cell carcinoma). To validate the expression pattern of CCT2 in various cancers, we further analyzed CCT2 expression in 72-paired tissues across more than 68,000 samples using GENT2 database. Both results from GPL 570 and GPL96 microarray platforms revealed that global CCT2 expression was higher in tumor tissues compared with normal tissues (Fig. 1b and c). CCT2 was higher in most of the tumor tissues when compared with normal tissues. Particularly, the global expression of CCT2 in breast cancer tissues was higher than normal tissues.

Association between CCT2 expression and clinical characteristics of breast cancer patients
Expression of CCT2 were dichotomised into low-and high-expression groups using the median as a cutoff value. We analyzed the associations of CCT2 expression and clinical characteristics in both TCGA cohort (n = 1090) and METABRIC cohort (n = 1904), results can be found in Table.1 and Table.2. We found both two cohorts showed that CCT2 expression was signi cantly associated with HER2 status. CCT2 expression was associated with American Joint Committee on Cancer (AJCC) stage and age in METABRIC cohort, but not TCGA cohort. CCT2 expression was signi cantly associated with TNM stage in TCGA cohort, but not ER status. Moreover, CCT2 expression was found to be associated with tumor grade in METABRIC cohort, but not tumor size and ER status.

CCT2 mRNA expression pattern in breast cancer
We further explored the differences in CCT2 expression between different clinicopathologic groups. CCT2 expression is signi cantly higher in PR positive (PR-) group (p = 0.013) and HER2 negative (HER2+) group (p = 0.014) ( Fig. 2a and b), and CCT2 overexpression in HER2 + group was also validated in TCGA cohort, but not PR-group ( Fig. 2e and f). In METABRIC cohort, CCT2 expression was higher in basal, HER2enriched, luminal B (LumB) group when compared with normal-like group (Fig. 2c). CCT2 overexpression was found to be signi cant in Grade 3 when compared with Grade 1 (P < 0.0001) (Fig. 2d). In TCGA cohort, elevated expression of CCT2 was found in higher T stage, and more aggressive subtype. CCT2 expression was signi cantly higher in tumor tissues compared with normal tissues (P < 0.0001) (Fig. 2i), and this result was further validated in four independent microarray datasets derived from GEO database ( Fig. 3a-d).

Association of CCT2 expression and patient survival in breast cancer
We explored the prognostic value of CCT2 expression using KM-plotter database containing a total of 6243 breast cancer samples. Kaplan-Meier analysis revealed that higher CCT2 expression was associated with both worse overall survival (OS), relapse-free survival (RFS), and distant metastasis-free (DMFS) but not postprogression survival (PPS). (Fig. 4a-b and d-e). CCT2 higher expression signi cantly correlated with worse OS was further validated in independent METABRIC cohort and TCGA cohort ( Fig. 4c and f). Furthermore, we assessed the prognostic value of CCT2 expression in subtype level, we found that higher expression signi cantly predicts worse OS in luminal A group in both KMplotter database (P < 0.0001) and TCGA cohort (p < 0.0001) ( Fig. 5a and e), but not luminal B, HER2, and basal group (Fig. 5b-d and f).

Univariate and multivariate analyses
Expression level of CCT2 mRNA was a signi cant factor in univariateunivariate analysis of both TCGA (HR, 1.306; 95% CI, 1.037-1.644; p = 0.023) and METABRIC (HR, 1.18; 95% CI, 1.075-1.295; p < 0.001) datasets (Table. 3). We also found CCT2 was an independent signi cant prognostic factor for breast cancer according to multivariate analysis of TCGA cohort after adjusting for age, AJCC stage, ER status, PR status as well as HER2 status (Fig. 6a). Interestingly, CCT2 expression was also an independent prognostic factor for breast cancer in multivariate analysis of METABRIC cohort after adjusting for age, AJCC stage, Grade, ER status, PR status as well as HER2 status (Fig. 6b).
CCT2-related signaling pathways identi ed using functional enrichment analysis To explore the potential functional role of CCT2, genes correlated with CCT2 expression (Pearson |R|>=0.4) were screened out (n = 140) (Table_S1), these genes were further used to do functional enrichment analysis in R using cluster Pro ler package (35). Interestingly, GO analysis revealed that these genes were mainly involved in protein folding and binding biological processes (Table_S2). KEGG enrichment analysis revealed that these genes were signi cantly enriched in cell cycle, oocyte meiosis, progesterone-mediated oocyte maturation and RNA transport as well as p53 signaling pathway (Fig. 7).

Correlations between CCTs gene family and prognostic value
We calculated the correlations of CCTs with each other by analyzing their mRNA expressions in TCGA cohort. Interestingly, we found almost all CCT genes were signi cantly positively correlated with each other, including CCT1, CCT2, CCT3, CCT4, CCT5, CCT6A and CCT7 as well as CCT8, but not CCT6B (Fig. 8). Furthermore, we systematically assessed the prognostic value of CCTs gene family using univariate analyses in both TCGA and METABRIC cohort (Table 4). CCT4 expression in METABRIC dataset can't be accessed thereby prognostic value in METABRIC cohort was unknown. In summary, only CCT2 and CCT5 were signi cantly correlated with OS in both two cohorts.

Discussion
Our work revealed that CCT2 tends to be overexpression in tumor tissues compared with normal tissues, and express more in more malignant grades and molecular subtypes of breast cancer. Genes correlated with CCT2 expression were mainly enriched in cell cycle pathway and also P53 signaling pathway. To the clinical aspects, our results indicated CCT2 expression was independently associated with worse prognosis of patients with breast cancer patients, especially in luminal A subtype. Additionally, we also explored potential relationships between CCTs gene family and their prognostic role in breast cancer.
Many previous studies have focused on colorectal cancer, gallbladder cancer, liver cancer, prostate cancer, small cell lung cancer and so on. For example, Park et al. found that the tissues of human colorectal cancer showed greater CCT2 expression than did the normal colon tissues, which indicated that higher CCT2 expression in tumor tissues from colorectal cancer patients reduced their survival rate. Besides, on the basis of the research conducted by Zou et al., in gallbladder cancer, the positive expression of PDIA3 and CCT2 was signi cantly associated with clinicopathological features of both squamous carcinoma/adenosquamous carcinoma (SC/ASC) and adenocarcinoma (AC) specimens, consisting of lymph node metastasis and high TNM stage [22]. Though there were several valuable outcomes, much more work related to BLCA, ESCA, HNSC, STAD, UCEC and renal tumors remains to be done, which will inevitably leads to a much more comprehensive understanding of the function of CCT2 in numerous cancers.
With regard to breast cancer, there were only three previous work directed towards the CCT2 has been published. The rst one was a study conducted by AH Charpentier et al. released in 2000, they illustrated that Pescadillo and chaperonin CCT2 were two presumptive autocrine/paracrine factors of potential function in the regulation of the growth of breast cancer cells, which were identi ed to be highly upregulated by E2 (17beta estradiol) [24]. Besides, the research conducted by Stephen T. Guest et al. represented some unique new ndings. They identi ed that CCT1 and CCT2 were necessary for growth/survival of breast cancer cells in vitro and were determinants of overall survival in breast cancer patients [19]. Apart from that, another research conducted by Anne E. Showalter et al., published in this year also drew some conclusions. By depleting or overexpressing the subunit in breast cancer and breast epithelial cells, they found that increasing CCT2 in cells by 1.3-1.8-fold also increased other CCT subunits' (CCT3, CCT4, and CCT5) levels, while silencing the expression of CCT2 by ~ 50% was able to cause other CCT subunits to reduce. Besides, their study also represented that cells expressing higher CCT2 were more invasive and showed a higher proliferative index, and depletion of CCT2 in a syngeneic murine model of triple negative breast cancer (TNBC) had a potential to prevent tumor growth [25].
Though all these previous studies laid emphasis on the signi cance of CCT2 in breast cancer, what they focused on was only the growth and survival of breast cancer cells. There was no comprehensive and detailed conclusion towards different biological, clinical and molecular characteristics of each distinct subtype. More importantly, transcriptome data we used in this study were derived from the top two biggest independent breast cancer databases, which enabled our outcomes much more overall and reliable.
As for other functions of CCT2, Park et al. found that reduction in CCT2 inhibited tumor induction by Gli-1, and ubiquitination-mediated Gli-1 degradation by β-TrCP occurred during incomplete folding of Gli-1 in hypoxia. CCT2 correlates with Gli-1 expression is an important determinant of survival in the colorectal cancer patients. Besides, based on the study conducted by Lu et al., they discovered that as an essential enzyme in de novo synthesis of purine, phosphoribosylformylglycinamidine synthase (PFAS) interacted with several proteins which played physiological roles in tumor development including CAD, CCT2, PRDX1, and PHGDH, and it was also able to deamidate PHGDH, and induce other posttranslational modi cation into CAD, CCT2 and PRDX1 [37]. When it comes to other subunits of CCT complex, previous studies has reported some valuable points. In various cancers, the expression levels of different CCT subunits were upregulated in varying degrees: CCT3 in hepatocellular carcinoma [38], and CCT8 in hepatocellular carcinoma and glioblastoma [39,40]. Based on study conducted by Hallal et al., extracellular vesicles from neurosurgical aspirates identi ed CCT6A as a potential glioblastoma biomarker with prognostic signi cance [41]. Another group found that overexpression of CCT1 in yeast did not exert any effect on levels of assembled complex, but the CCT1 subunits which were remained soluble in the cytosol had inherent activity of protein-folding [42]. In terms of CCT subunits acting as monomers, scientiests found that CCT4 was able to produce a protusion phenotype by interacting with microtubules and p150glued [43,44]. CCT5 and CCT8 could colocalize with actin bers outside of the oligomer54, and CCT5 also played a key role in the transcriptional regulation of actin [45]. Previous study also represented CCT5 had correspondence with breast cancer. Ooe A et al. discovered that CCT5, RGS3, and YKT6 mRNA expressions, which were up-regulated in p53-mutated breast cancers, might be involved in resistance to docetaxel and clinically feasible in distinguish the subset of breast cancer patients who may or may not be bene t from docetaxel therapy [46]. Apart from that, CCT5 was identi ed to be closely related to lung cancer. Gao H et al. showed that CCT5 could induce an autoantibody response in non-small cell lung cancer (NSCLC) sera and showed higher expression in NSCLC tissues by Western blot and immunohistochemistry [47]. Knockdown of CCT5, PIP4K2A, EXO1, CMBL, OPN3 and KMO, genes within 200 kb up/downstream of the 3 SNPs that were corresponded with small cell lung cancer (SCLC) overall survival [48]. In addition, CCT5 also participated in replication of hepatitis C virus genome through interaction with the viral NS5B protein [49]. However, the role of CCT in many diseases, including cancer, is far from fully characterized, needing much more researches and studies towards that.
Consistent to our results, some studies also reported the potential role of inhibiting cancer cell by targeting CCTs. For instance, Showalter Anne E et al., discovered one CCT inhibitor named CT20p, which had access to kill cancer cells in a CCT-dependent manner. In cancer cells where the CCT was inhibited, they were resistant to CT20p killing, while cells where the expression of CCT was increased were susceptible [15,50]. However, given the fact that the complexity of CCT and its multiple subunits, as well as the lack of a complete understanding of CCT substrate selectivity in vivo, there are inevitably some challenges that impede the development of feasible and effective therapeutics like CT20p [25]. In summary, we discussed the role of CCT2 in tumors together with current researches regarding CCTs gene family. Future research focus on investigating the underlying molecular mechanisms of CCT2 in promoting cancer might yield novel insights for possible treatments by targeting CCT2.

Conclusion
In conclusion, we found that CCT2 overexpression was independently associated with worse prognosis of patients with breast cancer, especially in luminal A subtype. Moreover, these ndings may expand understanding of potential anti-CCT2 treatments. To our knowledge, this is the largest and most comprehensive study characterizing the expression pattern of CCT2 together with its prognostic values in breast cancer.

Declarations
The data generated or analyzed during this study are included in this article, or if absent are available from the corresponding author upon reasonable request.       Kaplan-Meier survival curves comparing the high and low expression of CCT2 in breast cancer molecular subtype. (a-d). Survival analysis of CCT2 in breast cancer molecular subtype using KMplotter database.
(e-h) Survival analysis of CCT2 in TCGA molecular subtype. OS denotes overall survival.

Figure 6
Multivariate analysis of CCT2 expression adjusting for ER, PR, HER2, AJCC stage, age and stage in TCGA cohorts (a) as well as METABRIC cohort (b).

Figure 7
Functional enrichment analysis shows KEGG enriched pathways of CCT2-related genes.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.