High Expression of TTK As Colon Stem Cell Marker Correlated With M1 Macrophages Inltration

Background: Tumor stemness-related genes promote tumor progression and resistance to immunotherapy, but the relation between tumor-inltrating immune cells and colon adenocarcinoma (COAD) stemness-related genes remains unclear. Methods: The mRNAsi scores, a stemness index derived from transcriptomic data of the TCGA COAD cohort were used for integrated bioinformatics analysis. The Weighted Gene Co-expression Network Analysis (WGCNA) was applied to classify the modules correlated with mRNAsi scores. The Gene Expression Omnibus (GEO) cohorts GSE76402 was used to verify the DEGs between colon cancer and normal tissues. Real-time PCR and Flow-cytometry were used to validate the expression of genes and cancer stem cells marker in the spheres of colon cancer cells. The Single-sample Gene Set Enrichment Analysis (ssGSEA) and CIBERSORT algorithm were used to explore the correlation between signicant genes and the immune cells in tumor microenvironment. Result: The mRNAsi score was associated with tumor development and progression. TTK were identied as COAD stemness-related genes based on WGCNA analysis. TTK were negatively correlated with immunity. Interestingly, M1 Macrophages had a higher inltration in COAD with high TTK expression. Higher inltration of M1 Macrophages would result in the worse survival rate of COAD patients. Conclusion: TTK may have the potential as a prognostic biomarker and adjuvant target to promote immunotherapy sensitivity and improve outcomes of COAD patients.


Introduction
Colon adenocarcinoma (COAD) ranks the third and fourth of cancer-related incidence and mortality in the world, accounting for more than 700,000 death each year [1,2]. Immunotherapy has become a powerful clinical strategy for cancer treatment. However, in contrast to Melanoma and non-small-cell lung cancer, immunotherapy has not yet become a relevant part of the treatment landscape of unselected colon cancer and it remains an unmet treatment needs for COAD patients [2]. Immunotherapy in previouslytreated CRC patients resulted in no improvement post-treatment, also its use in other CRC subtypes has been either unsuccessful, or not extensively explored [3,4]. Hence, it is vital to further unravel the in-depth mechanism of the resistance of immunotherapy for COAD.
Effective immunotherapy is limited in most patients by the immunosuppressive tumor environment.
Cancer stem cells (CSCs), a dynamically proliferating population in the COAD microenvironment, play an immunosuppressive role in regulating Tumor in ltrating immune cells (TIICs) [5,6]. Failure of immunotherapy in COAD patients are mainly attributed to CSCs self-renewal and immune surveillance [7,8]. These features might be attributed to genes related to CSCs [9,10]. The genes related to CSCs stemness maintenance not only promote proliferation of CSCs, but also might regulate TIICs [11]. CD44, as CSCs biomarker in colon cancer, would regulate T helper type 1 (Th1) cell survival and promote optimal survival of effector CD4 + T cells engaged in immune response [12]. The genes related to CSCs and tumor stemness are critical for developing individualized treatment strategies. But the comprehensive landscape of immune cells in ltrating of COAD with high stemness has not yet been elucidated, and the association between TIICs and CSCs related genes is unclear.
At present, with the rapid development of high-throughput genome sequencing technology, the genome and transcriptome characteristics of tumor cells are considered closely related to tumorigenesis and maintenance of cancer stemness. Malta et al. developed a machine learning algorithm to compare genome maps and epigenetic feature sets of tumor cells and embryonic stem cells [13]. This algorithm calculated the stemness index (mRNAsi scores) by analyzing the expression of characteristic genes related to the tumor stemness maintenance. The mRNAsi scores have been used to evaluate the correlations with tumor metastasis and survival rate [14]. However, the correlation of mRNAsi scores with the in ltration of TIICs remains unknown.
Weighted Gene Expression Network Analysis (WGCNA) is a systematic biological method, which could cluster the modules of highly correlated genes and relate modules to external sample traits [15]. WGCNA was widely used to explore the associations between gene sets and clinical features, and to identify candidate biomarkers [16].
In our study, we identi ed key genes related to the stemness of COAD with WGCNA analysis and analyze the correlation between stemness of COAD and the in ltration of TIICs. Also, we investigated that the in ltration of activated memory CD4 + T cells was higher, while Tregs was lower in COAD patients with higher mRNAsi scores. Taken together, our result has identi ed genes that play a critical role in the biological functions of COAD stemness and reveal the correlation between stemness of COAD and in ltrating immune cells. The ow chart of this study is shown in supplement Fig. 1.

Cell culture and Real-time (RT) PCR
The human colon cancer cell lines including HT29, RKO cells and human normal colonic epithelial cells FHC were purchased from American Type Culture Collection (ATCC, Manassas, VA). All cell lines had been identi ed after purchasing and lack of contamination before the experiment. Cells were cultured in Dulbecco's modi ed Eagle's medium (DMEM, Gibco, Lot:8120287) with 10% fetal bovine serum (FBS, ExCell Bio, Lot:11l032). Total RNA was extracted from cells by the RNAiso Plus (TaKaRaBio Technology, Lot:9109) and real-time RT-PCR were performed by the Reverse Transcription Kit and SYBR Green Master Mix (Vazyme, Lot: Q341-02). The Ct values were normalized to the GAPDH expression level and 2 − ΔΔCt method was used to measure the relative expression. RT-PCR was repeated at least 3 times to ensure reproducibility. The sequence of the primers was shown in Table 1.

Correlation between mRNAsi scores and clinical characteristic
The comprehensive clinical features and clinical subtypes of samples were comprehensively analyzed and correlated with mRNAsi scores. The high scores group and low scores group were distinguished with median mRNAsi scores as the cut-off value, then the survival rate of patients was displayed with the Kaplan-Meier curve. We detected the protein expression of DEGs in the normal colon tissues and colon cancer tissues from The Human Protein Atlas (HPA) database (https://www.proteinatlas.org/).

Weighted Correlation Network Analysis (WGCNA) analysis and identi cation of signi cant modules and genes
The WGCNA algorithm eliminates multiple hypothesis testing and correction issues by clustering thousands of genes with similar expression patterns and then screen gene sets associated with phenotypes [15]. The co-expression network was constructed by the WGCNA package with the DEGs from the above data and mRNAsi scores. Pearson's correlation between genes and the power values were used to construct the weighted adjacency matrix and scale-free co-expression network Topological overlap matrix (TOM). Based on TOM analysis and clustering of genes with similar functions, the module with a minimum size of 50 genes was built for further construction of module.
To ensure the accuracy of the module, the module eigengenes (MEs) as an expression pattern with a distinct expression feature was used to calculate the correlation between different modules. Similar modules were aggregated with cutoff (ME scores < 0.25) into one module. The gene signi cance (GS) was de ned as the correlation between DEGs expression and sample traits. The module membership (MM) was calculated to identify the correlation between genes in a certain module and gene expression pro les. The key genes and signi cant modules were con rmed by cutoff (cor.gene MM > 0.8 and cor.gene GS > 0.5).
2.6 Functional Annotation of modules genes R packages cluster "Pro ler", "enrichplot", and "ggplot2" were used to analyze gene ontology (GO) functional annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG). The content of GO analysis consisted of cellular component (CC), molecular function (MF), biological process (BP).

Protein-Protein Interactions (PPI) network and Coexpression analysis of genes
To visualize the interactions among proteins in signi cant modules, we used the STRING website(https://string-db.org/) and the Cytoscape software to construct the PPI network. The STRING database was used to predict protein interaction and construct an interaction network. The "corplot" package was used to visualize the correlation between genes.
2.8 Immune scores evaluation, immunity stage classi cation and immune cell in ltration analysis in COAD ESTIMATE package was applied to infer the immune score with RNA-Seq expression data. The "ggExtra" and "ggpubr" packages were applied to calculate the correlation between mRNAsi scores and immune

Statistical analysis
All statistical analyses were performed on R (3.6.2) and corresponding packages. The difference of mRNAsi scores between cancer samples and normal samples and the in ltration of the immune cells in COAD with different mRNAsi scores were analyzed by the Wilcoxon rank sum test the Pearson correlation analysis was used for co-expression analysis of genes. The Kaplan Meier analysis was conducted to the overall survival of each group using the "survimer" package. For DEGs analysis, the difference expressed genes between groups were analyzed by the Wilcoxon rank-sum test. The correlation between miRNA scores and immune scores of samples was calculated by Spearman correlation analysis. The p-value < 0.05 was considered statistically signi cant.

Association of mRNAsi scores with COAD development and progression
To explore the correlation of COAD stemness with patient's clinical characteristics, we rst analyzed the difference of mRNAsi score in COAD and colon tissues. As shown in Fig. 1A, the mRNAsi scores were much higher in COAD than in normal tissues.
Moreover, the Kaplan Meier survival curve showed that there was no difference in the overall survival rate between COAD patients with high mRNAsi scores and patients with low mRNAsi scores (Fig. 1B). However, the mRNAsi scores in COAD without distant metastasis was signi cantly higher than that in distant metastatic COAD (Fig. 1C). Meanwhile, with the progression of COAD lymph node metastasis, the mRNAsi scores in COAD decreased rst and then increased (Fig. 1D). The above result suggested that the mRNAsi scores were signi cantly higher than that in normal colon tissue and increased with COAD metastasis.
3.2 Identi cation of the key modules and genes related to mRNAsi score via WGCNA analysis Tumor stemness related genes are signi cantly higher in the tumor, which are important in maintaining tumor self-renewal properties [14]. To screen the genes related to mRNAsi score, we rst analyzed and ltered the genes between COAD and normal colon tissues to identify differential expression genes (DEGs). We identi ed 6477 DEGs between normal tissues and COAD, consisted of 1916 down-regulated DEGs and 4561 up-regulated DEGs. we further used WGCNA analysis to identify biologically signi cant modules correlated with mRNAsi scores. For accurate analysis, 4 outliers were removed and the DEGs were divided into different modules with the clusters. Then the appropriate power value (β = 4, scale-free R2 = 0.950) was selected as a soft threshold for the scale-free network and MEDissThres as 0.25 was set to merge similar modules into one module ( Fig. 2A). Furthermore, 11 modules were screened out and the heat map of the modules was drawn to reveal the correlation scores between COAD characteristics and the patient's mRNAsi scores and EREG-mRNAs scores (Fig. 2B).
Given that mRNAsi showed better consistency for most tumors compared to the EREG-mRNAsi, based on both RNA expression and epigenetics, elucidates the discrepancy between mRNAsi and shows a positive correlation with mRNAsi [13]. The green module was most positively correlated with the mRNAsi scores of COAD patients (Fig. 2C), while the brown module and the red module exhibited a signi cantly negative correlation with the mRNAsi scores of COAD patients (Fig. 2D-E). In addition, the brown and green modules exhibited a high correlation with mRNAsi scores. Thus, we chose the genes (GS > 0.5 and MM > 0.8) in the brown and green modules for subsequent analyses. The genes in the brown modules were de ned as genes related mRNAsi score that was positively correlated with stemness of COAD.

Functional enrichment and co-expression analysis of COAD stemness related genes
In order to explore whether the genes in the module have similar functions and internal connections, we calculated the correlation between genes and performed GO and KEGG pathway enrichment analysis on these genes in the brown and green modules. As shown in Fig. 3A, GO analysis revealed that the genes in the brown module were associated with contractile ber and actin binding, while the genes in the green module were associated with mitotic nuclear division, nuclear division, organelle ssion, regulation of cell cycle phase transition, chromosomal region and protein serine/threonine kinase activity (Fig. 3B). In terms of KEGG analysis, the genes in the brown module mainly focused on cGMP − PKG signaling and the genes in the green module were associated with cell cycle (Fig. 3C-D). As shown in Fig. 3E-F, there was a strongly positive correlation between the genes in the green and the brown module.

Validation of COAD stemness related genes and construction of PPI network
In order to demonstrate that the genes related mRNAsi scores are higher in COAD with high stemness, we performed real-time RT-PCR to detect the expression of PLK4, TTK, CHEK1, KIF18A and BUB1 in HT29 and RKO colon cancer cell lines, which was derived from carcinoma and commonly [17,18]. The RT-PCR result showed that the expression level of these COAD stemness related genes was signi cantly increased in HT29 and RKO cells than in normal colon epithelial cells FHC (Fig. 4A). Also, the expression of PLK4, TTK, CHEK1, KIF18A and BUB1, as well as cancer stem cell marker genes, including CD44, CD90 CD133 and WNT-3A, was higher in the secondary spheres of HCT116 (Fig. 4B). In the GSE76402 cohort, PLK4, TTK, CHEK1 and BUB1 were higher expression in colon cancer while KIF18A was higher expression in colon tissues (Fig. 4C). Furthermore, the spheres of HCT116 were capable of the sphere-forming ability and expressed high CD133 and CD44 (Fig. 4D-E). According to the protein expression in colon cancer and normal tissues from HPA database, colon cancer has higher protein expression of TTK, PLK4 and KIF18A (Fig. 4F). BUB1 and CHEK1 were not reported in HPA database. These results were indicating that these COAD stemness related genes might be biomarker for colon cancer stemness. Interplays among the genes in the same module were analyzed by the online tool STRING and constructed with the PPI network. As shown in Supplementary Fig. 2A, 124 nodes and 182 edges in the brown module were formed in the network and the number of remarkable nodes was shown in bar plot. Moreover, 5 nodes were formed in the green module and each node had a connection with another ( Supplementary Fig. 2B).

Correlation of COAD stemness-related genes with immune cells in ltration.
Mounting evidence showed that high stemness of tumor is negatively correlated with antitumor immunity, leading to poor immunotherapeutic e cacy [19,20]. We rst calculated the immune cell purity in COAD using an estimation algorithm based on RNA-seq expression data and analyzed the correlation between COAD stem related genes and immune scores. As shown in Supplementary Fig. 3, immune scores were also inversely correlated with the expression of the COAD stemness related genes and mRNAsi scores. Moreover, we also interrogated the different level of COAD stemness related genes and mRNAsi scores in different immune stages (Fig. 5A). Interestingly, the expression of TTK and mRNAsi scores decreased with the immune stages ( Fig. 5B-C), there was no difference of expression of PLK4, CHEK1, KIF18A and BUB1 with the immune stage (Fig. 5D-G).
To further explore the association between the TIICs and COAD stemness, we analyzed the difference of in ltrating immune cells in different TTK level patients. CIBERSORT algorithm analysis showed that M1 Macrophages were signi cantly increased in patients with high TTK expression than in patients with low TTK expression, while memory resting CD4 + T cells were reduced in high TTK expression of patients than in low TTK expression of patients (Fig. 6A-B).
Finally, the Kaplan Meier curve was established to analyze the overall survival rate of COAD patients in different in ltration of immune cells and TTK expression. In Fig. 6C, the expression of TTK not affect overall survival of COAD patients. There was no statistical difference in survival rate between COAD patients in a different proportion of memory resting CD4 + T cells (Fig. 6D), but patients with low M1 Macrophages had a better prognosis (Fig. 6E). As shown in Supplementary Fig. 4, M1 Macrophages were signi cantly higher in patients with high COAD stemness related genes. This result con rmed that COAD stemness and related genes would affect the in ltration of immune cells.

Discussion
COAD is the fourth most deadly cancer in the world, with a mortality rate of approximately 50% every year worldwide [2]. Surgical treatment and chemotherapy of early COAD can signi cantly improve the survival time of patients, but advanced COAD with chemotherapy and radiotherapy often have drug resistance and liver metastasis. With the in-depth research on the tumor microenvironment, CSCs, a type of cell subset, in COAD have been found to play an important role in the drug resistance of patients and the regulating immune cells in the tumor microenvironment [7,8]. Tumor with high stemness has the characteristics of CSCs, which inhibit T cells and NK cell activation and nally leads to a poor outcome of immunotherapy [21][22][23][24]. But the in-depth mechanism of COAD stemness related genes resisting immunotherapy is not clear. In this study, we focus on identifying the genes related the COAD stemness and evaluating the correlation between COAD stemness-related genes and the immune cells in tumor environment.
We identi ed that as normal colon tissue developed into tumor tissue and deteriorates into lymph node metastasis status, mRNAsi scores become higher. Tumors with high stemness are more likely to migrate to other organs and lead to disease progression in patients [14], which is consistent with the increased mRNAsi scores in COAD. The result indicated that the stemness of COAD might be related to COAD progression and metastasis, but COAD patients with different mRNAsi scores had no signi cant difference in survival rate.
WGCNA analysis was applied to hierarchical cluster. the modules related the COAD stemness, and the brown and green modules were de ned as key modules. Functional annotations of genes from the brown module mainly focused on contractile ber, myo bril, actin binding and cGMP-PKG signaling pathway, which is down-expressed in COAD and closely related to COAD metastasis and recurrence [25,26].
Besides, functional analysis of key genes from the green module was only enriched in cell cycle. The genes from the green module were highly positively correlated with mRNAsi scores, which might remain the stemness of COAD. The PPI network and co-expression diagrams indicated that genes in the same module had a high co-expression and interaction correlation with each other. The genes in green module had a highly positive relation with COAD stemness, and they were up-expressed in the spheres of HCT116. Hence, PLK4, TTK, CHEK1, KIF18A and BUB1 were identi ed as COAD stemness-related genes from the green module.
The functions of COAD stemness-related genes were mainly enriched in the cell cycle, which explained the phenomenon that CSCs promote COAD in a highly self-renewing state [27]. The Polo-like kinases (PLKs) play a critical role in centrosome replication and cell division and checkpoint regulation of mitosis, and Plk4 over-expression or activity can cause chromosomal instability and mitotic errors which would promote colon cancer development [28,29]. TTK is an essential kinase to regulate chromosome alignment and error correction, which is positively correlated with the worse prognosis in several cancer [30,31]. High expression level of TTK in colon cancer might regulate the functions of spindle assembly checkpoint and mitochondria to enhance the survival of aneuploid cancer [32]. BUB1 mitotic checkpoint serine/threonine kinase (BUB1) could regulate the chromosome segregation and was reported over-expressing in many cancers and was also positively correlated with poor prognosis [33,34]. Bub1 insu ciency accelerates the rate at which whole chromosomes are lost or gained in ApcMin/+ mice, thereby perhaps accelerating Apc LOH and driving colonic tumorigenesis [35]. KIF18A, a member of the kinesin superfamily of molecular motor proteins, is a microtubule depolymerase and plays an important role in regulating the stability of microtubule plus ends in the mitotic spindle, and colon cancer cells with high expression of KIF18A show stronger migration [36].
Previous studies have found that colon CSCs inhibit immune cells activity in the tumor microenvironment and led to the poor outcome of immunotherapy [37][38][39]. we used the ESTIMATE algorithm to infer immune score and explored the correlation between immune score and mRNAsi score and COAD stemness-related genes [40]. The immune scores were found to be negatively correlated with the mRNAsi scores and COAD stemness-related genes. This would suggest that high expression of COAD stemnessrelated genes prevents the immunity of patients, which might cause the failure of the clinical effectiveness of immunotherapy.
Subsequently, we analyzed the association between TIICs and COAD stemness-related genes, and high TTK in COAD might lead to less in ltration of memory resting CD4 + T cells and more in ltration of M1 Macrophages. CD4 + T-cell activation is required for effective CD8 T-cell responses, and the higher in ltration of resting CD4 + T cells might lead to low CD8 T cells responses and poor immunotherapy [41].
This suggested that COAD with high stemness promoted poor outcome of immunotherapy by affecting the inactivation of memory CD4 + T cells. Interestingly, in COAD with high expression of TTK, the in ltration of M1 Macrophages was higher. Macrophages as a key driver of tumor in ammation, contribute to nurturing cancer stem cells and modulate CSC formation and maintenance by transferring into the M1 phenotype [42,43]. This evidence suggests that inactivation of CD4 + T cells and low in ltration of M1 Macrophages might be detrimental to cancer stemness related gene expression and immune escape. Overall survival analysis had elucidated signi cant roles for high M1 Macrophages in predicting prognosis for COAD patients, the COAD patients with low M1 Macrophages in ltration were considered to have a better prognosis. The result above demonstrated that the stemness of COAD was related to in ltrating immune cell in the tumor microenvironment.
In conclusion, PLK4, TTK, CHEK1, KIF18A and BUB1 were positively correlated with stemness of COAD, and TTK might be associated with CD4 T cells activation and M1 Macrophages in ltration in the tumor microenvironment. Also, the increased in ltration of memory activated CD4 + T cells and M1 Macrophages might further contribute to formation and immune escape of COAD with high stemness. TTK may have the potential as a prognostic biomarker and adjuvant target to promote immunotherapy sensitivity to improve outcomes in COAD. But conclusion derived from our study was mainly based on bioinformatic analysis, but it was not enough to re ect the gene expression in COAD with high stemness. Publicly available datasets were analyzed in this study, data can be download from TCGA(https://portal.gdc.cancer.gov/) and GEO (https://www.ncbi.nlm.nih.gov/geo/) database.

Competing interests
There is not any other con ict of interest by the all of the authors.   coe cient ranges from -1 to 1. Red represents the correlation coe cient is greater than 0, and blue represents the correlation coe cient is less than 0. Scatter plot of genes in modules shown in green(C), brown(D), red(E).   TTK and mRNAsi scores are negatively correlated to immunity stage. (A) 29 immune-related gene sets were used to cluster and classify the COAD patients into high immunity stage (n =22), middle immunity stage (n=372) and low immunity stage (n=88),respectively;(B-E) The difference of the expression level of BUB1, CHEK1, KIF18A and PLK4 in low immunity stage and high immunity stage(p>0.05);(F)The difference of the expression level of BUB1, CHEK1, KIF18A and PLK4 in low immunity stage and high immunity stage(p<0.05); (G) The difference of the mRNAsi scores in in low immunity stage and high immunity stage(p<0.001).