Dynamic immunity gene and lncRNA regulatory cooperative pairs revealed prognostic signatures in the development of colon cancer


 Background: The pathological development of colon cancer is a complex progression that depends on multiple alterations of coding and non-coding genes. Colon cancer are challenged by the immune system and acquire features to evade its surveillance during the whole process of occurrence and development. Therefore, it is important to capture the immune regulatory events during the progression of colon cancer development and to identify reliable markers for predicting clinical outcomes in patients. Methods: Here, a standardized computational procedures was developed to evaluate immune cell populations-associated gene (immuneCPa gene)-lncRNA relationships in diverse stages (I, II, III and IV) of colon cancer based on genes and lncRNAs expression. Stage-specific immuneCPa genes and lncRNAs were identified in each colon cancer stage. Results: Dynamic stage-specific immuneCPa gene-lncRNA regulatory networks were constructed and characterized in colon cancer development. A immuneCPa gene-lncRNA activity profile across different stages revealed that lncRNAs were highly stage-selective in regulating immuneCPa genes in colon cancer. Specify scores of immune indicated that diverse kinds of immune cells were temporally-specific in colon cancer. NK CD56bright cell showed strongest stage common features in colon cancer. Further survival analysis indicated that some stage specific immuneCPa gene-lncRNA relationships may have the potential for predicting colon cancer prognosis. Conclusions: Collectively, our study leads to a novel starting point for future functional explorations, the identification of immune-related biomarkers, and lncRNA-based targeted therapy for colon cancer.


Background
Colon cancer represents the third most frequent cancer in the world, with 147,950 novel cases worldwide in 2020 [1]. 30% of patients would occur recurrence when there is nodal involvement (stage III) due to micrometastatic spreading although the patients are diagnosed at a localized stage [2]. Colon cancer is a potentially preventable disease and early-stage colon cancer is usually curable [3]. Although the survival of colon cancer patients improve follow the combination of different therapeutic approaches including surgery, endoscopic ablation, radiation, and chemotherapy, individualization of patients varies greatly.
Especially, patients with advanced cancer show worse prognosis. Therefore, it is important to understand the mechanisms regulating the progression of colon cancer and to identify e cacious therapeutic targets and prognostic/predictive biomarkers for predicting clinical outcomes in patients.
The past years have witnessed promising clinical feedback for anti-cancer immunotherapies. Immune cells contribute to invasion by secreting a cornucopia of in ammatory factors that promote epithelial-tomesenchymal transition and remodeling of the stroma [4]. Previous study reports that immunoscore is a robust and validated clinical assay leveraging immune scoring to predict recurrence risk of colon cancer patient [5]. Immune-related genes play regulatory roles in the immune system and are involved in the initiation and progression of colon cancer [6]. Exploring the roles of immune in diverse stages of colon cancer could provide assistance for immunotherapy of colon cancer. However, most current researches only focus on immune-related coding genes in cancers.
In recent years, emerging evidence has shown the importance of long non-coding RNAs (lncRNAs) as a new regulator of many physical or pathological processes [7,8]. lncRNAs also play essential roles in colon cancer. For example, LINC00662 overexpression promoted the occurrence and development of colon cancer [9]. LINC00460 might function as an oncogenic lncRNA in colon cancer development and could be explored as a potential biomarker and therapeutic target for colon cancer. The functions of lncRNA in immune response for cancers are also identi ed. lncRNA could control the modulation of immune checkpoint molecules in cancer [10]. For example, lncRNA NKILA promotes tumor immune evasion by sensitizing T cells to activation-induced cell death [11]. Interference lncRNA SNHG1 could inhibit the differentiation of Treg cells by promoting miR-448 expression, thereby impeding the immune escape of breast cancer [12]. These studies have signi cantly enhanced our understanding of the lncRNA mechanisms in immune and underlying disease progression. Unfortunately, there has only been limited work studying the dynamic immune-related gene-lncRNA interactions involved in colon cancer development.
To address this issue, the present study designed an integrated computational approach to identify immuneCPa gene-lncRNA pairs in diverse stages of colon cancer based on genes and lncRNAs expression. Stage-speci c immuneCPa genes and lncRNAs were extracted in each colon cancer stage. Dynamic stage-speci c immuneCPa gene-lncRNA regulatory networks were constructed and characterized in colon cancer development. A immuneCPa gene-lncRNA activity pro le across different stages revealed that lncRNAs were highly stage-selective in regulating immuneCPa genes in colon cancer.
Specify scores of immune indicated that diverse kinds of immune cells were temporally-speci c in colon cancer. NK CD56bright cell showed strongest stage common features in colon cancer. Further survival analysis indicated that some stage speci c immuneCPa gene-lncRNA relationships may have the potential for predicting prognosis in speci c stage. In summary, our systematic analysis not only sheds new light on dynamic regulatory mechanisms of immuneCPa gene-lncRNA interactions, but may also help in colon cancer prognosis strati cation and discovery of therapeutic targets.

Materials And Methods
Obtain of high-throughput expression pro les of lncRNAs and genes for colon cancer Gene and lncRNA expression pro les (Level 2) were obtained from The Cancer Genome Atlas (TCGA, https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga). 464colon cancer tissues and 101 tumor adjacent normal control tissues were got and clinical follow-up information including cancer stages and survival data were retained for further analysis. Lastly, 90, 226, 160 and 81 stage I, II, II and IV colon patients samples were obtained. The genes and lncRNAs which their expression values were 0 in all the samples would be removed. All expression values had been transformed by log 2 to satisfy normal distribution.
Collecting multiple kinds of immune cell populations-associated genes 17 selected immune cell populations including B cells, Eosinophils, Macrophages, Mast cells, NK CD56bright cells, NK CD56dim cells, Neutrophils, T helper cells, Tcm cells, Tem cells, Tfh cells, aDC, iDC, Activated CD8 T cell, Gamma delta T cell, Regulatory T cell and Cytotoxic cells were selected [13]. Multiple kinds of immune cell populations-associated genes (immuneCPa genes) were also obtained from previous study [14,15]. All the ImmuneCPa genes were retained for further analysis.
Identifying colon cancer stage-speci c immuneCPa genes and lncRNAs based on expression pro les In order to identify colon cancer stage-speci c immuneCPa genes and lncRNAs, t-test was used to perform differential expression analysis between expression pro les of colon cancer and normal control samples in four diverse stages. The signi cant immuneCPa genes (P<0.05) were considered as colon cancer stage-speci c immuneCPa genes. The signi cant lncRNAs (P<0.01) were considered as colon cancer stage-speci c lncRNAs. Only the colon cancer stage-speci c immuneCPa genes and lncRNAs were used for follow analyses.
Construction of stage-speci c immuneCPa gene-lncRNA regulatory networks in colon cancer development For each kind of immune cell populations, Pearson correlation coe cients (PCCs) were calculated for each colon cancer stage-speci c immuneCPa genes and lncRNAs in colon cancer patients with diverse stages, respectively. The PCCs and P values were obtained for each stage-speci c immuneCPa gene-lncRNA pair. Only the stage-speci c immuneCPa gene and lncRNA pairs with absolute values which more than 0.5 were used for follow analyses and network construction. The stage-speci c immuneCPa gene-lncRNA regulatory networks were constructed by Cytoscape 3.3.0 (https://cytoscape.org/). Degree analysis was also performed by Cytoscape 3.3.0.
The speci city score of stage-speci c immuneCPa gene-lncRNA pairs for colon cancer The speci city of each stage-speci c immuneCPa gene-lncRNA pair across different stages of colon cancer was determined by the speci city score as: (see Equation 1

in the Supplemental Files)
Where N is the number of colon cancer stages and PCC i is a component normalized to the maximum value of the PCC value. Specify score was used to evaluate stage speci c for stage-speci c immuneCPa gene-lncRNA pair in colon cancer.

Survival analyses for stage-speci c immuneCPa gene-lncRNA pairs in colon cancer
In order to evaluate the prognosis about stage-speci c immuneCPa gene-lncRNA pairs in colon cancer, we performed survival analysis for these four stage groups. The two groups were used as constructing and validating model, respectively. We used the regression coe cient of each immuneCPa gene and lncRNA in the stage-speci c immuneCPa gene-lncRNA pair related to patient survival based on gene and lncRNA expression data to verify if these pairs were associated with survival. First, the colon cancer patients in each stage were randomly divided into two groups and the samples in these two groups are independent. Second, a multivariate Cox regression model was applied to each immuneCPa gene and its interacted lncRNAs in diverse colon cancer stage to obtain a standardized Cox regression coe cient for the rst group. Age, cancer stage and sex were also considered as confounders in this process. Third, a risk score formula was constructed based on expression values of each immuneCPa gene and its interacted lncRNAs for the held-out group weighed by their estimated regression coe cients, following the above multivariate Cox regression analysis. Fourth, median of the risk score was used as the threshold value to divide the colon cancer patients into high-risk and low-risk groups. Finally, Kaplan-Meier (K-M) survival analysis was used for the diverse risk groups. Statistical signi cance assessed using the log-rank test. All analyses were performed within the R 2.6.6 framework.

Results
Some stage-speci c immuneCPa genes and lncRNAs were identi ed in colon cancer We rst identi ed colon cancer-speci c 17 kinds of immuneCPa genes for colon cancer in all stages. More than 70% of the 15 kinds of immuneCPa genes were differentially expressed ( Figure 1A). 96.97% B cells associated genes were differentially expressed in all the colon cancer patients. The result indicated that immuneCPa genes play important roles in colon cancer. Next, stage-speci c immuneCPa genes were identi ed in diverse colon cancer stages. In each kind of immuneCPa gene, percent of differential expressed immuneCPa genes showed similar pattern ( Figure 1B). Most stage-speci c immuneCPa genes were presented in patients with stage II colon cancer which is a critical stage of cancer progression. The result suggested that stage II was active stage of immune system to ght cancer cells. Diverse stage also shared some common immuneCPa genes ( Figure 1C). For example, four stages had 14, 9 and 10 common immuneCPa genes in B cells, Gamma delta T cell and Eosinophils. We also extracted differential expressed lncRNAs (61%) for all colon cancer patients ( Figure 1D). Similar to immuneCPa genes, there were most differential expressed lncRNAs in stage II of colon cancer. Four diverse stages shared more common differential expressed lncRNAs compared with immuneCPa genes ( Figure 1E). We inferred some key lncRNAs maybe functions essentially in speci c colon cancer stage.
Identi cation of stage-speci c immuneCPa gene-lncRNA regulatory pairs in colon cancer development Two major factors including interactions and similar expression to extract stage-speci c immuneCPa gene-lncRNA regulatory pairs. Experimentally veri ed gene-lncRNA interaction were considered as candidate regulatory paris. Then considering that interactions do not directly imply their actual regulation of immuneCPa gene-lncRNA pairs in certain conditions, exploring gene regulation of lncRNAs through coexpression analysis can offer useful information to identify active immuneCPa gene-lncRNA relationships in different colon cancer stages. PCCs were calculated for each potential immuneCPa gene-lncRNA pair based on their expression pro les at different stages (Figure 2A). Stage II and III showed most similar distribution of PCCs for gene-lncRNA pairs. Most PCCs of gene-lncRNA pairs were concentrated between 0.5 and 1 or -0.5 and -1 ( Figure 2B). It indicated that most gene-lncRNA pairs showed strong co-expressed level. The numbers of co-expressed gene-lncRNA pairs were diverse in different kinds of immune cells ( Figure 2C). For example, there were most co-expressed gene-lncRNA pairs in Tem and Tfh cells. In addition, most co-expressed gene-lncRNA pairs were present in stage III for almost kinds of immune cells ( Figure 2D). The different sizes of these stage-speci c immuneCPa gene-lncRNA pairs indicate the heterogeneity of immune associated genes and lncRNAs in the development of colon cancer.
Construction of dynamic stage-speci c immuneCPa gene-lncRNA regulatory networks in colon cancer development We used a P <0.05 as the thresholds to identify a link between immuneCPa genes and lncRNAs in the regulatory networks. There were 91,962, 106,834, 165,589 and 118,431 immuneCPa gene and lncRNA pairs in stage I, II, III, IV ( Figure 3A). We extracted absolute values of PCCs > 0.9 as the thresholds to construct more tighter and closer stage-speci c immuneCPa gene-lncRNA regulatory networks in colon cancer development ( Figure 3B). Speci cally, 19,929 edges between 408 immuneCPa genes and 3,768 lncRNAs, 75,259 edges between 408 immuneCPa genes and 4,920 lncRNAs, 52,635 edges between 409 immuneCPa genes and 3,983 lncRNAs and 118,356 edges between 407 TFs and 4,389 lncRNAs were constructed for stages I, II, III and IV colon cancer patients, respectively. In these stage-speci c immuneCPa gene-lncRNA regulatory networks, there were some key hub nodes such as immuneCPa gene DCSTAMP, KIR3DL3 and SND1-IT1 could interacted with more than 400 lncRNAs. In these four stagespeci c immuneCPa gene-lncRNA regulatory networks, immuneCPa gene had bigger degree than lncRNAs ( Figure 3C). The degree analysis indicated that a common immuneCPa gene could be regulated by diverse and a number of lncRNAs in colon cancer development. lncRNAs maybe play speci c regulatory roles for immuneCPa genes in diverse stages of colon cancer.

The dynamic activity pro les of immuneCPa gene-lncRNA regulatory pairs
Although stage-speci c networks share common topological properties, the immuneCPa gene-lncRNA regulatory interactions may change in different stages of colon cancer. To evaluate the proportion of common and speci c immuneCPa gene-lncRNA regulatory interactions during colon cancer progression, we explored the overlaps of immuneCPa gene-lncRNA regulatory relationships among four stage-speci c networks ( Figure 4A). Most immuneCPa gene-lncRNA regulatory relationships are stage speci c. Stage III had maximum number of immuneCPa gene-lncRNA regulatory relationships ( Figure 4B). We inferred that stage III is a key stage which lncRNA participated in immune regulation for colon cancer. For all the immuneCPa gene-lncRNA regulatory relationships, 378,658 immuneCPa gene-lncRNA regulatory relationships were only present in one stage. The four stages only shared 249 common immuneCPa gene-lncRNA regulatory relationships, indicating that relationships were temporally-speci c in colon cancer ( Figure 4C). The immuneCPa gene-lncRNA regulatory relationships were making dynamic changes in colon cancer development ( Figure 4D). To provide an overview of all possible immuneCPa gene-lncRNA relationships and their dynamic regulatory status, we built an activity pro le for immuneCPa gene-lncRNA relationships across different stages of colon cancer. Activity score is the standardized value of PPCs. Based on the activity scores, these immuneCPa gene-lncRNA relationships were grouped by the K-means clustering method ( Figure 4E). Different groups of immuneCPa gene-lncRNA relationships were apparently activated at one or more stages. The patterns of immuneCPa gene-lncRNA relationships were diverse in different groups. These groups maybe could as stage-speci c biomarkers for colon cancer.
Speci city score evaluates the speci city of immuneCPa gene-lncRNA relationships in colon cancer development Speci city score was designed to evaluate the speci city of immuneCPa gene-lncRNA relationships in colon cancer development. Most immuneCPa gene-lncRNA relationships showed lower speci city scores and indicated stage speci c for these relationships in colon cancer ( Figure 5A). In order to extract more stage-speci c immuneCPa genes, we used mean speci city scores of all immuneCPa gene-lncRNA relationships to represent the specify of each immuneCPa gene. These stage-speci c immuneCPa genes could interact with diverse numbers of lncRNAs in colon cancer ( Figure 5B). The mean speci city scores for each kind of immune cells were also obtained. Mast and cytotoxic cells showed the strongest stage speci city in colon cancer ( Figure 5C). Speci city scores of most immuneCPa genes in these two kinds of immune cells concentrated between 0.06 and 0.08 ( Figure 5D). NK CD56bright cell showed strongest stage common features in colon cancer and indicated that it maybe participate in all immune process for colon cancer development. This results also revealed that diverse kinds of immune cells were temporallyspeci c in colon cancer.
Some immuneCPa gene-lncRNA relationships in colon cancer development has speci c prognostic potential To evaluate the potential value of immuneCPa gene-lncRNA pairs as prognostic biomarkers in colon cancer with diverse stages, we created a risk-score formula according to the expression of each immuneCPa gene and its corresponding lncRNAs to generate OS (overall survival) prediction (see the Material and Methods section). We used median risk score as the cut-off point to test the survival of the diverse stage colon cancer patients into high-risk or low-risk groups. These three immuneCPa gene-lncRNA clusters were associated with Macrophages, iDC and Neutrophils cell populations ( Figure 6A). Two stage II and one stage III speci c immuneCPa gene-lncRNA clusters were signi cantly associated with survival, and they could serve as prognostic biomarkers ( Figure 6B).

Discussion
In the past few years, a number of studies have successfully characterized the roles of lncRNAs based on the general assumption that lncRNAs are key regulators of biological processes in many kinds of cancers. Although many studies have been carried out to characterize the complex and multiple functions of lncRNAs including cell growth [16], apoptosis [17], autophagy, epithelial mesenchymal transformation [18] and so on, little is known about the roles of lncRNAs in immune regulations for colon cancer.
Preliminary studies have shown that lncRNAs play important roles in the immune system and had became the focus of immunology [19]. Thus, it is urgent to comprehensive characterize the functions and mechanisms of lncRNAs in immune regulation for cancers. In present study, we used immune related genes as a bridge to identifying immune related lncRNAs in colon cancer.
The pathological development of colon cancer is a complex progression that depends on multiple alterations of coding and non-coding genes. Although our understanding of colon cancer has increased, the precise regulatory mechanisms about immune regulations underlying this complex disease are still not fully known. Therefore, it is important to uncover the immune regulating events during progression of colon cancer and identify reliable markers for predicting the clinical outcome of patients. Specially, exploring regulatory mechanisms of lncRNAs in diverse stage of colon cancer could help us understand the dynamic changes of the immune system in different stages of colon cancer. The present study indicated that stage II was a key immune regulatory stage which there were most dysregulatory immuneCPa genes in colon cancer. Most immuneCPa gene-lncRNA relationships were present in stage III which demonstrated this stage was a critical period for lncRNA to participate in regulation. In short, stage II and III were crucial periods of immunotherapy for colon cancer.
To evaluate if immuneCPa gene-lncRNA pairs could become as stage-speci c prognostic biomarkers in colon cancer, we constructed risk scores to identify survival-related immuneCPa gene-lncRNA pairs. In this process, we divided each colon cancer stage patients to two independent sets.
One dataset was used for Cox regression analysis and another held-out dataset was applied for validating the model. This method could help avoiding over tting. These survival-related immuneCPa gene-lncRNA pairs could become as potential stage-speci c prognostic biomarkers for colon cancer.

Conclusion
Collectively, the present study developed a standardized computational procedures to evaluate immuneCPa gene-lncRNA relationships in diverse stages of colon cancer based on genes and lncRNAs expression. Stage-speci c immuneCPa genes and lncRNAs were identi ed in each colon cancer stage. Dynamic stage-speci c immuneCPa gene-lncRNA regulatory networks were constructed and analyzed in colon cancer development. A immuneCPa gene-lncRNA activity pro le across different stages revealed that lncRNAs were highly stage-selective in regulating immuneCPa genes in colon cancer. Collectively, our study leads to a novel starting point for future functional explorations, the identi cation of immunerelated biomarkers, and lncRNA-based targeted therapy for colon cancer.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. Equation1.pdf