Bioinformatic Analysis: The Role of LINC00634 in Colorectal Carcinoma

Background: LINC00634 is highly expressed in esophageal cancer, and its depletion can suppress the viability and induce the apoptosis of esophageal cancer cells. However, there is a lack of studies that examine the relationship between LINC00634 expression and the clinicopathological features, survival outcomes, prognostic factors and tumor immune cell inltration of colorectal carcinoma (CRC) patients. Objective: We aim at investigating the role of LINC00634 in colorectal carcinoma. Methods: We obtained data from the TCGA (The Cancer Genome Atlas) public database, GTEx (Genotype-Tissue Expression) database and clinical samples. Wilcoxon rank-sum test, Kruskal-Wallis test and logistic regression analysis were employed to assess the relationship between LINC00634 expression and the clinicopathological characteristics of CRC patients. Receiver operating characteristic (ROC) curve was constructed to evaluate the ability of LINC00634 for distinguishing between CRC patients and normal subjects based on the area under the curve (AUC) score. Univariate and multivariate analyses were conducted to evaluate the association between prognostic factors and survival outcomes. Kaplan-Meier curves and Cox regression analysis were employed to determine the contribution of LINC00634 expression to the prognosis of colorectal carcinoma patients. Immune inltration analysis and Gene Set Enrichment Analysis (GSEA) were conducted to identify the signicantly involved functions of LINC00634. Finally, a nomogram was constructed for internal verication based on the Cox regression data. Results: The expression of LINC00634 was upregulated in CRC patients, and markedly associated with N stage, residual tumor, pathological stage, and overall survival (OS) event. ROC curve showed that LINC00634 had strong diagnostic and prognostic abilities (AUC=0.74). The high expression of LINC00634 could predict poor disease specic survival (DSS; P=0.008) and poor overroll survival (OS;P<0.01). The expression of LINC00634 was independently associated with OS in CRC patients (P=0.019). GSEA and immune inltration analysis demonstrated that LINC00634 expression was involved in gene transcription, epigenetic regulation and the functions of certain types of immune inltrating cells. The c-index of the nomogram was 0.772 (95(cid:0)CI: 0.744-0.799). Conclusions: Our study reveals that LINC00634 can serve as a potential prognostic biomarker for CRC patients.


Introduction
Colorectal carcinoma (CRC) is a frequent malignant tumor in the clinic. The global cancer statistical analysis in 2020 shows that CRC ranks the third among cancers in the world, and the mortality rate ranks second among cancer-related deaths in both men and women (1). Increasing incidence of CRC is mainly attributed to the changes in lifestyle and eating habits. Speci cally, a high consumption of animal-derived food and a sedentary lifestyle can lead to a decrease in physical activity and an increase in body weight, which is independently associated with the risk of CRC (2). Other risk factors include heavy drinking, smoking, and consumption of red or processed meats (3). Despite the advancements of diagnosis technology and treatment, predicting the prognosis of CRC patients is still a major challenge. Therefore, nding new molecular markers associated with CRC is of utmost urgency for the improving diagnosis and prognosis of this disease.
Long intergenic non-coding RNAs (LINCRNAs) are non-coding RNAs with a transcript length of longer than 200 nt, which are not capable of encoding proteins. However, LINCRNAs can control gene expression at the epigenetic level, transcription level, and post-transcriptional level (4). In recent years, LINCRNAs have been widely studied as potential key factors for cancer cell regulation. LINCRNAs are abnormally expressed in almost all cancers, and they play pivotal roles in promoting and maintaining the occurrence and development of tumors. This suggests the clinical potential of LINCRNAs as biomarkers and therapeutic targets (5,6). Several studies have found that LINCRNAs can interact with RNA, proteins and lipids, and act as key signal transduction mediators in cancer-related signal transduction pathway, which eventually affect the angiogenesis, proliferation, migration and invasion of tumor cells (7)(8)(9). A recent study has reported that LINC00634 is highly expressed in esophageal cancer and activates BCL2L1 to regulate cell viability and cell apoptosis by sponging miR-342-3p in esophageal cancer, thus promoting the malignant progression of esophageal cancer cells (10). However, there are only few reports about the relationship between LINC00634 and colorectal carcinoma.
In this research, the expression data were retrieved from the Cancer Genome Atlas (TCGA) public database,GTEx database and clinical samples to compare the differential expression levels of LINC00634 between tumor tissues and normal colorectal tissues and to examine the association between LINC00634 expression and the clinicopathological features of CRC patients. Subsequently, the potential of LINC00634 for predicting the prognosis of CRC patients was evaluated. Moreover, bioinformatic analysis was conducted on LINC00634 upregulation and downregulation groups to reveal the underlying biological functions. Furthermore, we determined the association between LINC00634 expression and immune cell in ltration, and elucidated the underlying mechanism of LINC00634 in regulating the occurrence and development of CRC.

Data collection and bioinformatics analysis
The clinical data of TCGA CRC tissues (n = 647) and matched normal colorectal tissues (n = 51) were obtained from the TCGA database (https://genome-cancer.ucsc.edu/). To detect LINC00634 expression, the data for tumor tissues (n = 383) were obtained from the TCGA database, while those for normal colorectal tissues (n = 359) were combined with normal tissues from the TCGA and GTEx databases. The validation dataset was composed of 40 CRC patients who received radical surgery and 40 adjacent tissues from the specimen library of Guangzhou First People's Hospital. All patients had not received radiotherapy and chemotherapy before surgery. The format of data was converted from level 3 HTSeqfragments per kilobase per million (FPKM) to transcripts per million (TPM) format for further analyses.

Real-Time Quantitative Reverse Transcription (qRT-PCR) analysis
Trizol reagent (Invitrogen) was used to isolate total RNA from the cells. qRT-PCR was performed according to a previous method (11). All data were standardized with internal control GAPDH.

Functional enrichment analysis for LINC00634
Gene Set Enrichment Analysis (GSEA) has been widely used to determine whether there is an obvious difference in a set of genes between two biological states (12). To assess the variation between LINC00634 upregulation and downregulation groups, GSEA was performed using the package 'clusterPro ler' in R v3.6.0 (12,13). Based on the default settings, the process was repeated 5,000 times for each analysis, and the c2.all.v7.2.symbols.gmt (Curated) in MSigDB Collections was selected as the reference gene collection. The false discovery rate (FDR) < 0.25 and adjusted p-value < 0.05 were deemed as signi cant enrichment.

Immune cell in ltration analysis
Single-sample GSEA (ssGSEA) was conducted to analyze the immune cell in ltration of CRC by using the package 'GSVA' in R v3.6.0 (http://www.bioconductor.org/packages/release/bioc/html/GSVA.html) (14). The RNA-seq and clinical data were retrieved from TCGA (https://portal.gdc.cancer.gov/) COADREAD project. All data were ltered to remove adjacent tissues and converted into transcripts per million (TPM), followed by performing log2 conversion for subsequent analysis. To determine the relationship between LINC00634 and in ltration levels of 24 immune cell types, the P-value was calculated by Spearman's rank correlation and Wilcoxon rank-sum tests.

Statistical analysis
All statistical tests were carried out with R v3.6.3. Chi-square, logistic regression, Fisher's exact and Wilcoxon rank-sum tests were employed to assess the relationship between LINC00634 expression and clinicopathological characteristics of CRC patients. Survival package was used for the statistical analysis of survival data, survminer package was applied for visualization, and Kaplan-Meier method was employed to calculate the disease-speci c survival (DSS) and overall survival (OS) rates of CRC patients in TCGA datasets. To explore the association between clinical characteristics and OS rates, univariate and multivariate analyses were conducted using the Cox proportional hazards model. P-value of < 0.05 was considered statistically signi cant. Finally, R language was used to draw a nomogram and establish a predictive model.

High expression of LINC00634 is related to the clinicopathological features of CRC patients
The expression levels of LINC00634 in 480 CRC tissues and 41 neighbouring tissues were detected, and we observed that LINC00634 was highly expressed in CRC tissues (P < 0.001, Fig. 1A). At the same time, we also analyzed the expression of LINC00634 in 50 cases of CRC tissues and their matched adjacent tissues. The results demonstrated that the expression of LINC00634 was also higher in CRC tissues than in control tissues (P < 0.01, Fig. 1B). Due to a lack of normal tissues, the data obtained from the GTEx data were included into the control group, and the results further con rmed that LINC00634 was highly expressed in CRC tissues compared to normal tissues (P < 0.01, Fig. 1C). The validation dataset also achieved the same result (P < 0.01, Fig. 2A). Collectively, these ndings indicate that LINC00634 is markedly overexpressed in CRC tissues.
Next, the performance of LINC00634 expression to distinguish between CRC patients and normal subjects was evaluated based on the receiver operating characteristic (ROC) curves. The area under curve (AUC) of LINC00634 was 0.740, indicating that LINC00634 could serve as a good biomarker for identifying CRC and normal tissue (Fig. 1D).
The clinicopathological features of CRC patients are listed in Table 1. As above mentioned, the clinical and gene expression data of 644 primary CRC patients were obtained from TCGA database. Based on the mean expression of LINC00634, the CRC patients were assigned to LINC00634 upregulation (n = 322) and downregulation (n = 322) groups. Subsequently, we evaluated the association between LINC00634 expression and the clinicopathological features of CRC patients. The results of Chi-square or Fisher's exact test indicated that LINC00634 expression was associated with primary treatment outcome (P < 0.05), pathologic stage (P < 0.05), N stage (P < 0.05), residual tumor (P < 0.01) and OS event (P < 0.05). Logistic regression analysis was performed to further verify the relationship between LINC00634 expression and the clinicopathological features of CRC patients. The ndings demonstrated that LINC00634 expression was associated with N stage (P = 0.016), pathological stage (P = 0.008) and residual tumor (P = 0.018; Fig. 3; Table 2).

LINC00634 expression is related to poor prognosis in CRC patients
Kaplan-Meier survival curve was applied to determine the correlation between LINC00634 expression and the OS/DSS of CRC patients. The results indicated that LINC00634 expression was positively associated with poor OS and DSS in CRC patients (P < 0.01, Fig. 4A and P = 0.008, Fig. 4B, respectively).

Identi cation of prognostic factors for OS in CRC
The Cox univariate and multivariate analyses of prognostic factors for OS in CRC patients are revealed in Table 3. Notably, the signi cant univariate variables were age (P < 0.01), LINC00634 (P = 0.039), T/N/M stage (P < 0.01) and residual tumor (P < 0.001). Multivariate analysis further showed that age (P = 0.002), LINC00634 (P = 0.019), T stage (P < 0.05), M stage (P < 0.05) and residual tumor (P < 0.05) were independent prognostic factors for OS in CRC patients. Based on Cox regression analyses, a nomogram was constructed for internal veri cation, and a predictive model was established (Fig. 5). In internal veri cation, the c-index of the nomogram was 0.772 (95% CI: 0.744-0.799).

LINC00634 expression and its signi cantly involved functions
GSEA was carried out to explore the biological functions related to LINC00634 expression. There were 55 datasets with signi cantly differential enrichments in LINC00634 upreglation group, and we identi ed the target 6 datasets with high normalized enrichment score (NES). Most of these pathways were associated with gene transcription and gene methylation (Fig. 6).

LINC00634 expression and immune cell in ltration
The relationship between LINC00634 expression and immune cell in ltration was examined by ssGSEA (Fig. 7A). Our ndings demonstrated that LINC00634 expression was negatively associated with the in ltration levels of T central memory (Tcm) cells (P < 0.01, Fig. 7B

Discussion
In this study, we sought to determine the expression of LINC00634 in CRC and its association with the diagnosis and prognosis of CRC. Thus far, there are only few studies about the relationship between LINC00634 and tumors. A recent study found that LINC00634 was upregulated in esophageal cancer tissues, and its expression was related to the TNM staging of esophageal cancer patients. Mechanistically, LINC00634 regulated the downstream target gene BcI2LI by sponging miR-608 to attenuate inhibition, promoted the proliferation of esophageal cancer cells and inhibited cells apoptosis(10). Our results indicated that LINC00634 was highly expressed in CRC tissue, and was related to the poor prognosis of the patients. Many studies have reported the key roles of DNA methylation and RNA polymerase I in human cancers (15,16). Here, we also focused on predicting the potential mechanism of LINC00634 in regulating the occurrence of CRC. GSEA functional enrichment results showed that LINC00634 was involved in the processes of DNA methylation and RNA polymerase I transcription, and affected gene expression at the transcription level. The aforementioned results corroborated that LINC00634 is abnormally expressed in CRC, and may affect the progression of CRC by implicating in the processes of DNA methylation and RNA polymerase I transcription. An extensive body of research showed that LINCRNAs are becoming a key molecule of immune regulation (17)(18)(19)(20), and play functional roles in regulating cancer immunity and tumour immune microenvironment (TME) (21). Furthermore, the correlation between LINCRNAs and immune cell in ltrate has also been explored in certain cancers, which implies the potential of LINCRNAs in evaluating the immune cell in ltrate of tumor (22,23). Another purpose of this study was to explore the association between LINC00634 expression and the in ltration levels of various immune cells in CRC. Our results demonstrated a correlation between LINC00634 expression and the in ltration levels of Tcm and NK CD56 bright cells in CRC. Moreover, LINC00634 could inhibit the functions of Tcm and promote the function of NK CD56 bright cells, thereby exerting its potential functions on CRC. Cox regression analysis revealed that LINC00634 expression, T stage, M stage, residual tumor and age were independent prognostic factors for colorectal carcinoma. Interestingly, Kaplan-Meier survival analysis showed that patients with LINC00634 upregulation exhibited lower OS and DSS than those with LINC00634 downregulation. Based on the results of Cox regression analysis, we drew a nomogram and performed internal veri cation in order to have a more accurate prediction of the 3-year and 5-year survival rates of colorectal carcinoma patients, and this model had a certain practical signi cance for the development of clinical treatment. In internal veri cation, the c-index of the nomogram was 0.772 (95% CI: 0.744-0.799). Therefore, the prediction model had good accuracy. However, there are some inevitable limitations that need to be addressed. First, the current research data was mainly obtained from public datasets, and the results of bioinformatics analysis should be further validated through experimental research. Second, it is necessary to obtain data from different cohorts in order to improve the credibility of this study. Finally, retrospective research still has many shortcomings especially the accuracy of data and lack of some information. Therefore, future experimental research is warranted to validate our ndings.

Conclusion
In summary, LINC00634 was increased in CRC tissues, and its expression was related to poor OS and DSS. Moreover, LINC00634 was involved in the development of CRC by affecting gene transcription levels and immune in ltrating cell functions. Our ndings indicate that LINC00634 can serve as a promising biomarker for the diagnosis and prognosis of CRC.

Declarations
Acknowledgements: We acknowledge TCGA and GTEx databases for the free use of gene expression and clinical datasets. Author contributions: LF organized, wrote and critically modi ed the manuscript; NYQ modi ed the manuscript; FFYH drafted the manuscript and was responsible for the acquisition of data; LF participated in the data analysis; LF contributed to the literature search. All authors read and approved the manuscript, and agree to be accountable for all aspects of the research and manuscript.

Characteristic
Low expression of LINC00634      Most relevant enrichment pathway based on the GSEA results.