LncRNAs Signature Associated with Chemoradiotherapy Response and Prognosis in Locally Advanced Rectal Cancer

Yiyi Zhang First A liated Hospital of Fujian Medical University Binjie Guan Fujian Medical University Union Hospital Yong Wu First A liated Hospital of Fujian Medical University Fan Du First A liated Hospital of Nanchang University Jinfu Zhuang First A liated Hospital of Fujian Medical University Yuanfeng Yang First A liated Hospital of Fujian Medical University Xing Liu First A liated Hospital of Fujian Medical University Guoxian Guan (  fjxhggx@163.com ) First A liated Hospital of Fujian Medical University

Our ndings showed that DBET, LINC00909, and FLJ33534 could serve as novel biomarkers for prediction of NCRT response and prognosis in CRC patients. And LINC00909 could be a novel therapeutic targets in enhancing the NCRT response.

Background
Preoperative neoadjuvant chemoradiotherapy (NCRT) and radical surgery have become the standard of care for locally advanced rectal cancer (LARC) patients [1]. The bene ts of this multimodality therapy have been well documented, including tumor downsizing and downstaging, increased radical resection rate, and reduce local recurrence [2][3][4]. However, rectal cancer patients could show heterogeneous treatment responses to NCRT. Approximately 15-45% of rectal cancers would develop resistance to NCRT and can be exposed to NCRT-related toxicities without oncological bene ts, and even treatment failure [5].
Therefore, the identi cation of valid biomarkers for resistance to NCRT has become imperative.
Long non-coding RNAs (lncRNAs) are transcripts longer than 200 nucleotides (nt) in length and are lack of protein-coding ability [6]. LncRNAs play critical roles in many biological processes by affecting transcriptional modulation, splicing regulation, and post-transcriptional process [7][8][9]. LncRNAs are also involved in the process of proliferation, invasion, progression, and metastasis of cancers, including CRC [10][11][12][13]. Recently, lncRNAs have been reported to act as diagnostic and prognostic biomarkers for several cancers [14][15][16][17][18][19], including CRC. Several studies reported that lncRNAs also can act as effective biomarkers to chemotherapy resistance in mCRC patients [20,21]. Recently, Li et.al [22] reported that several effective biomarkers, including mRNAs and lncRNAs, can effectively predict NCRT response in LARC patients. However, studies regarding lncRNAs associated with resistance to NCRT are limited. In addition, the incorporation of multiple lncRNAs' expression is needed for improving the prediction accuracy of NCRT response and prognosis of LARC patients.
In this context, this study aimed to screen lncRNAs relevant to NCRT response using our previous gene expression pro le. Then, the lncRNAs were veri ed in internal and external datasets containing patient tissue samples, and a risk factor model to predict disease free survival was built based on the Cox regression analysis. Finally, we identi ed the function of the powerful lncRNA, LINC00909, in vivo and in vitro.
2 Materials And Methods

Data preprocessing
Totally 31 LARC patients receiving preoperative NCRT and radical surgery between March 2016 to December 2016 in Fujian Medical University Union Hospital, China were enrolled in this study. The inclusion criteria, exclusion criteria, treatment protocols, and follow-up protocols were described in our previous study [23]. The raw data can be obtained from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/, GSE145037) and used as an internal data set for screening effective lncRNAs. Moreover, a total of 138 CRC patients without preoperative therapy between January 2017 and December 2017 were used for building the risk score model and validating the lncRNAs expression in cancer and adjuvant cancer tissues, named as the risk score training dataset, and the samples were collected after surgery. And a total of 58 LARC patients who received NCRT from 2017 to 2017 were included for external validation of predictive e ciency, named as the external validation dataset, and the samples were collected at diagnosis by the colonoscopy. All patients provided written informed consent.
The work ow of this study was shown in Fig. 1 The methods of screening the hub lncRNAs in the microarray were as follows: to distinguish the function of the genes in the microarray, we download the human annotation le from the Ensemble database [24].
Then annotate the function of each gene from the microarray. Genes can be classi ed into two categories, including non-coding RNAs and protein-coding RNAs. A total of 241 lncRNAs were found in the microarray. The differential expressions of the lncRNAs were screened out by P < 0.05. Finally, basing on the FDR and P value, we selected the most relevant four lncRNAs including, DBET, LINC00909, FLJ33534, and HSD52. Moreover, based on the differential expression protein-coding genes and the hypothesis that lncRNAs directly interact with mRNA and regulate the activity of mRNAs by acting as miRNA sponges, a lncRNA-miRNA-mRNA ceRNA competing endogenous RNA (ceRNA) network of the above lncRNAs was constructed. First, the differential expression lncRNAs and mRNAs were selected from the microarray. Then, the lncRNA-miRNA and miRNA-mRNA interactions were predicted. Based on the miRcode online tool (http://www.mircode.org), the MiRDB (http://www.mirdb.org/), miRTarBase (http://mirtarbase.mbc.nctu.edu.tw//), and Targetscan (http://www.targetscan.org//), the miRNAs negatively regulated by lncRNAs and mRNAs were selected to construct the ceRNA network. In addition, the differentially expressed mRNAs (DEmRNAs) from the ceRNA network was selected to perform the KEGG and GO analysis.

Real-time quantitative polymerase chain reaction (RT-qPCR)
Total RNA from patient tissues were isolated using TRIzol reagent (Invitrogen) according to the manufacturer's instruction. And 1µg total RNA was used for reverse transcription reaction using M-MLV Reverse Transcriptase Product (Promega). RT-qPCR was performed using an ABI 7500 real-time PCR system (Applied Biosystems; Thermo Fisher Scienti c, Inc., Foster City, CA, USA). LncRNA levels were assessed by RT-qPCR with GAPDH used as an internal control. PCR ampli cation was performed by denaturation at 94˚C for 5 seconds, annealing and extension at 62˚C for 40 seconds for 40 cycles. The relative expression level of lncRNAs was calculated using the ΔCt method. In brief, the difference value between GAPDH Ct value and lncRNA Ct value was de ned as the ΔCt value, and the high ΔCt value was recognized as the relatively low expression of the lncRNA in each sample. All PCR ampli cation was performed in triplicate and repeated in three independent experiments. The RT-qPCR analysis was performed using primers in supplementary Table 1.

Internal and external validation for the hub lncRNAs
We rst veri ed the hub lncRNAs expression between the NCRT-resistant and -sensitive groups in the microarray data. Then, we evaluated the hub lncRNAs expression between cancerous tissues and adjacent non-cancerous tissues in the external data. Additionally, the expressions of hub lncRNAs were analyzed in patients receiving NCRT. The receiver operating characteristic (ROC) curve was plotted and the area under the ROC curve (AUC) was calculated to evaluate the predictive ability of the hub genes.

Overexpression of the LINC00909 with the lentivirus
In order to overexpressed LINC0090, the sequences of the LINC00909 and control (CON) were as follows,

Colony formation assay
Colony formation assay was carried out as described previously [25]. Brie y, Cells were plated in6-well plates (500 cells per plate) cultured for 14 days for 24 h before the addition of 4Gy radiotherapy cultured for 14 days, xed with 4% paraformaldehyde for 15 min, stained with 1% crystal violet for 10 min before counting the number of colonies. The number of colonies with diameters of more than 1.5 mm was counted.

Cell resistance to the 5-FU
Anchorage-dependent cell growth was evaluated by a CCK-8 Kit (Dojindo Laboratories, Japan) according to the manufacturer's instructions. Cells were plated in 96-well plates at 3x10 3 cells per well. When cells reached 60% con uence, the medium was removed and replaced with fresh medium containing varying concentrations of 5-FU, and then incubated for 48 h. The optical density was detected at 450 nm using a microplate reader, and the cell viability was calculated.

Tumor xenografts in the Rat
Male athymic nude mice (15-20g, 6-8 weeks of age) were purchased from SHANGHAI SLAC LABORATORY ANIMAL CO. LTD (China). Care and treatment of all experimental mice were carried out in accordance with institutional guidelines (No. 2019-0023). Tumor xenografts were established by subcutaneous injection of a 100 ul cells (DLD ov-LINC00909 groups VS. DLD con-LIN00909 group; SW620 ov-LINC00909 groups VS. SW620 con-LIN00909 group) suspension (1×10 7 /ml), in each foreleg of nude mice. Then we measured the long diameter as the tumor size each weeks and performed for 4 weeks.

Statistical analysis
All statistical analyses were performed using SPSS software (version 23, SPSS Inc, Chicago, IL) and R software (version 3.4.1). The optimal cut-off values for lncRNAs expression were determined by using the X-tile program (http://www.tissuearray.org/rimmlab/). Survival outcomes were assessed using the Kaplan-Meier method and the log-rank test. A Cox proportional hazards model was performed to identify risk factors for disease-free survival (DFS) [26]. LASSO Cox regression model was applied to determine the ideal coe cient for each prognostic feature and estimate the likelihood deviance [27][28][29][30]. The corresponding risk scores for the samples from validation datasets were calculated using a risk score system. Based on cut-off values determined by ROC analysis, patients were divided into high-risk and low-risk groups. The entire patient cohort was divided into two subgroups according to patient outcomes (dead or alive). Then, ROC curves were plotted based on the risk scores and survival status. The risk score was selected as the cut-off value when the AUC reached its maximum. Kaplan-Meier and Cox regression analyses were performed to compare DFS risk between high-risk and low-risk groups. The performance of the model was evaluated by time-dependent ROC analysis. Decision curve analysis (DCA) was performed to evaluate the clinical utility of the model for disease recurrence. DCA is a method for evaluation and comparison of the predictive value between different prediction models [31,32]; therefore, this method was used to evaluate the clinical utility of the model for disease recurrence. The x-axis of the DCA represents the percentage of threshold probability, and the y-axis represents the net bene t of the predictive model. The net bene t was calculated according to the following formula: Net bene t = (true positives/n) − (false positives/n) * (pt/(1 − pt). P < 0.05 was considered statistically signi cant.

Cluster analysis, GO enrichment and KEGG analysis
The gene microarray was used to examine gene expression pro les in primary tumor cells. A total of 18419 genes were detected, including 241 lncRNAs. Supervised hierarchical cluster analysis of lncRNA expression pro ling data showed a clustering trend between the two groups ( Fig. 2A, 2B). Moreover, a total of 16 differentially expressed lncRNAs (DElncRNAs) were found in the two groups, and the expression of DElncRNAs was higher in the NCRT-resistant group (P < 0.05).
GO enrichment analysis was performed to investigate the molecular mechanism of DElncRNAs involved in NCRT response for LARC patients. As shown in Fig. 2C, the top three signi cant GO terms were related to the positive regulation of transcription from RNA polymerase II promoter, negative regulation of transcription from RNA polymerase II promoter and positive regulation of transcription, DNA-templated. KEGG pathway analysis demonstrated that the top three KEGG pathways were pathways in cancer, MAPK signaling pathway, and neurotrophin signaling pathway (Fig. 2D). Moreover, we selected the top 6 lncRNAs to construct a ceRNA network basing on the differential mRNAs in the gene chip (Fig. 2E).
Moreover, the LASSO analysis was performed to explore the signi cant predictors for disease. The result demonstrated that the DBET, LINC00909, FLJ33534, and HSD52 were the signi cant factors ( Fig. 2F and G).

The four lncRNAs validation in internal data
To validate the four lncRNAs expression in the internal data, we examined the DBET, LINC00909, FLJ33534, and HSD52 expression in rectal cancer tissues between NCRT-resistant and -sensitive cases in our microarray datasets. The relative expression of four lncRNAs were signi cantly increased in NCRTresistant tissues ( analysis demonstrated that the top three KEGG pathways were related to vasopressin-regulated water reabsorption, RAS signaling pathway, and glioma signaling pathway (Fig. 3F). Moreover, the four lncRNAs relevant ceRNA network was constructed basing on the differential mRNAs in the gene chip (Fig. 3G).
Figure3 veri ed the three lncRNAs in the R2 platform and Oncomine database. High expression of the DBET, LINC00909, and FLJ33534 were associated with worse prognosis in the TCGA, Sveen, and Marisa datasets (all P < 0.05). Moreover, the DBET, LINC00909, and FLJ33534 were higher expression in the CRC cancer tissues compared with adjutant-cancerous tissues by eight datasets meat-analysis in Oncomine database.</ g> 3.3 Hub genes validation in the external without preoperative therapy data To independently validate the hub genes, we analyzed the expression level of the hub genes in the cancerous tissues and adjacent non-cancerous tissues using qPCR ( Fig. 4A  A total of 138 patients were included in the validation set. The clinicopathological characteristics of CRC patients are summarized in Table 1. As seen in Supplementary Fig. 1, X-tile plots identi ed 7.4, 10.6, 11.8, and 6.7 as cut-off values for DBET, LINC00909, FLJ33534, and HSD52, respectively. Based on the cut-off points, we divided the entire cohort into low and high subgroups in terms of DFS. Lower expression of DBET and LINC00909 were associated with a better DFS and overall survival (OS) in CRC patients (both P < 0.01, Fig. 4C, D, G, and H). The higher expression of the FLJ33534 was associated with a worse DFS (P < 0.01, Fig. 4E) and overall survival (OS) (P = 0.06, Fig. 4I) in CRC patients. The high expression of the HSD52 was associated with worse prognosis but there is no statistic difference DFS (P = 0.12, Fig. 4F) and overall survival (OS) (P = 0.09, Fig. 4J) in CRC patients.  Fig. 5A). Furthermore, we analyzed the predictive ability of each hub lncRNA in patients receiving NCRT before surgery. The hub gene with the biggest predictive power was LINC00909 (AUC = 0.82, P < 0.01, Fig. 5C). The predictive ability of other lncRNAs, such as DBET (AUC = 0.65, P = 0.07), FLJ33534 (AUC = 0.67, P = 0.04), and HSD52 (AUC = 0.66, P = 0.06) were as show in Fig. 5B, D, and E. Moreover, we analysis the relationship between four lncRNAs and prognosis in LARC patients. As shown in the Fig. 5F-M, the result demonstrated that the high expression of the DBET, LINC00909, FLJ33534, and HSD52 were associated with the worse DFS in LARC following NCRT patients (P = 0.02, P < 0.02, P = 0.02, and P = 0.06). However, we cannot nd the similarly result in the OS (P = 0.77, P = 0.33, P = 0.06, and P = 0.71).

Construction of a risk factor model and validation
To explore the prognostic impact of the hub lncRNAs on DFS in CRC patients, we performed a Cox Using the cutoff value of 0.89 for risk scores generated from ROC curves, the patients were divided into high-risk and low-risk groups (Fig. 6A). Patients in the low-risk group had an improved DFS and OS than those in the high-risk group (both log-rank P < 0.001, Fig. 6B and C). Moreover, the risk score was also identi ed in the LARC patients, and the result demonstrated that the risk score can also predict the prognosis, DFS (P < 0.01, Fig. 6E) and OS (P = 0.08, Fig. 6D) in the LARC patients.
As depicted in Fig. 6F, Time-dependent AUC curves showed that the LINC00909 had the most powerful predictive ability among the hub lncRNAs. The Cox model showed a stronger predictive ability to predict DFS for CRC patients than any single hub lncRNA. To further explore the predictive ability of the risk score in predicting the NCRT response, ROC curve analysis was performed in the LARC patients. The results demonstrated that the risk score had better predictive power compared with any hub gene (AUC = 0.75, P = 0.01, Fig. 6G).
3.6 Association of risk score with patient characteristics patients and prognosis in CRC All patients (n = 138) were equally divided into the low-risk score group (n = 68) and the high-risk score group (n = 68). A higher pathology M stage (P = 0.017) and nerval invasion (P = 0.040) were found in the high-risk group. No statistical differences were observed between two groups in terms of gender, age, American Society of Anaesthesiology (ASA) grade, tumor location, histopathology, tumor differentiation pathology T stage, pathology N stage, postoperative hospital stay (days), lymph nodes retrieved, metastatic lymph nodes, and tumor size, as shown in supplementary Table 3.
To further determine the prognostic factors in CRC patients, COX regression analysis was performed. On univariate analysis, higher pathological M stage (HR = 13.670, P < 0.001), and higher risk score (HR = 2.549, P < 0.001) were independently associated with OS in CRC patients. Multivariate Cox regression analysis demonstrated that higher pathological M stage (HR = 4.441, P = 0.006), and higher risk score (HR = 2.110, P < 0.001) remained signi cantly associated with OS, as demonstrated in Table 1.
On univariate analysis, higher pathological N stage (HR = 2.465, P < 0.001), vascular invasion (HR = 2.387, P = 0.040), higher pathological T stage (HR = 2.348, P = 0.008), and higher risk score (HR = 2.625, P < 0.001) were independently associated with DFS in CRC patients. Cox regression analysis demonstrated that higher risk score (HR = 1.224, P < 0.001) and higher pathological N stage (HR = 2.128, P = 0.001), remained signi cantly associated with increased risk of local recurrence, as demonstrated in Table 2. We further explored the association between the risk score and clinicopathological parameters. In the early pathology stage (stage 0-II), we found that the low-risk score group had better DFS and OS compared with the low-risk score group (P < 0.01, Fig. 7A and B). In the advanced pathology stage (stage III-IV), the low-risk group had a better prognosis compared with the high-risk group (all P = 0.01, Fig. 7C and D). DCA was used to evaluate the performance of the risk score. As shown in Fig. 7E, the risk score provided more bene t than either lncRNAs in the disease-free scheme or the disease recurrence scheme.
The clinical impact curve (Fig. 7F) showed the prediction of risk strati cation of 1,000 patients using a resampling bootstrap method. "Number high risk" indicated the number of patients classi ed as positive (high risk) by the risk score according to various threshold probabilities. "Number high risk with the event" was the true positive patient number according to various threshold probabilities.

The lncRNAs validate in the R2 platform and Oncomine database
Basing on the above result, the DBET, LINC00909 and FLJ33534 were further enrolled to verify in the external database. We veri ed the three lncRNAs in the R2 platform and Oncomine database in the CRC tissues. The result was shown in the supplemental Fig. 1, the high expression of the DBET, LINC00909 and FLJ33534 were associated with the worse prognosis in the three independent database in the R2 platform, which were similarly with our result. In the Oncomine database, basing the eight dates metaanalysis we can nd the DBET, LINC00909 and FLJ33534 were high expression in the CRC tissues compared with adjutant-cancerous tissues. The above result supporting our result that the DBET, LINC00909 and FLJ33534 acted as the oncogene in the CRC patients.

Overexpression of the LINC00909 associated with the NCRT resistance in vivo and in vitro
Basing on the previous result, we found that the LINC00909 was the most powerful lncRNAs in predicting the NCRT response and prognosis in CRC patients. To further identi ed the function of the LINC00909 in CRC cell lines, we constructed two LINC00909 overexpression CRC cell lines. As shown in Fig. 8A, we successful constructed the LINC00909 overexpressed cell lines, DLD-over-LINC00909 and SW620-over-LINC00909 (all P < 0.01). Moreover, we detected the LINC00909 overexpression CRC cell lines resistance to the NCRT. The result demonstrated that the IC50 of the DLD-CON group was 112.80 ± 20.76 ug/ml to 5-FU, and the DLD-over group was 1104.74 ± 50.74 ug/ml to 5-FU; the IC50 of the SW620-CON group was 94.89 ± 9.887 ug/ml to 5-FU, and the SW620-over group was 845.62 ± 35.24 ug/ml to 5-FU ( Fig. 8B and   C). Moreover, we analysis the sensitive of the LINC00909 overexpression cell lines to the 4Gy radiotherapy( Fig. 8D and E). The result demonstrated that the colony number of DLD-CON group was 19.33 ± 10.12, and the colony number of DLD-over group was 62.33 ± 15.04 (P < 0.01); the colony number of DLD-CON group was 21.33 ± 11.37, and the colony number of DLD-over group was 101.67 ± 25.03 (P < 0.01). The tumor xenografts was performed to explored the LINC00909 function in vivo experiment. As shown in Fig. 9F-J, the tumor size of the two groups were veri ed the LINC00909 as the oncogene in vivo (all P < 0.01).

Discussion
In this study, four lncRNAs (DBET, LINC00909, FLJ33534, and HSD52) were identi ed, and DBET, LINC00909, FLJ33534 were validated as hub genes correlated with NCRT response and prognosis in CRC patients. A three-lncRNA based risk model was constructed to predict NCRT and prognosis of CRC patients. Moreover, we veri ed the function of the LINC00909 in CRC in vivo and in vitro experiment.
LncRNAs have been reported to act as potent biomarkers for diagnosis and prediction of the prognosis, progression in the CRC patients [33][34][35]. Li et.al revealed that several effective biomarkers, KRAS, PDPK1, PPP2R5C, PPP2R1B, and YES1 which including mRNAs and lncRNAs, can effectively predict the NCRT response in the LARC [22]. However, the predictive effect of lncRNAs and a predicting model basing on lncRNAs in NCRT resistance are still unclear. To explore the role of the lncRNAs in the LARC patients receiving NCRT, we re-analyzed the microarray and classi ed the genes basing on the coding function in the transcriptome. Then, basing on the LASSO analysis we selected the most effective four lncRNAs, DBET, LINC00909, FLJ33534, and HSD52, in our microarray data to predict the NCRT response and prognosis in CRC patients. And the DBET, LINC00909, and FLJ33534 have a well predictive power to predict the NCRT response in both internal and external data sets. The results indicated that the hub lncRNAs were effective biomarkers in predicting NCRT response and prognosis.
In the previous studies, the above hub lncRNAs had already been reported in several areas. LINC00909 had already reported in the human glioma, and LINC00909 could act as an oncogenic lncRNA in glioma tumorigenesis [36]. Moreover, in the Xu et. al study [37], the high expression of serum LINC00909 could serve as an effective diagnostic biomarker for CRC. HSD52 gene expression is associated with body mass index (BMI) in obese Korean women including overweight patients [38]. In addition, Ahmad et. al [39] demonstrated that FLJ33534 has an intronic variant, rs140133294, in association with BMI variance.
Currently, no studies have investigated the biological functions of the DBET gene. To further explore the hub lncRNAs involved in the function and mechanism in NCRT patients, Pearson analysis and a ceRNA network were constructed to select relevant mRNAs [40]. The ceRNA network was constructed based on the hypothesis that lncRNAs directly interact with miRNAs and regulate the activity of mRNAs by acting as miRNA sponges [41]. Based on the associated mRNA, the GO enrichment and KEGG analysis were performed, the results demonstrated that the hub lncRNAs involve in the RAS signaling pathway and transcriptional activator activity, which had been reported in the previous studies [42][43][44]. Moreover, we successfully constructed the LINC00909 overexpression CRC cell lines to veri ed that overexpression LINC00909 could enhance the resistance to the NCRT in CRC. And the result of the in vivo and in vitro was according with our microarray and qPCR result.
The microarray as a useful tool to selected the hub lncRNAs. But there is no de ning that the result from the microarray may contain several false-positive lncRNAs which resulting the lncRNAs cannot be veri ed in the external dataset. Thus, to further verify the hub lncRNAs screened by the microarray pro ling, we examined the hub lncRNAs expression in the cancerous and adjacent non-cancerous tissues in 138 CRC patients. The results demonstrated that the three lncRNAs acted as the oncogenes in the CRC and high expression of the three lncRNAs was associated with a shorter DFS.
The risk factor model has been utilized for prognostication in several tumors, such as liver, lung, and colon cancers [10,45,46]. However, to our best knowledge, no study has focused on NCRT response and prognosis in CRC patients. In the present study, we successfully constructed a risk factor model based on a three-lncRNA signature that had a powerful ability in predicting the prognosis of CRC patients. Moreover, the time-dependent ROC curve demonstrated the risk factor model had the best AUC value than each of the single lncRNAs in predicting DFS in CRC patients. Additionally, to further explore the relationship between three lncRNAs and NCRT response in the CRC patients, we screened out 36 LARC patients who received the NCRT before surgery in an external data set. The results of ROC analysis revealed that the LINC00909, FLJ33534, and the risk factor score had a powerful predictive ability to predict NCRT response in LARC patients. Moreover, in both the internal and external data sets, the risk score model had a better predictive power than any single lncRNA. In summary, the risk factor model based on the four lncRNAs had a strong predictive ability in predicting prognosis and NCRT response in CRC patients.
In clinical practice, a patient's prognosis is usually in uenced by a variety of clinical factors. DCA as a useful tool can assist the clinical decision. In the present study, we also found that the risk score basing on the four lncRNAs had a better bene t to estimate the patients' disease recurrence rate. Thus, we employed the risk score to analyze the patient's prognosis in CRC patients. The pathological TNM stage has been considered as the most useful factor to predict the prognosis of CRC patients in many studies [47,48]. As we all known, ypCR-Stage II patients are associated with a better prognosis, such patients without risk factors are not recommended for postoperative chemotherapy according to the current NCCN guidelines [49]. However, the recurrence rates of the early-stage patients were 83.1-88.7% [50,51]. Thus, screening out the above patients was an urgent task. In the present study, we analyzed the prognosis of the early-stage patients and the results demonstrated that a higher risk score was associated with a higher risk of disease recurrence in CRC patients. To sum up, the risk score model not only acted as an effective predictive biomarker in CRC patients but also can distinguish the ypCR-Stage II CRC patients who had disease recurrence risk.
There were some limitations to the current study. Firstly, the small gene chip sample size was a major limitation of our study. Due to the study design, we included LARC patients at diagnosis who did not receive any treatment before biopsy from a colonoscopy, which limited the sample size of our study. We will continue to expand our sample size in our future studies. Secondly, the involved pathways of hub lncRNAs were conducted by lncRNAs microarray pro ling and bioinformatics methods, and they need to be further validated by in vitro and in vivo experimental studies in future research.
In conclusion, we identi ed and validated the three hub lncRNAs as new effective predictors for NCRT and a prognostic factor for CRC patients. Moreover, based on the three lncRNAs, we constructed a risk factor model that had a strong power to predict NCRT and prognosis in CRC patients. These results may help to discriminate CRC patients who are candidates for NCRT. Moreover, the risk score can distinguish the ypCR-Stage II patients CRC patients who had higher disease recurrence rate and the early-stage patients with a high risk score will be considered for postoperative chemotherapy. Moreover, we identi ed the function of the most powerful lncRNA, LINC00909, in resistance to the NCRT. Nevertheless, more insightful molecular mechanisms are warranted in future studies. Work ow diagram of data preparation, processing, analysis, and validation in this study.