Prognostic Value of Long Non-Coding RNAs MALAT1 in Colorectal Cancer Patients: A Pooled Analysis of Two Cohorts.


 Background: Previous researches have shown that the aberrant expression of Metastasis associated in lung adenocarcinoma transcript 1 (MALAT1) in tumour tissues may serve as a biomarker for colorectal cancer (CRC) prognosis. However, these previous studies have small sample sizes and lacked validation from independently external populations. We therefore aimed to clarify the prognostic value of MALAT1 expression status in CRC patients using a large cohort and validate the findings with another large external cohort. Methods: The prognostic association between MALAT1 expression status and CRC outcomes was evaluated initially in a prospective cohort in China (n=164) and then validated in an external TCGA population (n=596). In the initial cohort, MALAT1 expression levels were quantified by quantitative reverse transcriptase polymerase chain reaction. Propensity score (PS) adjustment method was used to control potential confounding biases. The prognostic significance was reported as PS-adjusted hazard ratio (HR) and corresponding 95% confidence interval (CI). Results: There was no statistically significant association between MALAT1 expression status and CRC patient overall survival (OS) or disease free survival (DFS) in both initial cohort and external validation cohort populations. When combining these populations together, the results did not change materially. The summarized HRPS-adjusted were 1.010 (95%CI, 0.752-1.355, P=0.950) and 1.170 (95%CI, 0.910-1.502, P=0.220) for OS and DFS, respectively. We performed extensive sensitivity analyses, and demonstrated a very robustness of these results. Conclusions: The MALAT1 expression status was not associated with prognostic outcomes in CRC patients. Our findings did not support a prognostic association of MALAT1 expression with CRC outcomes.

diagnosis. Consequently, about 45% of patients suffer from recurrence or metastases after lesion resection. 6 To date, pathological tumour staging system and speci c histological characteristics have been reported as the most common prognostic predictors for CRC patients after surgery. However, patients with similar clinical/pathological features often experience different clinical outcomes. 7 Therefore, an effectual predictive biomarker is urgently needed in prediction of outcomes of the disease.
In our previous studies, we have developed a series blood based DNA methylation biomarkers for CRC prognosis. 8,9 Long non-coding RNAs are becoming hotspots in the research elds of tumour biomarkers.
Metastasis associated in lung adenocarcinoma transcript 1 (MALAT1) is a well-studied lncRNA. 10 MALAT1 that has 8.5kb nucleotides in length locates at 11q13, and is rstly found in a study of earlystage non-small-cell lung cancer. 11 Subsequent mechanism researches have demonstrated a vital function of MALAT1 in the development and progression in various cancers, including CRC. [12][13][14] Recently, several studies have shown that the aberrant expression of MALAT1 in tumour tissues may serve as a biomarker for CRC prognosis. [15][16][17][18][19] However, these previous studies have small sample sizes.
None of these studies validated their results in external populations. In order to clarify whether the expression status of MALAT1 in tumour tissues is associated with CRC prognosis or the clinical characteristics of CRC patients, we perform this prospective cohort analysis with a relatively large sample size and a long-term follow-up period. We further used the datasets of colon and rectum adenocarcinoma from The Cancer Genome Atlas (TCGA), as an independently external cohort population, to validate the ndings.

Patient samples and inclusion criteria
This study was approved by the Harbin Medical University Ethics Review Board (Harbin, China). The study design and patient selection strategy have been published previously. 8,9 Brie y, in our initial prospective cohort of CRC patients, a total of 168 patients were included according to the inclusion criteria. All of the patients provided written informed consents. The inclusion and exclusion criteria are as follows: (1) all patients are newly diagnosed with stage I-IV primary CRC, and their diagnosis was histologically con rmed by a senior pathologist (HL); (2) fresh frozen tumour tissues were collected from all patients; (3) patients with cocurrent any other types of cancer were excluded (n = 3); (4) patients with a family history of CRC in rst-degree relatives were excluded (n = 5); (5) patients who received anti-cancer therapy before surgery were excluded (n = 11).
All CRC patients were diagnosed and operated at the First A liated Hospital and the Third A liated Hospital of Harbin Medical University between May 2010 and December 2012. The tumour specimens were staged according to the 2009 seventh version of the AJCC TNM staging system. The clinical characteristics and medical records were collected. The primary outcome was overall survival (OS), de ned as time from surgery to death from any cause. The secondary outcome was disease-free survival (DFS), de ned as time from surgery to a local or regional relapse, distant metastasis, or CRC-speci c death, whichever came rst. Outcomes were observed during the follow-up period through March 15, 2018 via an established protocol. Postoperative patients were followed up at 3-6 months intervals for the rst year and then annually. We used a telephone-delivery follow-up questionnaire to collect information on the date and cause of death of CRC patients. The recorded date and cause of death of each CRC patient were validated using the medical certi cation of death and the Harbin Death Registration system. Four cases lacked follow-up data and were then excluded in this analysis. Of these 164 eligible CRC patients, the median follow-up period was 61.1 months (ranging from 4.9 to 80.8 months) and 75 cases died.

RNA extraction and qRT-PCR assays
Fresh tumour tissue samples were collected and immediately stored at -80℃. Total RNA were extracted from fresh frozen tissues (0.5g) using TRIzol reagent (Invitrogen). cDNA was reverse transcribed from 2 µg total RNA using MultiScribe™ reverse transcriptase (Applied Biosystems). The RNA and cDNA concentration was measured using NanoDrop 2000c (ThermoFisher, USA). cDNA was then ampli Melting curve analysis was used to monitor the speci city of PCR reactions. The resulting data was analysed using the Gene Scanning and TM Calling modules (Roche). Two co-authors (HL and YXZ) blinded to outcomes and independently recorded the results. The relative expression level of MALAT1 was determined using the 2 −ΔCt method. The ΔCt value of each sample was calculated by subtracting the average Ct value of MALAT1 from the average Ct value of GAPDH. According to the median value of 2 −ΔCt , patients were categorized into higher or lower MALAT1 expression groups.

External validation dataset
The colorectal dataset (CORD) from TCGA was used as external validation population. The MALAT1 expression pro le data, the clinicopathologic information, and survival data were downloaded from the TCGA database and the UCSC Xena resource. 20,21 After excluding those without MALAT1 expression data (n = 102) or survival data (n = 30), a total of 596 patients were included in our analyses, including 475 patients with colon cancer and 121 with rectal cancer. The median follow-up period for these 596 patients was 22.5 months, with a range of 0.2 to 150.1 months, and a total of 121 cases died.
The gene expression RNA-seq-HTSeq-FPKM-UQ dataset for TCGA colon and rectum adenocarcinoma was performed using the UCSC Xena website tools, and then used in our analyses. The relative quanti cation of MALAT1 expression level was presented as N-fold differences and termed as 'N Malat1 ', which was determined by dividing the value of MALAT1 expression by the value of GAPDH. Then, patients were categorized into the higher (≥ median of N Malat1 ) or lower (< median of N Malat1 ) groups.

Statistical analysis
We used a Cox proportional hazards regression model to calculate the sample size. Given a pre-estimated overall survival rate of 50% in this initial cohort population, a sample size of 128 cases was required to achieve 90% power to detect an estimated hazard ratio (HR) of 1.5 with a two-sided 5% level of statistical signi cance. Finally, we included additionally 25% more patients and targeted a total sample size of 164 patients. The sample size was estimated using PASS software (version 11.0.7, NCSS LLC., USA).
We reported means (standard deviations) and counts (frequencies) for continuous and categorical variables, respectively. To minimise covariate differences between groups, we performed a PS-based analysis. 22 Group differences were compared using the standardised differences method with a signi cant imbalance level of standardised difference ≥ 25%. The PS value was calculated with MALAT1 expression level as the dependent variable using a multivariate logistic regression model that included demographic factors and clinical/pathological characteristics. We used the PS-adjustment method in order to incorporate all the patients in our analysis. 23 Survival curves were estimated by the Kaplan-Meier method, and the differences between survival rates between groups were examined with log-rank tests. The univariate and PS-adjustment multivariate Coxproportional hazards regression models were used to assess prognostic signi cance and the results were reported as hazard ratios (HRs) and 95% con dence intervals (CIs). The associations between MALAT1 expression status and those clinical/pathological covariates were reported as odds ratios (ORs) and 95% CIs. Statistical signi cance was de ned as a two-sided P < 0.05. All statistical analyses were conducted with SPSS Statistics (v.23.0, IBM, USA).

Sensitivity analysis
Several predesigned sensitivity analyses were performed to explore the robustness of the results. Firstly, we compared the univariate HR and the PS-adjustment HR using the confounding RR, 24 which was calculated to evaluated the relative impact of the PS adjustment on the results. Secondly, we performed a conventional multivariate Cox regression analysis as a sensitivity analysis. Additionally, for the external cohort population, we performed a post hoc sensitivity analysis by excluding those patients with a shorter follow-up duration (≤ 1 or ≤ 3 months) in order to explore the potential confounding impact. Finally, we performed extensive post hoc subgroup analysis according to clinical/pathological factors. In post hoc subgroup analyses, we used the Bonferroni adjustment method to correct the level of statistical signi cance.

Meta analysis
In order to better understand the current evidence for the association between MALAT1 expression and CRC prognosis, we systematically review the relevant researches and performed a meta-analysis. We systematically searched eligible studies assessing the prognostic signi cance of MALAT1 expression on CRC patient outcomes in PubMed, EmBase, and ProQuest through May 25, 2020. The inclusion criteria were as follows: (1) prospective cohort studies addressing the prognostic associations of MALAT1 and CRC outcomes; (2) studies that reported effect estimates including HRs with corresponding CIs; (3) studies with the sample size more than 50 participants; (4) there was no restriction on language, race, or any other participant characteristics. Data extraction was conducted independently by two co-authors (HL and YXZ). The maximally adjusted effect sizes and 95% CIs were extracted and summarised using random-effects models. The Q test and the I 2 Statistic were used to test the between-study heterogeneity.
The pooled effect estimates were presented as forest plots. We performed E-value analysis, 25 as a posthoc sensitivity analysis, to explore whether an unmeasured confounding factor could explain the observed associations.

MALAT1 and Patient Outcomes
In the initial cohort, we analysed MALAT1 expression levels in a series of 164 tumour tissues from primary CRC patients with known clinical/pathological status and long-period follow-up outcomes. After PS adjustment, all these covariates between groups reached balance (Standardised mean difference < 0.25, eTable 1). There was no prognostic association between MALAT1 expression status and CRC patient outcomes. The univariate HRs were 1.  (Table 1).  We then combined the PS-adjusted HRs from the initial and external populations together by using random effect models, and found no prognostic signi cance of MALAT1 expression status in CRC patient outcomes. The pooled HR PS−adjusted were 1.010 (95% CI, 0.752-1.355, P = 0.950) and 1.170 (95% CI, 0.910-1.502, P = 0.220) for OS and DFS, respectively. The pooled effect estimates for subgroup populations also showed similar results (Table 3).

Meta-analysis of MALAT1 and Patient Outcomes
To further assess the robustness of the results, we performed a systematically meta-analysis. The pooled results were showed in Fig. 1. Brie y, another three eligible studies were included in this meta-analysis. By pooling these results together, we still did not nd a positive prognostic association for OS, with a summarized HR of 1.683 (95% CI, 0.917-3.087; P = 0.093). For DFS, there was a marginally positive association between higher MALAT1 expression and worse DFS, with a summarized HR of 1.784 (95% CI, 1.021-3.118; P = 0.042).

MALAT1 and Clinical/Pathological Characteristics
We sought associations between MALAT1 expression level and clinical/pathological characteristics in CRC patients (eTable 6 and eTable 7). We found a signi cantly positive association between MALAT1 overexpression and higher CA19-9 level (P = 0.016), and higher T stage (P = 0.030) in the initial cohort population. In the external cohort, a signi cantly strong association between MALAT1 overexpression and overweight or obese (≥ 25 kg/m 2 ) was observed (P = 0.001). As for other clinical/pathological factors, there was no positive relationship.

Discussion
In our initial cohort, there was no prognostic association between MALAT1 expression status and CRC patient outcomes. This nding was con rmed in the external TCGA cohort. Furthermore, the consistence among extensive sensitivity analyses provided the robustness of the results. To our best knowledge, this present study is the largest population cohort addressing the prognostic effect of MALAT1 on CRC patient outcomes. We initially performed a prospective cohort analysis with a long-term follow-up period of 7 years. Then we used the CORD patient cohort from TCGA as external validation datasets. No association of MALAT1 expression status with OS or DFS of CRC patients was found in our analysis, which was inconsistent with the ndings from several previous studies.
A total of eight relevant studies assessed the association of MALAT1 expression and CRC prognosis. [13][14][15][16][17][18][19] Most of these studies reported that CRC patients with higher MALAT1 expression in tumour tissues had worse clinical outcomes with shorter OS or DFS. However, the sample sizes of these previous studies were all small, ranging from 30 to 146 cases. In addition, univariate Cox hazard ratio regression models were used in most of these studies, and only Zheng 15 and Li 16 took the potential impact of multifactor confounders into consideration. None of these studies conducted PS-based analyses. The PS-based method is a powerful statistical tool to control for confounding bias and is often more practical and statistically more e cient than conventional strategies of multivariate statistical analyses, 22, 23 which has been increasingly used to reduce the impact of confounders in observational studies, especially studies with small sample size.
Based on the inclusion criteria, three eligible studies were nally included in the meta-analysis. The pooled results supported the notion that there was no association between MALAT1 expression and OS of CRC patients. For DFS, a marginally positive prognostic signi cance was observed; however, E-values of both the point estimate and the lower CI limit of the pooled results were small, suggesting that a hypothetical residual confounding factor would fully explain the observed association for DFS (eTable 8).
Future large population cohorts are needed to further validate this issue for DFS. Given the rigorousness and better performance in controlling confounders of the PS methods used in this study, conclusions were drawn mainly according to the ndings from our initial and external validation populations.
Subgroup analyses by AJCC stages revealed a marginally better OS in stage II CRC patients with higher MALAT1 expression than those with lower expression. The HR PS−adjusted was 0.556 (95% CI, 0.328-0.943) with a P-value of 0.029, which did not reach statistical signi cance according to the Bonferroni correction method (α = 0.0125). In the CA19-9 higher level subgroup, it is found that CRC patients with higher MALAT1 expression had a longer OS than those with lower expression. But this nding cannot be validated in the TCGA external cohort population, due to the lack of eligible data. Therefore, the ndings from subgroup analyses should be interpreted with cautions.

Strengths and limitations
This study had several major strengths, including the novel PS-based analysis, a relatively large population, the validation by using of the external cohort population, and the validation by meta-analysis.
However, our present study had certain limitations. Firstly, confounding bias was the major limitation due to the nature of the observational cohort study design. The ndings of confounding RR analyses suggested that those confounders could overstate the prognostic association of MALAT1 with CRC patient outcomes. In a conservative manner, we used the PS-adjustment method to maximally control for the impact of potential confounders on the results. It is known that the PS method is a powerful statistical tool to reduce the likelihood of confounding bias in observational studies. Another limitation is the lack of detailed information about adjuvant chemotherapy from both our initial cohort and the external cohort.

Conclusions
In summary, our ndings did not support a prognostic association of MALAT1 expression with CRC patient outcomes.

Declarations
Ethics approval and consent to participate: This work has been approved by the Medical Ethics Committee of Harbin Medical University. All participants in the initial cohort provided written informed consent.
Consent for publication: Not applicable.