Demographic and clinical characteristics of TNBC patients without gBRCAm
A total of 98 TNBC patients without gBRCAm were enrolled in our study(Supplementary Table S1). The detailed demographic and clinical characteristics of these patients are summarized in Table 1. As shown in Table 1, the vast majority of patients (78.57%) were younger than 65 years old, and most of them (65.31%) were postmenopausal women. Almost all patients were in early or local advanced stage (TNM I&II&III; 98.98%). 60% TNBC patients were randomly selected as the training cohort (N =59), and all patient (N=98) as the validation cohort (also called the entire cohort). The demographic and clinical characteristics of the patients were similar between those two cohorts.
Generate prognostic lncRNA signature from the training cohort
Using the univariate Cox regression analysis method, a set of 173 prognostic lncRNAs was identified in the training cohort (P < 0.05). A LASSO Cox regression model was further applied to the top 20 lncRNAs to generate a prognostic signature (Table 2). As a result, we recognized an new 8-lncRNA signature that was highly associated with OS in TNBC patients without gBRCAm(Figure 1A, 1B). As shown in Table 3, those 8 lncRNAs were all with positive coefficients, which meant they were correlated with poor survival.Based on the expression of these 8 lncRNAs for OS prediction, we established a risk-score formula: Risk score = (0.12701757*expression level of HAGLROS) + (0.205120473* expression level of AL139002.1)+ (0.217051763* expression level of AL391244.2)+ (0.257176862*expression level of AP000696.1)+ (0.453822184* expression level of AL391056.1)+ (0.57841541* expression level of AL513304.1)+ (0.625591281*expression level of TONSL.AS1)+ (2.872258676* expression level of AL031008.1).
Investigate the prognostic ability of the 8-lncRNA signature in the training cohort
We worked out the 8-lncRNA signature risk score for every patient in the training cohort. Using the median risk score as the cut-off point, the patients were categorized into a low risk group (N = 30) and high-risk group (N = 29). The Kaplan-Meier survival curve analysis showed that the overall survival rate of the high-risk group was lower, and the difference between the two groups was statistically significant (P=0.00018, Figure 2A). The prognostic ability of the 8-lncRNA signature was also evaluated by calculating the AUC of the time-dependent ROC curve. The ROC curve can be used to assess the specificity and sensitivity of the model (AUC >0.7 indicates that the model has good sensitivity). The higher the AUC, the better is the prediction performance of the signature. For 1, 5, 8 years survival times, the AUC of the 8-lncRNA signature in the training cohort were 1.000, 1.000 and 0.908 respectively (Figure 3A).
Validation the prognostic ability of the 8-lncRNA signature in the validation cohort
In order to confirm the power of the 8-lncRNA signature in predicting the OS of TNBC patients without gBRCAm, we validated our results in the entire cohort. By utilizing the same classification method, patients were classified into a high-risk group (N = 49) and a low risk group (N = 49). Consistent with previous findings, patients in the high-risk group revealed significantly worse OS compared to the low-risk group (P =0.0068, Figure 2B). And for 1, 5, 8 years survival times, the AUC of the 8-lncRNA signature in the entire cohort were 0.785, 0.790 and 0.892 respectively (Figure 3B). It indicated that the prognostic ability 8-lncRNA signature is highly sensitive and specific, and also time-dependent.
Functional enrichment analysis of lncRNA-related mRNAs
Based on the mRNA expression data from the TCGA database, 531 mRNAs were found to be closely related 8-lncRNA signature using the Pearson correlation with |COR|> 0.3 and P< 0.05 as the cutoff (Supplementary Table S2). The top ten related mRNA were CALML6, BRICD5, ADCK5, C10orf143, GLI4, FBXL6, KIFC2, CDCP2, CCDC154, TSTA3 with COR>0.46.The functions of those mRNAs were analysis in the Metascape. The top 20 clusters of significantly enriched terms are shown in Figure 4. The most significant term was GO:0034660 (ncRNA metabolic process), which included 26 genes: CDK9, BRF1, POLR2J, SNAPC4, RRP1, CPSF4, RRS1, BOP1, PTCD1, CPSF1, SIRT7, SRRT, YBEY, EXOSC4, DDX4, DDX56, INTS11, OSGEP, DUS1L, TUT1, MRM1, ELL3, PUS1, CTU1, TRMT61A, TYW3.The top five GO biological process included ncRNA metabolic process, mRNA 3'-end processing, snRNA metabolic process, regulation of DNA repair, oxidoreduction coenzyme metabolic process. The KEGG analysis revealed that these genes mainly involved in mRNA surveillance pathway, Sulfur relay system, Phototransduction, ABC transporters.More details about the enriched terms and genes were shown in Supplementary TableS3.
Overexpression of lncRNAs TONSL-AS1 and HAGLROS in triple-negative breast cancers without gBRCAm and its clinical significance
To gain general insights on the clinical association of lncRNAs TONSL-AS1 and HAGLROS in triple-negative breast cancers without gBRCAm, qPCR was used to determine the expression level of these lncRNAs in 30 breast cancer samples and their paired adjacent non-tumorous tissues. The results showed that overexpression of lncRNAs TONSL-AS1 and HAGLROS (defined as a greater than 10-fold increase in tumor tissue) were observed in 10/30 (33.3% for lncRNA TONSL-AS1) and 16/30 (53.3% for lncRNA HAGLROS) of the primary breast tumors (Figure 5; P<0.0001).
Clinicopathological association analysis found that overexpression oflncRNAs TONSL-AS1 (Table4) and HAGLROS (Table 5)were significantly associated with patient prognosis, tumor grade and invasion.