Potential role of long non-coding RNA LINC00968 in luminal A and B breast cancers

Purpose Breast Cancer (BC) is the most frequent malignancy among women worldwide. ER + breast cancers (luminal A and B subtypes) comprise up to 70% of all BC patients. Long non-coding RNAs (lncRNAs) are regulatory non-coding transcripts and longer than 200 nucleotides. LncRNAs can affect many biological and pathological processes and dysregulation of them is related to many human cancers. The potential role of LINC00968 lncRNA in luminal A and B breast cancer pathogenesis is still unclear. Methods Seventy-one pairs of tumor and adjacent non-tumor tissue specimens of luminal A and B breast cancer were used to analyze the expression of LINC00968. Furthermore, two luminal A cell lines, MCF7 and T47D, were used to evaluate the expression of LINC00968 compared with a non-malignant breast cell line, MCF10A. Moreover, we have done multiple bioinformatic analyses for a better understanding of the potential roles of LINC00968 in luminal BC. Results Our data revealed the signicant downregulation of LINC00968 in luminal tumor tissues and cell lines. LINC00968 expression was negatively associated with tumor stage and lymph node metastasis. Bioinformatic analyses indicated that LINC00968 might be involved in blood vessel development, angiogenesis, and might be participated in interaction of ECM constituents with cancer cells. LINC00968 might have functions in some cancer-related signaling pathways, like PI3K/Akt, ECM-receptor interaction signaling pathway, and PPAR signaling pathway. Conclusions Downregulation of LINC00968 might promote carcinogenesis of luminal BC. LINC00968 might act as a tumor suppressor gene and might also promote invasion and metastasis of luminal BC.


Introduction
Breast Cancer (BC) is a heterogeneous disease and the most frequent malignant disease among women around the globe [1]. Also, this type of cancer is the leading cause of cancer mortality in female patients [1]. The incidence of BC is increasing in many developing countries [2]. Microarray-based gene expression pro ling analysis was used to classify BC into ve intrinsic/molecular subtypes based on the expression pattern of estrogen receptor [ER], progesterone receptor [PR], human epidermal growth factor receptor 2 [HER2], and a proliferation index marker, Ki67 [1]. These subtypes are included in luminal A, luminal B, HER2 enriched, triple-negative (basal-like), and normal-like that display different biological behavior and progression [1]. Luminal A is the most frequent molecular subtype, which expresses ER and PR but not HER2, and shows low expression of Ki67 [1]. Luminal B is de ned with ER positive, PR positive or negative, HER2 positive or negative, and high expression of Ki67 [3]. The two types of luminal B (with HER2 + or HER2 -) have worse prognosis and higher proliferation rate than luminal A subtype and are more aggressive [3]. Luminal A and B constitute approximately 70% of all breast cancers and show better prognosis than hormone-receptor negative BC [2].
The molecular mechanisms of BC have been broadly studied, but some problems still exist such as delayed diagnosis, recurrence and metastasis [4]. Currently, molecular markers can improve the diagnosis, prognosis, and guidance of new therapies for BC.
Recent advances in sequencing technologies revealed that approximately 90% of the human genome is transcribed actively [5]. Most of the human RNA transcripts are non-coding RNAs (≥80%) [5,6] which perform important functions in different biological processes [5]. Long non-coding RNAs (lncRNAs) are regulatory non-coding RNAs longer than 200 nucleotides [7]. The number of lncRNAs in mammalian genome is higher than protein-coding mRNAs, but they have lower expression levels and tissue-speci c expression [6]. Many of lncRNAs are associated with a variety of diseases such as cancers [8]. Long ncRNAs can interact with DNA, RNAs and protein macromolecules in cells, and regulate tumor suppressor or oncogenic pathways through epigenetic, transcriptional, post-transcriptional and translational mechanisms that can lead to the cancer phenotype [9,10]. Some lncRNAs affect proliferation, apoptosis, drug resistance, invasion, metastasis, cell-cycle regulation, and DNA damage response of cancer cells [11]. Non-coding RNAs show differential expression between tumor tissues and normal tissues [7]. LncRNAs can be categorized into tumor suppressor genes and oncogenes according to their functions and the expression patterns in the tumor tissues [12]. Some lncRNAs are differentially expressed in the molecular subtypes of breast cancers [7]. Exploring diverse functions of lncRNAs in cancers like breast cancer may lead to new ideas about novel biomarkers and targeted therapies and can be useful in early detection [8].
Long Intergenic Non-Protein Coding RNA 968 (LINC00968) is located on chromosome 8q12.1 including 3 exons. Studies indicated that this lncRNA acted as an oncogene in osteosarcoma [13]. Also, this gene was upregulated in lung squamous cell carcinoma (LUSC) and played key roles in the tumorigenesis of this malignancy [14].
In this study, we rst screened differentially expressed lncRNAs by analyzing 420 luminal A and B tissue samples of TCGA-BRCA RNA sequencing dataset obtained from GDC data portal. After selection of LINC00968 as a target gene, we conducted an in vitro assay to evaluate the expression level of LINC00968, and to assess the dysregulation of this lncRNA in luminal A and B tissues, as well as MCF7 and T47D cell lines. Also, the association between LINC00968 expression and clinicopathological features of patients was assessed. In addition to the experimental study, multiple bioinformatic analyses were performed to explore more about the functions of LINC00968 in luminal subtypes of BC.

Screening for differentially expressed lncRNAs in luminal A and B breast cancer
In the rst step, we used Lnc2Cancer 3.0 database (http://www.bio-bigdata.com [15]) to de ne a list of novel lncRNAs associated with all cancers. Lnc2Cancer 3.0 provides broad experimentally association (4989 lncRNA-cancer associations) between 1614 lncRNAs and 165 human cancer categories. Also, a TCGA invasive breast carcinoma (BRCA) RNA sequencing dataset was obtained from the Genomic Data Commons (GDC) Data Portal (https://portal.gdc.cancer.gov [16]). Next, a total of 420 paired tumor/normal tissue samples containing 230 luminal A and 190 luminal B were chosen from the TCGA-BRCA RNA sequencing dataset. Afterwards, differentially expressed lncRNAs (DElncRNAs) were de ned through the comparison of tumoral and normal tissues across the luminal subtypes of BC. This RNA-seq data was analyzed using DESeq2 package in R statistical software. The thresholds of selection for DElncRNAs obtained from Lnc2Cancer and GDC databases were set at adj p-value < 0.05 and│log 2 FC│> 2. Subsequently, literature review was done for the selected DElncRNAs to de ne lncRNAs on which no study of luminal A and B subtypes of BC was conducted. We determined the expression level of the selected lncRNA in luminal A and B subtypes of invasive breast carcinoma dataset using the Gene Expression Pro ling Interactive Analysis 2 (GEPIA2) web server (http://www.GEPIA2.cancer-pku.cn [17] [18]. All samples were immediately snap-frozen in liquid nitrogen and after transportation, stored at -80˚C. This study was approved by the Ethics Committee of Tehran University of Medical Sciences (TUMS) (approval no. IR.TUMS.MEDICINE.REC.1398.659). Written informed consent was obtained from all patients prior to their inclusion in the study. The clinicopathological data of patients was recorded from medical documents.

Bioinformatic analysis 2.3.1 PAM50 molecular intrinsic subtypes differential expression analysis
The expression level of LINC00968 across PAM50 molecular subtypes of breast cancer was retrieved from TANRIC database (https://www.tanric.org [19]). TANRIC characterized the expression pro le of lncRNAs across 20 cancer types from TCGA and other datasets.

Detection of the co-expressed genes with LINC00968 in luminal subtypes of BC
The data of 420 luminal BC tissue samples containing 230 luminal A and 190 luminal B tissue samples were downloaded as FPKM le from the GDC data portal. Then, the co-expressed genes with LINC00968 were obtained across the mentioned samples. Pearson correlation analysis of these co-expressed genes has been computed using a standard method.

Annotation enrichment, protein-protein interaction (PPI) network, and pathway analyses
Gene ontology (GO) term enrichment analysis was performed using DAVID version 6.8 (https://david.ncifcrf.gov [23,24]). This web server was used to provide three GO term categories based on a large list of LINC00968 co-expressed genes, including biological processes, cellular components, and molecular functions. Also, GO terms obtained from DAVID were applied to the REVIGO web server (http://revigo.irb.hr [25]). REVIGO web server can take long lists of Gene Ontology terms and summarize them by removing redundant GO terms. The XGMML les retrieved from REVIGO were imported into the Cytoscape version 3.8.0 to increase their visualization. Also, Enrichr database (http://amp.pharm.mssm.edu/Enrichr [26,27]) was used to perform the pathway enrichment analysis of genes co-expressed with LINC00968, using pathway databases such as KEGG, Reactome, and WikiPathways. Besides, the protein-protein interaction (PPI) network of LINC00968 co-expressed genes were obtained by the STRING database (https://string-db.org [28]), and the hub genes were determined based on degree method in cytoHubba plugin of Cytoscape software [29].

Statistical analysis
The qPCR data analysis was performed using IBM SPSS version 24 software (IBM Co., Armonk, NY, USA). Data were expressed as mean ± standard deviation. Paired samples t-test was used to examine the signi cant differences between gene expression in tumoral and adjacent non-tumor tissues. Normal distributions of ∆Ct values of LINC00968 in tumoral and adjacent non-tumor tissue samples were approved using the Kolmogorov-Smirnov Test. The association between LINC00968 expression and the clinicopathological data was performed by χ 2 test. The Pearson correlation coe cient ≥ 0.4 and p-values <0.05 were considered as signi cant.  (Fig. 1a). Moreover, the quantitative real time PCR results of in vitro assay indicated that there was a highly signi cant downregulation of LINC00968 (p <0.001) in luminal A and B breast cancer tissues compared with adjacent non-tumor tissues (Fig. 1b). LINC00968 expression levels were remarkably lower in 79% (56/71) of tumor samples in comparison to those in adjacent non-cancerous tissues (Fig. 1c). Also, the qRT-PCR results in T47D and MCF7 cell lines revealed that LINC00968 expression (p <0.001) was remarkably lower in these two luminal A cell lines than that in MCF10A cell line (Fig. 1d). Thus, the results of experimental assay are similar to the bioinformatic results and concordant with them.

The association of LINC00968 expression with clinicopathological data
The clinicopathological characteristics of patients with breast cancer (including luminal A and B) were shown in Table 1. These clinicopathological features were included in luminal A and B groups, ER/PR/HER2 status, tumor size, lymph node metastasis, tumor stages and grades, age at diagnosis and p53 protein expression.
According to the median levels of LINC00968 expression in luminal A and B breast cancer tissues, patients were classi ed into high and low expression groups. Low expression of LINC00968 in luminal A and B was signi cantly associated with advanced tumor stage (p=0.016) and lymph node metastasis (p=0.022). LINC00968 expression was lowest in stage III (Fig. 1e).

Expression level of LINC00968 in PAM50 molecular subtypes of BC
The expression analysis across PAM50 molecular intrinsic subtypes was performed through TANRIC database. This database revealed that LINC00968 had the lowest expression in luminal B subtype and the highest expression in normal-like subtype (P= 0.000061729) (Fig. 2).

Genetic alterations of LINC00968 in luminal subtypes of breast cancer
Based on the cBioPortal database, copy number variations (CNVs) of LINC00968 among luminal subtypes of four invasive breast carcinoma studies were assessed. The OncoPrint from cBioPortal revealed that there were genetic alterations in 10% (383/3653) of invasive breast carcinoma patients (Fig.   3a). Also, LINC00968 was mostly ampli ed in the luminal subtypes of the mentioned studies and deletion existed only in the MBC project (Fig. 3b).
According to the ICGC data portal, there are 206 mutations in LINC00968 across three breast cancer projects (BRCA-EU, BRCA-FR, BRCA-UK) (Fig. 3c, Supporting information le 1). About 193 of 206 LINC00968 mutations occur in luminal subtypes of breast cancer (BRCA-EU project). Most of the LINC00968 mutations in luminal BC are substitutions. The majority of LINC00968 mutations take place in the intronic site (Fig. 3d).
3.6 GO functional annotation, PPI network, and pathway enrichment analysis for LINC00968 coexpressed genes To better understand the potential functions of LINC00968 in luminal A and B breast cancer, co-expressed genes with this lncRNA were rst retrieved through analysis of 420 luminal A and B tissues. The coexpression analysis of LINC00968 indicated that this lncRNA is signi cantly co-expressed with 1032 genes (R ≥ 0.4) according to the luminal A and B subtypes of invasive breast carcinoma dataset (Supporting information le 2).
A list of GO terms for three categories including biological process (BP), cellular component (CC), and molecular function (MF) were retrieved from DAVID database. The top 10 enriched GO terms (p-value< 0.05) of BP, CC, MF for LINC00968 co-expressed genes were illustrated in Table 2. LINC00968 coexpressed genes were mainly involved in some biological processes such as 'vasculature development', 'blood vessel development', and 'cardiovascular system development'. Moreover, some signi cant GO cellular component terms of LINC00968 co-expressed genes were included in 'proteinaceous extracellular matrix', 'extracellular matrix', and 'extracellular matrix component'. Additionally, LINC00968 co-expressed genes were mainly enriched in some molecular functions like 'glycosaminoglycan binding', 'heparin binding', 'calcium ion binding', and 'extracellular matrix structural constituent'.
Summarization of Gene Ontology terms and removing redundant GO terms using REVIGO indicated GO terms similar to the GO term enrichment analysis retrieved from DAVID based on the co-expressed genes with LINC00968 (Fig. 4 a, b).
The results of the pathway enrichment analysis (KEGG, WikiPathways and Reactome) of the coexpressed genes using Enrichr indicated the potential pathways of LINC00968 functions in luminal A and B breast cancer. The co-expressed genes with LINC00968 might have roles in some cancer-associated signaling pathways such as PI3K/Akt/mTOR signaling pathway, ECM-receptor interaction signaling pathway, and PPAR signaling pathway (Fig. 5a-c). Additionally, focal adhesion, proteoglycans in cancer, and cell adhesion molecules (CAMs) are included in enriched pathways for co-expressed genes.
The protein-protein interaction (PPI) network of the LINC00968 co-expressed genes was retrieved from the STRING database. This network indicated that 622 of 1032 co-expressed genes had interaction score> 0.4 with each other (Supplementary le 3). The top 48 hub genes in the PPI network of the LINC00968 coexpressed genes were obtained based on degree method (node degree ≥ 30) and the sub-PPI network was generated using the cytoHubba plugin in cytoscape software (Fig. 6, Table 3). Approximately 73% of the top mentioned hub genes were signi cantly downregulated in luminal A and B breast cancer, based on the GEPIA2 web server (Table 4).

Discussion
Extensive studies reported that most of the lncRNAs are abnormally expressed in tumor tissues, and each lncRNA exerts its function through diverse mechanisms in different cancers. According to the previous studies, LINC00968 was upregulated in osteosarcoma and lung squamous cell carcinoma [13,30].
LINC00968 increased proliferation of osteosarcoma through PI3K/Akt signaling pathway [13]. Some signaling pathways such as PI3K/Akt are important pathways in breast cancer and LINC00968 might perform its function through this mechanism, but more studies are needed.
There is not any study about the expression levels and the roles of LINC00968 in luminal A and B subtypes of breast cancer. Thus, the combination of in vitro and bioinformatic analyses of gene expression can be useful for deciphering a part of the potential roles of LINC00968 in luminal BC.
In the present study, LINC00968 was signi cantly downregulated in luminal A and B breast cancer tissues, and T47D and MCF7 cell lines (luminal A cell lines) compared with adjacent non-tumor tissues and MCF10A control cell line, respectively.
Studies on human tissues indicated that the expression of lncRNAs is variable among human tissues of different persons, and the interindividual expression variations of lncRNAs are more than mRNAs [31]. In the current study, LINC00968 was downregulated in 79% (56/71) of tumor tissues and this may be due to the interindividual expression variability of LINC00968 in luminal breast cancer patients.
The low expression levels of LINC00968 in tumor samples are signi cantly associated with advanced tumor stage and lymph node metastasis. It is considered that LINC00968 may act as a tumor suppressor gene in the luminal breast cancer based on tissue-speci c manner, and dysregulation of this lncRNA maybe lead to the invasion and metastasis of luminal BC. Thus, LINC00968 might be involved in the luminal A and B breast cancer carcinogenesis.
The data retrieved from the ICGC data portal reported that a great number of mutations occur in LINC00968 across ER + /HER2breast cancer project. Most of the LINC00968 mutations are of substitution type and mostly take place in the intronic regions. The somatic mutations directly or indirectly change gene expression [32] and lncRNA expression pro les in cancers. Some somatic mutations that were located on lncRNA transcription factor binding sites affected the lncRNA expression in cancers [32]. Somatic mutations can affect RNA secondary structure and the regulation of gene expression [33]. Somatic mutations in introns can disrupt splice sites or intronic splicing regulatory sequences, activate cryptic splice sites, and change transcriptional enhancer or silencer binding sequence. Therefore, the intronic mutations in LINC00968 may affect transcription levels and RNA splicing.
The data obtained from the cBioPortal database illustrated that LINC00968 is mostly ampli ed in luminal subtypes of four breast cancer studies. The roles and effects of copy number variations in gene deregulation have not been investigated in whole genome, and only some regions in genome have been studied [34]. Also, expression of some genes has correlation with gene dosage in some tissues, but not in other tissues [35]. It is supposed that ampli cation of a gene might not always lead to upregulation and some of these ampli ed genes might be repressed by epigenetics mechanisms [34]. Therefore, it is assumed that ampli ed LINC00968 might be silenced by mechanisms such as epigenetics in luminal breast cancer that may lead to downregulation, but future studies are needed.
The "guilt by association" principle states one gene might show the same regulatory mechanisms and roles with its co-expressed genes in the related processes and functions [36]. Thus, considering this principle and according to the gene ontology (GO) terms enrichment analyses, LINC00968 might be involved in some biological processes like vasculature development, blood vessel development, cardiovascular system development, extracellular matrix organization, and angiogenesis. Angiogenesis is an important process in the development of new blood vessels, and invasion and metastasis of tumor cells [37]. The vascular ECM is consisted of the basement membrane and the interstitial ECM that have major functions in vascular homeostasis and tumor angiogenesis [38].
Also, the functional analyses indicated that LINC00968 is mostly involved in glycosaminoglycan binding, heparin binding, calcium ion binding, and extracellular matrix structural constituent. Glycosaminoglycans (GAGs) are components of the extracellular matrix (ECM) that are contributed in the regulation of cell functions [37]. ECM constituents are critical in molecular and cellular mechanisms of cancer cells. Interaction between molecules expressed by cancer cells and factors in tumor microenvironment can affect tumor cell survival, proliferation of tumor cells, adhesion, migration of tumor cells, and therefore, tumor invasion and metastasis [39]. Calcium and some ECM components like Glycosaminoglycans (GAGs) have interaction with each other and imbalance of extracellular concentrations of some ions like calcium can promote or suppress cancer cell migration [37].
According to the Enrichr database, LINC00968 is involved in some cancer-associated signaling pathways, like PI3K/Akt signaling pathway, ECM-receptor interaction signaling pathway, PPAR signaling pathway. Also, other enriched pathways for LINC00968 are focal adhesion, proteoglycans in cancer, and cell adhesion molecules (CAMs) that are associated with cancers. Focal adhesions (FAs) provide attachment of cells to the ECM, and focal adhesion kinase (FAK) has signi cant roles in some human cancers like breast cancer invasion and metastasis [40].
According to the STRING database, about 60% of the LINC00968 co-expressed genes have signi cant interaction with each other. Also, considering the PPI network of the co-expressed genes with LINC00968 and GEPIA2 web server, most of the hub genes (73%) are signi cantly downregulated in luminal A and B breast cancer tissues in comparison to normal breast tissues (data not shown). Thus, these results con rmed that LINC00968 and its co-expressed genes are involved in common biological processes.

Conclusion
Based on the data collected from previous research and this study, dysregulation of lncRNAs expression is thought to be involved in carcinogenesis. In the current study, downregulation of LINC00968 was observed in luminal A and B tumor tissues and luminal A cell lines. Downregulation of LINC00968 was remarkably associated with lymph node metastasis and advanced tumor stage. Thus, LINC00968 might act as tumor suppressor gene in luminal BC. Moreover, it is considered that LINC00968 might be involved in the invasion and metastasis processes of luminal BC. Moreover, LINC00968 might have functions in some cancer-related processes such as the blood vessel formation and angiogenesis. Furthermore, this novel lncRNA might play crucial roles in several cancer-associated signaling pathways in luminal BC like PI3K/Akt/mTOR signaling pathway, ECM-receptor interaction signaling pathway, and PPAR signaling pathway. Therefore, LINC00968 might have impacts on tumorigenesis of luminal A and B breast cancer. Compliance with ethical standards Con ict of interest The authors declare that they have no con ict of interests.
Ethics approval and consent to participate The present study was approved by the Ethics Committee of Tehran University of Medical Sciences (TUMS) (approval no. IR.TUMS.MEDICINE.REC.1398.659) and informed consent was obtained from all patients.
Consent for publication Not applicable.