Long non-coding RNA landscapes in benign and malignant thyroid neoplasms of distinct histological subtypes


 Background: The main types of thyroid neoplasms, follicular adenoma (FA), follicular thyroid carcinoma (FTC), classical and follicular variants of papillary carcinoma (clPTC and fvPTC), anaplastic thyroid carcinoma (ATC), are differ in the prognosis, rate of progression and metastatic behavior. It can be supposed that there are specific patterns of lncRNAs involved in the development of clinical and morphological features. The lncRNA landscapes within distinct benign and malignant histological variants of thyroid neoplasm are unknown. Methods: Comprehensive set of Microarray and RNA-Seq datasets was analyzed for the expression of lncRNAs in FA, FTC, fvPTC, clPTC and ATC. The potential biological functions were evaluated via coexpression and enrichment analysis. Results and conclusion: Abberant expression of lncRNA in FA, FTC, fvPTC, clPTC and ATC was established. The lncRNAs common for benign and malignant neoplasms, specific for papillary carcinomas, specific for clPTC, fvPTC and ATC are determined. The determined common and specific lncRNAs are found to be putatively involved into L1CAM interactions; processing of capped intron-containing pre-mRNA; Tryptophan metabolism; PCP/CE pathway and Beta-catenin independent WNT signaling; extracellular matrix organization and cell cycle and mitotic. The patterns of lncRNA expression in FA and FTC are appeared to be similar with no genes significantly differentially expressed within these subtypes. Previously known oncogenic and supressor lncRNAs (NR2F1-AS1, LINC00511, SLC26A4-AS1, CRNDE, LINC01116, RMST) are found aberrantly expressed in thyroid carcinomas. The findings enhance the understanding of lncRNA landscape in thyroid neoplasms and its role in thyroid cancer progression.

proteins to the gene promoter, decoying miRNAs and proteins, or interfering with protein post-translational modification [5,6,7,8]. Relative to the coding genes, lncRNA can be classified into intergenic (lincRNA); antisense (on the opposite strand of protein-coding locus); sense intronic or overlapping (on the same strand, transcript in introns of a coding gene, or contains a coding gene in its intron); retained intron (an alternatively spliced transcript containing intronic sequence); bidirectional (originates from the promoter region of a protein-coding gene, with transcription proceeding in the opposite direction on the other strand); 3-prime overlapping (overlap the 3'UTR of a protein-coding locus on the same strand).
In thyroid cancer several lncRNAs were shown to have pathogenic and predictive role, including BANCR, FALEC, CNALPTC1, PVT1, NAMA, PTCSC1, PTCSC2, PTCSC3, TNRC6C-AS1 and others [10][11][12][13][14][15][16][17][18][19][20][21]. However, all of the studies considered only PTC and mostly none of the previous work took into account the difference between clPTC and fvPTС. There are no published studies describing landscapes of lncRNA in ATC, FTC and FA. Nevertheless, lncRNAs differently expressed in ATC could reflect anaplastic features and be strong prognostic factors. As morphology and behavior of FTC differ from PTC it can be proposed that the landscape of lncRNAs in FTC would be different from that of PTC. Investigation of lncRNAs common and specific for FA and FTC is important in understanding their relations and revealing differential diagnostic markers.
This study aimed to find out lncRNAs specific and common for main types of thyroid neoplasms (FA, FTC, fvPTC, clPTC and ATC). The expression data from microarray technology (8 datasets) and RNA-Seq technology (PRJEB11591 dataset and TCGA transcriptome data) were analyzed.
CEL files were downloaded and normalization was performed using gcrma R package. Microarray probes were annotated with Ensembl version 93 using biomaRt package [22].

RNA-Seq datasets
RNA-Seq dataset PRJEB11591 of Yoo SK et al. [23] was selected from EBI European Nucleotide Archive database (https://www.ebi.ac.uk/ena/data/view/PRJEB11591). FASTQ files were downloaded, alignment was performed by hisat2 [24]. Counts were calculated using featureCounts (Rsubread package) with annotation by Ensembl version 93 and Ensemble gene ID as grouping attribute [25]. Genes with low counts (less than 2 count in number of samples exceeding the size of lowest sample group) were filtered out, TMM normalization (edgeR package) and voom method of limma R package were applied.
In TCGA transcriptomic data 58 NT, 356 clPTC and 101 fvPTC were selecte. Samples of metastases, and other minor histological subtypes were excluded. Raw counts (HTSeq -Counts Workflow Type, briefly, STAR 2-pass alignment followed by gene expression count assessment with HTSeq) were downloaded from Genomic Data Commons Data Portal (GDC, https://portal.gdc.cancer.gov/). Genes with low counts (les then 1 count in number of samples exceeding the size of lowest sample group) were filtered out, followed by TMM normalization (edgeR package) and voom analyses of limma [ 26].

Selection of lncRNA genes
Protein coding genes and genes attributed to Havana biotypes not related to lncRNA were filtered out of count matrices. Genes of the following Havana biotypes were included in the analyzes: lincRNA, antisense, 3-prime overlapping ncRNA, bidirectional promoter lncRNA, misc RNA, processed transcript, sense intronic, sense overlapping.

Statistical analysis
To identify differentially expressed lncRNAs, linear modelling using Limma package was performed [27]. Genes with FDR adjusted P-value ≤ 0.01 and fold change (FC) ≥ 2.0 were considered being differentially expressed. Hierarchical clustering heatmap analysis of differentially expressed genes was performed using coolmap of limma.

Validation
For clPTC and fvPTC, sets of genes found significantly differentially expressed at the previous step on Microarray, RNA-Seq PRJEB11591, and RNA-Seq TCGA datasets were processed with intersection. Genes found in all three datasets, and genes found in both RNA-Seq dataset but absent in microarray probes were considered as validated.

Evaluation of potential biological functions
To identify genes positively and negatively coexpressed with the differently expressed lncRNA pairwise Pearson correlation between the lncRNA and all the genes was calculated. Genes with an absolute r ≥ 0.7 and a significant correlation (P-value < 0.05) were considered to be coexpressed. Enrichment of Gene Ontology Enrichr [28,29]. Terms with adjusted P-value from Fisher's exact test ≤ 0.05 were considered significantly enriched.  Clustering analyses was performed for microarray and PRJEB11591 datasets. It showed strong clustering of ATC, clustering of clPTC and weak clustering of fvPTC.

LncRNAs differentially expressed in thyroid neoplasms
Surprisingly there was no clustering within the groups of FTC and FA (Fig. 2).

LncRNAs common for benign and malignant thyroid neoplasms
There are LINC02555 and LINC02471 genes that are in Top 5 differently expressed lncRNAs in all studied thyroid neoplasms, including FA ( Fig. 3, 4, Table 2). These lncRNA are validated in clPTC and fvPTC, and are differentially expressed in papillary carcinomas compared to FA.

LncRNAs common for differentiated thyroid carcinomas
There are 32 lncRNAs differentially expressed in all studied histological subtypes of differentiated carcinomas (FTC, clPTC, fvPTC) but not in FA (Fig. 3). Of them, 6 lncRNAs were validated and significantly differentially expressed in clPTC and fvPTC compared to FA (Fig. 4, Table 2). None of the 32 lncRNA was differentially expressed in FTC compared to FA. LncRNA specific for papillary carcinomas There are 22 genes differentially expressed in both clPTC and fvPTC, but not in follicular neoplasms (Fig. 3), validated and significantly differentially expressed compared to FA and FTC (Fig. 4, Table 3) -lnRNAs associated with papillary features in thyroid carcinomas. All these lncRNAs are differentially expressed compared to follicular thyroid carcinoma and follicular adenoma.
LncRNA specific for histological subtypes of differentiated carcinomas There are 20 lncRNAs aberrantly expressed in FTC, but not in other studied neoplasms, and significantly differentially expressed compared to PTC (Table 4).
However, none of these lncRNAs was differentially expressed compared to FA.
The 32 genes were found being differentially expressed in clPTC but not in other differentiated carcinomas and FA, validated, and significantly differentially expressed compared to fvPTC, FTC and FA -lncRNA specific for clPTC (Fig. 3, 4, Table 5).
Of 29 genes differently expressed in fvPTC but not in other differentiated carcinomas or FA (Fig. 3), only ENSG00000257647 gene is specific for fvPTCvalidated and significantly differentially expressed in fvPTC compared to FA, FTC and clPTC.
LncRNA specific for ATC ATC samples were available only in microarray dataset, which also included two variants of PTC. Out of 376 lncRNAs differentially expressed in ATC, 252 were not differentially expressed in other investigated histological subtypes, and 185 were significantly differentially expressed compared to clPTC and fvPTC -lncRNAs specific for ATC. Top 30 genes are represented in Table 6, the full list is in the Additional file 5.
Potential biological functions of aberrantly expressed lncRNAs interactions; lncRNA specific for FTC -processing of capped intron-containing pre-mRNA; specific for papillary carcinomas -Tryptophan metabolism; specific for fvPTC -PCP/CE pathway and Beta-catenin independent WNT signaling; specific for clPTCextracellular matrix organization; specific for ATC -cell cycle and mitotic (Fig. 5). Surprisingly there was no clustering within the groups of FTC and FA (Fig. 2).

LncRNAs common for benign and malignant thyroid neoplasms
There are LINC02555 and LINC02471 genes that are in Top 5 differently expressed lncRNAs in all studied thyroid neoplasms, including FA (Fig. 3, 4, Table 2). These lncRNA are validated in clPTC and fvPTC, and are differentially expressed in papillary carcinomas compared to FA.

LncRNAs common for differentiated thyroid carcinomas
There are 32 lncRNAs differentially expressed in all studied histological subtypes of differentiated carcinomas (FTC, clPTC, fvPTC) but not in FA (Fig. 3). Of them, 6 lncRNAs were validated and significantly differentially expressed in clPTC and fvPTC compared to FA (Fig. 4, Table 2). None of the 32 lncRNA was differentially expressed in FTC compared to FA. LncRNA specific for papillary carcinomas There are 22 genes differentially expressed in both clPTC and fvPTC, but not in follicular neoplasms (Fig. 3), validated and significantly differentially expressed compared to FA and FTC (Fig. 4, Table 3) -lnRNAs associated with papillary features in thyroid carcinomas. All these lncRNAs are differentially expressed compared to follicular thyroid carcinoma and follicular adenoma.
LncRNA specific for histological subtypes of differentiated carcinomas There are 20 lncRNAs aberrantly expressed in FTC, but not in other studied neoplasms, and significantly differentially expressed compared to PTC (Table 4).
However, none of these lncRNAs was differentially expressed compared to FA. The 32 genes were found being differentially expressed in clPTC but not in other differentiated carcinomas and FA, validated, and significantly differentially expressed compared to fvPTC, FTC and FA -lncRNA specific for clPTC (Fig. 3, 4, Table 5). Of 29 genes differently expressed in fvPTC but not in other differentiated carcinomas or FA (Fig. 3), only ENSG00000257647 gene is specific for fvPTCvalidated and significantly differentially expressed in fvPTC compared to FA, FTC and clPTC.
LncRNA specific for ATC ATC samples were available only in microarray dataset, which also included two variants of PTC. Out of 376 lncRNAs differentially expressed in ATC, 252 were not differentially expressed in other investigated histological subtypes, and 185 were significantly differentially expressed compared to clPTC and fvPTC -lncRNAs specific for ATC. Top 30 genes are represented in Table 6, the full list is in the Additional file 5.
Potential biological functions of aberrantly expressed lncRNAs interactions; lncRNA specific for FTC -processing of capped intron-containing pre-mRNA; specific for papillary carcinomas -Tryptophan metabolism; specific for fvPTC -PCP/CE pathway and Beta-catenin independent WNT signaling; specific for clPTCextracellular matrix organization; specific for ATC -cell cycle and mitotic (Fig. 5).
Putative biological process involving common and specific lncRNAs were established. LncRNAs common for all studied thyroid neoplasms might be involved in L1CAM interactions; common for follicular and classical variants of papillary carcinoma -in Tryptophan metabolism. Tryptophan degradation to kynurenine by the Indoleamine 2,3-Dioxygenase 1 (IDO1) is a well characterized immunosuppressive mechanism in cancer progression, including thyroid cancer [39].  [45].
For lncRNAs specific for ATC there is a strong enrichment of cell cycle and mitotic pathways which possibly reflects involvement of these lncRNAs in the loss of differentiation and high proliferation rate characteristic for ATC.

Conclusion
LncRNAs common for thyroid neoplasms, common for carcinomas with papillary features, specific for clPTC, fvPTC, FTC and ATC were discovered in the performed analyses of the most comprehensive dataset (combined of Microarray dataset and two RNA-Seq datasets). Similarity of lncRNA landscapes in FTC and FA was revealed.
LncRNAs found to be specific for ATC are probably associated with anaplastic features and cancer progression.

Declarations
Ethics approval and consent to participate. Not applicable

Consent for publication. Not applicable
Availability of data and materials. The datasets used and analysed during the current study are available in the following repositories: Results of the selection of common and specific lncRNA.

Figure 4
Results of the selection of common and specific lncRNA. Putative biological process employing aberrantly expressed lncRNAs in thyroid neoplasms. En Putative biological process employing aberrantly expressed lncRNAs in thyroid neoplasms. En

Supplementary Files
This is a list of supplementary files associated with the primary manuscript. Click to