LncRNA-sequencing data analysis
We characterized the lncRNA landscape of expression by performing deep RNA-seq experiments on three CN and three HG-induced HUVECs samples. After SEQTK quality assessment, more than 33 million total original reads for each sample were obtained, and the proportion of bases with quality values greater than 20 (Q20) was >94%. These results indicate that the quality of the sequencing results was acceptable (Table 2). After filtering out the adaptor sequence and low quality reads, the percentage of clean reads within the raw reads accounted for 94% of the total sequences in the two groups. Hisat2 software was used to map the clean reads to the Rattus norvegicus reference genome. As shown in Table 2, approximately 97% of the trimmed reads were mapped onto the reference genome. In total, we screened 40,380 lncRNAs from the six samples, including 387 novel and 39,993 known lncRNAs, of which 36,550 were shared lncRNAs detected in both the HG and CN HUVEC groups (Figure 1A, Supplemental Table S1). Most of the identified lncRNAs were transcribed from protein-coding exons (sense and antisense); others were from introns and intergenic regions (Figure 1B). Furthermore, 24,304 lncRNA transcripts could be found in all chromosomes, with the majority located on chromosome 1 (Figure 1C).
Identification of differentially expressed lncRNAs
EdgeR was used to filter differentially expressed lncRNAs (DELs) between the HG-induced and CN HUVEC groups. Among the lncRNAs, 214 were significantly upregulated (log2 (fold-change) > 1, FDR < 0.05) and 197 were significantly downregulated (log2 (fold-change) < −1, FDR < 0.05) in response to HG exposure (Figure 2). Additionally, several of the DELs had a fold change value equal to positive infinity and negative infinity, meaning that these lncRNAs were completely switched-on or off with HG induction. The top five upregulated DELs were NONHSAT180405.1, MSTRG.31780.5, NONHSAT086922.2, NONHSAT022138.2, NONHSAT094345.2, NONRATT027551.2; and the top five downregulated DELs were NONHSAT056661.2, NONHSAT204850.1, NONHSAT217441.1, MSTRG.9798.2, NONHSAT152502.1 (Supplemental Table S2).
qRT-PCR verification of DELs
To verify our findings, the expression profiles of six differentially expressed lncRNAs were randomly selected for qRT-PCR analysis. There were three repeats per group and five repeats per sample in the qPCR. The results show that the expression of the lncRNAs had similar trends as with the sequencing results, indicating that our sequencing results were reliable (Figure 3).
Regulatory analysis of DELs and DEGs
lncRNAs act via cis- and trans-regulation of target genes for biological function. To evaluate the regulatory pathways associated with the lncRNAs, we assessed the differentially expressed genes (DEGs) in the same HUVEC samples. Of 28,431 protein-coding genes that were screened, 778 were upregulated and 998 downregulated by HG treatment. By comparing the DELs and the DEGs, a total of 945 matched lncRNA-mRNAs pairs for 126 DELs and 201 DEGs were predicted, of which 26 lncRNA/RNA interactions were cis-regulatory, with either positive or negative correlations of the lncRNAs with their predicted target genes. An additional 715 interactions were trans-regulatory, including 2 that were both cis- and trans-regulatory (Supplemental Table S3).
To further understand the regulatory functions of the differentially expressed lncRNAs, all predicted target genes were annotated according to GO and pathway function entries using ClusterProfiler. Among the GO Enrichment terms (Figure 4A) the most abundant in the biological process categories were Mitotic cell cycle, Cell cycle, Cell division, Microtubule cytoskeleton organization, DNA replication, Chromosome segregation, Spindle organization, Cytoskeleton organization, Cholesterol biosynthetic process, and Centromere complex assembly. The most abundant GO terms in the cellular component categories were, Molecular function Centromeric region, Chromosome, spindle, Chromosome, Replication fork, Nuclear chromosome, Condensed nuclear chromosome, Microtubule, Microtubule cytoskeleton, Cytoskeleton, and Nucleoplasm. Among the Pathway Enrichment terms (Figure 4B), the most abundant were beta-Alanine metabolism, Primary immunodeficiency, Carbohydrate digestion and absorption, Arginine and proline metabolism, Histidine metabolism, Fatty acid elongation, Homologous recombination, Colorectal cancer, Mucin type O-Glycan biosynthesis, Arrhythmogenic right ventricular cardiomyopathy (ARVC), Aldosterone-regulated sodium reabsorption, Cardiac muscle contraction, Hypertrophic cardiomyopathy (HCM), Endocrine and other factor-regulated calcium reabsorption, Dilated cardiomyopathy, Valine, leucine and isoleucine degradation, Fatty acid metabolism, Apoptosis, NF-kappa B signaling pathway, Endometrial cancer, Adrenergic signaling in cardiomyocytes, and Hippo signaling pathway.
lncRNA-mRNA co-expression network
To visualize the co-expression network, pairs of lncRNAs and mRNAs that had PCC > 0.99 and p < 0.01 were assessed using Cytoscape software. As shown in Figure 5, the lncRNA–mRNA network was composed of 354 lncRNA nodes, 1,167 mRNA nodes and 9,735 edges. According to the nodes and connections, the top 10 LncRNAs that could connect with highest protein coding genes were ENST00000600527 (degree = 241), NONHSAT037576.2 (degree = 234), NONHSAT135706.2 (degree = 233), ENST00000602127 (degree = 226), NONHSAT200243.1 (degree = 221), NONHSAT217282.1 (degree = 219), NONHSAT176260.1 (degree = 216), NONHSAT199075.1 (degree = 204), NONHSAT067063.2 (degree = 197), and NONHSAT058417.2 (degree = 192) (Fig. 5). In our analysis, we found some famous lncRNAs , which could connect with more than 100 protein coding genes, such as: MUC20-OT1 (ENST00000600527 and ENST00000602127), TIMM23B-AGAP6 (ENST00000444438), DAPK1-IT1 (ENST00000431813), and ZNF528-AS1 (ENST00000594119). Taking ENST00000431813 (DAPK1-IT1) as an example, we constructed a network diagram of its interaction with mRNA(Figure 6).