Flowchart
A flowchart summarizing our study is shown in Fig. 1. We screened 497 DEGs, constructed co-expression networks for the IPH and non-IPH groups and detected gene modules. Then select intersection genes between DEGs and gene modules by LASSO regression analysis, retrieve the upstream microRNA of hub genes from the database of non-coding RNAs and analysis differentially expressed miRNAs (DEMs) from GEO database. The microRNAs that are both differentially expressed and present upstream of hub genes are strongly associated with carotid atherosclerotic plaque instability.
Data preprocessing and identification of DEGs
The expression matrix of the GEO dataset included 27 samples with IPH and 16 samples without IPH. A total of 497 DEGs (276 upregulated and 221 downregulated) were chosen for further investigation since their FDR was less than 0.05 and their fold change was more than two.The volcano plot and heatmap of the DEGs are shown in Fig. 2A and 2B. We identified the five most differentially expressed up-regulated and down-regulated genes. The difference of FBIN5 is the most obvious.
Evaluation of the training set's quality and the creation of a co-expression network
We matched the samples' illness states to their expression matrices. The sample dendrogram and trait heatmap were shown after the 43 samples were grouped (Additional file 1). To design a weighted network based on a scale-free topology criterion, the soft thresholding power was set as = 17, where the curve first approached R2 = 0.85. (Additional file 2).
Identification of clinically significant modules
The dynamic tree cutting method was used to identify two modules, as shown in Fig. 2C. Table 1 shows the number of genes in each module. High correlations were found with the characteristic of disease status (IPH or no IPH) after linking the modules to traits, as illustrated in Fig. 2D. The blue and turquoise modules were declared clinically significant and eligible for further investigation since they had a p.05 value.
Table1.The total number of genes in each of the two modules
Module colors
|
Number
|
blue
|
1189
|
turquoise
|
2155
|
Identification of critical genes and GO annotation and KEGG pathway enrichment analysis
From the clinically significant modules in the co-expression network, 3344 important genes were chosen. The intersection of DEGs and genes in two clinical modules gained 463 genes as shown in Figure 4. To obtain a deeper insight into the biological roles of these critical genes, we conducted the GO annotation and KEGG pathway enrichment analyses. Fig.2F lists the top 10 enriched GO terms and KEGG pathways of the clinically significant genes. The common hub genes were significantly enriched in the pathways of positive regulation of secretion, signal release, response to steroid hormone, and neutrophil mediated immunity, according to a GO biological processes (BP) analysis. MAPK signaling pathway, Neuroactive ligand-receptor interaction, PPAR signaling pathway, and Calcium signaling pathway were the top four markedly enriched pathways (Fig. 2F). The hub genes may play a crucial role in immunological and inflammatory responses triggered by hormone-mediated factors, according to the results of the GO and KEGG pathway studies.
Establishment predicted model
We eventually found eight optimum prognostic genes by LASSO regression analysis and put them into the prognostic risk model: FBLN5, FMOD, GAL, GEM, SLC14A1, SPTBN1, TMEM119 and GREM1, as shown in Fig. 3A and 3B. We calculated the risk score for each patient using gene mRNA levels and risk estimate regression coefficients to identify the relevance of hub genes. The formula for the calculation is explained in the Methods section. Risk score = (-0.31774005 × expression of FBLN5) + (-1.58406823 × expression of FMOD) + (0.04517572 × expression of GAL) + (-2.44920986 × expression of GEM) + (0.04109032 × expression of SLC14A1) + (-0.12978159 × expression of SPTBN1) + (-0.33417587 × expression of TMEM119) + (0.25663856 × expression of GREM1).
Identification of hub genes
ROC curve analysis was performed to evaluate the diagnostic significance of prognostic genes in patients with ruptured or stable human atheromatous lesions and showed that (Fig.3C). TMEM119 was not shown in the ROC curve for the absence in this dataset. The mRNA levels of these genes accurately distinguished IPH tissues and indicated high accuracy of gene prediction. This demonstrates that the genes obtained by LASSO regression analysis expression are potential diagnostic biomarkers for IPH.
Analysis and identification of microRNA of diagnostic genes
We searched for 392 upstream microRNAs of eight hub genes from starBase Version3.0. GSE11794, the dataset of ncRNA of carotid atherosclerosis, was downloaded and conducted difference analysis. 638 up-regulated and 244 down-regulated ncRNAs were obtained (P ), and the volcano plot was shown in Fig.3D. We intersected DEMs with the upstream microRNAs and obtained 13 core upstream regulators (Fig.3E), which include hsa-miR-769-5p, hsa-miR-532-3p, hsa-miR-501-3p, hsa-miR-4739, hsa-miR-455-5p, hsa-miR-421, hsa-miR-339-5p, hsa-miR-331-3p, hsa-miR-330-3p, hsa-miR-23c, hsa-miR-193a-3p, hsa-miR-133b and hsa-miR-128. These miRNAs and their closely related hub genes were introduced into Cytoscape software to draw interaction network (Fig.3F). miR-532-3p is linked with the largest number of hub genes. It is concluded that mir-532-3p closely relates to the rupture of the carotid plaque. Among the intersection genes, FBLN5 expression was significantly different after plaque rupture (Fig. 3G). Meanwhile, we used GSE28829 and GSE100927 to draw Box plot after data standardization, in which identify the FBLN5 expression difference between IPH and non-IPH. It is obviously that FBLN5 has high specificity.
Single-Cell Genomics reveals FBLN5 distribution during atherosclerosis
To figure out the farther mechanism, we compared the expression of hub genes between normal and atherosclerotic arteries at the cellular level. Subsequently, we calculated the expression of the hub genes in cells based on scRNA sequence data. All high-quality cells were screened for limitations of number of mitochondrial genes, the number of genes in a single cell, and the number of erythrocyte genes (Additional file 3). A total of SMC-derived cell, Macrophage, CD8+T cell, Smooth muscle cell (SMC), NK cell, Lymphatic Endothelial cell were obtained in normal group (Fig. 4A). A total of SMC, SMC-derived cell, Macrophage, common myeloid Progenitor, Lymphatic Endothelial cell were obtained in the disease group cell, CD4+T cell (Fig. 4B). Among them, we projected FBLN5 onto the cell distribution and found that in normal group FBLN5 high specificity expressed on SMC-derived cell and Smooth muscle cell (Fig. 4C-D). However, in disease group, in addition to the above two types of cells, FBLN5 was also highly expressed in endothelial cells (Fig. 4E-F).