Our experimental process is shown in Supplementary 1.
3.1. Identification of DEGs.
806 DEGs were identified in the GSE52093 dataset for aortic dissection, including 45 genes that were up-regulated and 341 genes that were down-regulated, when compared to samples of normal aortic tissue (Fig. 1a, 1b). The GSE28829 dataset on atherosclerosis revealed the presence of 571 differentially expressed genes (DEGs), with 410 genes showing up-regulation and 161 genes showing down-regulation when comparing early-stage plaques with late-stage arterial plaques (Fig. 1c, 1d). The comparison of aortic dissection tissue, late-stage arterial plaques, and early-stage arterial plaques using volcano plot and clustering heatmap of DEGs displayed notable distinctions (Fig. 1).
3.2. Discovering Common Gene Patterns in AD and AS
From the two datasets of aortic dissection and atherosclerosis, we identified 23 genes that were up-regulated and 46 down-regulated genes. These genes were found to intersect between the up-regulated and down-regulated gene sets (Fig. 2).
3.3. Enrichment Function Analysis of DEGs.
According to the results of the GO analysis, it was found that out of the 69 genes that were shared, the biological processes (BP) were primarily enriched in muscle organ development, muscle system process, regulation of smooth muscle cell proliferation associated with blood vessels, and various other processes. Within the cellular component (CC), the primary enrichment was observed in the actin cytoskeleton, sarcomere, Z disc, and various other subcellular sites. In the realm of molecular function (MF), they were primarily enriched in actin binding, the activity of phosphoric ester hydrolase, the activity of phosphoric diester hydrolase, and various other molecular functions (Fig. 3a). The analysis of pathway enrichment indicated that the commonly expressed genes primarily participated in processes such as contraction of vascular smooth muscle, synthesis and secretion of cortisol, interaction with extracellular matrix receptors, and various other pathways (Fig. 3b).
3.4. Enrichment Function Analysis of DEGs.
The 69 genes that were shared underwent analysis for protein-protein interactions (PPIs) utilizing the STRING online database (https //string-db.org/cgi/input.pl). This analysis resulted in the acquisition of a network relationship diagram consisting of 69 nodes and 194 edges. Next, the MCODE plugin of Cytoscape was utilized to acquire the core network diagram (Fig. 4a). According to the algorithm of CytoHubba, a Cytoscape plugin, the top 9 genes include CNN1, SPP1, SMTN, MYOCD, SLMAP, JAK2, CFL2, JAK2, and MYO1F (Fig. 4b).In two validation datasets, GSE147026 and GSE43292, there was a notable variation in gene expression among these 9 genes when comparing the disease group and the control group.
3.5. Weighted Gene Co-Expression Network Analysis.
A total of 27 modules were identified in GSE52093 by WGCNA, and 3 modules were identified in GSE28829. The module-trait relationship heat map was drawn using the Spearman correlation coefficient to evaluate the correlation between the modules and the disease. Among the 27 modules in GSE52093, the lightgreen and midnightblue modules showed a high correlation with AD and were determined to be AD-related modules (lightgreen module: r = 0.81, p = 0.001; midnightblue module: r = 0.86, p = 4e-04) (Fig. 5b). Among the 3 modules identified in GSE28829, the turquoise module showed a high correlation with AS and was determined to be an AS-related module (turquoise module: r = 0.76, p = 2e-06) (Fig. 5d). The lightgreen module contained 384 genes, the midnightblue module contained 1062 genes, and the turquoise module contained 119 genes. All three modules were positively correlated with the disease (Fig. 5).
3.6. Identification of genes finally screened
Upon identification of the three disease-associated modules in the WGCNA analysis, it was observed that the lightgreen and midnightblue modules shared 156 common genes, while the midnightblue and turquoise modules shared 9 common genes. Additionally, the lightgreen and turquoise modules exhibited two common genes, namely RGS1 and BCAT1. Four genes, namely CNN1, CNTN4, CKS2, and MYO1F, were identified as hub genes associated with both AD and AS in the lightgreen, midnightblue, and turquoise modules intersection (Fig. 6a). Subsequently, we intersected the 9 hub genes derived from the PPI network findings with the 4 hub genes obtained through WGCNA analysis, resulting in the identification of CNN1 and MYO1F as potential biomarker genes for AD and AS (Fig. 6b).
3.7. Validate Differential Expression RNAs in RT-qPCR and Validated Set.
To confirm our findings, we conducted verification of CNN1 and MYO1F expression in the Validated datasets GSE147026 and GSE43292. In the Validated datasets, it was found that there was a significant difference in the expression of CNN1 and MYO1F between the disease group and the control group, as indicated by the results shown in Fig. 7a, 7b, 7d, and f. Next, we conducted qPCR validation on 8 instances of aortic dissection samples and 7 healthy aortic tissues. Our analysis revealed that CNNI exhibited low expression in aortic dissection, aligning with the obtained results (Fig. 7c). Nonetheless, the expression of MYO1F showed no notable disparity between aortic dissection tissues and normal arterial tissues (Fig. 7d).
3.9. ROC curve.
We finally confirmed CNN1 as the diagnostic gene for aortic dissection and atherosclerosis. To explore the accuracy of CNN1 as a diagnostic biomarker for AD and AS, we plotted the ROC curve of CNN1. In two discovery datasets GSE52093 and GSE28829, q-PCR, and two validation datasets GSE147026 and GSE43292, the AUC of CNN1 was 1.0, 0.805, 1.0, and 0.877, respectively (Fig. 8). This indicates that CNN1 has good diagnostic value as a biomarker for AD and AS.
3.10. GSEA Analysis.
To explore the function of the CNN1 gene, single-gene GSEA analysis was performed in two discovery datasets. In the low-expression group of CNNI in GSE52093, 19 pathways were screened by analysis. In the low-expression group of CNNI in GSE28829, 20 pathways were screened, among which CYTOSOLIC_DNA_SENSING_PATHWAY was enriched in both datasets. We speculate that the low expression of CNN1 in AD and AS may be related to CYTOSOLIC_DNA_SENSING_PATHWAY (Fig. 9).
3.11. Prediction of Transcription Factors (TFs) And Chemicals-Gene Networks.
According to the ALGGEN-PROMO (upc.es) database, a set of 84 candidate transcription factors has been identified with the potential to bind to the CNN1 promoter region (Supplementary Fig. 2). Next, the correlation between CNN1 and the expression levels of these 84 transcription factors in the discovery dataset was examined. Among the 23 transcription factors in GSE52093, 14 exhibited a positive correlation with CNN1, while 9 showed a negative correlation, all with |R|>0.5 and P༜0.05. Within the dataset, GSE28829, a total of 8 transcription factors exhibited a correlation coefficient |R|>0.5 and P༜0.05. Out of these, 2 transcription factors showed a positive correlation with CNN1, while 6 transcription factors displayed a negative correlation with CNN1.In the discovery datasets for AD and AS, there was a positive correlation observed between the transcription factors RGR and AR, and CNN1.In the year AD, the correlation coefficients and p values for the relationship between RGR and AR to CNN1 were R = 0.786 P = 0.00243 and R = 0.672 P = 0.0167, correspondingly. In AS, the correlation coefficients between RGR and AR to CNN1 were R = 0.777 P = 7.19e-07 and R = 0.505 P = 0.00516, respectively (Fig. 10). We speculate that the transcription factors RGR and AR likely affect the expression of CNNI in AD and AS.
Using the CTD database, it was predicted that 125 small molecule chemicals interact with CNN1. Out of these, 56 chemicals were found to enhance the mRNA or protein expression of CNN1, whereas 58 chemicals were observed to reduce the mRNA or protein expression. We selected small molecule chemicals that had been tested more than three times to draw a gene-chemicals interaction network using Cytoscape. Six small-molecule chemicals can increase mRNA or protein expression of CNN1 and nine small-molecule chemicals that can decrease mRNA or protein expression of CNN1 (Fig. 11).