Calculate soft threshold, construct co-expression matrix and partition module
According to the distribution of the scale-free network, similarity and anisotropy coefficients among genes were calculated, and the cluster tree of the system between genes was constructed. Using the function of the WGCNA software package in R software, the optimal soft threshold = 8 is calculated to divide the co-expression module. After the soft threshold was determined, the dynamic shear tree method was used to preliminarily identify and merge similar modules. Cluster analysis was carried out on the modules, and the modules close to each other were merged into new modules. The minimum number of genes in each gene co-expression network module was set as 30, and 14 modules were obtained, among which the gray module was the gene set that could not be aggregated to other modules (Figure 1).
Screening of high-risk pathogenic gene modules for AS
The WGCNA software package was used to calculate the correlation between these modules and AS according to the feature vectors of each module, to build the cluster tree of each module and disease phenotype, and to calculate the Pearson correlation coefficient between different modules and AS. The results showed that 582 genes contained in the yellow (Classical Module) (r=0.43, P=1.4e-27) and 59 genes contained in grey60 (Hematological Module) (r=0.2, P=0.13) modules had the strongest correlation with AS. See Figure 2 for the results.
Screening of hub genes for AS
The genes in the Classical Module were selected and analyzed by 11 different methods. The top 20 genes with the highest scores were obtained, including LOC653773, MRPL32, DPM1, DPY30, MRPL1, MRPS33, CWC15, RWDD1, COMMD6, LSM1, CETN3, SNRPG, C15orf15, ITGB3BP, HINT1, NDUFB2, LSM5, RPL23, TMEM126B, NDUFA4. AS shown in Table 1. In the same way, the top 20 genes with the highest scores in the Hematological Module were obtained, including TSPAN9, MGLL, ABLIM3, ITGB3, ITGB5, SH3BGRL2, TREML1, SAMD14, CTTN, NAT8B, C6orf21, RBPMS2, ACRBP, GUCY1A3, AQP10, CDKN1A, GP9, ESAM, Septin 5, MYL9. AS shown in Table 2.
Biological function notes for AS hub genes
The genes of the Classical Module and Hematological Module were uploaded to DAVID's website for GO analysis and KEGG analysis to explore the biological functions of differentially expressed genes. The GO analysis results showed that the process of SRP-dependent cotranslational protein targeting to membrane, ribosome, and NADH dehydrogenase (ubiquinone) activity in the Classical Module was significantly abnormal, and the process of platelet activation, integrin complex, and extracellular matrix binding in Hematological Module was abnormal. And the KEGG analysis results showed that genes in the Classical Module are associated with Parkinson's disease, Huntington's disease, Alzheimer's disease; while genes in the Hematological Module have an association with Platelet activation, Tight junction, and ECM-receptor interaction. As shown in Figure 3, 4 and Table 3, 4.
miRNA-mRNA interaction networks
The miRNA analysis was conducted by FunRich software. There are 38 miRNAs predicted targeting genes in the Classical Module, including hsa-miR-22-3p, hsa-miR-32-5p, hsa-miR-320c and et. al. A total of 64 miRNAs were identified targeting Hematological Modules, including hsa-let-7b-5p, hsa-let-7c-5p, hsa-let-7e-5p, and et. al. (Supplement table 1 and 2) The miRNA-mRNA interaction network of genes in selected modules also were plotted (Figure 5A and Figure 5B).