Immune cell expression
Based on the microarray gene expression data set of the GSE104948, CiberSort was used to estimate the relative expression of 22 subtypes immune cells. (Figure 1). Remove low expression of immune cells from the sample, because the expression of these immune cells was not expressed in these samples. Pearson correlation analysis was performed on the remaining immune cells (Figure 2). There was significant positive correlation between T cells CD4 memory resting and T cells CD8 (r=0.84,p<0.001)，B cells resting was significantly positively correlated with B cell activated (r=0.92,p<0.001), while NK cell activated was significantly positively correlated with NK cell resting (r = 0.86, p <0.001).
Construct the WGCNA network
The genes with the top 5000 differences before expression were selected for WGCNA analysis. A scale-free R2 > 0.9 co-expression network was constructed with soft threshold power β = 16. Eight modules, including green、magenta、blue、grey、greenyellow、royalblue、grey60、midnightblue were identify by the hierarchical cluster average linkage method(Figure 3).
Construct the interrelationship between module and immune cell type
Correlation analysis was conducted between each module and the immune cell type selected above, including B cells native, B cells memory, plasma cells, T cells CD8, T cells CD4 memony resting,T cell follicular helper, T cell regulatory Tregs, NK cell resting, NK cell activated, Mcarolhages M1, Macrophages M2, Dendritic cells resting, Dendritic cells activated, Maste cells activated, Neutrophils. The results showed that the greenyellow module was positively correlated with T cells CD4 memory resting (r = 0.46, p = 0.003), while the Blue module was negatively correlated with T cells CD4 memory resting (r =−0.5, p <0.001) (Figure4)
Genes in greenyellow block were selected for GO and KEGG functional enrichment analysis, and biological effects were studied using David Online tool (Figure 5). The most enrichment biological process including innate immune response, inflammatory response, mitotic nuclear division, leukocyte migration, cell division. Additional, plasma membrane, cytosol, membrane, extracellular exosome, integral component of plasma membrane were significant enriched in cellular components. For Molecular Function, the significant enriched were protein binding, ATP binding, receptor activity, protein kinase binding, microtubule binding. KEGG pathway analysis were mainly enriched in tuberculosis, phagosome, natural killer cell mediated cytotoxicity, osteoclast differentiation, staphylococcus aureus infection
PPI Network identification of hub genes
String database was used to analyses the genes in the greenyellow module to assess the interactions. Finally, 99 nodes and 1193 edges were identified from network (Figure 6). The PPI network was then processed using CytoHubba MCC to identify the top ten genes.
Hub gene validation
GSE109108 data set was used to dectect the hub genes. The results showed that PBK, CEP55, CCNB1 and BUB1B were elevated in AAN patient (Figure 7). In addition, we combined Nephroseq database to ascertain the level of this hub gene between AAN and normal controls. The results also showed that PBK, CEP55, CCNB1 and BUB1B were significantly higher in AAN than in healthy group (Figure 8).