Weighted Gene Correlation Network Analysis Applied to Identify the Immune Cell-related Hub Genes in ANCA Nephritis

Background: Antineutrophil cytoplasmic antibody (ANCA) associated vasculitis (AAV) is the most common reason caused rapidly progressive glomerulonephritis worldwide. But the molecular mechanisms of ANCA - associated nephritis (AAN) have not been thoroughly expounded. So that(cid:0)we aim to seek the potential molecular pathogenesis of AAN by bioinformatic. Result: Finally, four hub genes, PBK, CEP55, CCNB1 and BUB1B, were identied. These four hub geneswas veried higher in AAN than normal. Conclusion: Those four genes identied by integrated bioinformatics analysis may play a critical role in AAN. May offering a new insights and potential therapeutic to the AAN


Background
The essential feature of antineutrophil cytoplasmic antibody (ANCA) associated vasculitis (AAV) is the systemic necrotizing vasculitis, often involving the whole body blood vessels, which can develop at any age [1]. Three times more risk to death in AAV patients than normal people [2]. AAV is the commonest reason to generate rapidly progressing glomerulonephritis worldwide, with rapidly progress and poor renal prognosis [3]. About 80% AAV patients occurs kidney damage during the disease phase [4]. Despite advances in mediacal treatment, about 20 to 30 percent of patients still developed to kidney failure, with unfavourable prognosis and inferior quality of daily life [5]. In ANCA -associated nephritis (AAN) the main pathological are mainly divided into four,such as focal type (more than 50% normal glomerulus), sclerosing type (more than 50% global sclerosing glomerulus), crescent type (more than 50% glomerulus crescent type) and mixed type (no obvious lesion type). The most common pathological type was crescent nephritis [3,6]. There have been reports that extracellular glomerular myeloperoxidase (MPO) deposition and antigen-speci c T and B cells activating, leading to crescent nephritis nally in AAN patients [7].
However, the pathophysiological mechanism of AAN remains unclear, and increasing evidence shows that immune cells play a vital role in AAN [7,8]. As the disease of AAN progresses, neutrophils and macrophages accumulate in the glomerulus [9]. Neutrophils in ltrate most prominently in the necrotic glomerulus, whereas macrophages aggregate most prominently in the crescent body [9]. Experimental results showed that there were antigen-speci c effector CD4 + T cells in peripheral blood during the acute infection stage of AAV, and the depletion of CD4 + T cells alleviated the progression of the disease [10]. As report, the B cells, plasma cells and Treg cells in circulating blood of AAV patients were higher than those in healthy controls [11]. Monocytes have also been shown to accumulate in crescents of necrotizing glomerulonephritis [12]. However, the mechanism by which immune in ammatory cells in ltrate renal tissue has not been fully elucidated [13]. The molecular mechanisms of immune cells and AAN need to be further studied for the sake of elucidating the pathophysiological mechanism of AAN pathogenesis and provide guidance for the search for potential therapeutic targets.
WGCNA is an effective method to analyses the complex relationship between gene and phenotype [14]. In many diseases, it was used to screen for key genes, such as tumor [15,16], immunity [17], kidney disease [18]. Mining hub genes in speci c modules greatly reduced the scope of screening genes, and nally screened key genes or markers related to phenotypes, improving the precise location of genes related to key traits [19].In this paper, the co-expression network of immune cells and AAN genes was constructed based on WGCNA algorithm, and then enrichment and PPI analysis were conducted on the selected modules to screen out hub genes for further veri cation. This study is expected to provide new perspectives and potential quality targets for the therapy of AAN.

Construct the WGCNA network
The genes with the top 5000 differences before expression were selected for WGCNA analysis. A scalefree R2 > 0.9 co-expression network was constructed with soft threshold power β = 16. Eight modules, including green magenta blue grey greenyellow royalblue grey60 midnightblue were identify by the hierarchical cluster average linkage method( Figure 3).

Enrichment analysis
Genes in greenyellow block were selected for GO and KEGG functional enrichment analysis, and biological effects were studied using David Online tool ( Figure 5). The most enrichment biological process including innate immune response, in ammatory response, mitotic nuclear division, leukocyte migration, cell division. Additional, plasma membrane, cytosol, membrane, extracellular exosome, integral component of plasma membrane were signi cant enriched in cellular components. For Molecular Function, the signi cant enriched were protein binding, ATP binding, receptor activity, protein kinase binding, microtubule binding. KEGG pathway analysis were mainly enriched in tuberculosis, phagosome, natural killer cell mediated cytotoxicity, osteoclast differentiation, staphylococcus aureus infection PPI Network identi cation of hub genes String database was used to analyses the genes in the greenyellow module to assess the interactions.
Finally, 99 nodes and 1193 edges were identi ed from network ( Figure 6). The PPI network was then processed using CytoHubba MCC to identify the top ten genes.
Hub gene validation GSE109108 data set was used to dectect the hub genes. The results showed that PBK, CEP55, CCNB1 and BUB1B were elevated in AAN patient (Figure 7). In addition, we combined Nephroseq database to ascertain the level of this hub gene between AAN and normal controls. The results also showed that PBK, CEP55, CCNB1 and BUB1B were signi cantly higher in AAN than in healthy group (Figure 8).

Discussion
AAV is an autoimmune disease in which PR3(proteinase3) and MPO are the main autologous antigens in the cytoplasm of antineutrophil, leading to destructive in ammation of vessel throughout the body [9]. Extensive crescent formation with glomerular necrosis is characteristic of renal involvement [25]. It often leads to kidney failure in a short time. Despite numerous reports, the etiology and molecular mechanism pathogeny of AAN is not yet fully clari ed. Therefore, it is critical to strengthen the research on the etiology and physiological mechanism of AAN, reveal the potential causes of the pathogenesis of AAN, and explore possible therapeutic targets.
In this study, data sets were extracted from GES104948 downloaded from the GEO database and Cibersort was used to estimate the composition of immune cells. Then the module with the strongest correlation with immune cell type was determined by WGCNA method. 8 modules were selected nally, and the greenyellow module has the hightest relationship with T cells CD4 memory resting cells. Then, the selected modules were analyzed for gene enrichment. GO enrichment results showed that genes of selected modules were mainly enriched in innate immune response and in ammatory response. KEGG pathway analysis showed that the gene was mainly enriched in innate immune response and in ammatory response pathways. Through protein interaction network analysis and Cytoscape's CytoHubba, the top 10 hub genes with the highest expression were nally screened. Finally, the expressions of PBK, CEP55, CCNB1 and BUB1B showed signi cant differences in AAN. The potential relationship between these four genes and AAN has not been thoroughly studied and deserves further research. Finally, the gene expression was veri ed by Nephroseq database. The genes expression levels of PBK, CEP55, CCNB1 and BUB1B in AAN patients were higher than those in normal, and were negatively correlated with serum creatinine, which was consistent with bioinformatics analysis PBK, also known as T lymphocytokine activated killer cell-derived protein kinase (TOPK),is a novel mitotic serine/threonine protein kinase [26]. It is involved in a multiple of biological functions, such as cell proliferation and transformation, cell cycle regulation, tumorigenesis and anti-apoptotic effects [27,28].
TOPK/PBK reduces UV-induced in ammation by enhancing the stability of MKP1 and thereby negatively regulating the activity of P38 [27]. Some studies have found that PBK mutation may be related to the occurrence of kidney stones [29]. TOPK/PBK expression can be detected in a variety of malignancies and is associated with aggressive tumor phenotypes, which is considered as a potential target [30,31], and also regarded to be a transfer-promoting kinase in cancer metastasis [32].

CEP55 (centrosome protein) is a vital component in cell cycle progression and plays a critical role in the
nal stage of cytokinesis, regulating the physical separation of two daughter cells [33]. Recent reports suggest that CEP55 is overexpressed in a large number of tumors and is associated with poor prognosis [34]. Overexpression of CEP55 can up-regulate PI3K/AKT signaling pathway to promote tumor migration and invasion [35] CCNB1, a cyclin, is a regulatory subunit of cyclin-dependent kinase 1 (CDK1), which promotes cell mitotic division [36]. CCNB1 was related to cell proliferation and differentiation, and inhibition of CCNB1 expression could inhibit the proliferation of gonorrhea and spermatogonia and promote apoptosis [37].
When CCNB1 is abnormally expressed, the immune system interprets the abnormally expressed CCNB1R as a tumor antigen and activates humoral and cellular immunity [38].
BUB1B, also known as mitotic checkpoint serine/threonine kinase B, is a family of spindle assembly checkpoint (SAC) proteins [39]. Loss of spindle detection sites and a doubling of severe chromosome separation defects proved to be associated with BUB1B inactivation [40]. BUB1B mutated, resulting in reduced BUB1B expression, increased brittleness of antioxidant stress, and premature ovarian failure [41]. In a variety of cancers, malignant cell proliferation and poor clinical prognosis are closely related to the overexpression of BUB1B [42][43][44].
Our study has some limitations. Firstly, our study is based on the GEO public database with a small sample size. More clinical samples are needed to validate our study. Second, in this article, the correlation between selection genes and AAN has not been reported in literature. It is necessary to clearly validate the functional importance and mechanistic role of these genes in this pathological setting.

Conclusions
In conclusion, the results of these bioinformatics analyses and hub genes afford a new sight to the mechanisms of AAN pathogenesis. Further research is needed elucidate the regulatory character of these genes, determining the values as clinical biomarkers or therapeutic targets.

Data acquisition and processing
Dataset GSE104948 was downloaded from GEO database(https://www.ncbi.nlm.nih.gov/geo/) GPL22945 platform including 22 AAN patients and 18 healthy controls. R package "WGCNA" was used to structure the co-expression network among the genes ranking the rst 5000 median absolute deviation in dataset GSE104948. A series of matrix le data GSE108109 was used, including AAN (n=15) and healthy controls group (n= 6), for subsequent model validation.

Immune cell correlation
Based on the gene expression matrix of the GSE104948 dataset Cibersort(https://cibersort.stanford.edu/) algorithm was designed to estimate the abundances of the immune cells composition. The algorithm can estimate the relative expression levels of 22 immune cell types [20].

WGCNA
The co-expression network of differential genes and immune cells was constructed [19] An appropriate soft threshold power β=16 was determined and a scale-free R2=0.95 were selected. Then cluster DEGs into modules and transform adjacency matrix into topological overlap matrix (TOM). After a height cutoff of 0.25, similar modules were merged, with a minimum of 30 genes per module and a threshold of 0.25 for module merging. The relative expression levels of each module and immune cells were calculated. Those with the highest correlation were selected for further analyses.

PPI network and identify hub gene
The genes in the selected module were imported into the interactive gene retrieval tool STRING Authors' contributions DZ and JJ designed the experiments. DZ wrote the paper. ,WFH and JSZ analyzed the data and prepared gures. All authors read and approved the nal manuscript.