Bioinformatics-based Identication of the Mechanism Whereby Astragalus Membranaceus Inhibits Inammation and Autophagy in Lupus Nephritis

Background. Lupus nephritis (LN) is an autoimmune disease, and effective treatment can improve the prognosis of patients. Several studies have demonstrated that Astragalus membranaceus can effectively suppress the progression of LN, but the underlying mechanism is still unclear. Thus, this study aimed to investigate the mechanism whereby Astragalus membranaceus exerts this therapeutic effect against LN. Method. Astragalus targets and LN-related targets were obtained from the TCMSP database and WGCNA analysis, respectively. Then we obtain the compound targets and constructed the protein interaction network of the compound targets. The top 10 core targets were identied. And We have validated it. We performed GO and KEGG enrichment analyses on the compound targets to identify the functional and genomic pathways enriched among these targets. Molecular docking of Astragalus membranaceus with the core targets was performed . Result. We identied 214 AM targets and 102 LN-related targets. Among them, we obtained 10 core targets, IL-1β, EGF, CCND1, CASP3, STAT1, PTGS2, and PPARγ were found to have high diagnostic values for LN. In the validation dataset GSE99339, all the core targets were signicantly expressed, except for EGF deletion. The results of the KEGG enrichment analysis showed that 7 out of 23 valid targets were signicantly enriched in the mitogen-activated protein kinase pathway (p=1.14E-05). The molecular-docking results showed that AM with IL-1β, CASP3, STAT1, and PPARγ had good binding properties. Conclusion. The therapeutic effect of AM against LN might be related to suppression of inammation and autophagy, and the bioinformatics-based results presented here provide a theoretical basis for the clinical application of AM in LN.


Introduction
Systemic lupus erythematosus (SLE) is an autoimmune disease with multi-organ involvement 1 . Lupus nephritis (LN) is one of the most severe complications of SLE. Approximately 5-20% of LN cases progress to end-stage renal disease (ESRD), a signi cant cause of death in SLE 2,3 . The treatment of LN mainly relies on hormones and immunosuppressants. However, the majority of patients are still not effectively controlled and progress to ESRD. Therefore, it is crucial to identify effective targets to suppress the progression of LN.
Astragalus membranaceus (AM) is a legume that has a variety of bioactive components such as astragaloside, astragalus polysaccharide and astragalus avonoids 4,5 . And it is one of the most widely used herbal remedies for kidney diseases. Studies have proved that it has the effect of reducing proteinuria, reducing creatinine, anti-in ammation and regulating immunity. AM signi cantly reduces the expression of in ammatory genes and cytokines in LN and suppresses the progression of this disease 6, 7 .
However, the factors involved and their speci c mechanisms remain unclear.
Weighted gene co-expression network analysis (WGCNA) is a systematic biological analysis that identi es the core genes associated with a disease. Network pharmacology has unique advantages in analyzing herbal medicines with multiple components and targets. Therefore, this study aimed to investigate the core genes and possible mechanisms whereby AM suppresses LN. The ow diagram of this study was showed in Fig. 1.We present the protocol in accordance with the STARD reporting checklist 2. Materials And Methods

Acquisition of the therapeutic targets of AM
The bioactive ingredients of AM and their corresponding target genes were retrieved from the Traditional Chinese Medicine Systematic Pharmacology Database and Analysis Platform (TCMSP, http://lsp.nwu.edu.cn/tcmsp.php). The screening criteria for the bioactive ingredients were to meet both oral bioavailability (OB) ≥ 30% and drug-likeness (DL) ≥ 0.18. We removed duplicate, non-human, meaningless targets, and the remaining targets were entered into the Uniprot database and converted into genes to obtain the therapeutic targets of AM.

Lupus nephritis microarray data collection and preprocessing
We searched for "lupus nephritis" in the National Center for Biotechnology Information Gene Expression Database (GEO, http: //www.ncbi.nlm.nih.gov/GEO/) and downloaded the data that are from gene microarrays and gene array platforms and that met the criteria. Then the microarray data were annotated for the next step of WGCNA. The screening criteria were as follows: 1) the study species was human, and the microarray data were derived from the kidney, not blood or urine; 2) the sample size was > 20; and 3) both LN group and healthy control group were included, and the complete gene microarray data and platform data were available.

Construction of the weighted gene co-expression network
Signi cant outlier samples were rst identi ed and removed by calculating the inter-array correlation of the dataset. We calculated the Pearson correlation coe cient between gene expressions for the remaining samples by applying the power adjacency function in the "WGCNA" package of R (3.6. 3) software. Afterward, we ltered the best soft threshold (β) by the "pick Soft Threshold" function to construct a scale-free gene co-expression network.
The "dissimilarity" function in the "WGCNA" package was used to rst cluster the genes in the microarray dataset and construct a hierarchical clustering tree to cluster the genes with similar functions into the same gene module. Then, we set the minimum number of genes in each module to 50 and the cut height to 0.25 and used the Dynamic tree cut method to divide the gene modules. We calculated the module eigengene (ME), modular membership (MM), and gene signi cance (GS), and the modular genes with high GS and high MM were selected as the LN-related targets.
2.4 Construction of protein-protein interaction (PPI) and "drug-bioactive ingredients-compound targets-disease" interaction networks We applied the "Venn Diagram" package in R3.6.3 software to match the targets of AM with the targets related to LN to obtain the AM-LN compound targets. The compound targets were imported into the STRING database (https://www.string-db.org), and the species was restricted to "Homo sapiens" to obtain the PPI network. The PPI network was visualized and analyzed using the Cytoscape 3.8.2 software. The top 10 target genes were selected as the core targets by using degree value. The compound targets and their corresponding bioactive ingredients were also imported into Cytoscape 3.8.2 to generate a "drugbioactive ingredients-compound targets-disease" network.

Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses
We used the "Cluster Pro ler GO" and "Cluster Pro ler KEGG" packages in R to perform GO and KEGG enrichment analyses on the compound targets. Then, we generated histograms and bubble plots.
2.6 Clinical internal validation of the core genes GSE32591 clinical data from the European Renal Disease Database (https://www.nephroseq.org/resource/login.html) were downloaded. After screening for the variables via logistic regression analysis, a columnar plot of the risk of developing LN was constructed and validated. Then, we calculated the area under the ROC curve (AUC) for 10 core genes by using the "pROC" package of R to assess the diagnostic value of each gene. We identi ed the genes with AUC > 0.7 as clinically signi cant for LN.

External veri cation
The GEO database (http://www.ncbi.nlm.nih.gov/geo) was searched again for "lupus nephritis" to identify another LN dataset-GSE99339. We again compared the expression levels of the 10 core genes between the LN group and the healthy control group in GSE99339 microarray data to further validate the clinical signi cance of the 10 core genes for LN.

Molecular docking
We identi ed four key genes after combining the degree and AUC values, and we downloaded the crystal structures of the proteins encoded by these key targets and the 3D structures of the bioactive ingredients corresponding to the most bioactive targets from the Protein Data Bank (PDB, http://www.pdb.org/) and PubChem database, respectively. We processed the 3D structures and bioactive ingredients by using the PyMol software. Then, we performed molecular docking by using the AutoDock Tools 1.5.6 software after adding hydrogen atoms and calculating the charges. A Lamarckian genetic algorithm was used to record the lowest binding energy and the best corresponding conformation 8-10 . Binding energy < 0 indicated that the ligand can bind to its receptor spontaneously, and a binding energy ≤ -5.0 kJ·mol -1 was considered to be an excellent binding property.

Acquisition of AM targets
We identi ed 87 bioactive ingredients of AM from the TCMSP database. Among them, 17 satis ed the conditions of OB ≥ 30% and DL ≥ 0.18 (Table 1). The obtained action targets were entered into the Uniprot database (https://www.uniprot.org/) to eliminate any meaningless target. Finally, we obtained 302 target genes as the action targets of AM.

Weighted gene co-expression network analysis
We performed WGCNA on 46 samples and 12548 genes. After normalization, we detected no signi cant outlier samples. A scale-free network was constructed by choosing β = 5 (scale-free R 2 = 0.99) (Fig. 2).
The minimum number of genes per module was set to 50, and the shearing height was 0.25 for dynamic shearing. Consequently, 12 gene modules were obtained (Fig. 3). The number of genes contained in each module is shown in Table 2. By calculating ME, MM, and GS, the modules were correlated with clinical features. We considered the blue module (R = 0.84 and P = 3e-13) as the central module (Fig. 4), and the module gene as the most closely related to LN. g.s 5A and 5B show the connectivity alongside the importance of the genes within the blue modules in the LN and healthy groups, respectively, with p < 0.05.

PPI network and "drug-bioactive ingredients-compound targets-disease" interaction network
We screened 24 AM and LN compound targets by using the R software VennDiagram data package (Fig. 6) and imported 24 compound targets into the STRING database for protein-interaction analysis, PPI network diagram (Fig. 7A), and string interactions. We imported the tsv les into the Cytoscape 3.8.2 software, and the top 10 targets, namely IL-1β, EGF, CASP3, CCND1, STAT1, PTGS2, PPARγ, AR, CXCL10, and KDR (ordered according to degree value), were identi ed as the core targets of AM in the treatment of LN (Table 3 and Fig. 7B). We used Cytoscape 3.8.2 to construct the "drug-bioactive ingredients-compound targets-disease" network diagram. The nodes correspond to the drug name, the bioactive ingredient, core target, and disease (Fig. 8). The full names of the bioactive ingredients and their corresponding abbreviations are shown in Table 4. HQ17 (quercetin) connects with the most targets (n = 153), followed by HQ13 (kaempferol) (n = 85). Therefore, quercetin and kaempferol are the rst and second main bioactive components of AM, respectively.

GO and KEGG enrichment analyses
GO and KEGG enrichment analyses were performed on 24 compound targets by using the "Bioconductor" function in R. The GO enrichment results showed that 24 compound targets were involved in cell growth and regulation and were related to biological processes such as endothelial cell proliferation (Fig. 9A, B).
To gain further insight into the pharmacological mechanisms of AM in LN, we performed the KEGG pathway analysis. Finally, 88 biologically signi cant pathways were obtained (q-value < 0.01 and corrected p-value < 0.01), among which 7 of the 23 genes (CASP3, EGF, KDR, IL-1β, IKBKB, INSR, and PRKCB) were enriched in the mitogen-activated protein kinase (MAPK) signaling pathway with p-value = 1.14E-05, suggesting that AM may interfere with this signaling pathway in the treatment of LN (Fig. 9C, D).

Clinical internal validation of the core genes
The nomogram (Fig. 10A) showed good distinction and calibration in predicting the risk of LN, with the AUC value of 0.886 (Fig. 10B). The calibration curve showed good agreement between the LN risk predicted by the nomogram and the actual risk of LN (Fig. 10C).
The AUC values of the top 10 core genes identi ed from the GSE32591 dataset were calculated using the "pROC" function in the R package to investigate the diagnostic performance of the core genes for LN. The results showed that IL1Β, EGF, CCDN1, CASP3, STAT1, PTGS2, AR, and PPARγ had diagnostic values for LN. Among the screened core genes, CCDN1, STAT1, and PPARγ had the highest diagnostic values (Fig. 11).

External validation
Another LN dataset, GSE99339, was identi ed from the GEO database to compare the expression levels of the core genes between the LN and healthy groups. The results showed that the core genes, except for EGF, (IL1, CCND1, CASP3, STAT1, PTGS2, PPARγ, AR, CXCL10, and KDR) were signi cantly upregulated in GSE99339 (Fig. 12).

Molecular docking
We used the Autodock software to molecularly dock four key targets (STAT1, PPARγ, IL-1β, and CASP3) with quercetin, the rst major bioactive component of AM. Low binding energy between a ligand and the corresponding receptor indicates a stable conformation and a high possibility of action. The docking results showed that the binding energies of the 4 key targets were < -5 kJ·mol -1 , indicating that quercetin binds well to the key target protein (Table 5 and Fig. 13).

Discussion
The results of our study showed that there are 24 compound targets of AM in LN. After PPI network analysis, we obtained 10 core targets according to degree value. Further, we performed ROC analysis to validate the core targets internally, and we found that the 10 core genes had high diagnostic values for LN, among which four key target genes, namely CASP3, IL-1β, STAT1, and PPARγ, had the highest diagnostic values. We also performed differential gene expression analysis on the external dataset GSE99339, which showed that the 10 core genes had signi cant differential expression levels in LN. Further, molecular docking revealed that four key targets in LN had the highest binding properties to quercetin, the rst bioactive component of AM. We performed GO and KEGG analyses on the 24 compound targets separately and found that the key target genes CASP3 and IL-1β were enriched in the MAPK signaling pathway. Therefore, we speculated that AM suppresses LN by regulating in ammation at least through the MAPK signaling pathway.
Autophagy is an intracellular catabolic modality that maintains cellular homeostasis. Defects in autophagy are involved in the pathogenesis of SLE 7,11,12 . Studies have shown that autophagy is involved in the process of foot-cell injury in LN 13 . The MAPK cascade is a key signaling pathway regulating autophagy, cell proliferation and differentiation, apoptosis, and cellular stress response 14,15 .
This cascade inhibits autophagy by activating mTOR 16, 17 . Our ndings revealed that the key target genes of AM in LN are enriched in the MAPK signaling pathway, and therefore we hypothesized that AM suppresses LN by regulating the MAPK signaling pathway. Indeed, autophagy is generally accompanied by apoptosis.
Caspase-3 is an essential member of the caspase family, an executor of the apoptotic pathway 18 . Several studies have shown activated caspase-3 in glomerular cells and in ltrated in ammatory cells (neutrophils and macrophages) in LN, and the increase in the number of apoptotic cells correlates with the severity of the glomerular lesions, demonstrating the involvement of apoptosis in the associated kidney injury 19 . Similarly, our analysis of the differentially expressed genes in LN revealed that caspase-3 expression increases in the kidneys of LN patients, and external datasets con rmed this result. We, therefore, hypothesize that regulation of caspase-3 expression may reduce apoptosis and thus treat LN.
In addition, our results showed increased expression of IL-1β in LN. Studies have demonstrated that the development of LN is closely related to in ammation 20 . IL-1β is involved in the pathogenesis of autoimmune diseases 21 , and it acts as a pro-in ammatory cytokine and coordinates the in ammatory response by activating the NF-κB signaling pathway and promoting the release of cyclooxygenase-2 and interferon-γ 22 . In patients with LN, renal tubular epithelial cells play an essential role in the tubular interstitial in ammation by producing several in ammatory mediators, including interleukin (IL)-6 and IL- inactivates in ammatory cytokines 27 . In ammation is closely related to autophagy; in ammatory mediators can inhibit autophagy, and impaired autophagy can also induce in ammatory responses. Therefore, we hypothesized that AM plays a role in regulating the expression of the key target gene IL-1β, which may be associated with the regulation of autophagy, to attenuate the in ammatory response and thereby improves the prognosis of LN.
Our study found that PPARγ and STAT1 were also signi cantly upregulated and had diagnostic values for LN. Peroxisome proliferator-activated receptor γ (PPARγ), a nuclear transcription factor, can regulate thylakoid cell proliferation, prevent trans-differentiation, inhibit broblast proliferation and macrophage in ltration, and suppress in ammatory mediators. Activated PPARγ can ameliorate renal in ammation and brosis by inhibiting the TLR4/MyD88/NF-κB pathway 28 . Studies have shown that STAT1 can accelerate the progression of LN by activating interferons, regulating the expansion and differentiation of T and B cells, and disrupting immune tolerance. Furthermore, the STAT1 expression in peripheral blood mononuclear cells was elevated in patients with SLE, and this elevation correlates with disease activity 29 . STAT1 has good binding properties for the rst bioactive ingredient of AM. Therefore, we speculate that AM may also suppress LN by interfering with PPARγ and STAT1. However, the exact mechanism needs to be further explored.
In this study, we found that the key target genes of AM that are involved in the treatment of LN are all associated with quercetin, which is a vital bioactive component of AM and has anti-oxidative, antiapoptotic, and anti-in ammatory properties 30,31 . Quercetin can suppress hepatic brosis and atherosclerosis through signaling pathways, such as the MAPK and SIRT1 pathways, in hepatic and cardiovascular diseases 32,33 . Studies have demonstrated that quercetin can play a therapeutic role in LN by inhibiting the activation of the nuclear factor-κB signaling pathway and suppressing the excessive proliferation of thylakoid cells. Accordingly, our results indicate that quercetin regulates the expression of caspase-3 and IL-1β through the MAPK signaling pathway, consequently regulating autophagy and suppressing apoptosis and cellular in ammatory response to treat LN.
This study has some limitations. First, the small sample size of the microarray data may have had some in uence on the results. Second, although we performed clinical validation, the results of this study should be veri ed via extensive clinical studies and experimental validation.

Conclusion
In this study, we predicted that AM for LN might be related to in ammation and autophagy by combining network pharmacology, WGCNA, and molecular docking method, which provides a theoretical basis for clinical application of AM in the treatment of LN. The ow diagram of the study.    The connectivity alongside the importance of the genes within the blue modules in the LN(A) and healthy groups (B), p < 0.05.

Figure 6
The Venn diagram of genes among of AM-LN-related targets.  "Drug-bioactive ingredients-compound targets-disease" interaction network.