Integrated bioinformatics analysis for the screening of hub genes and therapeutic drugs in severe acute respiratory syndrome corona virus 2 infection/COVID 19

Severe acute respiratory syndrome corona virus 2 (SARS-CoV-2) infections (COVID 19) is a progressive viral infection that has been investigated extensively. However, genetic features and molecular pathogenesis underlying SARS-CoV-2 infection remain unclear. Here we used bioinformatics to investigate the candidate genes associated in the molecular pathogenesis of SARS-CoV-2 infection. Expression proling by high throughput sequencing (GSE149273) was downloaded from the Gene Expression Omnibus (GEO), and the differentially expressed genes (DEGs) in remdesivir traded SARS-CoV-2 infection samples and non treated SARS-CoV-2 infection samples with an adjusted P-value < 0.05 and a |log fold change (FC)| > 1.3 were rst identied by limma in R software package. Next, Pathway and Gene Ontology (GO) enrichment analysis of these DEGs was performed. Then, the hub genes were identied by the Network Analyzer plugin and the other bioinformatics approaches including protein-protein interaction (PPI) network analysis, module analysis, target gene - miRNA regulatory network, and target gene - TF regulatory network construction was also performed. Finally, receiver ‐ operating characteristic (ROC) analyses were for diagnostic values associated with hub genes. A total of 909 DEGs were identied, including 453 up regulated genes and 457 down regulated genes. As for the pathway and GO enrichment analysis, the up regulated genes were mainly linked with inuenza A and defense response, whereas down regulated genes were mainly linked with Drug metabolism - cytochrome P450 and reproductive process. Additionally, 10 hub genes (VCAM1, IKBKE, STAT1, IL7R, ISG15, E2F1, ZBTB16, TFAP4, ATP6V1B1 and APBB1) were identied. ROC analysis showed that hub genes (CIITA, HSPA6, MYD88, SOCS3, TNFRSF10A, ADH1A, CACNA2D2, DUSP9, FMO5 and PDE1A) had good diagnostic values. In summary, the data may produce new insights regarding pathogenesis of SARS-CoV-2 infection and treatment. Hub genes and candidate drugs may improve individualized diagnosis and therapy for SARS-CoV-2 infection in future.


Introduction
At the December of 2019, a novel corona virus, called severe acute respiratory syndrome corona virus 2 (SARS-CoV-2) or novel corona virus 2019 (2019-nCoV) is a single-stranded RNA, nonsegmented, enveloped viruses, resulted fast spreading from its origin in China to the rest of the globe [1]. Symptoms of this viral infection ranging in severity from the common cold to severe illness, and nally lead to death. Despite the fact that great progress has been made in antivirals and vaccination for this SARS-CoV-2 infection, survival rates is less. Since the precise molecular pathogenesis of SARS-CoV-2 infection (virus replication and dissemination) remains unknown, it is extremely essential to examine molecular pathogenesis and to develop effective therapeutic strategies in SARS-CoV-2 infection and to control the disease [2].

Gene ontology (GO) enrichment analysis for DEGs
The ToppGene (ToppFun) (https://toppgene.cchmc.org/enrichment.jsp) [17] was used to study GO enrichment analyses of DEGs. The ToppGene online tool for GO analysis (http://www.geneontology.org) [18] was used to complete the function of DEGs. Data from biological processes (BP), cellular components (CC) and molecular functions (MF) were documented from each set of genes. A p < 0.05 was considered statistically signi cant for all analyses.

Validation of hub genes
In order to identify the diagnostic value of up and down regulated hub genes in SARS-CoV-2 infection,  Table 1. Heatmaps as shown in Fig. 3 and Fig. 4, respectively, indicated that these up and down regulated genes were good in distinguishing remdesivir traded SARS-CoV-2 infection samples and non treated SARS-CoV-2 infection samples.

Pathway enrichment analysis for DEGs
To further understand the function and mechanism of the identi ed up and down regulated genes, pathway enrichment analyses were performed using the ToppGene web tool. The main pathways that were particularly enriched by up regulated genes were pyrimidine deoxyribonucleosides degradation, tryptophan degradation to 2-amino-3-carboxymuconate semialdehyde, in uenza A, cytokine-cytokine receptor interaction, IL23-mediated signaling events, direct p53 effectors, cytokine signaling in immune system, interferon signaling, C21 steroid hormone metabolism, purine metabolism, genes encoding secreted soluble factors, ensemble of genes encoding ECM-associated proteins including ECM-a laited proteins, ECM regulators and secreted factors, toll receptor signaling pathway, in ammation mediated by chemokine and cytokine signaling pathway, JAK-STAT signaling, purine metabolic, steroidogenesis and pyrimidine metabolism and are listed in Table 2. Similarly, down regulated genes were notably enriched in pyridoxal 5'-phosphate salvage, glutamine degradation/glutamate biosynthesis, Drug metabolismcytochrome P450, chemical carcinogenesis, signaling events mediated by the hedgehog family, glypican 2 network, GPCR ligand binding, phase 2 -plateau phase, glycolysis, gluconeogenesis, type III secretion system, genes encoding secreted soluble factors, ensemble of genes encoding ECM-associated proteins including ECM-a laited proteins, ECM regulators and secreted factors, notch signaling pathway, TGFbeta signaling pathway, notch signaling, wnt signaling, sulfate/sul te metabolism and leukotriene C4 synthesis de ciency and are listed in Table 3.
Gene ontology (GO) enrichment analysis for DEGs GO term enrichment analyses were performed using web tool ToppGene. Table 4 and Table 5 show the functions of the identi ed up and down regulated genes. Up regulated genes of BP were associated in defense response and response to external biotic stimulus. Down regulated genes of BP were associated in reproductive process and positive regulation of transcription by RNA polymerase II. Up regulated genes of CC were associated in cell surface and external side of plasma membrane. Down regulated genes of CC were associated in intrinsic component of plasma membrane and nuclear chromatin. Up regulated genes of MF were associated in cytokine activity and receptor ligand activity. Down regulated genes of MF were associated in transporter activity and cation transmembrane transporter activity.

PPI network construction and module analysis
The PPI network of up regulated genes consisting of 206 nodes and 412 edges was constructed in the IMEX database (Fig. 5). A top hub genes were selected by the Network Analyzer (Table 6), including   VCAM1, IKBKE, STAT1, IL7R, ISG15, PML, NOS2, FBXO6, IRF1, IRF7, ADAM8, SBK1, ARL14 and TGM2, and statistical results in scatter plot for node degree distribution, betweenness centrality, stress centrality, closeness centrality and clustring coe cient are displayed in Fig. 6A -6E. Enrichment analyses revealed that hub genes in this PPI network were mainly associated with malaria, in uenza A, defense response, cytokine-cytokine receptor interaction, cytokine signaling in immune system, direct p53 effectors, ATF-2 transcription factor network, adaptive immune system, IL6-mediated signaling events, measles, innate immune system and ensemble of genes encoding ECM-associated proteins including ECM-a laited proteins, ECM regulators and secreted factors. Similarly, PPI network of down regulated genes consisting of 206 nodes and 412 edges was constructed in the IMEX database (Fig. 7). A top hub genes were selected by the Network Analyzer (Table 6), including E2F1, ZBTB16, TFAP4, ATP6V1B1, APBB1, ELF5, CBX2, USP2, ERP27, DSCAML1, KCNF1, DLX3, EGFL6 and AMIGO1, and statistical results in scatter plot for node degree distribution, betweenness centrality, stress centrality, closeness centrality and clustring coe cient are displayed in Fig. 8A -8E. Enrichment analyses revealed that hub genes in this PPI network were mainly associated with notch-mediated HES/HEY network, map kinase inactivation of SMRT corepressor, positive regulation of transcription by RNA polymerase II, iron uptake and transport, positive regulation of RNA metabolic process, nuclear chromatin, reproductive process, positive regulation of developmental process, de novo pyrimidine ribonucleotidesbiosythesis, neuronal system, transcription regulatory region sequence-speci c DNA binding, signaling receptor binding and molecular function regulator.
Analysis using the PEWCC1 Cytoscape software plugin was used to create modules for the PPI networks. A total of 423 modules were created from PPI network of up regulated genes. Four signi cant modules were identi ed, including module 1 (nodes 44 and edges 173), module 6 (nodes 24 and edges 69), module 12 (nodes 20 and edges 38) and module 16 (nodes 18 and edges 33) are shown in Fig. 9. Enrichment analyses revealed that hub genes in these modules were mainly associated with in uenza A, measles, chemokine signaling pathway, cytokine signaling in immune system, defense response, response to external biotic stimulus and innate immune response. A total of 219 modules were created from PPI network of down regulated genes. Four signi cant modules were identi ed, including module 4 (nodes 87 and edges 86), module 5 (nodes 77 and edges 76), module 13 (nodes 41 and edges 41) and module 16 (nodes 29 and edges 28) are shown in Fig. 10. Enrichment analyses revealed that hub genes in these modules were mainly associated with multi-organism reproductive process, iron uptake and transport, neuroactive ligand-receptor interaction and cell-cell signaling.
The construction of target genes -miRNA regulatory network and target genes -TF regulatory network analysis for up and down regulated genes, has been proven to be useful in the analysis of target genes involved in SARS-CoV-2 infection. Target genes such as CD7 [178] and ELOVL7 [179] were liable for advancement of HIV infection, but these genes may be associated with progression of SARS-CoV-2 infection. In general, our ndings suggested that novel biomarkers such as SOD2, APOL6, NPR1, NTNG2, VAV3, ZNF703, FAXC (failed axon connections homolog, metaxin like GST domain), GPR137C, ZNF704, ABCA17P, REEP1 and TRAM1L1 may play key roles in the action mechanism of SARS-CoV-2 infection.
In conclusion, we conducted a comprehensive bioinformatics analysis on microarray data of SARS-CoV-2 infection. Pivotal DEGs (up and down regulated genes) and pathways were diagnosed and screened to provide a theoretical basis for potential drug target discovery and the molecular pathogensis of SARS-CoV-2 infection. 10 hub genes, especially CIITA, HSPA6, MYD88, SOCS3, TNFRSF10A, ADH1A, CACNA2D2, DUSP9, FMO5 and PDE1A, were found to differentiate remdesivir traded SARS-CoV-2 infection from non treated SARS-CoV-2 infection. Nevertheless, additional relevant investigation are needed to further con rm the identi ed up and down regulated genes, and pathways in SARS-CoV-2 infection.