Microarray analysis of novel genes involved in HSV2 infection

Background: Herpes simplex virus type 2 infects the body and becomes an incurable and recurring disease. The pathogenesis of HSV-2 infection is not completely clear. Methods: We analyze the GSE18527 dataset in the GEO database in this paper to obtain distinctively displayed genes(DDGs)in the total sequential RNA of the biopsies of normal and lesioned skin groups, healed skin and lesioned skin groups of genital herpes patients, respectively.The related data of 3 cases of normal skin group, 4 cases of lesioned group and 6 cases of healed group were analyzed.The histospecic gene analysis , functional enrichment and protein interaction network analysis of the differential genes were also performed, and the critical components were selected. Results: 40 up-regulated genes and 43 down-regulated genes were isolated by differential performance assay.


Background
Herpes simplex virus type 2 (HSV-2) infection is a widespread infectious disease that is the primary cause of most clinical cases of genital ulcers worldwide; it can be activated repeatedly after infection. HSV-2 can cause meningitis, encephalitis, sacral radiculitis and myelitis.HSV-2 can cause multi-system diseases after infecting the body, and the etiology is complicated [1]. However, the pathogenesis of HSV-2 infection is not fully understood.
IFN-β plays a key role in antiviral activity in early-induced defenses to limit viral replication and transmission after HSV-2 enters the genital epithelium; previous studies con rm the inhibition of IFNβ production during HSV-2 infection. Multiple viral components during HSV-2 infection produce immune escape by inhibiting type I interferon expression through inhibition of the DNA receptor pathway, resulting in lifelong infection of the organism [2].
There was a large in ltration of T cells, monocytes/macrophages, numerous myeloid cells and a small number of plasmacytoid dendritic cells, low expression of type I interferons (IFNα and IFNβ), as well as detection of large amounts of HSV-2 antigen and HSV-2 RNA within skin lesion biopsy specimens in the acute phase of HSV-2 infection [3].Data analysis and exploration of genes associated with low expression of type I interferon during HSV-2 infection are still insu cient.Genes associated with low expression of type I interferon after HSV-2 infection were selected in this paper by database analysis and data exploration.
Therefore, we used statistical analysis and data mining to reveal the genes responsible for the low expression of type I interferon associated with HSV-2 infection. We used the GSE18527 dataset created by whole genome analysis analysis of skin biopsies during HSV-2 activation in the skin mucosa by Peng T et al. to obtain DDGs between normal and lesioned skin groups, healed skin and lesioned skin group, and then the integration of the 2 groups is taken.
This study contributes to the understanding of the etiology of immune escape after HSV-2 infection in the organism and provide new thinking for the clinical management of HSV-2 infection.

Distinctively displayed genes
The GSE18527 dataset was imported at the GEO database and differentially expressed mRNAs were analyzed in R software using the Limma package (version: 3.40.2). Adjusted P values were analyzed in GEO to correct for false positive results; "Adjusted -P < 0.05 and log2 (fold change) > 4" was de ned as the threshold to screen for DDGs. 49 upper-regulated genes and 192 lower-regulated genes were isolated in the normal and lesioned skin groups; 110 upper-regulated genes and 43 lower-regulated genes were isolated in the healed and lesioned skin groups.40 upper-regulated genes and 43 lower-regulated genes were recognized after taking crossovers for the 2 subgroups. As Table 1 shows, 36 of the 83 DDGs (43.4%, 31 up-regulated and 5 down-regulated) were mainly distributed in the skin, and most of the above genes were concerned with epidermal development, skin development, keratin-forming cell differentiation and other functions; 21 genes (fcer1a, cd1a, fpr1, fcn1, etc.) were relevant to the immune system. CXCL10 gene expression was signi cantly down-regulated with a signi cantly lower P value (P < 0.0001). Figure 1 displays the Venn diagram and heat map of the DDGs.

Histospeci c Gene analysis
Histospeci c representation of DDGs were investigated by genecards database. The most histospeci c expressed systems were the skin system (43.4%, 36/83); the blood/immune system was next (25.3%, 21/83); the nervous system (7%, 6/83); the reproductive system (2%, 2/83); and the visceral and digestive systems were the least enriched (1%, 1/83) ( Table 2). Functional and pathway enrichment analysis of DDGs GO and KEGG enrichment assay of underlying mRNAs was performed using the ClusterPro ler suite of programs in R software.Expression heat maps were presented by the R package pheatmap. 3 functional enrichment aliases, 328 enriched GO species and 21 KEGG pathways were isolated. Table 3 lists the enrichment aliases with P < 0.000005; they include Viral protein interaction with cytokine and cytokine receptor (P = 1.51065E-09) and IL-17 signaling pathway (P = 4.47045E-07 ); the MF class of chemokine receptor binding (P = 1.85516E-08 ) and G protein-coupled receptor binding (P = 5.19828E-07 ). In addition, nine enriched keywords were included, including epidermis development (1.19541E-17); response to virus (5.02351E-12), etc. Figure 2 illustrates the P values and gene counts of the 12 enriched functional items. PPI network analysis of DDGs A PPI network having 82 nodes as well as 429 edges was computed from the STRING database; the interaction score of this network was > 0.4 (Fig. 3A). The dots stand for genes and the edges stand for the linkage between genes. Red means upper-regulated genes and blue means lower-regulated genes.
We used MCODE,cytoHubba in Cytoscape to carry out gene clustering to recognize critical PPI network segments. As displayed in Fig. 3B,C,D, 3 critical segments were isolated. The above 3 critical segments are mainly concerned with the response to the virus, the development of the epidermis, and the structural components of the cytoskeleton( Table 3).

Characterization of genes of interest
We used the MCODE plugin in cytoscape to identify the top 20 genes by the MCC method, of which 8 down-regulated genes (IFIT2,IFIT3,RSAD2,gbp1,i 44l,gbp4,GBP5,cxcl8 ) were connected to the immune system.

Discussion
In this research, we based on the GSE18527 dataset in the GEO database to recognize the distinctively displayed genes(DDGs) of the total RNA after sequencing of normal skin of controls and lesioned skin of genital herpes patients, healed skin biopsies and lesioned skin, respectively; and took the crossover of the 2 sets of samples to minimize and enhance the potential pathogenic genes in HSV-2, selecting a couple of novant genes to be correlated with this disease that have not been investigated yet.
Previous studies have shown that RNA sequencing of lesion biopsies from the same sites during healing of HSV-2-infected skin tissue and at 2 and 4 weeks after healing showed no HSV-2 nucleic acids or antigens; however, IFN-γ was consistently expressed and IFN-β and IFN-α levels were very low; HSV-2 preventing the innate immune system from producing type I interferon may be a major factor in allowing the virus to break through the host mucosal defenses [3].In our study, we found a signi cant decrease in the expression level of type I interferon-related genes, which proves the accuracy of the method of this study.
We identi ed 83 DEGs in HSV-2-infected patients, including 40 up-regulated and 43 down-regulated genes, with the most differentially expressed genes in the cutaneous system, followed by the immune, nervous and reproductive systems, and the differential tissue expression could explain the life history after HSV-2 infection of the organism.HSV-2 virus invades through exposure to supraepithelial cells to produce initial infection and duplicates within the supraepithelial cells, after which HSV-2 ascends through the periaxonal sheaths of sensory nerves to the sacral ganglia of the host nervous system, where it becomes a reservoir for future outbreaks and subclinical genital viral shedding [4].
The differential genes were most expressed in the cutaneous system, suggesting that the skin is the primary barrier against HSV-2 infection and exerts a crucial role in innate and acquired immunity.Although HSV-2 can infect skin epithelial cells, immune cells and neuronal cells, it rst infects mucosal epithelial cells during sexual transmission; keratinocytes are the main cells involved in HSV-2 clinical damage.Keratin-forming cells in the skin epidermis are considered to be immunologically active cells that not only act as a physical barrier, but also recognize the source of infection and initiate the innate immune response, including secretion of various cytokines (e.g., interleukins, interferons, clonestimulating factors, growth factors, etc.), expression of cytokine receptors, recruitment of immune cells, enhancement of acquired immunity, and indirect elimination of the initial infection and prevention of subsequent infections [5].
PPI network and critical components analysis selected four genes (gbp1, gbp4, i t3, rsad2) associated with type I interferon expression. the GBPS family is induced by both type I and type II IFNs, and according to the uniprot database annotation gbp1 hydrolyzes GTP to GMP; gbp4 hydrolyzes GTP by binding GTP, GDP and GMP. The cGAS-STING pathway performs a core role in the generation of type I interferon after identi cation of double-stranded DNA virus; the cyclic GMP-AMP (CGAMP) synthase (CGAS) upstream of the pathway is a cytoplasmic DNA sensor that triggers downstream STING by catalyzing the synthesis of cyclic GMP-AMP (CGAMP) from ATP and GTP, which ultimately leads to transcriptional elicitation of type I interferon [6].Studies have con rmed that endogenous GBPS levels can induce the release of DNA from exposed bacteria in the cytoplasm, which in turn is sensed by cGAS [7].
i t3 is an IFN-inducible antiviral protein that enhances the MAVS-mediated host antiviral response by acting as an articulator linking TBK1 to MAVS, leading to TBK1 activation and IRF3 phosphorylation, with phosphorylated IRF3 translocating into the nucleus to promote antiviral gene transcription [8].Also known as interferon-induced iron-sulfur (4FE-4S) cluster-binding antiviral protein, rsad2 functions predominantly in the type I and type II interferon-induced cellular anti-virus status and inhibits a broad diversity of DNA and RNA viruses [9].These information indicate that disorders of gbp1, gbp4, i t3, and rsad2 may contribute to the suppression of type I interferon expression leading to the production of HSV-2 immune escape.
Myelitis and encephalitis caused by HSV-2 infection are common in HSV-2 infection. We identi ed four upregulated genes ephb6,n x,mlana,tyrp1 and two downregulated genes wars1,mx1 by screening for genes speci cally expressed in nervous system tissues.ephb6 regulates cell adhesion and migration by binding to ephrin-B1 and ephrin-B2 [10].NFIX identi es and combines the palindromic sequence 5'-TTGGCNNNNGCCAA-3' present in viral and cytosolic promoters as with the adenoviral type 2 replication motif [11]. Mlana is engaged in melanin production and plays a crucial role in the production, maintenance, export and machining of the melanocyte protein PMEL, which is essential for the generation of phase II melanosomes [12].Tyrp1 plays a role in melanin biosynthesis and can regulate or in uence the type of melanin that is synthesized [13].Studies have con rmed that melanocytes are melanin-producing cells with emerging innate immune functions, including the expression of antiviral type I interferon cytokines [14]. Tryptophanyl-tRNA synthetase 1 (WARS1) is an endogenous ligand of mammalian Toll-like receptors (TLR2 and TLR4). Microarray data, using mRNA from WARS1-treated human peripheral blood mononuclear cells (PBMCs), had indicated WARS1 to mainly activate innate in ammatory responses [15].Mx1 synthesizes interferon-inducible kinesin-like GTPases with antiviral activity against a variety of RNA viruses and some DNA viruses [16].Their imbalance may contribute signi cantly in the etiology of HSV-2-induced CNS infections.
HSV-2 infection is mainly transmitted through sexual contact, and we identi ed the pla2g2a and mamdc2 gene to be distributed mainly in the genital system.pla2g2a synthesizes secreted calcium-dependent phospholipases that primarily target extracellular phospholipids and are associated with host antimicrobial defense, in ammatory response and tissue regeneration; contributing to lipid remodeling of cell membranes and production of lipid mediators involved in pathogen clearance [17]. Also, by disrupting the integrity of the mitochondrial membrane, it promotes the release of effective damage-related molecular pattern molecules in the circulation, thereby activating the innate immune response [18].Singlecell RNA sequencing analysis con rms host-length non-coding RNA MAMDC2-AS1 as a cofactor for HSV-1 nuclear transport [19].The role of pla2g2a and mamdc2 in the pathogenesis of HSV-2 infection needs to be con rmed by further experiments.
Genes associated with type I interferon expression were identi ed from GO enrichment analysis, including IFIT2, IFIT3, ISG20, MMP12, MX1, ISG15, and RSAD2.RSAD2 was signi cantly downregulated 3.246-fold and 3.103-fold in the control-lesioned and healed-lesioned groups, respectively.RSAD2 is an ancient mechanism of protection against viral infection and serves an essential and many-sided role in the innate autoimmune reaction to viral infection; RSAD2 protein interacts with viral proteins and RSAD2 protein fosters the antiviral response by directly challenging viral proteins for decay and also contributes to the ubiquitin-dependent proteasomal decay of some of the antiviral proteins with which it interacts [20].In one study, immunohistochemical staining of vaginal tissue after systemic injection of 2'3'-cGAMP in mice revealed strong expression of RSAD2 protein in cells in the induced mesenchyme; epithelial cells close to the basement membrane were positive for RSAD2 protein, while epithelial cells located on the luminal side of the canal expressed less. Vaginal tissues of mice locally infected with HSV2 had strong RSAD2 protein expression in cells at the site of infection. A few cells co-expressed RSAD2 protein and HSV-2 antigen, but cells surrounding HSV-2 infected cells expressed high levels of RSAD2 protein suggesting that RSAD2 protein is involved in the body's immune response to HSV2 infection [21].ISG15 was signi cantly downregulated 4.401-fold and 2.203-fold in control-lesioned and healed-lesioned, respectively.ISG15 can inhibit viral translation, replication and release.ISG15 was found to preferentially co-translationally bind to newly synthesized proteins, suggesting that ISGization may be a general, nonspeci c host defense mechanism [22].A study found that isg15-/-mice were prone to infection with in uenza A and B, HSV-1 and Sindbis virus [23].
In this study, we employed nding-driven analysis to identify DDGs and selected four genes (gbp1, gbp4, i t3, rsad2) associated with type I interferon expression by building a PPI network and recognizing critical components. The association of these genes with HSV-2 infection needs to be further investigated.

Conclusion
This research used the GSE18527 dataset of the GEO database to screen distinctively displayed genes in the overall RNA of skin biopsy sequences from genital herpes patients at different stages, followed by histospeci c Gene analysis, feature enrichment and protein-protein interaction network analysis to de ne critical components. The most enriched system for histospeci c Gene expression was the skin, and then the immune system and the nervous system. Functional and pathway enrichment analysis of differential genes was performed for nine enrichment keywords, including epidermal development; response to viruses, etc. We used Cytoscape for gene clustering to pinpoint critical PPI network components and de ned three key components. We used the MCODE plugin in Cytoscape to identify the top 20 genes, and on basis of the genecards database, we arti cially pinpointed four genes associated with type I interferon expression.

Microarray data
The GSE18527 dataset created from Peng T et al. was downloaded at GEO database.The dataset was derived from the GPL6255 Illumina humanRef-8 v2.0 expression beadchip platform,the experiment contained 19 samples, including 3 pre-treatment normal skin, 4 pre-treatment lesioned skin, 6 posthealing group lesioned skin biopsies and 6 post-healing group normal skin biopsies. Since this is a public dataset, age, health status and medication use of these individuals are not available. Annotated les for GPL6255 were also downloaded from GEO.

Differential expression analysis
The data were downloaded in MINiML format. Differentially expressed mRNAs were studied using the Limma software package (version: 3.40.2) of R. Adjusted P values were analyzed in GEO to correct for false positive results, and differentially expressed mRNAs were screened at a threshold de ned by Adjusted P < 0.05 and logFC > 4. We divided the samples into 2 groups, normal skin biopsy and lesioned skin group, and healed skin and lesioned skin group, and gain the DDGs after taking the crossovers for the two subgroups.Venn diagrams of DDGs were produced with web-based facility Venny (https://www.xiantao.love), and expression heat maps were presented by the R package pheatmap.

Histospeci c Gene analysis
Histospeci c expression of differential genes were analyzed by genecards database. Transcripts from individual tissues that meet the following quasi-assays are considered extremely histospeci c: based on RNA-seq from GTEx reads to obtain a list of tissues with positive differential gene expression, ploidy change values were calculated for each sample using DESeq software, and each sample read was compared to all GTEx sample reads, and genes with ploidy change values > 4 in the tissue were de ned as being positively differentially expressed in that tissue. Genes with maximum read counts below 5 across tissues were excluded from the calculation.

Functional enrichment Analysis of DDGs
Latent functions of Latent targets were identi ed through functional enrichment analysis of the data. Gene ontology (GO) containing molecular functions (MF), biological pathways (BP) and cellular components (CC) is widely used to provide indications of genes with de ned functions.kEGG enrichment analysis is practical available for analyzing gene functions as well as related high-level genomic functional information. The ClusterPro ler in R was utilized to pro le the GO function of latent mRNAs and enrich the KEGG pathway. Box line plots were performed by the R package ggplot2; PCA plots were performed by the R package ggord.
Protein interaction (PPI) network pro ling PPI network pro ling was performed via the STRING database. The DDGs were uploaded to the STRING database and the threshold of the score of the correlation was adjusted to > 0.4.
In addition, PPI networks were visualized and constructed using Cytoscape(v 3.6.0) software, and the node with the highest number of interactions with neighboring nodes was called the central node.
To identify critical PPI network components, gene network clustering analysis was performed using cytohubba and MCODE in the Cytoscape software package. Critical components were de ned by sensitivity thresholds of P < 0.05.

Statistical analysis
All statistical analyses were performed with R statistical software. For all assays, two-sided P value of less than 0.05 indicated a statistically signi cant difference. Declarations 20. Ghosh S, Marsh ENG. Viperin: An ancient radical SAM enzyme nds its place in modern cellular metabolism and innate immunity. J Biol Chem.     Figure 1 a Hea t map display of differential genes between control normal skin and lesioned skin group.b Heat map display of differential genes in the healed and lesioned skin groups.c Venn diagram Expression pro les of total RNA in control normal and lesioned groups, healed skin and lesioned groups were taken to intersect.UP represents upregulated genes,and down represents downregulated genes. Bubble diagram.12 groups of representative enrichment function terms. The different colors represent the signi cance of differential enrichment results, larger values represent smaller fdr values, and the size of the circle represents the number of enriched genes, with larger circles representing a larger number.