Bioinformatics Approach to Investigate the Genes Manifesting Alopecia Areata


 The authors have withdrawn this preprint due to author disagreement.


Background
Alopecia areata (AA) is a type of alopecia or hair loss, which is very common in human. It is an autoimmune disorder with variable course, which can be either relapsing or persistent type with extensive hair loss (1). Non-scarring alopecia follows differential patterns in male and female and is the second most prevalent alopecia type (2). Peak incidence rate appears in the age of 15-29 years(3). According to the National Alopecia Areata Foundation (NAAF), it affects approximately 2% of the overall population(4).
It has incidence rate of 4% in China, 2-3% in UK and USA, and around 0.7% in India (5) . Alessandro Sette and others used IEDB (The Immune Epitope Database) to study autoimmune epitope data related to AA (6). Bioinformatics approaches have been used extensively for identifying diagnostic and therapeutic biomarkers in various clinical conditions, however, in case of alopecia; no systematic bioinformatics analysis has ever been reported. Due to lack of data analytic approaches, progress in this eld is very slow. In the present study an extensive bioinformatics based approach has been used to ll the gap in alopecia data analysis. To make this approach holistic, two datasets of AA were taken from microarray data repository. Up and down regulated genes of both the datasets were selected. These were further analyzed by DAVID and Enrichr tools for cellular component, molecular function and biological process information. The dual analytic approach of model organism along with patient data analysis has given plausible pathways based novel biomarkers.

Methodology
Data Collection and Organization: In GEO (Gene expression Omnibus), from (National Centre for Biotechnology Information) NCBI, AA was selected as disease name and 4 datasets were displayed.
Two datasets GDS5274 and GDS5272 with similar experimental framework were selected from GEO database. The complete work ow for the present analysis has been detailed in gure 1.  terms. This includes evaluating the enrichment signi cance of gene ontology (GO) terms. P-value of <0.05 was selected as threshold. Data retrieved in common from both enrichment tools were selected for construction of networks among the queried genes and their neighbors by Pathwaylinker2.0 (http://pathwaylinker.org/).

Results And Discussion
Bioinformatics tools have panned a way greatly in exploring the new vistas for nding biomarkers and targets for various diseases (7)(8)(9). In the present study, GEO the genomic data repository was used for data sourcing. Two GEO datasets: GDS5272 (H.sapiens) and GDS5274 (M.musculus) were extracted. Out of 45101 genes of model organism (M.musculus), and 54675 genes of AA patient (H.sapiens) investigated, top 100 up and downregulated genes were selected in each datasets and further analyzed as shown in Venn diagrams depicted in the work ow (Figure1).
In order to nd the commonalities and corroboration of patient data with experimental model organism data, both the datasets were analyzed in a combinatorial approach (10). Gene enrichment analysis was done to interpret the functional annotation of distinctively expressed genes (11). In order to nd out whether the functionally expressed genes were related to a particular biological process or molecular function (12), the calculation of enrichment p-value was done by comparing the observed frequency of an annotation term with frequency expected by chance. Only the genes with p-value less than 0.05 (p-value<0.05) were deemed enriched. Figure  pathways for up regulated genes of M.musculus from Enrichr and S 1 .1 (b) shows that from DAVID. Five common pathways of up regulated genes from both the tools were selected as shown in Table 1. The genes associated with these pathways were further screened. Involvement of Chemokine signaling pathway was seen in both the tools hence it was chosen along with CXCL10, 1TK, CXCL11, CXCL9, STAT1, STAT2, CCL5, CCL2, CXCR6 and CCR5 genes, which were validated by both the tools. Figure  The up regulated genes of M.musculus mostly included chemokines. CXCL10 (C-X-C Motif Chemokine Ligand 10) is an antimicrobial gene encoding a Chemokine of the CXC subfamily and ligand for the receptor CXCR3. The present data also suggests that in AA lesions the in ltration of CXCR3+ Th1 cells around the hair bulbs might be induced by the increased activity of CXCL10 which is actually a Th1 Chemokine as shown in Table 1 (13) .
The function of innate immune response is based on distinctive receptors called PRRs (patternrecognition receptors) which have an ability to recognize conserved microbial structures called PAMPs (pathogen associated molecular patterns). Innate immune response discriminates between self and nonself antigens because of PRRs. Toll-like receptors (TLRs) are a group of PRRs which play an active role in identi cation of danger and initiation of immune response (14). In the present study, as depicted in Table  1, CXCL10, CXCL11, CXCL9, STATA1, CCL5, SPP1 genes might be involved in Toll-like receptor signaling pathway. Alzolibani and others con rmed that as compared to healthy individuals, the gene expression of intracellular TLRs (TLR-3, 7, 8,9) have been found to be higher in AA patients. Dysregulated expression of TLR-3, TLR-7, TLR-8 and TLR-9 in peripheral blood cells of AA patients involved in their signaling cascade leads to dysregulation of Th-1, Th-17 and regulatory T-cell cytokines (15). Therefore they deduced that in the pathogenesis of AA, the poorly regulated expression of TLRs and cytokines play an important role due to the improper activation of TLRs.
The present study also shows IFNG as up regulated gene component, which is an interferon gamma gene encoding a soluble cytokine, a member of type II interferon class. It is involved in pathways like Herpes simplex infection, Cytokine-cytokine receptor interaction at extracellular space and external side of plasma membrane as shown in Table 1. It is produced by lymphocytes and plays an important role in immunoregulatory functions. Duncan and others stated that feeding high levels of dietary Vitamin A to mice accelerated the onset of AA which is further associated with decreased IFNG level (16). Results from few other studies have also suggested that although IFNG is important for AA outset but the level of IFNG involved in skin drops as AA advances.
Down regulated genes of M.musculus were investigated by KEGG pathway analysis as shown in Table 2. Circadian entrainment, Gastric acid secretion, Insulin secretion, Glutamatergic synapse, salivary secretion, GABAergic synapse pathways have been shown to be indirectly related to AA (17,18). In another study it was found that in AA skin, the expression of cytokines (CXCL10, CX3CL1, CCL5, CXCL1) gene controlling complicated immune responses was over expressed (19). CX3CL1 is instigated by the action of IFNgamma, and is responsible for the ampli cation of one way response of polarized T-helper 1, indicating a Th1 type of response in AA skin. Increased expression of CCL5 due to the action of gene IL-1 beta and TNF-alpha have attributes to high Th1 pathway (20). Bellavista and others presented that during Herpes zoster (HZ) infection, the pain could be considered as a stress factor, which potentially triggers recurrent AA. One of the possibilities is that HZ manifests cutaneous in ammatory reactions such as Koebner phenomenon which in turn induces AA (21). Koebner phenomenon is also known as koebnerization or isomorphic response which is described as the formation of skin lesions on parts of the body that are not typically involved, meaning the lesions appear in the areas other than the usual spots which are affected by cutaneous disease like psoriasis (22)(23)(24)(25)(26)(27)(28)  Another up regulated gene Granzyme B (GZMB) encodes a member of the granzyme subfamily of proteins, a part of the peptidase S1 family of serine proteases and is involved in allograft rejection and type I diabetes mellitus (Table 3). It participates in inducing apoptosis of target cells for NK (natural killer) cells and cytotoxic CD8+ lymphocytes which are part of the innate immune system. Boivin and others have revealed the role of GZMB in severing ECM (extracellular matrix) proteins, auto antigens, and receptors NOTCH1 and FGFR1 (29). This impacts the hair follicle in AA skin type and changes the structure of the connective tissue layer and signaling within the hair follicle stem cells and dermal papilla. If the extracellular matrix is damaged, its loss may lead to cell death, vacating the place for immune cells to in ltrate the follicular space and breakdown its immune privilege. Thus, remolded immunolocalization of GZMB by vitamin A can also cause a lot of cellular damage at different follicular sites.  In GO Cellular Component, very important and signi cant nding of current investigation was genes HLA-DRB1 and HLA-DQB1. Both of these genes are involved in notable pathways like type 1 diabetes mellitus, In uenza A, toxoplasmosis etc. and clathrin-coated endocytic vesicle and MHC (major histocompatibilty complex) class II receptor activity as shown in Table 3.Human Leukocyte antigen (HLA) system is a gene complex, responsible for the production of MHC protein, acting at the surface of cells and is responsible for all immune actions of the cell. In the present study, HLA is involved in all three GO components and pathways as shown in Table 3. HLA-DRB1*11:04 allele has been strongly associated with alopecia in Iraqi Arab Muslims patients and is highly involved with early outset and severe patchy AA (30). HLA-DRB1*04 allele group poses a risk factor for the development of AA whereas the allele DRB1*0401 predominates in the population of Belgium and Germany (31). This shows that genetic effects of the HLA system play a crucial role in familial cases of AA (32), similar ndings have been corroborated by the present study. Many more studies were done in UK and North America, conferring the risk of DRB1*04 in AA (32,33). 80% of AA patients were affected by HLADQB1*03 allele, and this allele also covers 92% of patients with total or universal AA (34) KEGG pathway analysis of down regulated genes of H.sapiens indicated that HLA-DRB4, FGG, CNTNAP2, HLA-DRB4, COMP, FGF18 genes are involved in the most signi cant enriched pathways, as shown in  Table 4 KEGG, GO analysis of down regulated genes of H. sapiens (p-value<0.05) After critically observing all the data, four genes came out to be common in both organisms which were executing their function in triggering the AA. These genes were further analyzed using Pathway Linker tool to assess their inter-relationship. The network and pathway analysis revealed the partners of these crucial genes which are CXCL9, CXCL10, STAT1 and CCL5. All four genes were found to be up regulated ( Figure 2).
The suggested model for disesse progression has been constructed by taking common nodes among all the four genes as given in inset ( Figure 3). The anagen hair follicle of AA expresses STAT1, CXCL9, CCL5 and CXCL10 genes along with IFNG in every part of follicular epithelium,which also includes the area adjacent to the dermal papilla of the hair follicle ( Figure 3).
The hair follicle is itself a complex mini organ with speci c immune and hormonal microenvironment.
Immune privilege is the most interesting aspect of hair follicle integrity. AA occurs on disintegration of MHC class I based immune privileges of anagen hair follicles, which can be futher prompted predominately by these genes. Hair follicle enters into the anagen phase of the hair cycle, in which active melanogenesis i.e. the formation of melanin occurs. Consequently hair follicle autoantigens are recognised by intrafollicular in ltrate CD8+ T cells. Finally, an attack by CD8+ Tcell on the anagen hair follicular epithelium due to the presence of perifollicular in ltrate of CD4+ T cells results in hair loss. The pathway generated from the analysis using Enrichr, DAVID and PathwayLinker tools is shown in gure 3.
The pathway depicts the regulatory phenomenon of anagen which could play a critical and crucial role in hair follicle damage in alopecia.    Diagrammatic representation of Alopecia areata anagen hair follicle. Inset: Linked pathway.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. Supplementarydata.pdf