Insights into Molecular Mechanisms Underlying NAFLD by Protein-Protein Interaction Network Analysis of the Differentially Expressed Genes in NASH Patients.

The ecient treatment of the disease such as non-alcoholic fatty liver disease as a global concern is impeded by the limited understanding of the disease complexity. We applied a systems-biology framework for a functional analysis of disease development using candidate genes. We identied, the keys dysregulation, pathways, and Hub-genes leading the disease. Moreover, we constructed a molecular disease network for the disease via integrating the pathways, providing an intuitional view to understand the development. In addition, the druggability of genes and their new application in terms of marker discovery or new drug discovery were examined. Our analyses identify a series of highly overexpressed genes, also featuring high centrality, yet being less underlined in the context of NAFLD development in the previous research. We also found several relatively low overexpressed genes, having a central place in the PPI network topology, whose potential involvement in non-alcoholic steatohepatitis might be undermined otherwise using network biology. Our results reveal that the biological processes related to disease progression are complex. Indeed, our pathway studies identied Hub genes, which may reinforce poly-pharmacological cross-talks followed by disease progression to irreversible stages holistic approaches via systems biology tools help us identify more reliable targets and disease-mediated genes integrated into translational studies and drug development in this area.


Background
In the emergence of extensive data, disease-related target discoveries of complex disease intensively require a holistic view of disease ontology to address it adequately. To this end, biological networks can uncover complex systems level properties. It is systems biology that combines experimental and computational biology to decipher the complexity of biological systems to develop effective targeted therapeutic strategies for multi-faceted and poorly studied disorders like NAFLD. In this sense, biological networks represent valuable tools for understanding disease prognosis molecularly. The complexities of biology, however, made single-targeted drugs very rare. Instead, Omics' recent efforts have aimed to explore the possibility of utilizing the poly-pharmacology strategy to predict more practical multi-target therapeutics.
Generally, NAFLD is a liver manifestation of dysregulation of metabolism. The disease entails single steatosis (NAFL) to non-alcoholic steatohepatitis (NASH) with necro-in ammatory lesions. NASH development stems from complex interactions of metabolic reactions and stress pathways in liver cells, initiated by chronic excessive lipid accumulation, with in ammatory processes driven by various immune cell populations. Moreover, due to the disease's complicated nature, Pharmacotherapy of such ailments poses a great endeavor to a holistic multi-targeted approach. This emerging approach nowadays takes advantage of high-content transcriptomic analysis and network pharmacology.
During the past decade, rapid advances in high-throughput technologies have brought unprecedented opportunities for the large-scale analysis of NAFLD genes/proteins. These datasets are often heterogeneous and multi-dimensional; integrating and arranging such datasets to ascertain the key molecular mechanisms and transform the data into meaningful biological phenomenon a major task and challenge. To meet the demand, pathway and network-based analyses have become an integral and in uential approach to elucidate the biological implications underlying complex diseases. The recent attempt has been dedicated to understanding the molecular mechanisms underlying NAFLD, mainly due to the identi cation of genes or pathways associated with disease initiation and progression. For example, Pathways such as insulin resistance, lipid metabolism, apoptosis, endoplasmic reticulum stress, mitochondrial dysfunction, and immune response have been found to contribute to NASH pathogenesis.
Additionally, with the development of high throughput analysis technology, more and more genes/proteins have been suggested to be linked to NAFLD and provide a valuable resource to analyze candidate genes function, biochemical pathways, and networks related to NAFLD. Using a bioinformatics analysis also, many hub genes that are essential to steatohepatitis development have been identi ed. For example, RPL36A, RPL14, UBE2A, UBE2B, PRKCA, EGFR, CDC42, VEGFA, PRKCA, CDC42, FCGR2A, and LARP1B were identi ed in GSE89632, GSE 33814, and GSE48452 datasets (1,2). Hub genes such as UBQLN4, APP, SHBG, CTNNB1, COL1A1, COL1A2, CD24, COL1A2, COL3A1, CXCL6, DCN, EHF, FAP, LUM, PCOLCE2, and SOX9 were all identi ed in a NAFLD-related brosis dataset (3,4). Recently, HNF4A was identi ed as central to NASH pathogenesis in NASH-related integrated datasets GSE17470, GSE24807, GSE37031, GSE89632 (5). Although NAFLD's previous transcriptomic signature provides identi cation of intuitive genes abnormally expressed or mutated and differentiate stages from healthy obesity to severe brosis, most of the studies did not precisely characterize a transcriptional pattern of the gene through disease transitions. And their interplay of genes toward the critical condition such as brosis and network structure of such complicated disorder was less focused.
Existing studies were based mainly on target-based genes, and there is a lack of framework and comprehensive examinations of driver genes. More importantly, stage-based studies of NAFLD etiology and pathology have lacked. It is due to limited diagnostic information in the context of transformation from severe steatohepatitis to an irreversible condition such as brosis and HCC in a more comprehensive approach. Given the recent analyses, In this study, we expected to make contributions to identify more novel gene markers to describe how the disease progresses to brosis or hepatocellular carcinoma. Thus, a comprehensive analysis of the NASH-related candidate genes within a systematic framework may provide us important insights into the molecular mechanisms underlying NASH disease.
In the present study, we initially retrieved NASH-related gene expression data sets using microarray-based analysis. We analyzed differentially expressed genes (DEGs) among healthy control (HC) and NASH groups. Then we conducted the Kyoto Encyclopedia of Genes and Genomes (KEGG) functional enrichment analysis, protein-protein interaction (PPI) network analysis to unravel molecular and biochemical mechanisms underlying disease progression and recognize the potential therapeutic relevance. Also, we elucidate that the frame of the modulation method to study disease progression could be applied to other disease models. We identi ed a NASH transcriptomic signature strongly enriched in genes controlling immune-in ammatory processes and Metabolism. Moreover, we constructed a schematic molecular network for NASH disease via integrating the pathways and network, providing an intuitional view to understand the development.

Methods data source an identi cation of DEGs
We retrieved the gene expression dataset GSE37031(6) from Gene Expression Omnibus (7)  Additionally, their studies demonstrate that increased hepatic KLF6 expression directly activates TGFβ1 and its receptors; this may play a crucial role in the progression from steatosis to steatohepatitis with progressive brosis (11). In a systematic review of NAFLD's genetic associations, Kayleigh L. Wood et al. demonstrate the function of KLF6 as a tumor suppressor, which confers susceptibility to non-alcoholic fatty liver disease and non-alcoholic steatohepatitis (12). Another study by the same researchers considers LEPR as the gene best poised to act as a metabolic hormone receptor-for leptin, speci callydue to its adaptive responses related to insulin resistance and increased gluconeogenesis (12).
As a result of a broad review which is done by Petta,S.et.al on NAFLD's pathophysiology, leptin expression is induced by endothelin expression (EDN1) and the effect of angiotensin II. these events, which are generated by proin ammatory cytokine (TNF-α and IL-6) and adhesion molecules (VCAM-1 and ICAM-1)are signi cantly higher in patients with NAFLD than in the control group and are therefore correlated with NAFLD severity (13). JUN is known as a proto-oncogene, the expression of which is reported to cause in ammatory responses and cancerous behavior in fatty liver contexts (13). Lou Y. et al. dissected the gene regulatory network while considering the transcription factor in various NAFLD stages. Based on their study, the top 10 putative co-regulators in the gene expression data (GSE48452) included EGR1, JUN, FOS, JUNB, FOSB, and NR4A1, which have dominant signi cance in liver physiology (14).
Cysteine-rich angiogenic inducer 61 (CYR61) is a secreted, extracellular matrix (ECM)-associated signaling protein of the CCN family. In adults, CYR61 contributes to in ammation and tissue repair. It is associated with diseases that involve chronic in ammation-including rheumatoid arthritis, atherosclerosis, diabetes-related nephropathy and retinopathy, and many different forms of cancer.
However, recent studies demonstrate that the presence of CYR61 is also associated with the hypomethylated and differentially methylated regions. It has even shown upregulated expression in the livers of patients with advanced NAFLD (15). Research by Linling Ju et al. revealed that the overexpression of CYR61 is related to the over-expression of fatty acid metabolism-associated genes. In addition, in vitro experiments conducted on murine primary hepatocytes indicate that CCN member family 1 is partially responsible for increasing intracellular TG content, pro-in ammatory cytokines, and the expression level of the apoptosis-associated proteins in steatosis (16).

Reconstruction and analysis of the PPI network of DEGs
Although differential expression analysis helps describe the contribution of genes to disease progression in terms of their expression level, the expression level is not the only factor in relevant diseases related to the importance of genes. Given that most biological phenotypes emerge from gene products' interaction, a more in-depth understanding of disease at the genetic level can be realized by analyzing the proteinprotein interaction (PPI) of genes active in the patients. Herein, we constructed the PPI network of genes that are highly overexpressed in NAFLD patients. The network includes 623 proteins (nodes) interacting through 2933 edges. The obtained PPI network is characterized by a 5, average path length of 2.197, and a network centralization of 0.164.
We further used the social network analysis to identify the network central genes, degree, betweenness, and closeness centralities. We were concomitantly used as a combined lter to identify the Hub genes. As a result, we identi ed 131 genes as the network Hubs. The Hub genes were ranked based on their degree, betweenness, and closeness centralities. According to each parameter, the genes common in the top 50 genes were identi ed as the network's key Hubs (table2). These genes may hold other aspects of importance that are not re ected in the expression level.
Notably, PPARA, SOD2, IL1B, FOXO, PRKAA1 (AMPK), RAF1, INSR, NF-ϰB, and CREBBP. they were absent from the list of our top 50 DEGs, are among the identi ed Hub genes. The signi cance of these genes in the context of NAFLD initiation/progression is well documented (3). These facts signify the network biology approach's potential in identifying important disease-driving genes that may not be uncovered by merely relying on expression level.

Structural clustering of Hub genes network
By using the STRING server, the integrated network of the Hub genes was reconstructed. As can be observed from Table 3, this network has a higher density, clustering coe cient, and average path length than the initial network using the community detection algorithm in Gephi 0.9.2 (www.gephi.com), an open-source network mapping tool ( Figure 4). The network was clustered into three distinct and two small modules. For each module, the centrality parameters were calculated. Table 4 presents the lists of top-ranked genes in each module. Top-ranked genes attributed to module1 include proto-oncogenes (MET, MYC, RAF1, MDM2) and tumor suppressor (PTEEN, ATM, ERN1, BCL6). In module 2, genes related to the metabolic aspect of the disease, for example, PPARA (with a critical role in the regulation of fatty acid uptake, beta-oxidation), INSR1 (a receptor tyrosine kinase which mediates the pleiotropic actions of insulin), Foxo1, G6PC, DUSP1 (may play an essential role in the human cellular response to environmental stress as well as in the negative regulation of cellular proliferation) constitute a particular characteristic of metabolic disease. These genes are ranked from those with an energy-sensing component to those that activate compensatory response modulators.
Along with module 3, there are some genes associated with in ammation and immune response such as IL-1B (a member of the interleukin 1 family of cytokines which is an important mediator of the in ammatory response and is involved in a variety of cellular activities, including cell proliferation, differentiation, and apoptosis), CXCL12 (The encoded protein functions as the ligand for the G-protein coupled receptor, chemokine (C-X-C motif) receptor 4, and plays a role in numerous diverse cellular functions, including embryogenesis, immune surveillance, in ammation response, tissue homeostasis, and tumor growth and metastasis), ICAM1 (This gene encodes a cell surface glycoprotein which is typically expressed on endothelial cells and cells of the immune system) and PTGS2 (PTGS2 is responsible for the production of in ammatory prostaglandins). The third module contains liver cytochromes genes such as CYP2B6, CYP1A1, CYP3A4 and encodes a member of the cytochrome P450 superfamily of enzymes. They catalyze many reactions in drug and xenobiotic metabolism and the synthesis of cholesterol, steroids, and other lipids. The last module contains HNRNPM, HNRNPU, SNRPA1, and SRSF6, which is involved in Spliceosome, a family of proteins that binds nucleic acids and functions in the formation of ribonucleoprotein complexes. It is widely believed that changes in splicing may be an early event in cancer initiation. Moreover, several studies indicate the role of these changes in the manifestation of NAFLD disease (17). Hence, over-expression of Splicing factors might be considered an early event to detect the progression of NAFLD.

Gene set enrichment analysis
We conducted Functional enrichment on two sets of genes: signi cant genes and hub genes. We searched for enriched pathways of NASH-related genes and identi ed 212 signi cant enrichment pathways for hub genes, speci cally. The KEGG items with the highest combined scores are represented in Figure 5. The Foxo signaling, PI3k-Akt signaling, and AGE-RAGE signaling pathways scored highest among the hub gene sets presented in this study. Figure 6 portrays the distribution of hub genes in the highest-ranked pathways of the disease. We also compared pathway enrichment, both in the community of signi cant genes and our hub nodes, before modulation. There were negligible discrepancies, which implies that our selected hub nodes behaved in the same manner (though on a smaller scale) as the signi cant genes. However, we also found that some critical pathways present in our hub gene enrichment-such as "focal adhesion," "glucagon signaling pathway, "and "Rheumatoid arthritis"-were absent or less signi cant in our signi cant gene enrichment. This presentation explains how the process of nding hub nodes among signi cant genes can extrapolate subtle details related to disease characteristics.

Network characterization of the healthy and control subjects
The healthy subjects and patients' network parameter was individually calculated by Cytoscape version (3.6.0) ( Table 5). We noticed that healthy subjects and patients' network indicators are close to each other. It is manifested that the disease's network structure has negligibly changed, and fatty liver disease and NFLD are considered a systematic disease.

Discussion
A combination of mechanisms-including persistent in ammation, metabolic dysfunction, and insulin resistance-are responsible for non-alcoholic steatohepatitis (NASH). Thus, the multifactorial nature of NASH requires a systems biology approach to the study of the disease to decipher the complexity associated with its initiation and progression. The present study aimed to gain a holistic PPI-level insight into biological processes driving the NASH progression. To this end, we rst compared the gene expression pro les of NASH patients' samples with the healthy subjects to identify the signi cantly overexpressed genes in the patients. Next, we reconstructed and analyzed the PPI network of DEGs to identify key NAFLD associated genes at the network level.
Although the number of our samples was limited, we used previous gene expression studies to gather gene signatures within the different stages of the disease, from obesity to progressive NASH.
Several genes related to immunity, oxidative stress responses, and metabolic functions were determined to play a role in NAFLD (12,(18)(19)(20). We found a satisfactory agreement between our results and previous ndings on the role of genes in the disease's modulation. Also consistent with our results, biological processes such as focal adhesion, in ammatory response, brosis, and cellular response to chemical stimuli are highlighted in the literature (19)(20)(21).
The centrality analysis of DEGs' PPI networks identi ed the network's hub genes, including PPARA, CREBBP, CCL2, SERPINE1, ABCB11, SOD2, Foxo1, IL-1B, INSR, NF-ϰB. These are both theoretical and experimental shreds of evidence to demonstrate that the PPI network hubs often play important roles in regulating the biological processes involving disease (4,22) (23). Previous studies consistently show the signi cance of our identi ed hub genes as critical modulators of NAFLD initiation and progression (12,18,22,24). Interestingly, several genes absent from the list of top DEGs, including PRS6KA5, CFLAR, TNFSF11, ADRB2, RAF1, RAGE, MEF2A, NFE2L2, ID2, GART, ZFP36 were found among the high-score hub genes. Among them, four proteins stand out as relatively unknown factors in NASH development.
Most notable among hubs is the Zfp36, which encodes tristetraprolin protein (a zinc nger transcription factor), which plays a signi cant role in negatively regulating TNFα production by destabilizing its mRNA (25). A recent study has demonstrated that tristetraprolin post-transcriptionally regulates systemic insulin sensitivity and hepatic metabolism through the modulation of liver-derived FGF21 (26). Another study found that insulin resistance in obese mice was associated with enhanced Zfp36 expression in the hepatic macrophage (m1). The study of Caracciolo, V. et al., revealed that the myeloid-speci c deletion of Zfp36 protects against insulin resistance and fatty liver in mice whose obesity is diet-induced (27). Moreover, a signi cant correlation has been reported between TTP and hepatocarcinogenesis (28).
Another less explored gene is NFE2L2, which encodes nuclear factor erythroid 2-related factor 2 (NRF2), a transcription factor that mediates protection against oxidative damage triggered by injury or in ammation. Lu, C. et al. demonstrate the role of curcumin in increasing NRF2 expression (29). Due to its antioxidant and anti-in ammatory properties, curcumin's natural polyphenol has long been proposed as a potentially viable treatment for NAFLD. A recent systematic review and meta-analysis of a randomized controlled trial demonstrated that curcumin supplementation has favorable effects on metabolic markers and anthropometric parameters in patients with NAFLD (30). Chambel, S.S. has reported that NRF2 also mediates the impact of lipid metabolism on antioxidant defense, as observed in NAFLD experimental models (31). In another research, the pharmacological activation of NRF2 in obese and insulin-resistant mice was found to reverse insulin resistance, suppress hepatic steatosis, and mitigate NASH and liver brosis (19). Hence, the pharmacological induction of NRF2 appears to be a promising strategy for NAFLD prevention and treatment.
Another underscored genes less focused in NAFLD in the connective tissue growth factor (CTGF), a multicellular component of the extracellular matrix-associated heparin-binding proteins involved in many biological processes, including cell adhesion, migration, proliferation, angiogenesis, and wound healing. The overexpression of CTGF has been considered a hallmark of brosis (32,33). Besides, aberrant CTGF expression is associated with many types of malignancies, diabetic nephropathy and retinopathy, arthritis, and cardiovascular diseases. Yoshino, J. showed that CTGF overexpression was associated with adipose tissue expansion and multi-organ insulin resistance in obese subjects (34 Lastly, CFLAR (a regulator of apoptosis that is structurally similar to caspase-8) functions as a suppressor of steatohepatitis and its metabolic disorders. According to Wang P.X. et al., CFLAR attenuates steatohepatitis progression in both mice and monkeys (36). Liu Y. et al. demonstrate that Silibinin, a avonolignan from milk thistle, performs its function by activating the CFLAR-JNK pathway (37). In so doing, it regulates downstream target genes involved in lipid metabolism (PPARα, SREBP-1C, and PNPLA3), glucose uptake (PI3K-Akt), oxidative stress (NRF2, CYP2E1, and CYP4A), an in ammatory response. Analyses of treated HepG2 cells have con rmed its potential use in improving various symptoms of NASH (38). Silymarin is the extract of Silybum marianum, or milk thistle, which has been used to treat various liver disorders-particularly chronic liver diseases, cirrhosis, and hepatocellular carcinoma-because of its antioxidant and anti-in ammatory properties. (39) The translational relevance of this research is accelerated by our authentication of the PPI networks of central genes, our interpretation of rationales, and the broad mechanisms we have drawn from the progression of the disease to an irreversible stage of non-alcoholic steatohepatitis in a previously healthy patient. Our biological function enrichment analysis enabled us to narrow the functional spectrum of NASH-related genes. Summarizing the results from our pathway and network analyses, we were able to pinpoint the vital pathways to metabolic perturbation, stress-related responses, and brosis initiation. The Foxo signaling pathway, the PI3k-Akt signaling pathway, and select pathways in cancer produced the highest combined scores. This result was consistent with previous research, thus offering valuable evidence to study the complicated connection underlying the NASH disease. Accordingly, greater emphasis is placed on the AMPK signaling pathway, the insulin resistance pathway, and the Foxo signaling pathway in several research lines on NAFLD progression (21,40). Also included are several pathways involved in the cellular physiological processes linked to tumor cell proliferation, such as apoptosis, necroptosis, adherence junction, and cell cycle. Following previous research, some pathways are related to the HCC signaling pathway, such as the transforming growth factor-beta signaling pathway, the NF-κB signaling pathway, and the Hippo and the JAK-STAT signaling pathway.
The clustering of the interactions between hub genes resulted in the PPI network's dissection into three distinct modules. According to previous studies, dense interactions between a particular set of proteins may underlie the biological functions coordinated by those proteins (41). Therefore, the clustering of the PPI network provides insights into the modules' biological functions and processes that might otherwise be challenging to uncover. Pathway enrichment analysis is broadly used to interpret PPIs in terms of their biological functions and processes. Discovering the pathways and processes most likely to be coordinated by each PPI module in a particular disease can reveal the molecular mechanisms that drive speci c diseases. Owing to the complex and multifaceted nature of NASH, we could not attribute the individual clusters to speci c aspects of the disease. Nevertheless, the groups of hub genes in each module constitute a mixture of speci c characteristics. We can nd genes that are metabolically related to oncogenic/apoptosis or in ammation/ bro-genesis (Figure 7,8).
We found the rst cluster with members of the topologically remarkable gene (of the entire network of hub genes)( Table 4) receiving the highest enrichment scores through the apoptosis signaling pathway (P-value: 3.000E-10), the PI3K-Akt signaling pathway (P-value: 9.799E-10), the Foxo signaling pathway (Pvalue: 2.46E-07), and other cancerrelated pathways (P-value: 6.935E-08) (Figure 8). This cluster study con rms that the majority of the functional genes presented in this module are involved in the interaction between the tumor-suppressing and oncogenic signals through the PI3K-Akt signaling pathway. This cluster also explains the prognosis of severe liver brosis or hepatocellular carcinoma (42). Many studies have demonstrated that the PI3K-Akt signaling pathway balances oncogenesis and cell survival signals by regulating pro-and anti-apoptotic genes (40,43). Conversely, the activation of the pathways such as apoptosis and Foxo signaling pathways in this module may indicate the activation of the cellular defensive mechanisms that counter the disease's progressive traits.
Moreover, enrichment of these pathways with a signi cant p-value can imply dysfunction in the Akt to regulate its downstream pathways in favor of maintaining balance. From this speci c module, we found tumor suppressors (e.g., PTEN, PHLPP-1ATM, BCL6, KLF2, ERN1) and protein oncogenes (e.g., MDM2, XIAP, ACTB, RAF1, CFLAR, and MCL1) among our top hub gene which are overexpressed. Furthermore, they may play a prominent role in exacerbating disease and insulin resistance as a hallmark of NASH progression.
It is worth noting that Akt is also considered a master regulator in insulin-mediated glucose homeostasis.
Evidence indicates that Akt is negatively regulated by tumor-suppressing proteins whose activity interferes with insulinmediated glucose homeostasis. Hub genes PHLPP1 and PTEN in this cluster are both tumor suppressors and negative regulators of Akt. Recently studies suggest that the overexpression of PHLPP1 may contribute to type 2 diabetes by interfering with Akt-mediated insulin signaling (44)(45)(46). Similarly, the PTEN overexpression is induced and mediated by high levels of free fatty acids and in ammatory cytokines (47)(48)(49)and can be involved in controlling cell division, .whereas it said that it could exacerbate insulin unresponsiveness. Consequently, despite the several ndings that demonstrate inhibition of Akt's downstream signaling by these suppressors, it might decrease oxidative stress and DNA damage response, and this appears to contradict with glucose homeostasis function of Akt (43,50). The clustering of all of these genes together in a single module may re ect the di culties associated with treating insulin resistance with an antagonizing approach-a well-known challenge in managing cancers and diabetes.
A study of pathways enriched in the second highly populated cluster highlights insulin unresponsiveness and the impaired metabolic balance that justify another molecular aspect of NASH. This fact is re ected in the high enrichment scores of the AMPK signaling pathway (P-value=8.187E-14), the Foxo signaling pathway (Pvalue=8.187E-14), the insulin resistance pathway (P-value=6.149E-09), and the adipocytokine signaling pathway (Pvalue= 2.272E-6) (Fig. 8). Remarkably, we found that almost all of these enriched pathways might shed light on feedback against cells' disorganized bioenergetics. This disorganization can happen through nutrient-sensing signals and glucose hemostasis. The activation of these pathways reinforces the fact that several mechanisms, including elevated gluconeogenesis, lipolysis, and increased fatty acid metabolism oxidation, at the onset of NASH development. These pathways represent maintaining cell metabolism; nevertheless, the combination of prolonged insulin unresponsiveness, accumulated stress, and metabolism perturbation can lead to pathologic conditions, making them "double-edged swords in the pathology of the disease.
Another sub-network constructed in this module elaborates on the complicated link between metabolic perturbation and in ammatory responses. This linkage is highlighted with the enrichment of pathways such as the MAPK signaling pathway (P-value=3.603E-08) and the AGE/RAGE signaling pathway (P-value=9.832E-06) in this speci c module may account for the stress that may originate from insulinmediated lipogenesis that subsequently gives rise to the progression of NASH. Meanwhile, several lines of evidence demonstrate a disruption in the balanced input and output of hepatic FFA manifested in active ROS generation (51). We can understand how hyperlipidemia due to obesity and hyperglycemia in the NASH cases and in ammation and oxidative stress may cause AGE products' formation through the glycosylation process (52). Generally, we can interpret this cluster's biological functions as a re ection of the mechanisms of transforming NAFLD characterized in the two-hit hypothesis. This NASH incidence theory implicates excessive fatty acid and insulin resistance as a " rst hit," which are supposed to be connected to mitochondrial dysfunction and oxidative stress (36).A modular study of the third module explains the disease, which moves toward the steato-apoptotic stage. Our analysis indicates two pathways-the TNF signaling pathway (P value=4.908E-07) and the NF-kappa B signaling pathway (P value= 4.908E-07)-were also enriched with genes involved in NASH development. (Figure 8). These two pathways have been studied extensively for their role in in ammation and immunity responses [24]. The AGE/RAGE signaling pathway (P value=2.120E-07) and its downstream gene are prevalent in our hub gene enrichment; in fact, they appear to explain contribution to the progression of the disease. The enrichment of these pathways highlights the over-activity of pro-in ammatory components in this pathway, which triggers positive feedback mechanisms via AGE/RAGE. These inter-related activitieswhich are known to insulin resistance and systemic complications and the progression of NASH to brosis-are presented in our clustering analysis of the disease PPI network. RAGE ligands (AGEs), which are generated by the non-enzymatic glycation of accumulated lipids and glucose, trigger oxidative stress pathways and cause the over-activation of receptors through a positive feedback loop. This phenomenon has detrimental effects on hepatic insulin resistance, steatosis, brosis, ischemic and non-ischemic liver disease, and the growth and metastasis of HCC. Consequently, receptor blockages or restrictions in dietary AGEs appear to be an in uential therapeutic target for these progressive hepatic disorders, as supported by numerous studies(53) (54) (55)(56).
The network-based analysis offers important insights about functional genes (e.g., RAF1, PTGS2 ATM, TNFSF11, CXCR4, ICAM-1, PTEN, SMARCA2, H6PD, SERPINE1, GLS, NAMPT), but it can also help to identify novel candidate targets and markers. On the one hand, some genes exhibit the pathological characteristics found in other disease contexts, including in ammatory diseases (e.g., rheumatoid arthritis, diabetes, metabolic syndrome, and various cancers) that can be repurposed as a pharmacological target in new disease. This complexity can cause challenges in determining the roles these genes playing in the context of fatty liver disease. On the other hand, some hub genes targeted by FDA-approved drugs-such as ADRB2, PTGS2, and AGTR1-are related to different diseases (e.g., cardiovascular disease) creates opportunities for repositioning and poly-pharmacological strategies about NASH disease. As illustrated in the pathway enrichment, two interconnected mechanismsincluding RAS activation and lipolysis regulation-appear to be stimulated by the sympathetic system. The beta-receptor (ADRB2) is among the top-ranked hub genes whose activity in the adrenergic system may lead to RAS activation through multiple signaling steps. RAS is a hallmark of several manifestations of NASH (including ROS formation and brosis initiation) (57).
Similarly, the receptor of angiotensin, AGTR1, is among the network hubs that cause aldosterone's biosynthesis through its function. This result can be re ected by a high score of the aldosterone synthesis pathway in the functional enrichments. Also, pathways pertaining to lipolysis regulation in adipocytes, which correlate with beta-receptor overactivity, are highly enriched. Thus, beta-receptor and angiotensin II receptor inhibitors, which are widely available in the market, might be considered for therapeutic use in the future to mitigate the devastating effects of fatty liver on the body.
In recent decades, growing evidence has highlighted the AGE/RAGE axis's pathological role in various diseases, such as diabetes and fatty liver disease. The mechanisms through which the AGE-RAGE pathway in uences in ammatory reactions include rising oxidative stress generation and in ammatory responses in its downstream. Four proteins involved in this pathway (SMAD4, ICAM-1, EDN1, SERPINE) can be considered the key players in further translational applications in the context of NASH. The rst key hub gene, SMAD4, is part of the protein complexes in this study. This hub gene encodes a protein that acts as a signal transducer in the activation of the brogenic pathway and apoptosis through HSC activation and can lead to organ damage such as diabetic nephropathy (58)(59)(60). One research has highlighted the role of SMAD4 as a risk factor for brosis in conjunction with BMI, TG, LDL-C, ALT, and AST (58). Studies in favor of the deletion of SMAD4 overexpression have shown that improvements to lipid metabolism, liver function, in ammation, or brosis could con rm its targeted role in improving brosis in the NASH context.
Similarly, the second Hub gene, ICAM-1, is associated with the increased adhesion of leukocytes via increased intercellular adhesion molecule-1 in the membrane of leukocyte and endothelial cells. As a result, more immune responses are activated, and more reactive oxygen is generated. Our data agree with previous ndings explaining that hepato-steatosis is associated with increased hepatic ICAM-1 expression. Therefore, setting this glycoprotein as a target in graft protection in fatty liver subjects (and as a biomarker for susceptibility to organ injury, as discussed in previous studies) would be more promising.
The overexpression of the endothelin-1 gene (EDN1) due to the transcriptional regulator's hypoxia-induced activation could explain oxygen hemostasis irregularity in the NASH state (61). Previous studies have found higher levels of endothelin-1 in NASH patients correlated with the grade of their hepatic brosis.
Such studies have also demonstrated the role of EDN1 as an angiogenic factor in tumor metastasis, and studies on zebra shes have mentioned the liverspeci c expression of EDN1-induced HCC. Hence, commercially available endothelin-1 receptor antagonists might be an excellent therapeutic target both for hypoxia-induced brosis and tumor growth that may occur in NASH disease. The activation of the AGE/RAGE axis could also produce an extrahepatic reaction. Studies indicate that the overexpression of SERPINE1 downstream of the AGE/RAGE axis predisposes patients to severe brosis and systemic vascular complications in the fatty liver (including atherosclerosis), which further complicates the condition (62).
Additionally, microvascular dysfunction occurs due to the overexpression of ICAM-1 in the endothelial cells may facilitate atherosclerosis formation. As a result, these two genes might be considered CVD risk factors, which could direct drug development efforts toward treating the extrahepatic complications of NAFLD. In agreement with other studies, we con rmed the involvement of these four genes in NASH pathogenesis.
Studying the PPI network could also highlight genes with promising pharmacological targets for further investigation. For instance, recent literature has suggested that BCL6 (a master immune system regulator) could play a role in metabolic regulation, which is not related directly to its function through immunity response (63,64). A study of knockout mice (2014) illustrated that BCL6 deletion decreased lipogenesis through alteration in SREBP1c Fasn and Scd1. It also reduced adipogenesis and fatty acid oxidation (via PPAR) (64). The BCL6 hub gene-which is coexpressed with hub genes like Foxo1, KLF2, and ATM in our study-supports new ndings that BCL6 acts as an intermediary, relating stress response the metabolic irregularity. As the co-expression illustrates, it may form a bridge with another hub genes such as JUN, IL-1b, REL, CCl2, and SERPINE1. Likewise, considering the signi cance of BCL6 in cross-talk between apoptosis and metabolic regulation, the inhibition of BCL6 could be a plausible target for future treatment of fatty liver disease and insulin resistance (65,66).
Our pathway analysis also revealed that the disease's progression through oxidative response ultimately leads to defects in cell bioenergetics. We can understand some changes as subordinate to metabolic reactions related to energy hemostasis (e.g., purine and glutamine metabolism), which are said to compensate for energy imbalance failures. In terms of dominant hub genes related to metabolic signaling pathways, those most strongly aligned with the metabolism were GART, GLS, and H6PD.
GART, which is involved in the de-novo biosynthesis of purine, causes noticeable changes in purine metabolism. Many studies have identi ed an elevation in uric acid in NAFLD patients (67,68); this appears to occur after the upstream elevation of purine biosynthesis and after the activation of additional metabolic signals through purine metabolism. It produces uric acid as a byproduct. However, some studies re ected that uric acid's elevated formation could be a compensative mechanism against disease progression. Other studies established a high uric acid level in increasing insulin resistance and lipogenesis conditions (69). In addition, many studies point out that purine metabolism plays a signi cant role in cancer prognosis (70,71). Based on this evidence, it is likely that alterations to this metabolic pathway in favor of cell proliferation contribute to the disease's cancer-like progression.
Ultimately, we can conclude that GART-expressed enzymes, not uric acid, can probably act as promising biomarkers for earlier detection of NAFLD progression.
The GLS gene overexpression represents a metabolic alteration in glutamine catabolism. A recent study by J. Simón et al. revealed that glutaminase (GLS), regulated by c-myc proto-oncogene, is overexpressed in the late stages of NASH and the early stages of HCC (72). They concluded that the enzyme's inhibition could deactivate the Krebs cycle and the electron transport chain (ETC) and decrease β-oxidation, which contributes to reduced ROS formation (73). Another study by Miller RA et al. shows that increased activity of haptic glutaminase (GLS2) is closely related to carbon generation. This carbon is needed for gluconeogenesis through mitochondrial stimulation anaplerotic reactions in response to glucagon (74).
This study indicates the role of glutamine degradation in the gluconeogenic pathway under insulin resistance conditions. Afterward, Hyper-ammonia can also occur during increased glutaminase activity. Javier Ampuero et al. demonstrated the role of metformin as a hepatic glutaminase inhibitor. So this medicine can primarily be proposed to treat encephalopathic cirrhosis caused by excess ammonia in the brain (75). A more consistent occurrence is the dysregulated expression of liver aminotransferase, such as ornithine transaminase produced through OAT expression. So, we may nd overexpression of OAT in our hub sets of genes as a scavenger that responds to the hyper-ammonia state (76). Our study hypothesizes that GLS2 can present promise not only for the treatment of hyperglycemic diabetes but also for the termination of hyper-gluconeogenic activity due to insulin resistance in fatty liver disease (74).
Hub gene H6PD-the proteins of which are known to activate 11-β HSD1 by generating NADPH (77)in uences the rise in corticosteroid-associated metabolic activity (e.g., gluconeogenesis, biosynthesis, insulin resistance, accumulation of visceral fat, vascular reactivity, vascular remodeling, and sodium reabsorption). Since excess glucocorticoids promote obesity, hyperlipidemia, insulin resistance, and antagonizing approach in the therapeutic strategies for all metabolic syndromes could be more promising in the treatment of fatty liver disease.
Despite decades of research, NAFLD remains one of the most complex diseases with no e cient cure.
While previous studies have demonstrated this disease's multifactorial nature, our study shed more light on the complexity associated with it at the molecular level. Our study con rms many genes well associated with NASH from previous reports. It was able to identify several genes less studied in the NAFLD context, an inspection of which in future research may help better characterization and develop a more effective treatment. Our gene expression, PPI network, and enrichment analysis also provide useful insights into NASH's development at the pathway level. Indeed an extensive body of the current knowledge of pathways contributing to NASH was re ected in our network and enrichment analysis. Clustering analysis of DEGs' PPI network also revealed more in-depth associations among genes that support disease progression, being otherwise challenging to identify. Our research demonstrates the usefulness of network and systems biology in facilitating the integration of separate NAFLD molecular data in a coherent framework, which enables consistent interpretation of the data in the context of molecular disease mechanisms and therapeutic target identi cation.

Declarations
Availability of data and materials All relevant raw data on which the conclusions of the paper rely are available Ethics approval and consent to participate Not applicable

Competing interests
The Authors declare that they have no competing interests Authors' contributions MM was responsible for conceiving, designing, and conducting the study and contributed to interpreting the results and drafting and revising the manuscript. NA made a signi cant contribution in analyzing the data, interpretation, and validation of the draft manuscript's data and preparation. AM contributed to some parts of analysis and visualization. TS was involved in the veri cation of the results.