HFE Gene Regulation of T Lymphocyte Activity Can Be a Potential Target of PCOS


 Background Polycystic ovary syndrome (PCOS) may be characterized by skeletal muscle abnormalities. Pioglitazone has a certain therapeutic effect on skeletal muscle metabolism in PCOS patients.Objective To investigate the other effects of pioglitazone on skeletal muscle and its mechanism.Methods We used weighted gene co-expression network analysis (WGCNA) to identify potential key genes and signaling pathways involved in pioglitazone affecting skeletal muscle in PCOS patients. First, download the chip data of GSE8157 from GEO database. Then, modules related to genes expressed in skeletal muscle of PCOS patients before and after pioglitazone treatment were screened by WGCNA, and core genes and differential gene analysis methods were further selected according to the correlation characteristics between modules and clinical manifestations. Finally, the selected genes were enriched for analysis.Results Four significantly up-regulated genes were screened out by differential gene analysis: MTBP, MAPK14, RBBP6 and PTPRC. They are both associated with cell cycle regulation and immune cell (T cell) antigen receptor signaling.Conclusions The effect of pioglitazone on skeletal muscle of PCOS patients is mainly reflected in the regulation of abnormal expression of cell cycle and immune cell-related genes. Among them, HFE gene has the most obvious regulation effect on T lymphocyte activity, which may be a potential gene target for the treatment of PCOS.


Introduction
Polycystic ovary syndrome (PCOS) is the most common disease of dysgenesis and endocrine metabolic disorder of women of reproductive age. Its characteristic phenotypes include elevated androgen levels, irregular menstruation and polycystic ovary (Manti et al. 2020).The hyperandrogenic manifestations of PCOS include dyslipidemia, insulin resistance, type-2 diabetes mellitus (DM2), obesity, cancer, infertility, and coronary heart diseases (Kogure et al. 2020).At present, there is evidence to support the key role of elevated androgen levels in PCOS pathogenesis, which affects the function of various tissues of the body, among which skeletal muscle is one of the main targets of androgen. Skeletal muscle as a key metabolic active organ is the most effective organ for insulin-stimulated glucose uptake in the body ( Additional studies have found that an important subgroup of patients with PCOS may have autoantibodies ) that differ in their pathophysiological autoimmune components. It has previously been documented in PCOS, for example, anti-nuclear, anti-thyroid, anti-SM, anti-histone, antiovarian, and anti-islet cell antibodies (Mobeen et al. 2016). Studies have veri ed the dose-dependent effect of gonadotropin-releasing hormone receptor activation in gonadotropin-releasing hormone receptor transfected cells by purifying antibodies of high gonadotropin-releasing hormone receptor antibodies from PCOS subjects; This in vitro activity was antagonized by the gonadotropin-releasing hormone receptor antagonist Siltra c (Kem et al. 2020), this nding supports the involvement of gonadotropinreleasing hormone receptor antibody in the pathogenesis of PCOS disease. Further studies have found the effect of gonadotropin-releasing hormone receptor antibody on the insulin receptor/PI3K/Akt/Glut signaling pathway in liver and skeletal muscle. Therefore, skeletal muscle is still closely related to the pathogenesis and clinical manifestations of PCOS patients, and the abnormality of immune components may become an important potential mechanism. At present, most studies on skeletal muscle abnormalities in PCOS patients still focus on hormone levels, while ignoring the abnormal gene expression in skeletal muscle itself.
Thiazolidinediones, including pioglitazone, have been shown to improve metabolic disorders in patients with PCOS (Street et al. 2020). 's recent microarray study, demonstrated signi cant differential expression of several genes in skeletal muscle of women with PCOS after pioglitazone treatment, and that pioglitazone treatment signi cantly reduced fasting serum insulin in PCOS subjects. Improved insulinstimulated Rd, glucose oxidation, and NOGM(Al-Muzafar et al. 2021). This study con rmed that pioglitazone has a certain effect on skeletal muscle abnormalities in PCOS patients, but it still focuses on the study of hormone levels and has not been further explored. Therefore, this study mainly mined potential key genes and signal pathways related to pioglitazone's in uence on skeletal muscle of PCOS patients through comprehensive analysis of WGCNA. This mining information further explored the effects of pioglitazone on skeletal muscle of PCOS patients at the genetic level, and the results showed that the effects of pioglitazone on skeletal muscle of PCOS patients were not only re ected in hormone level, but also on immune cells and cell cycle, which could not be ignored. The results of this study may provide a new association between pioglitazone and PCOS patients and providing value for subsequent studies.

Data Retrieval and Preprocessing
The GSE8157 gene expression pro le data and clinical information used for analysis are derived from the GEO database. The platform is GPL570, four groups of samples were included, including pioglitazone pretreatment group, pioglitazone post-treatment group, placebo group and blank control group. This study uses R programming language to download and preprocess the data, including constructing gene expression matrix, transforming probe name into gene name, and nally obtaining gene matrix and clinical information matrix suitable for WGCNA analysis.

Construction of co-expression network and identi cation of modules
To research the expression network of pioglitazone on the effects of related genes in muscle of PCOS patients, we use the WGCNA software package in the R to construct the weighted gene co-expression network. According to the standard of scale-free network β, the correlation matrix is transformed into adjacency matrix. And then by dynamic shear method (dynamic tree cut) and combined with module recognition to draw the gene tree map. Modules are de ned as the branch cutoff of the tree, and each module is marked with a unique color. Module eigengene (ME) is de ned as the rst principal component of the module. Finally, WGCNA software package is used to determine the most relevant modules for clinical information.

Identi cation of core genes
After determining the correlation between the module and clinical information, we selected the appropriate module and screened the core genes. We rst constructed the PPI network of selected modules and imported them into the cytoscape to screen out the largest Degree of the rst 30 genes by cytohuaab. Meanwhile, we imported string database processed genetic data into excel, the largest top 30 genes were screened by calculating the repeat rate of genes. Finally, we take the intersection gene of the two methods as the core gene

Gene Clustering and Differential Expression Analysis
To explore genes associated with different stages, we used methods of gene clustering and differential expression analysis. Visualization results of differential gene expression analysis by means of ggplot2 package. Brie y, the read counts were entered to build the analysis objects. DEGs at different stages were illustrated using a volcano plot according to the ggplot2 package. Genes with a p-value < 0.05 and absolute fold change ≥1 were regarded as differentially expressed genes.

Core Gene GO and KEGG analysis
For further study of the molecular mechanism of core gene in the effect of pioglitazone on skeletal muscle of PCOS patients. We used the clusterPro le package of R software for GO and enrichment analysis of these core genes and KEGG signaling pathways. We used the clusterPro le package of R software to analyze the enrichment of these core genes GO and KEGG signaling pathways. Before analysis, we transferred gene names from "symbol" to "entrezid" according to org. Hs.eg.db package (V3.8.2)(Lian et al. 2021).

Results
Data preprocessing GSE8157 data sets were retrieved and downloaded, with a total of 43 samples and 1728 gene probes. In this set of data, the samples were divided into 4 groups, including 10 cases in pretreatment group ,10 cases in post-treatment group ,13 cases in placebo group and 10 cases in blank control group. After that, we transformed the data id and processed the matrix, and nally got 1006 expression genes. The row of clinical information matrix was named sample name, column name was MPP MPAP MPC and mpc( represented pretreatment, post-treatment, comfort group and blank control group respectively).
WGCNA building modules and signi cantly associated with clinical phenotypes WGCNA correlation analysis was performed on all the processed genes. In order to satisfy the prerequisite that the co-expression network meets the non-scale network, that is, the logarithm of the node with k connectivity lgk is negatively correlated with the probability logarithm of the node lg[p(k)], and the correlation coe cient should >0.8. we need to use the R software WGCNA package to set the network construction parameter selection range and calculate the scale-free topology matrix. Finally, the analysis package automatically selects and calculates the resulting soft threshold β=5. and then calculate the difference coe cient between gene points to obtain a systematic clustering tree (minmodulesize=30, mergecutheigh=0.25). Through (Figure 1), we get the module division tree diagram of gure. Each color represents a different module. After screening, a total of 4 modules except gray module were obtained, and then the correlation between the selected module and clinical phenotype was calculated. (Figure 2) shows the heat map between different color modules and clinical phenotypes. Blue indicates negative correlation, red indicates positive correlation. The results showed that the modules with three colors had a signi cant correlation with clinical phenotypes: brown, blue and yellow. After the calculation according to the hypergeometric enrichment algorithm, screened and classi ed genes in the previous step are mapped into each module( Figure 3) The pie chart shows the module distribution ratio (the gray module is a collection of genes that cannot be aggregated to other modules, not included in the pie chart). By sorting out the module information and selecting the stable module with good P value and correlation, the results are shown in the table1 ( Table 1). The results showed that genes were signi cantly enriched in brown, blue and yellow modules, and the relationship between binding genes and clinical phenotypes in the above modules. We used the genes in the blue and yellow modules as target genes and performed the following analysis (the blue module contained 240 genes, while the yellow module contained 196 genes).
We found that the yellow module is highly correlated with pretreatment and post-processing, which is positively correlated with pretreatment and negatively correlated with post-processing. Therefore, we screened and enriched the core genes of yellow module genes. The blue module is always highly positively correlated with pretreatment and post-processing, so we analyze the differential genes of the blue module.

Screening of core genes
To further screen the core genes of the selected modules. we rst uploaded 240 genes within the yellow module to the STRING database and constructed a PPI network (using low con dence 0.150 as screening criteria). The obtained data are imported into the Cytoscape and the rst 30 genes Degree are screened by the CytoHubba plug-in (Figure 4). Meanwhile, we import the STRING data into the excel and use the Countif function to screen out that top 30 genes with the most repeat frequency. Finally, the intersection of the two methods(27genes) ( Table 2) is taken as the core gene for subsequent analysis ( Figure 5).

GO and KEGG enrichment analysis of core genes
For the purpose of further exploring the molecular biological mechanism of core genes in the effect of pioglitazone on skeletal muscle of PCOS patients, we performed GO and KEGG enrichment analysis of core genes in the yellow module. GO enrichment analysis of core genes showed (Figure 6), these 22 potential key genes are mainly enriched in the activities of enzymes that affect cell (such as ubiquitin protein ligase activity, protein dimer activity)and is associated with muscle movements ( such as cellular activity and excitatory protein lament binding) and cell cycle (RNA polymerase 2 transcription factor binding).Meanwhile, KEGG enrichment analysis showed (Figure 7), these 27 potential key genes are mainly related to cell division cycle and neuromuscular disease.

Differential gene analysis
Based on the high correlation between blue module and pretreatment group and post-processing group, to explore the genes that changed greatly in the samples before and after pioglitazone treatment, we chose to analyze the differential expression of genes in the blue module. We downloaded GSE8157 data from the GEO database for differential genetic analysis, The genes in the blue module were screened and extracted (109 genes were extracted). We clustered the resulting data. At the same time, the differential expression analysis was screened by using the ggplot2 package to draw the volcanic map (Parameter set to logFC>1, P value <0.05) (Figure 8). Four signi cantly upregulated differential genes (MTBP MAPK14 RBBP6 PTPRC) were obtained for subsequent analysis. At the same time, we screened the expression matrix data corresponding to the genes for differential gene analysis from the original gene expression matrix data and drew the cluster graph ( Figure 9).

Discussion
Polycystic ovary syndrome (PCOS) is a common cause of infertility in women of childbearing age. This disease is often associated with endocrine disorders. Due to unclear pathogenesis and the diversity of complications, PCOS patients suffer from multiple physiological and psychological pressures. Evidence showed that pioglitazone can ameliorates metabolic disorders in PCOS patients and corrects preexisting downregulation of pathways in skeletal muscle (pioglitazone Pio repeat) in PCOS patients. In order to explore the other effects of pioglitazone on skeletal muscle expression genes in PCOS patients, the skeletal muscle tissue chip dataset GSE8157 of PCOS patients before and after pioglitazone treatment was used for biogenic analysis. Through WGCNA analysis method, modules with high correlation with clinical phenotypes were found, and different analysis methods were selected according to different characteristics of the correlation between modules and clinical phenotypes. After GO analysis of core genes in yellow module, it was found that these 27 genes were mainly related to cell cycle and cellular enzymes. In addition, most of these genes were found to be related to immune cells (Table 3). Among them, some genes are related to activation of T cells, such as GO:2001185 and GO:0002710, and some genes are related to antigen presentation, such as GO:0002483 and GO:0019885. these genes were found to be correlated with HFE gene, which was directly targeted by cytotoxic TCR αβ T lymphocytes (Costa et al. 2015). Meanwhile, HFE positive can select a CD8 positive T cell subtype, and support the theory that the immune system is involved in iron metabolism control (Gong et al. 2018a). Studies have shown that the systemic immunity of PCOS patients is dominated by Th1 type immunity, and Th1/Th2 imbalance is associated with obesity, especially abdominal obesity, and may be one of the underlying mechanisms of PCOS disease (Gong et al. 2018b). All of these results showed that pioglitazone potential impact on the immune cells of patients with PCOS may played a larger role in the treatment process and can be expressed by skeletal muscle cells. In addition to PCOS obesity being associated with immune responses, insulin resistance is also associated with immune responses, which can stimulate the onset of low-grade in ammation (Zhou et al. 2021).PCOS and other high-risk reproductive disorders are associated with nodlike receptor protein 3 (NLRP3) in ammatory body, a key regulator of host immune response. Current research suggested that NLRP3 in ammasome plays an important role in female reproductive disorders and provides new insights into the treatment of PCOS patients(Kumagai and Dunphy 2020) .
After KEGG analysis of the core genes, it can be found that the core genes have certain correlation with cell cycle, such as DNA replication, nucleotide excision and repair, etc. According to KEGG, it can be found that the gene IDS of basal excision repair, DNA replication and nucleotide excision repair are all POLE4, indicating that POLE4 pathway plays an important role in the regulation of cell cycle. But its speci c function still needs further discussion.
By analysis of the blue module of differential gene, we found that MTBP, MAPK14, RBBP6 and PTPRC genes were up-regulated signi cantly, which suggest that four genes may be closely related to expression the genes of skeletal muscle after treated with pioglitazone. Studies have also shown that MTBP has at least 30,000 binding sites, most of which are located in regions related to transcriptional regulation, and these regions are also the preferred regions for replication initiation and these binding sites of MTBP are functionally important for DNA replication, and can provide an additional level of speci city. H3K4 methylation is involved in DNA replication (Rondinelli et al. 2015) and with highly relevant to MTBP binding sites. In addition, MTBP knockout can promote apoptosis and inhibit clonogenesis, and when

Conclusions
Current research evidence indicates that most of the changes in the expression genes in skeletal muscle of PCOS patients treated with pioglitazone are re ected in the effects on cell cycle and immune cells, no matter the analysis of core genes or differential genes. In other words, the therapeutic effect of pioglitazone on PCOS patients can be well re ected in skeletal muscle through related cell cycles and immune cells, but further studies are needed to investigate the cascade reaction of related cell cycles and immune cells in skeletal muscle of PCOS patients. Among them, HFE gene has the most obvious regulation effect on T lymphocyte activity, which may be a potential gene target for the treatment of PCOS. This study is helpful to provide new research ideas for the follow-up study on the treatment idea or effect of pioglitazone and the treatment monitoring and prognosis of PCOS patients. However, there are still de ciencies in this study. We only conducted the study through the method of bioinformation analysis, and the role of some genes and the potential role of drugs still need to be further veri ed by experiments. Figure 5 Legend not included with this version.

Figure 6
Legend not included with this version.

Figure 7
Legend not included with this version.

Figure 8
Legend not included with this version. Legend not included with this version.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.