Identication of Key Genes and Pathways Associated with Differences of Subchondral Bone in Osteoarthritis

Background Osteoarthritis (OA) is a common chronic disease worldwide. Subchondral bone is an important pathological change in OA and responds more rapidly to adverse loading and events compared to cartilage. However, the pathogenic genes and pathways of subchondral bone are largely unclear. Objective This study aimed to identify signature differences in genes involved in knee lateral tibial (LT) and medial tibial (MT) plateaus of subchondral bone tissue while exploring their potential molecular mechanisms via bioinformatics analysis. Methods First, the gene expression data of GSE51588 was downloaded from the GEO database. Differentially expressed genes (DEGs) between knee LT and MT were identied, and functional enrichment analyses were performed. Then, a protein-protein interactive network was constructed in order to acquire the hub genes, and modules analysis was conducted using STRING and Cytoscape for further analysis. The enriched hub genes were queried in DGIdb database to nd suitable drug candidates in OA. A total of 202 DEGs (112 upregulated genes and 84 downregulated genes) were determined. In the PPI network, ten hub genes were identied. Five signicant modules were identied using the MCODE plugin unit. Functional enrichment analysis revealed the most important signaling pathways. Six of the ten hub genes were targetable by a total of 35 drugs, suggesting their possible therapeutic use for OA . The identied hub genes and functional enrichment pathways were implicated in the development and progression of subchondral bone in OA, thus improving our understanding of OA and offering molecular targets for future therapeutic modalities. into different cell lineages, like muscle, bone, cartilage, or fat [36]. These studies tested the importance in the pathway of regulation of lipolysis in adipocytes for OA. Furthermore, in a study on the overexpression of miR-10a-5p inducing human primary chondrocyte, they also enriched important terms including PPAR signaling pathway [37]. This nding is also consistent with the of Li [38] and Yi [39] those who performed enrichment analysis using the tissue of OA synovial membranes. The α subunit of AMPK exists in two isoforms in the including AMPKα1 AMPKα2, where both serve as catalytic subunits Increased NR4A1 acts on to trigger leading to mitochondrial transition pore related chondrocyte cell


Introduction
Osteoarthritis (OA) is one of the most common forms of arthritis in the elderly population, which affects more than 6-32% of people worldwide [1]. In previous studies, researchers have primarily focused on cartilage tissue and chondrocytes in regard to OA treatment and their associated mechanisms [2,3]. Due to rapid advances in medical technologies, synovial tissue as well as subchondral bone has been further investigated [4]. Unfortunately, existing therapies are unable to prevent or reverse the development of OA apart from providing pain relief [5]. Eventually, patients with OA in advanced stages commonly receive joint replacement surgery. Hence, there is an urgent need to explore potential biomarkers and therapeutic targets for OA.
Subchondral bone located under articular cartilage is an important part of the joint as it is a composite tissue composed of mineral, organic matrix (non-collagenous proteins, collagen, and lipids), cells, and water [6]. Anatomically, subchondral bone consists of both the underlying subchondral trabecular bone and a layer of corticalized subchondral bone plate. Due to subchondral bone's specialized structure, it may effectively spare the overlying articular cartilage by dispersing axial loads across the joint [7]. The major function of subchondral bone is to absorb stress and maintain the shape of the joint cartilage [9]. By visualizing the whole knee's structure via magnetic resonance imaging (MRI), the severity of knee OA is associated with increased quantities of tibial subchondral bone [10]. As subchondral sclerosis increases, a reduction in the density of the underlying subchondral bone and exibility of the osteochondral junction is present. The texture of subchondral bone may generally be regarded as an important biomarker to predict the progression of OA [11].
Subchondral bone is currently considered a challenge in view of achieving osteochondral regeneration and self-repair [12]. The aforementioned studies provide an important foundation for further exploration of the development of OA. However, the mechanism of aberrant subchondral bone formation and articular cartilage degeneration remains unclear in OA.
Due to strides in biomedicine, numerous studies have shown that the occurrence and development of OA may be mediated by multiple genes and signaling pathways [13]. In addition, microarray technologies have been used to perform multiple gene expression pro ling studies on human knee OA tissue. Integrated bioinformatics techniques combined with expression pro ling may offer additional advantages compared to independent studies [14]. The majority of biomedical studies using gene expression microarray pro ling thus far have focused on human articular cartilage [15], synovium [16], or synovial broblasts [17]. Consequently, none have been performed on human subchondral bone tissue, particularly in regard to LT and MT.
This study attempts to identify the biomarkers, pathways and therapeutic targets involved in LT and MT of knee subchondral bone tissue via bioinformatics analysis and microarray gene expression pro ling in order to elucidate the subchondral bone tissue mechanism involved in OA and develop more effective treatment options.

Microarray Data Information.
The NCBI Gene Expression Omnibus (GEO) is a public genomics data repository (http://www.ncbi.nlm.nih.gov/geo) that stores gene expression datasets, platform records and original series [18]. Gene expression pro le dataset GSE51588 [19] was obtained from the GEO database, which was expressed on the GPL13497 platform [Agilent-026652 Whole Human Genome Microarray 4 × 44K v2 (Probe Name version), Agilent Technologies Manufacturer]. The GSE51588 dataset, which was the overlying articular cartilage or synovium cells, contained total RNA from the human knee (40 OA and 10 non-OA) from regions of LT and MT. Only 40 samples with OA were selected in our study for further investigation.

Data Processing and Identi cation of DEGs.
The common statistical method entailed in DEG analysis is to select genes at signi cantly different expression levels between samples [20]. We used the R software (version 3.6.1) and limma packages [21] to identify DEGs between knee lateral tibial (LT) and medial tibial (MT) plateaus in patients with OA. Then, the P-values were adjusted for comparisons using the false discovery rate (FDR) of the Benjamini and Hochberg (BH) test [22] in the limma package. The criteria for DEG was |log2-fold change (FC)| ≥2 and adjusted P-values < 0.01.

Functional and pathway enrichment analysis
The Database for Annotation, Visualization, and Integrated Discovery (DAVID; https://david.ncifcrf.gov/) [23] integrates biological data and analysis tools that provide systematic information for the list of DEGs. Gene Ontology (GO, www.geneontology.org) is widely used in the annotation of cell components (CCs), molecular functions (MFs), and Biological processes (BPs) of DEGs. The KEGG (Kyoto Encyclopedia of Genes and Genomes; www.genome.ad.jp/kegg) is a database resource used to extract pathway information from molecular interaction networks [24]. In the present study, GO and KEGG pathway enrichment analysis of DEGs were conducted using the DAVID online tool. The p-value <0.05 and gene count >2 were set as the cutoff point.

Construction of the PPI network analysis
The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database (http://www.string-db.org/) is an online biological database and website designed to construct the PPI networks in molecular biology [25]. The DEGs (species: Homo) were mapped to STRING(PPI score >0.4), and then Cytoscape software 3.7.1 (http://www.cytoscape.org/) was used to visualize the PPI network. Afterward, the hub genes were identi ed according to degree of connectivity.

Functional module analysis of PPI network
Functional module analysis of PPI networks was performed using the molecular complex detection (MCODE) to cluster the algorithm accompanied with Cytoscape [26]. The analysis parameters were set by default. Additionally, the module analysis results were executed for the functional and pathway enrichment analyses in each module using DAVID, with p<0.05 set as the threshold indicating a statistically signi cant difference.

Drug-gene interactions.
This study employed the Drug-gene Interaction Database (DGIdb; http://www.dgidb.org), which was used as a potential target to search for existing drugs [27] in order to drug-gene interactions between the nal hub genes. Such drugs targeting genes or pathways may reveal potential treatments of OA.

Functional and pathway enrichment analysis
To further explore the potential targets of these DEGs between LT and MT, we performed GO and KEGG pathway analyses (Figs. 3 and 4). As illustrated in Fig. 3, we classi ed GO analysis into 3 functional categories: including top ten of BP (cellular response to tumor necrosis factor, response to tumor necrosis factor, temperature homeostasis, etc.), top three of CC (collagen − containing extracellular matrix, lipid droplet, collagen trimer), and the top ten of MF (receptor-ligand activity, extracellular matrix structural constituent, cytokine activity, etc.). The top ten of KEGG pathway (PPAR signaling pathway, Tyrosine metabolism, and Cytokine-cytokine receptor interaction, etc.) were described in Fig. 4.
We also respectively show annotation of the up-regulated genes as well as down-regulated genes (Tables 2 and 3). In the BP group, the up-regulated DEGs were mainly enriched for genes involved in the triglyceride metabolic process, chemical homeostasis, and acylglycerol metabolic process. The downregulated DEGs were enriched genes that participated in epithelial cell proliferation, negative regulation of developmental processes, and locomotion. In the CC group, the upregulated DEGs were mainly enriched for genes associated with lipid particles, intrinsic components of the plasma membrane, and integral components of the plasma membrane. Moreover, the downregulated DEGs were enriched for genes associated with the proteinaceous extracellular matrix, extracellular matrix, and collagen trimer. In the MF group, the up-regulated DEGs were mainly enriched from genes associated with protein homodimerization activity, protein dimerization activity, and alcohol dehydrogenase activity; and the down-regulated DEGs were enriched for genes encoding proteins for receptor binding, integrin binding, and cytokine receptor binding. In the KEGG pathway group (P < 0.005), the up-regulated DEGs were enriched for genes associated with a total of nine terms (Regulation of lipolysis in adipocytes, PPAR signaling pathway, Tyrosine metabolism, etc.) but down-regulated DEGs only one term (protein digestion and absorption). (Table 4)

Construction of the PPI network analysis
Based on the data in the STRING database, we constructed a PPI network through the Cytoscape software, which contained 130 nodes and 362 edges (Fig. 5). Among the 130 genes, the top 10 hub genes were identi ed according to degree connectivity, including LEP, PNPLA2, LIPE, PLIN1, ADIPOQ, FABP4, LPL, SLC2A4, DGAT2, and CIDEC. LEP demonstrated the highest degree score (score = 31). (Table 5)

. Functional module analysis of PPI networks
To further analyze the interactions of proteins, 5 modules were detected using the Cytoscape plugin MCODE (Table 6 and Fig. 5). In addition, functional enrichment analyses for these modules were performed. The pathway enrichment analysis showed that Module 1 was mainly associated with PPAR signaling pathway, AMPK signaling pathway, and Regulation of lipolysis in adipocytes. Module 2 was found to be mainly associated with Chemokine signaling pathway, cAMP signaling pathway, and Neuroactive ligand-receptor interaction. Furthermore, Module 3 was mainly associated with Tyrosine metabolism, Drug metabolism -cytochrome P450, and Fatty acid degradation.  Using the10 hub genes the PPI network analysis revealed the potential targets. Finally, only ve genes were explored as potential drugs: LEP (9 drugs), ADIPOQ (3 drugs), LPL (8 drugs), SLC2A4 (13 drugs), and LIPE and DGAT2 (1 drug each). A list of the 35 drugs was compiled, which were all approved for use in humans by the FDA, satis ed the standards for being possible drug treatments for OA (Table III and Fig. 6).

Discussion
Despite demonstrated risk factors such as obesity, mechanical abnormality, genetic predisposition, and age, the precise pathogenesis underlying OA remains unclear; which leads to the present therapeutic program for OA is only primarily relieving symptoms not speci c treatment. Several prior OA studies emphasized only synovium [28], articular cartilage [29], and meniscus [30] while ignoring the role of subchondral bone in OA development [31]. Recently, increasing evidence has indicated that the degeneration of articular cartilage was associated with abnormal metabolism of subchondral bone [32,33]. Microarray technology and bioinformatics techniques have been widely used in predicting the potential therapeutic targets. This study attempted to further understand the molecular mechanisms involved in knee subchondral bone tissue of LT and MT using Microarray technology and bioinformatics.
To the best of our knowledge, this is the rst study that identi es the key genes and pathways between LT and MT from knee subchondral bone tissue using bioinformatics analysis. Data was extracted from GSE51588, including 40 OA (20 LT and 20 MT from human knee subchondral bone tissue) and 10 non-OA samples (5 LT and 5 MT from human knee subchondral bone tissue). Only 40 OA samples were enrolled in the present study, and a total of 202 DEGs were identi ed between LT and MT of knee subchondral bone, including 119 upregulated and 84 downregulated genes. Moreover, 10 genes were further screened with higher degree scores (> 15) and the most signi cant expression changes, including LEP, PNPLA2, LIPE, PLIN1, ADIPOQ, FABP4, LPL, SLC2A4, DGAT2, and CIDEC. As numerous studies have demonstrated that coexpressed genes are frequently related to those with similar expression pro les that are also commonly involved in similar biological processes [34], to better reveal the functions of the identi ed DEGs, we performed GO and KEGG enrichment analyses as well as constructed of the PPI network analysis.
The KEGG enrichment analysis of the DEGs demonstrated that the hub genes PLIN1, FABP4, PNPLA2, and LIPE were mainly enriched in regulation of lipolysis in the adipocytes; LPL, PLIN1, FABP4, and ADIPOQ were mainly involved in PPAR signaling pathway, while LEP, SLC2A4, ADIPOQ, and LIPE were mainly enriched in AMPK signaling pathway. Adipocytes are derived from multipotent mesenchymal stem cells (MSCs), and various molecular events commit MSCs to the adipocyte lineage [35]. However, MSCs are treated as cellular sources for multiple tissue regeneration, which can differentiate into different cell lineages, like muscle, bone, cartilage, or fat [36]. These studies tested the importance in the pathway of regulation of lipolysis in adipocytes for OA. Furthermore, in a study on the overexpression of miR-10a-5p inducing human primary chondrocyte, they also enriched important terms including PPAR signaling pathway [37]. This nding is also consistent with the teams of Li [38] and Yi [39] those who performed enrichment analysis using the tissue of knee OA synovial membranes. The α subunit of AMPK exists in two isoforms in the mammalian, including AMPKα1 and AMPKα2, where both serve as catalytic subunits [40]. Increased NR4A1 acts on the AMPK/dynaminrelated protein 1 (Drp1) pathway to trigger mitochondrial ssion, leading to mitochondrial permeability transition pore related chondrocyte cell death [41]. Such results are consistent with that of the present study and highlight the credibility of the enrichment analysis.
Moreover, PPI networks involved with DEGs were constructed, where the top 10 hub genes were observed to be LEP, PNPLA2, LIPE, PLIN1, ADIPOQ, FABP4, LPL, SLC2A4, DGAT2, and CIDEC. LEP has mitogenic activity on vascular endothelial cells and promotes angiogenesis [42]. In the early stage of OA, the high expression level of LEP may promote the phosphorylation of AKT [43], increasing bone remodeling via triggering the downstream signaling pathway. Prior studies have shown that skeletal muscle dysfunction and decline in mobility are associated with the PNPLA2/ATGL pathway [44]. LIPE is in white adipose tissue of retinyl ester hydrolase; LIPE-null mice have shown decreased levels of retinaldehyde, retinol and all-trans RA as well as increased levels of retinyl esters [45]. PLIN proteins in mammalian involved a conserved PAT domain (Plin1, Plin2 and Plin3) which appears to be important in lipid droplet targeting [46]. Early clinical studies have shown that the expression level of ADIPOQ are correlated with disease severity of knee OA [47]. ADIPOQ may be treated as a potential biomarker for knee damage in OA [48]. Six hub genes (LEP, LIPE, LPL, SLC2A4, FABP4,and ADIPOQ) corroborate the ndings of Gu et al. (2019) who used weighted gene co-expression network analysis to identify the biomarkers of subchondral bone [49].

Conclusions
In the present study, a bioinformatics analysis was performed to investigate knee LT and MT plateaus of subchondral bone tissue in OA. The identi ed functional enrichment pathways and 10 hub genes were identi ed providing further insight into the underlying pathogenesis and facilitate the diagnosis and treatment of OA. However, a limitation was that the screened genes and pathways lacked additional experimentation. Therefore, further studies are required to con rm the corresponding results. Abbreviations OA, Osteoarthritis; LT, lateral tibial;MT, medial tibial; GEO, Gene Expression Omnibus; DEGs, Differentially expressed genes; PPI, protein-protein interactive; DGIdb, Drug-Gene Interaction database; FDR, false discovery rate; BH, Benjamini and Hochberg; DAVID, Database for Annotation, Visualization, and Integrated Discovery; CCs, cell components; MFs, molecular functions; BPs, Biological processes; STRING, Search Tool for the Retrieval of Interacting Genes/Proteins; KEGG, Kyoto Encyclopedia of Genes and Genomes; MCODE, molecular complex detection Declarations Authors' contributions RGY performed comparative analysis using bioinformatics tools. JYZ, EYF, HYW, YGZ, WLW, ZLL, LQL,YTH, and CLW participated in data analysis and discussion, RGY interpreted data and wrote the manuscript. YYZ organized and supervised the project. All authors read and approved the nal manuscript. RGY was responsible for manuscript review. Figure 1 Volcanic map of all genes. Red dots indicate upregulated genes, blue dots indicate downregulated genes and gray dots indicate genes that are not regulated.