Potential Genes and Pathways Associated with Heterotopic Ossification, Derived from Analyses of Gene Expression Profiles


 Background: Heterotopic ossification (HO) represents pathological lesions, referred to the development of heterotopic bone in extraskeletal tissues around joints. This study will investigate the genetic characteristics of bone marrow mesenchymal stem cells (BMSCs) from HO tissues and explore the potential pathways involved. Methods: The gene expression profiles (GSE94683) was obtained from the Gene Expression Omnibus (GEO), including 9 normal specimens and 7 HO specimens, and differentially expressed genes (DEGs) were identified. Then, the protein‑protein interaction (PPI) networks, Gene Ontology (GO) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) enrichment analysis were performed for further analysis. Results: Totally 275 DEGs were differentially expressed, of which 153 were upregulated and 122 were downregulated. In the biological process (BP), the majority of DEGs, including EFNB3, UNC5C, TMEFF2, PTH2, KIT, FGF13 and WISP3, were intensively enriched in cell signal transmission items, including axon guidance, negative regulation of cell migration, peptidyl-tyrosine phosphorylation and cell-cell signaling. Moreover, KEGG analysis indicated that the majority of DEGs, including EFNB3, UNC5C, FGF13, MAPK10, DDIT3, KIT, COL4A4 and DKK2, primarily involved in mitogen-activated protein kinase (MAPK) signaling pathway, Ras signaling pathway, phosphatidylinositol-3-kinase/protein kinase B (PI3K/Akt) signaling pathway and Wnt signaling pathway. 10 hub genes were identified, including CX3CL1, CXCL1, ADAMTS3, ADAMTS16, ADAMTSL2, ADAMTSL3, ADAMTSL5, PENK, GPR18, CALB2.Conclusions: This study presents a novel insight into the pathogenesis of HO. 10 hub genes and most of the DEGs intensively involved in enrichment analyses may be the new candidate targets for the prevention and treatment of HO in the future.


Background
Heterotopic ossi cation (HO) represents pathological lesions, referred to the development of heterotopic bone in extraskeletal tissues around joints, which often occurs in the elbow, thigh, pelvis, and shoulder. (1, 2) The clinical manifestations were altered with the progression of the disease. Local pain, tenderness, swelling and stiffness of the involved joints were presented in the early stage of HO. At the end of the disease, it even presents with complete ankyloses, which seriously damages the quality of life. (3) The speci c causes of HO were still uncertain. It is generally believed that the occurrence is related to genetic susceptibility and severe trauma. Progressive ossifying brous dysplasia (FOP), a genetic hereditary form of HO, is extremely rare with the prevalence of about 1:2000000 in the population.(4-7) However, there was a much higher incidence of HO following soft-tissue trauma, amputations, central nervous system injury, such as cerebral anoxia, encephalitis, traumatic brain injuries and spinal cord lesions. (8,9) At present, the only effective way to treat HO is surgical resection of ossi ed tissue. Since HO is prone to recurrence, this method is only temporarily effective. (10) Moreover, when the ossi cation tissue invades the large blood vessels and nerves, the complication rate caused by resection is also increased. Studies have shown that HO is a disease with unknown pathogenesis, inadequate prevention and treatment, and high disability rate. (11) Therefore, in order to nd a more effective therapeutic method, we need to better understand the pathogenesis of HO. (12) Recently, the bioinformatics analysis of gene expression, which served as a key technology in mechanism exploration, play an important role in screening gene mutations and studying genetic behavior. Our study screened out the genes differentially expressed in bone marrow mesenchymal stem cells (BMSCs) from HO, which were identi ed from a public dataset. In this work, we seek to synthesize the protein-protein interaction (PPI) networks analysis, Gene Ontology (GO) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) enrichment analysis among DEGs to explore the potential mechanism of HO and candidate gene targets, which may be used to prevent, alleviate and even reverse the progress of HO.

Microarray data resource
A gene expression pro le dataset (GSE94683), The GSE94683,(13) a gene expression pro le dataset obtained from a public genomics data repository (GEO database, http://www.ncbi.nlm.nih.gov/geo/), was produced on the GPL10630 Agilent-021531 Whole Human Genome Oligo Microarray 4x44K. According to the annotation information in the platform, the probes were alternated into corresponding gene symbols. GSE94683 contains 9 normal specimens and 7 HO specimens. HO specimens were obtained from patients with injuries at Garches Hospital, Garches, France. Healthy specimens were obtained from patients after total hip surgery at Blois Hospital (Blois, France) and at Centre de Transfusion Sanguine des Armées (Percy Hospital, Clamart, France).

Data processing and Identi cation of DEGs
Principal components analysis (PCA) and the normalization were performed by ggbiplot and the preprocessCore package in R. After the preprocessing of the datasets, the identi cation of DEGs were processed using Multiple Linear Regression Models via limma package of R language. A probe without any corresponding gene symbol was eliminated and a gene with a plenty of probes was averaged. All DEGs and top 30 DEGs were demonstrated via volcano plot, which was performed by ggplot2 package.
DEGs were screened by using classical t test and |Log2(FoldChange)| > 2 and adj. P < 0.05 was de ned as a threshold and criteria for identi cation of statistically signi cant DEGs. The relative expression values of total DEGs and top 30 upregulated and downregulated DEGs, which were extracted from gene expression pro le, were demonstrated by the hierarchical clustering heatmaps via pheatmap package of R language.

Functional enrichment analysis for DEGs
KEGG is a data reservoir for unravelling advanced biological functions and pathways involved with genomic information. (14) GO is a tool for de ning concepts or classifying gene biological function by analyzing a large list of genes. (15) The Database for Annotation, Visualization and Integrated Discovery (DAVID), a web-based database resource that explores gene data, facilitates researchers to reveal the potential genetic signi cance.(16) The DAVID performed the GO and KEGG enrichment analyses. The functional enrichment analyses were processed via ggplot2 package in R and demonstrated by the bubble plots.

Integration of PPI network analysis
Search Tool for the Retrieval of Interacting Genes (STRING), an original online program that predicts the relationships and interactions among proteins involved, (17) was applied to construct functional network among proteins. Cytoscape, an open-accessed visual tool, was able to visualize processes and interactions among proteins. Molecular Complex Detection (MCODE), a plugin of Cytoscape, was used to explore intensively communications and discover the most signi cant module. "MCODE scores ≥ 5", "kscore = 2", "Max depth = 100", "node score cut-off = 0.2" and "degree cut-off = 2" were selected as criterion for identi cation of signi cant modules.

Hub genes selection and analyses
CytoHubba, a plugin of Cytoscape applied with 12 topological analysis methods, was used to detect more comprehensive hub genes and avoid missing. According to MCC, MNC, DMNC and Degree, the top 25 genes were predicted. Venn diagrams were used to screen the hub genes by identifying the overlapping genes.

Data source and data preprocessing
As large amounts of data were integrated in a gene expression pro le (GSE94683), the original data was downloaded from GEO database. PCA of datasets demonstrated that there were signi cant differences between HO samples and control samples (Fig. 1). Normalization of the data was presented in a boxplot (Fig. 2). Functional enrichment analysis for DEGs DAVID was used to perform functional enrichment analyses to explore the biological classi cation of DEGs and the top 10 items were shown in Fig. 5. In the biological process (BP) ontology, the majority of DEGs, including EFNB3, UNC5C, TMEFF2, PTH2, KIT, FGF13 and WISP3 were intensively enriched in cell signal transmission items, including axon guidance (9 genes), negative regulation of cell migration (7 genes), peptidyl-tyrosine phosphorylation (6 genes) and cell-cell signaling (10 genes). Moreover, KEGG analysis indicated that the majority of DEGs, including EFNB3, UNC5C, FGF13, MAPK10, DDIT3, KIT, COL4A4 and DKK2, primarily involved in axon guidance (7 genes), mitogen-activated protein kinase (MAPK) signaling pathway (9 genes), Ras signaling pathway (7 genes), phosphatidylinositol-3kinase/protein kinase B (PI3K/Akt) signaling pathway (9 genes) and Wnt signaling pathway (5 genes).

PPI network construction and hub gene selection
The PPI network has been constructed after the data of the functional relationships and interactions obtained from the STRING were imported into the Cytoscape, and the top module was identi ed (Fig. 6). According to MCC, MNC, DMNC and Degree, the top 25 genes (10%) were predicted. Venn diagrams were used to select the hub genes, and CX3CL1, CXCL1, ADAMTS3, ADAMTS16, ADAMTSL2, ADAMTSL3, ADAMTSL5, PENK, GPR18, CALB2, were identi ed as the hub genes (Fig. 7).

Discussion
HO refers to a pathological process, involving a variety of etiology, location, mechanism and cell origin. (3) It is reported that early diagnosis of HO is di cult due to lack of evident signs and symptoms and limited treatment. (18,19) How to manipulate the in ammatory cascades to control HO formation is just beginning to be understood.(3) Therefore, it is urgent to explore the mechanisms of the formation of HO at the molecular level, so as to nd novel, adequate and effective treatment methods to alleviate, delay, or even reverse the development of HO.
The whole-genome microarray and bioinformatic analysis facilitate us to detect genetic differences in the development of HO, and provide a better way to explore the pathogenesis and identify novel candidates for early diagnosis and precise treatment. In current study, 275 DEGs were identi ed, including 122 downregulated genes and 153 upregulated genes.
After that, the functional enrichment analysis was performed, and the relationships and interactions among DEGs were predicted. The majority of DEGs, including KIT, FGF13, EFNB3, UNC5C, TMEFF2, WISP3 and PTH2 were intensively enriched in cell signal transmission items, including axon guidance, negative regulation of cell migration, peptidyl-tyrosine phosphorylation and cell-cell signaling. Moreover, KEGG analysis indicated that the majority of DEGs, including KIT, DDIT3, FGF13, EFNB3, UNC5C, MAPK10 and DKK2, primarily involved in axon guidance, MAPK signaling pathway, Ras signaling pathway, PI3K-Akt signaling pathway and Wnt signaling pathway. This is consistent with the recent research ndings that muscle-derived mesenchymal stem cells in the soft tissue migrate to the area of trauma and in ammation, differentiate into osteoblasts and form heterotopic bone. (20) The PPI network of the DEGs has been constructed. The results indicates that CX3CL1, CXCL1, ADAMTS3, ADAMTS16, ADAMTSL2, ADAMTSL3, ADAMTSL5, PENK, GPR18, CALB2 are hub genes.
Chemokines are able to mediate the migration and localization of immunocyte in the process of in ammation, which plays a vital role in manipulating the immune system. According to the conserved cysteine motifs, chemokines in mankind are divided into four families (C, CC, CXC, and CX3C). Most chemokines are involved in the differentiation of osteoblasts and/or osteoclasts to varying degrees. CX3CL1, belonging to the CX3C subgroup, is a combination of the properties of chemotactic agents and adhesion molecules, which has been shown to continuously control reshaping bone matrix by regulating bone remodeling at the cellular level. (21,22) Current studies have reported that osteoblasts are able to express CX3CL1, and osteoclast progenitors are able to produce CX3C receptor 1 (CX3CR1). (23) Cytokines originated from in ammation are able to signi cantly induce the expression of CX3CL1 in osteoblasts. Meanwhile, CX3CR1 are identi ed as a candidate for screening osteoclasts with in ammatory reaction. (24)(25)(26) It was reported that the interaction between membrane-bound CX3CL1 on osteoblasts and CX3CR1 expressed by osteoclast progenitors is able to promote the progress of terminal differentiation of osteoclast progenitors.(23) CX3CR1-de cient mice showed moderate but signi cantly increased trabecular bone mass, which was mainly due to the decrease in the number of osteoclasts. (27) In vitro experiments suggested that this phenotype can be explained by the decreased ratio of receptor activator of nuclear factor-kappa B ligand/osteoprotegerin (RANKL/OPG), and the defect of spontaneousformation of osteoclasts from CX3CR1-de cient bone marrow cells. (27) In general, it indicates that CX3CL1 may be involved in osteoclast-mediated bone loss. CXCL1, belonging to the CXC subgroup, is a kind of growth factors that could send signals through CXC receptor 2 (CXCR2).(28) CXCR2, a G proteincoupled receptor, is found remarkably expressed in osteoclast precursors, while it cannot be detected in the osteoblast lineage predominantly. Cell culture studies have con rmed that recombinant CXCL1 is able to stimulate the migration of osteoclast precursors in a dose-dependent manner. (29) Moreover, CXCL1 has been proven to promote osteoclast formation in vitro. (30) The superfamily of ADAM metallopeptidase with thrombospondin type 1 motif (ADAMTS) comprises 19 distinct ADAMTS, which are consist of secreted enzymes and seven ADAMTS-like proteins (ADAMTSLs) without enzymatic activity. (31)(32)(33) Most of them are involved in the generation and degradation of extracellular matrix (ECM) molecules, and participate in the formation and remodeling of connective tissue and the occurrence and development of diseases. (32,34) The ECM of normal cartilage maintains a dynamic balance between generation and degradation, which is in a state of equilibrium. There is a loss of balance between the proteases and their inhibitors that degrade the ECM in pathological cartilage. ADAMTS3 has been proven to be involved in the formation of type II brous collagen in articular cartilage.(35) ADAMTS16, there is no known speci c function for articular cartilage yet. Current studies have shown that the expression of ADAMTS16 is increased in cartilage and synovium from osteoarthritis patients compared with the normal.(36, 37) The steady increase of expression of ADAMTS16 will cause the inhibition of cell proliferation, migration and adhesion, and a decreased expression of matrix metalloproteinase-13 (MMP13) in chondrosarcoma cells.(38) Studies have indicated that ADAMTSLs possess speci c extracellular ligands and several of them are ECM-binding proteins that act at the cellmatrix interface. (39)(40)(41) It is well known that brillin micro brils are able to bind to ADAMTSLs. (42)(43)(44)(45) Therefore, ADAMTSLs can be regarded as matricellular proteins, which are a kind of non-structural proteins expressedin ECM dynamically and with regulatory effects. PENK, a classically identi ed opioid gene, was initially shown to be expressed almost exclusively in the mature nervous and neuroendocrine systems. Current studies have revealed that the expression of PENK is selectively increased in mineralized cultures, and it is essential for the formation and remodeling of bone structure.(46) The expression of PENK in osteoblasts is regulated by bone-targeting hormones, which makes a valuable contribution toward bone development. (47) The interaction among bone formation and hub genes GPR18 and CALB2 have not been reported yet. GPR18 is a receptor for endocannabinoid N-arachidonyl glycine (NAGly).(48, 49) GPR18 may be involved in the regulation of immune system, whose activity is mediated by G proteins that can inhibit adenylyl cyclase.(48) CALB2, a member of the troponin C superfamily, is an intracellular calcium-binding protein and is abundant in auditory neurons and functions as a modulator of neuronal excitability.

Conclusions
In summary, the current study aims to determine potential biomarkers that might be related to the development of HO and to explore the potential mechanism of HO. Totally, 275 DEGs and 8 hub genes were identi ed, which provided a new promising perspective for HO diagnosis and treatment. However, the biological functions of these biomarkers in the pathogenesis of HO need to be further studied.

Consent for publication
Not applicable.

Availability of data and materials
All data generated or analyzed during this study are included in published articles.

Competing interests
All authors declare they have no con icts of interest or competing interests.

Funding
This research did not receive any speci c grant from funding agencies in the public, commercial, or notfor-pro t sectors.
Authors' contributions YZY and SB conceived and designed the study. YZY, LDL and LX analyzed the data and wrote the manuscript. The original text was drafted and modi ed by GR and WYW. All authors read and approved the nal manuscript. Figure 1 Principal components analysis of datasets. Red represents samples from normal cohort. Blue represents samples from heterotopic ossi cation.        The most signi cant module was identi ed with 15 nodes and 50 edges. Small-sized or brightly colored circles and lines mean a lower value of combined score.