Potential Common Key Genes Associated with Chronic Periodontitis and Low Birth Weight: A Case Control Study Using Bioinformatics Analysis of Pooled mRNA Expression Datasets

Background: Chronic periodontitis (CP) is a multifactorial disease associated with many systemic diseases. However, the precise association between CP and low birth weight (LBW) remains unclear. Therefore, this study aimed to elucidate common differentially expressed genes (DEGs), biomarker candidates, and upstream regulators related to key genes between CP and LBW. Methods: We investigated molecular relations and biomarker candidates using pooled microarray datasets of CP (GSE12484) and LBW (GSE29807) in the Gene Expression Omnibus (GEO). Datasets were analyzed for common DEGs using GEO2R, an R-based web application for GEO data analysis. Common DEGs, biomarker candidates, and upstream regulators in DEGs between CP and LBW were analyzed using the Database for Annotation Visualization and Integrated Discovery (DAVID), Search Tool for the Retrieval of Interacting Genes (STRING), and QIAGEN’s Ingenuity Pathway Analysis (IPA). Results: Three signicantly upregulated and 20 signicantly downregulated common DEGs between CP and LBW were identied. Some biological processes and pathways of these downregulated genes were associated with the cell cycle. Biomarker candidates among common DEGs were proline-rich coiled-coil 2A (PPRC2A), topoisomerase (DNA) II alpha (TOP2A), neural cell adhesion molecule 1 (NCAM1), and calcium channel, voltage-dependent, alpha 2/delta subunit 3 (CACNA2D3). Many upstream regulators of these biomarker candidates were factors associated with inammation, immunity, the cell cycle, and growth development, and were hormones related to pregnancy. Conclusions: The results of this study suggest that PPRC2A, TOP2A, NCAM1, and CACNA2D3 are common biomedical key genes between CP and LBW. The expression states of these genes, which are related to inammation, hormones, the cell cycle, and growth development, were common in both CP and LBW in blood. To the best of our knowledge, the relations of PPRC2A, TOP2A, and CACNA2D3 to CP and LBW Investigations on the relation between phenotype and gene expression levels are important to elucidate biological-related factors in various diseases. Using the integrated analysis of omics data, such as those from microarray mRNA expression datasets, enables DEGs, genetic networks, common biomarker candidates, and upstream regulators to be identied, which is important not only for prognosis, diagnosis, and medical treatment, but also for the elucidation of the molecular mechanisms of multifactorial diseases as screening tools. this study suggest that PRRC2A, TOP2A, NCAM1, and CACNA2D3 may be important common key genes related to CP and LBW. our this is the rst study to report the relations between PPRC2A, TOP2A, and CACNA2D3 and CP and LBW. These key genes are related to the cell cycle, cell composition, erythrocyte function, and transport pathways for small molecules. Among the upstream regulators of key genes, hormones such as beta-estradiol, estrogen, and progesterone had indirect effects on inammation and immunity, whereas others had direct effects.


Background
Chronic periodontitis (CP) is characterized by chronic destructive in ammation in periodontal tissues, such as gingiva, cementum, periodontal ligament, and alveolar bone. CP is an in ammatory disease caused by multiple factors, such as immunological response, hormone-regulated gestation, oral bacterial infections, genetic factors, environmental factors, and systematic diseases. Recently, epidemiological studies have reported that CP is related to systemic diseases such as diabetes, rheumatism, cardiovascular disease, chronic kidney disease, premature birth, and low birth weight (LBW; LBW infants are those weighing < 2,500 g at birth). The mechanisms of CP development and progression are complex and have not been completely elucidated. In periodontology, it is important to carry out research on the associations between CP and these systemic diseases to elucidate the mechanisms of CP and identify potential targets for clinical treatment.
We devised a study to investigate the relation between CP and LBW using gene expression pro ling in blood. Gene expression pro ling is a powerful tool that can elucidate the mechanisms and relations between multifactorial diseases, such as CP and LBW. Analyzing pooled gene expression datasets with CP or LBW, even under different experimental conditions with different subjects, is useful for investigating common genetic factors.
Data sharing and the integration of pooled omics data to investigate the mechanisms of and relations with multifactorial diseases have received increasing attention as screening tools. The use of pooled microarray gene expression datasets is an effective method for reducing high-throughput hybridization costs and compensating for insu cient amounts of mRNA sampling [32][33][34]. Thus, the National Center for Biotechnology Information developed the Gene Expression Omnibus (GEO) to promote the pooling and sharing of publicly available transcriptomic data and facilitate biomedical research [35][36][37][38][39][40][41][42][43][44][45][46][47][48][49].
This study aimed to investigate the common genetic factors and relations between CP and LBW through a bioinformatics analysis of differentially expressed genes (DEGs) using pooled microarray datasets from the GEO database.

Methods
In this study, we used pooled microarray gene expression datasets from the public GEO database to investigate the common molecular factors of CP and LBW as screening tools.

Selection Of Microarray Datasets From The Geo Database
We selected datasets based on the experiential conditions of CP (GSE12484) and LBW (GSE29807) in the National Center for Biotechnology Information (NCBI) GEO database (http://www.ncbi.nlm.nih.gov/geo/). These datasets, which include more than two healthy controls/two diseased patients each, were selected based on the typical clinical condition. Samples from peripheral or umbilical cord blood were used, along with a microarray platform (GPL96: Affymetrix GeneChip™ Human Genome U133A Array [HG-U133A] or GPL570: Affymetrix GeneChip™ Human Genome U133 Plus 2.0 Array [HG-U133 Plus 2]) ( Table 1). The CP or LBW DEGs between patients and normal healthy controls were identi ed using GEO2R, an Rbased web application for analyzing GEO data. A cutoff value of p < 0.05 and a |log Fold change (Fc)| > 1 were used. Common up/downregulated DEGs between CP and LBW were then extracted.

Functional And Pathway Enrichment Analysis Of Common Degs
The results of the functional enrichment analysis of common up-or downregulated DEGs based on the BP of GO, and pathway enrichment analyzed based on the KEGG and Reactome pathway using DAVID are shown in Tables 4-6.    (Tables 4-6). Predominantly, common downregulated DEGs were related to the cell cycle and metabolic processes in BP, and to the cell cycle, cell composition, erythrocyte function, and transport pathways for small molecules in KEGG and the Reactome pathway.

Constructed Protein-protein Interaction (ppi) Networks Of Common Degs
Constructed PPI networks of common up-and downregulated genes were identi ed using the Search Tool for the Retrieval of Interacting Genes (STRING) (https://string-db.org/cgi/about.pl).

Elucidation Of Common Molecular Biomarker Candidates
Common molecular biomarker candidates between CP and LBW were identi ed using IPA software (Table 7). Biomarker candidates are used to identify disease states such as diagnosis, e cacy, disease progression, and prognosis. Proline-rich coiled-coil 2A (PRRC2A) was a signi cant common upregulated DEG, whereas calcium channel, voltage-dependent, alpha 2/delta subunit 3 (CACNA2D3), neural cell adhesion molecule 1 (NCAM1), and TOP2A were signi cant common downregulated DEGs.

Analysis Of Upstream Regulators Of Dominant Common Degs
Upstream regulators of common DEGs were analyzed using comparison analysis in IPA software. DEGs were uploaded into the IPA software, and genetic networks were analyzed in the Ingenuity Knowledge Base. We also analyzed the functional annotations of upstream regulators.

Results
We investigated common DEGs, BP and pathway analyses, biomarker candidates, and upstream regulators between CP and LBW through gene expression pro ling of the GEO datasets.

Identi cation Of Common Degs
Common DEGs were identi ed from among respective DEGs investigated using GEO2R, and genes involved in the pathogenesis of CP and LBW were elucidated. Three signi cantly upregulated and 20 signi cantly downregulated common DEGs between CP and LBW were identi ed, as shown in Tables 2  and 3 (p < 0.05, |logFc| > 1). Venn diagrams representing the overlap of DEGs between CP (GSE12484) and LBW (GSE29807) are shown in Fig. 1.

Upstream Regulators Of Dominant Common Biomarker Candidates
Upstream regulators of common DEGs between CP and LBW, such as PRRC2A, TOP2A, NCAM1, and CACNA2D3, were revealed using IPA software.
In this study, upstream regulators of PRRC2A and CACNA2D3 showed inhibitory reactions, while those of TOA2A and NCAM1 showed inhibitory or active reactions (Table 8).
In this study, we focused on common genetic factors and molecular interactions between CP and LBW by performing gene expression analyses with pooled datasets from the GEO database.
Microarray analysis is a powerful tool to identify new candidate genes involved in the gene expression pro ling of multifactorial diseases. Gene expression pro ling involves the comprehensive study of gene expression levels; these can be used to diagnose a disease or predict treatment effects. The NCBI GEO database is the largest public repository for high-throughput biological assays generated by the research community [35][36][37][38][39][40][41][42][43][44].
In addition, data sharing and the integration of pooled omics data for investigations of biomedical mechanisms and multifactorial disease relations have gained increasing attention. Using pooled microarray gene expression datasets from the GEO is a method that reduces high-throughput hybridization costs and compensates for insu cient amounts of mRNA sampling [32][33][34].
In this study, we analyzed microarray gene expression datasets from the GEO database to elucidate the association between CP and preterm LBW. Although the two datasets contain different experiment conditions, subjects, and diseases, the relation between common genetic factor and biological interaction candidates and multifactorial diseases such as CP and LBW may be elucidated as screening tools.
Common genetic factors, molecular pathways, genetic interactions, and biomarker candidates between CP and LBW were analyzed using DAVID, STRING, and IPA. DAVID is a web-accessible program that provides a comprehensive set of functional annotation tools for investigators to understand biological meanings behind large lists of given genes [68]. STRIG is a database of known and predicted PPIs of multiple proteins [69]. IPA is an application built on a large knowledge database acquired by curators. IPA is a powerful application for the discovery of upstream regulators and biomarker candidates with omics data such as microarray analysis that identi es new biomarkers within the context of biological systems [70,71].
The aim of this study was to elucidate key genes and biological interactions between CP and LBW using bioinformatics analysis of microarray datasets in the GEO database.
We examined important common factors and their functions related to CP and LBW. The functions of the genes were considered while referring to the information in NCBI GEO database [72].
Our analysis of CP and LBW gene expression pro les identi ed three signi cantly upregulated DEGs and 20 signi cantly downregulated DEGs. The three upregulated DEGs had no signi cant relation with each other. Among the three upregulated DEGs, PRRC2A can be assumed to be associated with in ammation and immunity as it is localized in the vicinity of the genes for tumor necrosis factors alpha and beta [72] PRRC2A is associated with rheumatoid arthritis and the age at onset of insulin-dependent diabetes mellitus [72]. Some downregulated DEGs, such as CCNB2, CDKN1C, CENPA, CEP76, NEK2, TOP2A, and TTK, were found to be related to the cell cycle from the functional analysis of the BP and pathway databases. Based on the PPI networks, TOP2 had direct interactions with the downregulated DEGs: CCNB2, TTK, NEK2, and CENPA.
Based on the upstream regulator analysis, catenin beta 1 (CTNNB1) and interleukin-5 (IL-5) were found to be the upstream regulators suppressing PPRC2A, which is one of the upregulated DEGs. CTNNB1 is involved in the bonding of cell adhesion molecules, the homeostasis of living organisms, and intracellular messenger activity [72]. IL-5 is a hematopoietic cytokine that plays an important role in the differentiation, maturation, mobilization, and activation of neutrophils [72].
TOP2A, NCAM1, and CACNA2D3 were identi ed as common downregulated DEGs, while beta-estradiol, transforming growth factor beta 1 (TGFB1), trichostatin A, and decitabine were identi ed as common upstream regulators showing inhibitory reactions to TOP2A and NCAM1. Sirolimus was found to be an upstream regulator showing active reactions to TOP2A and NCAM1.
As for CACNA2D3, there is nothing in common upstream regulators with TOP2A and NCAM1. Adenylate denylate-cyclase activating polypeptide 1 (ADCYAP1), musculoaponeurotic brosarcoma oncogene homolog B (MAFB), achaete-scute homolog 1 (ASCL1), nuclear receptor subfamily 3 group C member 2 (NR3C2), and pancreas transcription factor 1 subunit alpha (PTF1A) were found to be active upstream regulators of CACNA2D3. ADCYAP1 is a transduction material, MAFB is involved in the differentiation of hematopoietic stem cells to monocytes and macrophages, ASCL1 is a transcription factor required when cells differentiate into neurons involved in the nuclear receptor of steroids, such as NR3C2 [72].
The results of this study revealed that PRRC2A, TOP2A, NCAM1, CACNA2D3, CTNNB1, IL5, ASCL1, NR3C2, ADCYAP1, and MAFB are genes commonly associated with CP and LBW, and that upstream regulators such as lipopolysaccharide and pregnancy-associated hormones are dominant regulators commonly associated with CP and LBW. These key genes and regulators are related to not only in ammation and immunity, but also the cell cycle, the bonding of cell adhesion molecules, intercellular messenger activity, the homeostasis of living organisms, and cell differentiation.
Previously reported genes and regulators related to both CP and LBW in the PubMed database are shown in Table 9. In this study, BCL2/adenovirus E1B 19 kDa protein-interacting protein 3-like (BNIP3L) and cyclin dependent kinase inhibitor 1A (CDKN1A) were found to be activated upstream regulators of TOP2A. Beta-estradiol, CD24, erb-b2 receptor tyrosine kinase 2 (ERBB2), estrogen, lipopolysaccharide, peroxisome proliferator-activated receptor alpha (PPARA), TGFB1, and tretinoin are inhibited upstream regulators of TOP2A, while beta-estradiol, TGFB1, and tretinoin are common inhibited upstream regulators of TOP2A and NCAM1, and NCAM1 is a common downregulated DEG and biomarker. Many of the listed genes and regulators are related to cell generation, development, and organization. Some genes and regulators were found to be indirectly related to immunology and in ammation, while some hormones related to pregnancy and fetal growth were found to in uence CP and LBW as upstream regulators. The pooled omics data microarray analysis carried out in this study revealed that several genes related to CP and LBW have functions relevant to cell morphology, organ morphology, and skeletal and muscular diseases, in addition to in ammation and immunity.

Conclusions
Investigations on the relation between phenotype and gene expression levels are important to elucidate biological-related factors in various diseases. Using the integrated analysis of omics data, such as those from microarray mRNA expression datasets, enables DEGs, genetic networks, common biomarker candidates, and upstream regulators to be identi ed, which is important not only for prognosis, diagnosis, and medical treatment, but also for the elucidation of the molecular mechanisms of multifactorial diseases as screening tools.
The results of this study suggest that PRRC2A, TOP2A, NCAM1, and CACNA2D3 may be important common key genes related to CP and LBW. To our knowledge, this is the rst study to report the relations between PPRC2A, TOP2A, and CACNA2D3 and CP and LBW. These key genes are related to the cell cycle, cell composition, erythrocyte function, and transport pathways for small molecules. Among the upstream regulators of key genes, hormones such as beta-estradiol, estrogen, and progesterone had indirect effects on in ammation and immunity, whereas others had direct effects.
Therefore, in ammation-related factors caused by CP may in uence gene expression associated with fetal growth. Conversely, female hormones related to pregnancy may affect the progress and development of CP. These predicted molecular key genes obtained from bioinformatics analysis should be further validated in future experimental research.

Consent for publication
Not applicable.

Availability of data and materials
The datasets generated and analyzed in this study are available in the GEO datasets repository at https://www.ncbi.nlm.nih.gov/gds.

Competing interests
The authors declare that they have no competing interests.

Funding
Not applicable.
Authors' contributions AS conceived this study, participated in the design, and performed the statistical analysis. TH, AN, and EK participated in the design and helped draft the manuscript. YN helped draft the manuscript. All authors read and approved the nal manuscript.  Up and down regulated genes and biomarkers RPA2 and PRRC2A are up regulated genes, ABCB6, ADD2, CA1, CACNA2D3, CCNB2, CDKN1C, CENPA, CEP76, CPOX, ENPP4, NCAM1, NEK2, RHAG, RHOBTB1, SDAD1, SLC35F6, TOP2, TTK and ZNF184 are down regulated genes. PRRC2A, CACNA2D3, NCAM1 and TOP2A are biomarker candidates. TOP2 had direct interaction CCNB2, TTK, NEK2, SDAD1 and CENPA which were down regulated genes. There is no protein-protein interaction among the up regulated genes such as RPS2 and PRRC2A. There are, however, some protein-protein interactions among the down regulated genes with low degree.