Discovery of potential drugs to treat aggressive pituitary adenomas through text mining

Background Aggressive growth hormone-secreting pituitary adenomas (GHSPAs) account for 20–45% of GHSPAs. Although they are benign, treatment of GHSPAs is usually unsatisfactory.We wished to identify existing gene–drug interactions and expand the potential indications for new drugs to treat GHSPAs. Methods We used text mining with the keywords “growth hormone” and “visual disturbance” to obtain a common set of genes. These genes were analyzed using Genome Ontology and Kyoto Encyclopedia of Genes and Genomes databases, as well as protein–protein interaction networks. Finally, important genes clustered in PPI networks were selected for analyses of gene–drug interactions to identify potential drugs. Results

Aggressive GHSPAs can secrete excessive amounts of growth hormone, leading to acromegaly, an increased risk of cardiovascular disease, and premature death [2]. Invasion of a GHSPA into the surrounding tissue can cause visual disturbances, cavernous sinus syndrome, and other symptoms. Acromegaly can signi cantly increase the risk of various systemic diseases and malignant tumors. Studies have shown that the mortality rate of people suffering from acromegaly is at least twice that of healthy people [3]. Resection is rst-line treatment for aggressive GHSPAs but is usually suboptimal. Some studies have shown that the recurrence prevalence of aggressive GHSPAs may be as high as 10-30% even if the GHSPA is resected completely [4].
Although they are benign, treatment of GHSPAs is usually unsatisfactory.
Increasingly, bioinformatics is being considered as a promising medical technology. Bioinformatics has been applied in several areas of clinical research. "Text mining" has been applied in several areas, such as the identi cation of potential key gene targets, the con rmation of pathways, the copy number, and guidance for drug use. However, compared with bioinformatics research in cancer, few studies have focused on aggressive GHSPAs of the nervous system through text mining.
In the present study, we rst used text-mining bioinformatics strategies to idnetify common genes. We obtained the common genes for "growth hormone" and "visual disturbance". Second, gene ontology (GO) and analyses of pathway enrichment were conducted using the Database for Annotation, Visualization and Integrated Discovery (DAVID; (https://david.ncifcrf.gov/). Then, we aggregated those genes in proteins and protein-protein interactions (PPIs) and identi ed important module genes with more interactions. Finally, the drug-gene interactions of module genes were identi ed in the Drug Gene Interaction Database (DGIdb; www.dgidb.org/). In this way, we aimed to nd some existing drugs and provide new ideas and a basis for the prevention and treatment of invasive GHSPAs. By analyzing their biological functions and pathways, we could outline the development of aggressive GHSPAs at the molecular level and identify potential candidate genes for their diagnosis, prognosis, and therapeutic targets, thereby providing new clues for drug development.

Text mining
First, the open-access website pubmed2ensembl (http://pubmed2ensembl.ls.manchester.ac.uk) was used for text mining. Upon entering a keyword, the pubmed2Ensembl website can retrieve and extract all the gene symbols found in PubMed articles related to that keyword [5]. We inputted the two keywords "growth hormone" and "visual disturbance" into pubmed2Ensembl, and then obtained the respective genes associated with them. Then, we extracted all non-repeated genes to obtain the intersection of different genes of the two, and these gene sets constituted "text-mining genes". Figure 1 shows the framework of the textmining process in our study.
GO and pathway-enrichment analyses GO is a useful method for annotating genes and gene products [6]. GO can also be employed to identify the biological signi cance of the characteristics of the relevant genomes [7]. Kyoto Encyclopedia of Genes and Genomes (KEGG) is a database for systematic analyses of gene function. The KEGG database links individual genomic information with high-level functional information. GO types can be divided into "biological processes" (BP), "cellular components" (CC), and "molecular functions" (MF). The KEGG database is designed to explain the biological functions of organic systems, generated from gene chips and high-throughput experiments, and derived from an open-access information database in Japan [8]. DAVID is an online analytics site that provides gene annotation, visualization, and analyses of genetic attributes. GO and pathway-enrichment analyses were undertaken through DAVID, and P < 0.05 was considered signi cant as the cutoff criterion [9].

Protein interactions and module analyses
The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database (https://string-db.org/) v11.0 is used to describe the interactions between various proteins. First, we uploaded the common gene to the STRING database and set a minimum interaction score > 0.4 (low con dence) to a signi cant threshold.
Then, a tab-separated value le of PPIs was downloaded. A PPI network was constructed using Cytoscape v3.7.1 (https://cytoscape.org/) [10].For genes that are clearly related in large PPIs, such as molecular complexes or collections or clusters of PPI networks in screening cells, the MCODE parameter criteria are set by default.

Drug-gene interactions of potential genes
The target module gene was pasted into the DGIdb. First, the preset conditions for selecting gene-drug interaction were inputted: "Approved", "Antineoplastic" and "Immunotherapies" were employed to search for existing targeted drugs. We obtained these targeted genes and the matching drugs, and undertook functional-enrichment analyses.

Statistical analyses
The Fisher's exact test was employed for analyses of GO and analyses of pathway enrichment.

Text mining
Our text-mining retrieval strategy revealed 2848 genes related to growth hormone and 220 genes related to visual disturbance. After screening, 157 genes were found to be common genes (Table 1). Table 1 The 157 common genes between "growth hormone" and "visual disturbance"

GO and pathway-enrichment analyses
To undertake the GO and pathway-enrichment analyses of 157 common genes, we annotated the functions on DAVID. The rst six signi cantly enriched terms of BP, CC, MF and the signaling pathways of common genes are displayed in Fig. 2 and Table 2.The BP category was enriched mainly in the response to "Oxygencontaining compound", "Response to external stimulus", and "Cell death". The CC category was signi cantly enriched in the "Extracellular space", "Extracellular region", and "Extracellular region part". In the MF category, signi cant enrichment was noted for "Glycoprotein binding", "Serine hydrolase activity", and "Endopeptidase activity" (Fig. 2A). With regard to enrichment of signaling pathways, the main ones were "Pathways in cancer", "PI3K-Akt signaling pathway", "Proteoglycans in cancer", and "Cytokine-cytokine receptor interaction" (Fig. 2B). The rst six terms for BP,CC, and MF in GO and the rst six terms for common gene pathways.

Protein interactions and module analyses
First, all the common genes were pasted into STRING and analyzed using Cytoscape. We selected the active interaction source in the PPI network complex: "Text mining"; "Experiment"; "Databases"; "Co-expression"; "Neighborhood"; "Gene fusion"; "Co-occurrence". Then hides unconnected genetic spots, total settings include 142 nodes and 995 edge. Using MCODE, the most important gene module (Fig. 3)  Drug-gene interactions of potential genes First, 17 genes clustered in important gene modules were selected for analyses of drug-gene interactions.
Finally, 12 genes were found to meet the screening conditions, and 46 potential drugs were obtained for these 12 genes. The interaction fraction, interaction type, and direction of these drugs were selected, respectively (Table 3). Each example of drug-gene interactions is evaluated in the context of the gene-gene relationship to GHSPAs and the drug-gene relationship to ensure that any hypothesized drug has a corresponding effect on the treatment and prevention of the disease.Track the reliability of reports linked to sources, such as approved drugs and route of administration.In the resulting list, drugs that meet the criteria for targeting one of the candidate genes through interaction were collected. Each example of drug-gene interactions is evaluated in the context of the gene-gene relationship to GHSPAs and the drug-gene relationship to ensure that any hypothesized drug has a corresponding effect on the treatment and prevention of the disease.Track the reliability of reports linked to sources, such as approved drugs and route of administration.In the resulting list, drugs that meet the criteria for targeting one of the candidate genes through interaction were collected.

Discussion
Invasive GHSPAs will lead to excessive secretion of growth hormone, and result in a series of endocrine and other systemic symptoms. If the tumor invades surrounding tissues it can also cause visual disturbances, visual-eld defects, and other symptoms. Therefore, understanding the molecular mechanism of invasive GHSPAs is very important for the diagnosis and treatment.
We employed text mining to nd the genes associated with aggressive GHSPAs and potential therapeutic drugs. Through pubmed2ensembl, the genes associated with growth hormone and visual disturbance were screened out. Then, GO and analyses of pathway enrichment were used for network analyses. Finally, PPI networks were used for analyses of gene clusters. Seventeen genes related to growth hormone and visual disturbance were screened out using MCODE: APOE, IGF1, CAT, TRH, MMP2, ACE, CXCL8, LEP, MMP1, ALB,  VEGFA, EDN1, TNF, PTH, REN, VWF, and CRP. These genes were pasted into the DGIdb network, and the preset conditions (Approved, Antineoplastic and Immunotherapeutics) selected. Finally, we obtained 12 genes: APOE, TRH, MMP2, ACE, CXCL8, MMP1, ALB, VEGFA, EDN1, TNF, PTH, and VWF. All of these genes were associated with aggressive pituitary tumors and targeted against 46 existing potential drugs for treatment of aggressive pituitary tumors.
ApoE-E4 has a clear relationship with late-onset Alzheimer's disease [11]and is the strongest genetic risk factor for advanced Alzheimer's disease [12]. TRH has several roles in the human body, not just regulation of the secretion of thyroid hormone, Moreover, it plays a key part in the normal function of the thyroid axis under different physiological conditions (e.g.low-temperature stress and changes in nutritional status) [13,14]. MMPs are involved in various stages of tumor development. Downregulation of expression of the MMP2 signaling pathway can inhibit the growth of prostate cancer cells and become a therapeutic target for prostate cancer [15,16]. ACE inhibits the growth and development of tumor cells. The ACE phenotype in biopsy of the prostate gland may be a reliable method for the diagnosis of early prostate cancer, and may be a method for the differential diagnosis of benign prostatic hyperplasia and prostate cancer [17]. CXCL8 (also known as interleukin-8) and its receptors are associated with a wide variety of tumor types. An increase in the CXCL8 level in the tumor microenvironment can promote the development of bladder cancer, and promote the angiogenesis and proliferation of tumor cells [18]. MMP1 is a mesenchymal collagen in the extracellular matrix, which involved in tumor behavior, and can promote the occurrence and metastasis of colorectal cancer through endothelial-mesenchymal transition and the protein kinase B signaling pathway [19]. ALB has an antioxidant effect in the blood vessels of people suffering from benign paroxysmal positional vertigo (BPPV), and a decrease in the serum level of ALB is related to BPPV pathogenesis [20].
Among the 12 genes, VEGFA accounted for the largest proportion. VEGF is a growth factor that plays an important part in angiogenesis [21,22]. VEGF-mediated pathogenicity is due mainly to its in uence on vascular permeability and angiogenesis. VEGF has an anti-apoptotic role in tumor cells during chemotherapy, and may become a potential target for improving chemotherapy. Hypoxia-inducible factor regulates VEGF expression in the hypoxic state [22]. VEGF overexpression is associated with the invasion, vascular density, and metastasis of tumor cells [23]. It is expressed not only in vascular endothelial cells, but also in aggressive GHSPAs to promote the growth of pituitary cells [24].
EDN1 is an effective vasoconstrictor in vivo and a well-known in ammatory marker. EDN1 function is mediated mainly by the EDN type-A receptor. Some studies have shown that EDN1 is associated with persistent pulmonary hypertension in neonates [25]. TNF is a proin ammatory multifunctional cytokine and plays an important part in the formation and maintenance of granulomas [26]. PTH is an important regulator of bone conversion, and its activity in vivo is reduced due to its oxidation. Determination of the PTH level is important for evaluating the indicators of secondary hyperparathyroidism in patients with Page 11/19 chronic kidney disease. The most important function of VWF is to attract platelets to the site of vascular injury during hemostasis. VWF has an obvious role in valve stenosis and bleeding in patients after implantation of heart valves [27].

Conclusions
The development of invasive GHSPAs is a chronic process, and secretion of growth hormone is one of the important causes. We found that APOE, TRH, MMP2, ACE,  Each example of drug-gene interactions is evaluated in the context of the gene-gene relationship to GHSPAs and the drug-gene relationship to ensure that any hypothesized drug has a corresponding effect on the treatment and prevention of the disease.Track the reliability of reports linked to sources, such as approved drugs and route of administration.In the resulting list, drugs that meet the criteria for targeting one of the candidate genes through interaction were collected. Figure 1 Framework of the text-mining proce. Data mining overview results.Text Mining: Using the search terms "growth horm one" and "visual impairment," the text was mined using Pubmed2ensemble, and a total of 157 Page 18/19 common and common genes were found.On the one hand: further enrichment was obtained through molecular network analysis using String, and 142 important genes were enriched, among which 17 signi cant gene clusters were obtained through MOCD.The nal list of 17 enriched genes was used for interactions with 46 known drugs using the drug gene interaction database.On the other hand: Go and KEGG analyses showing 157 common genes were annotated on the David website.