Expression and Gene Regulation Network of NUDT21 in Lung Adenocarcinoma and Prediction of Anticancer Components of Pinellia Ternata Based on Data Mining

NUDT21 belongs to NUDT families, which is thought to play an essential role in cancer growth and progression in recent years. Abnormal NUDT21 expression is closely related to lung adenocarcinoma (LUAD). However, the expression level, gene regulation network, and prognostic value of NUDT21 in LUAD remain unclear. Besides, the active compounds of Pinellia ternata against LUAD are still not clear yet. Therefore, an in-depth study of the expression and gene regulation network of NUDT21 is of great theoretical signicance and clinical demand for discovering new targets and strategies for the treatment of LUAD and the further improvement of the therapeutic effect of LUAD. Also, the targeted NUDT21 active ingredients of Pinellia ternata were sought to provide a theoretical basis for its clinical application in the treatment of LUAD. Our results revealed the expression and potential regulatory network of NUDT21 in LUAD, laying a foundation for further research on the role of NUDT21 in cancer. Furthermore, we offer new therapeutic targets and prognostic biomarkers for the reference. Finally, we provide potential therapeutic drugs from traditional Chinese medicine in the treatment of LUAD. GO function KEGG enrichment of NUDT21 and the top 50 neighbors altered gene in LUAD with the Metascape. Our results showed that the cellular components related to NUDT21 and its neighboring genes involved in the extracellular plasma


Introduction
Lung cancer is one of the deadliest malignancies in the world [1]. According to statistics, the number of lung cancer deaths was as high as 1.8 million, accounting for 18.4% [2]. Moreover, accounts for about 85% of lung cancer are non-small cell lung cancer, three histological subtypes that include squamous carcinoma, large cell undifferentiated carcinoma, and adenocarcinoma [3]. Among them, lung adenocarcinoma (LUAD) is the most common form of lung cancer. The pathogenesis of LUAD is extraordinarily complex and has not been fully elucidated. Studies have shown that many factors were involved in the development of LUAD, including diet, smoking,and genetic susceptibility [4][5][6]. Treatment for LUAD has progressed from the initial surgery, radiotherapy and chemotherapy to molecular-targeted therapy and immune-targeted therapy [7,8]. The innovations in the treatment of LUAD might signi cantly prolong the overall survival bene t of patients, but the survival prognosis for LUAD is still low. Part of the reasons for this mainly was associated with unknown target localization, di culty in early diagnosis, high risk of cancer recurrence and high rate of early metastasis [9]. However, the discovery of various subtypes of lung cancer targets and the introduction of targeted therapy has changed the prognosis of patients with lung cancer by incorporating tumor genotyping into treatment decisions [10]. The results from the European EUHER2 showed that non-small-cell lung cancer patients, a known HER2 exon-20 insertion, treated with HER2-targeted drugs had established an excellent therapeutic effect on lung cancer [11]. Therefore, targeted therapy is expected to become the crucial methods of treating LUAD in the future, and bring good news for patients with LUAD. NUDT21, as a member of NUDT families, that existed in nearly all organisms. And also is a tumor suppressor gene in the progression of cancer [12]. To date, its biological function remains unclear. NUDT21 as mRNA precursor 30-end modi cation factor mainly regulated 30UTR shortening. It participated in the normal physiological process of proliferation, differentiation and apoptosis in cells [13]. Previous studies had found that NUDT21 played an essential role in cancer growth and progression [14,15]. NUDT21 could inhibit various tumor growths, including glioblastoma, breast cancer, and cervical cancer [16][17][18]. However, the overexpression of NUDT21 was found in multiple cancers, including hepatocellular carcinoma, and leukemia [19,20]. However, we found that the overexpression and gene regulation network of NUDT21 in patients with LUAD by clinical trial data integration. Therefore, NUDT21 may be potential therapeutic targets and prognostic biomarkers for patients with LUAD.
Pinellia ternata, a plant of the Araceae has been used with its dried tuber as a drug in China for thousands of years. Pinellia ternata properties were spicy, warm, toxic, return to the spleen, stomach, and lung meridian. It had been documented by the Chinese Pharmacopoeia (2015 edition) as a commonly used Chinese medicine for the treatment of infection, in ammation, cough, and vomiting [21,22].
Excepted for the pharmacological effects of anti-asthmatic, anti-tussive, anti-in ammatory anti-emetic, and sedative-hypnotic, in recent years, more and more studies have found that pinellia ternata has a therapeutic effect on various cancer [23,24]. To date, few studies have reported whether Pinellia ternata have potential therapeutic effects on LUAD.
In this study, to identify the overexpression and gene regulation network of NUDT21 in LUAD, we used the Cancer Genome Atlas (TCGA) and various public databases to explore the expression and differences of NUDT21 in LUAD patient. Moreover, we also explore the potential active components of Pinellia ternata in the treatment of LUAD by using network pharmacology and molecular docking methods. Finally, we hope that this approach will provide a new idea for the target treatment of other disease and screen the potential active ingredients of Pinellia ternata as a potential new drug against LUAD.

Oncomine analysis
Oncomine (www.oncomine.org) is a large tumor gene chip database that provides translational bioinformatics services for scienti c researcher [25]. In our study, we registered platform of Oncomine database, and set the screening criteria as follow: (1) Gene: NUDT21; (2) Cancer type: lung adenocarcinoma; (3) Analysis type: cancer vs normal analysis; (4) Data type: mRNA; (5) Threshold setting conditions: P = 0.05, fold change = 2, and gene rank = top 10%. Student's t-test was used to analyze the difference of NUDT21 expression in LUAD.

UALCAN analysis
UALCAN (http://ualcan.path.uab.edu/analysis.html) is a portal for tumor subgroup gene expression and survival analyses [26]. We choose "Expression Analysis" module of UALCAN to analyst TCGA gene expression in this study, and the screening criteria were set as follow: (1) Gene: NUDT21; (2) Dataset: lung adenocarcinoma. (3) Threshold setting conditions: P-value cutoff = 0.05. Student's t-test was used for comparative analysis.

The Human Protein Atlas analyses
The Human Protein Atlas (https://www.proteinatlas.org/) is a platform that provides cells, tissues and organs distribution information on all 24,000 human proteins and offers free public access. In this study, we analyzed the protein expression levels of NUDT21 in the lung tissue of patients with AULD.
1.4 GEPIA analysis GEPIA (http://gepia.cancer-pku.cn/index.html) is a platform that is analyzing the RNA sequencing expression data of 9,736 tumours and 8,587 normal samples from the TCGA and the GTEx projects [27].
A variety of analytical methods were conducted in this study, including differential mRNA expression analysis, pathological stage analysis, and correlative prognostic analysis. We set the screening criteria as follow: (1) Gene: NUDT21; (2) Dataset: LUAD; (3) Threshold setting conditions: P-value cutoff = 0.05. Student's t-test was used to analyze the expression of NUDT21 or pathological stage of LUAD. The Kaplan-Meier curve was used to analyze the prognosis of LUAD.

cBioPortal analysis
The cBioPortal (http://cbioportal.org) is an open-source for interactive exploration of multidimensional cancer genomics datasets that provide a visualization tool for studying and analyzing cancer genetic data [28]. In our study, genetic alterations and coexpression of NUDT21 were conducted from cBioPortal. The screening criteria were set as follow: (1) 230 samples of lung adenocarcinoma were analyzed; (2) mRNA expression z scores relative to all samples (log RNA Seq V2 RSEM) were obtained using a z score threshold of ± 2.0. (3) Gene: NUDT21.
1.6 STRING analysis STRING (https://string-db.org/cgi/input.pl) is a platform used to construct protein-protein interaction networks between target proteins [29]. This study builds the PPI network interaction by screening condition with medium con dence (0.4) and de ned species as Homo sapiens.

GeneMANIA analysis
GeneMANIA (http://www.genemania.org) is a network for building protein-protein interactions (PPI), generating hypotheses about gene function, analyzing gene lists, and sequencing genes for function determination [30]. In this study, we constructed interaction networks for analyses the role of NUDT21 and the top 50 neighbor altered gene.

Metascape analysis
Metascape (https://metascape.org) is a simple and powerful gene function annotation and analysis tool that can help users apply the currently popular bioinformatics analysis methods to analyze batch genes and proteins to realize the knowledge of gene or protein function [31]. In our study, GO function and KEGG pathways enrichment analysis of NUDT21 and the top 50 neighbors' altered gene in LUAD were analyzed using Metascape.

TRRUST analysis
TRRUST is a manually curated database of human and mouse transcriptional regulatory networks that contain 8,444 and 6,552 TF-target regulatory relationships of 800 human TFs and 828 mouse TFs, respectively [32]. In this study, we tried to determine the key regulated factor of NUDT21 and the top 50 neighbors altered gene in LUAD by using TRRUST.

LinkedOmics analysis
LinkedOmics (http://www.linkedomics.org/) is a publicly available platform that includes multi-omics data from all 32 TCGA Cancer types [33]. It provides methods for analyzing and comparing cancer multiomics data within and across tumour types. In our study, kinase target enrichment, miRNA target enrichment, and genes differentially expressed in correlation with NUDT21 were conducted by using the "LinkInterpreter" module. The screening criteria set as follow: (1) a minimum number of genes (size) of 3; (2) cancer type: LUAD; (3) a simulation of 500; (4) search attribution: NUDT21 and the top 50 neighbors altered gene. (5) Target dataset: RNAseq (data type).

Timer analysis
TIMER (https://cistrome.shinyapps.io/timer/) is a comprehensive resource for systematical tumor immune analysis, including B cells, CD4 + T cells, CD8 + T cells, neutrophils, macrophages, and dendritic cells [34]. In our study, the correlation among clinical outcome and NUDT21 expression and the in ltration of immune cells was evaluated by "Survival module". The correlation between NUDT21 expression level and the in ltration of immune cells was assessed by "Gene module".
1.12 TCMSP analysis TCMSP (http://lsp.nwu.edu.cn/tcmsp.php) is a pharmacology platform of Chinese herbal medicines that captures the relationships between drugs, targets and diseases [35]. In this study, the active ingredient of Pinellia ternata with OB 30% and DL 0.18 were obtained.

Molecular docking analysis
AutoDock Vina (http://vina.scripps.edu/) is an open-source program for doing molecular docking, which signi cantly improves the average accuracy of the binding mode predictions compared to AutoDock 4 [36]. Protein Database Bank (PDB, https://www.rcsb.org/) was used to acquire the protein crystal structure of target genes (NUDT21). 3D structures of active compounds were obtained from Pubchem (https://pubchem.ncbi.nlm.nih.gov/). Vina software 1.1.2 was used for molecular docking. Finally, the conformation with the lowest score was selected and plotted with PyMOL 2.4 for analysis.

NUDT21 expression in LUAD
Compared the transcription levels of NUDT21 in lung tissue of patients with LUAD and normal human with the ONCOMINE, we found that the transcriptional levels of NUDT21 were signi cantly increased in patients with LUAD ( Fig. 1). Moreover, these results were similar to Table 1. It showed that the mRNA level of NUDT21 was increased in LUAD than normal human. Yamagata N's dataset found that the transcriptional level of NUDT21 (fold change = 2.417 and p = 0.008) in LUAD was signi cantly increased [37]. Analogously, the transcriptional levels of NUDT21 in LUAD were signi cantly up-regulated in datasets of Okayama H (fold change = 1.841 and p = 7.96E-17) [38]. The fold change of NUDT21 expression in LUAD was 1.852 (p = 2.49E-4) in Landi MT datasets [39]. Furthermore, we also compared the expression levels of NUDT21 in LUAD tissue and normal tissues with the UALCAN. Our results showed that the transcriptional levels of NUDT21 strati ed based on gender and stages (P = 0.05) in LUAD tissues were signi cantly up-regulation (Fig. 2). The relative expression level of NUDT21 in LUAD tissues was evaluated with the GEPIA. The results showed that the relative expression level of NUDT21 was increased (Fig. 3). Besides, we compared the protein expression levels of NUDT21 in lung tissue of PAAD patients and normal people with the Human Protein Atlas. We found that the protein expression levels of NUDT21 were up-regulated in LUAD (Fig. 4). Then, the correlation between differential expression of NUDT21 and pathological stage in patients with LUAD was assessed. Our results showed a signi cant correlation between the expression of NUDT21 and pathological stage (P = 0.00106) (Fig. 5). Moreover, we evaluated the prognostic value of NUDT21 in LUAD patients with GEPIA. The results showed that LUAD patients' survival with the low expression level of NUDT21 was better prognostic value than LUAD patients with a high expression level of NUDT21 (P = 0.0029) (Fig. 6).

Interaction network of NUDT21 alterations in LUAD
The interaction network of NUDT21 on the molecular characteristics in LUAD was analysis. First, the genetic alteration of NUDT21 was evaluated with the TCGA. We found that NUDT21 were altered by 12% (Fig. 7A). However, the promoter methylation level of NUDT21 in LUAD was lower compared to normal human (Fig. 7B). Moreover, we found of NUDT21-neighboring genes (the 50 most frequently altered neighbor genes) that were altered at frequencies > 17% in LUAD ( Table 2). The most frequent alterations among the NUDT21-neighboring genes were TTN2 (64.29%), TP53 (60.71%), and MUC16 (57.14%). We then explore the potential interactions of NUDT21, and its neighboring genes, the PPI network analysis of them was established with STRING. We found that 48 nodes and 572 edges were obtained in the PPI network (Fig. 7C). We also found that regulatory region DNA binding, transcription regulatory region DNA binding, regulatory region nucleic acid binding, and regulation of homeostatic process were the primary function of NUDT21 and its neighboring genes with the results of GeneMANIA (Fig. 7D).

Gene ontology (GO) and KEGG pathway enrichment analysis
The GO function and KEGG pathways enrichment analysis of NUDT21 and the top 50 neighbors altered gene in LUAD was explored with the Metascape. Our results showed that the cellular components related to NUDT21 and its neighboring genes mainly involved in the extracellular matrix, apical plasma membrane, z disc, cell cortex, and postsynapse (Fig. 8A). Moreover, the release of sequestered calcium ion into cytosol by sarcoplasmic reticulum, cardiac chamber development, and protein tetramerization were the main biological processes (Fig. 8B). The molecular functions of NUDT21 and the top 50 neighbors altered gene were mainly included calcium ion binding, structure molecule activity, and protein domain speci c binding (Fig. 8C). The KEGG pathway of NUDT21 and its neighboring genes were mainly involved in the apelin signaling pathway, PI3K-Akt signaling pathway, and axon guidance (Fig. 8D). We also found that non-small cell lung cancer recurrent, mixed tumor and sarcomatoid renal cell carcinoma was the manly disease related to NUDT21 and its neighboring genes (Fig. 8E).
2.4 Transcription factor targets, kinase targets, and miRNA targets of NUDT21 in LUAD The potential transcription factor targets, kinase targets and miRNA targets of NUDT21 in LUAD were obtained in the database of TRRUST and LinkedOmics. The results showed that DNA (cytosine-5-)methyltransferase 1 (DNMT1), histone deacetylase 1(HDAC1), and v-myc myelocytomatosis viral oncogene homolog (MYC) were the key transcription factor targets involved in the network of NUDT21 and its neighboring genes (P 0.05) ( Table 3). We also found that the top three kinase targets and miRNA targets of the NUDT21 network with the LinkedOmics (Table 4). Kinase CDK1, Kinase ATM and Kinase PLK1 were the top three targets in the NUDT21 kinase-target network (P 0.00). The NUDT21 miRNA-target network was associated with (ATGTTAA) MIR-302C, (TAGCTTT) MIR-9, and (TGCTTTG) MIR-330 (P 0.00).

Active compounds of Pinellia ternata and molecular docking
A total of 116 compounds were found in TCMSP database. Among them, 13 active compounds were retrieved base on OB and DL in homeostatic process (Table 5). Besides, to verify the binding ability of active compounds of Pinellia ternata to NUDT21, we conducted molecular docking with the AutoDock Vina. In this analysis, the vina score's value (binding energy) indicates the binding activity between a protein and a compound. The binding energy ≥ -5.0 kcal/mol was considered the compound and protein with a suitable binding property. Our results showed that baicalein was the best combination with NUDT21 (Score =-8.8 kcal/mol) (Fig. 12A). Van der Waals, Pi-cation, conventional-hydrogen bond, Pi-Pi stacked, carbon hydrogen bond, unfavorable donor-donor, and Pi-alkyl were the mainly interacted mode between them (Fig. 12B). 3. Discussion NUDT21, as a tumor suppressor gene,which could inhibit the occurrence and development of various cancers, including small cell lung cancer, bladder cancer, and breast cancer. NUDT21 is a 3'-terminal processing factor of mRNA precursor (targeted mRNA precursors 3'contain multiple poly-A splicing sites), regulating mRNA expression through selective alternative polyadenylation [40]. Ultimately, the normal physiological process of NULDT21 was involved in regulating proliferation, differentiation and apoptosis. However, the overexpression of tumor suppressor genes in cancer patients has attracted more and more attention. But the mechanism of them has not yet been elucidated. Studies had found that it might be closely related to genetic alters and DNA methylation [41,42]. However, the expression and gene regulation network of NUDT21 in patients with LUAD was rarely reported.
In our study, the expression level of NUDT21 and the correlation between differential expression of NUDT21 and pathological stage were rst explored in LUAD patients. We found that NUDT21 was up-regulated expression in patients with LUAD compared with normal human. In addition, our results showed a signi cant correlation between the expression of NUDT21 and pathological stage. Furthermore, the survival of LUAD patients with the low expression level of NUDT21 was better prognostic value than LUAD patients with a high expression level of NUDT21. In brief, overexpression of NUDT21 may play an essential role in survival with LUAD patients. Contrary to other studies, NUDT21 was low expressed in varies cancer [43]. To further explore the mechanism of NUDT21 overexpression in LUAD patients, we found that genetic alteration of NUDT21 was high (12%). Moreover, the promoter methylation level of NUDT21 in LUAD was lower compared to normal human. Therefore, genetic alteration and methylation of NUDT21 may be the leading cause of NUDT21 overexpression. Moreover, NUDT21-neighboring genes (the 50 most frequently altered neighbor genes) were altered at frequencies > 17% in LUAD. We then explore the potential interactions and function of NUDT21 and its neighboring genes. We found that NUDT21 and its neighboring genes have complex and tight networks of connections. Their primary functions mainly include the regulatory region DNA binding, transcription regulatory region DNA binding, and regulatory region nucleic acid binding. The above evidence reveals that they might affect the processes of gene binding, transcription, and regulation involved in the occurrence and development of cancer.
Moreover, GO enrichment analysis revealed that these genes' functions are mainly connected with in ammatory response caused by the activation of in ammatory factors. As expected, the expression of NUDT21 was positively associated with the immune cell in ltration, including CD8 + T cells, macrophages, neutrophils and dendritic cell. Immunotherapy has been recognized as the fourth pillar of cancer therapy, and a large number of preclinical and clinical studies have been revealed the e cacy of immunotherapy for lung cancer [44]. Targeting and regulating the tumor immune microenvironment can divert the immune system to anticancer and increase the sensitivity to established chemotherapy. Chemokines and their receptors are crucial for in ammation and anti-tumor immunity, thus in uencing angiogenesis, tumor occurrence, progression, metastasis, therapeutic e cacy, and patient outcomes [45][46][47]. Studies have shown that the CXC family of chemokines and their receptors are associated with tumour metastasis and therapy resistance [48,49]. Besides, our results showed that The KEGG pathway of NUDT21 and its neighboring genes involved in the apelin signaling pathway, PI3K-Akt signaling pathway, and axon guidance. Apelin is an endogenous ligand of G protein coupled receptor APJ. It is widely expressed in many tissues, especially the lungs. More and more evidence revealed that the apelin signaling pathway is closely associated with the occurrence of respiratory diseases [50]. Studies have found that inhibiting apelin effectively remodeled the tumor microenvironment, reduced angiogenesis, and effectively inhibited tumor growth [51]. Importantly, apelin prevents resistance to antiangiogenic receptor tyrosine kinase (RTK) inhibitor therapy in lung cancer [51]. PI3K-Akt signaling pathways play a key role in the regulation of cell proliferation, differentiation and apoptosis. AKT is a crucial regulator of cell survival and apoptosis [52,53]. AKT is activated by PDK1 through phospholipid binding and activation loop phosphorylation at Thr308 [52,53]. AKT inhibits apoptosis and promotes survival of malignant cells by phosphorylating and inactivating multiple targets, including Bad, C-Raf and Caspase-9 [54]. In recent years, research has shown that the axon guidance cue molecule Slit2 could suppress lung cancer cell invasion and growth [55].
We also focus on their targets and regulators. We then explored the transcription factor targets, kinase targets and miRNA targets of NUDT21 and its neighboring genes, and found that DNMT1, HDAC1, and MYC were their key regulated factor in LUAD. DNMT1 is a major epigenetic enzyme, maintains the genomic stability and the epigenetic state of DNA by copying CpG methylation markers and generating heritable methylation patterns through cell metabolism [56,57]. Approaches for inhibition of DNMT1 may become novel strategies for treating cancers [58]. In recent years, abnormal expression of HDAC1 has been found to have potential clinical prognostic value in various cancers. Studies have shown that downregulation of HDAC1 can inhibit cell proliferation, migration, invasion, angiogenesis and induce cell apoptosis [59]. MYC is a transcription factor that is overexpressed in tumors and participates in preventing immune cells from attacking tumor cells by inducing the expression of PD-L1. MYC expression's role is an alternative marker for evaluation of treatment with non-small cell lung cancer [60]. Therefore, Targeting DNMT1, HDAC1, and MYC may be a promising approach to the treatment of LUAD.
We also found that . miR-9 includes three family members, including miR-9-1, miR-9-2 and miR-9-3. Researchers recently found that the promoter region of Mir-9-3 was hypermethylated in lung cancer, leading to the down-regulation of Mir-9-3, and poor prognosis of patients and miR-9 may be a potential lung cancer biomarker [64,65]. MiR-330 acts as a tumor-suppressor microRNA (miRNA) in various cancers and is a promising candidate for miRNA replacement therapy in lung cancer patients [66]. In conclusion, the transcription factor targets, kinase targets and miRNA targets of NUDT21 and its neighboring genes may be potential therapeutic targets and biomarker for LUAD.
Pinellia ternata is a commonly used Chinese medicine for treating infection, in ammation, cough, and vomiting,which had been documented by the Chinese Pharmacopoeia (2015 edition) [21,22]. However, more and more studies have found that Pinellia ternata has a therapeutic effect on various cancers [23,24]. Our results found that 13 active compounds of Pinellia ternata were retrieved base on OB and DL from the TCMSP. Among them, baicalein was the best combination with NUDT21 (Score =-8.8 kcal/mol).
Hydrogen bonding was their main interaction force (Six hydrogen bonds were found in our study).
Although our study has not been further veri ed the anti-LUAD of baicalein by cells or animal experiments, our results preliminarily indicate that it may affect NUDT21 against LUAD.
In conclusion, by showing the overexpression and gene regulation network of NUDT21 in LUAD, we expect to provide a new perspective on drug selection for clinical workers from the standpoint of immunotherapy for LUAD. Furthermore, identifying new therapeutic targets and prognostic biomarkers to more accurately predicts the survival in patients with LUAD. We also provide potential therapeutic drugs from traditional Chinese medicine in the treatment of LUAD.

Declarations
Yong-li SITU performed data analysis work and aided in writing the manuscript. Hong NIE and Zhixin FANG designed the study and assisted in writing the manuscript. Yong-li SITU, Li-na LONG, and Hai-jian LI edited the manuscript. All authors read and approved the nal manuscript.

Funding
This work was supported by Natural Science Foundation of China (No.81673634 and No.81861138042). This funding source had no role in the design of this study and will not have any role during its execution, analyses, interpretation of the data, draft of the manuscript, or decision to submit results.

Availability of data and materials
The datasets used and analyzed in this study are available from publicly available platforms in the manuscript's method introduction.

Consent to publication
None.
Ethics approval and consent to participate In this study, relevant data were obtained from some publicly available platforms for retrospective analysis. All data in the platform had obtained informed consent signed by the patients or their families and published by relevant institutions or individuals.
Competing interests   The relative level of NUDT21 in LUAD (GEPIA).  The prognostic value of NUDT21 in LUAD patients in the overall survival curve (GEPIA).   The correlation between NUDT21 and immune cell in ltration in LUAD (TIMER).

Figure 12
Molecular docking models of NUDT21 with baicalein. A. Molecular docking of NUDT21 with baicalein; B.