Bioinformatics analysis identifies NDRG1 gene variants that may be relevant to thyroid cancer risk and prognosis

doi:10.21203/rs.3.rs-1651283/v1

The study of genetic alterations that alter genes that regulate normal cellular processes, providing growth advantages and metastatic capabilities to tumor cells is fundamental for a deeper understanding of the molecular nature of different types of cancer, while revealing new therapeutic targets. The NDRG1 gene encodes a protein whose expression has been associated with tumor development and progression, but data on different types of cancer are still dubious and controversial. In order to establish a possible relationship of NDRG1 gene polymorphisms with susceptibility, clinical characteristics and evolution of patients with thyroid tumors, we undertook a comprehensive bioinformatics investigation using a series of in silico tools. We demonstrate that NDRG1 rs201348291 and rs151322132 SNPs are potential and important candidates that can influence the process of pathological development of neoplasms, including thyroid neoplasms. Further validation on thyroid nodule patients and functional studies may confirm their clinical utility in the management of these patients.

NDRG1

in silico analysis

bioinformatics

thyroid

Approximately 90% of the sequential variations in human genetic material occur at a single base in DNA and are called Single Nucleotide Polymorphisms (SNPs) (Collins 1998). They are mostly neutral, but some can cause mutations and, consequently, functional changes in proteins that can lead to a number of disorders. Hence, SNPs may function as genetic markers of disease (Elkhattabi 2019). Missense variants (nonsynonymous SNPs - nsSNPs) are particularly important because they imply changes in the translated sequence of amino acid residues (Dabhi 2014). nsSNPs can alter protein function, structure and stability by reducing its solubility or destabilizing its structure, as well as affect gene regulation by modifying translation and transcription (Dabhi 2014). They are associated with many hereditary diseases in the human population (Dabhi 2014, AbdulAzeez 2016) and have been implicated in diverse types of tumors (Dabhi 2014, AbdulAzeez 2016).

The NDRG1 gene encodes NDRG1 protein, highly conserved in multicellular organisms and ubiquitously expressed in tissues in response to cellular stress signals (Ellen 2008, Sun 2013). It is a protein composed of 394 amino acids, present in the cytosol and expressed in epithelial cells, whose cellular localization depends on the tissue in which it is found (Kovacevic 2006). It is associated with growth suppression, cellular differentiation and action on signaling pathways (Bandyopadhyay 2003, Gerhard 2010, Chua 2007, Cheng 2011), also acting on vesicular transport and in the regulation of membrane proteins (Ellen 2008). However, the role of the NDGR1 gene in tumor development and progression is still dubious and controversial. Whereas there is evidence that its activation has anti-metastatic action, others suggest that its induction is related to less tumor differentiation, more aggressive metastatic potential and, therefore, worse prognosis (Gerhard 2010, Mosquera 2013). In breast, colorectal and prostate neoplasms, the gene appears to be related to a lower incidence of metastasis and less tumor progression (Bandyopadhyay 2003, Gerhard 2010), but in hepatocarcinoma it indicates a worse prognosis (Chua 2007, Cheng 2011). NDRG1 protein appears to be involved in proliferation, migration and invasion through MMP-2 and MMP-9 mediation; metastasis by acting in conjunction with E-cadherin, EMT and MMP-9; and in aberrant methylation (Bandyopadhyay 2003).

Some studies have shown an increase in the immunohistochemical expression of NDRG1 in primary and metastatic thyroid carcinomas compared to normal thyroid, nodular goiter, and follicular adenoma. In addition, NDRG1 expression was associated with more advanced TNM stage and an AMES high-risk category in patients with thyroid carcinoma, suggesting it has a role in thyroid tumor progression (Gerhard 2010). Another study has demonstrated that, in esophageal squamous cell carcinoma tumor tissues, NDRG1 overexpression was associated with poor overall survival (Ai 2016). Moreover, a novel carcinoma somatic alteration of NDRG1, p.G136D, was found in a patient with PTC (Chang 2018). A previous whole-exome sequencing study suggested that variants of this gene might be important in papillary thyroid cancer (PTC) tumorigenic processes serving as a marker of risk and possibly of tumor aggressiveness, and, therefore, as a prognostic marker (Chun-Chi 2018). However, the role of NDGR1 nsSNPs in thyroid nodules is still poorly understood and before may cause deleterious modifications in the protein and could be associated with thyroid carcinogenic process.

Data Selection

NDRG1 gene SNPs were obtained from the NCBI dbSNP (https://www.ncbi.nlm.nih.gov/snp/?term=NDRG1). The primary sequences of the proteins encoded by the gene were taken from the UniProt database (https://www.uniprot.org/uniprot/Q92597) under the code Q92597. The three available FASTA sequences were used for analysis.

Data analysis in bioinformatics

Bioinformatics tools were used to determine the effects of the NDRG1 gene SNPs on the NDRG1 protein. These effects were evaluated on the following platforms:

Prediction of deleterious nsSNPs

PredictSNP1.0 (http://loschmidt.chemi.muni.cz/predictsnp1/) (Bendl 2014) was used as the predictor of the SNP effect on protein function and structure. This feature is a consensus classifier that allows access to nine top performing prediction tools: SIFT, PolyPhen-1, PolyPhen-2, MAPP, PhD-SNP, SNAP, PANTHER, PredictSNP, and nsSNPAnalyzer. SIFT (Sorting Intolerant from Tolerant) predicts whether an amino acid substitution affects protein function based on sequence homology and the physical properties of the amino acids (Ng 2003). This predictor uses a query string and makes use of various alignment information to predict tolerated and deleterious substitutions at each position in the query string. The PolyPhen-1 (Polymorphism Phenotyping) tool uses a specialized set of empirical rules to predict the possible impact of amino acid substitutions, while PolyPhen-2 (Polymorphism Phenotyping v2) predicts the potential effect of an amino acid substitution on the structure and function of a human protein using multiple sequence alignment and structural information. MAPP (Multivariate Analysis of Protein Polymorphism) analyzes the physicochemical variation present in each column of a protein sequence alignment and predicts the impact of amino acid substitutions on protein function (Stone 2005). PhD-SNP (Predictor of Human Deleterious Single Nucleotide Polymorphisms) is a support vector machine (SVM-)-based predictor used to classify nsSNPs causing genetic mutations in human disease (Capriotti 2006). SNAP (Screening for Unacceptable Polymorphisms) is a neural network-based method used to predict functional effects of nsSNPs using information from in silico-derived proteins (Bromberg 2007). PANTHER (Analysis of Proteins Through Evolutionary Relationships) estimates the probability that a given nsSNP will cause a functional effect on the protein using evolutionary position-specific preservation (Tang 2016). nsSNPAnalyzer uses a machine learning method called random forest to predict whether nsSNP has a phenotypic effect (Bao 2005) based on multiple sequence alignment and 3D structure information. Finally, PredictSNP1.0 displays the confidence scores generated by each tool and a consensus prediction as percentages using their observed precision values to simplify comparisons (Bandl 2014).

Prediction of change in protein stability

The effect in protein stability due to amino acid change was predicted using I-Mutant2.0 (http://folding.biofold.org/cgi-bin/i-mutant2.0), a support vector machine (SVM), a web server tool used for automatic prediction of changes in protein stability after mutations at a single site, or Single Nucleotide Polymorphism (SNP). It gives the predicted free energy change (DDG) value and the sign of the prediction as increased or decreased. The DDG value is calculated from the mutated protein unfolding Gibbs free energy value minus the wild-type unfolding Gibbs free energy value in kcal/mol (Capriotti 2005).

Stability was also verified by MUpro tool (http://mupro.proteomics.ics.uci.edu/). This server is based on two machine learning methods: support vector machines and neural networks. Both were trained on a large mutation dataset and showed accuracy above 84%. A SCORE confidence<0 indicates that the mutation decreases the stability of the protein, while a confidence SCORE>0 means that the mutation increases the stability of the protein (Elkhattabi 2019).

PROVEAN was also used. This tool was developed with the aim to predict whether a protein's sequence modification due to amino acid change has its function affected. For maximum separation of deleterious and neutral variants for all 4 classes of human protein variants, the default score threshold is currently set at -2.5 for the binary classification (i.e. deleterious vs neutral). To increase detection sensitivity (find more deleterious variants including those with lower confidence), a higher score threshold can also be used (Elkhattabi 2019).

Prediction of linkage disequilibrium

Some pairs of SNP alleles along a haplotype may be genetically inherited more often than alleles from other pairs of SNPs, which is called linkage equilibrium – when the frequency of one allele at one locus conveys information about that of another locus. The Haploview software is able to statistically predict the linkage disequilibrium of the human genome (Barret 2005, Jr 2021). Elevated levels of linkage disequilibrium between two SNPs indicate that the statistical information of these polymorphisms is redundant and not suitable for association analysis (Jr 2021). Haploview is used to (1) analyze HapMap data and choose target SNPs, (2) assess the quality of disease genotype data, (3) test for association, and (4) evaluate a region for tracking an association (Jr 2021).

Protein-protein interaction analysis

STRING (https://string-db.org/cgi/about) is a database of known and predicted protein-protein interactions. Interactions include direct (physical) and indirect (functional) associations; these interactions originate from computational predictions, from the transfer of knowledge between organisms, and from aggregated interactions of other (primary) databases.

Probability analysis of the impact of nsSNPs on the function of the studied protein

We employed the SNPs3D resource (http://www.SNPs3D.org), a tool that has three main modules. A first module identifies which genes are candidates for involvement in a specific disease. A second module provides information on the relationships between sets of candidate genes. The third module analyzes the likely impact of non-synonymous SNPs on protein function. Disease/candidate gene relationships and gene-gene relationships are derived from the literature using simple but effective text profiles. SNP/protein function relationships are derived by two methods, one using principles of protein structure and stability, the other based on sequence conservation. Entries for each gene include a series of links to other data such as expression profiles, path context, mouse knockout information, and roles. Gene-gene interactions are presented in an interactive graphical interface, providing quick access to underlying information as well as convenient web browsing.

SNP dataset

Data on the NDRG1 gene SNPs (NCBI Gene ID: 10397) investigated in this work were retrieved in September 2021 from the dbSNP database (https://www.ncbi.nlm.nih.gov/snp/?term=ndrg1). It contained a total of 16,383 SNPs. Of these, 319 were missense, which were further evaluated since they could lead to amino acid change.

Prediction of deleterious nsSNPs

Of 319 missense polymorphisms obtained from dbSNP, 147 showed amino acid change and some even had more than one signaled change, so the three FASTA primary sequences were used to analyze the amino acid change. They correspond to the three protein isoforms, are encoded by the gene in question and were taken from the UniProt database (https://www.uniprot.org/uniprot/Q92597) under the code Q92597.

According to: SIFT, a total of 51 nsSNPs out of 109 (46.78%) were predicted to be deleterious; SNAP indicated 32 (29.35%) deleterious nsSNPs; PolyPhen-2 indicated 53 (48.62%) nsSNPs; PhD-SNP indicated 52 (47.70%) nsSNPs; PolyPhen-1 identified 38 (34.86%) nsSNPs; and MAPP indicated 47 (43.11%) deleterious nsSNPs. The PredictSNP tool analysis considered 13 (11.92%) nsSNPs deleterious by all the integrated tools. These nsSNPs, listed in table 1, were selected for further analysis in the Mupro, i-Mutatnt2.0 PROVEAN and Panther tools, shown in table 2.

Four SNPs (rs2272646, rs3779941, rs3088599 and rs2977497) from intronic regions found in literature were also analyzed, but were found to be mostly neutral, as shown in table 3.

Extensive literature search was performed for each of the deleterious nsSNPs selected by these tools, but we were unable to find any description or mention of any of them connected to thyroid nodules or thyroid tumors.

Table 1.

nsSNPS	AA Change	FASTA	PredictSNP	MAPP	PhD-SNP	PolyPhen-1	PolyPhen-2	SIFT	SNAP
rs2233319	M1V	1	N	N	N	N	D	D	N
rs138555940	M5T	1	N	N	N	D	N	D	D
rs145871479	A11T	1	N	N	N	N	N	N	N
rs377527660	G25R	1	N	-	N	N	N	N	N
rs61755063	Q33R	1	N	N	N	N	N	N	D
rs200046476	I37F	1	D	D	N	D	D	D	N
rs2233318	H41R	1	N	N	N	N	N	N	N
rs200593999	V44I	1	N	N	N	N	N	N	N
rs368056707	N55K	1	N	N	D	N	N	N	N
rs151322132	R56Q	1	D	-	D	D	D	D	D
rs142426003	Y62C	1	D	N	D	D	D	D	D
rs2233319	M67V	1	D	D	D	N	N	D	D
rs373590447	H69Y	1	N	N	D	N	N	N	N
rs368584363	Y73H	1	N	D	D	N	N	D	N
rs61755063	Q87R	1	N	N	N	N	N	N	N
rs200433822	G102S	1	D	N	D	D	D	D	N
rs374160497	A103T	1	D	D	D	D	D	D	N
rs144310406	A108T	1	N	N	N	N	N	N	N
rs146310821	M111R	1	N	N	N	N	N	N	N
rs2233328	M111L	1	N	N	N	N	N	N	N
rs2233319	M121V	1	N	N	D	N	D	D	N
rs200593999	V125I	1	N	N	N	N	N	N	N
rs202118022	I135V	1	N	D	N	N	N	N	N
rs139220402	Y144C	1	D	N	D	D	D	D	D
rs376555145	R148Q	1	D	N	D	D	D	D	N
rs370228850	L162V	1	D	D	D	D	D	D	N
rs148821566	P188S	1	N	N	N	N	D	N	N
rs138839833	V205M	1	N	N	N	N	N	N	N
rs138285479	R212C	1	D	D	D	D	D	D	D
rs147184316	M219K	1	N	N	D	N	D	D	N
rs143549909	N220K	1	D	D	D	N	D	D	D
rs199995009	G222S	1	N	N	N	N	N	N	N
rs137993172	N229S	1	N	N	N	N	N	N	N
rs368404338	R241C	1	D	D	D	D	D	D	D
rs148821566	P254S	1	N	N	N	N	D	N	N
rs201759485	V259A	1	D	D	D	N	D	D	N
rs201348291	P298S	1	D	D	D	D	D	D	D
rs373637595	A302T	1	N	N	N	N	D	N	N
rs367925853	A304V	1	D	D	D	D	D	D	D
rs368404338	R322C	1	D	N	D	D	D	D	D
rs144714216	R343H	1	D	-	D	N	D	N	D
rs146613168	R343C	1	D	-	D	D	D	D	D
rs111835070	T356A	1	N	N	N	D	N	D	N
rs144379016	G359D	1	N	N	N	N	N	D	N
rs181121989	R363H	1	N	N	N	N	N	N	N
rs367925853	A370V	1	N	-	N	N	N	D	N
rs373815729	S378L	1	N	-	N	N	N	D	N
rs373590447	H3Y	2	N	N	D	N	N	N	N
rs368584363	Y7H	2	D	D	N	N	N	D	D
rs200433822	G36S	2	D	N	D	D	D	D	N
rs374160497	A37T	2	D	D	D	D	D	D	N
rs144310406	A42T	2	N	N	N	N	N	N	N
rs146310821	M45R	2	N	D	N	N	N	N	N
rs2233328	M45L	2	N	N	N	N	N	N	N
rs200593999	V59I	2	N	D	N	N	N	N	N
rs202118022	I69V	2	N	D	N	N	N	N	N
rs139220402	Y78C	2	D	N	D	D	D	D	N
rs376555145	R82Q	2	D	D	D	D	D	D	N
rs150796527	A84S	2	D	D	D	N	D	D	N
rs370228850	L96V	2	D	D	D	D	D	D	N
rs369577648	Q119H	2	N	N	N	N	N	N	N
rs138839833	V139M	2	N	N	N	N	N	N	N
rs138285479	R146C	2	D	D	D	D	D	D	D
rs147184316	M153K	2	D	D	D	N	D	N	N
rs143549909	N154K	2	D	D	D	N	D	D	D
rs199995009	G156S	2	N	N	N	N	N	N	N
rs137993172	N163S	2	N	N	N	N	N	N	N
rs201759485	V193A	2	D	D	D	N	D	N	N
rs373637595	A221T	2	D	N	D	D	D	D	D
rs201348291	P232S	2	D	D	D	D	D	D	D
rs373637595	A236T	2	N	N	N	N	D	N	N
rs368404338	R256C	2	D	D	D	D	D	D	D
rs144714216	R277H	2	D	D	D	N	D	N	D
rs146613168	R277C	2	D	D	D	D	D	D	D
rs111835070	T290A	2	N	-	N	D	N	D	N
rs144379016	G293D	2	N	-	N	N	N	D	D
rs181121989	R297H	2	N	-	N	N	N	N	N
rs373815729	S312L	2	D	D	N	N	N	D	D
rs200433822	G21S	3	D	N	D	D	D	D	N
rs374160497	A22T	3	D	D	D	D	D	D	N
rs144310406	A27T	3	N	N	N	N	N	N	N
rs146310821	M30R	3	N	D	N	N	N	N	N
rs2233328	M30L	3	N	N	N	N	N	N	N
rs202118022	I54V	3	N	D	N	N	N	N	N
rs139220402	Y63C	3	D	D	D	D	D	D	D
rs376555145	R67Q	3	D	D	D	D	D	D	N
rs370228850	L81V	3	D	D	D	D	D	D	N
rs111835070	T103A	3	D	D	D	N	N	D	N
rs369577648	Q104H	3	N	N	N	N	N	N	N
rs138839833	V124M	3	N	N	N	N	N	N	N
rs138285479	R131C	3	D	D	D	D	D	D	D
rs138839833	V135M	3	N	N	N	N	N	N	N
rs147184316	M138K	3	N	D	D	N	D	N	N
rs143549909	N139K	3	D	D	D	N	D	D	N
rs199995009	G141S	3	N	N	N	N	N	N	N
rs137993172	N148S	3	N	N	N	N	N	N	N
rs148821566	P173S	3	N	D	N	N	D	N	N
rs201759485	V178A	3	D	D	D	N	D	N	N
rs201759485	V189A	3	D	D	D	N	D	D	D
rs201348291	P217S	3	D	D	D	D	D	D	D
rs144714216	R262H	3	D	D	D	N	D	N	D
rs146613168	R262C	3	D	D	D	D	D	D	D
rs111835070	T275A	3	N	-	N	D	N	D	N
rs144379016	G278D	3	N	-	N	N	N	D	N
rs367925853	A289V	3	N	-	N	N	N	D	N
rs373815729	S297L	3	N	D	N	N	N	D	N
rs367925853	A300V	3	N	N	N	N	N	N	D
rs373815729	S308L	3	D	D	N	D	D	D	D

Table 2.

nsSNPS	AA Change	FASTA	PANTHER	Mupro		i-Mutant2.0		PROVEAN
nsSNPS	AA Change	FASTA	PANTHER	RESULT	DDG	RESULT	RI	PREDICT	SCORE
rs2233319	M1V	1	D	↓	-0.56817112	↓	8	N	-0.830
rs138555940	M5T	1	D	↓	-13274808	↓	7	N	-2413
rs145871479	A11T	1	D	↓	-13697944	↓	6	N	1737
rs377527660	G25R	1	N	↓	-0.750936	↓	6	N	0.951
rs61755063	Q33R	1	N	↓	-0.72907334	↓	4	N	-1520
rs200046476	I37F	1	D	↓	-0.9407013	↓	7	D	-3498
rs2233318	H41R	1	N	↓	-0.41592254	↓	6	D	-4461
rs200593999	V44I	1	N	↓	-0.83166145	↓	5	N	-0.131
rs368056707	N55K	1	N	↓	-11398728	↓	2	N	-2209
rs151322132	R56Q	1	D	↓	-11018879	↓	7	D	-3326
rs142426003	Y62C	1	D	↓	-16684893	↓	1	D	-6718
rs2233319	M67V	1	D	↓	-10797346	↓	8	N	-1062
rs373590447	H69Y	1	N	↓	-0.59961522	↓	2	D	-2785
rs368584363	Y73H	1	D	↓	-0.96511263	↓	7	D	-3593
rs61755063	Q87R	1	N	↓	-0.76343932	↓	2	N	-1814
rs200433822	G102S	1	D	↓	-0.62604267	↓	7	D	-4752
rs374160497	A103T	1	D	↓	-1.1349019	↓	8	D	-3379
rs144310406	A108T	1	N	↓	-0.66255687	↓	9	N	0.560
rs146310821	M111R	1	N	↓	-10415918	↓	4	N	0.200
rs2233328	M111L	1	N	↓	-0.25970482	↓	2	N	-1451
rs2233319	M121V	1	D	↓	-1260296	↓	5	D	-2625
rs200593999	V125I	1	N	↓	-0.38010461	↑	0	N	-0.531
rs202118022	I135V	1	N	↓	-2.2686826	↓	7	N	-0.757
rs139220402	Y144C	1	D	↓	-0.58692472	↓	3	D	-7430
rs376555145	R148Q	1	D	↓	-1.0203947	↓	7	D	-3071
rs370228850	L162V	1	D	↓	-0.54583977	↓	8	D	-2606
rs148821566	P188S	1	N	↓	-16292528	↓	8	D	-3817
rs138839833	V205M	1	N	↓	-1.2819704	↓	7	N	0.147
rs138285479	R212C	1	D	↓	-0.58824678	↓	5	D	-6762
rs147184316	M219K	1	D	↓	-2073559	↓	7	N	-2484
rs143549909	N220K	1	D	↓	-11516959	↓	6	D	-5022
rs199995009	G222S	1	N	↓	-0.58146123	↓	9	N	0.756
rs137993172	N229S	1	N	↓	-1.1513627	↓	7	N	-1880
rs368404338	R241C	1	D	↓	-0.71066593	↓	2	D	-6327
rs148821566	P254S	1	N	↓	-0.92148153	↓	8	D	-5611
rs201759485	V259A	1	D	↓	-1646295	↓	9	D	-3746
rs201348291	P298S	1	D	↓	-0.83801011	↓	7	D	-7117
rs373637595	A302T	1	N	↓	-0.8127235	↓	8	N	1744
rs367925853	A304V	1	D	↓	-0.98119679	↓	4	D	-3570
rs368404338	R322C	1	D	↓	-0.60820748	↑	1	D	-7075
rs144714216	R343H	1	N	↓	-0.9352455	↓	5	N	-2214
rs146613168	R343C	1	D	↓	-0.6833013	↑	1	D	-4481
rs111835070	T356A	1	D	↓	-1.0621421	↓	6	N	-1306
rs144379016	G359D	1	D	↓	-0.7183422	↓	2	N	-0.728
rs181121989	R363H	1	N	↓	-0.9352455	↓	3	N	-1071
rs367925853	A370V	1	D	↓	-0.84050858	↓	1	N	-1094
rs373815729	S378L	1	D	↑	0.15723187	↑	5	N	-1151
rs373590447	H3Y	2	-	-	-	-	-	D	-2577
rs368584363	Y7H	2	D	↓	-1	↓	6	D	-3441
rs200433822	G36S	2	D	↓	-0.47786176	↓	7	D	-4617
rs374160497	A37T	2	D	↓	-1	↓	8	D	-3274
rs144310406	A42T	2	-	↓	-0.48662768	↓	9	N	0.497
rs146310821	M45R	2	-	↓	-0.46292434	↓	4	N	0.214
rs2233328	M45L	2	N	↓	-0.072420624	↓	2	N	-1462
rs200593999	V59I	2	-	↓	-0.62188057	↑	0	N	-0.475
rs202118022	I69V	2	-	↓	-0.4873603	↓	7	N	-0.719
rs139220402	Y78C	2	-	↑	0.54315229	↓	3	D	-7400
rs376555145	R82Q	2	-	↓	-0.69129085	↓	7	D	-2919
rs150796527	A84S	2	-	↓	-0.90784296	↓	9	N	-2419
rs370228850	L96V	2	-	↓	-1	↓	8	N	-2478
rs369577648	Q119H	2	-	↑	0.33917239	↓	4	N	0.006
rs138839833	V139M	2	-	↓	-0.1314981	↓	7	N	0.163
rs138285479	R146C	2	-	↓	-0.61815908	↓	5	D	-6522
rs147184316	M153K	2	-	↓	-1	↓	7	N	-2312
rs143549909	N154K	2	-	↓	-1	↓	6	D	-4767
rs199995009	G156S	2	-	↓	-0.31205687	↓	9	N	0.701
rs137993172	N163S	2	N	↓	-0.31254304	↓	7	N	-1883
rs201759485	V193A	2	-	↓	-1	↓	9	D	-3531
rs373637595	A221T	2	-	↓	-0.81701552	↓	2	D	-2884
rs201348291	P232S	2	-	↓	-1	↓	7	D	-6817
rs373637595	A236T	2	-	↓	-0.8999881	↓	8	N	1711
rs368404338	R256C	2	-	↑	0.014066878	↑	1	D	-7212
rs144714216	R277H	2	-	↓	-0.031492764	↓	5	N	-2449
rs146613168	R277C	2	-	↓	-0.17422903	↑	1	D	-4713
rs111835070	T290A	2	-	↑	0.1363743	↓	6	N	-1403
rs144379016	G293D	2	-	↓	-0.19900539	↓	2	N	-0.859
rs181121989	R297H	2	-	↓	-0.031492764	↓	3	N	-1156
rs373815729	S312L	2	-	↑	1	↑	5	N	-1454
rs200433822	G21S	3	D	↓	-0.62604267	↓	7	D	-4623
rs374160497	A22T	3	D	↓	-1.1349019	↓	8	D	-3266
rs144310406	A27T	3	N	↓	-0.66255687	↓	9	N	0.499
rs146310821	M30R	3	N	↓	-1.0415918	↓	4	N	0.258
rs2233328	M30L	3	N	↓	-0.25970482	↓	2	N	-1458
rs202118022	I54V	3	N	↓	-22686826	↓	7	N	-0.709
rs139220402	Y63C	3	D	↓	-0.58692472	↓	3	D	-7411
rs376555145	R67Q	3	D	↓	-1.0203947	↓	7	D	-2919
rs370228850	L81V	3	D	↓	-0.54583977	↓	8	D	-2528
rs111835070	T103A	3	D	↓	-0.15286417	↓	6	D	-3822
rs369577648	Q104H	3	N	↓	-0.36884022	↓	4	N	0.023
rs138839833	V124M	3	D	↓	-12819704	↓	7	N	0.163
rs138285479	R131C	3	D	↓	-0.5882467	↓	5	D	-6505
rs138839833	V135M	3	D	↓	-0.71988461	↓	7	N	-0.621
rs147184316	M138K	3	D	↓	-2073559	↓	7	N	-2312
rs143549909	N139K	3	D	↓	-1.1516959	↓	6	D	-4767
rs199995009	G141S	3	N	↓	-0.58146123	↓	9	N	0.701
rs137993172	N148S	3	N	↓	-1.1513627	↓	7	N	-1883
rs148821566	P173S	3	N	↓	-0.92148153	↓	8	D	-5437
rs201759485	V178A	3	D	↓	-1646295	↓	9	D	-3531
rs201759485	V189A	3	D	↓	-0.57637608	↓	9	D	-3176
rs201348291	P217S	3	D	↓	-0.83801011	↓	7	D	-6817
rs144714216	R262H	3	D	↓	-0.9352455	↓	5	D	-2575
rs146613168	R262C	3	D	↓	-0.6833013	↑	1	D	-4908
rs111835070	T275A	3	N	↓	-1.0621421	↓	6	N	-1467
rs144379016	G278D	3	N	↓	-0.7183422	↓	2	N	-0.570
rs367925853	A289V	3	-	↓	-0.84050858	↓	1	N	-1006
rs373815729	S297L	3	-	↑	0.15723187	↑	5	N	-0.965
rs367925853	A300V	3	-	↓	-0.37400668	↓	4	N	-0.424
rs373815729	S308L	3	D	↑	0.21987013	↑	6	N	-1649

Table 3.

Link Imbalance Prediction

Five SNPs predicted as most deleterious by the PredictSNP tools from the FASTA 1 sequence were evaluated with the Haploview tool (https://www.broadinstitute.org/haploview/haploview). The tool predicted that there is no linkage disequilibrium among any of the five evaluated SNPs (rs146613168, rs368404338, rs201348291, rs143549909 and rs151322132), as shown in figure 1.

Protein-Protein Interaction Analysis

The STRING database (https://string-db.org/cgi/network?taskId=b022 LqlKmSV8&sessionId=brixGRS2HYsO), used to obtain data on the interaction of the NDRG1 gene, obtained results are shown in figure 2. The server indicated that NDRG1 interacts with 11 proteins, including p53 cell tumor antigen (TP53), N-myc proto-oncogene protein (MYCN), serine/threonine-protein kinase (SGK1), serine/threonine-protein kinase 38-like (STK38L), NDRG2 protein (NDRG2), Myc proto-oncogene protein (MYC), WNT1-inducible signaling pathway protein 1 (WISP1), RAC -alpha serine/threonine-protein kinase (AKT1), serum/glucocorticoid regulated kinase family, member 3 (SGK3), and cadherin 1 (CDH1).

Probability analysis of the impact of nsSNPs on the function of the investigated protein

The SNPs3D resource (http://www.SNPs3D.org), used to assess probable diseases related to the NDRG1 gene, indicated a probability of association with prostate cancer, with possible interference in tumor aggressiveness and worse prognosis. The NDRG1 gene was not described as a possible influencer in any thyroid involvement (Yue 2006).

More than 300 missense polymorphisms have been reported in the human NDRG1 gene, and that number is still expanding. However, not all of these polymorphisms include amino acid changes, and not all changes significantly affect the structure or function of the corresponding protein. In order to select those that have the greatest potential to cause deleterious effects and influence the pathophysiology of thyroid cancer, we undertook a thoughtful bioinformatics analysis of NDRG1 SNPs. The methods we employed indicate the importance of using several algorithms with different predictive capabilities to estimate the effect of variations at the structural, functional and stability levels. We demonstrated that rs201348291 and rs151322132, two SNPs still little known, are potential markers of malignancy and may be related to the characteristics of thyroid cancer, deserving further validation.

The NDRG1 gene encodes a protein present in the cytosol and expressed in epithelial cells. The expression of NDRG1 is associated with cell growth arrest as well as terminal cell differentiation, but the role of NDGR1 in tumor development and progression is still dubious and controversial (Mao 2013). While there is evidence that its activation has anti-metastatic action, other studies suggest that the protein expression is related to less tumor differentiation, more aggressive metastatic potential and, therefore, worse prognosis (Kovacevic 2006, Gehrard 2010, Mosquera 2013). In fact, NDGR1 gene expression in breast (Mao 2013, Nagai 2011), liver (Mao 2013, Cheng 2011, Akiba 2008), lung (Azuma 2012) and cervical cancer (Nishio 2008) was positively correlated with disease recurrence and considered an indicator of poor prognosis for patient survival (Mao 2013). On the other hand, the expression of NDRG1 in prostate (Song 2010), colon (Strzlecyk 2009) and esophagus tumors (Ando 2006) was associated with a favorable clinical evolution. Data on the role of nsSNPs of the NDRG1 gene in the development of thyroid neoplasms, and their association with thyroid nodule clinical presentations and outcomes, have not yet been described in the literature.

In conclusion, the NDRG1 SNPs rs201348291 and rs151322132 emerge as potential and important candidates for markers of clinical utility in determining the risk and evolution of thyroid neoplasms, in addition to having a possible therapeutic utility, since they can influence the process of pathological development of neoplasm.

Acknowledgments

The authors thanks to the group of the Laboratory of Cancer Molecular Genetics (GEMOCA) of the School of Medical Sciences. This study received financial support #2020/16557-8, from the São Paulo Research Foundation (FAPESP). Author G.B.B. has received research support from FAPESP.

Competing Interests

The authors have no relevant financial or non-financial interests to disclose.

Data Availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

AbdulAzeez S, Borgio JF. In-Silico Computing of the Most Deleterious nsSNPs in HBA1 Gene. PloS one. 2016;11(1):e0147702.
Ai R, Sun Y, Guo Z, Wei W, Zhou L, Liu F, Hendricks DT, Xu Y, Zhao X. NDRG1 overexpression promotes the progression of esophageal squamous cell carcinoma through modulating Wnt signaling pathway. Cancer biology & therapy 2016; 17, 943–954.
Akiba J, Ogasawara S, Kawahara A, Nishida N, Sanada S, et al. (2008) N-myc downstream regulated gene 1 (NDRG1)/Cap43 enhances portal vein invasion and intrahepatic metastasis in human hepatocellular carcinoma. Oncol Rep 20: 1329–1335.
Ando T, Ishiguro H, Kimura M, Mitsui A, Kurehara H, et al. (2006) Decreased expression of NDRG1 is correlated with tumor progression and poor prognosis in patients with esophageal squamous cell carcinoma. Dis Esophagus 19: 454–458
Azuma K, Kawahara A, Hattori S, Taira T, Tsurutani J, et al. (2012) NDRG1/ Cap43/Drg-1 may predict tumor angiogenesis and poor outcome in patients with lung cancer. J Thorac Oncol 7: 779–789.
Bandyopadhyay S, Pai SK, Gross SC, Hirota S, Hosobe S, Miura K, et al. The Drg-1 gene suppresses tumor metastasis in prostate cancer. Cancer research. 2003 Apr;63(8):1731–6.
Bao L, Zhou M, Cui Y. nsSNPAnalyzer: identifying disease-associated nonsynonymous single nucleotide polymorphisms. Nucleic acids research. 2005 Jul;33(Web Server issue):W480-2.
Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics (Oxford, England). 2005 Jan;21(2):263–5.
Bendl J, Stourac J, Salanda O, Pavelka A, Wieben ED, Zendulka J, et al. PredictSNP: Robust and Accurate Consensus Classifier for Prediction of Disease-Related Mutations. PLOS Computational Biology [Internet]. 2014 Jan 16;10(1):e1003440. Available from: https://doi.org/10.1371/journal.pcbi.1003440
Bromberg Y, Rost B. SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic acids research. 2007;35(11):3823–35.
Capriotti E, Calabrese R, Casadio R. Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics (Oxford, England). 2006 Nov;22(22):2729–34.
Capriotti E, Fariselli P, Casadio R. I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure. Nucleic acids research. 2005 Jul;33(Web Server issue):W306-10.
Chang CC, Chang YS, Huang HY, Yeh KT, Liu TC, Chang JG. Determination of the mutational landscape in Taiwanese patients with papillary thyroid cancer by whole-exome sequencing. Hum Pathol. 2018 Aug;78:151–158.
Cheng J, Xie HY, Xu X, Wu J, Wei X, et al. (2011) NDRG1 as a biomarker for metastasis, recurrence and of poor prognosis in hepatocellular carcinoma. Cancer Lett 310: 35–45.
Chua M-S, Sun H, Cheung ST, Mason V, Higgins J, Ross DT, et al. Overexpression of NDRG1 is an indicator of poor prognosis in hepatocellular carcinoma. Modern pathology: an official journal of the United States and Canadian Academy of Pathology, Inc. 2007 Jan;20(1):76–83.
Chun-Chi Chang, Ya-Sian Chang, Hsi-Yuan Huang, Kun-Tu Yeh, Ta-Chih Liu, Jan-Gowth Chang, Determination of the mutational landscape in Taiwanese patients with papillary thyroid cancer by whole-exome sequencing.. Yhupa (2018), doi:10.1016/j.humpath.2018.04.02
Collins FS, Brooks LD, Chakravarti A. A DNA polymorphism discovery resource for research on human genetic variation. Genome research. 1998 Dec;8(12):1229–31.
Dabhi B, Mistry KN. In silico analysis of single nucleotide polymorphism (SNP) in human TNF-α gene. Meta Gene. 2014 Dec 1;2:586–95.
Elkhattabi L, Morjane I, Charoute H, Amghar S, Bouafi H, Elkarhat Z, et al. In Silico Analysis of Coding/Noncoding SNPs of Human RETN Gene and Characterization of Their Impact on Resistin Stability and Structure. Peterson JM, editor. Journal of Diabetes Research [Internet]. 2019;2019:4951627. Available from: https://doi.org/10.1155/2019/4951627
Ellen TP, Ke Q, Zhang P, Costa M. NDRG1, a growth and cancer related gene: regulation of gene expression and function in normal and disease states. Carcinogenesis. 2008 Jan;29(1):2–8.
Gerhard R, Nonogaki S, Fregnani JHTG, Soares FA, Nagai MA. NDRG1 protein overexpression in malignant thyroid neoplasms. Vol. 65, Clinics. scielo ; 2010. p. 757–62.
Jr J. Using Stata with PHASE and Haploview: Commands for Importing and Exporting Data. Stata Journal. 2010 Sep 1;10:359–68.
Hao X, Zhang S, Hu W, Li J, Sun J, Zheng M. Characterization of the prognostic values of the NDRG family in gastric cancer. Therapeutic Advances in Gastroenterology. 2019 Jul 1;12:175628481985850.
Kovacevic Z, Richardson DR. The metastasis suppressor, Ndrg-1: a new ally in the fight against cancer. Carcinogenesis. 2006 Dec;27(12):2355–66.
Mao Z, Sun J, Feng B, Ma J, Zang L, Dong F, et al. The Metastasis Suppressor, N-myc Downregulated Gene 1 (NDRG1), Is a Prognostic Biomarker for Human Colorectal Cancer. PLoS ONE. 2013;8(7):e68206.
Mosquera JM, Beltran H, Park K, MacDonald TY, Robinson BD, Tagawa ST, et al. Concurrent AURKA and MYCN gene amplifications are harbingers of lethal treatment-related neuroendocrine prostate cancer. Neoplasia (New York, NY). 2013 Jan;15(1):1–10.
Nagai MA, Gerhard R, Fregnani JH, Nonogaki S, Rierger RB, et al. (2011) Prognostic value of NDRG1 and SPARC protein expression in breast cancer patients. Breast Cancer Res Treat 126: 1–14.
Nishio S, Ushijima K, Tsuda N, Takemoto S, Kawano K, et al. (2008) Cap43/ NDRG1/Drg-1 is a molecular target for angiogenesis and a prognostic indicator in cervical adenocarcinoma. Cancer Lett 264: 36–43.
Ng PC, Henikoff S. SIFT: Predicting amino acid changes that affect protein function. Nucleic acids research. 2003 Jul;31(13):3812–4.
Rajasekaran R, Sudandiradoss C, Doss CGP, Sethumadhavan R. Identification and in silico analysis of functional SNPs of the BRCA1 gene. Genomics. 2007 Oct 1;90(4):447–52.
Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, et al. dbSNP: the NCBI database of genetic variation. Nucleic acids research. 2001 Jan;29(1):308–11.
Song Y, Oda Y, Hori M, Kuroiwa K, Ono M, et al. (2010) N-myc downstream regulated gene-1/Cap43 may play an important role in malignant progression of prostate cancer, in its close association with E-cadherin. Hum Pathol 41: 214–222.
Stone EA, Sidow A. Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. Genome research. 2005 Jul;15(7):978–86.
Strzelczyk B, Szulc A, Rzepko R, Kitowska A, Skokowski J, et al. (2009) Identification of high-risk stage II colorectal tumors by combined analysis of the NDRG1 gene expression and the depth of tumor invasion. Ann Surg Oncol 16: 1287–1294.
Sun J, Zhang D, Bae D-H, Sahni S, Jansson P, Zheng Y, et al. Metastasis suppressor, NDRG1, mediates its activity through signaling pathways and molecular motors. Carcinogenesis. 2013 Sep;34(9):1943–54.
Tang H, Thomas PD. PANTHER-PSEP: predicting disease-causing genetic variants using position-specific evolutionary preservation. Bioinformatics (Oxford, England). 2016 Jul;32(14):2230–2.
Yue, P., Melamud, E. & Moult, J. SNPs3D: Candidate gene and SNP selection for association studies. BMC Bioinformatics 7, 166 (2006). Access in 21th July 2021. Available at: (http://www.snps3d.org/modules.php?name=Candidate&disease=THYROID&uid=21/07/31/14:28:17:3).
Zhang X, Feng B, Zhu F, Yu C, Lu J, Pan M, et al. N-myc downstream-regulated gene 1 promotes apoptosis in colorectal cancer via up-regulating death receptor 4. Oncotarget. 2017 Oct;8(47):82593–608.

No competing interests reported.

Bioinformatics analysis identifies NDRG1 gene variants that may be relevant to thyroid cancer risk and prognosis

Status:

Version 1

Abstract

Figures

Introduction

Materials And Methods

Results

Discussion

Declarations

References

Additional Declarations

Status:

Version 1