Genome Mining, Phylogenetic and Structural Analysis of Bacterial Nitrilases for the Biodegradation of Nitrile Compounds

Microbial nitrilases play vital role in biodegradation of nitrile-containing contaminants in pollutant and euents treatments in chemical and textile industries as well as the biosynthesis of IAA from tryptophan in plants. However, the lack of structural information hinders the correlation of its activity and substrate specicity. Here, we have identied bacterial genomes for nitrilases bearing unassigned functions including hypothetical, uncharacterized, or putative role. The genomic annotations revealed four predicted nitrilases encoding genes as uncharacterized subgroup of the nitrilase superfamily. Further, the annotation of these nitrilases revealed relatedness with nitrilase hydratases and cyanoalanine hydratases. The characterization of motif analysis of these protein sequences, predicted a single motif of 20-28 aa, and glutamate (E), lysine (K) and cysteine (C) residues as a part of catalytic triad along with several active site residues. The structural analysis of the modeled nitrilases revealed geometrical and close conformation of α-helices and β-sheets arranged in a sandwich structure. The catalytic residues constituted the substrate binding pocket and exhibited the wide nitrile substrate spectra for both aromatic and aliphatic nitriles containing compounds. The aromatic amino acid residues Y159 in active site were predicted to show importance for substrate specicity. The substitution of non-aromatic alanine residue in place of Y159 completely disrupted the catalytic activity for indole-3-acetonitrile. The present study reports several uncharacterized nitrilases which have not been reported so far for their role in the biodegradation of pollutants, xenobiotics which could nd applications in industries.


Introduction
Nitrile (− C ≡ N) containing organic compounds such as cyanoglycosides, cyanolipids, ricinine and phenylacetonitrile alongwith other synthetic derivatives are abundant in the environment (Gong et al., 2012). Most of these compounds are toxic, mutagenic and carcinogenic due to the presence of cyano group in their structure. The enzyme catalyzing transformation of these nitrile organic (− C ≡ N) compounds into their corresponding amides and carboxylic acids such as nicotinic acid, glycolic acid and other is used for industrial production (Shaw et al., 2003;Mylerova and Martinkova, 2003;Banerjee et al., 2004;Panova et al., 2007). The biodegradation of toxic nitrile compounds from environmental wastes and contaminants using nitrile-converting enzymes is one of the e cient methods to convert them into their corresponding acid and ammonia (Gong et al., 2012). For the transformation of nitrile compounds, two types of pathways for the enzymatic break down of nitriles have been identi ed. The rst pathway involves nitrilases (E.C. 3.5.5.1), catalyse the transformation of nitriles compounds to their corresponding acid. The second pathway involves nitrile hydratases, converts nitriles into amides and subsequently, amidases (E.C.3.5.1.4) assisted hydrolysis of amides into ammonia and carboxylic acids. The respective enzymes for both pathways have been reported form bacteria, fungi and plants (Egelkamp et al., 2017;Martínková et al., 2009;Rustler and Stolz, 2007;Piotrowski 2008).
For the transformation of toxic nitrile compounds from environmental wastes, nitrilases are preferred over chemical methods, due to their ecofriendly, e cient and cost-effective nature (Kumar et al., 2011). In additions, the nitrilases are also of particular interest due to their potential use in food additives (Tang et al., 2019), surface modi cation of polyacrylonitrile (PAN) bers (Zhua et al., 2013), pharmaceutical and agrochemicals sectors (Zhang et al., 2019), degradation of bromoxynil, ioxynil, acrylonitrile (Shen et al., 2020) contaminated waste bodies such as near to gold mining, electroplating industries and production of indole acetic acid (IAA) from tryptophan (Salwan et al., 2020 With the advent of genome and metagenome sequencing, presently over 5000 nitrilase encoding genes have been deposited in the GenBank database (Shen et al., 2020). Despite the initial reports on nitrilases from plants, several nitrilases have been characterized from bacteria and fungi (Thimann and Mahadevan, 1964;Hook and Robinson, 1964;Gong et al., 2012). Most of the bacterial nitrilases have been explored for their role in chemical synthesis or degradation but the biological role of nitrilases is largely unknown (DeSantis et al., 2002). Moreover, the lack of structural data limits the correlation of amino acid sequence, speci city and activity of nitrilases towards substrates (Podar et al., 2005). Therefore, considering the importance of nitrilases, the present study involves the distribution and diversity of bacterial nitrilases at genomic level, their phylogenetic and structural characterization for motifs/ domains, and interaction with substrates. Here, we have described in silico annotation of nitrilases followed by the structure prediction of promising biocatalysts to propose their role in degradation of aliphatic and aromatic nitrile containing compounds.

Analysis of amino acid sequence
The amino acid sequences were subjected to pBLAST analysis. The sequences with unassigned functions including uncharacterized, putative or hypothetical were selected and compared in order to identify new nitrilases. The amino acid composition analysis of these nitrilases was done to determine molecular weight, pI value, acidic, basic, polar, and non-polar amino acids as well as GRAVY and Aliphatic index using Expasy ProtParam tool.

Phylogenetic analysis
The neighbor-joining phylogenetic tree was made by using Mega 7 (Kumar et al., 2016). The multiple alignments of sequences retrieved from NCBI database were performed using CLUSTALW program

Structural features
The catalytic domain of nitrilase was modeled using user-template approach based the nitrilase from Mus musculus nitrilase-2 (PDB code, 2W1V_A) in SWISS MODEL (http://swissmodel.expasy.org/) and 3D structures of the predicted models were prepared in PYMOL (Schwede et al., 2003). The quality of nal models was assessed with QMEANclust (Benkert et al., 2009). PROCHECK analysis of the 3D models was performed to check the stereo-chemical qualities . The secondary structures were predicted using POLYVIEW Server (http://polyview.cchmc.org/).

Annotation of genomes for nitrilases
In this study, a total of 16 genomes were annotated using RAST server for nitrilase encoding genes. The genomes size ranged from 1.7 to 9.15 Mbp size with GC content 35.2 to 70%. The presence of total predicted genes in the genome varies from 1900 in Pyrococcus horikoshii UBA8834 to 9063 in Bradyrhizobium diazoe ciens SEMIA 5080 ( Table 1). The genome mining revealed highest number of nitrilase encoding genes in Bradyrhizobium diazoe ciens SEMIA 5080 followed by Acidovorax sp. MR-S7, A. oryzae RS-1, Acinetobacter sp. ATCC27244, Cupriavidus necator B9, C. necator 5, Rhodococcus opacus PD630, Sphingomonas sp. S17, Sphingomonas sp. LH128 and Streptomyces AC40, whereas the genome of Bacillus cereus W, B. cereus G9241, Bacillus sp. SBA12, Pyrococcus horikoshii UBA8834, Rhodococcus rhodochrous BKS6-46, R. rhodochrous TRN7, Sphingomonas sp. KC8 and Streptomyces sp. AC30 lacks nitrilase encoding genes (Table 1). Nitrilase predicted in Streptomyces AC40 was only 119 aa long possibly because of the draft genome sequence (Salwan et al., 2020). Further, the identi cation of amino acid sequences revealed grouping of nitrilases to C-N hydrolase, amidase, glutamine amidotransferase, nitrile hydratase and periplasmic nitrile proteins (Table S1). Among all, four nitrilases showing identity towards uncharacterized subgroup of the nitrilase superfamily were characterized to identify newer sources of nitrilases. The size of nitrilase varies from 208-345 amino acids and shared 24-37% sequence identity to various nitrilases (Table 2). Further, the evolutionary and phylogenetic history of nitrilase revealed relatedness of uncharacterized nitrilases AcNit, As7nit, CnB9 and Cn5 with nitrilases belonging to C_N hydrolase domain and grouped as separate cluster (Fig. 1).  The comparative amino acid composition revealed molecular weight and pI in the range ~ 24-54 kDa and 4.76-7.81, respectively which is closer to the previously characterized aliphatic or aromatic nitrilases. The lower pI value is probably due to the presence of higher contents of acidic amino acids (30%) and lower contents of basic amino acids (25%), and − 0.032 GRAVY and 93 aliphatic indexes. Besides this, more number of negatively charged amino acids over the surface, higher content of non-polar (55%) and polar amino acids (22%) and less Pro (4%) and more Gly (9%) residues in nitrilases may provide exibility to the protein structures.

Domain and motif analysis
The domain analysis of nitrilases AcNit, As7Nit, Cn5Nit and Cn9Nit revealed a conserved catalytic domain  Fig. 2b). Catalytic residues were found conserved in all the predicted proteins (Table 3).  Fig. 2a). Ramachandran plot depicted 88-92% amino acid residues in most favored regions, 7.5-10.4% amino acid residues in additional allowed regions, and 0.4-1.3% amino acid residues in the disallowed conformations (Fig. 3). The 3D structure of 4 nitrilases prepared by taking best matches which involves 32-38% identity with nitrilase of Mus musculus nitrilase-2 (PDB: 2W1V_A), and 26% identity with nitrilase of Staphylococcus aureus subsp. aureus (PDB: 3P8K) and 31% identity with nitrilase of Nesterenkonia sp. 10004 (3HKX_A). The modeling and superimposition of modeled structures AcNit, As7Nit, Cn5Nit and prokaryotes. The catalytic triad EKC appeared distantly in primary structure and forms substrate binding pocket along with other residues, a characteristic of the nitrilase superfamily (Fig. 4).  (Shen et al., 2020). Therefore, the selected nitrilase AcNit contain catalytic C158 which acts as a nucleophile, E40 activates the sulfhydryl group of C158 by acting as a base, and K113 stabilizes intermediates formed in the reaction. The conserved residues N96 and E128 also help in providing stability and activating catalytic triad by making hydrogen bonds with E40 and K113, respectively.

Protein ligand interaction based on docking
Nitrilases are known for hydrolytic activity against aromatic, aliphatic and arylacetonitrile substrates. To investigate whether aromatic amino acid Y159 in nitrilase plays important role for determining substrate speci city, saturation mutagenesis was performed. The introduction of non-aromatic alanine residue in place of Y159 completely disrupted the catalytic activity for indole-3-acetonitrile which is indicated by the occurrence of + 5.69 kcal/mol binding energy for IAN (Fig. 4d). It is known that strong binding of protein and the ligand depends upon the accuracy of binding energy. Lower is the binding energy; stronger is the a nity for binding substrates. Therefore, the presence of hydrophilic residues in the substrate binding pocket justi es the activity of modeled nitrilase AcNit towards aromatic substrates also.
Gene prediction for xenobiotic degradation and secondary metabolites and clavulanic acid were predicted. The presence of these genes suggests the role of nitrilases in biodegradation of pollutants and xenobiotic compounds. It could also prove useful for production.

Conclusions
Nitrilases due to their broad substrate speci cities have been explored for their role in the transformation of nitrile containing toxic, mutagenic and carcinogenic organic compounds into their corresponding amides and carboxylic acids derivatives in industrial production. Nitrilases also play vital role in IAA production from tryptophan dependent pathway. Microbes are promising source of nitrilases and have several industrial, agricultural and environmental applications. Among all nitrilases, AcNit could also serve as a strong template to carry protein engineering for enhanced production and expression for suitable industrial applications. Further, characterization and detailed crystallographic studies could also demonstrate their biological role by in vitro analysis and scope for novel enzymes.
Declarations Figure 1 Phylogenetic tree depicting evolutionary study of nitrilase retrieved from NCBI and PDB. The tree was prepared in MEGA version 7.0. Boot strap values are shown next to branches.