Isolation, Partial Purication and Characterization of a Novel Restriction Enzyme from Pseudomonas Anguilliseptica

Type II restriction enzymes (REs) which can cleave double stranded DNA in a sequence specic manner have many applications in recombinant DNA technology and are considered the work horses of molecular biology. Soil and water samples were screened for isolation of bacteria, harboring restriction enzymes. Cell lysates of isolated bacteria were incubated with unmethylated λ DNA, followed by analysis by agarose gel electrophoresis. The presence of distinct banding patterns indicated the presence of REs. Nine putative isolates harboring REs were morphologically and molecularly characterized using 16S rRNA analysis and belonged to four different genera (Acinetobacter, Lysinibacillus, Pseudomonas, and Brevibacillus). A HindIII like restriction digestion prole was observed in a lysate of a soil bacterium belonging to genus Pseudomonas. Based on 16S rRNA analysis, the bacterial species was identied as P. angulliseptica. The enzyme was partially puried and optimum conditions for enzyme activity and its recognition sequence were determined. The enzyme showed optimum activity at 40 0 C and was stable at 40 ° C for 20 minutes without the DNA substrate. The Recognition sequence of the enzyme was determined and found to be 5’AAGCT 3’ indicating it to be an isoschizomer of HindIII. The whole genome of the Pseudomonas species was sequenced and the coding sequence of the gene for the putative HindIII isoschizomer was identied together with other genes encoding putative REs. The gene coding for the HindIII isoschizomer was analyzed in silico and its homology and evolutionary relationship to other known isoschizomers of HindIII were determined. The enzyme was tentatively designated as PanI.

Type II REs represent the highest number of characterized REs to date. Over 300 Type II REs, with > 200 dissimilar sequence-speci cities, are commercially available (Loenen et al. 2014b, a). These constitute a very diverse group of proteins in terms of size and amino acid sequence, organization of the domains and protein subunit composition, co-factor requirement for enzymatic activity and reaction mechanisms (Pingoud et al. 2005; Loenen et al. 2014a).
Type II REs are fundamental tools in DNA manipulation and play an important role in genetic engineering and molecular biology (Williams 2003), because they have the ability to cleave DNA at de ned positions close to or within their recognition sequences producing distinct reproducible banding patterns on agarose gels (Roberts 1976). Generally, they identify short, palindromic, 4-8 base pair (bp) sequences (Pingoud and Jeltsch 2001;Pingoud et al. 2014) and cleave DNA in the presence of Mg 2+ . Type II REases are the most commonly used group of enzymes in laboratories for gene cloning, DNA analysis, library preparation, diagnostic purposes, etc. Their sizes range from 250-350 amino acids and are the simplest and smallest among the REases. REases that identify the same DNA sequence, irrespective of the site at which they cleave the DNA are designated as 'isoschizomers' (Pingoud et al. 2014), while the rst enzyme which revealed the recognition sequence is known as the prototype.
Among the type II REases, the enzyme Hind III was isolated from the bacterium Haemophilus in uenzae, serotype d (Pingoud et al. 2014) and is used extensively in genetic engineering and molecular biology. Isoschizomers of Hind III REase reported to date are listed in the REBASE database (Kazennova et al. 1982; Mise and Nakajima 1984). In this study, an isolate of Pseudomonas anguilliseptica was identi ed from a soil sample and was found to produce an isoschizomer of HindIII. The enzyme was extracted from the bacterium, partially puri ed, functionally characterized and tentatively designated as PanI. Furthermore, through whole-genome sequencing the putative gene encoding the enzyme was identi ed and molecularly characterized using different in-silico tools.

Sample collection, Isolation of bacteria from soil samples, Screening isolated bacteria for REs and
Characterization of restriction enzyme-producing bacteria.
Soil samples were collected using sterile 50 mL conical tubes from different regions of the country, transported to the laboratory and stored at 4 0 C to isolate bacteria. One gram of each soil sample was vigorously mixed with 5 mL of sterile PBS (1X: 137 mM NaCl, 2.7 mM KCl, 8 mM Na 2 HPO 4 , and 2 mM KH 2 PO 4 ) separately and a 100 µL aliquot was serially diluted (10 − 1 to 10 − 10 ) using PBS. An aliquot (100 µL) from each dilution was spread on Luria-Bertani (LB) agar plates and incubated at 37 0 C, and observed daily for a week. Well-isolated colonies were picked for screening.
Well isolated colonies were separately inoculated into 5 mL LB broth and grown overnight. The cultures were then centrifuged at 3500 rpm for 30 minutes and cell pellets were washed with TME buffer (50 mM Tris-HCl, 20 mM MgCl 2 and 0.1 mM EDTA, pH 7.5) and then centrifuged again using the same conditions. Cell pellets were then re-suspended in 500 µL of TME buffer along with 0.5µL of β-mercaptoethanol. The cell suspensions were then disrupted by cold sonication (3x 2second pulses with 15 second intervals in between) and then centrifuged (18000 g /3 min/ 4°C) followed by careful separation of the supernatant.
The supernatant of each cell lysate was then screened for REs as follows; A 30 µL of reaction mixture (digestion mixture) containing 5.0 µL of cell lysate, 1X Multicore (MC) buffer (Promega), 0.5 µg of unmethylated λ DNA and deionized water was incubated at 37ºC for 3 h. Aliquots (15 µL) from the above reaction mixtures were electrophoresed on a 0.8% agarose gel and visualized under a UV trans-illuminator for the presence of distinct banding patterns.
Several isolates displayed distinct DNA banding patterns on agarose gel electrophoresis. The colony and cellular morphology, including shape, height, margin, surface refraction, opacity and color of the bacterial colony was observed as described elsewhere. (Bhumbla and Bhumbla 2018). An isolate designated 'MatS1' was selected for further analysis.
The MatS1 isolate was molecularly identi ed at the species level using 16S rRNA gene sequence analysis by employing the universal primers (RNABR1 -AGAGTTTGATCCTGGCTCAG and RNABR2-AAGGAGGTGATCCAGCC) in a standard PCR ampli cation reaction (Weisburg et al. 1991) followed by sequence veri cation of the amplicon.
2.2 Partial puri cation of MatS1 lysate, Determination of the recognition sequence, Endonuclease activity assays, Enzyme activity in commercially available buffers, Determination of optimum temperature, Thermal stability of the partially puri ed enzyme.
Partial puri cation of MatS1 by size exclusion ltration was carried out using molecular weight cut-off lters (Vivaspin 6 -Sigma-Aldrich) according to the manufacturer's instructions. Brie y, a cell free lysate of MatS1 was prepared as described above for screening (Sect. 2.3) and 2 mL was ltered through a 50 kDa molecular weight cut off lter and the enzyme activity of the retentate and ltrate determined. The ltrate was then passed through 30 kDa molecular weight cut off lter and enzyme activity were assayed in both the ltrate and retentate.
To determine the recognition sequence, λ DNA (0.5 µg) was incubated with 5.0 µL of a lysate of MatS1 at 37°C for 3 h. Several reactions were carried out and the digestion mixtures were pooled, phenol chloroform extracted, precipitated and resuspended in TE buffer (pH 7.9). An aliquot of this DNA was treated with Taq polymerase (5 U) in a reaction mixture containing 2mM dNTPs and 1X PCR buffer, for 20 minutes at 72 0 C to ll in any 3′ recessed ends and to add an 'A' residue to the 3' end of the DNA fragments to facilitate TA cloning. The DNA fragments were then resolved on an agarose gel and fragments less than 2000 bp in size were excised from the gel. Excised fragments were puri ed using Wizard® SV gel and PCR clean up system (Promega), according to manufacturer's instructions. Puri ed fragments were then ligated to pGEM-T easy vector (Promega) and transformed in E.Coli (JM109) competent cells. Recombinant plasmid DNA was prepared from recombinant clones and the ends of the insert sequenced using the dideoxy method (Macrogen-Korea).
Activity assays were carried out using the partially puri ed enzyme (5 µL) and unmethylated λ DNA (0.7 µg) as a substrate in a 30 µL reaction volume for 2 h. The cleavage products were then resolved on a 0.8 % agarose gel and the relative activities were evaluated based on the intensities of the DNA bands.
Enzyme activity in several commercially available buffers (Promega - Table 1) were determined as follows; λ DNA (0.7 µg) was digested with 5.0 µL of the puri ed lysate in the respective buffer (1X) for 2 h at 37 ºC in a 30 µL reaction mixture. The resulting DNA fragments were then resolved on a 0.8 % agarose gel and the intensity of the resolved bands were determined using IMAGEJ software (https://imagej.nih.gov/ij/download.html). The relative enzyme activity was determined based on the intensities of the cleaved DNA fragments. Based on the alignments obtained from NCBI Genbank sequence database with the contiguous sequences of MatS1 whole genome using BLAST, a homolog which codes a HindIII like endonuclease was identi ed. The corresponding amino acid sequence of the homolog was obtained by using Uni-pro-U-GENE software (Okonechnikov et al. 2012) and substrate (DNA) binding sites were predicted by COACH online server (Yang et al. 2013). Sequence comparison studies of predicted protein sequence with its homologs were performed using the Clustal-Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/) and EMBOSS-Needle (https://www.ebi.ac.uk/Tools/psa/emboss_needle/) sequence alignment programs.
With the objective of investigating the evolutionary relationship with its counterpart molecules, phylogenetic analysis of the identi ed sequence (MatS1-HindIII) was carried out using the Neighborjoining method with bootstrapping values taken from 1000 replicates, using Molecular Evolutionary

Results And Discussion
When the cell-free extracts of bacterial isolates were separately incubated with un-methylated lambda DNA (Promega), a banding pattern similar to that of lambda DNA cleaved with commercially available HindIII was observed ( Fig. 1) with one isolate. This was designated as MatS1 and selected for further analysis.
MatS1was found to be a Gram negative rod shaped bacterium, dark orange in color and the colony was sticky, shiny, circular, raised with entire margin and swarming. Analysis of the sequence of 16S rRNA gene using BLAST con rmed the isolate to be a strain of Pseudomonas anguilliseptica, evidenced by a 99.6% sequence identity with Pseudomonas anguilliseptica VITEPRRL6 strain (GenBank ID -KR149276). Therefore, the HindIII like enzyme of MatS1 was tentatively designated as PanI. This is a rst report of a HindIII like enzyme characterized from the genus Pseudomonas.
Partial puri cation of MatS1was carried out by size exclusion, using molecular weight cut off lters. Lambda DNA cleavage by concentrated lysate using 50 and 30 kDa molecular weight cut-off lters displayed two different banding patterns (Fig. 2). A banding pattern similar to that of λ DNA cleaved by HindIII was observed in the retentate of the 30 kDa lter and ltrate of the 50 kDa lter, suggesting the molecular weight of the HindIII like enzyme of MatS1 to lie between 30 and 50 kDa. Moreover, the observed cleavage pattern of λ DNA with the retentate of the 50 kDa lter indicates the presence of other restriction enzymes with different speci cities and molecular weights above 50 kDa. This observation was further con rmed by the whole genome sequence analysis of MatS1, which indicated the presence of other genes encoding putative restriction enzymes.
The recognition sequence of RE of MatS1 (PanI) was determined by cloning and sequencing the ends of λ DNA fragments produced by digestion of λ DNA with partially puri ed PanI. The recognition sequence was found to be "AAGCTT".
Partially puri ed PanI was active in most of the commercially available buffers including buffer B, which is the optimum buffer for commercially available HindIII (Promega). However, optimum activity of PanI was observed in multi-core (MC) buffer (Fig. 3), which contained 10 mM Mg 2+ as a divalent cation and 100 mM K + as a monovalent cation and a pH of 7.5 ( Table 1).
The optimum temperature of the partially puri ed enzyme appeared to be between 37ºC to 50ºC (Fig. 4).
The highest intensity of the banding pattern was observed at 40 0 C. Above 70ºC the activity decreased and detectable activity was observed even at 80ºC, suggesting its functionality even under extreme temperatures. A previously reported, isoschizomers of HindIII; EcoVIII also showed optimal activity at 48°C, aligning with our observations (Mise and Nakajima 1984).
To determine the thermal stability of PanI without the DNA substrate, partially puri ed enzyme was preincubated at different temperatures. The enzyme was found to be stable up to 40 0 C for 20 minutes without the DNA substrate (Fig. 5).
Whole-genome sequence analysis of MatS1 revealed the presence of a complete coding sequence of a gene having a high degree of similarity to known genes encoding HindIII family enzymes, reported in NCBI-GenBank database; further con rming that partially puri ed PanI in MatS1 is an isoschizomer of Hind III. The sequence information of PanI was deposited in NCBI GenBank database (GenBank ID -MW140018). The complete putative ORF of PanI was 912 bp and encodes a protein of 304 amino acids with a predicted molecular weight of ~ 34.6 kDa and a theoretical iso-electric point of 6.18. This predicted molecular weight is in agreement with the empirically estimated range of molecular weight (30-50 kDa) of partially puri ed PanI. The protein sequence has a region that resembles the partially conserved HindIII endonuclease superfamily signature (residues 13-292) and several conserved substrate binding sites (DNA) as predicted by COACH online server (Fig. 8 -(Yang et al. 2013). Based on the pairwise sequence alignment, PanI shared signi cant sequence relatedness with its bacterial counterparts with a maximum similarity of 89.1 % and identity of 76.3 % with that of Cylindrospermopsis raciborskii, validating its homology to known HindIII counterparts (Table 2). In the phylogenetic reconstruction, PanI was clustered with known bacterial HindIII family enzymes (Fig. 7). According to the tree topology, PanI was closely clustered with Pseudomonas HindIII homologs. However, it showed its highest evolutionary relationship to its counterpart in C. raciborskii, by forming a sub-clade with it in the main cluster which also harbors HindIII family enzymes of Pseudomonas species, with high bootstrapping support (85).
This pattern of clustering suggests a possible horizontal gene transfer event between P. anguliseptica and C. raciborskii with respect to HindIII like protein coding gene, which is also further reinforced by the pronounced sequence identity of PanI with C. raciborskii HindIII family protein, compared to those from other two Pseudomonas species (Table 2). However, further investigations are warranted for the validation of this likelihood.
I-TASSER online server predicted the tertiary structure of PanI based on 10 threading templates identi ed from the research Collaboratory for Structural Bioinformatics (RCSB) protein data bank, of which the normalized Z score of the threading alignments was between 2.27 to 11.92, con rming the credibility of each alignment. The most reliable model with a substantial global accuracy, measured by C-score (1.00) with estimated TM-score of 0.85 ± 0.08 and RMSD of 4.2 ± 2.8 A was selected for visualization on PyMol software.
According to the generated model, PanI consists of 15 α-helices and 1 β pleated sheet with ve strands (Fig. 8A). However, the monomeric form of the empirically determined crystal structure of HindIII was found to be made up of 16 α-helices and 2 β pleated sheets; one with two strands, and the other with ve strands (Watanabe et al. 2009). Thus, comparison of these two structures suggests that the ve stranded β pleated sheet is conserved in Pan I. Moreover, in compliance with the HindIII crystal structure, the rst strand of the ve stranded β sheet is oriented in a parallel direction with the fth strand in the predicted tertiary structure of PanI As a part of the computer based simulation performed by I-TASSER, potential DNA substrate binding sites were predicted based on the modeled tertiary structure of PanI along with the three dimensional structure of the enzyme -cognate DNA complex, by using COBOLT and COFACTOR algorithms (Yang et al. 2013).
The most reliable prediction with the highest C-score (0.55) and cluster size (35) consists of 8 potential DNA binding residues, namely from the N terminal, Ser-31, Thr-69, Asp-67, Lys-72, Ala-127, Asn-129, Lys-131 and Lys-284 with a binding probability of over 0.5 (Fig. 8B). The repeated occurrence of positively charged amino acid, Lys in the binding site indicates potentially strong interactions between negatively charged DNA and the enzyme via formation of ionic bonds. As expected, the cognate stretches of DNA which overlaps with the active site of the enzyme was predicted to bear the consensus HindIII recognition sequence 'AAGCTT' validating the empirically determined recognition sequence of PanI.

Conclusions
REs are powerful tools used in molecular biology and genetic engineering. In this study, several soil and water samples were screened for the isolation of restriction enzyme producing bacteria. Potent HindIII like activity was observed in the cell-free extract of an isolate designated MatS1 and was identi ed as a novel strain of Pseudomonas anguilliseptica through 16S rRNA analysis. The restriction enzyme isolated from this organism was designated as PanI. The isolated enzyme was partially puri ed and characterized in relation to its recognition sequence and optimum reaction conditions for DNA digestion. The recognition sequence was found to be 5′AAGCTT 3′ and revealed Pan1 to be an isoschizomer of HindIII. The whole genome of MatS1 was sequenced and the gene for PanI was identi ed and characterized.  Determination of the activity of partially puri ed PanI in different commercially available buffers (Table  1). Percentage (%) activity of the partially puri ed PanI in different commercially available buffers was determined based on the intensities of the cleaved DNA fragments resolved on a 0.8 % agarose gel and analysed by image J software. Error bars represent SD (n=3) Figure 4 Determination of optimum temperature of partially puri ed PanI. Percentage (%) activity of partially puri ed enzyme at different temperatures varying from 0-90 oC was determined by analysing the intensities of the cleaved DNA fragments resolved on a 0.8 % agarose gel using image J software, Error bars represent SD (n=3) Figure 5 Determination of thermal stability of the partially puri ed enzyme. Lane 1: λ/HindIII marker, Lanes 2 to 9;

Declarations
fragments of λ DNA produced by restriction digestion reactions performed at 40 0C using partially puri ed PanI, which was pre-incubated at 0, 20, 37, 40, 50, 60, 70, 80 and 90 0C for 20 minutes, respectively Figure 6 Multiple protein sequence alignment of selected bacterial HindIII like REs including PanI. Completely conserved and partially conserved residues are denoted by (*)/ highlighted in gray and (:) symbols respectively. Hind III superfamily signature is boxed on MatS1 HindIII. Blue color arrows depict the predicted substrate-binding sites.  Table 2 Figure 8 In