When the cell-free extracts of bacterial isolates were separately incubated with un-methylated lambda DNA (Promega), a banding pattern similar to that of lambda DNA cleaved with commercially available HindIII was observed (Fig. 1) with one isolate. This was designated as MatS1 and selected for further analysis.
MatS1was found to be a Gram negative rod shaped bacterium, dark orange in color and the colony was sticky, shiny, circular, raised with entire margin and swarming. Analysis of the sequence of 16S rRNA gene using BLAST confirmed the isolate to be a strain of Pseudomonas anguilliseptica, evidenced by a 99.6% sequence identity with Pseudomonas anguilliseptica VITEPRRL6 strain (GenBank ID - KR149276). Therefore, the HindIII like enzyme of MatS1 was tentatively designated as PanI. This is a first report of a HindIII like enzyme characterized from the genus Pseudomonas.
Partial purification of MatS1was carried out by size exclusion, using molecular weight cut off filters. Lambda DNA cleavage by concentrated lysate using 50 and 30 kDa molecular weight cut-off filters displayed two different banding patterns (Fig. 2). A banding pattern similar to that of λ DNA cleaved by HindIII was observed in the retentate of the 30 kDa filter and filtrate of the 50 kDa filter, suggesting the molecular weight of the HindIII like enzyme of MatS1 to lie between 30 and 50 kDa. Moreover, the observed cleavage pattern of λ DNA with the retentate of the 50 kDa filter indicates the presence of other restriction enzymes with different specificities and molecular weights above 50 kDa. This observation was further confirmed by the whole genome sequence analysis of MatS1, which indicated the presence of other genes encoding putative restriction enzymes.
The recognition sequence of RE of MatS1 (PanI) was determined by cloning and sequencing the ends of λ DNA fragments produced by digestion of λ DNA with partially purified PanI. The recognition sequence was found to be “AAGCTT”.
Partially purified PanI was active in most of the commercially available buffers including buffer B, which is the optimum buffer for commercially available HindIII (Promega). However, optimum activity of PanI was observed in multi-core (MC) buffer (Fig. 3), which contained 10 mM Mg2+ as a divalent cation and 100 mM K+ as a monovalent cation and a pH of 7.5 (Table 1).
The optimum temperature of the partially purified enzyme appeared to be between 37ºC to 50ºC (Fig. 4). The highest intensity of the banding pattern was observed at 40 0C. Above 70ºC the activity decreased and detectable activity was observed even at 80ºC, suggesting its functionality even under extreme temperatures. A previously reported, isoschizomers of HindIII; EcoVIII also showed optimal activity at 48°C, aligning with our observations (Mise and Nakajima 1984).
To determine the thermal stability of PanI without the DNA substrate, partially purified enzyme was pre-incubated at different temperatures. The enzyme was found to be stable up to 40 0C for 20 minutes without the DNA substrate (Fig. 5).
Whole-genome sequence analysis of MatS1 revealed the presence of a complete coding sequence of a gene having a high degree of similarity to known genes encoding HindIII family enzymes, reported in NCBI-GenBank database; further confirming that partially purified PanI in MatS1 is an isoschizomer of Hind III. The sequence information of PanI was deposited in NCBI GenBank database (GenBank ID - MW140018). The complete putative ORF of PanI was 912 bp and encodes a protein of 304 amino acids with a predicted molecular weight of ~ 34.6 kDa and a theoretical iso-electric point of 6.18. This predicted molecular weight is in agreement with the empirically estimated range of molecular weight (30–50 kDa) of partially purified PanI. The protein sequence has a region that resembles the partially conserved HindIII endonuclease superfamily signature (residues 13–292) and several conserved substrate binding sites (DNA) as predicted by COACH online server (Fig. 8 - (Yang et al. 2013). Based on the pairwise sequence alignment, PanI shared significant sequence relatedness with its bacterial counterparts with a maximum similarity of 89.1 % and identity of 76.3 % with that of Cylindrospermopsis raciborskii, validating its homology to known HindIII counterparts (Table 2).
Table 2. Percentage similarity and identity of PanI with different homologues.
Species name
|
Restriction Endonuclease (RE)
|
NCBI_GenBank Accession Number
|
Length (amino acids)
|
% Identity
|
% Similarity
|
- Cylindrospermopsis raciborskii
|
HindIII family
|
WP071241953
|
304
|
76.3
|
89.1
|
- Pseudomonas mygdali
|
HindIII family
|
WP057425488
|
304
|
73.1
|
85.6
|
- Pseudomonas meliae
|
HindIII family
|
WP054991699
|
304
|
73.1
|
85.2
|
- Thermoflexibacter ruber
|
HindIII family
|
WP091549217
|
304
|
69.2
|
82.6
|
- Chlorobi bacterium OLB4
|
HindIII family
|
KXK03935
|
304
|
66.9
|
82.6
|
- Bacteroidetes bacterium RIFCSPLOWO2
|
RE
|
OFY69087
|
304
|
66.6
|
82.6
|
- Ignavibacteria bacterium
|
RE
|
OIO24181
|
304
|
65.2
|
82.3
|
- Bacteroidales bacterium Barb6XT
|
HindIII family
|
WP066182127
|
304
|
62.3
|
80.7
|
- Planktothrix sp. PCC 11201
|
HindIII family
|
WP079679536
|
304
|
61.3
|
78.7
|
- Geminocystis herdmanii
|
HindIII family
|
WP017294927
|
304
|
60.3
|
78.0
|
- Arthrospira
|
HindIII family
|
WP006621132
|
304
|
60.3
|
77.0
|
- Oscillatoria acuminate
|
HindIII family
|
WP015150672
|
304
|
59.0
|
77.7
|
- Chlorobi bacterium OLB7
|
HindIII family
|
KXK52534
|
303
|
58.6
|
79.3
|
- Enterobacteriaceae
|
HindIII family
|
WP015059042
|
307
|
55.8
|
73.2
|
- Aquamicrobium aerolatum
|
HindIII family
|
WP091525013
|
304
|
55.1
|
71.8
|
- Escherichia coli
|
HindIII family
|
WP024235892
|
308
|
54.4
|
72.2
|
- Escherichia coli (Plasmid)
|
EcoVIII
|
AAA91203
|
333
|
51.5
|
67.6
|
In the phylogenetic reconstruction, PanI was clustered with known bacterial HindIII family enzymes (Fig. 7). According to the tree topology, PanI was closely clustered with Pseudomonas HindIII homologs. However, it showed its highest evolutionary relationship to its counterpart in C. raciborskii, by forming a sub-clade with it in the main cluster which also harbors HindIII family enzymes of Pseudomonas species, with high bootstrapping support (85).
This pattern of clustering suggests a possible horizontal gene transfer event between P. anguliseptica and C. raciborskii with respect to HindIII like protein coding gene, which is also further reinforced by the pronounced sequence identity of PanI with C. raciborskii HindIII family protein, compared to those from other two Pseudomonas species (Table 2). However, further investigations are warranted for the validation of this likelihood.
I-TASSER online server predicted the tertiary structure of PanI based on 10 threading templates identified from the research Collaboratory for Structural Bioinformatics (RCSB) protein data bank, of which the normalized Z score of the threading alignments was between 2.27 to 11.92, confirming the credibility of each alignment. The most reliable model with a substantial global accuracy, measured by C-score (1.00) with estimated TM-score of 0.85 ± 0.08 and RMSD of 4.2 ± 2.8 A was selected for visualization on PyMol software.
According to the generated model, PanI consists of 15 α-helices and 1 β pleated sheet with five strands (Fig. 8A). However, the monomeric form of the empirically determined crystal structure of HindIII was found to be made up of 16 α-helices and 2 β pleated sheets; one with two strands, and the other with five strands (Watanabe et al. 2009). Thus, comparison of these two structures suggests that the five stranded β pleated sheet is conserved in Pan I. Moreover, in compliance with the HindIII crystal structure, the first strand of the five stranded β sheet is oriented in a parallel direction with the fifth strand in the predicted tertiary structure of PanI
As a part of the computer based simulation performed by I-TASSER, potential DNA substrate binding sites were predicted based on the modeled tertiary structure of PanI along with the three dimensional structure of the enzyme - cognate DNA complex, by using COBOLT and COFACTOR algorithms (Yang et al. 2013). The most reliable prediction with the highest C-score (0.55) and cluster size (35) consists of 8 potential DNA binding residues, namely from the N terminal, Ser-31, Thr-69, Asp-67, Lys-72, Ala-127, Asn-129, Lys-131 and Lys-284 with a binding probability of over 0.5 (Fig. 8B). The repeated occurrence of positively charged amino acid, Lys in the binding site indicates potentially strong interactions between negatively charged DNA and the enzyme via formation of ionic bonds. As expected, the cognate stretches of DNA which overlaps with the active site of the enzyme was predicted to bear the consensus HindIII recognition sequence ‘AAGCTT’ validating the empirically determined recognition sequence of PanI.