Identification and selection of a GH10-enconding gene from a termite gut microbiome
One of our previous studies on contigs constructed using shotgun metagenomic sequencing from termite gut microbiomes (Cortaritermes fulviceps and Nasutitermes aquilinus) revealed that GH10 was the most abundant family involved in hemicellulose degradation (Romero Victorica et al. 2020). From the assembled contigs, we selected a predicted GH10-encoding gene KBCPBGKF_14734 (termed hereafter as Xyl10B), according to the criteria described in that study.
Sequence analysis and molecular modeling of Xyl10B
An initial sequence analysis of Xyl10B was performed with BLASTP against the RefSeq protein collection of the NCBI database. The Xyl10B amino acid sequence was then aligned with 32 bacterial GH10 reference sequences with a BLASTP and a expect value <1E-64 available in the NCBI database, by using the Muscle method as implemented in the MEGA X (v. 10.2.6) (Kumar et al. 2018). This multiple alignment was used to calculate a Jones-Taylor-Thornton (JTT) substitution model for proteins and therefore build a maximum likelihood phylogenetic tree with 1000 bootstrap replicates using MEGA.
Homology modeling by Iterative Threading Assembly Refinement (I-TASSER) (Skerman 1989) (http://zhanglab.ccmb.med.umich.edu/I-TASSER/) was used to generate three-dimensional models of Xyl10B. The models were constructed by retrieving structure of proteins with similar folds from the PDB (Protein Data Bank) library (http://www.rcsb.org) by LOMETS (Wu and Zhang 2007). The fragments from templates with the highest significance level (by Z-score) in the alignments were reassembled into a full-length model. The final model resulted from a second simulation round, in which the global topology was refined.
The confidence of each model was quantitatively measured by C-score (confidence score), TM-score (template modeling score), and RMSD (root-mean-square deviation). The C-score is typically in the range of - 5 to 2, where the highest scores show the highest quality values. A C-score > - 1.5 indicates a model of correct topology. TM-score has values between 0 and 1, where 1 indicates a perfect match between two structures (Roy et al. 2010). A TM-score > 0.5 indicates a model of correct topology, while a TM-score < 0.17 means a random similarity. RMSD is the root-mean-square deviation of atomic positions, which is the measure of the average distance between the atoms of superimposed proteins. An alternative set of molecular structures were developed with AlphaFold software using its online Colab notebook (Jumper et al. 2021). AlphaFold improves the accuracy of structure predictions by incorporating novel neural network architectures and training procedures based on the evolutionary, physical and geometric constraints of protein structures. The quality of the models were further evaluated by Qmean analysis (https://swissmodel.expasy.org/qmean/) (Benkert et al. 2010). Model manipulation and imaging were developed using Chimera software (Pettersen et al. 2004).
Cloning, heterologous expression and protein purification of Xyl10B
Total DNA was extracted from N. aquilinus gut samples using the QIAamp DNA Stool kit (Qiagen) following the manufacturer's indications with modifications. Briefly, pooled gut samples from six individuals were heated at 95 °C in 1 mL of lysis buffer, and then grounded following a FastPrep protocol (3 cycles of 20 sec. at 6,000 rpm) using glass beads (300 mg; 150–212 μm beads; Sigma, USA). An additional magnetic purification step was performed with 1.5 volumes of magnetic a bead solution (Agencourt AMPure XP magnetic beads, Beckman Coulter, USA) for 5 min, followed by two ethanol 80% washes.
The Xyl10B sequence was amplified, without the native signal peptide, with specific primers (designed from the assembled contig sequence) containing the BamHI and XhoI restriction enzyme sites (Xyl10B-F: 5′ GGATCCTACAACGCCCCCG 3′, Xyl10B-R: 5′ CTCGAGTTACTTAACCAGTTCCC- 3′), to subsequently perform an N-terminal fusion to a 6xHis tag (restriction sites are shown underlined). The amplification product was first cloned into pGEM-T Easy vector using E. coli DH5-α competent cells. Then, the plasmid inserts from selected colonies were cloned into pET28b (+) vector (BamHI/XhoI) (Novagen, Birmingham, United Kingdom) and transformed into competent E. coli Rossetta cells (DE3) (Novagen, Birmingham, United Kingdom).
The Xyl10B protein expression was induced with 1 mM IPTG for 16 h at 28 °C. The cells were subjected to lysis and sonication (six pulses of 10 s, 28% amplitude) and the recombinant protein was purified in the soluble fraction with a Ni-NTA agarose resin (Qiagen), using 50 mM NaH2PO4, 300 mM NaCl, 250 mM imidazole, at pH 8, as elution buffer. The concentration of the purified protein was estimated using Bradford Reagent (BIORAD) with BSA as standard. The yield was 2.6 mg of purified soluble active recombinant protein from 50 mL induced E. coli cultures.
Protein electrophoresis and Western blot assays
The molecular weight of protein fractions were assessed with standard sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE). The purified enzyme samples mixed with the same volume of cracking buffer 2X were boiled at 100 °C for 5 min and electrophoretically separated on 12% SDS–PAGE. The proteins fixed in the gels were visualized with Coomassie Brilliant Blue R-250 and de-stained with a solution of 50% methanol, 10% acetic acid and 40% H2O for 3 h.
Western blotting was performed by blotting SDS–PAGE separated proteins onto a nitrocellulose membrane Hybond C-Extra (GE Healthcare Life Science, Rahway, USA) by using Mini Trans-BlotTM (BioRad, Irvine, USA) according to the manufacturer’s specification. The separated recombinant proteins were detected with the anti-His mouse antibody (GE Healthcare Life Science, Rahway, USA) and the anti-mouse AP conjugate goat antibody (Sigma-Aldrich, St. Luis, USA) using the BCIP/NBT substrate. A Page ruler Plus prestained protein ladder (10-250 kDa) (Thermo Scientific, Waltham, USA) included in the SDS-PAGE was used as a molecular weight marker.
The endoxylanase and endoglucanase activities were assayed in 1% (w/v) of beechwood xylan and 1% (w/v) carboxymethyl cellulose (CMC) (Sigma-Aldrich, St. Louis, USA) in a final reaction volume of 0.1 mL, at 50 °C and 400 rpm for 20 min in a Thermomixer (Eppendorf, Hamburg, Germany). The reducing sugars released from the hydrolysis of xylan and CMC were measured using the 3,5-dinitrosalicylic acid (DNS) assay at 540 nm (Miller 1959) with xylose or glucose as standards, respectively.
The arabinofuranosidase, β-glucosidase and xylosidase activities were assayed using, 4-nitrophenyl α-L-arabinofuranoside (pNPA) (Megazyme, Bry, Ireland), 4-Nitrophenyl-β-D-glucopyranoside (pNPG) and 4-nitrophenyl-β-D-xylopyranoside (pNPX) (Sigma-Aldrich, St. Louis, USA). Reactions of 0.1 mL containing 2.5 mM of each substrate were prepared in 50 mM sodium phosphate buffer (pH 8) and the properly diluted enzyme solution. The mixtures were incubated at 50 °C for 20 min and, subsequently, the reaction was stopped by adding 0.5 mL of 2% Na2CO3. The released p-nitrophenol (pNP) concentration was calculated according to a standard curve measuring the absorbance at 410 nm. The enzyme activities were expressed as IU/mg of protein. One international unit (IU) was defined as the amount of enzyme that released 1 μmol of product per minute under the specified assayed conditions for all enzymatic assays.
Optimum pH, Temperature and thermal stability
The optimal pH was assessed using sodium citrate (pH 3-4), sodium phosphate (pH 5-8) and glycine-NaOH (pH 8.5-10) buffers at 50 °C. The temperature effect was evaluated by incubation at pH 9 and at temperatures ranging between 20 °C and 70 °C.
The thermal stability was evaluated by pre-incubating the enzymes at 40 °C and 50 °C from 0 to 16 h. In addition, kinetic parameters were determined under optimal assay conditions using 0–50 mg/mL of beechwood xylan as substrate, by fitting models to data with the software GraphPad Prism v 8.0 (http://www.graphpad.com/scientific-software/prism/).
Effect of metals and regents
The effects of various metal ions (CaCl2, CuSO4, NiCl2, MgCl2, MnSO4, ZnSO4 at 1 mM and 10 mM), chemical reagents (EDTA, SDS, Tween-40, DMSO, β-mercaptoethanol at 1 mM, 10 mM or 0.5%) and NaCl (1 M, 3 M and 5 M) on Xyl10B activity were determined by adding the individual reagent in sodium phosphate buffer (pH 9.0) into the standard reaction and incubating at 50 °C for 20 min. Beechwood xylan (1%) was used as substrate. The activity assayed in the absence of metal ions or reagents was recorded as 100% (control).
Determination of the mode of action of Xyl10B and ligand interaction analysis
The hydrolysis patterns were qualitatively analyzed by thin layer chromatography (TLC) in silica gel plates (GE Healthcare Life Science, Rahway, USA) and using ethanol/ acetic acid/ water (2:1:1) as solvents. The patterns were revealed by water/ethanol/sulfuric acid (20:70:3) with 1% (v/v) orcinol solution, over flame. Xylose (X1) (Sigma-Aldrich, St. Louis, USA), xylobiose (X2), xylotriose (X3), xylotetraose (X4) and xylopentaose (X5) (Megazyme, Bry, Ireland) were used as standards. The β-xylanase activity was performed over time in 100 µL reaction mixtures containing beechwood xylan (1%) and 0.65 mg/mL of enzyme. The reaction was stopped after 20 min, 40 min, 1 h, 2 h, 3 h and 16 h. The mode of action of Xyl10B was assessed by using different xylooligosaccharides (XOS) xylobiose (X2), xylotriose (X3), xylotetraose (X4) and (xylopentaose) X5 (10 mM), and beechwood xylan (1%) and then subjecting the hydrolysates to thin layer chromatography analysis (TLC). Both assays were performed at 50 ºC and pH 9.
The in silico ligand interaction analysis was implemented starting from a structure alignment of Xyl10B with a specific PDB model for the GH10 CbXyn10C complexed with xyloheptaose (PDB 5OFK) and using Chimera software (Pettersen et al. 2004). This task positioned the exogenous ligand at the Xyl10B enzymatic groove and (after the removal of the CbXyn10C) the remaining Xyl10B-xyloheptaose complex was energetically optimized in their interacting atoms. The resulting complex was uploaded at the PDBsum server (Laskowski et al. 2018), which runs a program for automatically plotting protein-ligand interactions (LIGPLOT) (Wallace et al. 1995).
Nucleotide and amino acid sequence accession numbers
The nucleotide and amino acid sequences of Xyl10B were deposited in the GenBank database with the accession numbers OK617332 and UKT5974, respectively.