Chemicals and substrates
Analytical grade substrates were used. The pre-packed 5 mL HisTrap™ affinity (HP), PD-10 desalting columns (Sephadex G-25 resin), and size exclusion chromatography column (HiLoad® Superdex, 75 pg) used for protein purification were obtained from GE Healthcare. Amorphous, phosphoric acid swollen cellulose (PASC) was prepared from Avicel according to method described before [53]. High-purity substrates include barley β-glucan, lichenan, konjac glucomannan, wheat arabinoxylan, birchwood xylan, and tamarind xyloglucan purchased from Megazyme. The model crystalline cellulose substrate Avicel PH-101, carboxymethyl cellulose (CMC), gum arabic and standard cello-oligomers purchased from Sigma-Aldrich.
Sample collection and identification of genes
Specimens of adult shipworm Psiloteredo megotara collected from Norway spruce (Picea abies) wooden panels submerged were for about 8–9 months in the Arctic Sea near Tromsø, Norway (N 69°46’47.515”; E 18°23’53.143”). The sampling was done accordance with the Norwegian Marine Resource Act. The shipworm specimen was initially rinsed with sterile water and dissected on a clean bench to separate the specialised gill tissue containing endosymbionts. Bacterial enrichment performed using crushed gill tissues in a medium supplemented with cellulose as a carbon source that were incubated for several months. DNA was isolated from the enrichment culture using the DNeasy Blood & Tissue Kit (QIAGEN). The metagenome sequencing was carried using Illumina MiSeq 300 paired-end chemistry at the Norwegian Sequencing Centre (www.sequencing.uio.no). Analysis of contigs were assembled, annotated and uploaded to the GenBank sequence database (accession number grp 8783669). Full details of the metagenomic dataset will be published elsewhere. Genes coding for carbohydrate-active enzymes were mined using the dbCAN meta server [54]. A gene with accession number OP793796 (3312 bp) that located on BankIt2638814 Contig82 was chosen for further studies due to its novel multi-domain architecture.
Gene cloning, expression, and protein purification
A gene (OP793796) encoding multidomain protein TwCel5-6 has an accession number WAK85940.1 (codon optimized for E. coli expression using the OptimumGene™ PSO algorithm) was synthesized by GenScript Biotech (Piscataway, NJ 08854, USA). Gene fragments encoding TwCel5CAT and TwCel5CBM were generated using through PCR using Q5 DNA polymerase (New England Biolabs, Ipswich, MA) and the primers described in Table S1. Both the genes were amplified excluding a putative signal peptide (additional file: Fig. S4), as predicted using the SignalP-5.0 prediction tool [55]. PCR products were purified using a PCR clean-up kit as per manufacturer’s instructions (Macherey-Nagel, Germany) followed by agarose (1 w/v %) gel electrophoresis. Prior to cloning, the DNA concentration was determined using a nano UV spectrophotometer (Thermo Scientific, San Jose, CA, USA). The DNA fragment encoding TwCel5CAT was cloned (26–320 amino acid residues) into the pNIC-CH expression vector (AddGene, Cambridge, MA), which adds a C-terminal polyhistidine-tag (6xHis-tag) to the protein as per manufactures instructions. Similarly, DNA fragment encoding TwCel5CBM was cloned (30–412 amino acid residues) into the directional champion pET151/D-TOPO™ expression vector (Invitrogen, Carlsbad, CA), which adds a cleavable N-terminal V5-6xHis-tag to the protein as per manufactures instructions.
The recombinant vectors were transformed into chemically competent OneShot™ E. coli TOP10 (Invitrogen, Carlsbad, CA) cells using a heat shock transformation method. Cells were grown in SOC medium for 60 minutes prior to plating on lysogenic broth (LB) agar plates supplemented with antibiotics; 50 µg/mL kanamycin (for TwCel5CAT) and 100 mg/mL ampicillin (for TwCel5CBM) depending on the vector, followed by overnight incubation at 37°C. Colonies on the LB plates were picked and screened by colony PCR using the pair of T7 primers: T7, 5´-TAATACGACTCACTATAGGG-3´ and T7 reverse 5´-TAGTTATTGCTCAGCGGTGG-3´. Positive clones were picked and inoculated in liquid LB containing appropriate antibiotics (as mentioned above) and the cultures were incubated overnight at 37°C with shaking at 200 rpm. The recombinant plasmids were isolated from the E. coli cells using the Zymo MiniPrep Kit (Zymo Research). Prior to transformation to expression cells, the correct integration and DNA sequence of the genes was confirmed using Sanger sequencing (GATC Biotech, Constance, Germany). The expression vectors were transformed into chemically competent OneShot BL-21 Star™ (DE3) and ArcticExpress (DE3) E. coli expression cells, for TwCel5CAT and TwCel5CBM, respectively, using heat shock method as described above.
To produce recombinant proteins, E. coli transformants were inoculated and grown in terrific broth medium supplemented with appropriate antibiotics and cells were incubated at 37°C with horizontal shaking (200 rpm) until the optical density (OD600nm) reached between 0.6 to 0.8 followed by induction by adding 0.3 mM isopropyl-β-D-thiogalactopyranoside (IPTG) and 24 h incubation at 15°C with horizontal shaking (200 rpm). Cultures were harvested by centrifugation at 10,000 x g for 15 min at 8°C, using a Beckman coulter centrifuge (Brea, CA, USA). Cells were stored frozen at -20°C until further use. For protein purification, a cell-free extract was prepared by re-suspending about 5 g wet cell biomass in 50 mL of 50 mM Tris-HCl buffer, pH 7.4, supplemented with 200 mM NaCl, 10% glycerol and 30 mM imidazole (lysis buffer). Prior to cell disruption, the suspension was supplied with 0.5–1 mg of cOmplete™, EDTA-free protease inhibitor cocktail (Roche) and lysozyme (0.5 mg/mL). Cells were disrupted using a cooled high-pressure homogenizer (LM20 Microfluidizer, Microfluidics). Cell debris was removed by centrifugation at 27,000 x g for 30 min at 4°C. The resulting cell-free extracts, containing cytosolic soluble proteins, were filtered using a sterile 0.45 µm filter (Sarstedt, Nümbrecht, Germany).
The filtered cell-free extract was subjected to immobilized metal affinity chromatography using an Äkta pure chromatography system equipped with a 5-mL HisTrap HP column (GE HealthCare) equilibrated with lysis buffer (see above). After sample loading, the HisTrap column was washed extensively using 50 mM Tris-HCl, pH 7.4, supplemented with 200 mM NaCl, 10% glycerol containing 70 mM imidazole (wash buffer) until UV absorbance dropped and stabilised at baseline level. Bound proteins were eluted using the same buffer supplemented with 500 mM imidazole (elution buffer). Eluted proteins were first analysed using SDS-polyacrylamide gel electrophoresis (SDS-PAGE) using TGX Stain-Free precast gels (Bio-Rad, Hercules, Ca, USA). The molecular weight of the recombinant proteins was estimated using Invitrogen™ BenchMark™ Pre-stained Protein Ladder (Fisher Scientific, Waltham, Massachusetts, USA). Eluted proteins were concentrated using Vivaspin® 10,000 MWCO centrifugal filter units (Sartorius, Göttingen, Germany). Proteins were purified to homogeneity using a size-exclusion chromatography column (HiLoad® Superdex, 75 pg) pre equilibrated with 50 mM sodium phosphate buffer (pH 7.4) containing 150 mM NaCl. Finally, buffer exchange was performed to 50 mM sodium citrate (pH 5.6) using a PD-10 desalting column. The protein concentration was determined by UV absorbance at 280 nm (A280) using theoretical molar extinction coefficients (TwCel5CAT: 71390 M− 1·cm− 1 and TwCel5CBM: 96745 M− 1·cm− 1) estimated using the ProtParam tool [56]. Purified enzymes were stored at 4°C.
Biochemical characterization of recombinant TwCel5
The standard assays were performed in 300 µL reaction volume containing 50 mM sodium phosphate buffer (pH 7.5), 0.5 M NaCl, 0.5% (w/v) β-glucan and 50 nM of the purified enzyme. The reaction was started either by the addition of enzyme or substrate followed by incubation at 30 ºC using a thermomixer with horizontal agitation (500 rpm; Eppendorf, Hamburg, Germany) for 60 minutes (unless stated otherwise). To determine the pH stability and pH optima, reactions were carried out in 50 mM buffer systems (sodium citrate, pH 3.0–6.0; potassium phosphate, pH 6.5–8.0; and glycine-NaOH, 9.6–10.6) containing 0.5 M NaCl, 0.5% (w/v) β-glucan and 50 nM of the purified enzyme. To determine the thermal stability and optimal temperature, reactions were performed in 50 mM sodium phosphate buffer (pH 7.5), 0.5 M NaCl, 0.5% (w/v) β-glucan and 50 nM of the purified enzyme at various temperatures ranging from 5–70°C. The effect of the NaCl concentration on activity was determined using 50 mM sodium phosphate buffer (pH 7.5), 0.5% (w/v) β-glucan, 50 nM of the purified enzyme and various concentrations of NaCl ranging from 0 to 1.5 M. Aliquots were collected at different time intervals in a period of 60 minutes; reactions were stopped by mixing the samples immediately with DNSA reagent. Product formation was determined by quantifying the amount of reducing end sugars using the 3,5-dinitrosalicylic acid (DNSA) assay method [57] using glucose as a standard. The absorbance (A540nm) was recorded using Varioskan™ LUX multimode microplate reader (Thermo Scientific, San Jose, CA, USA). All the assays were performed in triplicate.
Lignocellulose substrate specificity
The substrate specificity of purified TwCel5CAT and TwCel5CBM was evaluated using a wide variety of complex lignocellulosic substrates, both soluble and crystalline. The insoluble and model crystalline polysaccharides include Avicel PH-101, Whatman® cellulose filter paper (0.5 µm particle size) and phosphoric-acid swollen cellulose (PASC) was used as amorphous substrate. The soluble lignocellulosic substrates included β-glucan, birchwood xylan, carboxymethyl cellulose (CMC), wheat arabinoxylan, konjac glucomannan, xyloglucan, and lichenan. Soluble substrates were dissolved according to the supplier’s protocol. The standard reactions were carried out in 300 µL reaction volume using 50 mM sodium phosphate buffer, pH 7.5, 0.5 M NaCl, using 50 nM of enzyme for soluble substrates (0.5% w/v) whereas 1 µM for insoluble substrates (1% w/v) that were incubated at 30°C for 60 minutes, for soluble substrates, or 24 hours, for insoluble substrates, with horizontal agitation (500 rpm). Aliquots were taken at different intervals; reactions were stopped by mixing the samples immediately with DSNA reagent. Release of reducing end sugars was measured using DNSA assay, using glucose as a standard, as described (see above). When using insoluble substrates samples were filtered before measurement.
Cellulase-LPMO synergy experiment
Purified CelS2 from Streptomyces coelicolor was a kind gift from Dr. Zarah Forsberg [20]. The cellulase-LPMO synergy was assessed by performing reactions with crystalline Avicel (1% w/v) in 50 mM sodium phosphate buffer (pH 6.0) using a thermomixer (Eppendorf, Hamburg, Germany) incubated at 30°C with horizontal agitation (1000 rpm). Experiments to determine the synergy were conducted at a fixed total enzyme concentration of, such as 1 µM copper saturated CelS2 and/or 1 µM of one of the TwCel5 cellulase variants. The reactions were started by firstly supplying 1 mM ascorbic acid (final concentration) to all reaction mixtures, immediately followed by addition of the enzyme. Reactions were incubated for 48 hours, and aliquots were taken at different time intervals followed by filtration using 0.45 µm filter for removing insoluble substrate and to stop the reaction. To check for generation of reducing ends due to the action of the cellulase variants of oligomeric products solubilized by the LPMO, which could lead to a false impression of synergy, control reactions were performed in which Avicel was first incubated with the LPMO for 48 hours, after which the products were treated with the TwCel5 variants. Cellulose saccharification was assessed using the reducing end assay described above and all the experiments were performed in triplicate.
Product analysis by HPAEC-PAD (ICS-6000)
Hydrolytic products generated from PASC were detected by a Dionex ICS6000 system (Thermo Scientific, San Jose, CA, USA) using high performance anion exchange chromatography connected to pulsed amperometric detector using CarboPac™ PA200 IC analytical column. The eluent B (0.1 M NaOH and 1 M sodium acetate) and eluent A (0.1 M NaOH) was applied using following gradient program: 0–5.5% B for 3 min, 5.5–15% B for 6 min, 15–100% B for 11 min, 100–0% B for 6 s, 0% B for 6 min. The eluent flow rate was set to 0.5 mL/min. The cello-oligosaccharide with a degree of polymerization from one to five (DP1 - DP5), was used as standards for product identification. The data was analysed using Chromeleon 7.2.9 software.
Product analysis by MALDI-TOF MS
Hydrolytic products generated from β-glucan and konjac glucomannan were identified using a UltrafleXtreme matrix assisted laser desorption ionization time of flight mass spectrometer (Bruker Daltonics GmbH, Bremen, Germany) equipped with a Nitrogen 337-nm laser. Samples were prepared by mixing one microliter of sample with two microliter of 2,5-dihydrooxybenzoic acid (DHB) solution (9 mg/mL) that directly mounted onto a MTP 384 ground steel target plate (Bruker Daltonics). Sample spots were allowed to dry on the plate using a flow of dry hot air. Data was acquired using Bruker flexControl and flexAnalysis software. Products were identified based on m/z values.
Crystallization, data collection and analysis
Crystallization experiments were performed with a stock solution of the purified protein at 12.4 mg/mL, as estimated by A280, in 6 mM NaCl, 20 mM Tris-HCl at pH 8.0. Initial crystallization experiments were performed using the vapour diffusion sitting drop method set up by a Phoenix crystallization robot (Art Robbins Instruments). The crystallisation experiments were set up with sixty µl reservoir solutions and sitting drops with equal amounts of reservoir solution mixed with protein stock solution in a total drop volume of one microliter. The plates were incubated at 20°C. Crystals appeared in 1–2 weeks in conditions containing 0.2 M MgCl2, 0.1 M Tris-HCl, pH 8.5, and 25% w/v of polyethylene glycol 3350 (PEG 3350). Crystals were harvested, transferred to a cryoprotectant solution consisting of the reservoir solution supplemented with 15% ethylene glycol and flash cooled in liquid N2. X-ray diffraction data were collected at BEAMLINE 14.1 at BESSY II (Berlin, Germany). Data collection and processing statistics are presented in Table 1, and structure determination and refinement statistics are presented in Table 2. The crystal structure was solved by molecular replacement using MolRep in the CCP4 program package [58] with 1egz.pdb as search model [45]. The initial refinement was executed in Refmac [59] followed by automated model improvement in Buccaneer [60]. The manual model building was done in Coot [61] interspersed by cycles of refinement in Refmac and resulted in a final Rwork/Rfree of 11.42/13.39. The atomic coordinates and structural details have been deposited in the Protein Data Bank with the accession code 8C10. Figures presented in the results section were generated using Pymol v4.60 (www.pymol.org).