Identi cation and Characterization of a Novel Carboxylesterase EstQ7 from a Soil Metagenomic Library

Zhenzhen Yan Nanjing Agricultural University https://orcid.org/0000-0001-9356-7425 Liping Ding Nanjing Agricultural University Dandan Zou Nanjing Agricultural University Luyao Wang Nanjing Agricultural University Yuzhi Tan Nanjing Agricultural University Shuting Guo Nanjing Agricultural University Yingchen Zhang Nanjing Agricultural University Zhihong Xin (  xzhfood@njau.edu.cn ) Nanjing Agricultural University


Introduction
Soil is "natural medium" for microorganisms, and 1 g of soil usually contains 10 6 -10 9 microorganisms, which is the most abundant resource of microbial and provides a valuable reservoir for mining novel genes and biocatalysts (Bachar et al. 2010). However, more than 99% of microbes in the environment are currently unculturable under laboratory conditions (Rappe and Giovannoni 2003), which greatly limits the exploration and utilization of novel enzyme resources. Metagenomic strategy provides an access to genomes of the uncultured microorganisms skipping culturing microorganisms and have been applied to screen novel biocatalysts for industrial applications in the past decades (Handelsman 2004). Various novel biocatalysts have been successfully identi ed from soil metagenomic library, such as lipolytic enzymes (Li et al. 2018;Park et al. 2020), proteases (Gong et al. 2017), cellulases (Garg et al. 2016;Yang et al. 2016), and novel natural products (Katz et al. 2016).
Lipolytic enzymes (EC 3.1.1.x) belong to the α/β hydrolase family, which are ubiquitous in animals, plants and microorganisms (Jaeger and Eggert 2002). Based on their substrate preferences and structural characteristics, lipolytic enzymes were classi ed into carboxylesterases (EC 3.1.1.1), lipases (EC 3.1.1.3) and various Phospholipases (Arpigny and Jaeger 1999). Both carboxylesterases and lipases can catalyze the hydrolysis of p-nitrophenyl esters (pNPEs), while the former tend to hydrolyze short to medium chain and water-soluble fatty-acid esters, the latter show speci city toward long-chain and low water solubility ones. Such enzymes are among the most important biocatalysts that can catalyze ester hydrolysis, esteri cation and transesteri cation (Bommarius 2015). Because of their broad substrate speci city, high e ciency and stability, lipolytic enzymes have found diverse range of application in food, pharmaceutical, ne chemical, detergents, textiles, paper, environmental remediation and biodiesel industries (Bornscheuer 2002;Gurung et al. 2013;Gerits et al. 2014;Ramnath et al. 2017). As a result, lipolytic enzymes have attracted great attentions for a long period of time and a large quantity of novel esterolytic enzymes have been identi ed. Nevertheless, merely a minority of isolated enzymes were successfully applied in industry (Ferrer et al. 2016), researchers are still keen to screen novel lipolytic enzymes with improved properties from various environmental samples to meet the up-growing demands.
In this study, a fosmid metagenomic library with environment DNA (eDNA) extracted from a corn eld soil sample was constructed, and a novel lipolytic gene, termed estq7, was identi ed. We described the cloning and overexpression of this novel gene, and biochemical characterization of the puri ed recombinant enzyme EstQ7. Sequences of the novel lipolytic enzyme and its homologs were analyzed, three-dimensional (3D) model was constructed followed by molecular docking to analyze interactions between the receptor and ligand.
Construction of a soil metagenomic library and screening of lipolytic genes eDNA was extracted from a corn eld soil sample in Shangqiu, Henan province as described previously (Brady 2007). The CopyControl TM fosmid library production kit (Epicenter) was used for the construction of metagenomic libraries according to the manufacturer's instructions. For function-driven screening of lipolytic genes, transformants were incubated on Luria-Bertani (LB) agar media containing 1% (v/v) emulsi ed tributyrin (C4) as a substrate with chloramphenicol (12.5 μg/mL) and 0.1% (w/v) arabinose.
Positive colonies forming clear halos on the screening medium were picked out as lipolytic enzyme candidates for producing novel lipolytic enzymes and the colony showing the highest positive activity was selected for further study.

Subcloning and bioinformatic analysis
The plasmid of the positive clone was partially digested with Sau3A I. The resulting DNA with sizes of 1-5 kb were ligated into BamH I-digested pUC118 vector using T4 DNA ligase. The recombinant plasmids were transformed into E. coli DH5α competent cells and the transformants were screened for lipolytic activity. Positive clones showing clear halos were con rmed by DNA sequencing.
Open reading frames (ORFs) encoding carboxylesterases/lipases were analyzed using online server ORF Finder (https://www.ncbi.nlm.nih.gov/or nder/). Sequence similarity was investigated through the online Basic Local Alignment Search Tool (BLAST; https://blast.ncbi.nlm.nih.gov/Blast.cgi). Multiple alignments of sequences with high similarities were carried out using Clustal W online program (https://www.ebi.ac.uk/Tools/msa/clustalo/) and illustrated with ESPript 3.0 online server (http://espript.ibcp.fr/ESPript/ESPript/). A phylogenetic tree was constructed based on the neighborjoining algorithm using MEGA 6.0 software with 1000 bootstrap replicates. The SignalP-4.0 server was used to predict the exitance of signal peptides and the theoretical molecular weight (MW) was calculated by ExPASy online server.

Overexpression and puri cation
The putative esterase encoding gene was ampli ed by PCR using the following primers: Est-F/BamH I: CGCGGATCCATGCAGCTGATGTTGTGG, Est-R/Xho I: GCCCTCGAGTTTTGGATTTGTGAAGTC, where the underlined bases are BamH I and Xho I recognition sites, respectively. The PCR product was digested with BamH I and Xho I and ligated into pET28a (+) vector digested with the same restriction enzymes. The recombinant plasmid (pET28a-estq7) was transformed into the E. coli BL21(DE3) competent cells for overexpression.
A positive transformant was incubated at 37°C in 100 mL LB medium containing 50 μg/ml kanamycin. When the optical density at 600 nm (OD600) of the culture attained 0.6-0.8, 0.5 mM IPTG was added to induce protein overexpression, and the cells were additionally incubated for 24 h at 16 °C. Cells were harvested by centrifugation (10000 g, 2 min) and resuspended in binding buffer (50 mM NaH 2 PO 4 at pH 8.0 containing 300 mM NaCl). Resuspended cells were lysed by sonication for 1 s with 2 s interval for 30 min on an ice-water bath, and the crude cell lysate was centrifuged with a speed of 12,000 rpm at 4 °C for 20 min. The supernatant was loaded on to a selective column packed with nickel-nitrilotriacetic acid (Ni-NTA) resin (Sangon Biotech, Shanghai) that was equilibrated with binding buffer. The recombinant protein with high a nity to Ni-NTA were eluted slowly with a buffer containing 250 mM imidazole after washing the adsorbed resin with buffers containing 20-100 mM imidazole. All binding/washing buffers used for puri cation were prepared in accordance with the manufacturer's protocol. Protein samples were analyzed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE).

Enzyme activity assays
The catalytic activity of the recombinant enzyme was determined by quanti cation of the amount of pNP released from pNP-Acetate (pNPA) by photometric measurement at 405 nm. The reactions were performed in 1 mL of 50 mM Tris-HCl buffer (pH 8.0) with 500 μM pNPΑ as substrate and appropriate dilution of the recombinant enzyme EstQ7, which were incubated for 2 min at 50 ℃. Reactions were terminated by putting on ice for 3 min. Control groups with no enzyme added were measured simultaneously to eliminate the effect of spontaneous hydrolysis of the substrate. All measurements were carried out in triplicate for statistical analysis.

Substrate speci city
Speci city for a variety of chain length fatty acid esters was investigated using pNPEs (pNP-acetate, C2; pNP-butyrate, C4; pNP-octanoate, C6; pNP-decanonate, C8; pNP-laurate, C12) as substrates. The reaction mixture was consisted of 1 mL of 50 mM Tris-HCl (pH 8.0) buffer, 500 μM pNPE and 0.5 μg puri ed enzyme, while blank reactions were conducted with no enzyme adding. The degree of hydrolysis of each reaction mixture was measured at 405 nm after incubation for 2 min at 50 ℃.

Effect of pH and temperature
The optimal pH of the recombinant enzyme was assayed at 37 ℃ in the following buffers: sodium phosphate (pH 6.0-7.5), Tris-HCl (pH 7.2-8.9) and glycine-NaOH buffers (pH 9.6-10.6). pH stability was assessed by incubating the enzyme in different buffer systems (pH 6.0-9.0) for 3 h at 4 °C, and residual activity was measured after incubation at 37 °C. The optimal temperature of the enzyme was investigated at temperatures ranging from 20 to 70 °C in Tris-HCl buffer (pH 8.0). Enzyme thermostability was evaluated by preincubating the enzyme at temperatures between 20 and 60 °C for 1 h before measuring the residual activity. The analysis was performed with the same reaction mixture and substrate as those used in the effect of temperature, and conducted simultaneously with the control tests.

Effect of additives and organic solvents
The effects of different additives and organic solvents on the lipolytic activity of EstQ7 were investigated. The effects of metal ions and surfactants on enzymatic activity were detected at the nal concentration of 1 mM or 1% (w/v). The effects of organic solvents on enzymatic activity were determined at a nal concentration of 10% and 30% (v/v). One hundred percent enzyme activity corresponds to the activity in the absence of the additives and organic solvents. All quantitative assays were performed in triplicate for statistical analysis.

Kinetic parameters
Kinetic parameters of the recombinant enzyme were investigated using varying acyl lengths of pNPEs (C2-C12) as substrates with concentrations ranging from 0.05 to 2 mM. Assays were conducted under optimized conditions (50 °C and pH 8.0). The reaction rates were initially calculated before tting into the Lineweaver-Burk plots using Excel. The kinetic parameters (Km, Vmax, and kcat/Km) were calculated using the Michaelis-Menten equation (Biver and Vandenbol 2013).

Acyltransferase assay
The acyltransferase activity of the recombinant enzyme was determined using puri ed EstQ7 (0.2 mg/mL), 200 mM vinyl acetate and 50 mM benzyl alcohol in 1 mL of 50 mM Tris-HCl buffer (pH 8.2). The reaction mixtures were shaken at a speed of 1400 rpm at room temperature (25°C) for 20 min. The reactions were quenched by extraction with methyl tert-butyl ether, rapidly vortexed and then centrifuged to separate the phases (>13,000 g, 1 min). The organic phases were dried by adding anhydrous sodium sulfate before gas chromatography-mass spectrometry (GC-MS) analysis. The samples were passed through a 0.22 μm organic lter membrane before detected by a 7890GC/5975MSD GC-MS (Agilent, USA) coupled with a HP-5 column. The GC-MS conditions were described previously (Muller et al. 2021).

Results
Lipolytic genes screening and subcloning A fosmid metagenomic library was constructed with environmental DNA (eDNA) extracted from an agricultural soil sample. The library consisted of 30,000 clones and contained about 1.2 Gb genomic DNA with an average insert sizes of about 40 kb. Function-based screening was conducted and one colony showed superior activity was selected for further study. In order to identify the genes responsible for the lipolytic activity, plasmid digestion was performed and the positive subclone was then sequenced at General Biosystems Co., Ltd. (Anhui, China).

Sequence alignment and phylogenetic analysis of EstQ7
The complete length of the subclone DNA sequence was 1960 bp, comprising 11 open reading frames (ORFs). An 1113-bp ORF was identi ed as the putative lipolytic gene, designated as estq7, which encodes for a 370 amino acids protein with predicted molecular weight of 41.5 kDa. SignalP 4.0 was used to predict whether there is a signal peptide in EstQ7 (Petersen et al. 2011). Results showed that the values of C, S and Y are all greater than 0.5, indicating that the enzyme contained a signal peptide and EstQ7 is probably an extracellular protein. The sequence of EstQ7 was analyzed with BlastP against the UniProtKB/SwissProt database, the Non-Redundant (NR) protein sequences database and the Protein Data Bank database (PDB). Uniprot/SwissProt ananlysis indicated that the closest sequence homolog of EstQ7 is a carbohydrate acetyl esterase from Prevotella ruminicola (sequence identity 29.80%, coverage 38%, Accession no. D5EXZ4). The other hits from this database included a S-formylglutathione hydrolase from Escherichia coli (sequence identity 32.92%, coverage 38%, Accession no. Q0TKS8) and an esterase from Mycobacterium tuberculosis (sequence identity 26.85%, coverage 52%, Accession no. P9WM38). NR analysis yielded that EstQ7 is identical to an esterase originating from uncultured bacterium (sequence identity 65.96%, coverage 96%, Accession no. AIT69749), which has not been characterized functionally.
Analysis against PDB revealed that the closest sequence homolog of EstQ7 is a putative esterase from Bacteroides intestinalis (identity 34.46%, coverage 38%, PDB code: 5VOL), which has been characterized both biochemically and structurally (Daniel et al. 2017). Thus, estq7 was judged to be a novel lipolytic gene.
Multiple alignment of amino acid sequence of EstQ7 with closely related enzymes was conducted using the Clustal Omega program and Espript 3.0. The results indicated that EstQ7 contains a putative catalytic triad comprising the residues of Ser174, Asp306, and His344, which is absolutely conserved in the sequences of its homologs (Fig. 1A). EstQ7 also contains GXSXG pentapeptide motif, which is very characteristic for lipolytic enzymes (Bornscheuer 2002), and Ser174 located within the motif, acts as a nucleophile (Fig. 1A). Phylogenetic analysis revealed that EstQ7 appears to belong to a previously unidenti ed family of lipolytic enzymes (Fig. 1B). Although GHSMG motif is a consensus sequence for lipolytic enzymes belonging to EstA family, EstQ7 did not align at all with this family except for its alignment against the pentapeptide motif. Moreover, three uncharacterized putative esterases shown sequence similarities were found to be clustered with EstQ7 (Fig. 1B).

Overexpression and characterization of EstQ7
The putative lipolytic gene was overexpressed in E. coli BL21(DE3). The recombinant enzyme fused with a 6×His tag at its C-terminus was puri ed by a nity chromatography. The puri ed EstQ7 showed a single band on SDS-PAGE with a MW of 41 kDa, which is consistent with the theoretical value (Fig. 2).
Substrate speci city of EstQ7 toward various lengths of acyl chains was tested using pNPEs (C2-C12). EstQ7 exhibited a preference for short acyl chains (C2-C4), with the strongest activity towards pNPA, while its activity is barely detectable for C12 (Fig. 3). This demonstrated that EstQ7 acts as a carboxylesterase rather than a lipase (Hitch and Clavel 2019).
The effects of pH on the activity of EstQ7 were investigated over the pH range of 6-10, and the enzyme was found to have maximal activity at pH 8.2 and was stable within the pH range of 6-8 ( Fig. 4A-B). In terms of temperature, EstQ7 had optimal activity at 50 °C (Fig. 4C), and maintained a residual activity of more than 70% up to 60 °C, but decreased dramatically at temperatures above 60 °C (Fig. 4C). Based on these results, it can be concluded that EstQ7 is a moderately thermophilic enzyme.
Kinetic parameters of EstQ7 were determined and calculated following Michaelis-Menten kinetics using pNPEs as substrates, and the results were displayed in Table 1. EstQ7 exerted the maximum a nity and the highest catalytic e ciency toward pNPA with a K m value of 0.17 mM and a k cat /K m value of 11234 mM -1 ·S -1 . However, the a nity and catalytic e ciency declined as the substrate chain length increasing and the catalytic activity of EstQ7 against C12 was almost undetectable (Table 1). Effect of various organic solvents on activity of EstQ7 were tested at nal concentrations of 10 and 30% (v/v). In the presence of 10% methanol,ethanol,isopropanol,and N,, the catalytic activity of EstQ7 was slightly decreased, whereas the addition of 10% acetone and acetonitrile resulted in a 40% reduction of activity, and this was even more pronounced in the presence of 10% of 1-Butanol (Table 2). The enzymatic activity of EstQ7 decreased in a dose-dependent manner when different solvents from 10 to 30% (v/v) were added (Table 2).

Effects of additives on the activity of EstQ7
The effects of metal ions and surfactants on the activity of EstQ7 were investigated. The enzymatic activity of EstQ7 was affected slightly by most of the tested mono-and divalent metal ions at the nal concentration of 1 mM (Fig 5A). The addition of Ca 2+ slightly activated the catalytic activity of EstQ7, while Ba 2+ , Cu 2+ , Mn 2+ , Mg 2+ , Na + , NH 4 + , Fe 3+ and Co 2+ caused a minor decrease in its hydrolytic activity by 5~15%, and Zn 2+ inhibited the activity of EstQ7 by ~35% (Fig. 5A). EstQ7 maintained 41% and 69% in the presence of 1% (v/v) of Tween80 and Triton X100, respectively, but its activity dramatically declined in the presence of 1% (w/v) SDS and CTAB (Fig. 5B). The completely inhibited in enzymatic activity of EstQ7 by phenylmethylsulfonyl uoride (PMSF), known as serine hydrolase inhibitor, indicating a serine residue is conserved in the pentapeptide GXSXG motif of its active site (Sameh et al. 2007;Jayanath et al. 2018).

Acyltransferase assay of EstQ7
Whether EstQ7 has acyltransferase activity was investigated using benzyl alcohol and vinyl acetate as substrates. The reaction mixture of the experimental group with the addition of EstQ7 appeared white turbidity after the reaction nished, while the control group without EstQ7 was still clear (Fig. 6A). Based on this observation, GC-MS analysis was conducted. As Fig. 6B and C showed, benzyl acetate was detected which demonstrated that EstQ7 has transesteri cation activity. Fig. 6D showed the proposed reaction owchart of acyltransferase reaction catalyzed by EstQ7.

Homology modeling and molecular docking of EstQ7
Three-dimensional models of EstQ7 were created by using SWISS-MODEL and I-TASSER, models obtained were compared and the one with the highest scores checked by SAVEs was selected for further study. 3D model of EstQ7 was visualized and examined by Pymol (Fig. 7A). Molecular docking of receptor EstQ7 and ligand pNPA was carried out by using Autodock 4.2. The conformation with the lowest free energy was selected to analyze the protein-ligand interactions. The results of molecular docking showed that three hydrogen bonds were formed between the enzyme and the substrate. One hydrogen bond was formed between the putative catalytic residue His344 and carbonyl oxygens of pNPA (2.59 Å), the other two were formed among Arg349 and nitryl of the substrate (2.73 and 3.09 Å). Residues within the active cavity, Thr343, His173, Gly345, and Ile348 likely stabilize substrate binding by forming hydrophobic interactions with the substrate (Fig. 7B, C).

Discussion
The fosmid metagenomic library of the corn eld soil sample was investigated for screening novel lipolytic enzymes based on function-driven strategy, and a novel lipolytic gene, termed estq7, was identi ed. The amino acid sequence of EstQ7 was novel showing identities <66% to lipolytic enzymes from various cultured and uncultured microorganism. Analysis of amino acid sequence of EstQ7 indicated that it contained a putative catalytic triad composed of Ser174-Asp306-His344 and the nucleophilic residue Ser174 embedded in the conserved pentapeptide motif GXSXG, which is typical characteristic of α/β-hydrolase (Arpigny and Jaeger 1999;Buller and Townsend 2013) (Fig. 1A). Although the pentapeptide motif "GHSMG" of EstQ7 is consistent with the EstA esterase family (Chu et al. 2008), it cannot be categorized into the branch of EstA family (Fig. 1B). The updated esterase classi cation methods also failed to classify it successfully (Sood et al. 2018;Hitch and Clavel 2019), which indicated that EstQ7 and its close homolog hypothetical esterases clustered together to constitute a probable new family of lipolytic enzymes.
EstQ7 had a preference for short-chain pNPEs, with optimal activity toward pNPA (C2) (Fig. 2). The relative activity of EstQ7 was measured at various buffers and indicated optimal activity at pH 8.2, and over 60% relative activity of the enzyme was retained from pH 6.5 to 8 after incubation for 3 h (Fig. 4A, B). EstQ7 had optimal activity at 50 ℃, and its activity was sustained stably in the temperature below 40 ℃ (Fig. 4C, D). Given the results above, EstQ7 is considered as a moderately thermophilic and alkaliphilic carboxylesterase (Fig. 4).
Biocatalysts that can retain high levels of catalytic activity when exposed to various denaturing conditions are of high importance for industrial applications (Kawata and Ogino 2010; Alex et al. 2014).
As a result of industrial bioconversion catalyzed by lipolytic enzymes are usually conducted in non-aqueous medium (Oh et al. 2012), researchers are dedicated to search for solvent-tolerant enzymes (Zarafeta et al. 2016;Park et al. 2020) or adopt directed evolutionary approach to acquire stable enzymes in hydrophilic organic solvents (Hyun et al. 2012). EstQ7 showed moderate tolerance to a range of organic solvents (Table 2). When exposed to methanol, ethanol, isopropanol, and DMF of 10% (v/v), the enzyme was shown to retain its activity to over 80%. The enzymatic activity of EstQ7 retained less than 40% of its original activity when exposed to different solvents of 30% (v/v), which displayed a dosedependent manner (Table 2). EstQ7 exhibited extensive tolerance to the common metal ions, and it can sustain over 90% original activity when exposed to Ba 2+ , Mg 2+ , Na + , and NH 4 + . In addition, Ca 2+ had a slightly stimulatory effect on the catalytic activity of EstQ7, while Zn 2+ inhibited its activity by about 35% (Fig. 5A). Addition of surfactants inhibited its enzymatic activity in different degree. Participation of a serine residue in the catalytic process was demonstrated by the phenomenon that the addition of the serine hydrolase-speci c inhibitor PMSF led to a complete decline of the enzymatic activity (Zarafeta et al. 2016) (Fig. 5B).
Acyltransferase activity of EstQ7 was indicated by the occurrence of white turbidity, which results from the low water solubility of the reaction product, benzyl acetate (Mestrom et al. 2019). This approach led to the discovery that EstQ7, a novel carboxylesterase, has promiscuous acyltransferase activity. At present, most of the reported lipolytic enzymes with transesteri cation activity are from hormonesensitive lipase family and a few from family VIII carboxylesterase (Muller et al. 2021), and EstQ7 was the rst GHSMG motif subfamily exhibiting promiscuous acyltransferase activity.

Conclusions
Soil offers abundant microbial diversity for the exploration of novel biocatalysts. A novel carboxylesterase was isolated from a farmland soil metagenomic library, which can constitute a novel lipolytic enzyme family. The recombinant enzyme was characterized to be a moderately thermophilic and alkaliphilic esterase that has promiscuous acyltransferase activity. These properties make EstQ7 worthy of great attention in various industrial applications. This study suggests that the metagenomic approach is a powerful tool for mining novel enzymes with high potential for industrial applications and to broaden our vision of biodiversity.
Declarations Figure 1 Sequence alignment and phylogenetic analysis of EstQ7. (A) Multiple amino acid sequence alignment of EstQ7 and its homologs from cultured and uncultured microorganisms. The absolutely conserved amino acids are highlighted in red, while similar residues are highlighted in yellow. The putative catalytic residues are indicated by blue triangles. Elements of predicted secondary structure of EstQ7 are denoted as α (α-helix), β (β-sheet), η (random coil) and T (β-turn). (B) Phylogenetic tree of EstQ7 and other lipolytic enzymes. The tree was constructed using amino acid sequence of lipases/esterases from eight different families proposed by Aprigny and Jaeger (Arpigny and Jaeger 1999), EstA family, and hypothetical proteins showed high identities to EstQ7.  Substrate speci city of EstQ7 towards p-nitrophenyl esters. Cell extracts were assayed against pNP esters of fatty acids with acyl chain lengths varying from C2 to C12. The relative enzyme activity was determined spectrophotometrically at 405 nm. The error bars represent mean ± SD (n = 3).