Construction of a mutant Bacillus subtilis strain for high purity poly-γ-glutamic acid production

To construct a Bacillus subtilis strain for improved purity of poly-γ-glutamic acid. The construction of strain GH16 was achieved by knocking out five genes encoding extracellular proteins and an operon from Bacillus subtilis G423. We then analyzed the amount of protein impurities in the γ-PGA produced by the resulting strain GH16/pHPG, which decreased from 1.48 to 1.39%. Subsequently the fla-che operon, PBSX, as well as the yrpD, ywoF and yclQ genes were knocked out successively, resulting in the mutant strains GH17, GH18 and GH19. Ultimately, the amount of protein impurities was reduced from 1.48 to 0.83%. In addition, the amount of polysaccharide impurities in the γ-PGA was also decreased from 2.21 to 1.93% after knocking out the epsA-O operon. The high purity γ-PGA producer was constructed, and the resulting strain was a promising platform for the manufacture of other highly pure extracellular products and secretory proteins.


Introduction
Poly-γ-glutamic acid (γ-PGA), a polypeptide-type biopolymer, is formed by the polymerization of Dand L-glutamate through γ-amide bonds (Ashiuchi et al. 2004). γ-PGA has a broad range of potential applications in medicine, food, cosmetics, agriculture and environmental protection because it is highly water soluble, biodegradable, nontoxic, water-retentive and biocompatible (Shih and Van 2001;Buescher and Margaritis 2007;Yu et al. 2011;Park et al. 2019). However, immunogenic impurities such as proteins and polysaccharides must be completely removed for parenteral applications in order to eliminate the risk of anaphylaxis.
At present, γ-PGA is mainly produced by Bacillus subtilis and related species (Min et al. 2019). γ-PGA is an extracellular product that is secreted into the fermentation broth with other secreted products such as proteins and polysaccharides. General methods for extracting γ-PGA from the fermentation broth include organic solvent precipitation, salting out or membrane filtration (Goto and Kunioka 1992;Manocha and Margaritis 2010;Kumar and Pal 2015). However, because of the similar macromolecular properties, none of these methods can completely separate proteins and polysaccharides from γ-PGA. As a result, there is a certain amount of proteins and polysaccharides in the final product, and the problem seems to be difficult to solve relying solely on downstream processing. Currently, the purity of commercially available γ-PGA for cosmetics is between 92 and 95%, but the content of protein and polysaccharide impurities is not indicated.
Bacillus subtilis can secrete more than 300 extracellular proteins (Tjalsma et al. 2000(Tjalsma et al. , 2004, and it shows significant differences in the types and quantities of extracellular proteins under different fermentation conditions (Yamane et al. 2000). It is generally accepted that TasA, Hag, YlqB, XlyA, XkdM, XkdG, YolA, YclQ, Csn, WprA, LipA, FlgK, Epr, Bpr and YncM are abundant extracellular proteins in the fermentation broth (Yamane et al. 2000;Tjalsma et al. 2004). However, the amounts of these highly abundant extracellular proteins have not been determined in the background of γ-PGA synthesis. Therefore, it may be possible to improve the purity of γ-PGA by removing extracellular proteins which are abundant in the fermentation broth. In our previous work (Ran et al. 2018), we modified B. subtilis by deleing the genes encoding neutral metalloprotease NprE, alkaline serine protease AprE, xylanase XynA, chitosanase Csn and proteins of unknown function (YolA, YolB and YncM). Based on the physiological functions and 2D-PAGE of the extracellular proteins (Yamane et al. 2000;Tjalsma et al. 2004), it could be confirmed that the biofilm constituent proteins TasA and YqxM, lipase LipA, the serine proteases Epr, Bpr, as well as the proteins with unknown function YlqB and YweA were abundant extracellular proteins in the culture supernatant of B. subtilis. Therefore, it is necessary that the genes encoding the above proteins should be deleted in γ-PGA producing strain to prevent their synthesis. Furthermore, the fla-che operon of B. subtilis, consisting of 31 coding sequences, encodes hook-basal body (HBB), RNA polymerase recognition factor SigD and other proteins related to chemotaxis (Estacio et al. 1998). In addition, the flgM operon, hag gene and yvyC-fliT operon are responsible for synthesizing the filament proteins and junction proteins (Estacio et al. 1998). These flagellar proteins are abundant in the fermentation broth of B. subtilis (Yamane et al. 2000;Tjalsma et al. 2004), which may affect the purity of the γ-PGA product. Coincidentally, the expression of flgM operon, hag gene and yvyC-fliT operon are all dependent on the presence of SigD factor (Mukherjee and Kearns 2014). Therefore, the genes encoding flagellar proteins will be silenced if the fla-che operon was deleted. PBSX is a prophage integrated into the chromosome of B. subtilis 168, with a size of 28 kb (Krogh et al. 1996). Some proteins encoded by genes from PBSX, such as XkdG, XkdK and XlyA, were also detected in the fermentation broth of B. subtilis (Yamane et al. 2000;Tjalsma et al. 2004). Therefore, it might be necessary to delete part of the PBSX sequences to reduce the amount of extracellular proteins.
In addition to proteins, exopolysaccharides (EPS) are also abundant impurities that affect the quality of γ-PGA. EPS mainly include polysaccharides that make up the biofilm of B. subtilis (Nagorska 2010). EPS is a complex mixture of polysaccharides with a molecular weight between 500 and 2000 kDa (Sutherland 2001;Marvasi et al. 2010) and its synthesis is related to the epsA-O operon which is approximately 15 kb in length and contains 15 genes encoding a variety of glycosyltransferases (Guttenplan 2010). It is difficult to remove EPS using downstream purification technology, but preventing the upstream EPS synthesis by gene knockout may increase the purity of γ-PGA.
In this work, we aimed to reduce the amounts of extracellular proteins and polysaccharides by knocking out relevant biosynthetic genes from the genome of B. subtilis. The strain construction process is illustrated in Fig. 1. The effects of the deletion of these genes on cell growth, γ-PGA synthesis and γ-PGA purity were also investigated. The results obtained here may be useful for the industrial production of γ-PGA. Finally, we provide perspectives for the application of the constructed B. subtilis strains as cell factories for the synthesis of highly pure secreted proteins.

Gene knockout procedure
The marker-free gene deletion was conducted according to standard procedures (Liu 2008). Competent cells were prepared using the classical method of Anagnostopoulos and Spizizen (1961). The specific genetic manipulation technique used in this study was described previously (Zhu et al. 2015). All fragments were amplified using the genomic DNA of B. subtilis TD01m as the template. The oligonucleotide primers used in this study are listed in Supplementary  Table 2.
As an example, the procedure for deleting the ywoF gene was conducted as follows: Firstly, the 1.2 kb upstream homologous fragment (U) of ywoF gene was amplified by PCR using the primer pair ywoUP1/ywoUP2, and the 0.75 kb internal homologous fragment (D) was amplified using the primer pair ywoDN1/ywoDN2. Secondly, the 2.1 kb selection marker cassette (a fragment composed of chloramphenicol resistance genes cat and arabinose operon repressor protein coding gene araR, CR) was amplified using the primer pair ywoCR1/ywoCR2, and the 1.0 kb downstream homologous fragment (G) was amplified using the primer pair ywoG1/ ywoG2. The four fragments were then ligated in the order U-D-CR-G by overlap-extension PCR using the primer pair ywoUP1/ywoG2, and the resulting fragment UDCRG was used to transform competent cells of B. subtilis. A chloramphenicol-resistant colony was selected and incubated in LB medium for 4 h, during which the intragenomic single-crossover event caused the CRG fragment to be ejected. Finally, the ywoF-deficient strain could be selected based on neomycin resistance.
In special cases, the available homology arms had to be short if the target gene sequence was also short (< 600 bp), which would results in a low rate of homologous recombination. In this case, the target gene sequence should be replaced with the erythromycin-resistance gene emr, which in turn was knocked out using the marker-free deletion method. The concentration of γ-PGA in the fermentation broth was measured using high performance liquid chromatography (HPLC) on a 1260 instrument (Agilent, USA) with a PL aquagel-OH column (60, 8 μm; Agilent) and a 1260 G1362A refractive index detector (Agilent). Sodium nitrate solution (8.5 g Sodium nitrate l −1 ) was used as the mobile phase at a flow rate of 1.0 ml/min at 30 °C. The fermentation broth was diluted 60 times with deionized water as a sample for analysis and the injection volume was 50 μl. For quantitative analysis of the γ-PGA content, a series of concentrations of γ-PGA standard solution (0.25, 0.50, 0.75, 1.00, 1.25 g γ-PGA l −1 ) were analyzed under the same conditions, and the resulting peak area and sample concentration were fitted to obtain a linear equation. Based on this linear equation, the concentrations of γ-PGA in the fermentation broth were calculated according to the formula: where A is the peak area (× 10 -5 ) of γ-PGA, r 2 = 0.998.
Preparation of the γ-PGA product Ethanol precipitation was used to extract γ-PGA from the fermentation broth as described previously (Goto and Kunioka 1992). To remove small molecule impurities, 5 ml of the crude γ-PGA product was transferred to a dialysis bag (YA1073; Solarbio, China) and dialyzed for 20 h against 1.0 l of ultrapure water. Then, the retained solution was vacuum-dried to obtain the γ-PGA product. The obtained product was used as the γ-PGA standard in this study.

Analysis of protein impurities in γ-PGA product
An aqueous solution of the produced γ-PGA with a concentration of 5 g γ-PGA l −1 was prepared as a sample to be tested. The protein content in the sample was measured using a Bradford protein concentration assay kit (PC0015, Solarbio). For quantitative analysis of the protein content, a series of concentrations of bovine serum albumin (BSA) standard solutions (0.04, 0.08, 0.12, 0.16, 0.20 g BSA l −1 ) were analyzed. Then, the obtained absorbance at 595 nm (A 595 ) and sample concentration were fitted to obtain a linear equation. The amount of protein impurities in the γ-PGA product was calculated according to the formula: where A is the A 595 value of the sample, r 2 = 0.993.
Analysis of polysaccharide impurities in the γ-PGA product An aqueous solution of the γ-PGA product with a concentration of 5 g/l was prepared as a sample to be tested. The polysaccharide content was measured using the phenol-sulfuric acid method (DuBois et al. 1956). For quantitative analysis of the polysaccharide content, a series of concentrations of glucose standard solution (0.01, 0.02, 0.04, 0.06, 0.08 g glucos l −1 ) were analyzed. The obtained A 490 value and sample concentration were fitted to obtain a linear equation. Considering the addition of water molecules when hydrolyzing polysaccharides, the amount of polysaccharide impurities in the γ-PGA was calculated according to the following formula: where A is the A 490 value of the sample, and 0.9 is the conversion factor related to the addition of water molecules, r 2 = 0.999.

SDS-PAGE analysis of the extracellular protein extracts
Strains without plasmids were cultured for 40 h and extracellular proteins in the fermentation broth were extracted by ethanol precipitation. The pellet was dissolved in deionized water and concentrated 40 times. Then, 20 μl of the resulting protein solution was mixed with an equal volume of 2 × loading buffer (P1019, Solarbio) and boiled for 12 min, after which 15 μl of the mixed sample was loaded and subjected to sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE).

Mass spectrometry analysis of extracellular proteins
The extracellular protein extract was separated by SDS-PAGE, followed by cutting of protein Y(w∕w) = (0.3039A − 0.0087)∕5 × 100% Z(w∕w) = (0.0685A + 0.0007)∕5 × 0.9 × 100% bands and in-gel digestion as described previously (Yu et al. 2005). Then, the samples were analyzed by ultra-performance liquid chromatography (UPLC) on an Easy nLC 1200 system (Thermo Scientific, USA) with an Easy-spray column (C18, 2 μm × 75 μm × 15 cm) and Q-Exactive HF mass spectrometer (Thermo Scientific, USA). The solution A (0.1% formic acid/H 2 O) and solution B (0.1% formic acid/acetonitrile) were used as the mobile phase ata flow rate of 600 nl min −1 and the gradient elution was performed as shown in Supplementary  Table 4. The mass spectrometry parameters were set as follows: ion spray voltage: 2.3 kV; capillary temperature: 250 °C; analysis mode: Full MS/dd-MS2. The software Mascot 2.5.1 (Matrix Science, USA) was used to analyze the mass spectral data based on the UniProt database for Bacillus subtilis (strain 168).

Results and discussion
Effects of the sequential knockout of genes encoding secreted proteins The tasA-yqxM operon, lip, epr, bpr, ylqB and yweA genes were successively knocked out from G423 strain and the deficient strain GH16 was obtained. After introducing the γ-PGA expression plasmid pHPG into the GH16 strain, shake flask fermentation of G423/pHPG and GH16/pHPG showed that the deletion of these genes had no significant effect on cell growth and γ-PGA synthesis ( Fig. 2A). Furthermore, the conversion rate, ratio of γ-PGA production to glucose and glutamate consumption, only changed slightly ( Fig. 3A; Table 1). However, the amount of protein impurities in γ-PGA produced by G423/pHPG and GH16/pHPG was 1.48% and 1.39%, respectively, representing a decrease of 6.08% ( Fig. 3B; Table 1). SDS-PAGE analysis of the extracellular proteins extracted from the fermentation broth of G423 strain and GH16 showed no significant difference in the protein bands (Fig. 4). Consequently, TasA, YqxM, Epr, Bpr, LipA, YlqB and YweA were not the major extracellular proteins and only had a small effect on improving the purity of γ-PGA following the deletion of the corresponding genes.

Effects of deleting the fla-che operon
The flagellar protein deficient strain GH17 was constructed by deleting the first 30 genes of the fla-che operon in GH16 strain, while the terminal ylxL gene was retained. The deletion of fla-che operon reduced the amount of protein impurities in γ-PGA to 1.24% and promoted the synthesis of γ-PGA (Table 1; Fig. 3A). Compared to the GH16/pHPG strain, the amount of protein impurities decreased by 10.8% (Fig. 3B), indicating that the deletion of fla-che operon can help reduce the synthesis of extracellular proteins and improve the purity of the γ-PGA product.
The biomass of GH17 strain in shake-flask fermentation was slightly higher than that of the GH16 strain ( Fig. 2A). The lack of flagella was likely to reduce the metabolic load of cells, resulting in a higher biomass. Accordingly, the glucose conversion rate based on γ-PGA production increased from 72.33 to 80.22% (Table 1). At the same time, the conversion rate of glutamic acid increased from 94.67 to 97% (Table 1), which might be related to the inhibition of the synthesis of flagellin. In addition, the GH17 cells exhibited a filamentous phenotype during the exponential growth phase (Fig. 5), but they gradually became shorter and returned to normal in the stationary phase. This phenomenon may be associated with the presence of autolysin, which is encoded by the lytA-lytF genes in Bacillus subtilis. It is worth noting that the SigD factor is necessary for lytA-lytF expression, and the separation of the daughter cells during division requires autolysin (Mukherjee and Kearns 2014). However, the lytA-lytF genes could not be expressed in GH17 strain because of the defect of SigD, and thus the daughter cells could not separate normally. Furthermore, the mechanism by which cell morphology returns to normal in the stationary phase may be related to the function of the peptidoglycanase gene cwlO (Mitsui et al. 2011), which should be investigated in future studies.

Effects of the partial deletion of PBSX
A total of 26.1 kb of PBSX sequences, from genes xkdB to xlyA, were deleted from GH17 strain, resulting in strain GH18. We found that the biomass of GH18 was always slightly lower than that of GH17 throughout the fermentation process (Fig. 2B), and the maximum biomass of GH18 was lower than that of GH17 by 1.37 OD 600 units. At the same time, the γ-PGA yield of GH18/pHPG also decreased slightly, with a rate of 2.7% ( Fig. 3B;  Table 1). These apparent changes could be attributed to the large-scale absence of PBSX, which reflected the complexity of the physiological relationship between the prophage and the host. However, it stands to reason that the amount of protein impurities in γ-PGA extracted from the fermentation broth of GH18/pHPG was 1.07%, which was decreased by 13.7%. Overall, PBSX deletion is necessary to reduce the amount of extracellular proteins and improve the purity of the γ-PGA product, and its mild negative effects can be ignored.
The extracellular proteins secreted by the GH18 strain Approximately 1.0% of the proteins still remained in the γ-PGA extracted from fermentation broth of GH18/pHPG strain. Therefore, advanced highresolution mass spectrometry was used to analyze the extracellular proteins of GH18 strain. A total of 449 different proteins were identified in the culture supernatant. The abundance of each protein was characterized by the peak area of the mass spectrum, and the results showed that Yrpd was the most abundant. We then used the peak area of Yrpd as a benchmark, and defined the ratio of the peak area of a protein to the peak area of Yrpd as the relative abundance of each protein. Finally, there were 60 proteins with a relative abundance of more than 0.2% (Supplementary Table 4). Among them, some proteins had a relative abundance of more than 3%, including Yxal, Ywof, YjdB, L-lactate dehydrogenase Ldh and YoqM. However, YjdB, Ldh and YoqM had never been isolated or identified in 2D-PAGE (Yamane et al. 2000;Tjalsma et al. 2004). In addition, the order of protein spot sizes in the previously reported 2D gel map was Yxal > YrpD > Ywof, which was different from our observation. It is worth noting that the proteins whose encoding genes had been deleted in the strain GH18, such as Hag, FlgK, FliD, XkdG, XkdK and XlyA were not detected. This confirmed the validity of the previously performed gene knockout at the proteomic level.
As can be seen in Supplementary Table 4, a number of non-secretory proteins with high abundance were identified by mass spectrometry, such as L-lactate dehydrogenase (relative abundance 3.3%), acetolactate synthase (relative abundance 1.8%) and enolase (relative abundance 1.4%). These are intracellular proteins involved in central carbon metabolism, and were not previously reported to be present in the culture supernatant of B. subtilis. We assumed that they leaked from autolyzed or otherwise damaged cells under the fermentation conditions. Deletion of genes encoding the extracellular proteins YrpD, YwoF and YclQ Based on the results of mass spectrometry, the genes encoding the three proteins with unknown function YrpD, YwoF and YclQ were selected to be deleted in the GH18 strain, and the resulting strain GH19 and GH19/pHPG were fermented, respectively. The results of fermentation showed that the biomass of GH19 strain was slightly higher than that of GH18 and slightly lower than that of GH17 (Fig. 2B). Also, there was no any adverse effect on the γ-PGA yield or the conversion rate of glucose and glutamic acid ( Table 1). The amount of protein impurities in γ-PGA produced by the GH19/pHPG strain was 22.4% lower than that of GH18/pHPG. From the perspective of the relative abundance of proteins, the decrease of protein impurities in the γ-PGA product of GH19/pHPG was mainly attributed to the absence of YrpD and YwoF. As shown in Fig. 4, the SDS-PAGE results of extracellular proteins also support this conclusion that the bands of proteins around 17 kDa and 58 kDa almost disappeared. The size of these two protein bands was close to the size of YrpD and YwoF, whose theoretical molecular mass was 24.9 kDa and 51.4 kDa, respectively.
Obviously, the high-resolution mass spectrometry provided precise and reliable data. It was expected that the purity of the γ-PGA product will be further improved by further reducing the synthesis of extracellular proteins based on the reliable data.

Deletion of the epsA-O polysaccharide synthesis operon
Using the phenol-sulfuric acid method, it was found that the γ-PGA product contained 2.21% polysaccharides (Table 1), which apparently originated from extracellular polysaccharides (EPS) secreted by B. subtilis. EPS are among the major extracellular polysaccharides (Nagorska et al. 2010), and are the main component of bacterial biofilms. In this study, the EPS synthesis gene epsA-O was knocked.
The 15 kb epsA-O operon was deleted in GH18 strain to obtain the EPS-deficient strains GH21 and GH21/pHPG. Shake flask fermentation showed that the deletion of the epsA-O operon did not affect cell growth (Fig. 2B). The amount of polysaccharide impurities in the γ-PGA product was reduced to 1.93%, representing a decrease of 11%. At the same time, the γ-PGA yield of GH21/pHPG was unexpectedly reduced by 9.8% and the impurity protein content is almost unchanged compared to that of GH18/pHPG (Table 1; Fig. 3B). Therefore, considering the deletion of epsA-O operon could not effectively reduce the content of polysaccharides in the γ-PGA product and also reduced the γ-PGA yield, it was not necessary to knock out the epsA-O operon from the γ-PGA production strain.
In addition, the surface of the colony of GH21 still showed the characteristic glossy and moist surface, and there was no difference compared to strain GH18. These results were different from previous observations in Bacillus amyloliquefaciens (Feng et al. 2015) and it showed that in addition to EPS, B. subtilis 168 secreted a considerable amount of other polysaccharides. The types of extracellular polysaccharides that affect the purity of γ-PGA should therefore be explored in more detail in the future.

Conclusions
To our best knowledge, this is the first report of a systematic modular gene engineering method for improving the purity of γ-PGA, and we successfully reduced the extracellular protein secretion of B. subtilis by 43.9%, which significantly improved the purity of γ-PGA. This will greatly reduce the difficulty of γ-PGA purification and facilitate its application in the medical field. The multigene-knockout strain constructed in this study is also suitable for the production of other valuable proteins with strict requirements on purity. Therefore, the strategy of constructing a cell factory with a simple background secretome is of great significance for biomanufacturing.