Gene identification, and cloning
A putative polysaccharide deacetylase gene from Bacillus megaterium with Gene ID: NZ_CP009920.1 was taken as a reference gene. The gene was probed in the Bacillus aryabhattai whole genome with Genome ID: NZ_JYOO01000001.1 using genome BLAST server. The result yielded a 100% query coverage with a 97% match. The sequence was annotated in the NCBI gene databank in the third-party section of the DDBJ/ENA/GenBank databases with accession number TPA: BK010747 (Pawaskar et al. 2021).
Using the designed primers, BaCDA (~765bp) was amplified at optimized PCR conditions. The genetic code of BaCDA was affirmed by Sanger sequencing employing the T7promoter and the terminator region of the vector. Post confirmation, the vector construct was transformed into E. coli Rosetta pLysS cells for expression.
Extracellular expression of BaCDA
In TB basal media, the concentration of glucose depleted completely within 24 h of fermentation. This is followed by a second lag phase of 4 h, and further lactose consumption and auto-induction was initiated in the second log phase which prolonged for an additional 20 h. The maximum biomass yield and total CDA activity were found to be 22.26 ± 0.98 g/L and 84.67 ± 0.56 U/L respectively in the stationary phase at 52 h of fermentation (Pawaskar et al. 2021).
Process optimization of expression of extracellular recombinant enzyme chitin deacetylase in E. coli using central composite design (CCD)
In the current study, five factors induction temperature (A), agitation rate (B), induction time (C), glucose concentration (D), and lactose concentration (E) were considered (Table 1). Accordingly, A, B, C, D, and E were taken as exogenous factors while the total activity of expression of extracellular recombinant chitin deacetylase was chosen as the endogenous factor (response). The expression of extracellular recombinant enzyme chitin deacetylase in E. coli Rosetta pLysS cells is quantified and represented as “total activity (U/L)” of recombinant chitin deacetylase (Table 2). The results of biomass yield, expression of CDA, total protein content, specific activity at all the experimental runs are shown in Table 3. The expression was estimated by SDS-PAGE and quantified using ImageJ software. The total activity at each condition was determined using acetate assay and represented along with the SDS-PAGE as shown in Fig. 1.
Table 3
The results of biomass yield, expression of CDA, total protein content, the specific activity of all the experimental runs given by central composite design.
Run order | Biomass (g/L) | Expression | Total protein (mg/L) | Specific activity (U/mg) |
1 | 11.60 ± 0.30 | 1.000 | 2280.20 ± 103.90 | 0.022 |
2 | 10.20 ± 0.60 | 0.992 | 2083.50 ± 99.75 | 0.023 |
3 | 16.54 ± 1.27 | 0.887 | 2697.80 ± 107.40 | 0.010 |
4 | 14.80 ± 0.90 | 0.923 | 2639.70 ± 65.35 | 0.015 |
5 | 11.88 ± 0.94 | 1.115 | 2288.40 ± 35.20 | 0.026 |
6 | 13.36 ± 1.18 | 0.739 | 2618.00 ± 98.10 | 0.004 |
7 | 20.98 ± 0.99 | 0.973 | 4766.00 ± 84.70 | 0.009 |
8 | 17.88 ± 0.06 | 0.810 | 3430.80 ± 28.90 | 0.003 |
9 | 9.30 ± 0.35 | 0.830 | 1794.00 ± 92.26 | 0.013 |
10 | 8.60 ± 0.70 | 1.051 | 1706.60 ± 22.00 | 0.034 |
11 | 17.20 ± 0.90 | 0.848 | 2916.00 ± 77.16 | 0.009 |
12 | 12.60 ± 0.20 | 1.229 | 2353.00 ± 50.18 | 0.042 |
13 | 14.80 ± 0.60 | 1.420 | 2652.00 ± 99.88 | 0.048 |
14 | 12.72 ± 1.36 | 0.919 | 2369.40 ± 49.80 | 0.016 |
15 | 18.04 ± 0.48 | 1.042 | 3741.70 ± 99.65 | 0.015 |
16 | 17.94 ± 1.03 | 0.894 | 3528.30 ± 74.93 | 0.010 |
17 | 6.60 ± 0.70 | 0.517 | 1663.00 ± 55.44 | 0.001 |
18 | 16.62 ± 0.69 | 0.653 | 2842.20 ± 58.60 | 0.001 |
19 | 9.66 ± 0.67 | 1.160 | 1816.10 ± 101.65 | 0.034 |
20 | 18.46 ± 0.77 | 1.088 | 3810.60 ± 73.75 | 0.016 |
21 | 11.02 ± 0.99 | 0.718 | 2176.80 ± 50.40 | 0.002 |
22 | 12.76 ± 0.62 | 0.965 | 2391.60 ± 36.16 | 0.018 |
23 | 12.94 ± 1.03 | 0.913 | 2460.90 ± 118.65 | 0.015 |
24 | 12.20 ± 0.90 | 1.184 | 2308.50 ± 100.25 | 0.030 |
25 | 10.98 ± 0.51 | 1.333 | 2167.20 ± 99.90 | 0.053 |
26 | 13.04 ± 0.48 | 1.522 | 2544.10 ± 44.85 | 0.079 |
27 | 12.58 ± 0.71 | 1.313 | 2338.80 ± 97.50 | 0.048 |
28 | 12.30 ± 0.85 | 1.131 | 2358.80 ± 111.50 | 0.052 |
29 | 12.80 ± 0.60 | 1.282 | 2321.80 ± 55.19 | 0.057 |
30 | 12.45 ± 0.78 | 1.818 | 2345.80 ± 94.39 | 0.053 |
31 | 12.65 ± 0.68 | 1.133 | 2319.80 ± 56.09 | 0.053 |
32 | 12.72 ± 0.64 | 1.003 | 2342.80 ± 45.74 | 0.050 |
All the runs were performed in triplicates and the value was represented in average with standard deviation. |
Initially, the full second order regression model obtained for total activity of recombinant chitin deacetylase was significant with high coefficient of determination (R2) of 0.9595. Whereas the predicted R2 was 0%, indicating the lack of predictability of a model and their regression equation is given below:
Z1 (Total activity, U/L) = -1844 + 99.7 A + 2.64 B + 36.33 C+ 808 D + 55 E - 1.842 A2 - 0.00925 B2 - 0.3801 C2 - 26676 D2 + 234.9 E2 + 0.0562 A x B - 0.609 A x C + 44.8 A x D - 10.32 A x E - 0.0203 B x C + 0.83 B x D - 0.979 B x E + 28.2 C x D + 3.10 C x E + 432 D x E. (2)
Where A: Induction temperature, °C; B: Agitation rate, rpm; C: Induction time, h; D: Glucose concentration, %(w/v); and E: Lactose concentration, % (w/v).
The results of the analysis of variance (ANOVA) of the full model (Equation 2) are given in Table 4.
Table 4
Analysis of variance (ANOVA) for the full model.
Source | DF | Seq SS | Contribution, % | Adj SS | Adj MS | F-value | P-value | |
Model | 20 | 70183.3 | 95.95 | 70183.3 | 3509.2 | 13.03 | 0 | S |
Linear | 5 | 4790.8 | 6.55 | 4790.8 | 958.2 | 3.56 | 0.037 | S |
A | 1 | 214.6 | 0.29 | 214.6 | 214.6 | 0.8 | 0.391 | NS |
B | 1 | 267.3 | 0.37 | 267.3 | 267.3 | 0.99 | 0.341 | NS |
C | 1 | 301.6 | 0.41 | 301.6 | 301.6 | 1.12 | 0.313 | NS |
D | 1 | 2395.8 | 3.28 | 2395.8 | 2395.8 | 8.89 | 0.012 | S |
E | 1 | 1611.5 | 2.2 | 1611.5 | 1611.5 | 5.98 | 0.032 | S |
Square | 5 | 53958.5 | 73.77 | 53958.5 | 10791.7 | 40.07 | 0 | S |
A2 | 1 | 20884 | 28.55 | 25489.5 | 25489.5 | 94.64 | 0 | S |
B2 | 1 | 4699.7 | 6.43 | 6426.6 | 6426.6 | 23.86 | 0 | S |
C2 | 1 | 16785.4 | 22.95 | 17360 | 17360 | 64.45 | 0 | S |
D2 | 1 | 9000.3 | 12.3 | 8154 | 8154 | 30.27 | 0 | S |
E2 | 1 | 2589.2 | 3.54 | 2589.2 | 2589.2 | 9.61 | 0.01 | S |
2-Way Interaction | 10 | 11434 | 15.63 | 11434 | 11434 | 4.25 | 0.013 | S |
A x B | 1 | 1295.1 | 1.77 | 1295.1 | 1295.1 | 4.81 | 0.051 | NS |
A x C | 1 | 6078.9 | 8.31 | 6078.9 | 6078.9 | 22.57 | 0.001 | S |
A x D | 1 | 321.5 | 0.44 | 321.5 | 321.5 | 1.19 | 0.298 | NS |
A x E | 1 | 1090.8 | 1.49 | 1090.8 | 1090.8 | 4.05 | 0.069 | NS |
B x C | 1 | 677 | 0.93 | 677 | 677 | 2.51 | 0.141 | NS |
B x D | 1 | 11.1 | 0.02 | 11.1 | 11.1 | 0.04 | 0.843 | NS |
B x E | 1 | 981.1 | 1.34 | 981.1 | 981.1 | 3.64 | 0.083 | NS |
C x D | 1 | 510.7 | 0.7 | 510.7 | 510.7 | 1.9 | 0.196 | NS |
C x E | 1 | 392.9 | 0.54 | 392.9 | 392.9 | 1.46 | 0.252 | NS |
D x E | 1 | 74.8 | 0.1 | 74.8 | 74.8 | 0.28 | 0.609 | NS |
Error | 11 | 2962.8 | 4.05 | 2962.8 | 269.8 | | | |
Lack of Fit | 6 | 2739.9 | 3.75 | 2739.9 | 456.7 | 10.25 | 0.011 | |
Pure Error | 5 | 222.8 | 0.3 | 222.8 | 44.6 | | | |
Total | 31 | 73146 | 100 | | | | | |
Where A: Induction temperature, °C; B: Agitation rate, rpm; C: Induction time, h; D: Glucose concentration, %(w/v); E: Lactose concentration, % (w/v) S, Significant; and NS, Not significant R2 = 95.95%; Adj R2 = 88.58%; and Predicted R2 = 0.00% |
Further, to increase the predictability of the model the most insignificant factors (A, B, C, AD, BC, BD, CD, CE, and DE) were removed from Equation 2 and the final modified regression model was obtained for the total activity of recombinant chitin deacetylase is shown below:
Z2 (Total activity, U/L) = 121.81 + 9.99 D + 8.19 E - 29.48 A2 - 14.80 B2 - 24.33 C2 - 16.67 D2 + 9.40 E2 + 9.00 A x B - 19.49 A x C - 8.26 A x E - 7.83 B x E. (3)
The predicted values of the total activity were given by the software based on a modified regression model (Equation 3) and are represented in Table 5. In addition, the normality assumption is fulfilled as the residuals are normally distributed i.e., the data points are closer to the straight line as shown in normal probability plot (Fig. 2) indicating the capability of the model to optimize the expression of recombinant chitin deacetylase. Hence the modified model Equation 3 shall be applied to discover the optimal levels and their design space for the process.
Table 5
Analysis of variance (ANOVA) for the reduced model, test of significance for the expression of recombinant chitin deacetylase in E. coli Rosetta pLysS cells.
Source | DF | Seq SS | Contribution, % | Adj SS | Adj MS | F-value | P-value | |
Model | 11 | 67411.7 | 92.16 | 67411.7 | 6128.3 | 21.37 | 0.009 | S |
Linear | 2 | 4007.2 | 5.48 | 4007.2 | 2003.6 | 6.99 | 0.028 | S |
D | 1 | 2395.8 | 3.28 | 2395.8 | 2395.8 | 37.64 | 0 | S |
E | 1 | 1611.5 | 2.2 | 1611.5 | 1611.5 | 5.62 | 0.028 | S |
Square | 5 | 53958.5 | 73.77 | 53958.5 | 10791.7 | 37.64 | 0 | S |
A2 | 1 | 20884 | 28.55 | 25489.5 | 25489.5 | 88.9 | 0 | S |
B2 | 1 | 4699.7 | 6.43 | 6426.6 | 6426.6 | 22.41 | 0 | S |
C2 | 1 | 16785.4 | 22.95 | 17360 | 17360 | 60.55 | 0 | S |
D2 | 1 | 9000.3 | 12.3 | 8154 | 8154 | 28.44 | 0 | S |
E2 | 1 | 2589.2 | 3.54 | 2589.2 | 2589.2 | 9.03 | 0.007 | S |
2-Way Interaction | 4 | 9446 | 12.91 | 9446 | 2361.5 | 8.24 | 0 | S |
A x B | 1 | 1295.1 | 1.77 | 1295.1 | 1295.1 | 4.52 | 0.046 | S |
A x C | 1 | 6078.9 | 8.31 | 6078.9 | 6078.9 | 21.2 | 0 | S |
A x E | 1 | 1090.8 | 1.49 | 1090.8 | 1090.8 | 3.8 | 0.065 | NS |
B x E | 1 | 981.1 | 1.34 | 981.1 | 981.1 | 3.42 | 0.079 | NS |
Error | 20 | 5734.3 | 7.84 | 5734.3 | 286.7 | | | |
Lack of Fit | 15 | 5511.5 | 7.35 | 5511.5 | 367.4 | 8.24 | 0.14 | NS |
Pure Error | 5 | 222.8 | 0.3 | 222.8 | 44.6 | | | |
Total | 31 | 73146 | 100 | | | | | |
Where A: Induction temperature, °C; B: Agitation rate, rpm; C: Induction time, h; D: Glucose concentration, %(w/v); E: Lactose concentration, % (w/v). S, Significant; and NS, Not significant. R2 = 92.16%; Adj R2 = 87.85%; and Predicted R2 = 72.83%. |
The impact of interactions among the independent factors was visualized by the two-dimensional contour plots for the expression of recombinant chitin deacetylase (Fig. 3).
Validation of model
The response optimizer tool in MINITAB 17.0 (Trial version) was used to solve the reduced regression model (Equation 3) and to find the optimal conditions for enhanced total activity of recombinant enzyme chitin deacetylase and their expression in E. coli Rosetta pLysS. The model (Fig. 3) was validated at the optimal process conditions i.e., induction temperature of 22°C; agitation rate, 128 rpm; induction time, 30 h; glucose concentration, 0.058% (w/v); and lactose concentration, 1% (w/v). The optimal values (Table 6 and Fig. 4) of all the five factors except factor E (lactose concentration) are placed within factor levels selected and the predicted and experimental total activity of recombinant chitin deacetylase at these optimal conditions was 190.85 U/L and 202.39 ± 0.31 respectively (Table 6).
Table 6
The optimum process conditions for increased total activity of recombinant enzyme chitin deacetylase and their expression in E. coli Rosetta pLysS.
Parameters | Symbol | Optimal values | Total Activity, U/L | Total Activity, U/L |
Predicted | Experimental |
Induction Temperature, °C | A | 22 | 190.85 | 202.39 ± 0.31 |
Agitation rate, rpm | B | 128 |
Induction time, h | C | 30 |
Glucose concentration, % (w/v) | D | 0.058 |
Lactose concentration, % (w/v) | E | 1.0 |
The experiment was performed in triplicates and the value was represented in average with standard deviation. |
The optimal value of lactose concentration using the response optimizer tool was found to be 1% which was at a +2 level in the model. Therefore, the impact of higher lactose concentrations (1 – 2.5%(w/v)) on expression and total enzyme activity was studied. The total enzyme activity was 201.840±1.92 U/L, 201.900±1.95 U/L, 202.186±1.59 U/L and 202.173±2.09 U/L with 1%, 1.5%, 2% and 2.5% lactose respectively. There was no significant difference in the total activity and the SDS-PAGE analysis showed the expression level was the same in all lactose concentrations (Fig. 5).