Real-time and on-line parallel detection of key fermentation process parameters by near-infrared spectroscopy in different environments

Yang Chen State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology Lingli Chen State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology Meijin Guo State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology Xu Li State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology Jinsong Liu SDIC Biotech Investment Co. Ltd., Beijing 100000, China Xiaofeng Liu SDIC Biotech Investment Co. Ltd., Beijing 100000, China Zhongbing Chen Zhejiang Biok Co.Ltd, Zhongguan Industrial Park Xiaojun Tian SDIC Biotech Investment Co. Ltd., Beijing 100000, China Haoyue Zheng SDIC Biotech Investment Co. Ltd., Beijing 100000, China Xiwei Tian (  tahfy@163.com ) East China University of Science and Technology https://orcid.org/0000-0003-2157-0427 Ju Chu State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology Yingping Zhuang State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology Frontiers Science Center for Materiobiology and Dynamic Chemistry, East China University of Science and Technology, Shanghai 200237, China


Introduction
The fermentation is a dynamical and complex biochemical reaction process. Through real-time controlling of environmental parameters, the metabolic state of the cells can be regulated exibly to obtain high titer, productivity and yield (Cai et al., 2002;Tain et al., 2018;Wang, 2019). Thus, bioprocess parameter detection is the basis of fermentation optimization (Zhang et al., 2004). With the development of sensing technology, on-line measurements of many physical, chemical and physiological parameters have been implemented (Zhang et al., 2014). However, more importantly, for key index parameters such as substrate and product concentrations in the broth, they are commonly determined by off-line methods, which is time-consuming and laborious from sampling to analysis procedures (Lorena et al., 2015).
Near-infrared spectroscopy (NIR) belongs to the molecular vibration spectrum, which is the fundamental frequency of molecular vibration frequency doubling and the combination of frequency for the characteristic information of hydrogen groups with X-H bond (X for C, O, N, S, etc.) (Costa et al., 2019;Peng et al., 2019). NIR spectroscopy is an accurate, on-line, and non-invasion technique, which doesn't require complicated sample preparation. Therefore, it has been widely applied to the quantitative and qualitative analyses in food, medicine and other elds (Quintelas et al., 2018;Pinto et al., 2015).
Generally, NIR detection system consists of three parts with hardware equipment, stoichiometric software and mathematic model. According to the collected different spectral information, a quantitative functional relationship between spectral information and sample composition as well as sample content can be established .
In the previous study, the application of NIR in biological process could be divided into three kinds according to sampling methods: off-line, at-line and on-line determination (Scarff et al., 2007). According to whether the near-infrared probe is in direct contact with the fermentation broth, it can be divided into contact type and non-contact type (Navrátil et al., 2005;Olarewaju et al., 2019;Svendsen et al., 2016). Wang et al., (2020) established a method to quickly identify the quality of Japanese fermented soy sauce based on NIR spectroscopy technology and chemometric method, which could realize rapid and economical classi cation of soy sauce. Puvendran et al. (2018) used NIR technology to monitor the fermentation process parameters of hyaluronic acid in real-time, and then an established quantitative analysis model with partial less squares regression method was applied to other hyaluronic acidproducing processes by the recombinant strains. Do Nascimento et al., (2017) used near-infrared spectroscopy to detect ethanol, glucose, biomass and glycerol in the process of ethanol fermentation, and then used stoichiometry to associate the spectrals with off-line detecting data, and nally realized real-time and on-line detection in the ethanol fermentation process.
At present, the application of NIR spectroscopy combined with stoichiometry has been widely used in the fermentation processes. The difference of fermentation environment has a great impact on the NIR detection. Cervera et al. (2010) analyzed the application of NIR in the process of cell growth and fermentation, and found that structural differences (agitator paddle, ba e, etc.) in the bioreactor would lead to non-uniform bubble size and distribution in the fermentation broth. Otherwise, the morphology, substrates and products also affect the rheological properties (viscosity, color, etc.) of the fermentation broth, thus in uencing the spectral absorption. Although NIR detection may work well in a homogeneous environment, complex fermentation systems seems to be a big challenge. In the fermentation process of sophorolipids (SLs), oil, solid particles and bubbles are mixed in the broth, in which three phases of gas, liquid and solid are present, bringing great di culty to the parameter detection. The fermentation of lamentous fungi is characterized by adherence, clump formation, mycelium breaking, there is a great in uence on the rheological properties of fermentation broth, further on substrate and product detection. Therefore, the research on the application of NIR in different fermentation environments plays an important role in expanding its applicable range.
In this study, an experiment platform for real-time and on-line monitoring the concentrations of substrates and products by NIR technology was developed in the fermentation processes of L-lactic acid (L-LA) by Lactobacillus paracasei, SLs by Candida bombicola and sodium gluconate (SG) by Aspergilus niger. Subsequently, a quantitative analysis model based on partial least-squares regression (PLSR) and internal cross-validation methods was established for spectral data collected in the fermentation processes. Finally, the feasible application of NIR technology in fermentation broths with different types of microbial strains and different rheological properties was veri ed. C. bombicola ATCC 22214 bought from Guangdong Culture Collection Center (China) was used to produce SLs and it was stored at -80℃ in 20% glycerol solution. The initial fermentation medium contained (g/L): glucose 100, KH 2 PO 4 1, (NH 4 ) 2 SO 4 4, MgSO 4 ·7H 2 O 0.5, corn steep liquor 10. The medium was sterilized for 30 min under 115℃. The SLs fermentation was carried out in a 5 L bioreactor with 2.5 L initial working volume. The operation conditions were as follows, inoculum of 2.9% (optical density of 80 at 600 nm), fermentation temperature of 25℃, aeration of 0.5 vvm, initial agitation of 200 rpm. The process pH was maintained at 3.5 by adding 4 M NaOH solution. The dissolved oxygen (DO) was controlled above 40% saturation concentration during 0-36 h and above 25% saturation concentration after 36 h by adjusting the agitation step-wisely. The rapeseed oil was continuously fed into the broth and maintained the level lower than 10 g/L during the SLs fermentation. The solid glucose was added per 24 h to keep the level between 30 and 80 g/L.

Materials And
A. niger which was kindly provided by Shangdong Fu Yang Biotechnology Co., LTD was used to produce sodium gluconate (SG). The initial fermentation medium contained (g/L): glucose 250, KH 2 PO 4 0.5, (NH 4 ) 2 SO 4 2.355, (NH 4 ) 2 HPO 4 O 1.8, corn steep liquor 10. The SG fermentation was carried out in a 5 L bioreactor with 3 L initial working volume. The operation conditions were as follows, inoculum of 10%, fermentation temperature of 38℃, aeration of 1.2 vvm, agitation of 700 rpm. The process pH was maintained at 5.3 by adding 7.5 M NaOH solution.
2.2. Analytical methods 2.2.1. Off-line determination of glucose and L-LA in lactic acid fermentation L-LA and glucose were measured by SBA-40C Biosensor analyzer (Shandong Province Academy of Sciences, China).

Off-line determination of glucose, oil and SLs concentrations in SLs fermentation
Glucose concentration in the broth was analyzed by the method as above mentioned. Oil and SLs concentrations were determined by the weighing method and high-performance liquid chromatography (HPLC) method respectively as described in our previous works (Chen et al., 2019b). Brie y, three parallel broth samples were extracted twice using the same volume of n-hexane for oil content determination. The upper layer was then transferred to another tube and dried for 24 h to constant weight by an oven. In terms of SLs, two milliliter of fermentation broth was withdrawn and 2 mL of KOH/MeOH (4 M) solution was added, and then the mixture was heated at 80℃ for 15 min. After cooling to room temperature, methanol was added to a total volume of 10 mL and NaH 2 PO 4 pH buffer (0.2 M) was used to neutralize the solution. Finally, the sample was diluted to an appropriate concentration for HPLC analysis. Mobile phase (ammonium acetate, 0.02 mol/L; formic acid, 1% v/v; methanol, 75% v/v) C 18 column (4.6 mm×250 mm, Acchrom) refractive index detector (RID), and the ow rate of 0.9 mL/min were adopted. The injection volume was 20 μL and the column and detector temperatures were controlled at 50℃ and 35℃, respectively.

Off-line determination of glucose, SG, NH 4 + and soluble phosphorus concentrations in SG fermentation
Glucose concentration in the broth was analyzed by the method as above mentioned. SG concentration in the broth was determined by HPLC (SPD-20A, Shimadzu, Japan) at 210 nm with a C 18 column (4.6×250 nm, no. 336-1101, Sepax Technologies, Inc, USA) as described by . The mobile phase was prepared with isovolumetric methanol solution (3 M) and phosphate solution (0.25 M). The ow rate and the column temperatures were set at 1.0 mL/min and 28℃, respectively.
Soluble phosphorus (P) concentration in the broth was determined by ammonium molybdate reduction method at 825 nm as described by Durge and Paliwal (1967). NH 4 + was determined by phenol-sodium hypochlorite colorimetry at 625 nm as described by Broderick et al., (1980 DA7440 on-line NIR analyzer manufactured by Perten company (Sweden) was used in fermentation processes. Light is emitted from the instrument, passed through the glass jar, and after many re ections or refraction, the device reabsorbs the returned spectrum (Fig. 1). It belongs to non-invasive diffuse re ection detector and an unscrambler 10.3 quantitative analysis software from NIR can conduct spectral preprocessing, spectral region selection (removing the interference of water vapor from the wavelength range of 1350-1410 nm) and remove abnormal samples. The spectral acquisition speed is about 30 times full spectrum measurements every second. All received spectral signal results are stored and displayed on a connected computer.

NIR spectral modeling
In order to eliminate the interference from the changes of environmental conditions on the spectrum measurement, the methods such as rst-order derivative, ve-point smoothing, standard normal variable (SNV), de-trending algorithm and so on were adopted to preprocess spectral data so as to improve the detection accuracy and reliability in this study.
PLSR is a new kind of multivariate statistical data analysis method (Rehman et al., 2018). The model established by this algorithm is the regression model of multiple or single dependent variable Y to multiple independent variable X. PLSR is a combination of principal component analysis (PCA), canonical correlation analysis (CCA) and multiple linear regression analysis (Chen et al., 2019a). The mathematical model is as follows by equation (1) and (2): Where, the matrix X and Y represent the independent and dependent variable matrixes respectively. The matrix T and U are the score matrixes of X and Y, respectively. The matrix P and Q represent the load matrixes for X and Y. E and F represent the errors. PLSR decomposed the spectral matrix and the concentration matrix at the same time, and considered the relationship between them during the decomposition to strengthen the corresponding machine damage, so as to ensure the best correction model.
Cross validation (CV) is a commonly used statistical method, which often cuts data samples into smaller subsets. In a given modeling sample, most samples are taken for modeling and a small number of samples are left for evaluation by the established model (Ryan et al., 2013). In this study, the internal cross validation method was used to determine the optimal factor number in the mathematical model. For example, the sample set was divided into K sub-samples. A single sub-sample was retained as the data of the validation model and the other K-1 sub-samples were used for model training. The CV was repeated K times, each sub-sample was veri ed once, and the results of average K times or other combinations were used to obtain a single estimate (Afendras and Markatou, 2019). The external validation procedure consists of using a set of validation samples that do not belong to the calibration set.

NIR model evaluation
In order to evaluate the prediction function of NIR for fermentation process parameters, the root mean square error (RMSEP) and correlation coe cient R 2 of the prediction set were adopted. The calculation formulas are as follows by equation (3) and (4) (Rodrigues et al., 2008;Dong et al., 2018): the deviation between the predicted value and the reference value. R 2 represents the correlation between the predicted value and the reference value. The smaller of RMSEP and the larger of R 2 , indicating the higher accuracy of the NIR model.

Statistical analysis
All experiments were performed in triplicate and all data were presented as the mean with standard deviation (SD). Statistical analysis was performed using One-way Analysis of Variance (ANOVA) and ttest (P<0.05) was used to test whether there is any signi cant difference among treatments (SPSS 22.0, SPSS Inc., USA).

Quantitative calibration models by NIR method
The data of the L-LA, SLs and SG fermentation parameters were collected on the above NIR experimental platform to establish the calibration model of NIR spectrum. The off-line sampling data of glucose, L-LA, SLs, rapeseed oil, SG, NH 4 + and P were shown in Table 1. Samples collected were used as calibration set and veri cation set in a ratio of 2:1, respectively. In order to ensure the reliability of the spectral model, the maximum and minimum values of the off-line detection data in the fermentation process were included in the calibration set. The differences of substrates or products in the fermentation process will cause signi cant changes in the spectral data ( Fig. 2 a, b, c). In addition, compared with L-LA fermentation, SLs and SG fermentation broths presents the characteristics of multi-phase, viscosity change and mycelium interference, therefore the spectral baseline drift was relatively large. To enhance spectral features, the original spectra need to be subjected to pretreatments before being used to construct calibration models. The rst and second derivatives can eliminate the baseline drift related to the changes in concentrations , as shown in Fig. 3, a, b, c, and the Fig. 4 a, b, c were the spectra data processed by the rst and second derivatives. Regression coe cients were established according to the off-line data and spectral data of each fermentation component, which indicated that the absorption signal intensity of different components in fermentation broth was different at wavelength (Fig S1-S9). It is noted that the spectra does not de ne a substance in a clear way, but rather represents the absorbance of different groups of a substance at the wavelength.

NIR model construction
The spectral calibration model was established by PLSR and CV methods, and then the model was validated by the known validation data set (Table 2). L-LA fermentation was relatively simple and the NIR model had a good prediction function on environmental glucose and L-LA concentrations with both R 2 above 0.96 (Fig. 5 a and b). As regards to SLs fermentation, the NIR model also could well predict environmental SLs and residual oil contents with R 2 over 0.99 and 0.98 respectively (Fig. 6 b and c), however, it was worthy that the R 2 of glucose concentration only reached 0.90 (Fig. 6 a), which might be ascribed to the intermittent addition of solid glucose into the broth during the process, leading to a wide uctuation of glucose concentration in a relatively short period. On the other hand, compared to SG concentration (R 2 =0.66), the prediction effects of glucose, NH 4 + and P contents were better, where R 2 were 0.92, 0.84 and 0.91 in the SG fermentation process, respectively (Fig. 7 a, b, c and d). In terms of SG, it could be contributed to the signi cant change of SG concentration during the fermentation process.

NIR model validation
Three independent batches of L-LA, SLs and SG fermentation were conducted respectively to validate the above established NIR models. As shown in Fig. 8, the line data were the real-time predicted values based on the NIR model and the scatter data was the reference values by off-line detection (Fig. 8 a, b and c).
It was found that the real-time on-line detection results of the NIR model has a high correlation with the reference values of the off-line detection (Table S1). Apart from the SG of 0.80, all the R 2 of other components were above 0.90, showing a good correlation (Fig. 8). In detail, the on-line detection values of high and low concentrations of SG were signi cantly different from the predicted values. On the other hand, compared with RMSEP in the process of model establishment, the reduction of RMSEP means that the error is reduced and the model is more accurate during the veri cation process. This result further indicates that the NIR model has a good prediction accuracy for the detection of fermentation process parameters.

Discussion
Fermentation process was characterized of complexity and uncertainty, and the basis of fermentation regulation is to harness fermentation parameters in real-time. However, at present, the detection of process parameters is still relatively limited, and it is impossible to achieve multi-dimensional cell metabolism measurement. Moreover, some sensors may be affected by the characteristics of strain, the rheological properties of fermentation broth, the aseptic requirement, so as the on-line detection of various substances and products in the fermentation broth cannot be realized. At present, a variety of online detection technologies have been used in microbial fermentation, including near-infrared spectroscopy, low eld nuclear magnetic technology, raman spectroscopy, viable cell sensor, electronic nose and so on (Zhao et al., 2016;Schalk et al., 2019). Wang et al., (2016) used low eld nuclear magnetism to realize real-time and accurate dynamic analysis of cellular lipids of Chlorella protothecoides. Chen et al., (2019b) further realized real-time detection of SLs and oil in fermentation broth with low eld nuclear magnetism. However, online detection technology may have the following problems. First, the detection parameters are limited, often only for 1-2 parameters. In contrast, NIR technology can achieve multi-parameter detection. NIR has been used in ethanol, lactic acid and other fermentation processes, mainly for the detection of biomass, glucose, ethanol, lactic acid and other substances (Sandor et al., 2013;Bence et al., 2019). In addition, most of NIR spectroscopy and raman spectroscopy techniques use contact electrode probes. In the fermentation process, it is necessary to face high temperature and high-pressure sterilization and fermentation liquid corrosion, which puts forward higher requirements on the electrode. In this study, non-contact NIR spectroscopy technology was used, without direct contact with the fermentation broth, sterilization and there were no problems such as contamination of bacteria and affecting cell metabolism. In view of the complex fermentation environment, especially in the complex environment of SLs fermentation broth with gas-liquid-solid threephase, the application of NIR has not been reported yet. In this study, real-time and on-line detection by NIR technology in different fermentation processes was established. It is of great signi cance to the application of NIR in the future. However, the current online NIR spectroscopy technology may be di cult to achieve large-scale applications. The main reason is the high cost of near-infrared spectroscopy technology and the modeling system, and its modeling system is affected by many factors, the model cannot be uni ed, and its technology needs further development.

Conclusion
A NIR detecting platform has been established to real-time and on-line detection of process multiparameters under different fermentation systems. Especially for the complex fermentation environments, different rheological properties (uniform system and multi-phase inhomogeneous system) and different parameter types (substrate, product and nutrients) can also have good applicability. Finally, the veri cation shows that the NIR model has good predictability and reliability, which provides a solid technical basis for the subsequent fermentation regulation.

Declarations
Author Contribution Statement

Consent for publication
All authors have read and approved the manuscript before submitting it to bioresources and bioprocessing.

Availability of data and materials
All data generated or analyzed during this study are included in this published article.
Zhao H T, Pang K Y, Lin W L, Wang Z J, Gao D Q, Guo M J, Zhuang Y P. (2016) Optimization of the npropanol concentration and feedback control strategy with electronic nose in erythromycin fermentation processes. Process Biochem, 51 (2) 195-203. Tables   Table 1 Parameters of fermentation process   Figure 1 Test platform based on NIR analyzer for fermentation process parameters  Spectral model of SG fermentation (a) Glucose, (b) SG, (c) NH4+, (d) P The X-axis is the reference value for off-line detection and the Y-axis is the predicted value based on NIR data. The blue color in gure represents the correction set data obtained by the CV method.