Artificial Neural Networks (ANNs) and Response Surface Methodology (RSM) Approach for Modeling and Optimization of Pectin Extraction from Banana Peel


 The present study, the influence of three independent variables for extraction of pectin were investigated and optimized using artificial neural network and response surface methodology on the yield and degree of esterification of banana peel pectin obtained using acid extraction method. The results revealed that properly trained artificial neural network model is found to be more accurate in prediction as compared to response surface method model. The optimum conditions were found to be temperature of 82oC, pH of 2 and extraction time of 102 min in the desirable range of the order of 0.977. The yield of pectin and degree of esterification under these optimum conditions was 15.64% and 65.94, respectively. Temperature, extraction time and pH revealed a significant (p < 0.05) effect on the pectin yield and degree of esterification. The extracted banana peel pectin was categorized as high methoxyl pectin, based on the high methoxyl content and degree of esterification. In general, the findings of the study show that banana peel can be explored as a promising alternative for the commercial production of pectin.


Introduction
The cultivated desert banana and plantain (Musa sp.) are considered as one of the most important food crop for tropical and subtropical region and play important role in food security and economy (Waghmare and Arya, 2016). In Ethiopia banana is considered as the most popular fruit crop that is most broadly grown and consumed. It covers about 59.64% of the overall fruit area, approximately 68.00% of the entire fruits produced, and about 38.30% of the total fruit 4 Therefore in the present work, RSM and ANN linked genetic algorithm-based models have been developed to predict the relationship between the input variables and the output variables. Subsequently, the results predicted by the ANN and RSM techniques were compared statistically to the coefficient of determination (R 2 ), root mean square error (RMSE), mean absolute error (MAE), standard error of prediction (SEP%), and absolute average deviation (AAD%) based on the validation data set for their predictive and generalization capabilities. An effective RSM model and a feed-forward neural network on back-propagation were developed by utilizing the experimental data, and the efficiency of both models was compared. Therefore, it is important to identify appropriate extraction conditions to obtain maximum possible yield of pectin from banana peel. Therefore, this study was conducted to investigate the effect of extraction conditions namely pH, temperature and time on yield and the degree of esterification of banana peel pectin and to optimize these conditions to extract maximum possible pectin by employing ANN and RSM.

Materials
Banana (varieties: Dwarf Cavendish and Giant Cavendish) were collected from some selected hotels, juice processing houses and restaurants in Jimma, Ethiopia. All chemicals used for the extraction process were of analytical reagent grade.

Raw material preparation
The fresh banana peels were segregated according to their type and chopped into approximately 1cm 2 pieces using a stainless steel knife for easy drying and washed with water three times. Sample drying was carried out in an oven at 60°C for 48 hours to obtain easily crushable material. The dried peel was milled using a mechanical grinder and then screened to pass through a sieve size of 60 meshes and packed in an airtight, moisture-proof bag at room temperature and ready for the extraction process.

Preparation alcohol insoluble solids
Banana peel powder of the samples were homogenized in boiling ethanol (solid-liquid ratio of 1:10, w/v) with a final ethanol concentration of 80% (v/v) at 70 o C for 20min in a shaking water-bath to inactivate possible endogenous enzymes and remove AIS. Then after, the resulting residue was washed with distilled water and air-dried at 50°C.

Pectin extraction
In this study, pectin was extracted according with the methodology proposed by (do Nascimento Oliveira et al., 2018), with a few modifications. An alcohol insoluble solid was mixed in a conical flask with the extracted solution (solid-liquid ratio of 1:40 (w/v). Pectin was extracted from alcohol insoluble solids by using three different extraction conditions to study the effect of each condition on the pectin yield. The extraction was done at different temperatures (52.5, 60, 71, 82, and 89.5 o C), pH (1.66, 2, 2.5, 3 and 3.34) and extraction time (44.7, 60, 82.5,105 and 120 min). The hot acid extracts were separated from the alcohol insoluble solids residue by filtering through nylon/muslin cloth and cooled immediately by chilled water, dispersed in an equal volume of 96% ethanol, stirred 5 min for proper mixing and allowed to stand for 3h. The precipitate was washed 2-3 times by 70% acidic ethanol (0.5% HCl), 70% ethanol and finally 95% ethanol. Finally, the precipitate was dried at 40°C in hot air oven overnight to remove the moisture until a constant weight was reached. The ground powder pectin was kept in airtight container. According to Ranganna [10], The percentage yield of the extracted pectin was determined using the following equation:

Analytical methods
Sample of dried banana peel pectin was subjected to quantitative test in order determine its physicochemical characteristics. From the results obtained, the optimal conditions that gave the optimum yield were used for subsequent analysis.  (Ranganna, 1995).

Experimental design and statistical analysis
In the present work, extraction of pectin was studied to determine the optimized conditions for the pectin yield and degree of esterification. RSM is a collection of mathematical and statistical techniques to utilize quantitative information from an appropriate experimental design to identify optimum conditions. The influence of temperature, pH and extraction time were determined through a RSM, and central composite design (CCD), requiring a total of 20 experimental runs employed to determine the best combination of parameters for the extraction process. The responses and the process variables are modeled and optimized using analysis of variance (ANOVA) to predict the statistical parameters using RSM. The independent process variables range were selected based on (Fakayode and Abobi, 2018). Generally, CCD involves six factorial points, eight axial points and six points at the center were carried out with the alpha factor of 1.414. All factors have to be adjusted at five coded levels (-α, -1, 0, +1, + α) (Nahar et al. 2017). The relationship between the coded and the actual value of the variables is shown in where N is the total number of experiments required, m is the number of variables and m c is number of replicates. The relationship of the variables and response was determined by secondorder polynomial multiple the quadratic regression equation.
where Y is the predicted response (i.e. yield and DE), n is the number of independent variables, b 0 is the constant coefficient, b i is the linear coefficients, b ij is the second-order interaction coefficients, b ii is the quadratic coefficients and x i and x j are the coded values of the independent variables.
The outcomes were summarized and statistically analyzed by using Design Expert version 11software (Stat-Ease Inc., Minneapolis, USA) and Neural Network Toolbox of MAT LAB version 8.1(R2013a). The ANOVA test was employed to estimate the statistical significance of the regression model. The coefficient of determination R 2 , adjusted R 2 , and predicted coefficient Notes: n is the number of variables for any particular experiment, n = 3

Artificial Neural Network Modeling
In present study, the ANN was developed for describing the extraction condition of pectin to enhance the yield. The data generated from the experimental design planned through CCD (Table 3) were used to constitute the optimal architecture of ANN. ANNs were introduced recently into the field of engineering studies as a tool for optimization and modeling of systematic variable studies involved in a particular process. ANN has been applied for the purpose of simulation on the same experimental data used for RSM.
The neural network architectures were trained by Levenberg-Marquardt back-propagation algorithm. The network architecture consisted of an input layer of three neurons (Temperature, pH and, extraction time), an output layer of two neurons (pectin yield and DE), and a hidden layer. There are 60% of data points were selected for training to develop the neural network, 20% of the data set used for validation and 20% data sets for testing. The more data sets in training reduce processing time in ANN learning and improve the generalization capability of models. This step makes possible the assessment of the generality of the ANN model. The number of neurons in the hidden layer can be calculated from the expression 2(n + m) 0.5 to 2n + 1 where n is the number of neurons in the input layer and m is the number of neurons in the output layer (Sundarraj et al. 2018a). A network is built each of them is trained separately, and therefore, the best network is selected based on the accuracy of the predictions within the testing phase. Levenberg-Marquardt back propagation is presented in Figure 1.  where X min , X max and X Ac are the minimum, maximum and actual data, respectively. The normalization of inputs and target was performed to avoid overflows that may appear due to very large or very small weights. The training process was run until a minimum of the MSE was reached in the validation process. The performance of the trained network was estimated based on the accuracy of the network with the test data. Feed forward with backward propagation is one of the most common neural networks used in solving engineering problems. All calculations were done using the Neural Network Toolbox of MAT LAB version 8.1(R2013a) utilized throughout the study (Joel et al. 2018).

Comparative analysis of RSM and ANN models
In order to evaluate the goodness of fitting and prediction accuracy of the constructed models was performed by using the root mean square error (RMSE), mean absolute error (MAE), correlation coefficients (R 2 ), standard error of prediction (SEP), and absolute average deviation (AAD) were calculated between experimental and predicted data. The formula used for error analysis were calculated by equation (5) to (9) (Liew et al. 2014b). To study the modeling abilities of the RSM and ANN models, the values predicted by the RSM and ANN models are plotted against the corresponding experimental values.
where, Y i , e is the experimental data, Y i,p is the predicted data obtained from either RSM or ANN, Y e is the mean value of experimental data and n is the number of the experimental data. The final network was selected based on the lowest error in the train and depending upon the test data.

Pectin yield
The yields of pectin extracted and degree of esterification using 1M H 2 SO 4 , from banana peel powder ranges from 10.52 to 15.87% and 61.27 to 65.95, respectively of the dry weight of peel depending on the extraction conditions. RSM has been widely adopted to investigate the effects of several design factors influencing a response by varying them simultaneously in a limited set of experiments. Thus, temperature, pH and extraction time were examined as factors to investigate the correlation between the process variables to the pectin yield and DE by using CCD. The complete experiment variables design matrix together with the values of experimental responses is presented in Table 3. The analysis of variance was carried out to investigate the model terms, to select a suitable model, and to detect the significances of the model equation.

RSM modeling fitting
The statistical significance of the regression model (linear, interaction, and quadratic) effect of all the response variables was checked by the Fisher statistical test (F-test) in the ANOVA. The ANOVA is used to calculate the coefficient of determination, the significance of linear, lack of fit, and interaction effects. The statistical analyses show that quadratic models fit very well into the data for the response. The probability value (p-value) was employed to check the significance of the coefficient, indicating the interaction between each independent variable (  (Table 4). Table 5 shows that the coefficient of variation (CV %) and standard deviation for the two responses in this study were reasonably low and acceptable, indicated a better precision and reliability of the experiment. Thus, the model is adequate for predictions in the range of the experimental variables. The goodness of fit of the models was further scrutinized using R 2 value.
It had been suggested that R 2 value should be at least 0.80 for a good fit of a model. The regression model found to be highly significant with the R 2 value of pectin yield and DE was 0.9888, and 0.9857, respectively, indicating a close agreement between the observed and the theoretical values predicted by the model equation (Figure 2 and 3). Moreover, the value of the adjusted R 2 for pectin yield and DE was 0.9787, and 0.9729 respectively, which confirmed that the model was highly significant, indicating good agreement between the experimental and predicted values of the response variables.
Adjusted R 2 and predicted R 2 should be within 20% to be in good agreement as suggested by (Owolabi et al. 2018). This requirement is satisfied in this study with a predicted R 2 value of pectin yield and DE was 0.9200, and 0.9236 respectively. From the above analysis, it can be concluded that these models are suitable for predicting the pectin yield and DE from banana peel powder within the limits of the experiment.

3.3Development of regression model equation
The experimental results obtained from the pectin yield and DE based on CCD is presented in Table 3. So as to create a simple model with a minimum of equation terms and also to prevent over-fitting, the insignificant coefficients ( (10) and (11), respectively. It should be noted that Eqs. (10) and (11) Equation (10) and (11)

Effect of extraction condition on the pectin yield
The present work was carried out under different experimental conditions (temperature, pH and extraction time) shown in Table 3. Pectin yield obtained in this experiment was found to be in the range of10.52-15.87% (Table 3), which is comparable to the ripe mango peel pectin The pH value has the most significant effect on the pectin yield whose F value is 475.98, followed by extraction temperature and time (   (Table 4). Figure 4 shows a 3D response surface plot of the pectin yield as a function of temperature and pH at a fixed extraction time. Increasing the combined effect between extraction temperature and pH generally decreased the pectin yield as can be seen from Figure 4; the highest yield was achieved when both variables were at the minimum point. Relatively long period of temperature and extraction time would cause a thermal degradation effect on the extracted pectin, thus causing a decrease in the amount perceptible by alcohol. The effect of temperature, pH and period in this study is similar to previous work of( Oliveira et al., 2016).

The effect of process variable on the degree of esterification
The degree of esterification obtained in the experiment is found to be in the range of 61.27 -65.95 (Table 3). Based on the degree of esterification pectin can be classified as low methoxyl pectin with ≤ 50% and high methoxyl pectin with > 50%. The presence of high methoxyl pectin (DE > 50 %) in the extracted banana peel pectin was evident (  The interaction between temperature and pH exhibited a strong significant (p < 0.0014) effect on the DE of pectin. Significant interaction indicates that the factors work independently, whilst the presence of interaction indicates that the difference in DE at different levels of a factor is not the same at all levels of another factor. It is to note that the interaction between temperature and extraction time, as well as pH and time, did not exhibit a significant effect on DE (Table 4). In order to visualize the relationship between the response and process variables 3D response surface plots were generated from the model equation (Eq. 11) developed in this study. The 3D dimension response surface showed mutual interactions between pH and temperature as shown in Figure 5.

Artificial neural network based results
Feed-forward with backward propagation neural network 3-10-2 is used in the present investigation to train the experimental data given in Table 3. In the present study, different line indicating remarkable compatibility between the experimental and predicted output data values by ANN. Therefore, the ANN prediction for training, validation, and testing is highly substantial and meritorious in terms of correlation and implies that the predicted model was more precise in predicting the responses.
The value of MSE obtained from the ANNs for both batch and continuous modes was 0.00098, which is close to the acceptance limit for the MSE, set to zero. The closeness of the training and testing errors validates the accuracy of the model. The linear regression analysis between the values predicted by ANN and RSM showed that the values predicted by the ANN model were much closer to experimentally measured data, suggesting that the ANN model was better modeling ability for both simulation and predicted values. Therefore, in the case of data sets with a limited number of observations in which regression models fail to capture reliably, advanced soft computing approaches like ANN may be preferred. ANN model had fitted the experimental data with an excellent accuracy.

Comparative evaluation of ANN and RSM models
The predictive competence of the ANN and RSM models were determined and compared based on prediction accuracy and various parameters such as RMSE, R 2 , SEP, MAE and AAD.    Moisture content of pectin extracted in this experiment was found to be 7.87%, which is slightly higher than banana peels of different varieties (4.54 -6.24%) and apple pomace (4.54%) but slightly lower than citrus peel (7.92%) (Khamsucharit et al. 2018). Low moisture content is necessary for safe storage because they inhibit the growth of microorganisms and pectinase enzymes that adversely affect pectin quality (Mohamadzadeh et al. 2010).
The ash content of pectin extracted from banana peel was found to be1.44% (Table 8) which was in similar range to that obtained from the conventional pectin sources, apple pomace (1.96%) and citrus peel (3.46%). The current finding was in agreement with an earlier finding of varies banana peel pectin (1.43-2.76%) (Khamsucharit et al. 2018). Low ash content (below 10%) and maximum limit of ash content 10% are one of the good criteria for gel formation (Manh et al. 2019). Lower ash content means higher purity. Therefore, the ash content found in this experiment indicates the purity of the pectin.
The Anhydrouronic acid content of pectin extracted from banana peel was found to be 67.43% (Table 8), which is comparable to pectin extracted from banana peels of different varieties (34.56-66.67%) while lower than citrus peel and apple pomace (Khamsucharit et al. 2018). The content of anhydrouronic acid (AUA) indicates the purity of the extracted pectin with a recommended value of not less than 65% for pectin used as food additives or for pharmaceutical purpose (May, 1990). In this study the highest AUA content of banana peel pectin was obtained which lies in the acceptable limits of pectin purity. This requirement has limited the potential sources of food and pharmaceutical pectin. Based on the AUA content the extracted pectin from banana peel had higher than 65% and met the criteria for commercial pectin; thus, banana peel can be an alternative source of high methoxyl pectin.
Methoxyl content is an important factor in controlling the setting time of pectin and the ability of the pectin to form gels (Constenla and Lozano, 2003). The methoxyl content of pectin extracted from banana peel was found to be 8.52% (Table 8), which is comparable to pectin extracted from pomelo peel (8.57%), passion (8.81%-9.61%), (Azad, 2014), banana peels of different varieties (3.86-8.46%) while lower than citrus peel (9.06%) and higher than apple pomace (7.92%) (Khamsucharit et al.2018). Spreading quality and sugar binding capacity of pectin are increased with increase methoxyl content (Azad, 2014). Based on methoxyl content value in this study indicates that banana peel pectin was categorized as high methoxyl pectin (HM). HM pectin requires a minimum amount of soluble solids and a pH within a narrow range, around 2.0-3.5, in order to form gels (Azad, 2014).
The equivalent weight of pectin extracted from banana peel was found to be 956.49 which was higher than citrus peel (577) and apple pomace (551) but comparable to other varieties of banana peel pectin (943-1456) (Khamsucharit et al. 2018) and lemon pomace peel pectin (368 -1632) (Azad, 2014). Viscosity of pectin extracted from banana peel was found to be 6.53x 10 -3 N s m -2 (Table 8). The physicochemical characterizations of pectin depend mainly on the raw material source and conditions selected for isolation and purification of pectin.

Validation of the optimized condition by response surface modeling
The main objectives of this study were to determine the optimal operating parameters for the maximum pectin yield and DE from banana peel using sulfuric acid. The numerical optimization of extraction of pectin was performed by using Design Expert 11.0 (Stat-Ease, Inc., Minneapolis, MN, USA, Trail version) statistical package by setting the desired goal for each process variable and responses. Pectin yield and DE were set at maximum values while the value of process variables was set in the range under study. To validate the statistical experimental strategies, the duplicate was performed under the predicted process conditions. Table 9