Choosing the best embryogenesis medium in carrot by data mining technology

Plant cell, tissue and organ culture (PCTOC) is extensively used to propagate faster and more plants, to produce virus-free plants, and secondary metabolites production as well. This requires the optimization of PCTOC conditions for each plant and nal aim. Optimizing the micropropagation is time-consuming and costly, because it is different from the plant and even for each variety. In addition requires the optimization of the concentration and type of hormone and the type of explants for each variety in the stages of callogenesis, embryogenesis, shooting and rooting. Hence, today researchers have used Data Mining with using an articial neural network (ANN) to predict the best conditions for tissue culture and saved time and money considerably. In this research, radial basis function (RBF) model was used to predict the best conditions for carrot tissue culture and the results showed that the highest and the least sensitivity were related to variety and percentage of Agar in liter, respectively. The results prediction of the RBF model showed that the percentage of embryogenesis was 62.5%, but the percentage of embryogenesis in laboratory obtain 75%. The results showed that the RBF model is a good model to predict the results. of the RBF model in the showed a high dependence between RBF model and laboratory results. The correlation between the result of RBF model and the result of the laboratory were 67%.


Introduction
Plant biological systems are considered by non-linear and non-deterministic developmental processes. The main factors during control of developmental patterns in these complex systems are environmental and genetic (1).
These two key factors contain a high level of inconsistency among and in them that ultimately causes a unique developmental process. A similar condition can be seen in plant tissue culture. The conventional analytical techniques for modeling the growth and development in in vitro culture are costly and time-consuming and sometimes become ineffective because of the existing complex system. Therefore, to circumvent these two main limitations (cost and time) during in vitro culture, computational strategies that can consider the growth kinetics together with thermodynamic constraints of culture conditions are required (2). Many factors such as macro nutrients, micronutrients, plant growth regulators, varieties, different stages of growth plant are effective in tissue culture. In preliminary studies for development of new tissue culture media, combinations or concentration of micro nutrients and macro nutrients were changed. This process is costly and time-consuming (3,4,5).
Recently Multilayer Perceptron (MLP) and neuro-fuzzy logic were used for modeling and predicting in vitro culture process such as shoot proliferation of Prunus rootstocks (1,6), in vitro rooting of Prunus rootstocks (7), in vitro sterilization of chrysanthemum (8), predict the effect of medium macro-nutrients on in vitro performance of pear rootstocks (OHF and Pyrodwarf) of pear (9), prediction and optimization of the plant hormones concentration and combinations for in vitro proliferation of Garnem (G × N15) rootstock of Vegetative (1), in vitro rooting and acclimatization of Vitis vinifera L. (10). Different arti cial neural networks (ANNs) such as MLP, Generalized Regression Neural Network (GRNN), Probabilistic Neural Network (PNN) and Radial basis function (RBF) can be used to interpret and process different data. However, there is no report for using RBF in plant tissue culture. RBF is a three layered feed-forward network type applicable to various regression and classi cation problems (10).
Research showed that the RBF model has better performance than that of the investigated constitutive equation (lower RMSE) (11). There is an excellent t between the predicted and experimental ow curves in using the developed RBF model. This together with very low RMSE value showed the robustness of the developed RBF model to predict the best media culture of the tested (11).
Tissue culture medium must be optimized because the currently used media culture like MS for all plants and explant is not suitable. There are many optimization algorithms; one of which is a Genetic Algorithm (GA) (12).
Researches indicate GA is a powerful tool for optimized designing culture media for proliferation and by using different amounts of ingredients for predicting the micropropagation.
Today, vaccine production is very important in the world. Carrots as a candidate plant for the production of vaccines and recombinant proteins (13).
Carrots (Daucus carota L) grown around the world and this vegetable is is one of the popular root vegetables (13). Carrots are a rich source of minerals, antioxidants and dietary ber (13). Due to the above reasons, it is necessary to optimize carrot tissue culture with maximum e ciency.
For regeneration of carrot, plant regulatory growth (BAP, 2,4-D, NAA, kin), explants (root, shoot, hypocotyl, leaf), Variety and macro nutrients and micro nutrients effects on regeneration of carrot. The effects of all these elements are effective on the embryogenesis of carrot.
Somatic embryogenesis of plants, as a key character of in vitro propagation, is a di cult and complex process ref. The embryogenesis process is reported to be affected by various factors ranging from endogenous levels of biochemical components such as plant growth regulators to physical conditions. In this study, we employed a data mining strategy using RBF to assess the effect and importance of plant tissue media composition in the embryogenesis of carrot.

Data
In the rst step, several datasets were selected from the previous reports. (14,15). In summary, the datasets include different carrot varieties (Monarch, Nantes improved, Tam Tam, Vilmorn, US-Harumakigosum, US-Haru+ A85 makigosum), different concentration of 2,4-D, BAP, kin and several essential mineral elements which were used as inputs (Table 1).

Radial basis function (RBF)
Radial-based radial function networks have a wide range of applications including function approximation, time series prediction, classi cation and system control (16,17,18).
The RBF network has three layers including input layer, a nonlinear hidden layer and a linear output layer. The input can be modeled as a vector of real numbers X Rn. The network output is a scalar function of the input vector, ℓ: Rn → R, and is calculated whit: The aim is to obtain the most important parameters after initial review of the parameters.
In this research, inputs layers of RBF model were three different concentrations of Plant Growth Regulators (2,4-D, BAP, kin), three different concentrations of three ions including MgSO 4 , CaCl 2 , MnSO 4 and six varieties of carrot including Monarch, Nantes improved, Tam Tam, Vilmorn, us-Harumakigosum and us-Haru + A85 makigosum. Output was embryogenesis. For this purpose, 75% data were used as training and 25% data were used as testing. Also the K-Fold Cross Validation used for the data of this experiment is 5 (K= 5). R 2 (coe cient of determination), RMSE (root mean square error) and MBE (mean bias error) were used to assess the appropriateness of the RBF model (18) as follows: Genetic Algorithm Optimization Genetic Algorithm (GA) is a very powerful tool used to optimize data. This algorithm is a parallel iterative optimization algorithm that has the capability of learning and iteratively for steps such as evolution, selection, mutation and crossover. The evaluation criterion is obtained in order to optimize each value individually.
For obtaining the best tness, the roulette wheel method was used to select the best population. 100 initial populations, 1000 repeats, the 0.7 mutation rate and 0.05 combination rate were set. The compound function and the mutation function used in this experiment, are respectively Two-Point Crossover and Uniform.
The optimal values of inputs containing varieties, MgSO4, CaCl2, MnSO4, 2, 4-D, BAP, Kin were determined to achieve the best amount of output (embryogenesis) by GA.

Sensitivity Analysis
Sensitivity analysis was performed to determine which input parameters have the most impact on the output model. In fact, sensitivity analysis determined which of the input parameters, including the variety, ions including For this experiment select 8 candidate treatments (table 2) and 4 different explants, including root, shoot, leaf and nodal used for research and used variety was Nantes.
The statistical analysis was performed by ANOVA that based experiments were on Completely Randomized Design (CRD) with 3 replications. Data were analyzed using SPSS software and signi cantly different means were identi ed using Tukey , s test (P = 0.05).

Result ANN MODELING AND OPTIMIZATION, AND SENSITIVITY ANALYSIS ANN Modeling and Evaluation
Results of estimation of predicted outputs whit ANN model showed that there is a signi cant difference between observed and predicted growth factor in both training and test sets (Table 3). Simple regression lines represent a high correlation between the value of the observations and the predicted data values of ion compound, plant growth regulatory concentration, six types of variety and three types of media culture for both the training and testing sets. Using high squared correlation coe cients tting method and according to the ANN models obtained, the graph was created for presentation the variation in ion compounds, plant growth regulatory concentration, six types of variety and three types of media culture (Fig. 1). The correlation coe cients (R 2 ) for training data and testing data were 0.833 and 0.666 respectively. RMSE and MBE in training data were 7.879734 and 0.311607 respectively, and in testing data were 10.23974 and 3.78125 respectively. As can be seen RBF model is su cient accuracy, because correlation coe cients for the validation data set is su cient. As can be seen, the model is able to predict embryogenesis (R 2 = 0.833) ( Table 3) (Fig. 1).
Based on the results of the research, the ANN can be considered as one of the most effective methods for analyzing the data obtained from tissue culture (embryogenesis) parameters to predict the optimization factors that required at proliferation stage.

Model Optimization and Sensitivity Analysis of the Models
The ANN-GA Predicted Optimized Amounts of Variable Parameter (effect of mineral nutrients and a plant growth regulator, variety and tissue culture media). The result of optimization show in Table 4. The results indicate that best variety used in this experiment of between six varieties was Nantes improved. The other of the data is shown in Table 4.
The result of analysis, sensitivity indicated that the higher sensitivity regarding to variety and shorter sensitivity regarding to the percentage of Agar in liter ( Table 5).
The results of con rmation analysis showed that the combination of NAA-GA is an e cient method for prediction and optimization of elements combination in in vitro embryogenesis in shoot explant.
The results of NAA-GA model was con rmed in the laboratory. The highest embryos created in the MS media containing 195.23 mg/l MgSO 4 , 330.07 mg/l CaCl 2 , 18.3 mg/l MnSO 4 and containing plant growth regulators containing 0.46 mg/l 2,4-D, 0.03 mg/l BAP and 0.88 mg/l kin (Fig. 2). The shoot explants were best explants (Fig.3). Percentage of embryogenesis by RBF model have been predicted 62.5%, but percentage of embryogenesis in the laboratory obtain 75% (Table 5, Fig. 3).
The embryogenesis were formed ve weeks after tissue culture of explants. The results of the analysis indicated that used of 8 treatments for embryogenesis of explants root, leaf and shoot at the 5 % level was signi cant, but with used nodal explants, embryogenesis is not signed by the used treatment and is meaningless (Fig. 3).

Discussion
Designing a new media culture for tissue culture requires a lot of time and cost, Because many factors such as the type and concentration of hormones, the type and amount of ion, varieties, etc. should be tested alone and in combination with together. There are many studies in which a large number of factors have been used to design a new culture medium. This design is different for any plant and even any variety (19,20,21,22,23).
To solve these problems, the Data Mining as a new method has been proposed. In this method, previous research data are used and the data collected are analyzed in the software and the appropriate model and the best components are obtained in order to produce a new media culture.
In this work, we used NAA-RBF model, as a new data mining technique, in order to achieve a better  28,29) and High concentrations of 2, 4-D can be cause disruption of natural genetic processes and physiological processes. (30,31,32,33). study shows application of 22.5 µM 2,4-D. At Glycine max L. effect on embryo morphology and development (36). Result indicate shows High concentration of 2,4-D (68 µM) cause morphological abnormalities in the resulted plants (37). But another study shows positive effect of 2,4-D on somatic embryogenesis of Arabidopsis (38). I another study positive effect somatic embryogenesis in a carrot cell suspension culture (39).
Kinetin is a type of cytokinin, a class of plant hormone that promotes cell division. Kinetin is often used in plant tissue culture for inducing formation of callus (in conjunction with auxin) and to regenerate shoot tissues from callus (with lower auxin concentration). study shows highest embryogenesis obtain used of 5 µM kinetin in Coffea (canephora of species of coffee) (40). In contrast with researchs another study shows highest somatic embryogenesis obtain of 50 µM of NAA in lulo (Solanum quitoense Lam.). 6-Benzylaminopurine, benzyl adenine, BAP or BA is a rst-generation synthetic cytokinin that elicits plant growth and development responses, setting blossoms and stimulating fruit richness by stimulating cell division. The result of research shows that application 1, 5 and 15 µM benzyladenine (BA) reveal Somatic embryogenesis Desmocladus exuosus (34). result shows highest somatic embryogenesis obtain of 4.4 µM 6benzylaminopurine (BAP) in A. archeri and A. Appressipila from the hypocotyls explants (35). Sane et el., in 2012 shows for somatic embryogenesis 0.5 mg/l BAP is needed (2).
Interestingly, we found here that variety was the most important effect for embryogenesis and in fact the model con rms that Nantes improved have highest embryogenesis. Originally, results offered by the RBF model agree with those previously found after a traditional statistical analysis, suggesting the effect of variety is strong (15).
Research shown e cient media culture for embryogenesis was MSm medium. The result of the RBF model and traditional statistical analysis is similar and con rmed that the RBF model is appropriate for predicted of embryogenesis (15). In this study, RBF clearly has pinpointed the effect of parameters on the speci c growth parameter (embryogenesis).
The work reported in this article illustrate a data mining strategy coupled whit ANN technologies as an additional, novel and accurate approach to evaluate the effect of several growth conditions (in this case, ion and plant growth concentration, variety and media culture) on plant growth. This technology is amazingly useful in identifying interaction effects. It is shorter time consuming and higher helpful because of varying types of data and historical data can be analyzed as well. ANN-RBF model allows us to answer questions about a useful combination of the element to obtain the best results. This could be so useful for designing new and effective optimization experiments. In the end, the knowledge generated in this research can be simpli ed increased by including additional information (inputs and outputs) in the previous research results, such as ion and plant growth regulators, variety and plant media culture as they become available.

Conclusion
In this study, Data Mining used to making tissue culture easier and reduce the cost. The model used to be RBF. By Using this model we can measure the effect of different factors on tissue culture and the results showed that this model is a very suitable model for predicting effects of different factors on tissue culture. We examined embryogenesis of carrot in this study and in the future, this model can be used to study the effect of different factors on other aspects of tissue culture.      Percentage of embryogenesis after 5 weeks in different media