A Multi-Mode Reasoning For Predicting Tobacco Baking Quality

: In order to reduce the number of defective products caused by the unreasonable baking time during the tobacco production process, this paper proposes a method for establishing a multi-model reasoning tobacco baking quality prediction model. Conduct data mining and analysis on the data of various indicators of the original tobacco, and screen out the data that have an impact on the quality of tobacco baking. In order to reduce the complexity of the model and eliminate the influence between different dimensions, the data are carried out and standardized processing. Next, the normalized data is explored for the multi-input and multi-output mapping relationship. Finally, a mapping matrix is given for the multi-input and multi-output mapping relationship so as to establish a tobacco baking quality prediction model. The test results show that the predicted value of this model is basically the actual value, and the prediction accuracy rate is more than 90%. It has a high prediction accuracy rate. The cured tobacco leaves are basically the same as the actual cured yellow expected value. This model provides a practical guide method for tobacco baking, which has certain practical value in actual tobacco baking.


Introduction
With the development of informatization and digitization, many countries and even the global manufacturing industry are facing a new round of challenges-"data explosion, lack of information", that is, the explosive growth of production data in various industries, but the existing calculation methods capture the information is insufficient. Therefore, the useful information contained in the big data requires more efficient data mining and analysis methods. It forces researchers and various industries to rethink computing solutions. To obtain information generated in various real production and life scenarios, the useful insights of massive data 1 were made to be intelligent. In the context of growing population and increasingly tight resources, it was of great significance to carry out research on tobacco baking quality prediction models aimed at saving resources and improving efficiency. Compared with other industries, the tobacco industry has a higher level of automation. Tobacco baking is an important for tobacco leaf production. With the advancement of data collection technology, the data obtained during the baking process of tobacco has exploded. Efficient utilization, making production safer, low-cost, and high-efficiency has become an urgent problem to be solved. An intelligent control system for tobacco status monitoring was designed in the paper 2 based on real-time images of tobacco leaves to monitor the status changes of the tobacco leaves during the baking process. In this paper, neural network method was used to identify and predict the set value of dry and wet bulb temperature and the time to the next set value. However, the neural network method may have local optimization or over-fitting, and the stability of the system needs to be studied. In recent years, with the continuous development of machine learning, integrated learning methods such as boosting 4 had attracted great attention from researchers in industrial product prediction 3 . Shi et al 5 proposed a tobacco quality prediction model based on an ensemble learning algorithm on the basis of data mining. The paper first used the random forest algorithm to reduce the dimensionality of the data features, and then used the XGBoost algorithm to establish a tobacco quality prediction model. From the perspective of experimental results, the model had a good effect on tobacco quality prediction. However, the prediction stability of this model in high-dimensional feature data sets was poor. Some scholars have also explored the relationship between the appearance of tobacco leaves and its chemical composition 6,7 . The appearance quality of tobacco leaves was closely related to its internal chemical composition. The main determinant of the quality of flue-cured tobacco lies in its chemical composition. Therefore, some scholars 8 judged the quality of tobacco leaves by using BP neural network, and indirectly judged the quality of flue-cured tobacco through the neural network to determine the appearance quality of tobacco leaves. However, the appearance quality was only part of the characterization of tobacco leaf quality, and couldn't directly indicate the quality of the tobacco leaf. Lu et al 10 used tobacco leaf color as the input index and flue-cured tobacco sensory quality as the output index to build a neural network to predict and evaluate the sensory quality of flue-cured tobacco. Yang et al 11 explored the application of machine vision in the prediction of flue-cured tobacco quality, using machine vision to distinguish the color difference of tobacco leaves after baking to predict the quality of tobacco leaves. Li Zheng, Lu Xiaochong, Yang Xiaoliang and others all used appearance to indirectly predict the quality of flue-cured tobacco, and did not directly involve the quality indicators of flue-cured tobacco. Fan et al 9 designed a cascade control method for the moisture control of the re-drying of tobacco leaves. This method was robust in response speed and robustness. There was a high improvement in performance and residuals. However, the moisture re-drying process of tobacco belonged to the follow-up project of tobacco leaf baking, and this method had no obvious effect on the prediction of tobacco leaf quality during the baking process. At present, most of the predictions of flue-cured tobacco quality were based on neural network appearance prediction, without considering the direct judgment of tobacco leaf quality from the tobacco baking data. Analyzing the tobacco leaf data was an effective method for improving predictive efficiency during the baking process and directly predicting the tobacco leaf quality. This paper analyzes the collected baking tobacco data by data mining, and establishes a multi-mode reasoning tobacco baking quality prediction model, which is used to monitor the state of tobacco leaves in real time during the tobacco baking process. The quality prediction model can be established by discussing the best mapping relationship between input data and output data. Moreover, in order to avoid the influence of different dimensions and data range for establishing the model, the input and output data are standardized before and after tobacco baking. Finally, through model evaluation and some experiments, compares some different prediction methods, the tobacco baking quality prediction model obtained by using multi-mode reasoning can ensure the accuracy, rapidity and anti-interference of the prediction.

Tobacco baking prediction model establishment
The effectiveness of the tobacco quality prediction model is related to a series of major issues such as "quality production prediction, prediction accuracy, precise control of tobacco quality, and intelligent decision-making of tobacco products". The use of data serves for tobacco quality prediction and reduces the labor of production personnel. Strength, greening and environmental protection, reducing resource waste, saving manpower, material and financial resources, and promoting the rapid development of the enterprise economy. In order to ensure the success of the system construction, the initial top-level design of the system development and construction must be done before the system development and construction are formally started. The overall scheme of model establishment is shown in Figure 1.  Figure 1. Tobacco baking data model establishment overall plan In Figure 1, the quantitative and qualitative analysis of tobacco baking data is carried out to find the correlation between the data; the main factor data is mined and extracted; then the mined data is standardized, because of the type, unit, and characteristics of the data If the relationship between input and output is not unified, the data must be standardized; the relationship between input and output data must be analyzed, and the proportion of the main and secondary factors of the input to the output will be mined; the establishment of multiple inputs and Multi-output mapping relationship matrix, that is, the establishment of input and output models; through this relationship matrix for testing, that is, the user's input is combined with the relationship matrix, and finally the user can reflect some relevant index values of the tobacco input. Bake the predictive index value, and give the change curve graph of the predictive index value.

Standardized processing of baking data Data analysis
Analyze input and output data, including type, dimension, number of input and output, numerical characteristics, data comprehensiveness, output expectations, etc., according to the characteristics of the data, propose multiple models of data relationships, conduct experiments, and finally determine a reasonable one Algorithm. The specific mode of inference prediction is: Then "If the input x is A , what should be the B corresponding to the output y ".
We can define: , that is, the conclusion B can be obtained by combining A with the inference relationship from A to B . Among them, ( represents the mapping relationship obtained by obtaining B from A , and represents the operator combining A and ( After obtaining the basic idea of modeling, it is also necessary to screen the input and output indicators of tobacco leaves. Through the research of material providers and Liu et al. 12 , we have learned that the input indicators that affect the baking of tobacco leaves are the temperature of the dry and wet bulbs. , Absolute temperature, relative temperature and absolute humidity; The quality indicators of tobacco leaves after baking are red, green, blue and moisture respectively. After the data indicators are selected, the data can be modeled.

Standardized processing
For the actual production process, the collected data is different in dimension and value range. Direct modeling will lead to uneven spatial distribution of sample data, which will affect the prediction results of the model. Therefore, the collected data needs to be preprocessed first. There are various types of preprocessing methods. The conventional one is to normalize or standardize the data. Among them, there are many methods of data standardization, such as deviation standardization, 0-1 standardization, etc. Here we use the most value comparison method to standardize the data. which is: The standardized data enables different indicators to have the same value range, and also eliminates the influence between different dimensions, and realizes the normalization of the data while retaining the complete information of the data.

Establishment of multi-modal reasoning model
According to the input data and output data during the baking process, a state equation of expected output is established, that is, based on the actual output state value, the analysis and establishment of an expected output state equation are: () Gk is the process interference distribution matrix generated by the input deviation in the tobacco baking process given in this paper. In order to facilitate modeling and processing, it is assumed that there is no control input at the beginning of tobacco baking, and it is assumed that the input deviation in the baking process, that is, the process noise is additive zero-mean white noise, and the interference distribution matrix () Gk is known, that is:

 
Qk is the zero-mean positive definite covariance matrix of the process noise, kj  is a small number of deviations, and j is any moment different from k .
The estimated output data is the existing measurement output data in Excel table, 6 and the measurement equation is established as follows: Among them, () Zk is the actual tobacco index value vector output at time k , h k X k is the non-linear function of the measurement output given in this article, and   Vk is the deviation interference vector measured during the tobacco baking process. For the convenience of standardization processing,   Vk is set as additive white noise with zero mean value, that is:

 
Rk represents the zero-mean positive definite covariance matrix of the measurement noise. It is assumed that the interference vector and the measured deviation interference vector column in the tobacco baking process are independent of each other, and there is an initial state estimate (0 0) Then there is an estimate at time k : is an approximate conditional mean, and the associated covariance is ( ) P k k . In view of the fact that is not an exact conditional mean, strictly speaking, ( ) P k k is the approximate mean square error, not the covariance, and k Z represents the actual output vector at time k .
The state prediction during the tobacco baking process from time k to time 1 k  is: The prediction error equation during the tobacco baking process is: Among them, ( 1| ) The covariance associated with the prediction error is: Where, represents the covariance at time is its first-order derivative, its specific expression is as follows: X  means that the function takes the derivative of x .
Similarly, for the second-order filtering, the predicted value of the actual measurement during the tobacco baking process is: Same as the vector f ,   And the filter gain is: Among them, 1 ( 1) Sk   represents the inverse of ( 1) Sk .
The state update equation during the tobacco baking process is: Where,   1 wk represents the measurement residual, and its expression is as follows: The covariance updating equation is, that is, the filtering error covariance updating equation corresponding to equation (13) is expressed as: Equations (13) and (15) are the measurement indexes for evaluating the tobacco quality, color and processing degree in the tobacco baking process, and also control some parameter indexes and attribute values of the baking process through the output of these two equations.
Further, the final actual output vector of tobacco baking is: The error of the final actual output vector of tobacco baking is: is the residual of the actual output.
Thus, the error covariance of the final actual output of tobacco baking is: (20) At this time,   1 Sk  is the covariance matrix of the actual output.
Finally, a complete multi input prediction multi output model is established, that is, the relationship from input vector X to output vector Y is determined as a specific mapping relationship T R through a function transformation T , and the model can be simply expressed as the following mode: Set the input to   . Given a mapping relationship R from X to Y , a function transformation from X to Y can be determined, that is: Wherein,   R T A represents the synthetic operation result of the input matrix A and the relationship matrix R between input and output, that is, the prediction matrix obtained by the operation of A and mapping relationship R ; A represents the input matrix, R represents the mapping relationship matrix from input to output, and represents the synthesis operator.

Evaluation of model establishment
The evaluation method given in this paper is as follows: (1) Establish factor set   Finally, through the synthetic operation of the weight value of each factor and the single factor evaluation matrix, the comprehensive evaluation B is obtained. The algorithm in this paper is: represents a column of single factor evaluation matrix, and the comprehensive evaluation matrix of this model is calculated as The tobacco baking quality prediction model established in this paper is theoretically feasible according to the evaluation criteria of high-quality tobacco products for tobacco color and moisture content. Next, we will verify the practical feasibility of this model through experiments.

Source of sample data
The experimental samples were tobacco leaves of the same grade provided by a tobacco processing company, including various data indexes of tobacco leaves, including 553 data of upper leaves, middle leaves and lower leaves respectively. Since the establishment of this model is to explore the mapping relationship between input and output, that is, infer the quality index after baking according to the input data of tobacco. Therefore, using the upper, middle and lower leaf data of tobacco has little impact on the experimental results. Therefore, only the upper leaf data of tobacco leaf is used to establish a model to verify the effect of the algorithm.

Construction of multi-mode reasoning tobacco baking prediction model and its application in tobacco baking
process 550 samples of upper leaf data were taken to establish the training set and test set with a ratio of 10:1. 500 samples were taken from 550 upper leaf data as the model training set, and the remaining 50 samples were used as the test set to verify the feasibility of the model.
The multi-mode reasoning model is programmed. The system uses MATLAB software platform to build the input and output module interface. Through the built platform interface, it is convenient to input and select input indicators, and can clearly see the prediction results and the trend of each output quality index data. The developed reasoning model interface is shown in Figure 2. The left half of the interface is the input indicator display and operation button area, and the right half is the output indicator display area. After the training, the model can predict a single group of data or batch predict the data. During the prediction of a single group of data, the user can independently input the corresponding value in the corresponding space. After clicking run, the output box on the right will display the corresponding output index. Figure 2. The interface of the developed predictive model software system As shown in Figure 2, the model can also import the corresponding input parameter values from the excel table to predict the quality of tobacco baking in batches, and the change curve of each quality index after tobacco baking can be displayed in the block diagram on the right side of the interface Show clearly.
Before the model can predict normally, the model needs to be trained and tested first. When training the model, you only need to click the "Open" button, select 500 sets of training data from the pop-up data set, and then click the "Run" button to train the model. After the model is trained, the test set can be used to test the model training results. When performing model testing, select the corresponding test set and click "Run" to test the trained model.
As shown in Figure 3, a single data prediction is performed on the trained model. Enter a group of tobacco leaf corresponding data in the indicator input box, and click Run to display the set of tobacco quality output indicators in the output box. Figure 3. The output of a single set of data predictions It can be seen from the prediction result that the predicted output value when comparing the single data prediction with the original data is basically consistent with the given index value of the tobacco leaf. However, due to the small amount of data processed, the single group data prediction is inefficient in the actual tobacco production and has no practical significance. Therefore, it is only used as development test. This paper mainly discusses the prediction of batch data.
As shown in Figure 4, it is the result of predicting the test data in a set of test sets after model training. It can be clearly seen from the figure that the last group of data output of various indicators will be displayed in the corresponding box in the figure. In addition, the change trend of all indicators will be displayed in the corresponding trend change diagram, and all predicted values will be exported to Excel table for subsequent use of data.
After analyzing the prediction results, we will verify the accuracy of the model, as shown in Figure 5, which is the comparison diagram of the predicted values of red, green, blue and moisture with the given values of the original quality indicators. It can be clearly seen in the comparison diagram that the predicted output values of each index of the model are basically consistent with the original values of the data, and the accuracy of the model prediction reaches more than 90%, The validity of this model in tobacco baking quality prediction is proved again. data This model not only has high prediction accuracy, but also has certain anti risk ability and strong robustness. The multi-mode reasoning method is adopted to make full use of all the data information provided by tobacco data, so as to make the model more scientific and objective. At the same time, when the input format is wrong, the developed system will prompt the user to input the correct data type, as shown in Figure  6. The user directly runs the system without inputting the training value. At this time, the system will report an error and remind the user to select the input value.

Comparison of the constructed predictive modeling method with existing methods
The control of tobacco baking process is a mapping relationship between multiple inputs and multiple outputs, that is, if the input matrix is m, the mapped output matrix is n. For such control, it is difficult to establish a specific function to reflect the relationship between them, but the input parameters affecting the output index are coupled. That in order to achieve better control and accurate prediction, each input data must be analyzed. In this paper, all input parameters and output indexes are taken as a whole to establish a relationship model between them, and a multi-mode reasoning state system method is proposed to establish the state model. At the same time, the model is comprehensively evaluated in the process of establishing the model, and three kinds of errors are corrected by the evaluation method, That is, output index correction method, mapping relationship matrix correction method and curve correction method. Finally, through the iterative learning of samples by machine, a more accurate tobacco baking prediction model system is established and developed. The model is tested in the value range of each input parameter, and the error of each output index is controlled below 10%, as shown in Figure 7. Figure 7. Comparison of correct prediction rate between the proposed method and the existing prediction method After theoretical and experimental verification, the data modeling method given in this article is used in tobacco baking monitoring. The experimental results established from the model show that the correct prediction rate is as high as 90% and above. This application method is better than the previous application method. The prediction accuracy rate of is higher by about 10%, indicating that the establishment of this model is effective and feasible.
In particular, if the main factor is changed, the output result changes significantly, and if the change is a secondary factor, the output result does not change significantly, because the tobacco baking system is a multi-input multi-output functional relationship, and the input parameters are Coupled, not independent of each other. Under a certain input, some input parameters play a major role, some play a secondary role, and some do not play a role. Therefore, when changing the input parameters, if it is the main factor of the change, the output index changes more obviously.

Conclusion and future work
In this paper, a mapping relationship between multi input and multi output is established by using the modeling method of multi-mode reasoning state. When establishing the model, the data are filtered and irrelevant variables are eliminated; Then, in order to avoid the influence of variable dimension, the maximum value comparison method is used to normalize the data; When modeling, the expectation model is established for the first-order filter and the second-order filter respectively, and finally the complete prediction model is established. Compared with the previous prediction methods, this model removes the redundancy of data, fully extracts the data information, obtains the relationship between data input and output more accurately, and further improves the accuracy of model prediction. The experimental results prove that compared with other algorithms, the algorithm established in this paper has greatly improved the accuracy of the prediction of tobacco baking quality, and the accuracy of the model's prediction basically reaches more than 90%. This algorithm improves the prediction accuracy of tobacco baking quality, provides technical support for the intellectualization of tobacco production, and provides a guiding method for quality monitoring in tobacco baking. In addition, this model can not only be used in the tobacco baking process, but also provides a good algorithm for the baking of dried fruit, and provides a reliable idea for the automation of life and production.