Process Prediction Using Machine Learning Techniques Applied to Cement Industry

Abstract

In the context of organizing the means of production, the heavy cement industry is following the new concept of Industry 4.0, which increases the efficiency of industrial processes and increases productivity through customization and flexibility, while reducing costs and energy consumption. To do this, it uses process prediction by operating the digital transformation through a 4.0 tool for monitoring and analyzing temperature and pressure in real time. This tool monitors temperature and pressure using sensors that transform the data into a computer platform for real-time analysis, and predicts failures according to a predictive model to remedy the problem of preheater cyclone blockages. This new technology reduces incidents and increases the life of equipment [1].

1. Introduction

The industrial revolution began in 1765 with the mechanical production carried by the steam engine, followed by a 2nd revolution in 1870 with the mass production carried by the electric and oil energy, in 1969 it is the 3rd revolution, the production knew a support by the electronics and the computer technologies, until today where we arrived at the 4th generation, it is about the industry 4.0 which the introduction of new technologies, the internet of the objects, the artificial intelligence, the cloud, the big data, etc.... and cyber-physical systems.

One of its goals is to reduce excessive scrap, which has led researchers to think about reuse strategies in a different way [2].

In today's commercial production industries, there is a growing trend of needing more available equipment that can run non-stop 24/7. Thus, any type of failure, even minor, cannot be accepted as it can significantly affect cost and production. Hence the use of predictive process analysis, a newly emerged discipline that aims to provide insights into business processes in modern organizations. It uses event logs, which capture process execution traces in the form of multidimensional sequence data, as a key input to form predictive models. These predictive models, often built on deep learning or machine learning techniques, can be used to make predictions about future states of business process execution [3].

On the other hand, the cement market is highly competitive, and industries operating in this sector find it necessary to improve their performance.

To this end, the heavy cement industry has established an Industry 4.0 plan, which corresponds to implement tools that allow the implementation of predictive process, in order to increase the efficiency of industrial processes and increase productivity while reducing and energy consumption through flexibility and customization.

2. Purpose And Delimitation

The paper will present the work done with the objective of predicting preheater cyclone bockages, starting with collecting all the data with the help of analytical tools, which allowed us to know the defects of the current pre-heating process that affect the equipment and consequently affect the production performance. Then, we used a predictive model [4] based on machine learning, that predicts the values of temperatures and pressures with high efficiency to have an idea about what's happening inside the cyclone and to know the maximum pressure that influences the corking and also the pressure variation during the process. We also programmed a code to analyze the predictive values if the pressure has exceeded the maximum condition or if it follows the corking function, in this case an alert will be sent to the driver or to the maintenance department with interventions and actions to do.

3. Predictive Process Monitoring

In recent years, organizations have tried to leverage historical process data to get data driven insights from the day-to-day business operations. One way to improve process performance by leveraging historical data is to train models based on different types of machine learning.

Predictive process monitoring (PPM) [5] takes historical process data (a set of completed business process executions) as input and uses machine learning techniques to predict a user specified need during the runtime of a selected business process. In the past, different setups have been applied because of the high complexity. This is because researchers have used different algorithms, datasets, domains or prediction goals.

PPM aims to predict the future of quantifiable values during a running process execution whereas for example business process intelligence [6] focus on long term predictions such as key performance indicators. To predict the outcome of running processes, PPM exploits historical data of already executed processes of the same type. The set of historical data consists of events that correspond to the execution of activities of each process instance. Based on process prediction, the idea is to enable the business to proactively improve process performance and mitigate risks.

4. Machine Learning

4.1. Definition

Machine learning [7] is not a new technology. The first artificial neural network, called "Perceptron", was invented in 1958 by the American psychologist Frank Rosenblatt.

Initially, Perceptron was intended to be a machine, not an algorithm. In 1960, it was used in the development of the Mark 1 Perceptron image recognition machine. Mark 1 Perceptron was the first computer to use artificial neural networks (ANNs) to simulate human thinking and learn by trial and error.

Due to the emergence of open-source libraries and frameworks and thanks to the multi-billion-fold increase in computer processing power between 1956 and 2021, machine learning has become widely democratized. Today, machine learning is everywhere: from stock trading to malware protection to marketing personalization [8].

Machine learning is a computer programming technique that uses statistical probabilities to give computers the ability to learn by themselves without explicit programming. The basic goal of machine learning is to "teach computers to learn" - and subsequently to act and react - as humans do, improving their learning style and knowledge autonomously over time. The ultimate goal would be for computers to act and react without being explicitly programmed for those actions and reactions. Machine learning uses development programs that adjust each time they are exposed to different types of input data. Regardless of its degree of complexity, machine learning can be classified into three broad categories: Supervised, unsupervised and reinforcement machine learning.

4.2. Steps of machine learning

The goal of a Machine Learning project is to develop efficient learning models from large data sets (datasets). To achieve this, it is recommended to follow a precise process [9]:

4.2.1. Defining the problem to be solved

The cement manufacturing process starts from the quarry department to the shipping, in the preheating phase it can happen the phenomenon of preheater cyclone blockage predictions within the cyclone which is caused by the interaction between the surface of the cyclone and the raw meal, which include substances of different forms. These substances are retained and settle on the surface of cyclone or on the lower walls. If there is a big depression, the raw meal will take the direction of depression (opposite direction), and therefore there will be a clogging and interaction between the raw meal flow (120t/h).

The operator cannot do the unclogging mechanically (back pressure) to eliminate the clogging.

In order to avoid this problem, the plant team, as part of the transformation to Plant 4.0[10], implemented a process prediction plan that contributed to the reduction of losses caused by this phenomenon, thus saving time and money.

4.2.2. Collecting the necessary data

With the help of our connected sensors, we can know the temperature and pressure in each cyclone (1, 1bus, 2, 3 and 4), these values are updated every minute. With the help of these we can have an idea about the behavior of the flour and air inside our cyclones so that our driver in the control center can react (increased or decreased) on the flow of raw meal and air to avoid several risks (temperature increase, clogging...)

Table 1

Model data
	Dep C4	Dep C3	Dep C2	Temp M C4	Temp EC4	Temp MC3	Temp SC3	Temp C2	Vitesse ventilator	Debit farine
07/04/19 09:00	-132.72	-132.72	-409.45	878.45	869.42	752.96	802.84	599.18	641.42	129.24
07/04/19 09:01	-133.85	-281.5	-412.48	879.14	871.16	753.94	802.67	599.28	641.31	130.52
07/04/19 09:02	-135.44	-278.76	-410.49	880.18	866.64	752.96	802.74	598.02	641.36	129.8
07/04/19 09:03	-151.09	-193.07	-416.94	878.45	864.21	751.33	800.59	596.26	641.54	129.89
07/04/19 09:04	-144.84	-278.82	-413.52	876.37	860.05	749.04	798.84	594.22	641.42	128.61
07/04/19 09:05	-129.45	-270.84	-404.47	872.2	859.35	746.43	796.64	592.79	641.45	128.65
07/04/19 09:06	-126.18	-267.78	-402.29	873.52	861.78	747.08	795.84	592.8	641.42	128.39
07/04/19 09:07	-127.49	-275.42	-406.42	879.84	870.12	748.72	798.52	594.15	641.47	128.81
07/04/19 09:08	-129.68	-273.5	-407.31	883.31	873.59	751	801.88	596.98	641.34	130.52
07/04/19 09:09	-131.21	-272.69	-406.49	880.88	869.77	751	802.07	879.2	641.51	131.38
07/04/19 09:10	-131.5	-270.87	-404.42	880.88	864.91	749.04	800.31	595.5	641.34	132.64
07/04/19 09:11	-138.38	-278.26	-408.82	878.45	861.09	747.08	797.04	593.43	641.36	133.32
07/04/19 09:12	-138.95	-274.05	-406.95	873.59	855.88	742.84	792.04	589.39	641.43	131.78
07/04/19 09:13	-134.44	-270.14	-404.62	868.38	853.1	739.58	788.6	585.67	641.41	131.25
07/04/19 09:14	-145.1	-281.99	-410.43	864.91	854.14	738.92	786.38	583.23	641.44	134.25
07/04/19 09:15	-145.28	-276.88	-409.49	863.17	851.02	737.29	783.6	579.31	641.42	132.64

With this table, we can get a general idea of each column to choose the input and output parameters of our model.

4.2.3. Prepare and clean the data

To have a good model, it is necessary to prepare the data well, we have eliminated the missing values and the empty rows so we have transformed the time into a real variable, since the model does not include the structure of the date 06.05.2021, we have used a code to solve the problem.

Correlation is an expression of the intensity of the relationship between two variables. They are widely consulted by analysts and portfolio managers, since understanding correlations is part of risk management.

In order to calculate the correlation coefficient between two numerical variables we will come back to summarize the link that exists between the variables using a line; in this case we are talking about a linear adjustment [11].

We have chosen the time, the flow, the flour and the fan speed as input variables, on the other hand we as input variables, on the other hand we will have as output variables the values of pressure and temperature of cyclone 4.

4.2.4. Determining the right model

4.2.4.1. Mlp model

Mlp defines a multilayer perceptron model, this function can fit classification and regression models. In this part, we divide our data at X_Train and Y_Train and we work with a size of 50 and we repeat this each time until we converge to the link function. In this case we got an efficiency of 0.010. Then the mlp model is a very weak model for making a prediction close to the reality.

4.2.4.2. Knn model

The Knn algorithm is a supervised machine learning model, it predicts a target variable using multiple independent variables.

The same as the mlp model except we have 30 samples to train with each once. We got an efficiency of 0.93 which is very weak to predict the reality.

4.2.4.3. Lasso model

Lasso model is a regression analysis method that performs variable selection and regularization in order to interpret the resulting statistical model. By the same approach, we found that the effectiveness is equal to 8.33.

4.2.4.4. Linear regression model

Linear regression defines a model that can predict numeric values from predictors using a linear function. Then we found an efficiency equal a 9.28.

So, we choose the linear regression model since it has the best efficiency to predict the reality.

After choosing and testing the model, we were able to find the right model which is Knn [12] with a score of 93%, which allowed us to predict the pressure and temperature.

4.2.5. Training and evaluating the model

Among all the steps of machine learning, the training test [13] remains the most characteristic phase of machine learning. Fed with data, our model is trained over time to progressively improve its ability to react to a given situation, to solve a complex problem or to perform a task. For this learning phase, we used training data. The set of collected information is often too heavy and too resource-intensive: therefore, we selected a part of the data set (sampling) in order to train the model more efficiently and to improve its predictions.

4.2.6. Test and deploy the model

This last step of Machine Learning tends to confront the model with the reality of the field. In this test phase, the other part of the data, the test dataset, is used. This subset of information allows us to refine the model thanks to the scenarios or data that the computer has not yet experienced during the learning phase. In this way, we can evaluate the performance of the model in the context of our business.

5. Solidworks Simulation

To better describe the phenomenon, we are going to use a simulation of the preheater cyclone in SolidWorks which is a 3D modeler using parametric design [14].

For the representation of the pre-calcination of the raw flour in the cyclone we propose a modeling based on the concepts of balance sheets which it makes it possible to establish a process model from the basic laws of mass conservation and energy and the transfer laws.

5.1. Cyclone drawing

The overall design of our system is divided into two parts with its real dimensions: the higher one which ensures turbulence and the other which ensures the separation of the raw flour and the compressed air.

To describe the phenomenon we need density, specific heat and thermal conductivity. So, the chemicals characteristics samples from the laboratory analyzes is on the table below:

Table 2

Percentage of the chemical component
	CaO % %	SiO2 % %	Al2O3 % %	Fe2O3 % %	MgO % %	SO3 % %	K2O3 % %	Na2O % %
01/06/21 06:00	44,75	12,08	3,27	1.81	0.8	0.06	0.35	0.1
01/06/21 08:00	43,34	13,39	3,52	1.96	0.8	0.05	0.38	0.1
01/06/21 10:00	43,92	13,02	3,43	1.9	0.81	0.06	0.38	0.1
01/06/21 12:00	44,19	12,54	3,42	1.92	0.8	0.05	0.36	0.09
01/06/21 14:00	44,33	12,56	3,43	1.87	0.8	0.07	0.36	0.09
01/06/21 16:00	44,02	12,8	3,53	1.92	0.8	0.06	0.37	0.1
01/06/21 18:00	44,12	12,89	3,49	1.92	0.81	0.05	0.39	0.09
01/06/2120:00	44,02	12,71	3,4	1.89	0.8	0.08	0.35	0.1
01/06/21 22:00	43,92	12,93	3,43	1.9	0.8	0.05	0.35	0.09
02/06/21 00:00	44,46	12,89	3,43	1.88	0.81	0.07	0.34	0.1
02/06/21 02:00	44,22	12,9	3,44	1.94	0.81	0.06	0.34	0.09
02/06/21 04:00	44,23	12,82	3,43	1.89	0.79	0.05	0.38	0.09
02/06/21 06:00	43,8	12,97	3,34	1.89	0.79	0.07	0.33	0.1
02/06/21 08:00	44,44	12,3	3,27	1.93	0.79	0.08	0.32	0.09

5.2. Calculation execution

SolidWorks flow simulation automatically generates a calculation mesh based on the parameters we have. The mesh size is created by dividing the calculation into cells which still divided, if necessary, in order to correctly solve the geometry of the model and the flow functions.

5.3. Flow trajectories

The flow trajectories display the flow flows which offer a clear and an understandable representation of the flow particularities.

5.4. Results extraction:

We extract the values of pressure with several diameters until the preheater cyclone blockage obtained.

Table 2

Pressures values as a function of diameter
Diameter(mm)	Depression (Pa)
20	-224,3
90	-211,7
500	-152,2
900	-130,8
1700	-120,9
2200	-115,4
3000	-111,7

Using the table above, we obtained the relationship diameter and pressure:

The blockage function: \(A={e}^{-Bt}\) with: A=-109.44 and B=0.01414

So large, we created in the blockage phenomenon in order to see its influence on the pressure and temperature, something that led us to find the blockage function and the maximum pressure.

6. Improvements

In order to remedy these defects and drawbacks, the solution found is a digital transformation through the implementation of a process prediction tool, this tool aims to manage the level of pressures and temperatures, and the prediction of clogging to reduce losses and equipment failures, and minimize downtime, then increase the production rate.

This technology is part of an Industry 4.0 plan established by the heavy cement industry that aims to make the plant 4.0. The plant has decided to implement thermal pressure and temperature sensors with a predictive model for monitoring and analyzing cyclone pressures and temperatures in real time.

7. Deployment Of A Predictive Model

The objective of the deployment is to integrate a machine learning model into a production environment to make practical data-driven decisions.

The starting point for the design of our architecture must always be the business requirements and the environment in which it will be deployed. So, our architecture is mainly composed of three main steps as shown in the following figure:

7.1. Automatic data acquisition

The first step in this process is to collect data. The temperature and pressure measurements must be collected and transferred to the database. This database is connected to the Power BI [15] platform, for automatic data acquisition.

7.2. Data processing

The Machine Learning model will be loaded and inserted into Power BI, since the latter offers the possibility to write python scripts, which makes the process smooth. After the acquisition of the data, a batch will be configured to process it with the model that has already been trained, then, the model will predict new values of temperature and pressure.

7.3. Communicating the results

The prediction results are displayed on a platform in the form of a graph showing the state of the machines and its behavior as well as a graph showing the evolution of the output parameters (temperature and pressure) as a function of time. These graphs are considered as indicators of the cyclone degradation which allows the operator to anticipate the breakdowns and to take the necessary decisions.

8. Conclusion

In this paper, we talked about process prediction as a tool based on machine learning techniques in our case of cement industry, we detailed the steps to build a machine learning model, which will help the company to predict the key values of the process, in order to know the time remaining before the arrival of breakdowns. To show the effectiveness of this process prediction we have involved a case study of a project already carried out by the heavy cement industry team within of the new concept of industry 4.0, which gave relevant results.

Process prediction using machine learning generates very strong models which minimize the loss of money and time within companies whatever their field of application.

Declarations

Acknowledgements

The authors would like to thank all the stakeholders of this project and particularly the engineers and technicians of the LafargeHolcim plant.

Funding: the authors declare that no funds, grants, or other support were received during the preparation of this manuscript.
Financial interests: the authors have no relevant financial or non-financial interests to disclose.
Availability of data and material: Not applicable
Code availability: Not applicable
Ethics approval: Not applicable
Consent to participate: Not applicable
Consent for publication: Not applicable

References

Gallo T, Cagnetti C, Silvestri C, Ruggieri A (2021) Industry 4.0 tools in lean production: A systematic literature review. https://doi.org/10.1016/j.procs.2021.01.255
Groumpos PP (2021) A Critical Historical and Scientific Overview of all Industrial Revolutions. IFAC-Pap 54:464–471. https://doi.org/10.1016/j.ifacol.2021.10.492
Ali Y (2018) Artificial Intelligence Application in Machine Condition. Monitoring and Fault Diagnosis
Hanatani T, Fukuda N, Hiroyuki H (2007) Simulation of Network Agents Supporting Consumer Preference on Reuse of Mechanical Parts. In: Takata S, Umeda Y (eds) Advances in Life Cycle Engineering for Sustainable Manufacturing Businesses. Springer, London, pp 353–358
Spree F (2020) Predictive Process Monitoring-A Use-Case-Driven Literature Review. In: EMISA Forum: Vol. 40, No. 1. De Gruyter
Kim J, Comuzzi M, Dumas M et al (2022) Encoding resource experience for predictive process monitoring. Decis Support Syst 153:113669. https://doi.org/10.1016/j.dss.2021.113669
Hey T, Butler K, Jackson S, Thiyagalingam J (2020) Machine Learning and Big Scientific Data. Philos Trans R Soc Math Phys Eng Sci 378:20190054. https://doi.org/10.1098/rsta.2019.0054
Li B, Lee Y, Yao W et al (2020) Development and application of ANN model for property prediction of supercritical kerosene. Comput Fluids 209:104665. https://doi.org/10.1016/j.compfluid.2020.104665
Praveena M, Jaiganesh V (2017) A Literature Review on Supervised Machine Learning Algorithms and Boosting Process. Int J Comput Appl 169:32–35
Hassani A (2020) L’industrie 4.0 et les facteurs clés de succès de projet. Masters, Université du Québec à Trois-Rivières
Speed T (2011) A Correlation for the 21st Century. Science 334:1502–1503. https://doi.org/10.1126/science.1215894
Wang F, Zhen Z, Wang B, Mi Z (2018) Comparative Study on KNN and SVM Based Weather Classification Models for Day Ahead Short Term Solar PV Power Forecasting. Appl Sci 8:28. https://doi.org/10.3390/app8010028
Kumar A (2020) Machine Learning Models Evaluation Infographics. In: Data Anal. https://vitalflux.com/machine-learning-models-evaluation-infographics/. Accessed 26 Sep 2022
Jing W (2017) The Application of Solidworks in Scientific Research and Innovation. Comput Telecommun 1:74–75
Becker LT, Gould EM (2019) Microsoft Power BI: Extending Excel to Manipulate, Analyze, and Visualize Diverse Data. Ser Rev 45:184–188. https://doi.org/10.1080/00987913.2019.1644891