A Prediction Model for Regional Carbon Emissions Based on GRU Networks

: In this study, a data-driven regional carbon emissions prediction model is proposed. The Grubbs criterion is used to eliminate the gross error data in carbon emissions sensor data. Then, according to the nearby valid data, the exponential smoothing method is used to interpolate the missing values to generate the continuous sequence for model training. Finally, the GRU network, which is a deep learning method, is used to process these sequential standardized data to obtain the prediction model. In this paper, the wireless carbon sensor network monitoring data set from August 2012 to April 2014 trained and evaluated the prediction model, and compared with the prediction model based on BP network. The experimental results prove the feasibility of the research method and related technical approaches, and the accuracy of the prediction model, which provides a method basis for the nowcasting of carbon emissions and other greenhouse gas environmental data.


Introduction
Global climate change affects the survival and development of mankind, and brings serious challenges to the sustainable development of the economy and society. In this context, the "low-carbon economy" based on low energy consumption and low pollution has become a global hot spot. Carbon dioxide (CO2) is one of the main ingredients of the atmosphere, it through to the absorption of long wave radiation and stop the surface heat dissipation, also is one of the main greenhouse gases. Therefore, monitoring carbon emissions and designing accurate prediction methods based on datadriven thinking will be of great significance to the development of a low-carbon economy and the analysis of future climate change trends.
At present, the conventional carbon emissions accounting methods adopted in the world mainly include IPCC inventory method (Martin et al. 2006), measured method, material balance algorithm and model decomposition method (Hao et al. 2011). These methods have been applied to varying degrees in practical studies Cheng et al. 2013; ). But the calculation results of these methods will be affected by such as the production process, residents' lifestyle , calorific value, carbon content, carbon oxidation rate , calculation methods  and other factors, resulting in greater uncertainty. It can be seen that there is still a widespread problem of weak data statistics and prediction methods in the existing field of carbon emissions, and it is urgent to establish scientific, transparent, accurate and reliable data statistics and prediction methods that meet international standards.
In recent years, the maturity of sensor network technology provides convenience for real-time and continuous acquisition of research data (Wang 2021), and also provides a strong technical support for data-driven carbon emissions environmental data prediction methods. At the same time, deep learning has made significant achievements in the field of artificial intelligence. This method can improve the accuracy of classification and prediction by training big data and by mining and capturing deep connections among big data. Deep learning is an effective big data processing method. The typical deep learning model, the convolutional neural network (CNN), has been widely used, but CNNs are not completely suitable for learning time series. To adapt to the processing of time series data, a recurrent neural network (RNN) emerged (Huang et al. 2019). An ordinary RNN is equivalent to a multi-layer deep neural network (DNN) expanded on a time series (Fan et al. 2017). This model is prone to gradient explosion or gradient disappearance. The long short-term memory (LSTM) neural network was first proposed by Hochreiter & Schmidhuber in 1997, and it has been improved and popularized by many experts and scholars. Now, it is widely used due to its excellent performance (Duan et al. 2019). The Gated Recurrent Unit (GRU) network is proposed by Cho, et al. (2014) and can be regarded as a variation of LSTM. However, it has a simpler structure and fewer parameters than LSTM, so it is slightly faster to train or requires less data for generalization and guarantees excellent performance (Chen et al. 2021). This makes GRU more suitable for dynamic process modeling (Wang et al. 2020). The variation trend of CO2 concentration is related to environmental factors such as temperature and air humidity. These data exhibit a natural continuity in time and achieve strong correlation and causality before and after the time series. Using the GRU network to achieve CO2 concentration prediction can not only use the correlation of the data in the time dimension but also automatically mine the potential correlations between the data and improve the accuracy of carbon emissions data prediction.
In this study, a data-driven regional carbon emissions prediction model is proposed. This model is based on the comprehensive measurement and perception data of complex environmental systems, connecting with relevant research results in the field of deep learning, to collect information and data on various aspects of the natural environment, industrial production and social life to form a regional meso-social simulation Model, and then design a regional carbon emissions environmental data prediction method.

Proposed method 2.1 Model framework
Based on the above related theories and algorithms, a prediction model for regional carbon emissions is established. The algorithmic flow of the model is shown in figure 1 and the details of the algorithm involved will be introduced.

Eliminate gross error data
In the data preprocessing stage, the feature data are processed according to Grubbs criterion: set the sample as ( =1,2,3,…, ), is the total number of samples. Then for the -th sample value: 1) Arrange in ascending order.
2) Calculate the mean ̅ and variance .

Data sequence interpolation
The cubic exponential smoothing method is used to obtain missing data values within a time series. Firstly, the length of the missing data sequence is determined, and then data points and smoothing steps are inserted according to the time series value of the previous section of CO2 concentration of the missing data. Related processing formula is as follows: , and are all smoothing constants, the value range is [0,1], and its value is selected subjectively, and + is the predicted value of period i+m, that is, the missing value of the CO2 concentration.

Model training method
Since different features have different dimensions and units, in order to reduce the dimensional influence between the features, the data needs to be standardized: GRU network is an effective variant of the LSTM network. The LSTM network uses three gate functions: input gate, forgetting gate and output gate to control the input value, memory value and output value. In a GRU network, there are only two gates: the update gate and the reset gate. The structure is shown in figure 2.
It can be seen from the formula in the previous propagation process that the parameters to be learned are , , h and . The first three parameters are all spliced, so they need to be divided during the training: The input of output layer:

= ℎ (19)
The output of output layer: The loss of a single sample at a certain moment is: The loss of a single sample at all times is： To learn the network by using the backward error propagation algorithm, the partial derivative of the loss function with respect to each parameter must be obtained first: After calculating the partial derivatives for each parameter, the parameters can be updated, and iteratively until the loss converges. The training process of the prediction model is shown in table 2.

Experiment and model evaluation
Based on the requirements of the project, the research group designed a wireless carbon emissions monitoring sensor device. It mainly integrates ARM control module, carbon dioxide data acquisition module, temperature data acquisition module, humidity data acquisition module, gas flow rate acquisition module, GPS positioning module and GPRS data transmission modules. In addition, considering its working environment, the device is designed to be a low power consumption, portable, professional embedded system with industrial standards, with good stability and reliability. The prototype equipment is shown in figure 3: The experiment selects the CO2 concentration, temperature, humidity and other data obtained by the wireless carbon sensor network from August 2012 to April 2014. These data are collected by the above-mentioned sensors, and according to the set time period, the data is regularly transmitted to the server of the carbon data processing center through the 4G network. The processing center stores these data into the database and processes them to produce relevant data products. The data collection and processing process is shown in figure 4.
Since August 2010, the research group has set up 14 environmental monitoring sites in Genhe City, Hulunbeier City, Inner Mongolia Autonomous Region, and since April 2014, five monitoring sites have been set up in the new urban area of Huhehaote City, Inner Mongolia Autonomous Region. This number will increase in the future in order to obtain more complete data sets. The monitoring system interface and data of some sites are shown in figure 5. For the prediction model, the smaller the error of the result, the higher the prediction accuracy. For the same prediction model, the setting of parameter value is the most influential factor. In order to obtain higher accuracy, the parameters need to be adjusted repeatedly to get a satisfactory model. As can be seen from the partial test results in table 3, with the increase of lookback value, the prediction ability of the model is constantly improved, and the prediction effect is the best when its value is 3. However, as the value continues to increase, the accuracy of the model decreases. This reflects that when the lookback value is set too small, the model fitting effect will be poor due to insufficient correlation information between data, while when the lookback value is set too big, the correlation between data will be decreased, which leads to the accuracy of the model and the generalization ability of the model. In addition, theoretically the larger the training set size, the better the learning ability, but if the training set size is too large, the training will also be more time-consuming. For this model, when the size of the training set is 800, its accuracy can be guaranteed, and if the training set size continues to increase, the accuracy does not improve significantly, which indicates that the prediction accuracy becomes insensitive to the increase of the sample number of the training set. In order to prove the performance of the prediction model based on the GRU network, it is not only necessary to adjust the parameters to compare with itself, but also to compare with different models. The model is compared with the model based on Back Propagation (BP) network on the same training set, and the results are shown in the figure 6 and table 4.  It can be seen from the above chart and table that this prediction model is more accurate than BP model. This shows that the model can extract deep features from highdimensional data through various nonlinear operations, and give full play to the unique advantages of time series data, so that the prediction model shows better prediction and fitting effect.

Conclusion
From the experimental results, it can be seen that the prediction model based on the GRU network has high accuracy and can well fit the variation trend of CO2 concentration. This also shows that the model gives full play to the unique advantages of deep learning of the GRU network for time series data interpolation. It can extract abstract and deep features from high-dimensional data through a variety of nonlinear operations, making the prediction model perform better prediction and fitting effect. It also proves the feasibility of this research method and related technical approaches, which provides a method basis for the nowcasting of carbon emissions and other greenhouse gas environmental data.