It was important for the development of a CO2 emission model for startstop technology for the vehicle tested that the model prepared could reflect the moments when the vehicle's engine is turned off, e.g., during momentary stops at junctions. Current microscale models for emission maps and for calculating instantaneous emission values did not allow such results (Eijk et al. 2018; Mądziel and Campisi 2022). Therefore, it is necessary to analyze instantaneous emission plots for CO2, emission maps, residual graphs, and validate the models using R2 and MSE indices.
The input data for the development of the emission models were the parameters: velocity, acceleration, and road gradient. These data were collected from the PEMS recording and are additionally recorded via OBDII for additional verification of the values obtained. The velocity data for the route tested for the driving cycle are shown in Fig. 5.
The purpose of collecting velocity data was to collect them for values that characterize urban, rural and highway driving. The appropriate range covered speeds between 0 and 130 km/h. It is important to collect enough data, as the actual portion will be split into 80% as a training set and 20% as a test set. With more data collected, a more accurate model of CO2 emissions can be created. However, it should be noted that testing with the PEMS system is relatively expensive, as a large number of calibration gases are required to zero the instrument and prepare it for performing and collecting the correct road test data. Collecting a sufficient amount of data is crucial to the correct implementation of future predictions, as evidenced, for example, by the work of (Kan et al. 2018) and (Zhang et al. 2022).
Equally important as data collection for velocity is data for vehicle acceleration. Data for an acceleration during the driving cycle are shown in Fig. 6.
Similarly to velocity, the acceleration parameter must be related to different driving characteristics. In urban conditions, acceleration often has to be dynamic, but sometimes in congested conditions, it is somewhat reduced. In rural conditions, acceleration refers to a velocity greater than 60–100 km/h. In general, acceleration is somewhat less dynamic in these ranges due to traffic conditions as well as vehicle capabilities, while driving for different driving styles is relevant here. Acceleration is equally relevant for highway driving, where speeds range from 100–130 km/h. The importance of this set for the development of a good model is described, among others, in (Zhang et al. 2021a) and (Peng et al. 2022).
Another important parameter that provided an input value for the development of CO2 emission models using artificial intelligence techniques was the road gradient. The road gradient is a factor that greatly influences the results obtained for CO2 emissions. It also forms the input base for the emission models already developed. The importance of this parameter in relation to fuel consumption and emissions is indicated, among others, in the works (Rosero et al. 2021) and (Liu et al. 2019).
The data for velocity, acceleration, and gradient constituted the socalled independent variable, which served as input data to create a CO2 emission model for a vehicle with startstop technology. Therefore, the CO2 data could be treated as a dependent variable. For the creation of a model for artificial intelligence techniques, in particular for machine learning techniques as a branch of artificial intelligence, it is important that the created models perform well and calculate with good accuracy for a new series of data that have been unused so far. Therefore, the input data for the creation of the model is divided into two portions, which are often referred to as the traintest split (Salazar et al. 2022; Rashidi et al. 2019). The first of these portions is a large dataset, which represents 80% of the original data sets, while the second series of data is the testing set, which is a set of the remaining 20% of the data. The training set is then used to create prediction models such as the trained model, which is then used for the data of the testing set, which are respected as new unseen data (Valabas et al. 2019).
The first model learning technique used was linear regression. Linear regression is used to find a linear relationship between the target and one or more predictors. Currently, vehicle emission models are being developed using this method; examples include the work of (Wang et al. 2022) and (Madrazo and Clappier 2018).
The second method used was the random forest machine learning method. It is based on classification algorithms (Speiser et al. 2019). Random forest, as the name suggests, is a method based on socalled decision trees, which in combination form a random forest. The number of decision trees is very large and they operate among themselves as an ensemble. Each individual tree in the random forest produces a class prediction, and the class with the model most votes becomes the prediction (Sheykhmousa et al. 2020; Shonlau and Zou 2020). In fact, a large number of relative tree models acting as an ensemble will outperform each of the individual component models in terms of prediction accuracy (Balyan et al. 2022). The reason for the good accuracy of this model is that trees protect themselves from making individual errors.
The third method used is gradient boosting. Boosting is a method to convert weak learners into strong learners (Bentéjac et al. 2021). In boosting, each new tree is a fit to a modified version of the original data set. The gradient boosting algorithm begins by training a decision tree in which each observation is assigned an equal weight. After evaluating the first tree, we increase the weights of the observations that are difficult to classify and decrease the weights of those that are easy to classify (Zhang et al. 2021b; Li et al. 2020). A second tree is then created on these weighted data. In this case, the idea is to improve the predictions of the first tree. Gradient boosting trains many models in an additive, sequential, and gradual manner. One of the main motivations for using gradient boosting is that it allows a userspecified cost function to be optimized, rather than a loss function, which typically offers less control and generally does not correspond to realworld applications (Punmiya and Choe 2019; Wang et al. 2020b).
Figure 7 presents the results of the scatter plot of predicted versus observed CO2 emission and residual plots for all investigated machine learning methods. Statistical analysis of the results was carried out based on the work (Piñeiro et al. 2008). On the basis of the predicted versus observed graphs, it can be seen that the strongest correlation between the prediction data and the actual data is for the case of the gradientboosting machine learning technique. The opposite situation applies to linear regression and the random forest technique. For the residuals plot, the distance that separates the prediction from the 0value line and the symmetries of the created point cloud are evaluated. Positive residual values say that the average prediction was too low, while negative values mean that the prediction was too high, a value of 0 means that the prediction was perfect. Therefore, a closer localization near the 0 axis means that the created model is better able to reflect and calculate values close to the real ones (Kozak and Piepho 2018). In the case of the residuals graph, similarly to the predicted versus observed graphs, the best results are presented by the gradient boosting method. This state of affairs is justified by the most symmetrical distribution of results, which tends to form clusters near the center of the graph.
Figure 8 shows a plot of instantaneous CO2 emissions for real and model data for two selected methods of machine learning techniques: linear regression and gradient boosting. This is a comparison of the methods and instantaneous emission results for the dataset studied. On the basis of this, it can be observed which method gives the best reflection of emission values, paying special attention to the zero emission values. These are, respectively, places where the vehicle is stopped and the engine is shut down, while not every machine learning technique is able to reflect such an engine operating condition. It can be seen from Fig. 8 that the linear regression method does not estimate CO2 emissions correctly, erroneously indicating even values below 0. Compared to this method, the gradient boosting method satisfactorily reflects the estimate of CO2 emissions, even for states where the engine is turned off. An area with greater differences in CO2 emissions is the state for a cold engine, where there was increased fuel consumption at the start of the road test, with consequent increased CO2 emissions. For practical use of the model and its simplicity, a separate model was not made for the engine state when it is characterized by the socalled "cold start" state.
Figure 9 shows the emission maps for the realworld data and the predictive data for all the machine learning techniques analyzed.
Based on the visualization of CO2 emission maps for startstop technology, it is possible to see which model has the best predictive ability. For the CO2 emission maps, the range on the scale was intentionally not changed, so as to distinguish more between the analyzed cases. For example, the biggest differences for CO2 prediction indications are found for the linear regression method, which largely underestimates CO2 emissions, e.g., for a highway section the emissions are high over the course of the entire length of the highway compared to real data, where increased CO2 emissions occur only for selected areas of highway driving. A similar underestimation characterizes the random forest method. The map for the gradient boosting technique shows the closest to real values, which actually indicates very close values of CO2 emissions. This method has been shown to be very accurate both in terms of instantaneous emissions and CO2 emission maps for startstop technology. It is also worth noting that, for the urban part, where there were many stopping points, this method performs a very accurate CO2 calculation, also indicating zero CO2 emission locations.
The validation of the models was also carried out using R2 and MSE. The coefficient of determination was calculated according to formula (Ueki 2021) (2):
$${R}^{2}=\frac{{SS}_{M}}{{SS}_{T}}=\frac{\sum _{t=1}^{n}(\widehat{y}{ }_{t}\stackrel{}{y}){ }^{2}}{\sum _{t=1}^{n}({y}_{t}\stackrel{}{y}){ }^{2}}$$
2
where:
R2 – coefficient of determination,
SSM – sum of squares for the model,
SST – total sum of squares,
yt – the actual value of the dependent variable,
y ̅ – predicted values of the dependent variable,
y ̅ – The average value of the actual dependent variable.
The coefficient of determination describes how much variation in the explanatory variable was defined by the model (Chicco et al. 2021). It takes values in the range 0 ÷ 1. Some sources state that the model fit is better the closer the R2 value is to unity, while this is incorrect, because the sheer number of observation data makes the model R2 value decrease; then the evaluation should be carried out on the basis of other model validation parameters (Mohammad 2020). For the case under study, for example, by analyzing instantaneous emission graphs and CO2 emission maps.
The evaluation of the model's correctness is the most important part in the model development process because on this basis, it can be determined to what extent the prepared model fulfills its intended purpose. The validation of the obtained exhaust emission models was carried out on the basis of instantaneous and emission map results, using data that were not used for the calibration of earlier models. To validate the models obtained, a widely used coefficient was used to evaluate the prediction error, MSE. The mean squared error measures the amount of error in statistical models (Wang and Lu 2018; Gao et al. 2020). Assesses the average square difference between the observed and predicted values (ChaniCahuana et al. 2018).
The mean square error was calculated based on the formula (Chachlakis et al. 2021) (3):
$$MSE=\frac{{\sum (y}_{}{y}_{t}^{P})}{n}$$
3
Where:
y  the actual value of the dependent variable,
\({y}_{t}^{P}\) – predicted values of the dependent variable,
n – number of observations.
Table 3
MSE and R2 results for the training and test set
Method

Training MSE

Training R2

Test MSE

Test R2

Linear regression

0.611291

0.549273

0.592783

0.56976

Random forest

0.596003

0.560545

0.584536

0.575746

Gradient boosting

0.286721

0.78859

0.355065

0.742295

The squaring of the differences in the MSE method serves several purposes. Squaring the differences eliminates negative values for the differences and ensures that the mean squared error is always greater than or equal to zero (Liu et al. 2018; Sun and Huang 2020). It is almost always a positive value. Only a perfect model without error produces an MSE of zero. Obtaining an MSE equal to zero is impossible in practice. The results shown in Table 3 confirm the previously inference methods for the analyzed to create a CO2 model using artificial intelligence methods. For both the training set and the test set, the gradient boosting method had the smallest prediction error rates. The same is true for R2, which for the gradient boosting method for the training set was 0.78, while for the test set it was 0.74. These results indicate a very good representation of the real data for the model data. The worst method of those analyzed turned out to be the linear regression method, which achieved the weakest results for both the MSE and R2.