The performance of the proposed PPED-STA Scheme is evaluated based on the evaluation metrics of accuracy, precision, recall and F-measure The accuracy metrics is defined as the potential in predicting labels associated with categorical class. In other words, It is computed as the proportion of correctly predicted instances as specified in Eq. (1)
Precision refers to the closeness measures of instances with one another and it is computed based on Eq. (2)
Recall is defined as the exact positive instances in the dataset that have been accurately determined as positive by the utilized classifier as calculated based on Eq. (3)
In addition, the F-Measure is computed based on the weighted harmonic mean of the recall and precision as mentioned in Eq. (4)
Where, TP-True Positive-If the instance is positive and the outcome of the classification is also positive
TN-True Negative-If the instance is negative and the outcome of the classification is also negative.
FN-False Negative-If the record is positive and the outcome of the classification is also negative.
FP-False Positive-If the data record is negative and the outcome of the classification is also positive.
The political event prediction is considered as one of the most significant tasks, particularly when the data availability with respect to political events are not available up to the marks. At the juncture, electronic media is determined as the most powerful tool that can facilitate accurate data and retains to be potential during the research process. Further, data mining tools helped in handling the data and transform into an understandable format which helps in extracting information in order to answer political event patterns and its relationships. In this prediction scheme, three machine learning algorithm such as kNN, random forest and Markov Property-based Random Forest were used for the political event prediction process based on archive datasets. The results of the aforementioned machine learning algorithms are investigated and compared with respect to the evaluation metrics of accuracy and prediction.
Table 3 and Fig. 3 presents the accuracy and precision of the proposed KNN-based PPED-STA scheme with different value of k (k set to 3, 5, 7 and 9). The accuracy of the proposed KNN-based PPED-STA scheme with k = 9 performed better than at the remaining values considered for investigation, since the process of discrimination imposed for the collected data played an anchor role in improving accuracy. In contrast, the precision of the proposed PPED-STA scheme with k = 3 performed better as it incorporated better determination of relevant and irrelevant classes from the dataset considered for exploration.The value of the accuracy facilitated by the proposed KNN-based PPED-STA scheme at k = 9 is also maximum upto 95.4%, which is comparatively increased at a mean rate of 4.86% on par with the other values considered for investigation. Moreover, value of the precision attained by the proposed KNN-based PPED-STA scheme at k = 3 is maximized with a value of 94.2%, which is comparatively increased at a mean rate of 5.62% on par with the other values considered for investigation.
Table 3
Results of the proposed PPED-STA: Accuracy and Precision based on k-NN attained using Weka tool
Value of k
|
Accuracy
|
Precision
|
3-NN
|
0.912
|
0.942
|
5-NN
|
0.948
|
0.936
|
7-NN
|
0.953
|
0.924
|
9- NN
|
0.954
|
0.918
|
Table 4 and Fig. 4 demonstrates the recall and F-measure (macro and micro-averaged) value of the KNN-based PPED-STA scheme with different value of k (k set to 3, 5, 7 and 9). The recall of the proposed KNN-based PPED-STA scheme with k = 3 is identified to provide excellent performance on par with the remaining values considered for investigation. This potential performance is mainly due to the determination of relevant data from the complete set of dat considered for exploration. The recall value of the proposed KNN-based PPED-STA scheme at k = 9 is proved to be maximum upto 91.2%, which is comparatively improved by 8.32% over the other values considered for investigation. The F-measure (macro-averaged) attained by the proposedPPED-STA scheme with k = 9 was identified to be significant as the features considered for computing mean precision and recall is reliable in properly estimating the regression degree available between the data. Thus, F-measure (macro-averaged) attained by the proposed KNN-based PPED-STA scheme at k = 9 is also maximum upto 91.2%, which is comparatively improved by 7.24%, better than the other values considered for investigation.The F-measure (micro-averaged) achieved by the proposedPPED-STA scheme with k = 3 is confiemed to be predominant as the temporal classification involved during analysis aided in better exploration of the data. Hence, F-measure (micro-averaged) achieved by the proposed KNN-based PPED-STA scheme at k = 3 is maximized with 91.8%, which is enhanced by 6.92%, superior to the other baseline values considered for investigation.
Table 4
Results of the proposed PPED-STA: Accuracy and Precision based on k-NN attained using Weka tool
Value of k
|
Recall
|
F-measure
|
|
|
Macro-Averaged
|
Micro-Averagesd
|
3-NN
|
0.912
|
0.842
|
0.918
|
5-NN
|
0.908
|
0.854
|
0.904
|
7-NN
|
0.903
|
0.886
|
0.896
|
9- NN
|
0.896
|
0.912
|
0.893
|
Table 5 and Fig. 5 depicts the accuracy and precision of the proposed k-Random Forest-based PPED-STA scheme with different value of k (k set to 10, 20, 30 and 40). The accuracy and precision of the proposed k-Random Forest-based PPED-STA scheme with k = 10 performed better, since it aided in predominant exploration of spatial and temporal features existing in the dataset considered for study. The value of the accuracy facilitated by the proposed k-Random Forest-based PPED-STA scheme at k = 10 maximized with a value of 68.2%, which is comparatively increased at a mean rate of 6.84% on par with the other values considered for investigation. Moreover, value of the precision attained by the proposed k-Random Forest-based PPED-STA scheme at k = 10 is maximized with a value of 62.2%, which is comparatively increased at a mean rate of 4.59% on par with the other values considered for investigation.
Table 5
Results of the proposed PPED-STA: Accuracy and Precision based on Random Forest attained using Weka tool
Value of k
|
Accuracy
|
Precision
|
10-Forest
|
0.682
|
0.622
|
20-Forest
|
0.676
|
0.618
|
30-Forest
|
0.672
|
0.604
|
40-Forest
|
0.668
|
0.602
|
Table 6 and Fig. 6 presents the recall and F-measure (macro and micro-averaged) value of the k-Random Forest-based PPED-STA scheme with different value of k (k set to 10, 20, 30 and 40). The recall of the proposed k-Random Forest-based PPED-STA scheme with k = 10 is excellent as it adopted integrated feature selection and classification process. The recall value of the proposed k-Random Forest-based PPED-STA scheme at k = 10 is enured is proved to be maximum upto 71.2%, which is comparatively improved by 5.68% better than the other values considered for investigation. The F-measure (macro-averaged) attained by the proposedk-Random Forest-based PPED-STA scheme with k = 10 is proved to be potential as the comprehensive set of features considered for prediction is highly optimized before classification. F-measure (macro-averaged) attained by the proposed k-Random Forest-based PPED-STA scheme at k = 10 is also maximum upto 56.2%, which is comparatively improved by 6.42%, better than the other values considered for investigation. The F-measure (micro-averaged) achieved by the proposedk-Random Forest-based PPED-STA scheme with k = 10 is confiemed to be predominant as the process of spatial and temporal classification involved during analysis aided is completely context-based and it is also capable in handling missing data. Hence, F-measure (micro-averaged) achieved by the k-Random Forest-based PPED-STA scheme at k = 10 is maximized with 61.2%, which is enhanced by 5.94%, superior to the other baseline values considered for investigation.
Table 6
Results of the proposed PPED-STA: Accuracy and Precision based on Random Forest attained using Weka tool
Value of k
|
Recall
|
F-measure
|
|
|
Macro-Averaged
|
Micro-Averaged
|
10-Forest
|
0.712
|
0.562
|
0.612
|
20-Forest
|
0.706
|
0.541
|
0.604
|
30-Forest
|
0.696
|
0.536
|
0.596
|
40-Forest
|
0.684
|
0.526
|
0.582
|
Table 7 and Fig. 7 presents the accuracy and precision of the proposed PPED-STA scheme under the integration of kNN and k-Random Forest classifier. The accuracy and precision of the proposed PPED-STA scheme with 3-NN and 10-Forest is superior, since it combined the possible benefits of KNN and random forest to the expected level. The value of the accuracy facilitated by the proposed PPED-STA scheme with 3-NN and 10-Forest achieved a value of 73.2%, which is increased at a mean rate of 5.12%, better than the other k-NN and k-foresr values considered for investigation. Moreover, value of the precision attained by the proposed PPED-STA scheme with 3-NN and 10-Forest is maximized with a value of 78.3%, which is comparatively increased at a mean rate of 6.42%, better than the other k-NN and k-foresr values considered for investigation
Table 7
Results of the proposed PPED-STA: Accuracy and Precision based on Random Forest attained using Weka tool
Value of k and n
|
Accuracy
|
Precision
|
3-NN and 10-Forest
|
0.732
|
0.783
|
5-NN and 20-Forest
|
0.729
|
0.772
|
7-NN and 30-Forest
|
0.718
|
0.764
|
9 NN and 40-Forest
|
0.711
|
0.721
|
Table 8
Results of the proposed PPED-STA: Accuracy and Precision based on kNN and k-Random Forest attained using Weka tool
Value of k
|
Recall
|
F-measure
|
|
|
Macro-Averaged
|
Micro-Averagesd
|
3-NN and 10-Forest
|
0.852
|
0.671
|
0.664
|
5-NN and 20-Forest
|
0.841
|
0.664
|
0.652
|
7-NN and 30-Forest
|
0.834
|
0.661
|
0.641
|
9 NN and 40-Forest
|
0.826
|
0.652
|
0.638
|
Table 8 and Fig. 8 presents the recall and F-measure (macro and micro-averaged) value of the proposed PPED-STA scheme under the integration of kNN and k-Random Forest classifier. The recall of the proposed PPED-STA scheme with 3-NN and 10-Forest is determined to facilitate remarkable performance, since the merits of k-NN and random forest are proportionally combine for retrieving huge amount of features that aids in better exploration. The recall value of the proposed PPED-STA scheme with 3-NN and 10-Forest is identified to be maximum with a value of 85.2%, which is comparatively improved by 6.56%, superior than the other values considered for investigation. The F-measure (macro-averaged) attained by the proposed PPED-STA scheme with 3-NN and 10-Forest is proved as significant because of the reduction in the false positive rate determined during discriminate analysis. Thus, F-measure (macro-averaged) attained by the proposed PPED-STA scheme with 3-NN and 10-Foresthas a maximum value of 67.1%, which is comparatively improved by 5.92%, better than the other values considered for investigation. The F-measure (micro-averaged) achieved by the proposed PPED-STA scheme with 3-NN and 10-Forest is proved to be highly important as it explored diversified dimensions of data with different aspects of exploration. Hence, F-measure (micro-averaged) achieved by the proposed PPED-STA scheme with 3-NN and 10-Forest is maximized up to 66.4%, which is enhanced by 5.86%, predominant over the other benchmarked values considered for investigation.