The news are very important because it keeps the public informed about activities and events around their premises and beyond their premises. Reports showed that most adults use digital forms such as social media and web/search engines to access their news instead of using traditional media. Fake news detection has got considerably attention (De Choudhury et al. 2014). In this section, numerous methods have been suggested to detect fake news in various types of features and datasets. Authors in (Okoro et al. 2018) used a Machine-Human (MH) model for detection of fake news on social media. The study (Khan et al. 2021) focused on detectition of fabricated opinions and compared the Glove Embedding and Character Embedding features of fake and real news dataset using three datasets. Amongt them, two are standard datasets and one is combination news of distributed topic on social media through Naïve Bayes, CNN, LSTM, Bi-LSTM, C-LSTM, Heterogeneous Graph Neural Network (HAN), Cov-HAS, Char-level C-LSTM model. It is observed that n-gram features show great promising results in fake news on Naive Bayes model which is almost equivalent to the performances of CNN based model. The authors in (Ozbay and Alatas 2020) used mixure of text classification techniques and supervised artificial intelligence classifiers. The proposed model was tested on three different real word datasets. The model was evaluated using accuracy, precision, recall and F-measures values. The performance of best mean values was obtained from the Decision Tree Algorithm. Zero, CV parameter selection (CVPS) with 1000 value, seems the best recall metric algorithm. Authors in (Gravanis et al. 2019) used an enhanced set of linguistic features for the detection of fake news by evaluated several classification models using five different datasets containing fake and reals news. Adaboost obtained 95% accuracy over all datasets and next is ranking Support Vector Machine (SVM) and Bagging algorithms.
The study (Ahmad et al. 2020) presented work of machine learning model and ensemble techniques for detection of fake and real social News. Data collected from web contains fake and real news covering different domain. They extracted different textual features from the dataset and used it as input to different machine learning models like Logistic Regression, SVM, MLP, KNN, Random forest(RF) and ensemble models like voting classifier(RF, LR, KNN), voting classifier(LR, LSVM, CART), Bagging classifier (decision tree) and boosting classifier (AdaBoost and XG-Boost), Ensemble model XGBoost performance better than other classifiers and ensemble model in terms of accuracy. The author in (Ahmed et al. 2019) have used n-grams and Part of Speech (POS) tagging, they suggested Deep Syntax Analysis using Probabilistic Context-Free Grammars (PCFG). The author in (Ruchansky et al. 2017) proposes the CSI hybrid model used for fake news detection. The CSI model is comprised of three modules. The first model captures the pattern of the user's temporal engagement with an article. The second modules capture the characteristic source present in the behavior of users and the third modules are used as integrated of both modules first and second experiment on two datasets, check robustness of CSI model when labeled data is limited. It also inspects suspicious users’ behaviors. The CSI model doest not make assumptions regarding distribution of user behavior specially textual context of the data or the structure of data underlying. In the study of (Mansouri et al. 2020), a combined method based on semi-supervised LDA (Linear Discriminant Analysis) and convolutional neural network are used to detect fake news using an unlabeled dataset for the convolutional neural network the unlabeled dataset is labeled. The result of the proposed method of precision is 95.6% and 96.7% recall, which outperforms existing methods for detecting fake news.
Another study (Najar et al. 2019) Fake news detection using Bayesian interference used Bag of Words using Multinomial Model (MM), Dirichlet Compound Multinomial (DCM) and Deterministic Annealing Expectation-Maximization(EDCM-DAEM) and EDCM-Bayesian, EDCM-Bayesian better accuracy than other classifiers, classification accuracy 87.85 on BS-Detector dataset. In study (Jain et al. 2019; Reis et al. 2019) used different textual features like language features (syntax) such as n-gram and part of speech tagging, lexical features (character and word-level signals), psycholinguistic features, semantic features and subjectivity and sentiment scores of a text using classification of K-Nearest neighbors (KNN), Naïve Bayes(NB), Random forests(RF), Support Vector Machine (SVM) with RBF kernel (SVM), and XGBoost (XGB). Random forest and XGB performed best using handcraft features, web-based networking media. In study [14] using Naïve Bayes classifier, SVM with comparison Naïve Bayes and CNN. Results show that Naïve Bayes, SVM, NLP are performed better than other machine classifier.
The accuracy of proposed model 93.50% at the other machine learning model.
Another study (Faustini and Covões 2019) conduct on fake social media news used three datasets of social media (Twitter, WhatsApp and Fake BR Corpus), by extracting of fourteen textual features such as proportion of uppercase characters, exclamation marks, question marks, number of unique words, number of sentence, number of characters, words per sentence, proportion of adjective, adverb, nouns, sentiment of message, proportion of swear words and proportion of spell errors as features for classifiers. In study (Hlaing and Kham 2020) presents multidimensional fake news (news content, social engagement and news stance) used synonym-based features using three different classifier Decision Tree classifiers, AdaBoost classifier and Random forest classifier for detection of fake news. Experimental result show that Random Forest perform better than other two classifiers on social media dataset.
The study (Mahir et al. 2019) reported that SVM performed better than other classifiers including Naïve Bayesian, RNN/LSTM, Logistic Regression in recognizing fake news extracted from twitter. The study (Al-Ash et al. 2019) used Indonesian news dataset consist of fake and real news documents to show the classification performance of Random Forest, SVM and Naïve Bayesian Classifiers over this dataset with associate classification approach. In this study (Katsaros et al. 2019) eight models were evaluated for classification purpose. These models include Linear Regres. sion, SVM, MLP, Gaussian and Multinomial naïve Bayes, Random Forests, Decision Trees and CNN on three publicly available datasets. The result showed that the CNN is the best performing algorithm.
The study of (Choudhary et al. 2021) proposed a deep learning architecture called BerConvoNet for classification of Fake news and Real news with marginal error. The proposed architecture was composed of two main blocks, a New Embedding Block (NEB)and a Multi-scale Features Block (MSFB). The NEB used BERT for extracting word embeddings from news articles and then fed it to MSFB as input.
In the study (Vogel and Meghana 2020) reported that SVM achieved the highest accuracy of 92% in classification of fake news. He used hand crafted features extracted from news dataset like total word(tokens), Unique words, Unique words (types), type/token ratio, Number of sentences, average sentence length, Number of Characters, Average word length, nouns, prepositions, adjectives.The classification models include XG Boost, Random Forest, Naïve Bayesian, KNN, Decision Tree and SVM were used for classification of fake news.