Evaluating Machine Learning Approaches for Detecting Fake News on Social Media

doi:10.21203/rs.3.rs-4316140/v1

Download PDF

Research Article

Evaluating Machine Learning Approaches for Detecting Fake News on Social Media

https://doi.org/10.21203/rs.3.rs-4316140/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

The study aims to evaluate the efficiency of machine learning and ensemble methods in identifying fake news using the liar dataset. A novel model is proposed that utilizes a voting ensemble method to achieve optimal accuracy in fake news detection. The proposed methodology involves data collection from Kaggle, preprocessing to clean and prepare the dataset, and feature selection using term frequency and inverse document frequency. Subsequently, machine learning algorithms are trained and tested, and model performance is evaluated using accuracy, precision, recall, and F1-score metrics. Various classifiers, including Naive Bayes, Logistic Regression, Support Vector Machine, K-Nearest Neighbors, Decision Tree, and Random Forest and the vote method were utilized. During the implementation phase, Support Vector Machine achieved the highest accuracy among individual machine learning algorithms i.e. 62%. However, upon applying the vote algorithm on various machine learning models, the best combination, consisting of Naive Bayes, Support Vector Machine, K-Nearest Neighbors, and Random Forest, achieved the highest accuracy of 63.07%. This indicates that combining multiple classifiers into a single classifier using the vote algorithm produced the best results compared to individual machine learning algorithms. The research introduces novel contributions to fake news detection by proposing a unique model utilizing a voting ensemble method. The effectiveness of ensemble methods compared to individual machine learning algorithms is explored, and combining classifiers trained on different feature sets is investigated. Through experimentation with the Liar dataset, the optimal combination of classifiers for fake news detection is identified.

Fake news detection

Social media misinformation

Machine learning algorithms

Liar dataset analysis

Information credibility

In today’s world, the problem of fake news is not limited to a specific organization, but it is widespread and affecting individuals and societies. With the growth of social media platforms like Twitter, Facebook, WhatsApp, and YouTube, many people are more reliant on online platforms rather than authentic sources like television or newspapers. As a result, people can easily access the news through social media, and this gives an opportunity to spread fake news on a large scale. The dissemination of fake news in social media can be shared in various forms like fake campaigns, fake comments, fake advertisements, fake images, fake videos, fake reviews, fake blogs or articles, etc. For example, alcohol is a solution for COVID-19 as per false claims that resulted in several deaths and hospitalizations in Iran ^[1]. Detection of fake news is a challenging task on social media platforms due to the enormous amount of data available on online networks and it has become difficult to differentiate between real and false news. There are various manual fact-checking websites such as factcheck.org, Snopes, and politifact.com. These websites play a significant role in countering the spread of misleading false information. However, the problem with these fact-checking sites was that they requires human expertise to verify information and they face scalability issues. Additionally, these fact-checking websites mainly focus on articles related to politics. Utilizing machine learning techniques helps to examine a large amount of textual data. By training on large datasets of true and false information, these models can detect misleading information and identify patterns of misinformation. Natural language processing enhances human-machine interaction, which helps organizations make more accurate decisions and become more efficient ^[2]. As per the survey, the majority of the US population, specifically 77%, prefers to access news through Internet sources rather than relying on print media ^[3]. Therefore, identifying fake news holds significant importance for society. The main issue is determining how to verify the authenticity of news and articles that are shared through social media channels such as WhatsApp, Facebook, Twitter, and other microblog sites.

The goal of this research is to provide effective solutions to existing challenges, rather than to create new problems, such as fake news. Traditional machine learning algorithms achieve varying levels of accuracy when detecting fake news. However, depending on individual classifiers does not effectively address the complexity of fake news detection. Ensemble methods, including the voting classifier ^[19], integrate the prediction results of multiple classifiers into a single classifier, which enhances performance accuracy. It is an effective approach to address the limitations of individual classifiers. Various performance metrics accuracy, precision, recall, and f1-score are used to evaluate the performance of machine learning algorithms. The present study experimented on a liar dataset using machine learning approaches, and the experiment was carried out using Google Colab. To acquire a thorough understanding of the dataset, data visualization was performed using Word Cloud. Figure 1 shows the Word Cloud for fake news, and Word Cloud for real news on the liar dataset.

The remaining sections of this paper are organized as follows: Section 2 discusses the related research work, Section 3 presents the research methodology, Section 4 covers the results and discussion, and Section 5 concludes the paper.

The authors in ^[4], utilized Machine and Natural Language Processing techniques to gather news articles and subsequently classify them as either real or fake using the Support Vector Machine algorithm. The work in ^[5], introduces a machine learning ensemble approach to classify news articles as real or fake based on different textual properties. Using various ensemble methods, they conduct training using machine learning techniques and evaluate their performance on four datasets. The results demonstrate the superiority of the proposed ensemble learner approach over individual learners. The study introduces a new approach by utilizing ensemble techniques with a Linguistic Inquiry and Word Count (LIWC) feature set. They collected three publicly available datasets for their experiments and the fourth dataset is the aggregation of features extracted from the three datasets. The authors in ^[6], proposed a system that divides into three parts: Static, dynamic, and the user input based on various NLP and Machine learning algorithms for detecting fake news. The result shows that Logistic regression achieved the highest accuracy as compared to other classifiers and the performance of LR was increased by using grid search parameter optimization. The authors in ^[7], compared the performance of different machine learning algorithms Random Forest, Naive Bayes, Neural Network, and decision tree for detecting fake news. It is found that the performance of Naive Bayes is better as compared to other classifiers. In ^[8], the authors introduce machine learning algorithms, decision tree, random forest, extra tree classifiers, and ensemble methods with feature extraction for detecting fake news. The experiment of the model was carried out on two datasets i.e. ISOT and LIAR. The experiment result shows that the model achieved an accuracy of 44% and 100% percent. The authors conducted a study to compare the efficiency of several algorithms in detecting fake news and classify their real positive and real negative. The study involved the application of four machine learning methods were applied to two datasets, Kaggle and LIAR, using both count vector and term frequency inverse document frequency (TF-IDF) vector. The result indicated that XGBoost with a count vector achieved the highest accuracy in detecting fake news ^[9].In their study ^[10], the authors proposed a deep learning approach for detecting fake news using a Bidirectional Long Short Term Memory with a Convolutional Neural model (BiLSTM-CNN). The framework achieved an accuracy of 86.12%, recall of 86%, and f-measure by classifying 5800 tweets as fake or not, based on text-based features. The authors aim to explore and compare different baseline machine learning and deep learning classifiers for rumor detection. The author in ^[11], aims to train four deep learning models with N-Gram vector and sequence vector for detecting fake news. It is found that the model trained with news content demonstrates better results when compared to the model trained with the news title. The study introduced a capsule neural network approach for identifying fake news by taking into account the length of the text and implementing various levels of n-grams for feature extraction, with the use of non-static embedding for large statements during the training phase. The investigation was conducted using two datasets, the ISOT and liar datasets. The experimental result shows that the proposed model gives better accuracy, with 99.8% for the ISOT testing dataset and 39.5% for the liar testing dataset ^[12]. The authors in ^[13], used three techniques: linguistics, sociocultural textual, and textual classification. They employed deep learning models, such as LSTM, RNN, and GRU, for binary classification and achieved an accuracy of 75% for Long Short Term Memory (LSTM), 42% for GRU, and 62% for RNN. In ^[14] the authors proposed a model DeepFake for detecting fake news that considered media news content, social media content, and a combination of both News and Social context based classification. The BuzzFeed and PolitiFact datasets were used to test the model, Matrix-tensor factorization was employed to extract features from the dataset. The accuracy of 85.86% on the BuzzFeed dataset and 88.64% on the PolitiFact dataset was obtained from the testing results. In [15] the authors describe an attention based convolutional bidirectional LSTM method for fake news detection using the LIAR dataset and show improved accuracy compared to other models. The authors evaluated LR, RF, MNB, SGD, KNN, DT, AB, CNN, RNN, and Hybrid CNN RNN classifiers on two different datasets. A combination of deep learning models CNN and RNN shows positive Results [16]. LSTM neural network for fake news detection on two different datasets with an accuracy of 99.88% by the authors in [17]. The authors in [18] proposed a model for fake news detection using machine learning and bio inspired optimization algorithm such as genetic algorithm. The large size of datasets may be the reason for higher performance in machine learning, which increases model accuracy. However, it might not always be possible to create a large dataset. Therefore, the study explores the use of smaller datasets to train the models without lowering their accuracy.

Previous research mainly used machine learning for fake news detection and achieved high accuracy. However, they didn't explore ensemble methods much. It's possible that even classifiers with lower accuracy may give better results when combined with others. The paper contributes to the field of fake news detection in several ways. It proposes a novel model that uses a voting ensemble method to achieve optimal accuracy in fake news detection. Ensemble methods offer promising results by combining predictions from multiple classifiers into a single classifier, providing a holistic strategy for addressing the complex challenge of fake news detection. By combining classifiers trained on different feature sets or using different algorithms, ensemble methods can enhance overall detection performance.

In the proposed study, machine learning models are trained for detecting fake news. The workflow of the proposed approach is shown in Fig. 2, which is divided into various phases, including data pre-processing, feature selection, and model training. The following is a general outline of the involved steps:

Step 1: Data collection: The liar dataset is collected from Kaggle.

Step 2: Data preprocessing: Data preprocessing techniques like Tokenization, Stemming and removal of punctuation marks and stop-words are utilized on the dataset to clean them or to eliminate features that are not needed from dataset and handle ambiguous data.

Step 3: Feature selection: After cleaning the dataset the next step is the Vectorization of data. Apply TF-IDF which is used to convert text data into a numerical sequence.

Step 4: Training and Testing Machine Learning algorithms

Step 5: Evaluation of the trained model using performance metrics.

Step 6: Finally, choose the best performing model on the basis of evaluation metrics.

Dataset

The liar dataset is a publicly available dataset of fact-checking on political statements. It comprises 10,240 news articles. The dataset, obtained from Kaggle, contains two columns – “statement” and six categorized classes: “true”, “half-true”, “mostly true”, “barely true”, “false”, and “pants on fire”.

Data preprocessing

Pre-processing was done using NLTK toolkit which is an open-source and widely used NLP library. It comes with inbuilt functions and algorithms such as nltk.tokenize method (for tokenizing text), nltk.stem.porter.PorterStemmer method (popular Porter stemming algorithm) and other such methods. This phase transforms the raw text into an understandable format for further processing. In this phase, the input dataset is pre-processed using various steps to remove the noise such as lowercase, punctuation and special characters, tokenization, stopwords, stemming, etc.

The pre-processing of the dataset consists of the following steps.

Tokenization is the process of breaking down text into smaller units or tokens, and it is typically the first step in data cleaning for Natural Language Processing (NLP) projects. NLTK, which stands for Natural language toolkit, is a group of libraries and nltk. tokenize method can be used for tokenization.

Stop words are insignificant words that don’t add much to a sentence's meaning or they don’t tell us much about data, they create noise or some words mislead the models without aiding them in detecting true and fake news therefore they are removed. Articles, prepositions, conjunctions, and some pronouns are considered stop words. Some of these commonly used stop-words are the, of, I, you, it, and, as a, about, an, are, as, at, be, by, for, from, how, in, is, of, on, or, that, these, this, too, was, what, when, where, who will, etc. To eliminate stop words from a sentence, the text is divided into words and then it is checked to see if the word is in the NLTK list of stop words. If the particular word exists in the collection of a corpus, the word is then eliminated.

Stemming helps to reduce the words to their root form. For example, words such as running, ran, and runner to their root word which is run. For this purpose, the Porter Stemmer algorithm is used, which is the most commonly used stemming algorithm. For example “cars” get reduced to a root term “car”. After preprocessing data, the cleaned data must be converted into a numerical format for further tasks.

Feature selection

The preprocessed data is passed to the Vectorization process which includes both CountVectorizer and Term frequency-inverse document frequency techniques. Both are used for text preprocessing in natural language processing tasks such as text classification and help to convert the text data into a numerical format. In this work, TF-IDF served as the primary feature selection method.

CountVectorizer is a useful tool in NLP to convert text data into a token count. It determines how often each word appears in the text and counts for each word in the text data. This creates a numerical representation of the text data.

For example: “The dog in the hat sat on the floor”

TF-IDF consists of two parts: term frequency and inverse document frequency. It is a statistical measure that evaluates the importance of a word in a given document or a corpus. TF-IDF is used to identify the most important terms or keywords in a document or set of documents. This is useful for many natural language processing tasks, such as text classification, information retrieval, and content analysis. Using TF-IDF, it becomes possible to automatically extract the most relevant and informative terms from a large collection of documents, making it easier to understand and analyze the content. TF-IDF is the product of term frequency and inverse document frequency. Term frequency is calculated how often a term occurs in a particular document. In other words, it is the ratio of the number of times a particular word appears in a document to the total number of words in that document, and Inverse document frequency measures how rare or unique the term is across all documents in the dataset.

TF - Number of times a word appears in a document / Total number of words present in the document

IDF – log (total number of documents/numbers of documents that contain the word)

The TF-IDF score of a term is calculated by multiplying the term frequency with the inverse document frequency. The higher the TF-IDF score of a term in a document, the more important the term is in that document. After converting textual data into a numerical form the machine learning models are trained on the numerical vectors and used to predict the output of the test dataset. Various Machine Learning techniques are used to train the models on the datasets. Naive Bayes, Logistic Regression, and Support Vector Machine algorithms are utilized to train the methods on both datasets.

Evaluation metrics

Accuracy: Accuracy is the ratio of the number of correct prediction to the total number of instance.

Precision: Precision is the ratio of the number of correct prediction to the total number of predicted positives instances.

Recall: Recall is the ratio of the total number of correctly predicted instance to the total number of actual positive instances.

F1score: F1 Measure is the harmonic mean of precision and recall.

TP, TN, FP, FN, represent the number of true positives, true negative, false positives and false negative respectively.

The study was carried out using Python due to the large collection of libraries. A summary of the characteristics of the liar dataset before and after preprocessing is shown in Table 1. The dataset is divided into three separate sets: training, validation, and testing. Before preprocessing, the size of these sets includes 10240, 1267, and 1284 statements. The dataset changes after preprocessing which leads to decrease in size for each set. After preprocessing the size of these sets includes 6724, 853, and 861 statements. Table 2, demonstrate the performance of various Machine learning algorithms on the liar dataset using four evaluation metrics accuracy, precision, recall, and f1-score.

Table 1

Summary of Dataset Statistics Before and After Preprocessing
Dataset	Before preprocessing	After preprocessing
Training set size	10240	6724
Validation set size	1267	853
Testing set size	1284	861

Table 2

Performance of various machine learning algorithms on liar dataset
Machine learning classifiers	Model	Features	Performance evaluation
			Accuracy		Macro avg.			Weighted avg.
			Validation accuracy	Test Accuracy	Precision	Recall	F1 score	Precision	Recall	F1 score
	NB	Tf-Idf	0.58	0.60	0.61	0.54	0.48	0.61	0.61	0.53
	LR		0.59	0.60	0.58	0.56	0.55	0.59	0.60	0.58
	SVM		0.62	0.62	0.60	0.57	0.56	0.61	0.62	0.59
	KNN		0.56	0.56	0.54	0.54	0.54	0.56	0.57	0.56
	DT		0.57	0.56	0.55	0.55	0.55	0.56	0.56	0.56
	RF		0.62	0.59	0.56	0.55	0.53	0.57	0.59	0.56

Table 3

Performance of all models on LIAR dataset for fake and real class
Machine learning classifiers	Model	Features	Performance evaluation
			Class 0				Class 1
			Accuracy	Precision	Recall	F1-score	Accuracy	Precision	Recall	F1-score
	NB	Tf-Idf	0.61	0.60	0.13	0.22	0.61	0.61	0.94	0.74
	LR		0.60	0.52	0.33	0.41	0.60	0.63	0.79	0.70
	SVM		0.62	0.57	0.32	0.41	0.62	0.64	0.83	0.72
	KNN		0.57	0.47	0.39	0.43	0.57	0.62	0.69	0.65
	DT		0.56	0.46	0.49	0.48	0.56	0.63	0.61	0.62
	RF		0.s59	0.51	0.29	0.37	0.59	0.62	0.81	0.70

From Table 2, it can be observed that among the machine learning models such as Naive Bayes (NB), Logistic Regression (LR), Support Vector Machine (SVM), K-nearest neighbor (KNN), Decision Tree (DT), and Random Forest (RF), SVM shows the best accuracy on the liar dataset. The performance of the six classifiers varied significantly in their ability to classify both Class 0 (fake news) and Class 1 (real news) instances shown in Table 3. Among the machine learning classifiers, SVM achieved the highest accuracy for Class 0 at 0.62, while LR had the highest precision of 0.52. However, NB showed the highest recall for Class 1 at 0.94. KNN achieved a relatively balanced F1-score of 0.43 fsor Class 0 and 0.65 for Class 1. DT had the highest recall for Class 0 at 0.49, while RF showed the highest precision for Class 1 at 0.81. Figure 4 presents the confusion matrix for the performance of different classifiers on the same dataset.

Some classifiers work better than others. Instead of disregarding classifiers with lower performance, the results of individual classifiers were combined using ensemble methods. This resulted in achieving the highest accuracy of 63%. The combination that gave us this accuracy included Naive Bayes, Support Vector Machine, K-nearest neighbor, and Random Forest classifiers.

In Fig. 5 the voting results obtained from various combinations of classifiers achieve better results. For example, the ensemble method improved the individual accuracies of KNN and RF from 56% and 59%, to achieve an accuracy of 61.78%. Figure 6 shows a graphical representation of the accuracy results of classifiers after applying the vote ensemble method on the liar dataset. Table 4 demonstrates the performance of various machine learning algorithms after experimenting on the liar dataset to identify fake news. Table 4 provides a summary of existing models for fake news detection and presents the proposed results obtained by applying ensemble methods across different classifiers. Choosing the appropriate machine learning methods is essential to improve model performance and reduce dependence on single classifiers. Ensemble methods combine the outputs of multiple algorithms, can significantly improve machine learning performance. However, ensemble techniques have not extensively explored in previous studies.

Table 4

Comparison of the proposed work with existing Fake News Detection models
Authors	Year	Dataset	Techniques	Tool	Accuracy
[6] Jain et al.	2019	News Website	NB, and SVM	Python	NB, SVM, and NLP = 93.50%
[8] Uma sharma et al.	2020	LIAR, Real or fake	NB, LR, RF, and passive aggressive classifier	Python	LR = 65%
[10] Saqib Hakak et al.	2020	ISOT and LIAR	DT, RF, ET, and ensemble method	Python	ISOT = 100% LIAR = 44%
[11] Swapnesh Jaina et al.	2020	LIAR and Kaggle	NB, LR, RF, and XGBoost	Python	RF = 61.66% XGBoost = 95.59%
[13] Sheng How Kong et al.	2020	Kaggle dataset	Neural network model	Python	NN = 90.3%
[14] M.H. Goldani et al.	2021	ISOT and LIAR	Capsule Neural network	Python	ISOT = 99.8% Liar = 39.5%
[15] Alina Vereshchaka et al.	2020	News articles	LSTM, GRU, RNN	-	LSTM = 75% GRU = 42% RNN = 62%
[16] Rohit Kumar Kaliyar et al.	2020	Buzzfeed, Politifact	XGBOOST, DNN	Python	BuzzFeed = 85.86% PolitiFact = 88.64%
[17] Tina Esther Trueman et al.	2021	LIAR dataset	CNN and BiLSTM	Python	AC-BiLSTM = 35.1%
[18] Jamal Abdul Nasir et al.	2021	ISOT and FA-KES dataset	CNN, RNN	Python	ISOT = 60% FA-KES = 99%
[28] Deepjyoti Choudhury,Tapodhir Acharjee	2022	Liar,fake job posting, fake news	NB,SVM,LR,RF	Python	SVM = 61% SVM,RF = 97%
Our paper		Liar	NB, LR, SVM, KNN, DT, RF, and Vote	Python	SVM = 62% Vote = 63%

The study contributes a novel approach to the field of fake news detection using machine learning. While previous research primarily focused on individual machine learning algorithms for fake news detection, this paper explores ensemble methods, specifically a voting ensemble method, to combine predictions from multiple classifiers. This approach aims to improve overall accuracy by leveraging the strengths of different classifiers. The paper explores combining classifiers trained on different feature sets or using different algorithms. By doing so, it seeks to enhance the overall detection performance by capturing diverse aspects of the data. The study evaluates the performance of machine learning algorithms using various metrics such as accuracy, precision, recall, and F1-score. This comprehensive evaluation helps in understanding the strengths and weaknesses of different classifiers and ensemble methods. The research includes visualizations such as word clouds and confusion matrices to analyses the performance of different classifiers. This aids in better understanding the behavior of the models and their predictive capabilities. Combining different methods in the model through ensemble methods improved overall accuracy. It enhances the efficiency of fake news identification using the Liar dataset. The proposed model, which utilizes a voting ensemble method, demonstrated high performance compared to individual classifiers, achieving an optimal accuracy of 63.07%. Findings highlight the potential of ensemble methods to improve fake news detection compared to traditional machine learning approaches. While the study focused on the Liar dataset, future research could explore the application of the proposed model to other datasets and the use of other classifiers for further validation. Testing the model on more datasets and making it easier to understand would be necessary.

Ethical Approval: Not applicable

Funding: Not applicable

I, Jyoti Negi, hereby declare that the work presented in this research paper is original and has not been submitted elsewhere for publication. Any sources of information or ideas used in this paper have been duly acknowledged through proper citation and references. As Ethical Approval and Funding were not applicable to this study, no additional details are provided in these areas. All data and results presented in this paper are accurate to the best of my knowledge and have been appropriately analyzed.

Date: 30/04/2024

Signature:

Author Contribution

Jyoti negi collected and analyzed the data, conducted the literature review, and implemented the study design.Sumesh Sood conducted thorough reviews of the manuscript, including plagiarism checks, and grammar checks, and provided feedback to improve clarity and coherence. Additionally, Sumesh Sood guided Jyoti negi where necessary.Kritika kumari assisted in the preparation of tables and figures, ensuring their accuracy and clarity for presentation.

Karimi N, Gambrell J. Hundreds die of poisoning in iran as fake news suggests methanolcure for virus. Times of Israel. Accessed 2021-03-31; 2020.
Bahja M. Natural language processing applications in business. E-Business-higher education and intelligence applications. 2020 May 11. https://doi.org/10.5772/intechopen.92203
Smitha N, Bharath R. Performance comparison of machine learning classifiers for fake news detection. In2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA); 2020 Jul. p. 696-700.https://doi.org/10.1109/icirca48905.2020.9183072.
Jain A, Shakya A, Khatter H, Gupta AK. A smart system for fake news detection using machine learning. In2019 International conference on issues and challenges in intelligent computing Stechniques (ICICT) 2019 Sep 27 (Vol. 1, pp. 1-4). https://doi.org/10.1109/icict46931.2019.8977659
Ahmad I, Yousaf M, Yousaf S, Ahmad MO. Fake news detection using machine learning ensemble methods. Complexity. 2020 Oct 17;2020:1-1. https://doi.org/10.1155/2020/8885861
Sharma U, Saran S, Patil SM. Fake news detection using machine learning algorithms. International Journal of Creative Research Thoughts (IJCRT). 2020 Jun 6;8(6):509-18.
Albahr A, Albahar M. An empirical comparison of fake news detection using different machine learning algorithms. International Journal of Advanced Computer Science and Applications. 2020;11(9). https://doi.org/10.14569/ijacsa.2020.0110917
Hakak S, Alazab M, Khan S, Gadekallu TR, Maddikunta PK, Khan WZ. An ensemble machine learning approach through effective feature extraction to classify fake news. Future Generation Computer Systems. 2021 Apr 1;117:47-58. https://doi.org/10.1016/j.future.2020.11.022
Jaina S, Patelb R, Guptac S, Dhootd T. Fake news detection using supervised learning method. https://doi.org/10.26480/etit.02.2020.104.108
Asghar MZ, Habib A, Habib A, Khan A, Ali R, Khattak A. Exploring deep neural networks for rumor detection. Journal of Ambient Intelligence and Humanized Computing. 2021 Apr;12:4315-33. https://doi.org/10.1007/s12652-019-01527-4
Kong SH, Tan LM, Gan KH, Samsudin NH. Fake news detection using deep learning. In2020 IEEE 10th symposium on computer applications & industrial electronics (ISCAIE) 2020 Apr 18 (pp. 102-107). https://doi.org/10.1109/iscaie47305.2020.9108841
Goldani MH, Momtazi S, Safabakhsh R. Detecting fake news with capsule neural networks. Applied Soft Computing. 2021 Mar 1;101:106991. https://doi.org/10.1016/j.asoc.2020.106991
Vereshchaka A, Cosimini S, Dong W. Analyzing and distinguishing fake and real news to mitigate the problem of disinformation. Computational and mathematical organization theory. 2020 Sep;26:350-64. https://doi.org/10.1007/s10588-020-09307-8
Kaliyar RK, Goswami A, Narang P. DeepFakE: improving fake news detection using tensor decomposition-based deep neural network. The Journal of Supercomputing. 2021 Feb;77:1015-37. https://doi.org/10.1007/s11227-020-03294-y
Trueman TE, Kumar A, Narayanasamy P, Vidya J. Attention-based C-BiLSTM for fake news detection. Applied Soft Computing. 2021 Oct 1;110:107600. https://doi.org/10.1016/j.asoc.2021.107600
Nasir JA, Khan OS, Varlamis I. Fake news detection: A hybrid CNN-RNN based deep learning approach. International Journal of Information Management Data Insights. 2021 Apr 1;1(1):100007. https://doi.org/10.1016/j.jjimei.2020.100007
Chauhan T, Palivela H. Optimization and improvement of fake news detection using deep learning approaches for societal benefit. International Journal of Information Management Data Insights. 2021 Nov 1;1(2):100051. https://doi.org/10.1016/j.jjimei.2021.100051
Choudhury D, Acharjee T. A novel approach to fake news detection in social networks using genetic algorithm applying machine learning classifiers. Multimedia Tools and Applications. 2023 Mar;82(6):9029-45.
Negi J, Bansal KL. Feature Selection and Ensemble Method Analysis for Breast Cancer Datasets. International Journal of Computer Sciences and Engineering. 2022;10(4):11-15. doi:10.26438/ijcse/v10i4.1115.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Evaluating Machine Learning Approaches for Detecting Fake News on Social Media

Status:

Version 1

Abstract

Figures

1. INTRODUCTION

2. RELATED WORK

3. METHODOLOGY

4. RESULT AND DISCUSSION

5. CONCLUSION

Declarations

Author Contribution

References

Additional Declarations

Status:

Version 1