Universiti Teknologi Malaysia Institutional Repository

Fake news detection on social media platforms using machine learning algorithms

Kee, Wee Boon (2022) Fake news detection on social media platforms using machine learning algorithms. Masters thesis, Universiti Teknologi Malaysia, Faculty of Engineering - School of Electrical Engineering.

[img]
Preview
PDF
2MB

Official URL: http://dms.library.utm.my:8080/vital/access/manage...

Abstract

The rapid growth of advanced technology and social media platforms has changed the way of retrieving information or news for the whole world. However, it is undeniable that there is some fake news which is purposely created to disseminate false information to the public. The rapid spreading of fake news brings a significant negative impact on individuals, culture and country. Although there are some existing fake news detection methods like manual fact-checking websites and tools, they are time-consuming for fake news detection and cannot provide real time detection. Hence, an automated machine learning based approach is required for fake news detection since fake news can be disseminated rapidly through online social media platforms. This project aims to propose and construct an effective hybrid machine learning based algorithm for automated fake news detection. This algorithm will assist people in differentiating real and fake news, reduce the threats to national security by preventing the wide spread of fake news and maintain the news ecosystem's genuineness equilibrium. The project flow started with the dataset collection which includes both real and fake news. This dataset is publicly available and it is obtained from Kaggle website. Next, some text pre-processing techniques, i.e., lower case conversion, punctuation and stopwords removal and lemmatization will be applied to the raw data. Furthermore, Term Frequency – Inverse Document Frequency (TF-IDF) is the technique applied for text vectorization in this project. Two machine learning algorithms, Random Forest (RF) and Support Vector Machine (SVM) are implemented to classify the real and fake news. In addition, two hybrid ensemble models, namely voting classifier and stacking classifier are constructed by combining three individual base classifiers, i.e. RF, SVM and Logistic Regression (LR). Hybrid ensemble model is an approach that combines several models to improve the prediction accuracy. These hybrid classifiers can detect fake news in real time and effectively prevent the fake news from being disseminated widely through social media platforms. In this project, stacking classifier performs better than voting classifier, achieving an accuracy of 91.06% as compare to the classification accuracy of 90.41% obtained by voting classifier.

Item Type:Thesis (Masters)
Uncontrolled Keywords:fake news detection, Random Forest (RF), Support Vector Machine (SVM)
Subjects:T Technology > TK Electrical engineering. Electronics Nuclear engineering
Divisions:Faculty of Engineering - School of Electrical
ID Code:99471
Deposited By: Yanti Mohd Shah
Deposited On:27 Feb 2023 07:13
Last Modified:27 Feb 2023 07:13

Repository Staff Only: item control page