Universiti Teknologi Malaysia Institutional Repository

Classification of terrorism based on tweet text post on twitter using term weighting schemes

Muhammad, Muhammad Fikri Arif (2018) Classification of terrorism based on tweet text post on twitter using term weighting schemes. Masters thesis, Universiti Teknologi Malaysia.

[img]
Preview
PDF
242kB

Official URL: http://dms.library.utm.my:8080/vital/access/manage...

Abstract

Social Network Service (SNS) has become the main platform to distribute information, sharing of experience and knowledge. The Twitter platform gained the popularity very quickly since it’s founded for all layers of generation. The popularity of Twitter has led to prominent media coverage with instant news and advertisement from all over the world. However, the content of tweet posted on Twitter platform are not necessarily true and can sometimes be considered as a threat to another users. Workforce expertise that involve in intelligence gathering always deals with difficulty as the complexity of crime increases, human errors and time constraints. Thus, it is difficult to prevent undesired posts, such as terrorism posts, which are intended to disseminate their propaganda. Hence, an investigating for three term weighting schemes on two datasets are used to improve the automated content-based classification techniques. The research study aims to improve the content-based classification accuracy on Twitter by comparing Term Weighting Schemes in classifying terrorism contents. In this project, three different techniques for term weighting schemes namely Entropy, Term Frequency Inverse Document Frequency (TF-IDF) and Term Frequency Relevance Frequency (TFRF) are used as feature selection process in filtering Twitter posts. The performance of these techniques were examined via datasets, and the accuracy of their result was measured by Support Vector Machine (SVM). Entropy, TF-IDF and TFRF are judged based on accuracy, precision, recall and F score measurement. Results showed that TFRF performed better than Entropy and TF-IDF. It is hoped that this study would give other researchers an insight especially who want to work with similar area.

Item Type:Thesis (Masters)
Uncontrolled Keywords:Social Network Service, Term Weighting Schemes
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computing
ID Code:81564
Deposited By: Narimah Nawil
Deposited On:10 Sep 2019 01:40
Last Modified:10 Sep 2019 01:40

Repository Staff Only: item control page