Universiti Teknologi Malaysia Institutional Repository

Classification of online grooming on chat logs using two term weighting schemes

Sulaiman, Nur Rafeeqkha and Md. Siraj, Maheyzah (2019) Classification of online grooming on chat logs using two term weighting schemes. International Journal of Innovative Computing, 9 (2). pp. 43-50. ISSN 2180-4370

[img]
Preview
PDF
603kB

Official URL: https://dx.doi.org/10.11113/ijic.v9n2.239

Abstract

Due to the growth of Internet, it has not only become the medium for getting information, it has also become a platform for communicating. Social Network Service (SNS) is one of the main platform where Internet users can communicate by distributing, sharing of information and knowledge. Chatting has become a popular communication medium for Internet users whereby users can communicate directly and privately with each other. However, due to the privacy of chat rooms or chatting mediums, the content of chat logs is not monitored and not filtered. Thus, easing cyber predators preying on their preys. Cyber groomers are one of cyber predators who prey on children or minors to satisfy their sexual desire. Workforce expertise that involve in intelligence gathering always deals with difficulty as the complexity of crime increases, human errors and time constraints. Hence, it is difficult to prevent undesired content, such as grooming conversation, in chat logs. An investigation on two term weighting schemes on two datasets are used to improve the content-based classification techniques. This study aims to improve the content-based classification accuracy on chat logs by comparing two term weighting schemes in classifying grooming contents. Two term weighting schemes namely Term Frequency – Inverse Document Frequency – Inverse Class Space Density Frequency (TF.IDF.ICSdF) and Fuzzy Rough Feature Selection (FRFS) are used as feature selection process in filtering chat logs. The performance of these techniques were examined via datasets, and the accuracy of their result was measured by Support Vector Machine (SVM). TF.IDF.ICSdF and FRFS are judged based on accuracy, precision, recall and F score measurement.

Item Type:Article
Uncontrolled Keywords:TF.IDF.ICSdF, FRFS, SVM, online grooming, classification
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computing
ID Code:85242
Deposited By: Fazli Masari
Deposited On:17 Mar 2020 08:10
Last Modified:17 Mar 2020 08:10

Repository Staff Only: item control page