Universiti Teknologi Malaysia Institutional Repository

Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification

Hossain, Rajan and Ibrahim, Roliana and Dollah @ Md. Zain, Rozilawati and Mohamed Khaidzir, Khairul Anwar (2017) Experimental study of support vector machines and Naïve Bayes classifier on automated subject area classification. Journal of Information Systems Research and Innovation, 11 (3). pp. 7-13. ISSN 2289-1358

Full text not available from this repository.

Official URL: https://seminar.utmspace.edu.my/jisri/download/Vol...

Abstract

Subject area classification allow researchers to identify publications based on their discipline or research domain. When number of document is large, classification of publication documents become increasingly difficult. Besides, covering granularity of broad range of subject areas manually is a critical problem. In recent areas, machine learning has emerged as an effective way for automated classification in various domains such as text, images and videos. Problems with classifying large amount of publication papers can be solved with automating the process of subject area classification using supervised machine learning approaches. This paper represents an experimental study that used support vector machines and naïve bayes for automated classification of subject areas. Text classification method is used to find the probability of a document to be in certain category based on co-words and their frequency in a document. The proposed experimentation is consisted of two phases. In first phase, a list of co-words was generated from a collection of document in each of selected subject areas using text pre-processing technique. In second phase, both Support Vector Machines(SVM) and Naïve Bayes classifiers were used to conduct the experimentation and performance of each method was observed. It was found that SVM performs better than Naïve Bayes classifier in multi-label classification.

Item Type:Article
Uncontrolled Keywords:Machine Learning, Text Classification, Feature Extraction. Co-words Analysis, Multi Label Classification
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computing
ID Code:80618
Deposited By: Fazli Masari
Deposited On:27 Jun 2019 06:10
Last Modified:27 Jun 2019 06:10

Repository Staff Only: item control page