Universiti Teknologi Malaysia Institutional Repository

Islamic web pages filtering and categorization

Mohd. Zamry, Nurfazrina (2013) Islamic web pages filtering and categorization. Masters thesis, Universiti Teknologi Malaysia, Faculty of Computing.

[img]
Preview
PDF
1MB

Official URL: http://dms.library.utm.my:8080/vital/access/manage...

Abstract

The Internet creates the world without boundaries where people can get lots of information just by surfing the Internet. But still some of the information is not genuine and correct. Because of that, some of the practitioners of deviant teachings can take this opportunity to attract followers just using the Internet especially to distort beliefs of Muslim in Malaysia. Web filtering can be used as protection against inappropriate and prevention of misuse of the network, hence, it can be used to filter the content of suspicious websites and alleviate the dissemination of such website. Currently, process for blocking the deviate teaching website is done manually and in addition there are limited web filtering product offered to filter religion content and very limited for Malay language. This project is aim to classify deviant teachings Website into three categories which is deviate, suspicious and clean. Pre-processing, feature selection and classification are process involved in Web filtering process. In pre-processing three processes are involved: HTML parsing, stemming and stopping to produce the deviant teaching keyword. Three existing term weighting scheme namely TF, TFIDF and Modified Entropy are used as feature selection process in filtering deviant teaching website while Support Vector Machine (SVM) will be used for classification process. Classification is validated by accuracy, precision, recall and F1. 300 Web pages were collected from Internet based on three categories: deviant teaching, suspicious and clean Web pages. As a result, M.Entropy shows the most suitable term weighting scheme to use in Islamic web pages filtering rather than TFIDF and Entropy.

Item Type:Thesis (Masters)
Additional Information:Thesis (Sarjana Sains Komputer (Keselamatan Maklumat)) - Universiti Teknologi Malaysia, 2013; Supervisor : Dr. Anazida Zainal
Uncontrolled Keywords:web sites design, web site development
Subjects:T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK5015.888 Web sites
Divisions:Computing
ID Code:35863
Deposited By: Kamariah Mohamed Jong
Deposited On:10 Mar 2014 12:07
Last Modified:17 Jul 2017 13:05

Repository Staff Only: item control page