Universiti Teknologi Malaysia Institutional Repository

Classification of illicit web pages using neural network

Lee, Zhi Sam and Aizani, Mohd. and Selamat, Ali and Shamsuddin, Siti Mariyam (2007) Classification of illicit web pages using neural network. Jurnal Teknologi Maklumat, 19 (2). pp. 1-21. ISSN 0128-3790

[img] PDF (Full Text) - Published Version
9Mb

Abstract

The illicit web contents such as pornography, violence, gambling, etc, have greatly polluted the mind of web users especially children and teenagers. Due to some popular web filtering techniques like Uniform Resource Locator (URL) blocking and Platform for Internet Content Selection (PICS) checking are limited against today dynamic web content, hence content based analysis techniques with effective model are highly desired In this paper we propose textual content analysis model using entropy term weighting scheme to classify pornography and sex education web pages. We examine the entropy scheme with two other common term weighting schemes which are TFIDF and Glasgow. Those techniques are examined extensively with artificial neural network using small class dataset. In this study, we found that our proposed model archive better performance from the aspects of accuracy, convergence speed and stability.

Item Type:Article
Uncontrolled Keywords:Artificial neural network, term weighting scheme, textual content analysis, web pages classification.
Subjects:Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4050 Electronic information resources
Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4450 Databases
Divisions:Computer Science and Information System (Formerly known)
ID Code:8178
Deposited By: Norshiela Buyamin
Deposited On:02 Apr 2009 06:30
Last Modified:15 Jan 2014 08:35

Repository Staff Only: item control page