Shamsuddin, Siti Mariyam (2008) Enhance term weighting algorithm as feature selection technique for illicit web content classification. In: The Eight International Conference on Intelligent Systems Design and Applications (ISDA`08), 2008, Kaohsing City, Taiwan.
Official URL: http://dx.doi.org/10.4028/10.1109/ISDA.2008.171
The exponential increase of information in Internet has raise the issue of information security. Pornography Web content is one of the biggest harmful resource that pollute the mind of children and teenagers. Several Web content based analysis approaches had been proposed to avoiding these illicit Web content accessing by the children. However implementation of each solution still remain as an issue. Most of the approaches are weak against classify the high similarity Web content such as pornography and gynecology Web pages. In this study, we try to solve this issue by propose a modified term weighting scheme which used as term feature selection technique for illicit Web page classification. We examine the performance of this proposed technique via three data sets which represent three critical scenarios and compare it with original term weighting scheme. Based on our observation, the proposed technique had shown its superiority for illicit Web pages classification which averagely achieve higher than 90% accuracy rate. Meanwhile the experiment result also denote that the proposed technique had improve from original term weighting scheme. We hope that this study would give other researchers an insight especially who work in the similar area.
|Item Type:||Conference or Workshop Item (Paper)|
|Uncontrolled Keywords:||feature selection, neural network, term weighting scheme, text categorization, web filtering|
|Subjects:||Q Science > QA Mathematics > QA75 Electronic computers. Computer science|
|Divisions:||Computer Science and Information System|
|Deposited By:||Liza Porijo|
|Last Modified:||15 Dec 2011 07:49|
Repository Staff Only: item control page