Universiti Teknologi Malaysia Institutional Repository

Analisis capaian web crawler dengan menggunakan algoritma genetik

Selamat, Ali and Lim, Yee Way and Ibrahim, Siti Nurkhadijah Aishah (2008) Analisis capaian web crawler dengan menggunakan algoritma genetik. Jurnal Teknologi, 49D . pp. 61-85. ISSN 0127-9696 (Print); 2180-3722 (Electronic)

[img] PDF (Full Text) - Published Version
455kB
[img] HTML - Published Version
436kB

Abstract

With the tremendous growth of World Wide Web (WWW), finding relevant source through this boundless world becomes a challenging task. In order to make web user easier to seek for their desire information, several famous search engines such as Google, LookSmart, Altavista and Yahoo have been introduced to WWW in these recent years. One of the most crucial components in search engine is web crawler. Web crawler also name as web ant or web robot which uses to crawl all resources or information in the WWW. As the current design of search engines do not have the communication capabilities between the web crawler and the users who dispatched the crawler which cause the imprecise phenomena. Almost the result of finding is outdated or incorrect. Therefore, an intelligent web crawler which namely UtmCrawler has been designed to solve the imprecise phenomena. The methodology of UtmCrawler is consisting of several phases such as literature review, crawling, preprocessing, processing, testing and documentation phase. During the processing phase, genetic algorithm (GA) works as keyword optimization where it expends initial keywords to certain appropriate threshold. The experimental results has shown that a web crawler with GA design has achieved higher precision (95.19%) than the usual crawler which without GA (85.07%). As conclusion, UtmCrawler could provide a better search result for current web user.

Item Type:Article
Uncontrolled Keywords:web crawler, genetic algorithms, search engine, agent, precision
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computer Science and Information System
ID Code:11712
Deposited By: Zalinda Shuratman
Deposited On:12 Jan 2011 06:02
Last Modified:01 Nov 2017 04:17

Repository Staff Only: item control page