Selamat, Ali and Lim, Yee Way and Ibrahim, Siti Nurkhadijah Aishah (2008) Analisis capaian web crawler dengan menggunakan algoritma genetik. Jurnal Teknologi (49D). pp. 61-85. ISSN 0127-9696
- Published Version
Official URL: http://www.penerbit.utm.my/onlinejournal/49/D/49si...
With the tremendous growth of World Wide Web (WWW), finding relevant source through this boundless world becomes a challenging task. In order to make web user easier to seek for their desire information, several famous search engines such as Google, LookSmart, Altavista and Yahoo have been introduced to WWW in these recent years. One of the most crucial components in search engine is web crawler. Web crawler also name as web ant or web robot which uses to crawl all resources or information in the WWW. As the current design of search engines do not have the communication capabilities between the web crawler and the users who dispatched the crawler which cause the imprecise phenomena. Almost the result of finding is outdated or incorrect. Therefore, an intelligent web crawler which namely UtmCrawler has been designed to solve the imprecise phenomena. The methodology of UtmCrawler is consisting of several phases such as literature review, crawling, preprocessing, processing, testing and documentation phase. During the processing phase, genetic algorithm (GA) works as keyword optimization where it expends initial keywords to certain appropriate threshold. The experimental results has shown that a web crawler with GA design has achieved higher precision (95.19%) than the usual crawler which without GA (85.07%). As conclusion, UtmCrawler could provide a better search result for current web user.
|Uncontrolled Keywords:||web crawler, genetic algorithms, search engine, agent, precision|
|Subjects:||Q Science > QA Mathematics > QA75 Electronic computers. Computer science|
|Divisions:||Computer Science and Information System|
|Deposited By:||Ms Zalinda Shuratman|
|Deposited On:||12 Jan 2011 06:02|
|Last Modified:||12 Jan 2011 06:02|
Repository Staff Only: item control page