Universiti Teknologi Malaysia Institutional Repository

Finding english and translated arabic documents similarities using GHSOM

Selamat, Ali and Ismail, Hanadi Hassen (2008) Finding english and translated arabic documents similarities using GHSOM. In: Proceedings of the International Conference on Computer and Communication Engineering 2008, ICCCE08: Global Links for Human Development. Institute of Electrical and Electronics Engineers, New York, 460 -465. ISBN 978-142441692-9

Full text not available from this repository.

Official URL: http://dx.doi.org/10.1109/ICCCE.2008.4580647

Abstract

The idea of finding similar news across Arabic and English sources is that to provide the audience with multiple views of the broadcasted news because reading the news from a single source may not always reflects on what happening around the world due different background, cultures and opinions of the readers and writers. To achieve this goal there are many techniques have been used to cluster the documents with similar themes. In this paper, we analyze the similarity of the views on the news written in the news translations form Arabic and English texts using Self-organizing Map (SOM). However, we have found there are some difficulties in SOM that affect its performance. In order to improve the problems of performance, we have used a Growing Hierarchical Self-organizing Map (GHSOM). The main advantage of such a mapping is the ease by which a user gains an idea regarding the structure of the data by analyzing the map. Thousands of news documents have been collected from Arabic and English news sources from the web in order to train both algorithms. Form experiments, the results show that using GHSOM is better in terms of clustering documents with the same opinions.

Item Type:Book Section
Additional Information:ISBN: 978-142441692-9; International Conference on Computer and Communication Engineering 2008, ICCCE08: Global Links for Human Development; Kuala Lumpur; 13 May 2008 through 15 May 2008
Uncontrolled Keywords:chlorine compounds, conformal mapping, strength of materials, technology, clustering documents, communication engineering, growing hierarchical self-organizing map, human developments, international conferences, multiple views, news sources, self-organizing map, single sourcing, maps
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computer Science and Information System (Formerly known)
ID Code:12570
Deposited By: Liza Porijo
Deposited On:09 Jun 2011 04:07
Last Modified:09 Jun 2011 04:07

Repository Staff Only: item control page