Universiti Teknologi Malaysia Institutional Repository

Towards stemming error reduction for malay texts

Kassim, M. N. and Jalil, S. H. M. and Maarof, Z. A. and Zainal, A. (2019) Towards stemming error reduction for malay texts. In: 5th International Conference on Computational Science and Technology, ICCST 2018, 29-30 Aug 2018, Kota Kinabalu, Malaysia.

Full text not available from this repository.

Official URL: http://www.dx.doi.org/10.1007/978-981-13-2622-6_2

Abstract

Text stemmer is one of useful language preprocessing tools in the field of information retrieval, text mining and natural language processing. It is used to map morphological variants of words into base forms. Most of the current text stemmers for the Malay language focused on removing affixes, clitics, and particles from affixation words. However, these stemmers still suffered from stemming errors due to insufficiently address the root cause of these stemming errors. This paper investigates the root cause of stemming errors and proposes stemming technique to address possible stemming errors. The proposed text stemmer uses affixes removal method and multiple dictionary lookup to address various root causes of stemming errors. The experimental results showed promising stemming accuracy in reducing various possible stemming errors.

Item Type:Conference or Workshop Item (Paper)
Uncontrolled Keywords:stemming errors, text stemmer, text stemming
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computing
ID Code:88862
Deposited By: Narimah Nawil
Deposited On:29 Dec 2020 04:38
Last Modified:29 Dec 2020 04:38

Repository Staff Only: item control page