Al-Maimani, Maqbool Ramdhan Ibrahim (2015) Enhanced aspect level opinion mining knowledge extraction and representation. PhD thesis, Universiti Teknologi Malaysia, Faculty of Computing.
Full text not available from this repository.
Official URL: http://dms.library.utm.my:8080/vital/access/manage...
Abstract
There is a need to find more effective techniques to extract, classify, represent and summarize customers’ online opinions on products and services for better sentiment analysis. The aim of this thesis is to enhance aspect level opinion extraction and representation. This study uses SentiWordNet lexical resource which is specifically built for opinion mining and widely used in sentiment analysis. This research introduces an approach using adjectives, verbs, adverbs and nouns (AVAN) which analyses all opinion word types for sentiment analysis and not only limited to adjectives and adverbs as have been conventionally done. SentiWordNet is used in this thesis to identify and analyze all word types for opinion extraction and representation. Opinion representation is enhanced by capturing key elements of opinions into predicates that consists of opinion word, strength, score and category in order to improve the opinion representation and classification. Then it further enhances the mining by introducing opinion accounting which summarizes opinion scores at various group levels. In addition, this thesis introduces a new concept called opinion strength which classifies opinions into degrees. An enhanced score is assigned to opinion based on the strength at which these opinions are expressed. Furthermore, as opinions are fuzzy in nature, this study shows that fuzzy logic is an effective technique to address opinion vagueness since human-like logic is fuzzy. This is important as opinions should not only be categorized in classical Boolean sentiments. This study identifies SentiWordNet, AVAN, Opinion Strength and fuzzy logic as classification features to classify customer reviews into a 5-class prediction model (Excellent, Good, Fair, Poor and Very Poor ). The results show an accuracy of 92% using Sequential Minimal Optimization classifier for these features, outperforming previous works that implemented Support Vector Machine and Logistic Regression. Moreover, combination of AVAN, Opinion Strength and fuzzy logic outperformed SentiWordNet alone by a 30% accuracy.
Item Type: | Thesis (PhD) |
---|---|
Additional Information: | Thesis (Ph.D (Sains Komputer)) - Universiti Teknologi Malaysia, 2015; Supervisor : Prof. Dr. Naomie Salim |
Uncontrolled Keywords: | adjectives, verbs, adverbs and nouns (AVAN), knowledge extraction and representation |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Computing |
ID Code: | 54773 |
Deposited By: | Fazli Masari |
Deposited On: | 13 May 2016 02:28 |
Last Modified: | 06 Nov 2020 16:19 |
Repository Staff Only: item control page