Najafabadi, Maryam Khanian (2016) Improved collaborative filtering using clustering and association rule mining on implicit data. PhD thesis, Universiti Teknologi Malaysia, Advanced Informatics School.
|
PDF
532kB |
Official URL: http://dms.library.utm.my:8080/vital/access/manage...
Abstract
The recommender systems are recently becoming more significant due to their ability in making decisions on appropriate choices. Collaborative Filtering (CF) is the most successful and most applied technique in the design of a recommender system where items to an active user will be recommended based on the past rating records from like-minded users. Unfortunately, CF may lead to poor recommendation when user ratings on items are very sparse (insufficient number of ratings) in comparison with the huge number of users and items in user-item matrix. In the case of a lack of user rating on items, implicit feedback is used to profile a user’s item preferences. Implicit feedback can indicate users’ preferences by providing more evidences and information through observations made on users’ behaviors. Data mining technique, which is the focus of this research, can predict a user’s future behavior without item evaluation and can too, analyze his preferences. In order to investigate the states of research in CF and implicit feedback, a systematic literature review has been conducted on the published studies related to topic areas in CF and implicit feedback. To investigate users’ activities that influence the recommender system developed based on the CF technique, a critical observation on the public recommendation datasets has been carried out. To overcome data sparsity problem, this research applies users’ implicit interaction records with items to efficiently process massive data by employing association rules mining (Apriori algorithm). It uses item repetition within a transaction as an input for association rules mining, in which can achieve high recommendation accuracy. To do this, a modified preprocessing has been employed to discover similar interest patterns among users. In addition, the clustering technique (Hierarchical clustering) has been used to reduce the size of data and dimensionality of the item space as the performance of association rules mining. Then, similarities between items based on their features have been computed to make recommendations. Experiments have been conducted and the results have been compared with basic CF and other extended version of CF techniques including K-Means Clustering, Hybrid Representation, and Probabilistic Learning by using public dataset, namely, Million Song dataset. The experimental results demonstrate that the proposed technique exhibits improvements of an average of 20% in terms of Precision, Recall and Fmeasure metrics when compared to the basic CF technique. Our technique achieves even better performance (an average of 15% improvement in terms of Precision and Recall metrics) when compared to the other extended version of CF techniques, even when the data is very sparse.
Item Type: | Thesis (PhD) |
---|---|
Uncontrolled Keywords: | Collaborative Filtering (CF), data mining, CF techniques |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science T Technology > T Technology (General) > T58.5-58.64 Information technology |
Divisions: | Advanced Informatics School |
ID Code: | 98093 |
Deposited By: | Yanti Mohd Shah |
Deposited On: | 14 Nov 2022 09:52 |
Last Modified: | 14 Nov 2022 09:52 |
Repository Staff Only: item control page