Universiti Teknologi Malaysia Institutional Repository

Comparison of different automatic text summarization systems using standard performance evaluations

Abd Munir, Nur Hafizah (2009) Comparison of different automatic text summarization systems using standard performance evaluations. Masters thesis, Universiti Teknologi Malaysia, Faculty of Computer Science and Information System.

[img] PDF (Abstract)
[img] PDF (Table of Contents)
[img] PDF (1st Chapter)
[img] PDF (References)


There are many automatic summarization systems can be used to produce a summary from a single text documents. From the different automatic summarization system, it can be found that the system will produce a different content of summary results although the percentage of sentences out of whole single text document is setting to the same value. Therefore, in this study, three automatic summarization systems are used to produce the summary results; Microsoft Word Automatic Summarization, Shvoong Summarization and Simple Text Summarization in PHP. The performance of those results are investigated and measured using standard performance evaluation such recall, precision and f-measure. The dataset collection used in this study is collected from The New Straits Time and The Stars online and it is about Iskandar Region Development Authority (IRDA). Two automatic summarization system are already existed which is Microsoft Word Automatic Summarization and Shvoong Summarization and only one summarization system is coded in PHP language, there is Simple Text Summarization in PHP. Many operations have been applied in this coded system such as removing stop word, stemming, normalizing, creating weighted term-frequency and applying the technique. The results from those systems are stored into the database. In this study, about 50 articles are used. The comparison between different automatic summarization systems was made using standard performance evaluation. The performance evaluation is fully analyzed without depending on human evaluator. One program of analyzing the performance is coded in PERL language to produce a statistic of all summary results from those three automatic summarization systems. From the experimental results, it can be concluded that the Shvoong Summarization is the most effective automatic summarization system for single text document.

Item Type:Thesis (Masters)
Additional Information:Supervisor : Prof. Madya Dr. Naomie Salim; Thesis (Sarjana Sains (Sains Komputer)) - Universiti Teknologi Malaysia, 2009
Uncontrolled Keywords:text processing (Computer science), computational linguistics
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computer Science and Information System (Formerly known)
ID Code:18202
Deposited By: Kamariah Mohamed Jong
Deposited On:25 Feb 2014 06:48
Last Modified:25 Feb 2014 06:52

Repository Staff Only: item control page