Universiti Teknologi Malaysia Institutional Repository

Replication effect over hadoop mapreduce performance using regression analysis

Shabbir, Aisha and Abu Bakar, Kamalrulnizam and Raja Mohd. Radzi, Raja Zahilah (2018) Replication effect over hadoop mapreduce performance using regression analysis. International Journal Of Computer Applications, 181 (24). ISSN 0975 –8887

Full text not available from this repository.

Official URL: http://dx.doi.org/10.5120/ijca2018918034

Abstract

Hadoop MapReduce is the community accepted platform that deals with the gigantic data in an efficient and cost-effective manner. To cope up with ever growing datasets and shrinking time to analyze them, Hadoop MapReduce leveraged parallelize computations on large distributed clusters consisting of many machines. Careful consideration of the factors affecting the Hadoop MapReduce can enhance its performance. Many researches has been done for improving the total job execution time of MapReduce by optimizing different parameters. The replication factor is still unexplored for its effect on the MapReduce job completion time. This paper focuses on the evaluation of data replication factor on MapReduce job completion time using regression analysis. The performance of the Hadoop MapReduce job in terms of total job completion time is monitored experimentally by changing different values of replication. The evaluation results evidently shows the dependence of the job completion time on the replication factor. The dependence of total job completion time on the replication has been verified both analytically and experimentally.

Item Type:Article
Uncontrolled Keywords:hadoop mapreduce, big data, regression analysis, data replication, job optimization
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computing
ID Code:82101
Deposited By: Yanti Mohd Shah
Deposited On:30 Sep 2019 09:00
Last Modified:26 Oct 2019 02:44

Repository Staff Only: item control page