Auteurs: | » Mohammed MERABET | |
Type : | Revue Internationale | |
Nom du journal : | International Journal of Grid and High Performance Computing ISSN: | |
Volume : | Issue: | Pages: |
Lien : » https://www.igi-global.com/article/a-predictive-map-task-scheduler-for-optimizing-data-locality-in-mapreduce-clusters/210172 | ||
Publié le : | 13-05-2018 |
This article describes how data locality is becoming one of the most critical factors to affect performance of MapReduce clusters because of network bisection bandwidth becomes a bottleneck. Task scheduler assigns the most appropriate map tasks to nodes. If map tasks are scheduled to nodes without input data, these tasks will issue remote I/O operations to copy the data to local nodes that decrease execution time of map tasks. In that case, prefetching mechanism can be useful to preload the needed input data before tasks is launching. Therefore, the key challenge is how this article can accurately predict the execution time of map tasks to be able to use data prefetching effectively without any data access delay. In this article, it is proposed that a Predictive Map Task Scheduler assigns the most suitable map tasks to nodes ahead of time. Following this, a linear regression model is used for prediction and data locality based algorithm for tasks scheduling. The experimental results show that the method can greatly improve both data locality and execution time of map tasks.