Accès chercheur

EEDIS Laboratory

Evolutionary Engineering

and

Distributed Information Systems

Réseaux et Communication

Sécurité et Multimédia

Ingénierie des Connaissances

Data Mining & Web Intelligent

Interopérabilité des Systèmes d’information
& Bases de données

Développement Orienté Service

A predictive map task scheduler for optimizing data locality in mapreduce clusters

Auteurs: » Mohammed MERABET
Type : Revue Internationale
Nom du journal : International Journal of Grid and High Performance Computing ISSN:
Volume : Issue: Pages:
Lien : » https://www.igi-global.com/article/a-predictive-map-task-scheduler-for-optimizing-data-locality-in-mapreduce-clusters/210172
Publié le : 13-05-2018

This article describes how data locality is becoming one of the most critical factors to affect performance of MapReduce clusters because of network bisection bandwidth becomes a bottleneck. Task scheduler assigns the most appropriate map tasks to nodes. If map tasks are scheduled to nodes without input data, these tasks will issue remote I/O operations to copy the data to local nodes that decrease execution time of map tasks. In that case, prefetching mechanism can be useful to preload the needed input data before tasks is launching. Therefore, the key challenge is how this article can accurately predict the execution time of map tasks to be able to use data prefetching effectively without any data access delay. In this article, it is proposed that a Predictive Map Task Scheduler assigns the most suitable map tasks to nodes ahead of time. Following this, a linear regression model is used for prediction and data locality based algorithm for tasks scheduling. The experimental results show that the method can greatly improve both data locality and execution time of map tasks.

Tous droits réservés - © 2019 EEDIS Laboratory