version française rss feed
HAL : hal-00759546, version 1

Voir la fiche concise  BibTeX,EndNote,...
Scheduling/Data Management Heuristics
Desprez F., Gault S., Suter F.
Informatique/Calcul parallèle, distribué et partagé
Scheduling/Data Management Heuristics
Frédéric Desprez ()1, Sylvain Gault ()1, Frédéric Suter ()1, 2
1 :  LIP Lyon / Inria Grenoble Rhône-Alpes - AVALON
CNRS : UMR5668 – INRIA – École Normale Supérieure (ENS) - Lyon – Laboratoire d'informatique du Parallélisme – Université Claude Bernard - Lyon I (UCBL)
ENS Lyon 46 allée d'Italie 69364 Lyon Cedex 07
2 :  CC IN2P3 - Centre de Calcul de l'inst. national de phy. nucléaire et de phy. des particules
CNRS : USR6402 – IN2P3
12-14, boulevard Niels Bohr 69622 VILLEURBANNE CEDEX
Data volume produced by scientific applications increase at a high speed. Some are expected to produce several petabyte per year. In order to process this amount of data, the computing power of several hundreds or thousands of machines have to be used at the same time. Regarding this, one of the biggest challenge is: how to program these machines in order to make them to collaborate for the same computation? One answer brought by Google is the MapReduce paradigm. MapReduce has the advantage of being quite simple to program for the user and handle on its own the repetitive or complex tasks like the data transfers between nodes, task scheduling or handling node failure. These automatic tasks have to be handled in an optimized way in order to make the framework fast and scalable. This report presents our first studies towards an efficient scheduling of MapReduce operations. More specifically, we focused on the scheduling of the data transfers together with the tasks. We present here an interesting work around this topic and our algorithm which improves their results.


Deliverable D3.1 of MapReduce ANR project
Référence du projet ANR-10-SEGI-001
Année 2010
Acronyme du projet MapReduce
Titre du projet Traitement intensif de données à très grande échelle à l'aide du paradigme MapReduce sur des infrastructures de type cloud et hybrides

Liste des fichiers attachés à ce document :
MapReduce-D3.1.pdf(777.4 KB)