HAL : in2p3-00457105, version 1

3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003 (CCGrid2003), Tokyo : Japon (2003)
XtremWeb & Condor : sharing resources between Internet connected Condor pool
O. Lodygensky1, 2, G. Fedak2, F. Cappello2, V. Neri2, M. Livny, D. Thain

Grid computing presents two major challenges for deploying large scale applications across wide area networks gathering volunteers PC and clusters/parallel computers as computational resources: security and fault tolerance. This paper presents a lightweight Grid solution for the deployment of multi-parameters applications on a set of clusters protected by firewalls. The system uses a hierarchical design based on Condor for managing each cluster locally and XtremWeb for enabling resource sharing among the clusters. We discuss the security and fault tolerance mechanisms used for this design and demonstrate the usefulness of the approach measuring the performances of a multi-parameters bio-chemistry application deployed on two sites: University of Wisconsin/Madison and Paris South University. This experiment shows that we can efficiently and safely harness the computational power of about 200 PC distributed on two geographic sites.
1 :  LAL - Laboratoire de l'Accélérateur Linéaire
2 :  LRI - Laboratoire de Recherche en Informatique
Informatique/Calcul parallèle, distribué et partagé

Informatique/Performance et fiabilité