version française rss feed
HAL : in2p3-00726735, version 1

Fiche concise  Récupérer au format
Peer-to-peer multi-class boosting
Hegedűs I., Busa-Fekete R., Ormándi R., Jelasity M., Kégl B.
Dans Euro-Par 2012 Parallel Processing - 18th International European Conference on Parallel and Distributed Computing (Euro-Par 2012), Rhodes Island : Grèce (2012) - http://hal.in2p3.fr/in2p3-00726735
Informatique/Calcul parallèle, distribué et partagé
Informatique/Algorithme et structure de données
Peer-to-peer multi-class boosting
I. Hegedűs, R. Busa-Fekete, R. Ormándi, M. Jelasity, B. Kégl ()1, 2, 3
1 :  LAL - Laboratoire de l'Accélérateur Linéaire
CNRS : UMR8607 – IN2P3 – Université Paris XI - Paris Sud
Centre Scientifique d'Orsay B.P. 34 91898 ORSAY Cedex
2 :  LRI - Laboratoire de Recherche en Informatique
CNRS : UMR8623 – Université Paris Sud
LRI - Bâtiments 650-660 Université Paris-Sud 91405 Orsay Cedex
3 :  INRIA Saclay - Ile de France - TAO
INRIA – CNRS : UMR8623 – Université Paris XI - Paris Sud
DIGITEO Bat. Claude Shannon - Université de Paris-Sud, Bâtiment 660, 91190 Gif-sur-Yvette
We focus on the problem of data mining over large-scale fully distributed databases, where each node stores only one data record. We assume that a data record is never allowed to leave the node it is stored at. Possible motivations for this assumption include privacy or a lack of a centralized infrastructure. To tackle this problem, earlier we proposed the generic gossip learning framework (GoLF), but so far we have studied only basic linear algorithms. In this paper we implement the well-known boosting technique in GoLF. Boosting techniques have attracted growing attention in machine learning due to their outstanding performance in many practical applications. Here, we present an implementation of a boosting algorithm that is based on FilterBoost. Our main algorithmic contribution is a derivation of a pure online multi-class version of FilterBoost, so that it can be employed in GoLF. We also propose improvements to GoLF, with the aim of maximizing the diversity of the evolving models gossiped in the network, a feature that we show to be important. We evaluate the robustness and the convergence speed of the algorithm empirically over three benchmark databases. We compare the algorithm with the sequential AdaBoost algorithm and we test its performance in a failure scenario involving message drop and delay, and node churn.

Communications avec actes
Euro-Par 2012 Parallel Processing
Christos Kaklamanis, Theodore Papatheodorou, Paul G. Spirakis
Lecture Notes In Computer Science

18th International European Conference on Parallel and Distributed Computing (Euro-Par 2012)
Rhodes Island

LAL 12-310
ISBN: 978-3-642-32819-0 (Print) 978-3-642-32820-6 (Online)
Liste des fichiers attachés à ce document : 
EUROPAR2012.pdf(359.2 KB)