s'authentifier
version française rss feed
HAL : hal-00643000, version 1

Voir la fiche concise  BibTeX,EndNote,...
A Robust Ranking Methodology based on Diverse Calibration of AdaBoost
Busa-Fekete R., Kégl B., Elteto T., Szarvas G.
European Conference on Machine Learning (ECML 2011), Athens : Grèce (2011) - http://hal.inria.fr/hal-00643000
Informatique/Apprentissage
A Robust Ranking Methodology based on Diverse Calibration of AdaBoost
Róbert Busa-Fekete1, Balázs Kégl (, http://users.web.lal.in2p3.fr/kegl)1, 2, 3, Tamas Elteto ()3, György Szarvas4
1 :  LAL - Laboratoire de l'Accélérateur Linéaire
http://www.lal.in2p3.fr/
CNRS : UMR8607 – IN2P3 – Université Paris XI - Paris Sud
Centre Scientifique d'Orsay B.P. 34 91898 ORSAY Cedex
France
2 :  LRI - Laboratoire de Recherche en Informatique
http://www.lri.fr/
CNRS : UMR8623 – Université Paris Sud
LRI - Bâtiments 650-660 Université Paris-Sud 91405 Orsay Cedex
France
3 :  INRIA Saclay - Ile de France - TAO
http://tao.lri.fr/tiki-index.php
INRIA – CNRS : UMR8623 – Université Paris XI - Paris Sud
DIGITEO Bat. Claude Shannon - Université de Paris-Sud, Bâtiment 660, 91190 Gif-sur-Yvette
France
4 :  Ubiquitous Knowledge Processing (UKP) Lab
Technische Universität Darmstad
Allemagne
In subset ranking, the goal is to learn a ranking function that approximates a gold standard partial ordering of a set of objects (in our case, relevance labels of a set of documents retrieved for the same query). In this paper we introduce a learning to rank approach to subset ranking based on multi-class classification. Our technique can be summarized in three major steps. First, a multi-class classification model (AdaBoost.MH) is trained to predict the relevance label of each object. Second, the trained model is calibrated using various calibra- tion techniques to obtain diverse class probability estimates. Finally, the Bayes-scoring function (which optimizes the popular Information Re- trieval performance measure NDCG), is approximated through mixing these estimates into an ultimate scoring function. An important novelty of our approach is that many different methods are applied to estimate the same probability distribution, and all these hypotheses are combined into an improved model. It is well known that mixing different condi- tional distributions according to a prior is usually more efficient than selecting one "optimal" distribution. Accordingly, using all the calibra- tion techniques, our approach does not require the estimation of the best suited calibration method and is therefore less prone to overfitting. In an experimental study, our method outperformed many standard ranking algorithms on the LETOR benchmark datasets, most of which are based on significantly more complex learning to rank algorithms than ours.
Anglais

Communications avec actes
20/11/2011
internationale
European Conference on Machine Learning (ECML 2011)
Athens
Grèce
05/09/2011
09/09/2011

LAL 11-309
11024
Liste des fichiers attachés à ce document :
PDF
final.pdf(323.8 KB)