s'authentifier
version française rss feed
HAL : hal-00642998, version 1

Voir la fiche concise  BibTeX,EndNote,...
Algorithms for Hyper-Parameter Optimization
Bergstra J., Bardenet R., Bengio Y., Kégl B.
25th Annual Conference on Neural Information Processing Systems (NIPS 2011), Granada : Espagne (2011) - http://hal.inria.fr/hal-00642998
Informatique/Apprentissage
Algorithms for Hyper-Parameter Optimization
James Bergstra1, R. Bardenet2, 3, Yoshua Bengio4, Balázs Kégl (, http://users.web.lal.in2p3.fr/kegl)2, 3, 5
1 :  The Rowland Institute
Harvard university (Cambridge, USA)
France
2 :  LRI - Laboratoire de Recherche en Informatique
http://www.lri.fr/
CNRS : UMR8623 – Université Paris Sud
LRI - Bâtiments 650-660 Université Paris-Sud 91405 Orsay Cedex
France
3 :  INRIA Saclay - Ile de France - TAO
http://tao.lri.fr/tiki-index.php
INRIA – CNRS : UMR8623 – Université Paris XI - Paris Sud
France
4 :  DIRO - Département d'Informatique et de Recherche Opérationnelle [Montreal]
http://www.iro.umontreal.ca/
Université de Montréal
Département d'Informatique et de recherche opérationnelle Université de Montréal Pavillon André-Aisenstadt CP 6128 succ Centre-Ville Montréal QC H3C 3J7 Canada
Canada
5 :  LAL - Laboratoire de l'Accélérateur Linéaire
http://www.lal.in2p3.fr/
CNRS : UMR8607 – IN2P3 – Université Paris XI - Paris Sud
Centre Scientifique d'Orsay B.P. 34 91898 ORSAY Cedex
France
Several recent advances to the state of the art in image classification benchmarks have come from better configurations of existing techniques rather than novel ap- proaches to feature learning. Traditionally, hyper-parameter optimization has been the job of humans because they can be very efficient in regimes where only a few trials are possible. Presently, computer clusters and GPU processors make it pos- sible to run more trials and we show that algorithmic approaches can find better results. We present hyper-parameter optimization results on tasks of training neu- ral networks and deep belief networks (DBNs). We optimize hyper-parameters using random search and two new greedy sequential methods based on the ex- pected improvement criterion. Random search has been shown to be sufficiently efficient for learning neural networks for several datasets, but we show it is unreli- able for training DBNs. The sequential algorithms are applied to the most difficult DBN learning problems from [1] and find significantly better results than the best previously reported. This work contributes novel techniques for making response surface models P(y|x) in which many elements of hyper-parameter assignment (x) are known to be irrelevant given particular values of other elements.
Anglais

Communications avec actes
20/11/2011
internationale
25th Annual Conference on Neural Information Processing Systems (NIPS 2011)
Granada
Espagne
12/12/2011
15/12/2011
J. Shawe-Taylor, R.S. Zemel, P. Bartlett, F. Pereira, K.Q. Weinberger
Neural Information Processing Systems Foundation
Advances in Neural Information Processing Systems
24

LAL 11-308
11023
Liste des fichiers attachés à ce document :
PDF
draft1.pdf(268.5 KB)