version française rss feed
HAL : hal-00642998, version 1

Voir la fiche détaillée  BibTeX,EndNote,...
25th Annual Conference on Neural Information Processing Systems (NIPS 2011), Granada : Espagne (2011)
Algorithms for Hyper-Parameter Optimization
James Bergstra1, R. Bardenet2, 3, Yoshua Bengio4, Balázs Kégl2, 3, 5

Several recent advances to the state of the art in image classification benchmarks have come from better configurations of existing techniques rather than novel ap- proaches to feature learning. Traditionally, hyper-parameter optimization has been the job of humans because they can be very efficient in regimes where only a few trials are possible. Presently, computer clusters and GPU processors make it pos- sible to run more trials and we show that algorithmic approaches can find better results. We present hyper-parameter optimization results on tasks of training neu- ral networks and deep belief networks (DBNs). We optimize hyper-parameters using random search and two new greedy sequential methods based on the ex- pected improvement criterion. Random search has been shown to be sufficiently efficient for learning neural networks for several datasets, but we show it is unreli- able for training DBNs. The sequential algorithms are applied to the most difficult DBN learning problems from [1] and find significantly better results than the best previously reported. This work contributes novel techniques for making response surface models P(y|x) in which many elements of hyper-parameter assignment (x) are known to be irrelevant given particular values of other elements.
1 :  The Rowland Institute
2 :  LRI - Laboratoire de Recherche en Informatique
3 :  INRIA Saclay - Ile de France - TAO
4 :  DIRO - Département d'Informatique et de Recherche Opérationnelle [Montreal]
5 :  LAL - Laboratoire de l'Accélérateur Linéaire
Liste des fichiers attachés à ce document :
draft1.pdf(268.5 KB)