P. Auer, N. Cesa-bianchi, and P. Fischer, Finitetime analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002.
DOI : 10.1023/A:1013689704352

B. Awerbuch and R. Kleinberg, Competitive collaborative learning, Journal of Computer and System Sciences, vol.74, issue.8, pp.1271-1288, 2008.
DOI : 10.1016/j.jcss.2007.08.004

S. Gelly, J. B. Hoock, A. Rimmel, O. Teytaud, and Y. Kalemkarian, The parallelization of Monte- Carlo planning, Proceedings of of the Fifth International Conference on Informatics in Control, Automation and Robotics, pp.244-249, 2008.
URL : https://hal.archives-ouvertes.fr/inria-00287867

I. Heged?-us, R. Busa-fekete, R. Ormándi, M. Jelasity, and B. Kégl, Peer-to-peer multi-class boosting, International European Conference on Parallel and Distributed Computing (EURO- PAR), pp.389-400, 2012.

M. Jelasity, A. Montresor, and O. Babaoglu, Gossip-based aggregation in large dynamic networks, ACM Transactions on Computer Systems, vol.23, issue.3, pp.219-252, 2005.
DOI : 10.1145/1082469.1082470

M. Jelasity, S. Voulgaris, R. Guerraoui, A. Kermarrec, and M. Van-steen, Gossip-based peer sampling, ACM Transactions on Computer Systems, vol.25, issue.3, p.8, 2007.
DOI : 10.1145/1275517.1275520

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

P. Joulani, Multi-armed bandit problems under delayed feedback, 2012.

D. Kempe, A. Dobra, and J. Gehrke, Gossip-based computation of aggregate information, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings., pp.482-491, 2003.
DOI : 10.1109/SFCS.2003.1238221

L. Kocsis and C. Szepesvári, Bandit Based Monte-Carlo Planning, Proceedings of the 17th European Conference on Machine Learning, pp.282-293, 2006.
DOI : 10.1007/11871842_29

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

T. L. Lai and H. Robbins, Asymptotically efficient adaptive allocation rules, Advances in Applied Mathematics, vol.6, issue.1, pp.4-22, 1985.
DOI : 10.1016/0196-8858(85)90002-8

URL : http://doi.org/10.1016/0196-8858(85)90002-8

J. Langford and T. Zhang, The epoch-greedy algorithm for multi-armed bandits with side information, NIPS, 2007.

J. Langford, A. Smola, and M. Zinkevich, Slow Learners are Fast, Advances in Neural Information Processing Systems 22, pp.2331-2339, 2009.

R. Ormándi, I. Hegedüs, and M. Jelasity, Gossip learning with linear models on fully distributed data. Concurrency and Computation: Practice and Experience

L. Xiao, S. Boyd, K. , and S. , Distributed average consensus with least-mean-square deviation, Journal of Parallel and Distributed Computing, vol.67, issue.1, pp.33-46, 2007.
DOI : 10.1016/j.jpdc.2006.08.010