Adaptation of learning methods to U-statistics.

Authors
  • COLIN Igor
  • CL?MEN?ON St?phan
  • SALMON Joseph
  • BIANCHI Pascal
  • RICHTARIK Peter
  • JOHANSSON Mikael
  • ASPREMONT Alexandre d
Publication date
2016
Publication type
Thesis
Summary The recent explosion of available data volumes has made algorithmic complexity a central element of machine learning methods. Stochastic optimization algorithms as well as distributed and centralized methods have been widely developed during the last ten years. These methods have made it easier to scale up to optimize empirical risks whose formulation is comparable to the associated observations. However, in many statistical learning problems, the accurate estimation of risk is done using U-statistics, functions of data in the form of averages over d-tuples. We first address the problem of sampling for the minimization of the empirical risk. We show that the risk can be replaced by a Monte-Carlo estimator, called U-statistic incomplete, which is a very efficient way to minimize the risk. We show that the risk can be replaced by an incomplete Monte Carlo estimator, named U-statistic, based on only O(n) terms and allowing to keep a learning rate of the same order. We establish bounds on the approximation error of the U-process and the numerical simulations demonstrate the advantage of such a sampling technique. We then turn our attention to centralized estimation, where observations are now distributed over a related network. We develop so-called gossip algorithms, in both synchronous and asynchronous settings, that disseminate the observations while maintaining local estimators of the U-statistic to be estimated. We show the convergence of these algorithms with explicit data dependencies and network topology. Finally, we deal with the centralized optimization of functions of pairs of observations. As for the estimation, our methods are based on the concomitance of the propagation of observations and the local optimization of the risk. Our theoretical analysis underlines that these methods keep a convergence speed of the same order as in the centralized case. The numerical experiments confirm the practical interest of our approach.
Topics of the publication
Themes detected by scanR from retrieved publications. For more information, see https://scanr.enseignementsup-recherche.gouv.fr