BABICHEV Dmitry

< Back to ILB Patrimony
Affiliations
  • 2018 - 2019
    Département d'Informatique de l'Ecole Normale Supérieure
  • 2018 - 2019
    Ecole normale supérieure Paris
  • 2018 - 2019
    Sciences mathematiques de paris centre
  • 2018 - 2019
    Communauté d'universités et établissements Université de Recherche Paris Sciences et Lettres
  • 2019
  • On efficient methods for high-dimensional statistical estimation.

    Dmitry BABICHEV, Francis BACH, Anatoli JUDITSKY, Olivier CAPPE, Francis BACH, Anatoli JUDITSKY, Olivier CAPPE, Arnak s. DALALYAN, Stephane CHRETIEN, Franck IUTZELER, Arnak s. DALALYAN, Stephane CHRETIEN
    2019
    In this thesis, we examine several aspects of parameter estimation for statistics and machine learning techniques, as well as optimization methods applicable to these problems. The goal of parameter estimation is to find the unknown hidden parameters that govern the data, for example parameters with unknown probability density. Constructing estimators through optimization problems is only part of the problem, finding the optimal value of the parameter is often an optimization problem that must be solved, using various techniques. These optimization problems are often convex for a large class of problems, and we can exploit their structure to obtain fast convergence rates. The first main contribution of the thesis is to develop moment matching techniques for multi-index nonlinear regression problems. We consider the classical nonlinear regression problem, which is infeasible in high dimensions due to the curse of dimensionality. We combine two existing techniques: ADE and SIR to develop the hybrid method without some of the weak aspects of its parents. In the second main contribution, we use a particular type of averaging for stochastic gradient descent. We consider conditional exponential families (like logistic regression), where the objective is to find the unknown value of the parameter. We propose the averaging of the moment parameters, which we call prediction functions. For finite dimensional models, this type of averaging can lead to a negative error, i.e. this approach provides us with an estimator that is better than any linear estimator can ever do. The third main contribution of this thesis concerns Fenchel-Young losses. We consider multi-class linear classifiers with losses of a certain type, so that their double conjugate has a direct product of simplices as support. The corresponding convex-concave point-saddle formulation has a special form with a bilinear matrix term and classical approaches suffer from time-consuming matrix multiplication. We show that for multi-class SVM losses with efficient sampling techniques, our approach has sublinear iteration complexity, i.e., we only have to pay O(n+d+k) three times: for the number of classes k, the number of features d, and the number of samples n, while all existing techniques are more complex.
Affiliations are detected from the signatures of publications identified in scanR. An author can therefore appear to be affiliated with several structures or supervisors according to these signatures. The dates displayed correspond only to the dates of the publications found. For more information, see https://scanr.enseignementsup-recherche.gouv.fr