Contributions to the theoretical study of variational inference and robustness.

Authors
Publication date
2020
Publication type
Thesis
Summary This PhD thesis deals with variational inference and robustness in statistics and machine learning. Specifically, it focuses on the statistical properties of variational approximations and the design of efficient algorithms to compute them sequentially, and studies Maximum Mean Discrepancy based estimators as learning rules that are robust to model misspecification.In recent years, variational inference has been widely studied from a computational perspective, however, the literature has paid little attention to its theoretical properties until very recently. In this thesis, we study the consistency of variational approximations in various statistical models and the conditions that ensure their consistency. In particular, we address the case of mixture models and deep neural networks. We also justify from a theoretical point of view the use of the ELBO maximization strategy, a numerical criterion that is widely used in the VB community for model selection and whose effectiveness has already been confirmed in practice. In addition, Bayesian inference provides an attractive online learning framework for analyzing sequential data, and offers generalization guarantees that remain valid even under model misspecification and in the presence of adversaries. Unfortunately, exact Bayesian inference is rarely tractable in practice and approximation methods are usually employed, but do these methods preserve the generalization properties of Bayesian inference? In this thesis, we show that this is indeed the case for some variational inference (VI) algorithms. We propose new online tempered algorithms and derive generalization bounds. Our theoretical result relies on the convexity of the variational objective, but we argue that our result should be more general and present empirical evidence in support. Our work provides theoretical justifications for online algorithms that rely on approximate Bayesian methods.Another question of major interest in statistics that is addressed in this thesis is the design of a universal estimation procedure. This question is of major interest, especially because it leads to robust estimators, a topical issue in statistics and machine learning. We address the problem of universal estimation by using a distance minimization estimator based on Maximum Mean Discrepancy. We show that the estimator is robust to both dependence and the presence of outliers in the dataset. We also highlight the links that can exist with distance minimization estimators using the L2 distance. Finally, we present a theoretical study of the stochastic gradient descent algorithm used to compute the estimator, and we support our findings with numerical simulations. We also propose a Bayesian version of our estimator, which we study from both a theoretical and a computational point of view.
Topics of the publication
Themes detected by scanR from retrieved publications. For more information, see https://scanr.enseignementsup-recherche.gouv.fr