GERMAIN Pascal

< Back to ILB Patrimony
Affiliations
  • 2013 - 2020
    Université Laval
  • 2017 - 2020
    Model for data analysis and learning
  • 2014 - 2019
    Apprentissage statistique et parcimonie
  • 2016 - 2017
    Département d'Informatique de l'Ecole Normale Supérieure
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • Landmark-based Ensemble Learning with Random Fourier Features and Gradient Boosting.

    Leo GAUTHERON, Pascal GERMAIN, Amaury HABRARD, Guillaume METZLER, Emilie MORVANT, Marc SEBBAN, Valentina ZANTEDESCHI
    European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases | 2020
    This paper jointly leverages two state-of-the-art learning strategies gradient boosting (GB) and kernel Random Fourier Features (RFF)-to address the problem of kernel learning. Our study builds on a recent result showing that one can learn a distribution over the RFF to produce a new kernel suited for the task at hand. For learning this distribution, we exploit a GB scheme expressed as ensembles of RFF weak learners, each of them being a kernel function designed to fit the residual. Unlike Multiple Kernel Learning techniques that make use of a pre-computed dictionary of kernel functions to select from, at each iteration we fit a kernel by approximating it from the training data as a weighted sum of RFF. This strategy allows one to build a classifier based on a small ensemble of learned kernel "landmarks" better suited for the underlying application. We conduct a thorough experimental analysis to highlight the advantages of our method compared to both boosting-based and kernel-learning state-of-the-art methods.
  • PAC-Bayes and domain adaptation.

    Pascal GERMAIN, Amaury HABRARD, Francois LAVIOLETTE, Emilie MORVANT
    Neurocomputing | 2020
    We provide two main contributions in PAC-Bayesian theory for domain adaptation where the objective is to learn, from a source distribution, a well-performing majority vote on a different, but related, target distribution. Firstly, we propose an improvement of the previous approach we proposed in Germain et al. (2013), which relies on a novel distribution pseudodistance based on a disagreement averaging, allowing us to derive a new tighter domain adaptation bound for the target risk. While this bound stands in the spirit of common domain adaptation works, we derive a second bound (introduced in Germain et al., 2016) that brings a new perspective on domain adaptation by deriving an upper bound on the target risk where the distributions’ divergence—expressed as a ratio—controls the trade-off between a source error measure and the target voters’ disagreement. We discuss and compare both results, from which we obtain PAC-Bayesian generalization bounds. Furthermore, from the PAC-Bayesian specialization to linear classifiers, we infer two learning algorithms, and we evaluate them on real data.
  • Improved PAC-Bayesian Bounds for Linear Regression.

    Vera SHALAEVA, Alireza FAKHRIZADEH ESFAHANI, Pascal GERMAIN, Mihaly PETRECZKY
    Thirty-Fourth AAAI Conference on Artificial Intelligence | 2020
    In this paper, we improve the PAC-Bayesian error bound for linear regression derived in Germain et al. [10]. The improvements are twofold. First, the proposed error bound is tighter, and converges to the generalization loss with a well-chosen temperature parameter. Second, the error bound also holds for training data that are not independently sampled. In particular, the error bound applies to certain time series generated by well-known classes of dynamical models, such as ARX models.
  • Dichotomize and Generalize: PAC-Bayesian Binary Activated Deep Neural Networks.

    Gael LETARTE, Pascal GERMAIN, Benjamin GUEDJ, Francois LAVIOLETTE
    ML with guarantees -- NeurIPS 2019 workshop | 2019
    We present a comprehensive study of multilayer neural networks with binary activation, relying on the PAC-Bayesian theory. Our contributions are twofold: (i) we develop an end-to-end framework to train a binary activated deep neural network, (ii) we provide nonvacuous PAC-Bayesian generalization bounds for binary activated deep neural networks. Our results are obtained by minimizing the expected loss of an architecture-dependent aggregation of binary activated deep neural networks. Our analysis inherently overcomes the fact that binary activation function is non-differentiable.
  • Dichotomize and Generalize: PAC-Bayesian Binary Activated Deep Neural Networks.

    Gael LETARTE, Pascal GERMAIN, Benjamin GUEDJ, Francois LAVIOLETTE
    ML with guarantees -- NeurIPS 2019 workshop | 2019
    No summary available.
  • Interpreting Neural Networks as Majority Votes through the PAC-Bayesian Theory.

    Paul VIALLARD, Remi EMONET, Pascal GERMAIN, Amaury HABRARD, Emilie MORVANT
    Workshop on Machine Learning with guarantees @ NeurIPS 2019 | 2019
    We propose a PAC-Bayesian theoretical study of the two-phase learning procedure of a neural network introduced by Kawaguchi et al. (2017). In this procedure, a network is expressed as a weighted combination of all the paths of the network (from the input layer to the output one), that we reformulate as a PAC-Bayesian majority vote. Starting from this observation, their learning procedure consists in (1) learning a "prior" network for fixing some parameters, then (2) learning a "posterior" network by only allowing a modification of the weights over the paths of the prior network. This allows us to derive a PAC-Bayesian generalization bound that involves the empirical individual risks of the paths (known as the Gibbs risk) and the empirical diversity between pairs of paths. Note that similarly to classical PAC-Bayesian bounds, our result involves a KL-divergence term between a "prior" network and the "posterior" network. We show that this term is computable by dynamic programming without assuming any distribution on the network weights.
  • PAC-Bayesian Contrastive Unsupervised Representation Learning.

    Kento NOZAWA, Pascal GERMAIN, Benjamin GUEDJ
    2019
    Contrastive unsupervised representation learning (CURL) is the state-of-the-art technique to learn representations (as a set of features) from unlabelled data. While CURL has collected several empirical successes recently, theoretical understanding of its performance was still missing. In a recent work, Arora et al. (2019) provide the first generalisation bounds for CURL, relying on a Rademacher complexity. We extend their framework to the flexible PAC-Bayes setting, allowing to deal with the non-iid setting. We present PAC-Bayesian generalisation bounds for CURL, which are then used to derive a new representation learning algorithm. Numerical experiments on real-life datasets illustrate that our algorithm achieves competitive accuracy, and yields generalisation bounds with non-vacuous values.
  • Dichotomize and Generalize: PAC-Bayesian Binary Activated Deep Neural Networks.

    Gael LETARTE, Pascal GERMAIN, Benjamin GUEDJ, Francois LAVIOLETTE
    2019
    We present a comprehensive study of multilayer neural networks with binary activation, relying on the PAC-Bayesian theory. Our contributions are twofold: (i) we develop an end-to-end framework to train a binary activated deep neural network, overcoming the fact that binary activation function is non-differentiable. (ii) we provide nonvacuous PAC-Bayesian generalization bounds for binary activated deep neural networks. Noteworthy, our results are obtained by minimizing the expected loss of an architecture-dependent aggregation of binary activated deep neural networks. The performance of our approach is assessed on a thorough numerical experiment protocol on real-life datasets.
  • Multiview Boosting by Controlling the Diversity and the Accuracy of View-specific Voters.

    Anil GOYAL, Emilie MORVANT, Pascal GERMAIN, Massih reza AMINI
    Neurocomputing | 2019
    In this paper we propose a boosting based multiview learning algorithm, referred to as PB-MVBoost, which iteratively learns i) weights over view-specific voters capturing view-specific information. and ii) weights over views by optimizing a PAC-Bayes multiview C-Bound that takes into account the accuracy of view-specific classifiers and the diversity between the views. We derive a generalization bound for this strategy following the PAC-Bayes theory which is a suitable tool to deal with models expressed as weighted combination over a set of voters. Different experiments on three publicly available datasets show the efficiency of the proposed approach with respect to state-of-art models.
  • Domain Adaptation from a Pre-trained Source Model: Application on fraud detection tasks.

    Luxin ZHANG, Christophe BIERNACKI, Pascal GERMAIN, Yacine KESSACI
    12th International Conference of the ERCIM WG on Computational and Methodological Statistics (CMStatistics 2019) | 2019
    No summary available.
  • Pseudo-Bayesian Learning with Kernel Fourier Transform as Prior.

    Gael LETARTE, Emilie MORVANT, Pascal GERMAIN
    The 22nd International Conference on Artificial Intelligence and Statistics | 2019
    We revisit Rahimi and Recht (2007)’s kernel random Fourier features (RFF) method through the lens of the PAC-Bayesian theory. While the primary goal of RFF is to approximate a kernel, we look at the Fourier transform as a prior distribution over trigonometric hypotheses. It naturally suggests learning a posterior on these hypotheses. We derive generalization bounds that are optimized by learning a pseudo-posterior obtained from a closed-form expression. Based on this study, we consider two learning strategies: The first one finds a compact landmarks-based representation of the data where each landmark is given by a distribution-tailored similarity measure, while the second one provides a PAC-Bayesian justification to the kernel alignment method of Sinha and Duchi (2016).
  • Revisiting random Fourier features based on PAC-Bayesian learning via interest points.

    Leo GAUTHERON, Pascal GERMAIN, Amaury HABRARD, Gael LETARTE, Emilie MORVANT, Marc SEBBAN, Valentina ZANTEDESCHI
    CAp 2019 - Conférence sur l'Apprentissage automatique | 2019
    This paper summarizes and extends our recent work published at AISTATS 2019, in which we revisited the Random Fourier Features (RFF) method of Rahimi et al. (2007) through PAC-Bayesian theory. Although the main objective of RFFs is to approximate a kernel function, here we consider the Fourier transform as a distribution \emph{a priori} on a set of trigonometric assumptions. This naturally suggests learning an a posteriori distribution on this set of assumptions. We derive bounds in generalizations that are optimized by learning a pseudo-posterior distribution obtained from a closed form expression. From this study, we propose two learning strategies based on points of interest: (i) the two-step procedure proposed in our previous paper, where a compact representation of the data is learned and then used to learn a linear model, (ii) a new procedure, where the representation and the model are learned in a single step following a Boosting type approach
  • Learning Landmark-Based Ensembles with Random Fourier Features and Gradient Boosting.

    Leo GAUTHERON, Pascal GERMAIN, Amaury HABRARD, Emilie MORVANT, Marc SEBBAN, Valentina ZANTEDESCHI
    2019
    We propose a Gradient Boosting algorithm for learning an ensemble of kernel functions adapted to the task at hand. Unlike state-of-the-art Multiple Kernel Learning techniques that make use of a pre-computed dictionary of kernel functions to select from, at each iteration we fit a kernel by approximating it as a weighted sum of Random Fourier Features (RFF) and by optimizing their barycenter. This allows us to obtain a more versatile method, easier to setup and likely to have better performance. Our study builds on a recent result showing one can learn a kernel from RFF by computing the minimum of a PAC-Bayesian bound on the kernel alignment generalization loss, which is obtained efficiently from a closed-form solution. We conduct an experimental analysis to highlight the advantages of our method w.r.t. both Boosting-based and kernel-learning state-of-the-art methods.
  • PAC-Bayes and Domain Adaptation.

    Pascal GERMAIN, Amaury HABRARD, Francois LAVIOLETTE, Emilie MORVANT
    2018
    We provide two main contributions in PAC-Bayesian theory for domain adaptation where the objective is to learn, from a source distribution, a well-performing majority vote on a different, but related, target distribution. Firstly, we propose an improvement of the previous approach we proposed in Germain et al. (2013), which relies on a novel distribution pseudodistance based on a disagreement averaging, allowing us to derive a new tighter domain adaptation bound for the target risk. While this bound stands in the spirit of common domain adaptation works, we derive a second bound (recently introduced in Germain et al., 2016) that brings a new perspective on domain adaptation by deriving an upper bound on the target risk where the distributions' divergence—expressed as a ratio—controls the trade-off between a source error measure and the target voters' disagreement. We discuss and compare both results, from which we obtain PAC-Bayesian generalization bounds. Furthermore, from the PAC-Bayesian specialization to linear classifiers, we infer two learning algorithms, and we evaluate them on real data.
  • A PAC-Bayesian terminal in hope and its extension to multiview learning.

    Anil GOYAL, Emilie MORVANT, Pascal GERMAIN
    Conférence Francophone sur l'Apprentissage Automatique (CAp) | 2017
    We propose a PAC-Bayesian theorem expressed as a bound in expectation, whereas classical PAC-Bayesian bounds are probabilistic bounds. Our main result is therefore a generalization bound on the expectation of the final majority vote. We then use this result to study multiview learning when we want to learn a model in two steps: (i) learning one or more majority votes for each view, (ii) which we combine in a second step. Finally, we empirically validate the interest of this PAC-Bayesian approach for multi-view learning.
  • Domain Adaptation in Computer Vision Applications.

    Yaroslav GANIN, Evgeniya USTINOVA, Hana AJAKAN, Pascal GERMAIN, Hugo LAROCHELLE, Francois LAVIOLETTE, Mario MARCHAND, Victor LEMPITSKY
    Advances in Computer Vision and Pattern Recognition | 2017
    No summary available.
  • PAC-Bayesian Analysis for a two-step Hierarchical Mutliview Learning Approach.

    Anil GOYAL, Emilie MORVANT, Pascal GERMAIN, Massih reza AMINI
    27th European Conference on Machine Learning | 2017
    No summary available.
  • PAC-Bayesian Theorems for Domain Adaptation with Specialization to Linear Classifiers.

    Pascal GERMAIN, Amaury HABRARD, Francois LAVIOLETTE, Emilie MORVANT
    2016
    In this paper, we provide two main contributions in PAC-Bayesian theory for domain adaptation where the objective is to learn, from a source distribution, a well-performing majority vote on a different target distribution. On the one hand, we propose an improvement of the previous approach proposed by Germain et al. (2013), that relies on a novel distribution pseudodistance based on a disagreement averaging, allowing us to derive a new tighter PAC-Bayesian domain adaptation bound for the stochastic Gibbs classifier. We specialize it to linear classifiers, and design a learning algorithm which shows interesting results on a synthetic problem and on a popular sentiment annotation task. On the other hand, we generalize these results to multisource domain adaptation allowing us to take into account different source domains. This study opens the door to tackle domain adaptation tasks by making use of all the PAC-Bayesian tools.
  • PAC-Bayesian Bounds based on the Rényi Divergence.

    Luc BEGIN, Pascal GERMAIN, Francois LAVIOLETTE, Jean francis ROY
    International Conference on Artificial Intelligence and Statistics (AISTATS 2016) | 2016
    We propose a simplified proof process for PAC-Bayesian generalization bounds, that allows to divide the proof in four successive inequalities, easing the "customization" of PAC-Bayesian theorems. We also propose a family of PAC-Bayesian bounds based on the Rényi divergence between the prior and posterior distributions, whereas most PAC-Bayesian bounds are based on the Kullback-Leibler divergence. Finally, we present an empirical evaluation of the tightness of each inequality of the simplified proof, for both the classical PAC-Bayesian bounds and those based on the Rényi divergence.
  • A New PAC-Bayesian Perspective on Domain Adaptation.

    Pascal GERMAIN, Amaury HABRARD, Francois LAVIOLETTE, Emilie MORVANT
    33rd International Conference on Machine Learning (ICML 2016) | 2016
    We study the issue of PAC-Bayesian domain adaptation: We want to learn, from a source domain, a majority vote model dedicated to a target one. Our theoretical contribution brings a new perspective by deriving an upper-bound on the target risk where the distributions’ divergence—expressed as a ratio—controls the trade-off between a source error measure and the target voters’ disagreement. Our bound suggests that one has to focus on regions where the source data is informative. From this result, we derive a PAC-Bayesian generalization bound, and specialize it to linear classifiers. Then, we infer a learning algorithm and perform experiments on real data.
  • PAC-Bayesian theorems for multi-view learning.

    Anil GOYAL, Emilie MORVANT, Pascal GERMAIN, Massih reza AMINI
    Conférence Francophone sur l'Apprentissage Automatique (CAp) | 2016
    No summary available.
  • PAC-Bayesian Theory Meets Bayesian Inference.

    Pascal GERMAIN, Francis BACH, Alexandre LACOSTE, Simon LACOSTE JULIEN
    Neural Information Processing Systems (NIPS 2016) | 2016
    We exhibit a strong link between frequentist PAC-Bayesian risk bounds and the Bayesian marginal likelihood. That is, for the negative log-likelihood loss function, we show that the minimization of PAC-Bayesian generalization risk bounds maximizes the Bayesian marginal likelihood. This provides an alternative explanation to the Bayesian Occam's razor criteria, under the assumption that the data is generated by an i.i.
  • A New PAC-Bayesian View of Domain Adaptation.

    Pascal GERMAIN, Francois LAVIOLETTE, Amaury HABRARD, Emilie MORVANT
    NIPS 2015 Workshop on Transfer and Multi-Task Learning: Trends and New Perspectives | 2015
    We propose a new theoretical study of domain adaptation for majority vote classifiers (from a source to a target domain). We upper bound the target risk by a trade-off between only two terms: The voters’ joint errors on the source domain, and the voters’ disagreement on the target one. Hence, this new study is simpler than other analyses that usually rely on three terms. We also derive a PAC-Bayesian generalization bound leading to a DA algorithm for linear classifiers.
  • A New PAC-Bayesian Perspective on Domain Adaptation.

    Pascal GERMAIN, Amaury HABRARD, Francois LAVIOLETTE, Emilie MORVANT
    2015
    We study the issue of PAC-Bayesian domain adaptation: We want to learn, from a source domain, a majority vote model dedicated to a target one. Our theoretical contribution brings a new perspective by deriving an upper-bound on the target risk where the distributions' divergence---expressed as a ratio---controls the trade-off between a source error measure and the target voters' disagreement. Our bound suggests that one has to focus on regions where the source data is informative. From this result, we derive a PAC-Bayesian generalization bound, and specialize it to linear classifiers. Then, we infer a learning algorithm and perform experiments on real data.
  • An Improvement to the Domain Adaptation Bound in a PAC-Bayesian context.

    Pascal GERMAIN, Amaury HABRARD, Francois LAVIOLETTE, Emilie MORVANT
    NIPS 2014 Workshop on Transfer and Multi-task learning: Theory Meets Practice | 2014
    This paper provides a theoretical analysis of domain adaptation based on the PAC-Bayesian theory. We propose an improvement of the previous domain adaptation bound obtained by Germain et al. in two ways. We first give another generalization bound tighter and easier to interpret. Moreover, we provide a new analysis of the constant term appearing in the bound that can be of high interest for developing new algorithmic solutions.
  • A PAC-Bayesian analysis of domain adaptation and its specialization to linear classifiers.

    Pascal GERMAIN, Amaury HABRARD, Francois LAVIOLETTE, Emilie MORVANT
    Conférence sur l'apprentissage automatique | 2013
    In this paper, we focus on the domain adaptation (DA) problem corresponding to the case where the training and test data are from different distributions. We propose a PAC-Bayesian analysis of this problem in the context of binary classification without supervised information on the test data. The PAC-Bayesian theory provides theoretical guarantees on the risk of a majority vote on a set of hypotheses. Our contribution to the DA framework relies on a new measure of divergence between distributions based on a notion of expectation of disagreement between hypotheses. This measure allows us to derive a first PAC-Bayesian bound for the stochastic Gibbs classifier. This bound has the advantage of being directly optimizable for any hypothesis space and we give an illustration in the case of linear classifiers. The algorithm proposed in this context shows interesting results on a toy problem as well as on a common opinion analysis task. These results open new perspectives for understanding the domain adaptation problem thanks to the tools offered by the PAC-Bayesian theory.
  • A PAC-Bayesian Approach for Domain Adaptation with Specialization to Linear Classifiers.

    Pascal GERMAIN, Amaury HABRARD, Francois LAVIOLETTE, Emilie MORVANT
    International Conference on Machine Learning 2013 | 2013
    We provide a first PAC-Bayesian analysis for domain adaptation (DA) which arises when the learning and test distributions differ. It relies on a novel distribution pseudodistance based on a disagreement averaging. Using this measure, we derive a PAC-Bayesian DA bound for the stochastic Gibbs classifier. This bound has the advantage of being directly optimizable for any hypothesis space. We specialize it to linear classifiers, and design a learning algorithm which shows interesting results on a synthetic problem and on a popular sentiment annotation task. This opens the door to tackling DA tasks by making use of all the PAC-Bayesian tools.
Affiliations are detected from the signatures of publications identified in scanR. An author can therefore appear to be affiliated with several structures or supervisors according to these signatures. The dates displayed correspond only to the dates of the publications found. For more information, see https://scanr.enseignementsup-recherche.gouv.fr