MARIN Jean Michel

< Back to ILB Patrimony
Topics of productions
Affiliations
  • 2012 - 2020
    Institut Montpelliérain Alexander Grothendieck
  • 2017 - 2018
    Université de Montpellier
  • 2017 - 2018
    Biologie computationnelle et quantitative
  • 2015 - 2019
    Centre de biologie pour la gestion des populations
  • 2017 - 2018
    Sélection de modèles en apprentissage statistique
  • 2013 - 2014
    Centre de recherche en économie et statistique de l'Ensae et l'Ensai
  • 2013 - 2014
    Centre de recherche en économie et statistique
  • 2000 - 2001
    Université Toulouse 3 Paul Sabatier
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2014
  • Extending approximate Bayesian computation with supervised machine learning to infer demographic history from genetic polymorphisms using DIYABC Random Forest.

    Francois david COLLIN, Ghislain DURIF, Louis RAYNAL, Eric LOMBAERT, Mathieu GAUTIER, Renaud VITALIS, Jean michel MARIN, Arnaud ESTOUP
    Molecular Ecology Resources | 2021
    Simulation-based methods such as Approximate Bayesian Computation (ABC) are well-adapted to the analysis of complex scenarios of populations and species genetic history. In this context, supervised machine learning (SML) methods provide attractive statistical solutions to conduct efficient inferences about scenario choice and parameter estimation. The Random Forest methodology (RF) is a powerful ensemble of SML algorithms used for classification or regression problems. RF allows conducting inferences at a low computational cost, without preliminary selection of the relevant components of the ABC summary statistics, and bypassing the derivation of ABC tolerance levels. We have implemented a set of RF algorithms to process inferences using simulated datasets generated from an extended version of the population genetic simulator implemented in DIYABC v2.1.0. The resulting computer package, named DIYABC Random Forest v1.0, integrates two functionalities into a user-friendly interface: the simulation under custom evolutionary scenarios of different types of molecular data (microsatellites, DNA sequences or SNPs) and RF treatments including statistical tools to evaluate the power and accuracy of inferences. We illustrate the functionalities of DIYABC Random Forest v1.0 for both scenario choice and parameter estimation through the analysis of pseudo-observed and real datasets corresponding to pool-sequencing and individual-sequencing SNP datasets. Because of the properties inherent to the implemented RF methods and the large feature vector (including various summary statistics and their linear combinations) available for SNP data, DIYABC Random Forest v1.0 can efficiently contribute to the analysis of large SNP datasets to make inferences about complex population genetic histories.
Affiliations are detected from the signatures of publications identified in scanR. An author can therefore appear to be affiliated with several structures or supervisors according to these signatures. The dates displayed correspond only to the dates of the publications found. For more information, see https://scanr.enseignementsup-recherche.gouv.fr