PATILEA Valentin

< Back to ILB Patrimony
Topics of productions
Affiliations
  • 2020 - 2021
    Centre de recherche en économie et statistique
  • 2012 - 2018
    Institut de recherche mathématique de Rennes
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2007
  • Adaptive optimal estimation of irregular mean and covariance functions.

    Steven GOLOVKINE, Nicolas KLUTCHNIKOFF, Valentin PATILEA
    2021
    We propose straightforward nonparametric estimators for the mean and the covariance functions of functional data. Our setup covers a wide range of practical situations. The random trajectories are, not necessarily differentiable, have unknown regularity, and are measured with error at discrete design points. The measurement error could be heteroscedastic. The design points could be either randomly drawn or common for all curves. The definition of our nonparametric estimators depends on the local regularity of the stochastic process generating the functional data. We first propose a simple estimator of this local regularity which takes strength from the replication and regularization features of functional data. Next, we use the "smoothing first, then estimate" approach for the mean and the covariance functions. The new nonparametric estimators achieve optimal rates of convergence. They can be applied with both sparsely or densely sampled curves, are easy to calculate and to update, and perform well in simulations. Simulations built upon a real data example on household power consumption illustrate the effectiveness of the new approach.
  • Clustering multivariate functional data using unsupervised binary trees.

    Steven GOLOVKINE, Nicolas KLUTCHNIKOFF, Valentin PATILEA
    Computational Statistics and Data Analysis | 2021
    We propose a model-based clustering algorithm for a general class of functional data for which the components could be curves or images. The random functional data realizations could be measured with error at discrete, and possibly random, points in the definition domain. The idea is to build a set of binary trees by recursive splitting of the observations. The number of groups are determined in a data-driven way. The new algorithm provides easily interpretable results and fast predictions for online data sets. Results on simulated datasets reveal good performance in various complex settings. The methodology is applied to the analysis of vehicle trajectories on a German roundabout.
  • Learning the smoothness of noisy curves with application to online curve estimation.

    Steven GOLOVKINE, Nicolas KLUTCHNIKOFF, Valentin PATILEA
    2021
    Combining information both within and across trajectories, we propose a simple estimator for the local regularity of the trajectories of a stochastic process. Independent trajectories are measured with errors at randomly sampled time points. Non-asymptotic bounds for the concentration of the estimator are derived. Given the estimate of the local regularity, we build a nearly optimal local polynomial smoother from the curves from a new, possibly very large sample of noisy trajectories. We derive non-asymptotic pointwise risk bounds uniformly over the new set of curves. Our estimates perform well in simulations. Real data sets illustrate the effectiveness of the new approaches.
  • Equivalent models for observables under the assumption of missing at random.

    Marian HRISTACHE, Valentin PATILEA
    Econometrics and Statistics | 2021
    No summary available.
  • Testing for lack-of-fit in functional regression models against general alternatives.

    Valentin PATILEA, Cesar SANCHEZ SELLERO
    Journal of Statistical Planning and Inference | 2020
    No summary available.
  • A likelihood-based approach for cure regression models.

    Kevin BURKE, Valentin PATILEA
    TEST | 2020
    No summary available.
  • Testing for the significance of functional covariates.

    Samuel MAISTRE, Valentin PATILEA
    Journal of Multivariate Analysis | 2020
    No summary available.
  • An equivalence result for moment equations when data are missing at random.

    Marian HRISTACHE, Valentin PATILEA
    Statistical Theory and Related Fields | 2019
    No summary available.
  • Nonparametric model checks of single-index assumptions.

    Samuel MAISTRE, Valentin PATILEA
    Statistica Sinica | 2018
    Semiparametric single-index assumptions are convenient and widely used dimen\-sion reduction approaches that represent a compromise between the parametric and fully nonparametric models for regressions or conditional laws. In a mean regression setup, the SIM assumption means that the conditional expectation of the response given the vector of covariates is the same as the conditional expectation of the response given a scalar projection of the covariate vector. In a conditional distribution modeling, under the SIM assumption the conditional law of a response given the covariate vector coincides with the conditional law given a linear combination of the covariates. Several estimation techniques for single-index models are available and commonly used in applications. However, the problem of testing the goodness-of-fit seems less explored and the existing proposals still have some major drawbacks. In this paper, a novel kernel-based approach for testing SIM assumptions is introduced. The covariate vector needs not have a density and only the index estimated under the SIM assumption is used in kernel smoothing. Hence the effect of high-dimensional covariates is mitigated while asymptotic normality of the test statistic is obtained. Irrespective of the fixed dimension of the covariate vector, the new test detects local alternatives approaching the null hypothesis slower than $n^{-1/2}h^{-1/4},$ where $h$ is the bandwidth used to build the test statistic and $n$ is the sample size. A wild bootstrap procedure is proposed for finite sample corrections of the asymptotic critical values. The small sample performances of our test compared to existing procedures are illustrated through simulations.
  • Powerful nonparametric checks for quantile regression.

    Samuel MAISTRE, Pascal LAVERGNE, Valentin PATILEA
    Journal of Statistical Planning and Inference | 2017
    We address the issue of lack-of-fit testing for a parametric quantile regression. We propose a simple test that involves one-dimensional kernel smoothing, so that the rate at which it detects local alternatives is independent of the number of covariates. The test has asymptotically gaussian critical values, and wild bootstrap can be applied to obtain more accurate ones in small samples. Our procedure appears to be competitive with existing ones in simulations. We illustrate the usefulness of our test on birthweight data.
  • A new minimum contrast approach for inference in single-index models.

    Weiyu LI, Valentin PATILEA
    Journal of Multivariate Analysis | 2017
    No summary available.
  • A dimension reduction approach for conditional Kaplan–Meier estimators.

    Weiyu LI, Valentin PATILEA
    TEST | 2017
    No summary available.
  • Testing the Predictor Effect on a Functional Response.

    Valentin PATILEA, Cesar SANCHEZ SELLERO, Matthieu SAUMARD
    Journal of the American Statistical Association | 2016
    No summary available.
  • Scoring for credit risk: polytomous response variable, variable selection, dimension reduction, applications.

    Clement VITAL, Valentin PATILEA, Laurent ROUVIERE
    2016
    The aim of this thesis was to explore the theme of scoring in the context of its use in the banking world, and more particularly to control credit risk. Indeed, the diversification and globalization of banking activities in the second half of the 20th century led to the introduction of a certain number of regulations, in order to ensure that banking institutions have the necessary capital to cover the risk they take. This regulation thus requires the modeling of certain risk indicators, including the probability of default, which is, for a particular loan, the probability that the client will not be able to repay the amount he owes. The modeling of this indicator involves the definition of a variable of interest called the risk criterion, denoting "good payers" and "bad payers". Translated into a more formal statistical framework, this means that we seek to model a variable with values in {0,1} by a set of explanatory variables. In practice, this problem is treated as a scoring issue. Scoring consists in the definition of functions, called score functions, which transfer the information contained in the set of explanatory variables into a real score. The objective of such a function will be to give the same ordering on the individuals as the a posteriori probability of the model, so that the individuals with a high probability of being "good" have a high score, and conversely that the individuals with a high probability of being "bad" (and thus a high risk for the bank) have a low score. Performance criteria such as the ROC curve and the AUC have been defined, allowing to quantify how relevant the ordering produced by the score function is. The reference method for obtaining score functions is logistic regression, which we present here. A major problem in credit risk scoring is the selection of variables. Indeed, banks have large databases containing all the information they have on their customers, both socio-demographic and behavioral, and not all of them can explain the risk criterion. In order to address this issue, we have chosen to consider the Lasso technique, based on the application of a constraint on the coefficients, so as to set the values of the least significant coefficients at zero. We considered this method in the context of linear and logistic regressions, as well as an extension called Group Lasso, allowing to consider explanatory variables by groups. We then considered the case where the response variable is no longer binary, but polytomous, i.e. with several possible response levels. The first step was to present a definition of scoring equivalent to the one presented previously in the binary case. We then presented different regression methods adapted to this new case of study: a generalization of the binary logistic regression, semi-parametric methods, as well as an application of the Lasso principle to polytomous logistic regression. Finally, the last chapter is devoted to the application of some of the methods mentioned in the manuscript on real data sets, allowing to confront them with the real needs of the company.
  • A significance test for covariates in nonparametric regression.

    Pascal LAVERGNE, Samuel MAISTRE, Valentin PATILEA
    Electronic Journal of Statistics | 2015
    We consider testing the significance of a subset of covariates in a nonparametric regression. These covariates can be continuous and/or discrete. We propose a new kernel-based test that smoothes only over the covariates appearing under the null hypothesis, so that the curse of dimensionality is mitigated. The test statistic is asymptotically pivotal and the rate of which the test detects local alternatives depends only on the dimension of the covariates under the null hypothesis. We show the validity of wild bootstrap for the test. In small samples, our test is competitive compared to existing procedures.
  • Some contributions to the estimation of models defined by conditional estimating equations.

    Weiyu LI, Valentin PATILEA
    2015
    In this thesis, we study models defined by conditional moment equations. A large part of statistical models (regressions, quantile regressions, transformation models, instrumental variable models, etc.) can be defined in this form. We are interested in the case of models with a finite dimensional parameter to be estimated, as well as in the case of semi parametric models requiring the estimation of a finite dimensional parameter and an infinite dimensional parameter. In the class of semi-parametric models studied, we focus on models with a single revealing direction that achieve a compromise between simple and accurate parametric modeling, but too rigid and therefore exposed to model error, and non-parametric estimation, which is very flexible but suffers from the curse of dimension. In particular, we study these semi-parametric models in the presence of random censoring. The main thread of our study is a contrast in the form of a U-statistic, which allows to estimate the unknown parameters in general models.
  • Non-parametric conditional quantile estimation and semi-parametric learning: applications in insurance and actuarial science.

    Muhammad anas KNEFATI, Farid BENINEL, Michel DELECROIX, Farid BENINEL, Anne BERTRAND MATHIS, Christophe BIERNACKI, Pierre CHAUVET, Marian HRISTACHE, Valentin PATILEA, Ali GANNOUN
    2015
    The thesis is composed of two parts: one part dedicated to the estimation of conditional quantiles and another one to supervised learning. The part "Estimation of conditional quantiles" is organized in 3 chapters: Chapter 1 is devoted to an introduction on local linear regression, presenting the most used methods to estimate the smoothing parameter. Chapter 2 deals with existing methods of nonparametric estimation of the conditional quantile. These methods are compared, using numerical experiments on simulated and real data. Chapter 3 is devoted to a new estimator of the conditional quantile that we propose. This estimator is based on the use of an asymmetric kernel in x. Under certain assumptions, our estimator is more efficient than the usual estimators.
    The part "Supervised learning" is also composed of 3 chapters: Chapter 4 is an introduction to statistical learning and the basic notions used in this part. Chapter 5 is a review of conventional methods of supervised classification. Chapter 6 is devoted to the transfer of a semi-parametric learning model. The performance of this method is shown by numerical experiments on morphometric data and credit-scoring data.
  • Testing Second-Order Dynamics for Autoregressive Processes in Presence of Time-Varying Variance.

    Valentin PATILEA, Hamdi RAISSI
    Journal of the American Statistical Association | 2014
    The volatility modeling for autoregressive univariate time series is considered. A benchmark approach is the stationary ARCH model of Engle (1982). Motivated by real data evidence, processes with non constant unconditional variance and ARCH effects have been recently introduced. We take into account such possible non stationarity and propose simple testing procedures for ARCH effects. Adaptive McLeod and Li's portmanteau and ARCH-LM tests for checking for second order dynamics are provided. The standard versions of these tests, commonly used by practitioners, suppose constant unconditional variance. We prove the failure of these standard tests with time-varying unconditional variance. The theoretical results are illustrated by mean of simulated and real data.
  • Nonparametric tests in regression.

    Samuel MAISTRE, Valentin PATILEA, Pascal LAVERGNE
    2014
    In this thesis, we study tests of the type: (H0): E [U | X] = 0 p.s. versus (H1): P {E [U | X] = 0} < 1 where U is the residual of modeling a variable Y as a function of X. In this framework and for several special cases - significativity of variables, quantile regression, functional data, single-index model - we propose a test statistic to obtain critical values from an asymptotic pivotal distribution. In each case, we also give an appropriate bootstrap method for small sample sizes. We show the consistency towards local-or Pitman-like-alternatives of the proposed tests, when this type of alternative does not tend too quickly to the null hypothesis. In each case, we vérifions from simulations under the null hypothesis and under a sequence of alternative hypotheses that the theoretical results are in agreement with practice.
  • Testing for the significance of functional covariates in regression models.

    Samuel MAISTRE, Valentin PATILEA
    2014
    Regression models with a response variable taking values in a Hilbert space and hybrid covariates are considered. This means two sets of regressors are allowed, one of finite dimension and a second one functional with values in a Hilbert space. The problem we address is the test of the effect of the functional covariates. This problem occurs for instance when checking the goodness-of-fit of some regression models for functional data. The significance test for functional regressors in nonparametric regression with hybrid covariates and scalar or functional responses is another example where the core problem is the test on the effect of functional covariates. We propose a new test based on kernel smoothing. The test statistic is asymptotically standard normal under the null hypothesis provided the smoothing parameter tends to zero at a suitable rate. The one-sided test is consistent against any fixed alternative and detects local alternatives á la Pitman approaching the null hypothesis. In particular we show that neither the dimension of the outcome nor the dimension of the functional covariates influences the theoretical power of the test against such local alternatives. Simulation experiments and a real data application illustrate the performance of the new test with finite samples.
  • Single index regression models in the presence of censoring depending on the covariates.

    Olivier LOPEZ, Valentin PATILEA, Ingrid VAN KEILEGOM
    Bernoulli | 2013
    Consider a random vector $(X',Y)'$, where $X$ is $d$-dimensional and $Y$ is one-dimensional. We assume that $Y$ is subject to random right censoring. The aim of this paper is twofold. First we propose a new estimator of the joint distribution of $(X',Y)'$. This estimator overcomes the common curse-of-dimensionality problem, by using a new dimension reduction technique. Second we assume that the relation between $X$ and $Y$ is given by a single index model, and propose a new estimator of the parameters in this model. The asymptotic properties of all proposed estimators are obtained.
  • Corrected portmanteau tests for VAR models with time-varying variance.

    V. PATILEA, H. RAISSI
    Journal of Multivariate Analysis | 2013
    The problem of test of fit for Vector AutoRegressive (VAR) processes with unconditionally heteroscedastic errors is studied. The volatility structure is deterministic but time-varying and allows for changes that are commonly observed in economic or financial multivariate series. Our analysis is based on the residual autocovariances and autocorrelations obtained from Ordinary Least Squares (OLS), Generalized Least Squares (GLS)and Adaptive Least Squares (ALS) estimation of the autoregressive parameters. The ALS approach is the GLS approach adapted to the unknown time-varying volatility that is then estimated by kernel smoothing. The properties of the three types of residual autocovariances and autocorrelations are derived. In particular it is shown that the ALS and GLS residual autocorrelations are asymptotically equivalent. It is also found that the asymptotic distribution of the OLS residual autocorrelations can be quite different from the standard chi-square asymptotic distribution obtained in a correctly specified VAR model with iid innovations. As a consequence the standard portmanteau tests are unreliable in our framework. The correct critical values of the standard portmanteau tests based on the OLS residuals are derived. Moreover, modified portmanteau statistics based on ALS residual autocorrelations are introduced. Portmanteau tests with modified statistics based on OLS and ALS residuals and standard chi-square asymptotic distributions under the null hypothesis are also proposed. An extension of our portmanteau approaches to testing the lag length in a vector error correction type model for co-integrating relations is briefly investigated. The finite sample properties of the goodness-of-fit tests we consider are investigated by Monte Carlo experiments. The theoretical results are also illustrated using two U.S. economic data sets.
  • New estimating equation approaches with application to lifetime data analysis.

    Yu KEMING, Bing xing WANG, Valentin PATILEA
    Annals of the Institute of Statistical Mathematics | 2013
    Estimating equation approaches have been widely used in statistics inference. Important examples of estimating equations are the likelihood equations. Since its introduction by Sir R. A. Fisher almost a century ago, maximum likelihood estimation (MLE) is still the most popular estimation method used for fitting probability distribution to data, including fitting lifetime distributions with censored data. However, MLE may produce substantial bias and even fail to obtain valid confidence intervals when data size is not large enough or there is censoring data. In this paper, based on nonlinear combinations of order statistics, we propose new estimation equation approaches for a class of probability distributions, which are particularly effective for skewed distributions with small sample sizes and censored data. The proposed approaches may possess a number of attractive properties such as consistency, sufficiency and uniqueness. Asymptotic normality of these new estimators is derived. The construction of new estimation equations and their numerical performance under different censored schemes are detailed via Weibull distribution and generalized exponential distribution.
  • Smooth minimum distance estimation and testing with conditional estimating equations: Uniform in bandwidth theory.

    Valentin PATILEA, Pascal LAVERGNE
    Journal of Econometrics | 2013
    To study the influence of a bandwidth parameter in inference with conditional moments, we propose a new class of estimators and establish an asymptotic representation of our estimator as a process indexed by a bandwidth, which can vary within a wide range including bandwidths independent of the sample size. We study its behavior under misspecification. We also propose an efficient version of our estimator. We develop a procedure based on a distance metric statistic for testing restrictions on parameters as well as a bootstrap technique to account for the bandwidth's influence. Our new methods are simple to implement, apply to non-smooth problems, and perform well in our simulations.
  • Contribution to the statistical analysis of functional data.

    Mathieu SAUMARD, Valentin PATILEA, Pascal SARDAT, James LEDOUX, Herve CARDOT, Andre MAS
    2013
    In this thesis, we focus on functional data. The generalization of the generalized linear functional model to the model defined by estimating equations is studied. We obtain a central limit theorem for the considered estimator. The optimal instruments are estimated, and we obtain a uniform convergence of the estimators. We are then interested in different tests in functional data. These are non-parametric tests to study the effect of a functional random covariate on an error term, which can be directly observed as a response or estimated from a functional model like the functional linear model. In order to implement the different tests, we have proven a dimension reduction result that relies on projections of the functional covariate. We construct non-effect and goodness-of-fit tests using either kernel smoothing or nearest neighbor smoothing. A goodness-of-fit test in the functional linear model is proposed. All these tests are studied from a theoretical and practical point of view.
  • Contributions to survival analysis.

    Damien BOUSQUET, Jean pierre DAURES, Jean michel MARIN, Jean pierre DAURES, Jean michel MARIN, Elodie PICCININI BRUNEL, Gilles DUCHARME, Agathe GUILLOUX, Valentin PATILEA, Pascal ROY
    2012
    In this work, we present new models for survival analysis.
  • Dimension reduction in the presence of censored data.

    Olivier LOPEZ, Michel DELECROIX, Valentin PATILEA
    2007
    We consider regression models where the explained variable is right-censored randomly. We propose new estimators of the regression function in parametric models, and we propose a non-parametric test of fit to these models. We extend these methods to the study of the semi-parametric "single-index" model, generalizing dimension reduction techniques used in the absence of censoring. We first consider models with stronger identifiability assumptions, before working in a framework where the explained variable and the censoring are conditionally independent of the explanatory variables. We develop a new dimension reduction approach for this type of problem.
Affiliations are detected from the signatures of publications identified in scanR. An author can therefore appear to be affiliated with several structures or supervisors according to these signatures. The dates displayed correspond only to the dates of the publications found. For more information, see https://scanr.enseignementsup-recherche.gouv.fr