BENHAMOU Eric

< Back to ILB Patrimony
Affiliations
  • 2017 - 2021
    Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision
  • 2021
  • 2020
  • 2019
  • 2018
  • 2016
  • 2009
  • Adaptive Learning for Financial Markets Mixing Model-Based and Model-Free Rl for Volatility Targeting.

    Eric BENHAMOU, David SALTIEL, Serge TABACHNIK, Sui kai WONG, Francois CHAREYRON
    SSRN Electronic Journal | 2021
    No summary available.
  • Deep Reinforcement Learning (DRL) for Portfolio Allocation.

    Eric BENHAMOU, David SALTIEL, Jean jacques OHANA, Jamal ATIF, Rida LARAKI
    Lecture Notes in Computer Science | 2021
    No summary available.
  • Shapley values for LightGBM model applied to regime detection.

    Eric BENHAMOU, J OHANA, S OHANA, D SALTIEL, B GUEZ
    2021
    We consider a gradient boosting decision trees (GBDT) approach to predict large S&P 500 price drops from a set of 150 technical, fundamental and macroeconomic features. We report an improved accuracy of GBDT over other machine learning (ML) methods on the S&P 500 futures prices. We show that retaining fewer and carefully selected features provides improvements across all ML approaches. Shapley values have recently been introduced from game theory to the field of ML. They allow for a robust identification of the most important variables predicting stock market crises, and of a local explanation of the crisis probability at each date, through a consistent features attribution. We apply this methodology to analyse in detail the March 2020 financial meltdown, for which the model offered a timely out of sample prediction. This analysis unveils in particular the contrarian predictive role of the tech equity sector before and after the crash.
  • Is the Covid equity bubble rational? A machine learning answer.

    Eric BENHAMOU, Jean JACQUES OHANA, David SALTIEL, Beatrice GUEZ
    2021
    Is the Covid Equity bubble rational? In 2020, stock prices ballooned with S&P 500 gaining 16%, and the tech-heavy Nasdaq soaring to 43%, while fundamentals deteriorated with decreasing GDP forecasts, shrinking sales and revenues estimates and higher government deficits. To answer this fundamental question, with little bias as possible, we explore a gradient boosting decision trees (GBDT) approach that enables us to crunch numerous variables and let the data speak. We define a crisis regime to identify specific downturns in stock markets and normal rising equity markets. We test our approach and report improved accuracy of GBDT over other ML methods. Thanks to Shapley values, we are able to identify most important features, making this current work innovative and a suitable answer to the justification of current equity level.
  • Computation of the marginal contribution of Sharpe ratio and other performance ratios.

    Eric BENHAMOU, Beatrice GUEZ
    2021
    Computing incremental contribution of performance ratios like Sharpe, Treynor, Calmar or Sterling ratios is of paramount importance for asset managers. Leveraging Euler's homogeneous function theorem, we are able to prove that these performance ratios are indeed a linear combination of individual modified performance ratios. This allows not only deriving a condition for a new asset to provide incremental performance for the portfolio but also to identify the key drivers of these performance ratios. We provide various numerical examples of this performance ratio decomposition.
  • Distinguish the indistinguishable: a Deep Reinforcement Learning approach for volatility targeting models.

    Eric BENHAMOU, David SALTIEL, Serge TABACHNIK, Sui kai WONG, Francois CHAREYRON
    2021
    Can an agent efficiently learn to distinguish extremely similar financial models in an environment dominated by noise and regime changes? Standard statistical methods based on averaging or ranking models fail precisely because of regime changes and noisy environments. Additional contextual information in Deep Reinforcement Learning (DRL), helps training an agent distinguish different financial models whose time series are very similar. Our contributions are four-fold: (i) we combine model-based and modelfree Reinforcement Learning (RL). The last model-free RL allows us selecting the different models, (ii) we present a concept, called "walk-forward analysis", which is defined by successive training and testing based on expanding periods, to assert the robustness of the resulting agent, (iii) we present a method based on the importance of features that looks like the one in gradient boosting methods and is based on features sensitivities, (iv) last but not least, we introduce the concept of statistical difference significance based on a two-tailed T-test, to highlight the ways in which our models differ from more traditional ones. Our experimental results show that our approach outperforms the benchmarks in almost all evaluation metrics commonly used in financial mathematics, namely net performance, Sharpe ratio, Sortino, maximum drawdown, maximum drawdown over volatility.
  • Distribution and statistics of the Sharpe Ratio.

    Eric BENHAMOU
    2021
    Because of the frequent usage of the Sharpe ratio in asset management to compare and benchmark funds and asset managers, it is relevant to derive the distribution and some statistics of the Sharpe ratio. In this paper, we show that under the assumption of independent normally distributed returns, it is possible to derive the exact distribution of the Sharpe ratio. In particular, we prove that up to a rescaling factor, the Sharpe ratio is a non centered Student distribution whose characteristics have been widely studied by statisticians. For a large number of observations, we can derive the asymtptotic distribution and find back the result of Lo (2002). We also illustrate the fact that the empirical Sharpe ratio is asymptotically optimal in the sense that it achieves the Cramer Rao bound. We then study the empirical SR under AR(1) assumptions and investigate the effect of compounding period on the Sharpe (computing the annual Sharpe with monthly data for instance). We finally provide general formula in this case of heteroscedasticity and autocorrelation.
  • Trade Selection with Supervised Learning and Optimal Coordinate Ascent (OCA).

    David SALTIEL, Eric BENHAMOU, Rida LARAKI, Jamal ATIF
    Lecture Notes in Computer Science | 2021
    No summary available.
  • Detecting crisis event with Gradient Boosting Decision Trees.

    Eric BENHAMOU, Jean OHANA, David SALTIEL, Beatrice GUEZ
    2021
    Financial markets allocation is a difficult task as the method needs to dramatically change its behavior when facing very rare black swan events like crises that shift market regime. In order to address this challenge, we present a gradient boosting decision trees (GBDT) approach to predict large price drops in equity indexes from a set of 150 technical, fundamental and macroeconomic features. We report an improved accuracy of GBDT over other machine learning (ML) methods on the S&P 500 futures prices. We show that retaining fewer and carefully selected features provides improvements across all ML approaches. We show that this model has a strong predictive power. We train the model from 2000 to 2014, a period where various crises have been observed and use a validation period of 3 years to find hyperparameters. The fitted model timely forecasts the Covid crisis giving us a planning method for early detection of potential future crises.
  • Regime change detection with GBDT and Shapley values.

    Eric BENHAMOU, Jean OHANA, David SALTIEL, Beatrice GUEZ
    2021
    Regime changes detection in financial markets is well known to be hard to explain and interpret. Can an asset manager explain clearly the intuition of his regime changes prediction on equity market ? To answer this question, we consider a gradient boosting decision trees (GBDT) approach to plan regime changes on S&P 500 from a set of 150 technical, fundamental and macroeconomic features. We report an improved accuracy of GBDT over other machine learning (ML) methods on the S&P 500 futures prices. We show that retaining fewer and carefully selected features provides improvements across all ML approaches. Shapley values have recently been introduced from game theory to the field of ML. This approach allows a robust identification of the most important variables planning stock market crises, and of a local explanation of the crisis probability at each date, through a consistent features attribution. We apply this methodology to analyse in detail the March 2020 financial meltdown, for which the model offered a timely out of sample prediction. This analysis unveils in particular the contrarian predictive role of the tech equity sector before and after the crash.
  • Detecting and Adapting to Crisis Pattern with Context Based Deep Reinforcement Learning.

    Eric BENHAMOU, David SALTIEL, Jean jacques OHANA, Jamal ATIF
    SSRN Electronic Journal | 2020
    No summary available.
  • Time Your Hedge With Deep Reinforcement Learning.

    Eric BENHAMOU, David SALTIEL, Sandrine UNGARI, Abhishek MUKHOPADHYAY
    SSRN Electronic Journal | 2020
    No summary available.
  • Time your hedge with Deep Reinforcement Learning.

    Eric BENHAMOU, David SALTIEL, Sandrine UNGARI, Abhishek MUKHOPADHYAY
    2020
    Can an asset manager plan the optimal timing for her/his hedging strategies given market conditions? The standard approach based on Markowitz or other more or less sophisticated financial rules aims to find the best portfolio allocation thanks to forecasted expected returns and risk but fails to fully relate market conditions to hedging strategies decision. In contrast, Deep Reinforcement Learning (DRL) can tackle this challenge by creating a dynamic dependency between market information and hedging strategies allocation decisions. In this paper, we present a realistic and augmented DRL framework that: (i) uses additional contextual information to decide an action, (ii) has a one period lag between observations and actions to account for one day lag turnover of common asset managers to rebalance their hedge, (iii) is fully tested in terms of stability and robustness thanks to a repetitive train test method called anchored walk forward training, similar in spirit to k fold cross validation for time series and (iv) allows managing leverage of our hedging strategy. Our experiment for an augmented asset manager interested in sizing and timing his hedges shows that our approach achieves superior returns and lower risk.
  • NGO-GM: Natural Gradient Optimization for Graphical Models.

    Eric BENHAMOU, Rida LARAKI, David SALTIEL, Jamal ATIFL
    2020
    This paper deals with estimating model parameters in graphical models. We reformulate it as an information geometric optimization problem and introduce a natural gradient descent strategy that incorporates additional meta parameters. We show that our approach is a strong alternative to the celebrated EM approach for learning in graphical models. Actually, our natural gradient based strategy leads to learning optimal parameters for the final objective function without artificially trying to fit a distribution that may not correspond to the real one. We support our theoretical findings with the question of trend detection in financial markets and show that the learned model performs better than traditional practitioner methods and is less prone to overfitting.
  • AAMDRL: Augmented Asset Management With Deep Reinforcement Learning.

    Eric BENHAMOU, David SALTIEL, Sandrine UNGARI, Jamal ATIF, Abhishek MUKHOPADHYAY
    SSRN Electronic Journal | 2020
    No summary available.
  • Deep Reinforcement Learning (DRL) for portfolio allocatio.

    Eric BENHAMOU, Jamal ATIF, Rida LARAKI, David SALTIEL, Jean jacques OHANA
    The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases | 2020
    No summary available.
  • BCMA-ES: a conjugate prior Bayesian optimization view.

    Eric BENHAMOU, David SALTIEL, Rida LARAKI, Jamal ATIF
    2020
    CMA-ES is one of the state of the art evolutionary optimization methods because of its capacity to adapt covariance to information geometry. It uses prior information to form a best guess about the distribution of the minimum. We show this can be reformulated as a Bayesian optimization problem for the sampling of the optimum. Thanks to Normal Inverse Wishart (NIW) distribution, that is a conjugate prior for the multi variate normal distribution, we can derive a numerically efficient algorithm Bayesian CMA-ES that obtains similar performance as the traditional CMA-ES on multiple benchmarks and provides a new justification for the CMA-ES updates equations. This novel paradigm for Bayesian CMA-ES provides a powerful bridge between evolutionary and Bayesian optimization, showing the profound similarities and connections between these traditionally opposed methods and opening horizon for variations and mix strategies on these methods.
  • Bridging the gap between Markowitz planning and deep reinforcement learning.

    Eric BENHAMOU, David SALTIEL, Sandrine UNGARI, Abhishek MUKHOPADHYAY
    2020
    While researchers in the asset management industry have mostly focused on techniques based on financial and risk planning techniques like Markowitz efficient frontier, minimum variance, maximum diversification or equal risk parity , in parallel, another community in machine learning has started working on reinforcement learning and more particularly deep reinforcement learning to solve other decision making problems for challenging task like autonomous driving , robot learning, and on a more conceptual side games solving like Go. This paper aims to bridge the gap between these two approaches by showing Deep Reinforcement Learning (DRL) techniques can shed new lights on portfolio allocation thanks to a more general optimization setting that casts portfolio allocation as an optimal control problem that is not just a one-step optimization, but rather a continuous control optimization with a delayed reward. The advantages are numerous: (i) DRL maps directly market conditions to actions by design and hence should adapt to changing environment , (ii) DRL does not rely on any traditional financial risk assumptions like that risk is represented by variance, (iii) DRL can incorporate additional data and be a multi inputs method as opposed to more traditional optimization methods. We present on an experiment some encouraging results using convolution networks.
  • Similarities between policy gradient methods (PGM) in reinforcement learning (RL) and supervised learning (SL).

    Eric BENHAMOU
    2020
    Reinforcement learning (RL) is about sequential decision making and is traditionally opposed to supervised learning (SL) and unsupervised learning (USL). In RL, given the current state, the agent makes a decision that may influence the next state as opposed to SL (and USL) where, the next state remains the same, regardless of the decisions taken, either in batch or on-line learning. Although this difference is fundamental between SL and RL, there are connections that have been overlooked. In particular, we prove in this paper that gradient policy method can be cast as a supervised learning problem where true label are replaced with discounted rewards. We provide a new proof of policy gradient methods (PGM) that emphasizes the tight link with the cross entropy and supervised learning. We provide a simple experiment where we interchange label and pseudo rewards. We conclude that other relationships with SL could be made if we modify the reward functions wisely.
  • Estimating Individual Treatment Effects throughCausal Populations Identification.

    Celine BEJI, Eric BENHAMOU, Michael BON, Florian YGER, Jamal ATIF
    28th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2020) | 2020
    Estimating the Individual Treatment Effect from observational data, defined as the difference between outcomes with and without treatment or intervention, while observing just one of both, is a challenging problems in causal learning. In this paper, we formulate this problem as an inference from hidden variables and enforce causal constraints based on a model of four exclusive causal populations. We propose a new version of the EM algorithm, coined as Expected-Causality-Maximization (ECM) algorithm and provide hints on its convergence under mild conditions. We compare our algorithm to baseline methods on synthetic and real-world data and discuss its performances.
  • Testing Sharpe ratio: luck or skill?

    Eric BENHAMOU, David SALTIEL, Nicolas PARIS, Beatrice GUEZ
    2020
    Sharpe ratio (sometimes also referred to as information ratio) is widely used in asset management to compare and benchmark funds and asset managers. It computes the ratio of the (excess) net return over the strategy standard deviation. However, the elements to compute the Sharpe ratio, namely, the expected returns and the volatilities are unknown numbers and need to be estimated statistically. This means that the Sharpe ratio used by funds is likely to be error prone because of statistical estimation errors. In this paper, we provide various tests to measure the quality of the Sharpe ratios. By quality, we are aiming at measuring whether a manager was indeed lucky of skillful. The test assesses this through the statistical significance of the Sharpe ratio. We not only look at the traditional Sharpe ratio but also compute a modified Sharpe insensitive to used Capital. We provide various statistical tests that can be used to precisely quantify the fact that the Sharpe is statistically significant. We illustrate in particular the number of trades for a given Sharpe level that provides statistical significance as well as the impact of auto-correlation by providing reference tables that provides the minimum required Sharpe ratio for a given time period and correlation. We also provide for a Sharpe ratio of 0.5, 1.0, 1.5 and 2.0 the skill percentage given the auto-correlation level. JEL classification: C12, G11.
  • BCMA-ES II: revisiting Bayesian CMA-ES.

    Eric BENHAMOU, David SALTIEL, Nicolas PARIS, Beatrice GUEZ
    2020
    This paper revisits the Bayesian CMA-ES and provides updates for normal Wishart. It emphasizes the difference between a normal and normal inverse Wishart prior. After some computation, we prove that the only difference relies surprisingly in the expected covariance. We prove that the expected covariance should be lower in the normal Wishart prior model because of the convexity of the inverse. We present a mixture model that generalizes both normal Wishart and normal inverse Wishart model. We finally present various numerical experiments to compare both methods as well as the generalized method.
  • Variance Reduction in Actor Critic Methods (ACM).

    Eric BENHAMOU
    2020
    After presenting Actor Critic Methods (ACM), we show ACM are control variate estimators. Using the projection theorem, we prove that the Q and Advantage Actor Critic (A2C) methods are optimal in the sense of the L 2 norm for the control variate estima-tors spanned by functions conditioned by the current state and action. This straightforward application of Pythagoras theorem provides a theoretical justification of the strong performance of QAC and AAC most often referred to as A2C methods in deep policy gradient methods. This enables us to derive a new formulation for Advantage Actor Critic methods that has lower variance and improves the traditional A2C method.
  • BCMA-ES: A Bayesian approach to CMA-ES.

    Eric BENHAMOU, David SALTIEL, Fabien TEYTAUD, Sebastien VEREL
    2020
    This paper introduces a novel theoretically sound approach for the celebrated CMA-ES algorithm. Assuming the parameters of the multi variate normal distribution for the minimum follow a conjugate prior distribution, we derive their optimal update at each iteration step. Not only provides this Bayesian framework a justification for the update of the CMA-ES algorithm but it also gives two new versions of CMA-ES either assuming normal-Wishart or normal-Inverse Wishart priors, depending whether we parametrize the likelihood by its covariance or precision matrix. We support our theoretical findings by numerical experiments that show fast convergence of these modified versions of CMA-ES.
  • Bridging the Gap Between Markowitz Planning and Deep Reinforcement Learning.

    Eric BENHAMOU, David SALTIEL, Sandrine UNGARI, Abhishek MUKHOPADHYAY
    SSRN Electronic Journal | 2020
    No summary available.
  • Omega and Sharpe ratio.

    Eric BENHAMOU, Nicolas PARIS, Beatrice GUEZ
    2020
    Omega ratio, defined as the probability-weighted ratio of gains over losses at a given level of expected return, has been advocated as a better performance indicator compared to Sharpe and Sortino ratio as it depends on the full return distribution and hence encapsulates all information about risk and return. We compute Omega ratio for the normal distribution and show that under some distribution symmetry assumptions , the Omega ratio is oversold as it does not provide any additional information compared to Sharpe ratio. Indeed, for returns that have elliptic distributions , we prove that the optimal portfolio according to Omega ratio is the same as the optimal portfolio according to Sharpe ratio. As elliptic distributions are a weak form of symmetric distributions that generalized Gaussian distributions and encompass many fat tail distributions, this reduces tremendously the potential interest for the Omega ratio.
  • AAMDRL: Augmented Asset Management with Deep Reinforcement Learning.

    Eric BENHAMOU, David SALTIEL, Sandrine UNGARI, Abhishek MUKHOPADHYAY, Jamal ATIF
    2020
    Can an agent learn efficiently in a noisy and self adapting environment with sequential, non-stationary and non-homogeneous observations? Through trading bots, we illustrate how Deep Reinforcement Learning (DRL) can tackle this challenge. Our contributions are threefold: (i) the use of contextual information also referred to as augmented state in DRL, (ii) the impact of a one period lag between observations and actions that is more realistic for an asset management environment , (iii) the implementation of a new repetitive train test method called walk forward analysis, similar in spirit to cross validation for time series. Although our experiment is on trading bots, it can easily be translated to other bot environments that operate in sequential environment with regime changes and noisy data. Our experiment for an augmented asset manager interested in finding the best portfolio for hedging strategies shows that AAMDRL achieves superior returns and lower risk.
  • Efficient variable selection by coordinate descent with theoretical guarantees.

    David SALTIEL, Eric BENHAMOU
    2020
    No summary available.
  • Efficient variable selection by coordinate ascent with theoretical guarantees.

    David SALTIEL, Eric BENHAMOU
    2019
    Despite the advent of representation-based learning, mainly through deep learning, feature selection remains a key element of many machine learning scenarios. This paper presents a new theoretically motivated method for feature selection. This approach deals with the problem of feature selection through coordinate optimization methods taking into account the dependencies of the variables, materializing these dependencies into blocks. The low number of iterations (until the convergence of the method) attests to the efficiency of gradient boosting methods (e.g. the XG-Boost algorithm) for these super-vectored learning problems. In the case of convex and smooth features, we can prove that the convergence rate is polynomial in terms of the dimension of the complete set of features. We compare the results obtained with state-of-the-art feature selection methods: Recursive Features Elimination (RFE) and Binary Coordinate Ascent (BCA), to show that this new method is competitive.
  • Everything You Always Wanted to Know about Exponential Family and Their Efficient Training in Bayesian Conjugate Priors.

    Eric BENHAMOU
    SSRN Electronic Journal | 2019
    No summary available.
  • NGO-GM: Natural Gradient Optimization for Graphical Models.

    Eric BENHAMOU, Jamal ATIF, Rida LARAKI, David SALTIEL, Jamal ATIFL
    SSRN Electronic Journal | 2019
    This paper deals with estimating model parameters in graphical models. We reformulate it as an information geometric optimization problem and introduce a natural gradient descent strategy that incorporates additional meta parameters. We show that our approach is a strong alternative to the celebrated EM approach for learning in graphical models. Actually, our natural gradient based strategy leads to learning optimal parameters for the final objective function without artificially trying to fit a distribution that may not correspond to the real one. We support our theoretical findings with the question of trend detection in financial markets and show that the learned model performs better than traditional practitioner methods and is less prone to overfitting.
  • Trade Selection with Supervised Learning and OCA.

    David SALTIEL, Eric BENHAMOU
    2019
    In recent years, state-of-the-art methods for supervised learning have exploited increasingly gradient boosting techniques, with mainstream efficient implementations such as xgboost or light-gbm. One of the key points in generating proficient methods is Feature Selection (FS). It consists in selecting the right valuable effective features. When facing hundreds of these features, it becomes critical to select best features. While filter and wrappers methods have come to some maturity , embedded methods are truly necessary to find the best features set as they are hybrid methods combining features filtering and wrapping. In this work, we tackle the problem of finding through machine learning best a priori trades from an algo-rithmic strategy. We derive this new method using coordinate ascent optimization and using block variables. We compare our method to Recursive Feature Elimination (RFE) and Binary Coordinate Ascent (BCA). We show on a real life example the capacity of this method to select good trades a priori. Not only this method outperforms the initial trading strategy as it avoids taking loosing trades, it also surpasses other method, having the smallest feature set and the highest score at the same time. The interest of this method goes beyond this simple trade classification problem as it is a very general method to determine the optimal feature set using some information about features relationship as well as using coordinate ascent optimization.
  • Variance Reduction in Actor Critic Methods (ACM).

    Eric BENHAMOU
    SSRN Electronic Journal | 2019
    After presenting Actor Critic Methods (ACM), we show ACM are control variate estimators. Using the projection theorem, we prove that the Q and Advantage Actor Critic (A2C) methods are optimal in the sense of the L 2 norm for the control variate estima-tors spanned by functions conditioned by the current state and action. This straightforward application of Pythagoras theorem provides a theoretical justification of the strong performance of QAC and AAC most often referred to as A2C methods in deep policy gradient methods. This enables us to derive a new formulation for Advantage Actor Critic methods that has lower variance and improves the traditional A2C method.
  • Similarities Between Policy Gradient Methods (PGM) in Reinforcement Learning (RL) and Supervised Learning (SL).

    Eric BENHAMOU
    SSRN Electronic Journal | 2019
    Reinforcement learning (RL) is about sequential decision making and is traditionally opposed to supervised learning (SL) and unsupervised learning (USL). In RL, given the current state, the agent makes a decision that may influence the next state as opposed to SL (and USL) where, the next state remains the same, regardless of the decisions taken, either in batch or on-line learning. Although this difference is fundamental between SL and RL, there are connections that have been overlooked. In particular, we prove in this paper that gradient policy method can be cast as a supervised learning problem where true label are replaced with discounted rewards. We provide a new proof of policy gradient methods (PGM) that emphasizes the tight link with the cross entropy and supervised learning. We provide a simple experiment where we interchange label and pseudo rewards. We conclude that other relationships with SL could be made if we modify the reward functions wisely.
  • Testing Sharpe Ratio: Luck or Skill?

    Eric BENHAMOU, David SALTIEL, Beatrice GUEZ, Nicolas PARIS
    SSRN Electronic Journal | 2019
    Sharpe ratio (sometimes also referred to as information ratio) is widely used in asset management to compare and benchmark funds and asset managers. It computes the ratio of the (excess) net return over the strategy standard deviation. However, the elements to compute the Sharpe ratio, namely, the expected returns and the volatilities are unknown numbers and need to be estimated statistically. This means that the Sharpe ratio used by funds is likely to be error prone because of statistical estimation errors. In this paper, we provide various tests to measure the quality of the Sharpe ratios. By quality, we are aiming at measuring whether a manager was indeed lucky of skillful. The test assesses this through the statistical significance of the Sharpe ratio. We not only look at the traditional Sharpe ratio but also compute a modified Sharpe insensitive to used Capital. We provide various statistical tests that can be used to precisely quantify the fact that the Sharpe is statistically significant. We illustrate in particular the number of trades for a given Sharpe level that provides statistical significance as well as the impact of auto-correlation by providing reference tables that provides the minimum required Sharpe ratio for a given time period and correlation. We also provide for a Sharpe ratio of 0.5, 1.0, 1.5 and 2.0 the skill percentage given the auto-correlation level. JEL classification: C12, G11.
  • A short note on the operator norm upper bound for sub-Gaussian tailed random matrices.

    Eric BENHAMOU, Jamal ATIF, Rida LARAKI
    2019
    This paper investigates an upper bound of the operator norm for sub-Gaussian tailed random matrices. A lot of attention has been put on uniformly bounded sub-Gaussian tailed random matrices with independent coefficients. However, little has been done for sub-Gaussian tailed random matrices whose matrix coefficients variance are not equal or for matrix for which coefficients are not independent. This is precisely the subject of this paper. After proving that random matrices with uniform sub-Gaussian tailed independent coefficients satisfy the Tracy Widom bound, that is,their matrix operator norm remains bounded by O(√n) with overwhelming probability, we prove that a less stringent condition is that the matrix rows are independent and uniformly sub-Gaussian. This does not impose in particular that all matrix coefficients are independent, but only their rows, which is a weaker condition.
  • BCMA-ES II: Revisiting Bayesian CMA-ES.

    Eric BENHAMOU, David SALTIEL, Beatrice GUEZ, Nicolas PARIS
    SSRN Electronic Journal | 2019
    This paper revisits the Bayesian CMA-ES and provides updates for normal Wishart. It emphasizes the difference between a normal and normal inverse Wishart prior. After some computation, we prove that the only difference relies surprisingly in the expected covariance. We prove that the expected covariance should be lower in the normal Wishart prior model because of the convexity of the inverse. We present a mixture model that generalizes both normal Wishart and normal inverse Wishart model. We finally present various numerical experiments to compare both methods as well as the generalized method.
  • A few properties of sample variance.

    Eric BENHAMOU
    2019
    A basic result is that the sample variance for i.i.d. observations is an unbiased estimator of the variance of the underlying distribution (see for instance Casella and Berger (2002)). Another result is that the sample variance 's variance is minimum compared to any other unbiased estimators (see Halmos (1946)). But what happens if the observations are neither independent nor identically distributed. What can we say? Can we in particular compute explicitly the first two moments of the sample mean and hence generalize formulae provided in Tukey (1957a), Tukey (1957b) for the first two moments of the sample variance? We also know that the sample mean and variance are independent if they are computed on an i.i.d. normal distribution. This is one of the underlying assumption to derive the Student distribution Student alias W. S. Gosset (1908). But does this result hold for any other underlying distribution? Can we still have independent sample mean and variance if the distribution is not normal? This paper precisely answers these questions and extends previous work of Cho, Cho, and Eltinge (2004). We are able to derive a general formula for the first two moments and variance of the sample variance under no specific assumptions. We also provide a faster proof of a seminal result of Lukacs (1942) by using the log characteristic function of the unbiased sample variance estimator.
  • A discrete version of CMA-ES.

    Eric BENHAMOU, Jamal ATIF, Rida LARAKI
    2019
    Modern machine learning uses more and more advanced optimization techniques to find optimal hyper parameters. Whenever the objective function is non-convex, non continuous and with potentially multiple local minima, standard gradient descent optimization methods fail. A last resource and very different method is to assume that the optimum(s), not necessarily unique, is/are distributed according to a distribution and iteratively to adapt the distribution according to tested points. These strategies originated in the early 1960s, named Evolution Strategy (ES) have culminated with the CMA-ES (Covariance Matrix Adaptation) ES. It relies on a multi variate normal distribution and is supposed to be state of the art for general optimization program. However, it is far from being optimal for discrete variables. In this paper, we extend the method to multivariate binomial correlated distributions. For such a distribution, we show that it shares similar features to the multi variate normal: independence and correlation is equivalent and correlation is efficiently modeled by interaction between different variables. We discuss this distribution in the framework of the exponential family. We prove that the model can estimate not only pairwise interactions among the two variables but also is capable of modeling higher order interactions. This allows creating a version of CMA ES that can accomodate efficiently discrete variables. We provide the corresponding algorithm and conclude.
  • Kalman filter demystified: from intuition to probabilistic graphical model to real case in financial markets.

    Eric BENHAMOU
    2019
    Affiliated researcher to LAMSADE (UMR CNRS 7243) and QMI (Quantitative Management Initiative) chair, Abstract: In this paper, we revisit the Kalman filter theory. After giving the intuition on a simplified financial markets example, we revisit the maths underlying it. We then show that Kalman filter can be presented in a very different fashion using graphical models. This enables us to establish the connection between Kalman filter and Hidden Markov Models. We then look at their application in financial markets and provide various intuitions in terms of their applicability for complex systems such as financial markets. Although this paper has been written more like a self contained work connecting Kalman filter to Hidden Markov Models and hence revisiting well known and establish results, it contains new results and brings additional contributions to the field. First, leveraging on the link between Kalman filter and HMM, it gives new algorithms for inference for extended Kalman filters. Second, it presents an alternative to the traditional estimation of parameters using EM algorithm thanks to the usage of CMA-ES optimization. Third, it examines the application of Kalman filter and its Hidden Markov models version to financial markets, providing various dynamics assumptions and tests. We conclude by connecting Kalman filter approach to trend following technical analysis system and showing their superior performances for trend following detection.
  • A new approach to learning in Dynamic Bayesian Networks (DBNs).

    Eric BENHAMOU, Jamal ATIF, Rida LARAKI
    2019
    In this paper, we revisit the parameter learning problem, namely the estimation of model parameters for Dynamic Bayesian Networks (DBNs). DBNs are directed graphical models of stochastic processes that encompasses and generalize Hidden Markov models (HMMs) and Linear Dynamical Systems (LDSs). Whenever we apply these models to economics and finance, we are forced to make some modeling assumptions about the state dynamics and the graph topology (the DBN structure). These assumptions may be incorrectly specified and contain some additional noise compared to reality. Trying to use a best fit approach through maximum likelihood estimation may miss this point and try to fit at any price these models on data. We present here a new methodology that takes a radical point of view and instead focus on the final efficiency of our model. Parameters are hence estimated in terms of their efficiency rather than their distributional fit to the data. The resulting optimization problem that consists in finding the optimal parameters is a hard problem. We rely on Covariance Matrix Adaptation Evolution Strategy (CMA-ES) method to tackle this issue. We apply this method to the seminal problem of trend detection in financial markets. We see on numerical results that the resulting parameters seem less error prone to over fitting than traditional moving average cross over trend detection and perform better. The method developed here for algorithmic trading is general. It can be applied to other real case applications whenever there is no physical law underlying our DBNs.
  • Connecting Sharpe ratio and Student t-statistic, and beyond.

    Eric BENHAMOU
    2019
    Sharpe ratio is widely used in asset management to compare and benchmark funds and asset managers. It computes the ratio of the excess return over the strategy standard deviation. However, the elements to compute the Sharpe ratio, namely, the expected returns and the volatilities are unknown numbers and need to be estimated statistically. This means that the Sharpe ratio used by funds is subject to be error prone because of statistical estimation error. Lo (2002), Mertens (2002) derive explicit expressions for the statistical distribution of the Sharpe ratio using standard asymptotic theory under several sets of assumptions (independent normally distributed-and identically distributed returns). In this paper, we provide the exact distribution of the Sharpe ratio for independent normally distributed return. In this case, the Sharpe ratio statistic is up to a rescaling factor a non centered Student distribution whose characteristics have been widely studied by statisticians. The asymptotic behavior of our distribution provides the result of Lo (2002). We also illustrate the fact that the empirical Sharpe ratio is asymptotically optimal in the sense that it achieves the Cramer Rao bound. We then study the empirical SR under AR(1) assumptions and investigate the effect of compounding period on the Sharpe (computing the annual Sharpe with monthly data for instance). We finally provide general formula in this case of heteroscedasticity and autocorrelation. JEL classification: C12, G11.
  • Seven proofs of the Pearson Chi-squared independence test and its graphical interpretation.

    Eric BENHAMOU, Valentin MELOT
    2019
    This paper revisits the Pearson Chi-squared independence test. After presenting the underlying theory with modern notations and showing new way of deriving the proof, we describe an innovative and intuitive graphical presentation of this test. This enables not only interpreting visually the test but also measuring how close or far we are from accepting or rejecting the null hypothesis of non independence.
  • Omega and Sharpe Ratio.

    Eric BENHAMOU, Beatrice GUEZ, Nicolas PARIS
    SSRN Electronic Journal | 2019
    No summary available.
  • Feature selection with optimal coordinate ascent (OCA).

    David SALTIEL, Eric BENHAMOU
    2019
    In machine learning, Feature Selection (FS) is a major part of efficient algorithm. It fuels the algorithm and is the starting block for our prediction. In this paper, we present a new method, called Optimal Coordinate Ascent (OCA) that allows us selecting features among block and individual features. OCA relies on coordinate ascent to find an optimal solution for gradient boosting methods score (number of correctly classified samples). OCA takes into account the notion of dependencies between variables forming blocks in our optimization. The coordinate ascent optimization solves the issue of the NP hard original problem where the number of combinations rapidly explode making a grid search unfeasible. It reduces considerably the number of iterations changing this NP hard problem into a polynomial search one. OCA brings substantial differences and improvements compared to previous coordinate ascent feature selection method: we group variables into block and individual variables instead of a binary selection. Our initial guess is based on the k-best group variables making our initial point more robust. We also introduced new stopping criteria making our optimization faster. We compare these two methods on our data set. We found that our method outperforms the initial one. We also compare our method to the Recursive Feature Elimination (RFE) method and find that OCA leads to the minimum feature set with the highest score. This is a nice byproduct of our method as it provides empirically the most compact data set with optimal performance.
  • BCMA-ES: A Bayesian Approach to CMA-ES.

    Eric BENHAMOU, David SALTIEL, Sebastien VEREL, Fabien TEYTAUD
    SSRN Electronic Journal | 2019
    This paper introduces a novel theoretically sound approach for the celebrated CMA-ES algorithm. Assuming the parameters of the multi variate normal distribution for the minimum follow a conjugate prior distribution, we derive their optimal update at each iteration step. Not only provides this Bayesian framework a justification for the update of the CMA-ES algorithm but it also gives two new versions of CMA-ES either assuming normal-Wishart or normal-Inverse Wishart priors, depending whether we parametrize the likelihood by its covariance or precision matrix. We support our theoretical findings by numerical experiments that show fast convergence of these modified versions of CMA-ES.
  • Gram Charlier and Edgeworth Expansion for Sample Variance.

    Eric BENHAMOU
    SSRN Electronic Journal | 2018
    In this paper, we derive a valid Edgeworth expansions for the Bessel corrected empirical variance when data are generated by a strongly mixing process whose distribution can be arbitrarily. The constraint of strongly mixing process makes the problem not easy. Indeed, even for a strongly mixing normal process, the distribution is unknown. Here, we do not assume any other assumption than a sufficiently fast decrease of the underlying distribution to make the Edgeworth expansion con-vergent. This results can obviously apply to strongly mixing normal process and provide an alternative to the work of Moschopoulos (1985) and Mathai (1982). Mathematics Subject Classification : 62E10, 62E15.
  • Feature Selection With Optimal Coordinate Ascent (OCA).

    David SALTIEL, Eric BENHAMOU
    SSRN Electronic Journal | 2018
    In machine learning, Feature Selection (FS) is a major part of efficient algorithm. It fuels the algorithm and is the starting block for our prediction. In this paper, we present a new method, called Optimal Coordinate Ascent (OCA) that allows us selecting features among block and individual features. OCA relies on coordinate ascent to find an optimal solution for gradient boosting methods score (number of correctly classified samples). OCA takes into account the notion of dependencies between variables forming blocks in our optimization. The coordinate ascent optimization solves the issue of the NP hard original problem where the number of combinations rapidly explode making a grid search unfeasible. It reduces considerably the number of iterations changing this NP hard problem into a polynomial search one. OCA brings substantial differences and improvements compared to previous coordinate ascent feature selection method: we group variables into block and individual variables instead of a binary selection. Our initial guess is based on the k-best group variables making our initial point more robust. We also introduced new stopping criteria making our optimization faster. We compare these two methods on our data set. We found that our method outperforms the initial one. We also compare our method to the Recursive Feature Elimination (RFE) method and find that OCA leads to the minimum feature set with the highest score. This is a nice byproduct of our method as it provides empirically the most compact data set with optimal performance.
  • Incremental Sharpe and other performance ratios.

    Eric BENHAMOU, Beatrice GUEZ
    Journal of Statistical and Econometric Methods | 2018
    We present a new methodology of computing incremental contribution for performance ratios for portfolio like Sharpe, Treynor, Calmar or Sterling ratios. Using Euler's homogeneous function theorem, we are able to decompose these performance ratios as a linear combination of individual modified performance ratios. This allows understanding the drivers of these performance ratios as well as deriving a condition for a new asset to provide incremental performance for the portfolio. We provide various numerical examples of this performance ratio decomposition. JEL classification: C12, G11.
  • Operator Norm Upper Bound for Sub-Gaussian Tailed Random Matrices.

    Eric BENHAMOU, Jamal ATIF, Rida LARAKI
    SSRN Electronic Journal | 2018
    No summary available.
  • Kalman Filter Demystified: From Intuition to Probabilistic Graphical Model to Real Case in Financial Markets.

    Eric BENHAMOU
    SSRN Electronic Journal | 2018
    Affiliated researcher to LAMSADE (UMR CNRS 7243) and QMI (Quantitative Management Initiative) chair, Abstract: In this paper, we revisit the Kalman filter theory. After giving the intuition on a simplified financial markets example, we revisit the maths underlying it. We then show that Kalman filter can be presented in a very different fashion using graphical models. This enables us to establish the connection between Kalman filter and Hidden Markov Models. We then look at their application in financial markets and provide various intuitions in terms of their applicability for complex systems such as financial markets. Although this paper has been written more like a self contained work connecting Kalman filter to Hidden Markov Models and hence revisiting well known and establish results, it contains new results and brings additional contributions to the field. First, leveraging on the link between Kalman filter and HMM, it gives new algorithms for inference for extended Kalman filters. Second, it presents an alternative to the traditional estimation of parameters using EM algorithm thanks to the usage of CMA-ES optimization. Third, it examines the application of Kalman filter and its Hidden Markov models version to financial markets, providing various dynamics assumptions and tests. We conclude by connecting Kalman filter approach to trend following technical analysis system and showing their superior performances for trend following detection.
  • A Discrete Version of CMA-ES.

    Eric BENHAMOU, Jamal ATIF, Rida LARAKI, Anne AUGER
    SSRN Electronic Journal | 2018
    No summary available.
  • Trade Selection with Supervised Learning and OCA.

    David SALTIEL, Eric BENHAMOU
    SSRN Electronic Journal | 2018
    In recent years, state-of-the-art methods for supervised learning have exploited increasingly gradient boosting techniques, with mainstream efficient implementations such as xgboost or light-gbm. One of the key points in generating proficient methods is Feature Selection (FS). It consists in selecting the right valuable effective features. When facing hundreds of these features, it becomes critical to select best features. While filter and wrappers methods have come to some maturity , embedded methods are truly necessary to find the best features set as they are hybrid methods combining features filtering and wrapping. In this work, we tackle the problem of finding through machine learning best a priori trades from an algo-rithmic strategy. We derive this new method using coordinate ascent optimization and using block variables. We compare our method to Recursive Feature Elimination (RFE) and Binary Coordinate Ascent (BCA). We show on a real life example the capacity of this method to select good trades a priori. Not only this method outperforms the initial trading strategy as it avoids taking loosing trades, it also surpasses other method, having the smallest feature set and the highest score at the same time. The interest of this method goes beyond this simple trade classification problem as it is a very general method to determine the optimal feature set using some information about features relationship as well as using coordinate ascent optimization.
  • Incremental Sharpe and Other Performance Ratios.

    Eric BENHAMOU, Beatrice GUEZ
    SSRN Electronic Journal | 2018
    We present a new methodology of computing incremental contribution for performance ratios for portfolio like Sharpe, Treynor, Calmar or Sterling ratios. Using Euler's homogeneous function theorem, we are able to decompose these performance ratios as a linear combination of individual modified performance ratios. This allows understanding the drivers of these performance ratios as well as deriving a condition for a new asset to provide incremental performance for the portfolio. We provide various numerical examples of this performance ratio decomposition. JEL classification: C12, G11.
  • A Few Properties of Sample Variance.

    Eric BENHAMOU
    SSRN Electronic Journal | 2018
    A basic result is that the sample variance for i.i.d. observations is an unbiased estimator of the variance of the underlying distribution (see for instance Casella and Berger (2002)). Another result is that the sample variance 's variance is minimum compared to any other unbiased estimators (see Halmos (1946)). But what happens if the observations are neither independent nor identically distributed. What can we say? Can we in particular compute explicitly the first two moments of the sample mean and hence generalize formulae provided in Tukey (1957a), Tukey (1957b) for the first two moments of the sample variance? We also know that the sample mean and variance are independent if they are computed on an i.i.d. normal distribution. This is one of the underlying assumption to derive the Student distribution Student alias W. S. Gosset (1908). But does this result hold for any other underlying distribution? Can we still have independent sample mean and variance if the distribution is not normal? This paper precisely answers these questions and extends previous work of Cho, Cho, and Eltinge (2004). We are able to derive a general formula for the first two moments and variance of the sample variance under no specific assumptions. We also provide a faster proof of a seminal result of Lukacs (1942) by using the log characteristic function of the unbiased sample variance estimator.
  • Connecting Sharpe Ratio and Student T-Statistic, and Beyond.

    Eric BENHAMOU
    SSRN Electronic Journal | 2018
    Sharpe ratio is widely used in asset management to compare and benchmark funds and asset managers. It computes the ratio of the excess return over the strategy standard deviation. However, the elements to compute the Sharpe ratio, namely, the expected returns and the volatilities are unknown numbers and need to be estimated statistically. This means that the Sharpe ratio used by funds is subject to be error prone because of statistical estimation error. Lo (2002), Mertens (2002) derive explicit expressions for the statistical distribution of the Sharpe ratio using standard asymptotic theory under several sets of assumptions (independent normally distributed-and identically distributed returns). In this paper, we provide the exact distribution of the Sharpe ratio for independent normally distributed return. In this case, the Sharpe ratio statistic is up to a rescaling factor a non centered Student distribution whose characteristics have been widely studied by statisticians. The asymptotic behavior of our distribution provides the result of Lo (2002). We also illustrate the fact that the empirical Sharpe ratio is asymptotically optimal in the sense that it achieves the Cramer Rao bound. We then study the empirical SR under AR(1) assumptions and investigate the effect of compounding period on the Sharpe (computing the annual Sharpe with monthly data for instance). We finally provide general formula in this case of heteroscedasticity and autocorrelation. JEL classification: C12, G11.
  • Seven Proofs of the Pearson Chi-Squared Independence Test and its Graphical Interpretation.

    Eric BENHAMOU, Valentin MELOT
    SSRN Electronic Journal | 2018
    This paper revisits the Pearson Chi-squared independence test. After presenting the underlying theory with modern notations and showing new way of deriving the proof, we describe an innovative and intuitive graphical presentation of this test. This enables not only interpreting visually the test but also measuring how close or far we are from accepting or rejecting the null hypothesis of non independence.
  • Three remarkable properties of the Normal distribution for sample variance.

    Eric BENHAMOU, Nicolas PARIS, Beatrice GUEZ
    Theoretical Mathematics & Applications | 2018
    In this paper, we present three remarkable properties of the normal distribution: first that if two independent variables 's sum is normally distributed, then each random variable follows a normal distribution (which is referred to as the Levy Cramer theorem), second a variation of the Levy Cramer theorem (new to our knowledge) that states that two independent symmetric random variables with finite variance, independent sum and difference are necessarily normal, and third that normal distribution can be characterized by the fact that it is the only distribution for which sample mean and variance are independent, which is a central property for deriving the Student distribution and referred as the Geary theorem. The novelty of this paper is twofold. First we provide an extension of the Levy Cramer theorem. Second, for the two seminal theorem (the Levy Cramer and Geary theorem), we provide new, quicker or self contained proofs. Mathematics Subject Classification : 62E10, 62E15.
  • Gram Charlier and Edgeworth expansion for sample variance.

    Eric BENHAMOU
    Theoretical Mathematics and Applications | 2018
    In this paper, we derive a valid Edgeworth expansions for the Bessel corrected empirical variance when data are generated by a strongly mixing process whose distribution can be arbitrarily. The constraint of strongly mixing process makes the problem not easy. Indeed, even for a strongly mixing normal process, the distribution is unknown. Here, we do not assume any other assumption than a sufficiently fast decrease of the underlying distribution to make the Edgeworth expansion convergent. This results can obviously apply to strongly mixing normal process and provide an alternative to the work of Moschopoulos (1985) and Mathai (1982).
  • Three remarkable properties of the Normal distribution for simple variance.

    Eric BENHAMOU, Beatrice GUEZ, Nicolas PARIS
    Theoretical Mathematics and Applications | 2018
    In this paper, we present three remarkable properties of the normal distribution: first that if two independent variables ’s sum is normally distributed, then each random variable follows a normal distribution (which is referred to as the Levy Cramer theorem), second a variation of the Levy Cramer theorem (new to our knowledge) that states that two independent symmetric random variables with finite variance, independent sum and difference are necessarily normal, and third that normal distribution can be characterized by the fact that it is the only distribution for which sample mean and variance are independent, which is a central property for deriving the Student distribution and referred as the Geary theorem. The novelty of this paper is twofold. First we provide an extension of the Levy Cramer theorem. Second, for the two seminal theorem (the Levy Cramer and Geary theorem), we provide new, quicker or self contained proofs.
  • Three Remarkable Properties of the Normal Distribution.

    Eric BENHAMOU, Beatrice GUEZ, Nicolas PARIS
    SSRN Electronic Journal | 2018
    No summary available.
  • T-Statistic for Autoregressive Process.

    Eric BENHAMOU
    SSRN Electronic Journal | 2018
    In this paper, we discuss the distribution of the t-statistic under the assumption of normal autoregressive distribution for the underlying discrete time process. This result generalizes the classical result of the traditional t-distribution where the underlying discrete time process follows an uncorrelated normal distribution. However, for AR(1), the underlying process is correlated. All traditional results break down and the resulting t-statistic is a new distribution that converges asymptot-ically to a normal. We give an explicit formula for this new distribution obtained as the ratio of two dependent distribution (a normal and the distribution of the norm of another independent normal distribution). We also provide a modified statistic that follows a non central t-distribution. Its derivation comes from finding an orthogonal basis for the the initial circulant Toeplitz covariance matrix. Our findings are consistent with the asymptotic distribution for the t-statistic derived for the asymptotic case of large number of observations or zero correlation. This exact finding of this distribution has applications in multiple fields and in particular provides a way to derive the exact distribution of the Sharpe ratio under normal AR(1) assumptions.
  • Trend Without Hiccups - A Kalman Filter Approach.

    Eric BENHAMOU
    SSRN Electronic Journal | 2016
    No summary available.
  • Stochastic development and closed-form pricing for European options.

    Mohammed MIRI, Emmanuel GOBET, Eric BENHAMOU, Nicole EL KAROUI, Philippe BRIAND, Etienne KOEHLER, Jean pierre FOUQUE, Denis TALAY
    2009
    This thesis develops a new methodology to establish analytical approximations for European option prices. Our approach cleverly combines stochastic developments and Malliavin calculus to obtain explicit formulas and accurate error estimates. The interest of these formulas lies in their computation time which is as fast as that of the Black-Scholes formula. Our motivation comes from the growing need for real-time calculations and calibration procedures, while controlling the numerical errors related to the model parameters. We treat four categories of models, performing specific parameterizations for each model in order to better target the right proxy model and thus obtain easy to evaluate correction terms. The four parts treated are: diffusions with jumps, local volatilities or Dupire models, stochastic volatilities and finally hybrid models (rate-share). It should also be noted that our approximation error is expressed as a function of all the parameters of the model in question and is also analyzed in terms of the regularity of the payoff.
Affiliations are detected from the signatures of publications identified in scanR. An author can therefore appear to be affiliated with several structures or supervisors according to these signatures. The dates displayed correspond only to the dates of the publications found. For more information, see https://scanr.enseignementsup-recherche.gouv.fr