Machine learning methods for discrete multi-scale fows : application to finance.

Authors
Publication date
2012
Publication type
Thesis
Summary This research work deals with the problem of identifying and predicting the trends of a financial series considered in a multivariate framework. The framework of this problem, inspired by machine learning, is defined in chapter I. The efficient markets hypothesis, which contradicts the objective of trend prediction, is first recalled, while the different schools of thought in market analysis, which to some extent oppose the efficient markets hypothesis, are also exposed. We explain the techniques of fundamental analysis, technical analysis and quantitative analysis, and we are particularly interested in the techniques of statistical learning allowing the calculation of predictions on time series. The difficulties of dealing with time-dependent and/or non-stationary factors are highlighted, as well as the usual pitfalls of overfitting and careless data manipulation. Extensions of the classical statistical learning framework, especially transfer learning, are presented. The main contribution of this chapter is the introduction of a research methodology allowing the development of numerical models for trend prediction. This methodology is based on an experimental protocol, consisting of four modules. The first module, entitled Data Observation and Modeling Choices, is a preliminary module devoted to the expression of modeling choices, hypotheses and very general objectives. The second module, Database Construction, transforms the target variable and explanatory variables into factors and labels in order to train numerical trend prediction models. The third module, Model Building, is aimed at building numerical trend prediction models. The fourth and final module, Backtesting and Numerical Results, evaluates the accuracy of the trend prediction models on a significant test set, using two generic backtesting procedures. The first procedure returns the recognition rates of upward and downward trends. The second procedure constructs trading rules using the predictions computed on the test set. The result (P&L) of each of the trading rules is the accumulated gains and losses during the test period. Moreover, these backtesting procedures are completed by interpretation functions, which facilitate the analysis of the decision mechanism of the numerical models. These functions can be measures of the predictive ability of the factors, or measures of the reliability of the models as well as of the delivered predictions. They contribute decisively to the formulation of hypotheses better adapted to the data, as well as to the improvement of the methods of representation and construction of databases and models. This is explained in chapter IV. The numerical models, specific to each of the model building methods described in Chapter IV, and aimed at predicting the trends of the target variables introduced in Chapter II, are indeed calculated and backtested. The reasons for switching from one model-building method to another are particularly well documented. The influence of the choice of parameters - and this at each stage of the experimental protocol - on the formulation of conclusions is also highlighted. The PPVR procedure, which does not require any additional calculation of parameters, has thus been used to reliably study the efficient markets hypothesis. New research directions for the construction of predictive models are finally proposed.
Topics of the publication
Themes detected by scanR from retrieved publications. For more information, see https://scanr.enseignementsup-recherche.gouv.fr