Browse our papers

Combining Model-Based and Model-Free RL for Financial Markets

Collaboration with  LOMBARD ODIER  team

Video: here

Link: SSRN paper 3830012

Authors: Eric BenhamouDavid Saltiel, Serge Tabachnik Sui Kai Wong  and  François Chareyron

Abstract: Model Free Reinforcement Learning has achieved great results in stable environments but has not been able sofar to generalize well in regime changing environments like financial markets. In contrast, model based RL are able to capture some fundamental and dynamical concepts of the environment but suffer from cognitive bias. In this work, we propose to combine the best of the two approaches by selecting thanks to Model free Deep Reinforcement Learning various model based approaches. Using not only past performance and volatility, we include additional contextual information to account for implicit regime changes like macro and risk appetite signals . We also adapt traditional RL methods to take into account that in real life training takes always place in the past. Hence we cannot use future information in our training data set as implied by K-fold cross validation. Building on traditional statistical methods, we introduce "walk-forward analysis", which is defined by successive training and testing based on expanding periods, to assert the robustness of the resulting agent. Last but not least, we present the concept of statistical difference significance based on a two-tailed T-test, to highlight the ways in which our models differ from more traditional ones. Our experimental results show that our approach outperforms traditional financial baselines portfolio models like Markowitz in almost all evaluation metrics commonly used in financial mathematics, namely net performance, Sharpe ratio, Sortino, maximum drawdown, maximum drawdown over volatility. Read more

Submitted: March 25, 2021

Explainable AI Models of Stock Crashes: A Machine-Learning Explanation of the Covid March 2020 Equity Meltdown

Collaboration with  Homa Capital  team

Link: SSRN paper 3809308

Authors: Jean Jacques OhanaSteve OhanaEric BenhamouDavid Saltiel,  and  Beatrice Guez

Abstract: We consider a gradient boosting decision trees (GBDT) approach to predict large S&P 500 price drops from a set of 150 technical, fundamental and macroeconomic features. We report an improved accuracy of GBDT over other machine learning (ML) methods on the S&P 500 futures prices. We show that retaining fewer and carefully selected features provides improvements across all ML approaches. Shapley values have recently been introduced from game theory to the field of ML. They allow for a robust identification of the most important variables predicting stock market crises, and of a local explanation of the crisis probability at each date, through a consistent features attribution. We apply this methodology to analyze in detail the March 2020 financial meltdown, for which the model offered a timely out of sample prediction. This analysis unveils in particular the contrarian predictive role of the tech equity sector before and after the crash. Read more

Submitted: March 21, 2021

Knowledge discovery with Deep RL for selecting financial hedges

Collaboration with  Société Générale  team

Link: KDF 21 Workshop

See video : here

Authors: Eric Benhamou, David Saltiel, Sandrine Ungari, Abhishek Mukhopadhyay, Jamal Atif, and Rida Laraki 

Abstract: Can an asset manager gain knowledge from different data sources to select the right hedging strategy for his portfolio? We use Deep Reinforcement Learning (Deep RL or DRL) to extract information from not only past performances of the hedging strategies but also additional contextual information like risk aversion, correlation data, credit information and estimated earnings per shares. Our contributions are threefold: (i) the use of contextual information also referred to as augmented state in DRL, (ii) the impact of a one period lag between observations and actions that is more realistic for an asset management environment, (iii) the implementation of a new repetitive train test method called walk forward analysis, similar in spirit to cross validation for time series. Although our experiment is on trading bots, it can easily be translated to other bot environments that operate in sequential environment with regime changes and noisy data. Our experiment for an augmented asset manager interested in finding the best portfolio for hedging strategies achieves superior returns and lower risk. Read more

Submitted: 9 February, 2021

Time your hedge with Deep Reinforcement Learning

Collaboration with  Société Générale  team

Link: SSRN paper 3693614

Authors: Eric Benhamou, David Saltiel, Sandrine Ungari and Abhishek Mukhopadhyay

Abstract: Can an asset manager plan the optimal timing for her/his hedging strategies given market conditions? The standard approach based on Markowitz or other more or less sophisticated financial rules aims to find the best portfolio allocation thanks to forecasted expected returns and risk but fails to fully relate market conditions to hedging strategies decision. In contrast, Deep Reinforcement Learning (DRL) can tackle this challenge by creating a dynamic dependency between market information and hedging strategies allocation decisions. In this paper, we present a realistic and augmented DRL framework that: (i) uses additional contextual information to decide an action, (ii) has a one period lag between observations and actions to account for one day lag turnover of common asset managers to rebalance their hedge, (iii) is fully tested in terms of stability and robustness thanks to a repetitive train test method called anchored walk forward training, similar in spirit to k fold cross validation for time series and (iv) allows managing leverage of our hedging strategy. Our experiment for an augmented asset manager interested in sizing and timing his hedges shows that our approach achieves superior returns and lower risk. Read more

Submitted: 9 November, 2020

Detecting and adapting to crisis pattern with context based Deep Reinforcement Learning

Collaboration with  Homa Capital  team

Link: SSRN paper 3688353

Authors: Eric Benhamou, David Saltiel, Jean Jacques Ohana  and Jamal Atif

Abstract: Deep reinforcement learning (DRL) has reached super human levels in complex tasks like game solving (Go, StarCraft II), and autonomous driving. However, it remains an open question whether DRL can reach human level in applications to financial problems and in particular in detecting pattern crisis and consequently dis-investing. In this paper, we present an innovative DRL framework consisting in two subnetworks fed respectively with portfolio strategies past performances and standard deviations as well as additional contextual features. The second sub network plays an important role as it captures dependencies with common financial indicators features like risk aversion, economic surprise index and correlations between assets that allows taking into account context based information. We compare different network architectures either using layers of convolutions to reduce network’s complexity or LSTM block to capture time dependency and whether previous allocations is important in the modeling. We also use adversarial training to make the final model more robust. Results on test set show this approach substantially over-performs traditional portfolio optimization methods like Markovitz and is able to detect and anticipate crisis like the current Covid one. Read more

Submitted: 9 November, 2020

Bridging the gap between Markowitz planning and deep reinforcement learning

Collaboration with  Société Générale  team

Link: SSRN paper 3702112

Authors: Eric Benhamou, David Saltiel, Sandrine Ungari and Abhishek Mukhopadhyay

Abstract: While researchers in the asset management industry have mostly focused on techniques based on financial and risk planning techniques like Markowitz efficient frontier, minimum variance, maximum diversification or equal risk parity, in parallel, another community in machine learning has started working on reinforcement learning and more particularly deep reinforcement learning to solve other decision making problems for challenging tasks like autonomous driving, robot learning, and on a more conceptual side games solving like Go. This paper aims to bridge the gap between these two approaches by showing Deep Reinforcement Learning (DRL) techniques can shed new lights on portfolio allocation thanks to a more general optimization setting that casts portfolio allocation as an optimal control problem that is not just a one-step optimization, but rather a continuous control optimization with a delayed reward. The advantages are numerous: (i) DRL maps directly market conditions to actions by design and hence should adapt to changing environment, (ii) DRL does not rely on any traditional financial risk assumptions like that risk is represented by variance, (iii) DRL can incorporate additional data and be a multi inputs method as opposed to more traditional optimization methods. We present on an experiment some encouraging results using convolution networks. Read more

Submitted: 30 September, 2020

Similarities between policy gradient methods in reinforcement and supervised learning

Link: SSRN paper 3391216

Authors: Eric Benhamou, David Saltiel

Abstract: Reinforcement learning (RL) is about sequential decision making and is traditionally opposed to supervised learning (SL) and unsupervised learning (USL). In RL, given the current state, the agent makes a decision that may in uence the next state as opposed to SL where the next state remains the same, regardless of decisions taken. Although this difference is fundamental, SL and RL are not so different. In particular, we emphasize in this paper that gradient policy methods can be cast as a SL problem where true label are replaced with discounted rewards. We pro- vide a simple experiment where we interchange label and pseudo rewards to show that SL techniques can be directly translated into RL methods. Read more

Submitted: 2 May, 2019

Trade Selection with Supervised Learning and Optimal Coordinate Ascent (OCA)

Link: SSRN paper 3298347

Authors: David Saltiel, Eric Benhamou, Rida Laraki  and Jamal Atif

Abstract: Can we dynamically extract some information and strong relationship between some financial features in order to select some financial trades over time? Despite the advent of representation learning and end-to-end approaches, mainly through deep learning, feature se- lection remains a key point in many machine learning scenarios. This paper introduces a new theoretically motivated method for feature se- lection. The approach thatfits within the family of embedded methods, casts the feature selection conundrum as a coordinate ascent optimiza- tion with variables dependencies materialized by block variables. Thanks to a limited number of iterations, it proves eficiency for gradient boost- ing methods, implemented with XGBoost. In case of convex and smooth functions, we are able to prove that the convergence rate is polynomial in terms of the dimension of the full features set. We provide comparisons with state of the art methods, Recursive Feature Elimination and Bi- nary Coordinate Ascent and show that this method is competitive when selecting some financial trades. Read more

Submitted: September, 2020

Deep Reinforcement Learning for Portfolio Selection

Collaboration with  Homa Capital  team

Link: SSRN paper 3871070

Authors: Eric Benhamou, David Saltiel, Jean Jacques Ohana, Jamal Atif  and Rida Laraki

Abstract: Deep reinforcement learning (DRL) has reached an unprecedent level on complex tasks like game solving (Go, StarCraft II), and autonomous driving. However, applications to real Financial assets are still largely unexplored and it remains an open question whether DRL can reach super human level. In this demo, we showcase state-of-the-art DRL methods for selecting portfolios according to financial environment, with a final network concatenating three individual networks using lay- ers of convolutions to reduce network's complexity. The multi entries of our network enables capturing dependencies from common financial indicators features like risk aversion, citigroup index surprise, portfolio specific features and previous portfolio allocations. Results on test set show this approach can overperform traditional portfolio optimization methods with results available at our demo website. Read more

Submitted: September, 2020