(18127047 20127261 20127547 20127672) - Slide

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 24

FORECASTING DIRECTIONAL

MOVEMENTS OF STOCK
PRICES FOR INTRADAY
TRADING USING LSTM AND
RANDOM FORESTS

18127047 – LÊ HOÀNG LONG


20127261 – NGUYỄN KHÔI NGUYÊN

20127547 – PHAN THÀNH LẬP


20127672 – VŨ MẠNH QUÂN
MAIN FLOW
• Introduction
• Data and technology
• Methodology
• Result and Discussion
I. INTRODUCTION
• Compare LSTM and RF in using single-feature (Krauss & Fischer) setting
with multi-feature setting
• Consisting of close price, open price and intraday returns
II. DATA AND TECHNOLOGY
• Colleting adjusted closing prices and opening prices of all constitutent stocks
of S&P 500 in the period from January 1990 to December 2018.
• Simulating and developing codes: using Python with TennsorFlow and scikit-
learn.
• Visualiztation and calculating statistical value: using MATHLAB R2016b’s
toolbox.
• a NVIDIA Tesla V100 with 30 GB memory
III. METHODOLOGY
1. Dividing raw data
2. Introducing features
3. Setting targets
4. Modeling(RF and CuDNNLSTM)
5. Establishing trading strategy for trading part
1. DATASET CREATION
• Following the procedure of Krauss et al. (2017) and Fisher & Krauss (2018).
• Using a 4-year window and 1-year stride

Sources: Pushpendu Ghosh, Ariel Neufeld, Jajati Keshari


Sahoo’s report
2. SELECTING FEATURES
• T_study: amount of days in a study period
• n_i: the number of stock in S
• s ∈ S at time t by and
2. SELECTING FEATURES
• Input: (t ∈ {0, 1, ..., τ − 1, τ}), (cp (s) t , t ∈ {0, 1, ..., τ − 1})

• TO-DO: out of all n stocks, predict k stocks with the highest and k stocks
with the lowest intraday return
2.1. FEATURE GENERATION FOR
RANDOM FOREST
2.2. FEATURE GENERATION FOR LSTM
Following the approach of Fischer & Krauss (2018)
Using multi-feature
Input model with 240 timesteps and 3 features
Predict the direction of the 241st intraday return
2.2. FEATURE GENERATION FOR LSTM

Following three features:

Robust Scaler standardization:

Firstly, Subtracts the median and then scales the data using the inter-quatile range.

Make it robust to outlier.

Where are the first, second, and third quartile of , for each feature ∈{

}in the respective training period.


2.2. FEATURE GENERATION FOR LSTM
2.3. TRAIN-TEST SPLITING
3. SETTING TARGET
• Divided each stock at time t into 2 classes, based on intraday returns.

• Class 0: < cross-sectional median intraday return of all stocks at time t


• Class 1: > cross-sectional median intraday return of all stocks at time t
4. MODELING

• Random forest:

• Number of decision trees in the forest = 1000


• Max depth of each tree = 10
• For every split, select m = ⌊ √ p⌋ features randomly from the p = 93 features in the data
4. MODELING

• LSTM:

• Loss function: categorical cross-entropy


• Optimizer: RMSProp (with the keras default learning rate of 0.001)
• Batch size: 512
• Early stopping: patience of 10 epochs, monitoring the validation loss
• Validation split: 0.2.
5. ESTABLISHING TRADING STRATEGY FOR TRADING
PART

• Predict the probability for each stock s to outperform median


• Go long the top 10 stocks with highest
• Go short the top 10 stocks with lowest
• Each long and short transaction are subjected to 0.05% slippage cost  is
penalized with a total of 0.2%
IV. RESULTS AND DISCUSSION
• Results outperform the single feature setting of Krauss et al.
• Obtaining higher sharpe ratio and lower standard deviation.
• Producing a lower maximum drawdown and lower daily value at risk (VaR).
• LSTM outperforms random forests.
IV. RESULTS AND DISCUSSION
IV. RESULTS AND DISCUSSION
IV. RESULTS AND DISCUSSION
IV. RESULTS AND DISCUSSION
IV. RESULTS AND DISCUSSION
THANKS FOR WATCHING

You might also like