Udeh Onyedikachi Peter

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 63

STOCK PRICE PREDICTION USING MACHINE LEARNING

BY

UDEH ONYEDIKACHI PETER

U17/NAS/CSC/251

COMPUTER SCIENCE

COMPUTER SCIENCE/MATHEMATICS

FACULTY OF NATURAL SCIENCES AND

ENVIRONMENTAL STUDIES

GODFREY OKOYE UNIVERSITY, ENUGU NIGERIA


STOCK PRICE PREDICTION USING MACHINE LEARNING

BY

UDEH ONYEDIKACHI PETER

U17/NAS/CSC/251

A RESEARCH SUBMITTED TO THE DEPARTMENT OF COMPUTER

SCIENCE AND MATHEMATICS, FACULTY OF NATURAL SCIENCES

AND ENVIRONMENTAL STUDIES, GODFREY OKOYE UNIVERSITY,

ENUGU.

IN PARTIAL FULFILMENT OF THE REQUIREMENT FOR THE AWARD

OF BACHELOR OF SCIENCE (B.Sc) DEGREE IN COMPUTER SCIENCE

SUPERVISOR: Dr. S.C. Echezona

JULY, 2021
APPROVAL

This research titled STOCK PRICE PREDICTION USING MACHINE LEARNING has been

assessed and approved by the Committees of the Department of Computer Science and Faculty

of Natural Sciences and Environmental Studies, Godfrey Okoye University Enugu.

Dr. S.C. Echezona ……………………….. ……………………………..

Supervisor Signature Date

Dr. J.B Agbogun ……………………….. ……………………………..

Head of Department Signature Date

Assoc. Prof M.N. Unachukwu ……………………….. ……………………..

Dean, FNSES Signature Date

Prof H.C. Inyiama ……………………….. ……………………………..

External Examiner Signature Date

i
CERTIFICATION

This is to certify that this research titled STOCK PRICE PREDICTION USING MACHINE

LEARNING was carried out by UDEH ONYEDIKACHI PETER with Registration Number

U17/NAS/CSC/251 under supervision in the Department of Computer Science, Godfrey Okoye

University Enugu.

Dr. S.C. Echezona ……………………….. ……………………………..

Supervisor Signature Date

Dr. J.B Agbogun ……………………….. ……………………………..

Head of Department Signature Date

ii
DEDICATION

This research work is dedicated to Almighty God, who gave me the wisdom, understanding and

patience to see this project from the beginning to the end.

iii
ACKNOWLEDGEMENTS

With all my heart, I appreciate God Almighty for his protection and guidance and for seeing this

research work from the beginning to the end. . I wish to express my heartfelt gratitude to my able

supervisor Dr. S.C. Echezona who made this research work a reality through his dedication and

corrections. Special thanks to my Head of Department Dr J.B Agbogun and all the Lecturers and

Staff in the Computer science/Maths department for their contributions to my academic life. I

express my immense gratitude to my lovely parents for their unfailing support, words of

encouragement and prayers. I say a big thank you for your immense support.

iv
ABSTRACT

This study builds the groundwork for democratizing machine learning technology for retail
investors by using a web application to connect machine learning models for retail investors. It
provides predictions and visualizations to help investors navigate the stock markets and make
more intelligent decisions. Prophet, Ridge Regressions, Recurrent Neural Networks, and
Bagging regression are among the stock price prediction algorithms and models created. A great
variety of basic features, such as technical analysis features developed using distinct history
timelines, are driven by time series data. Numerous feature selection and feature extraction
approaches are used to find best features for the problem. The major technologies used for
the web application includes; Streamlit, SQLite, Bootstrap.

v
TABLE OF CONTENTS

Title page

Approval…………………………………………………………………………………………...i

Certification……………………………………………………………………………………….ii

Dedication………………………………………………………………………………………...iii

Acknowledgements……………………………………………………………………………….iv

Abstract…………………………………………………………………………………………....v

Table of contents………………………………………………………………………………….vi

List of tables……………………………………………………………………………………...ix

List of figures……………………………………………………………………………………...x

1.0 CHAPTER ONE: INTRODUCTION………………………………………………………1

1.1 Background of the Study……………………………………………………………………...1

1.2 Statement of the Problem……………………………………………………………………...2

1.3 Aim and Objectives…………………………………………………………………………....2

1.4 Significance of the Study……………………………………………………………………...3

1.5 Scope of the Study…………………………………………………………………………….3

2.0 CHAPTER TWO: LITERATURE REVIEW……………………………………………...5

vi
2.1 Introduction…………………………………………………………………………………....5

2.2 Theoretical Background……………………………………………………………………….5

2.3 Review of Related Literature………………………………………………………………...16

3.0 CHAPTER THREE: SYSTEM ANALYSIS AND DESIGN…………………………….19

3.1 Overview……………………………………………………………………………………..19

3.2 Analysis of the Existing System……………………………………………………………..19

3.3 Analysis of Proposed System………………………………………………………………..20

3.4 Design of Proposed System………………………………………………………………….20

4.0 CHAPTER FOUR: IMPLEMENTATION………………………………………………..23

4.1 Preamble……………………………………………………………………………………..23

4.2 Choice of Development Environment……………………………………………………….23

4.3 Software Testing……………………………………………………………………………..24

4.4 User Manual………………………………………………………………………………….27

5.0 CHAPTER FIVE: CONCLUSION AND RECOMMENDATIONS…………………….28

5.1 Summary……………………………………………………………………………………..28

5.2 Conclusion…………………………………………………………………………………...28

5.3 Recommendation……………………………………………....…………………………….28

vii
REFERENCE…………………………………………………………………………………….29

Appendix A………………………………………………………………………………………32

Appendix B……………………....……………………………………………………………....61

viii
LIST OF TABLES

Table 2.1: The train time, accuracy score, mean squared error (MSE) of both the train and

test data from each of the models used in this study………………………………………….11

ix
LIST OF FIGURES

Figure 2.1: A line plot of the actual price versus the predicted price using Ridge Regression

model…………………………………………………………………………………………….12

Figure 2.2: A line plot of the actual price versus the predicted price using the Prophet

model…………………………………………………………………………………………….12

Figure 2.3: A line plot of the actual price versus the predicted price using Bagging

Regression model……………………………………………………………………………….13

Figure 2.4: A line plot of the actual price versus the predicted price using LSTM model...13

Figure 3.1: A use case diagram of the proposed system……………………………………...21

Figure 3.2: Activity diagram of the proposed system………………………………………...22

Figure 4.1: Different stock tickers of different companies to select from…………………...25

Figure 4.2: The raw data and interactive visualization of the raw data…………………….25

Figure 4.3: The forecast data and interactive visualization of the forecast data…………...26

Figure 4.4: Visualized stock trends……………………………………………………………26

x
CHAPTER ONE

INTRODUCTION

1.1 Background of the Study

In recent years, many attempts have been made to predict the behavior of bonds, currencies,

stocks, stock markets or other economic markets among economists and investors. These

attempts were inspired by various information on how uncertain the economic markets

behaves.

Retail investors spend a lot of time trying to find investment opportunities. Wealthier investors

could seek professional financial advisory services, which is not true for retail investors, being

that the costs are prohibitive. Thus, retail investors need to find out the market themselves and

make informed decisions on their own, this makes investment very stressful for them.

Unfortunately, humans are irrational in nature. Without quantitative, data-driven models,

decisions get swayed by cognitive biases or personal emotions, leading to irrelevant losses. Even

if retail investors are watchful enough, most do not have sufficient skills to process a huge

volume of data required to make good investment decisions. Institutional investors rely on

sophisticated models backed by technologies to avoid traps, but retail investors do not have an

avenue to such technologies and often find themselves falling behind the market.

1
Without access to quantitative and data-driven models, one obvious approach retail investors

could use to gauge the market is through simple indicators, for instance, rectilinear regression,

bollinger bands and exponential moving average (EMA). Another conspicuous approach retail

investors might use to predict the stock exchange is to draw a rectilinear regression line that

connects the utmost or minimum of candlesticks.

Inspired by the increasing popularity of machine learning algorithms for forecasting applications,

these algorithms might function as potential tools to discover hidden patterns within the trend of

stock prices, this information might be useful to supply extra insights for retail investors when

making investment decisions. Therefore, this final year project aims to research the usefulness of

machine learning in predicting stock prices and democratize such technologies through a simple

to use interface for the overall public.

1.2 Statement of the Problem

Various prediction techniques were studied in the stock market prediction field and still

nowadays researchers are focusing on implementing the latest technique so as to enhance the

stock price prediction model.

Retail investors most of the time find it difficult to seek professional financial consultative

services because the prices are excessive. The main goal of this project research is to provide

retail investors a web application that uses machine learning to assist them steer within the

fast-changing stock market. The objective of the project is to introduce and democratize the

current Machine Learning and Deep Learning technologies for retail investors to help them make

investment decisions. No prediction is 100% accurate. Therefore, the upper bound and lower

bound of the stock prices will be displayed to illustrate the trading range the investors should be

2
looking at. This application is an additional quantitative tool for investors to ascertain the market

at a different perspective with the help of machine learning.

1.3 Aim and Objectives of the Study

The main aim of the study is to develop a web application that provides stock price prediction

based on the latest machine learning technologies to retail investors.

Specific Objectives:

i. Explore how different machine learning approaches can be used and will affect the

accuracy of stock price predictions.

ii. Investigate how different hyperparameters can be tuned for better performance of the

machine learning models.

iii. Develop a web application to display the predictions in an intuitive way.

iv. Visualize the prediction result in the web application.

1.4 Significance of the Study

This research is carried out with the main objective of introducing and democratizing the latest

machine learning technologies and also helps retail investors make investment decisions.

Therefore, some importance of this study is stated and based on the results obtained, it is hoped

that this study is able to:

1. Provide retail investors access to quantitative and data-driven models useful for making

investment decisions.

2. To encourage more studies on machine learning in term of different model selections

for stock prediction or stock selection

3
3. To provide a basis for researchers who are interested in applying machine learning

algorithms in fundamental data such as accounting information and other financial data.

1.5 Scope of the Study

1. The scope of this project does not exceed a generalized suggestion tool.

2. No prediction is 100% accurate since there are many parameters that can directly affect

the stock market, each and every one of them cannot be taken into account.

3. This system is limited to only certain users that have knowledge of the stock market.

4
CHAPTER TWO

LITERATURE REVIEW

2.1 Introduction

This chapter shows, and outlines the various technologies used in my study. It also presents

related literature from different authors.

2.2 Theoretical Background

This project is partitioned into two sections, specifically a research segment and an application

segment. The Machine Learning and Deep Learning algorithms used in the research part

includes; Ridge Linear Regression model, BaggingRegressor model, LSTM, Prophet and

AdaBoostRegressor model. While the major technologies used for the application part includes;

Streamlit, SQLite, Bootstrap.

2.2.1 Python

Python is one of the most powerful and most popular programming languages for scientific

computing. It is easy to learn, has efficient data structures, and a simple but effective approach to

object-oriented programming. Python’s elegant syntax and dynamic typing, together with its

interpreted nature, make it an ideal language for scripting and rapid application development in

many areas on most platforms (Rossum 2020). In this project, Python programming

packages/libraries will be used for the Machine Learning predictions. The server side web

framework that will be used to server the predicted prices is also written in Python.

5
2.2.2 Streamlit

Streamlit is an open-source software framework for deploying Data Science and Machine

Learning projects. It provides for simple data optimization, deployment, and statistical analysis

with very little code. It also eliminates the need for prior experience with web service

frameworks such as Django and Flask. This is especially handy when working on data

dashboards with a team that is mainly made up of non-technical people. Streamlit is simple to

use because it builds an interactive data-driven web application using predefined commands.

Simple instructions like st.write() may now be used to build a wide range of objects, from simple

text to pandas dataframes and matplotlib visualizations (Saxena et al 2020).

2.2.3 SQLite

SQLite is a relational database management system (RDBMS) that will be used in this project to

store user data and preferences about their choice of stocks. SQLite is an open source embedded

relational database designed to provide a convenient way for applications to manage data without

the overhead that often comes with dedicated relational database systems (Owens 2006). SQLite

is easy to configure, easy to use, highly portable and efficient compared to other relational

database management systems. SQLite is serverless such that it does not need a server to operate.

The database storage file is accessed directly using the SQLite library.

2.2.4 Bootstrap

6
Bootstrap is now the most widely used HTML, CSS, and JS framework for creating responsive,

mobile-first web projects. It is used to speed up and simplify front-end web development. It's

designed for people of all skill levels, devices of all sizes, and projects of all sizes. In this project,

Bootstrap will be used as the front end technology for displaying the Machine Learning predicted

results and other features the user will be seeing on the web page.

2.2.5 Machine learning/ Deep learning algorithms

This study will explore different Machine Learning models, how they can be used and how they

will affect the accuracy of the stock price prediction. The Machine Learning algorithms used in

this project are part of the Scikit-learn library while the Deep Learning algorithms are part of the

PyTorch library.

2.2.5.1 Scikit learn

Scikit-learn is an open source Python library that provides supervised and unsupervised Machine

Learning algorithms. Scikit-learn harnesses this rich environment to provide state-of-the-art

implementations of many well known machine learning algorithms, while maintaining an

easy-to-use interface tightly integrated with the Python programming language (Pedregosa et al,

2011). Some of the functionalities provided by Scikit-learn library includes; Regression,

Classification, Clustering, Model selection, Preprocessing.

2.2.5.2 PyTorch

PyTorch is a Deep Learning framework developed by Facebook's AI Research lab (FAIR).

PyTorch is open source and it focuses on both usability and speed. PyTorch offers an imperative

and Pythonic programming style that supports code as a model, simplifies debugging, and is

7
consistent with other widely known machine learning libraries, all while remaining efficient and

continuing to support hardware accelerators like GPUs (Paszke 2019). PyTorch is easy to learn,

simple to use and more pythonic compared to other Deep Learning frameworks like Tensorflow.

PyTorch has a very special feature called data parallelism which allows it to distribute

computational workload among multiple CPU or GPU cores. Some of the PyTorch Deep

Learning algorithms includes; Recurrent Neural Network (RNN), Convolutional Neural Network

(CNN), Artificial Neural Network (ANN), LSTM.

2.2.5.3 Ridge Linear Regression Model

Ridge Regression (also called Tikhonov regularization) is a regularized version of Linear

Regression: a regularization term equal to α∑n = 1θ2 is added to the cost function. This forces the

learning algorithm to not only fit the data but also keep the model weights as small as possible

(Aurélien 2019). The parameter α determines how much the model should be regularized. If = 0,

Ridge Regression is simply Linear Regression. If it is very large, all weights end up very close to

zero, resulting in a flat line passing through the mean of the data. Ridge regression is a model

tuning technique used to analyze data with multicollinearity. L2 regularization is performed by

this method. When there is a problem with multicollinearity, least-squares are unbiased, and

variances are large, resulting in predicted values that are far from the actual values.

2.2.5.3 Prophet Model

The Prophet model is a relatively new methodology developed by Facebook researchers.

Because of its structure of adjusting parameters without investigating the original model's details,

it is a simple yet robust estimation method. It includes a time series model that can be

decomposed into three main model components: trend, holidays, and seasonality. In a recent

8
study “Long-Term Forecasting of Electrical Loads in Kuwait Using Prophet and Holt–Winters

Models'' by Almazrouee et al (2020), the Prophet model outperformed the well-established

Holt–Winters model in Kuwait’s long-term peak load forecasting. The use of this method

in forecasting is expected to spread due to its robustness and accuracy. Prophet is a method

for forecasting time series data that uses an additive model to fit non-linear trends with yearly,

weekly, and daily seasonality, as well as holiday effects. It works best with time series that have

strong seasonal effects and historical data from multiple seasons.

2.2.5.4 Bagging Regression Model

Bagging regression is an acronym derived from Bootstrap aggregating, an ensemble

approach to classification through utilization of different classification methods and

regression methods with the aim of reducing prediction process variance (Sadrmomtazi

2013). Bagging is built on the development of individual regression models that use a randomly

distributed training set to train a single algorithm. There is a random training set of N instances

for each regression model (N = size of original training set). In the test, a significant number of

the original instances may be repeated or completely omitted. The average prediction values are

used to provide the final prediction after iteratively building several regression models. For a

long time, the approach has been associated with a proper response to handling missing values in

datasets due for use in prediction.

2.2.5.5 Long Short-Term Memory (LSTM)

A recurrent neural network (RNN) is a type of artificial neural network that recognizes

sequential patterns in data to predict the following scenarios (Laskowski 2018). This architecture

is particularly powerful due to its node connections, which enable the display of temporal

9
dynamic behavior. The use of feedback loops to process a sequence is another important feature

of this architecture. Such a characteristic allows information to persist, which is commonly

referred to as memory. Because of this behavior, RNNs are ideal for time series problems. Long

short-term memory (LSTM) architectures were developed based on this structure. LSTMs are

specifically developed to prevent the problem of long-term dependency. They don't have to work

hard to remember knowledge for lengthy periods of time; it's nearly second nature to them (Olah

2015). They are currently frequently utilized and function exceptionally effectively in a wide

range of situations.

2.2.5.6 The performance of the Machine learning algorithms used

The dataset used for this research is an OHLCV (Open, High, Low, Close, Volume) historical

data fetched using Yahoo finance API. Other features were extracted from the fetched dataset.

The features used are the closing price of each particular trading day, the volume of stock traded

any particular day, S&P 500 index, the Bollinger bands derived from the Simple Moving

Average (SMA), rolling mean and rolling standard deviation.

After data extraction, the next step was feature scaling. The reason for feature scaling is to bring

every feature in the same footing without any upfront importance. Another reason why feature

scaling is applied is that few algorithms like Neural network gradient descent converge much

faster with feature scaling than without it (Roy 2020). The MinMaxScaler class from the

Scikit-learn library was used to perform the feature scaling, it scales the data by using a scaling

10
technique called normalization in which the values are shifted and rescaled so that they end up

ranging between 0 and 1.

After feature scaling, the scaled data is used for training each model. After training, the mean

squared error was used to measure the average of the squares of the errors i.e, the average

squared difference between the estimated values and the actual value.

Table 2.1: The train time, accuracy score, mean squared error (MSE) of both the train and

test data from each of the models used in this study.

Ridge Prophet Bagging LSTM

Regression Regression

Train MSE 875.77453 3706.13251 736.78503 540.7672

Test MSE 7456.85947 133462.64 6298.09608 562.30225

Train Time 0.03511s 1.70573s 0.02368s 1.26454s

The Table 2.1 above shows the performance of each model on the prepared dataset. It can be

observed that every other model except LSTM is overfitting. This is because the MSE from their

respective test data is way higher than that of their training data, as such they may not generalize

well on unseen data.

11
Figure 2.1: A line plot of the actual price versus the predicted price using Ridge Regression

model.

Figure 2.2: A line plot of the actual price versus the predicted price using the Prophet

model.

12
Figure 2.3: A line plot of the actual price versus the predicted price using Bagging

Regression model

Figure 2.4: A line plot of the actual price versus the predicted price using LSTM model

2.2.6 Hyperparameters and hyperparameter tuning


13
When building a machine learning model, you'll be given design options for defining your model

architecture. We don't always know what the best model architecture is for a given model, so

we'd like to be able to experiment with a variety of options. In true machine learning fashion,

we'll ideally ask the machine to perform this exploration and automatically select the best model

architecture. The parameters that define the model architecture are known as hyperparameters,

and the process of searching for the best model architecture is known as hyperparameter tuning

(Jordan 2017). The difficulty with hyperparameters is that there is no single magic number that

works everywhere. The best numbers vary depending on the task and the dataset.

Knowing where to begin can be difficult given the number of hyperparameters that you may

want to tune. With this in mind, here are commonly used hyperparameters used in this project.

2.2.6.1 Learning rate

The single most important hyperparameter is learning rate, and it should always be tuned. The

learning rate is a hyper-parameter that governs how much we adjust the weights of our network

in relation to the loss gradient (Zulkifli 2018). A good starting point is 0.01. If our learning rate

is too low, it will take a much longer time (hundreds or thousands of epochs) to reach the ideal

state. On the other hand, if our learning rate is too high, it will overshoot the ideal state and our

algorithm will fail to converge as such, it will result in overfitting.

In a report “Cyclical Learning Rates for Training Neural Networks”, Smith (2018) argued that

you could estimate a good learning rate by training the model initially with a very low learning

rate and increasing it (either linearly or exponentially) at each iteration.

2.2.6.2 Mini-Batch gradient descent

14
Gradient descent is a popular optimization algorithm for determining the weights or coefficients

of machine learning algorithms like artificial neural networks and logistic regression. It works by

having the model predict on training data and then using the error on the predictions to update

the model in order to reduce the error (Brownlee 2017).

Mini-batch gradient descent is a gradient descent variation that divides the training dataset into

small batches that are used to calculate model error and update model coefficients (Brownlee

2017). Implementations may choose to sum the gradient over the mini-batch, which reduces the

gradient's variance even further. Mini-batch gradient descent is the recommended gradient

descent variant for most applications, particularly deep learning. Mini-batch sizes, also known as

“batch sizes” for brevity, are frequently tuned to an aspect of the computational architecture on

which the implementation is running. For example, a power of two that corresponds to the

memory requirements of the GPU or CPU hardware, such as 32, 64, 128, 256 etc. Small values

result in a learning process that converges quickly at the expense of training noise. Large values

result in a slow convergent learning process with accurate estimates of the error gradient.

2.2.6.2 Number of epochs

The number of epochs determines how frequently the network's weights will be changed. As the

number of epochs increases, the neural network's weights are changed the same number of times,

and the boundary shifts from underfitting to optimal to overfitting. The Validation Error is the

metric we should be looking at when deciding on the number of epochs for our training step. The

intuitive manual method is to train the model for as many iterations as the validation error

decreases. To determine when to stop training the model, a technique known as Early Stopping is

15
used. If the validation error has not improved in the last 10 or 20 epochs, the training process is

terminated.

2.3 Review of Related Literature

Various researchers have worked on various studies to improve the accuracy of stock price

prediction in the Artificial Intelligence industry. They include:

Piramuthu (2004) thoroughly evaluated various feature selection methods for data mining

applications. He compared how different feature selection methods optimized decision tree

performance using datasets such as credit approval data, loan default data, web traffic data, tam

data, and kiang data. He compared probabilistic distance measures such as the Bhattacharyya

measure, the Matusita measure, the divergence measure, the Mahalanobis distance measure, and

the Patrick-Fisher measure. The Minkowski distance measure, city block distance measure,

Euclidean distance measure, and nonlinear distance measure are inter-class distance measures.

The author's evaluation of both probabilistic distance-based and several inter-class feature

selection methods is a strength of this paper. Furthermore, the author conducted the evaluation

using various datasets, which added to the paper's strength. The evaluation algorithm, on the

other hand, was only a decision tree. We don't know if the feature selection methods will still

work on a larger dataset or a more complex model.

In their study "Stock market forecasting using Hidden Markov Model" Hassan and Nath (2005)

used the Hidden Markov Model (HMM) to forecast stock prices of four different airlines. They

divide the model's states into four categories: the opening price, the closing price, the highest

price, and the lowest price. The approach used in this paper is unique in that it does not require

expert knowledge to build a prediction model. While this work is limited to the airline industry

16
and is based on a very small dataset, it may not result in a generalizable prediction model. One of

the approaches used in stock market prediction works could be used to perform the comparison

work. The authors limited the date range of the training and testing datasets to a maximum of

two years.

Lee (2009) predicted stock trends using the support vector machine (SVM) and a hybrid feature

selection method in "Using support vector machine with a hybrid feature selection method to the

stock trend prediction". The dataset used in this study is a subset of the NASDAQ Index from the

Taiwan Economic Journal Database (TEJD) in 2008. The feature selection part used a hybrid

method, with supported sequential forward search (SSFS) acting as the wrapper. Another

advantage of this work is that they created a detailed procedure for parameter adjustment with

performance under various parameter values. The feature selection model's clear structure is also

heuristic to the primary stage of model structuring. One limitation was that the performance of

SVM was only compared to back-propagation neural networks (BPNNs) and not to other

machine learning algorithms.

Lei (2018) used Wavelet Neural Network (WNN) to predict stock price trends in "Wavelet neural

network prediction method of stock price trend based on rough set attribute reduction". As an

optimization, the author used Rough Set (RS) for attribute reduction. Rough Set was used to

reduce the dimensions of the stock price trend feature. It was also used to determine the Wavelet

Neural Network's structure. This work's dataset includes five well-known stock market indices:

(1) the SSE Composite Index (China), (2) the CSI 300 Index (China), (3) the All Ordinaries

Index (Australia), (4) the Nikkei 225 Index (Japan), and (5) the Dow Jones Index (USA). The

model was evaluated using various stock market indices, and the results were convincing and

general. The computational complexity is reduced by using Rough Set to optimize the feature

17
dimension before processing. However, in the discussion section, the author only emphasized

parameter adjustment and did not mention the model's flaws. I discovered that because the

evaluations were done on indices, the same model may not perform as well if applied to a

specific stock.

Recent studies make use of input data from a variety of sources and in a variety of formats. Some

systems use historical stock data to perform mathematical analysis, others use financial news

articles, still others use expert reviews to perform sentimentanalysis on financial news articles,

and still others use a hybrid system that uses multiple inputs to forecast the market.

18
CHAPTER THREE

SYSTEM ANALYSIS AND DESIGN

3.1 Overview

In this study, Object-Oriented Analysis and Design (OOAD) methodology and notation symbols

of the Unified Modelling Language (UML) will be used for the analysis of the system.

Object-Oriented Analysis and Design (OOAD) is a new system development approach

encouraging and facilitating reuse of software components. The OOAD methodology uses

diagrams to document an object-based decomposition of systems and to show the interaction

between these objects and dynamics of these objects.

3.2 Analysis of the Existing System

Before the proposal of the new system, retail investors spend a lot of time trying to find

investment opportunities. One obvious approach retail investors could use to gauge the market is

by drawing technical indicators on a visualized price history, for instance, Bollinger bands,

Simple Moving Average, Exponential moving average (EMA) and Relative strength index (RSI).

This requires a lot of work and decisions get swayed by cognitive biases or personal emotions,

leading to irrelevant losses.

3.2.1 Problems of the Existing System

Having the overview knowledge of the existing system, some of its problems are as follows.

1. Irrelevant losses from bad investment decisions due to cognitive biases or personal

emotions.

19
2. Irrelevant losses due to lack of proper analysis of stock market data.

3. Time consuming: a lot of time is spent trying to find investment opportunities or trying to

gauge the market.

3.3 Analysis of the Proposed System

The proposed system will process a huge volume of data required to make good investment

decisions and use quantitative and data-driven models to gauge the market and predict stock

prices. Because it does not account for human behaviors, the proposed system will not always

produce accurate results. Factors such as a change in company leadership, internal issues, strikes,

protests, natural disasters, and a change in authority cannot be considered when relating to a

change in the stock market by the machine. The system's goal is to provide an estimate of where

the stock market might be headed. Many factors and parameters may influence it along the way.

3.3.1 Advantages of the Proposed System

1. The use of quantitative and data driven models to make investment decisions.

2. Time saving: retail investors no longer have to waste time processing market data.

3. Since the models are data driven, the losses will be less because models cannot be

swayed by emotions.

4. Less expensive: retail investors won’t have to spend the huge amount required to hire

financial advisors.

3.4 Design of the Proposed System

The proposed solution is comprehensive as it includes pre-processing of the stock market dataset,

utilization of multiple features and visualization of the stock market price trend prediction.

20
3.4.1 Use Case

The use case is a list of steps, typically defining interactions between a role(actor) and a system.

Use case diagrams are used to gather the requirements of a system including internal and

external influences. These requirements are mostly design requirements. Hence, when a system

is analyzed to gather its functionalities, use cases are prepared and actors are identified.

When the initial task is complete, use case diagrams are modelled to present the outside view.

Figure 3.1: A use case diagram of the proposed system

3.4.2 Activity Diagram

21
Activity diagrams represent workflows in a graphical way. They can be used to describe the

workflow or the operational workflow of any component in a system.

Figure 3.2: Activity diagram of the proposed system

22
CHAPTER FOUR

IMPLEMENTATION

4.1 Preamble

This chapter describes the implementation of the new system and the software and hardware that

may be required for the system implementation.

4.2 Choice of Development Environment

4.2.1 Integrated Development Environment

My choice of IDE Jupyter Notebooks. Jupyter Notebook is a free and open-source web tool that

lets us write and share code and documents. It provides an environment in which you may

document your code, run it, examine the findings, visualize data, and view the results without

having to leave the environment. This makes it a useful tool for completing end-to-end data

science workflows, including data cleansing, statistical modeling, constructing and training

machine learning models, visualizing data, and so on.

It provides an environment in which you may document your code, run it, study the results,

visualize data, and view the output without ever leaving the environment. This makes it a helpful

tool for completing end-to-end data science workflows, such as data purification, statistical

modeling, machine learning model construction and training, data visualization, and so on.

4.2.2 Choice of Programming Language

23
All the machine learning related code are written in Python. Python programming language was

used mainly for the development of the back-end server of the system. The Streamlit framework

was used to server the machine learning prediction to the user interface.

4.3 Software Testing

There was extensive testing of the software throughout the implementation. The testing is in two

phases:

1. Component test

2. System test

Component test - during the development of the project, Jupyter notebook provided the

capability to test each model or function in real time. The functions were tested with different

variables in order to detect any possible error, while the models were tuned with different

parameters to reduce error and increase performance.

System testing - the completed system was later tested on more realistic data after completion to

make sure all components worked correctly together as a whole.

Screenshots

Figure 4.1 shows different stock tickers of different companies to select from, figure 4.2 shows

the raw data and interactive visualization of the raw data for any selected stock, figure 4.3 shows

the forecast data and interactive visualization of the forecast data for any selected stock, figure

4.4 shows the trends for the selected stock.

24
Figure 4.1: Different stock tickers of different companies to select from.

Figure 4.2: The raw data and interactive visualization of the raw data.

25
Figure 4.3: The forecast data and interactive visualization of the forecast data.

Figure 4.4: Visualized stock trends

26
4.5 User Manual

To run the whole program on a personal workstation od PC, follow these steps;

1. Making sure that Python >=3.8 is installed on your server computer.

2. Install all the required modules by running “$ pip install setup.py” on the terminal or

command prompt.

3. Then run “ $ streamlit run main.py” to launch the program.

4.6 Source Code Listing

Some of the source code snippets for this implementation can be seen at Appendix A and

Appendix B.

27
CHAPTER FIVE

CONCLUSION AND RECOMMENDATIONS

5.1 Summary

Stock investment can help you build your savings, protect your money from inflation and taxes.

But deciding on what or where to invest your money can be difficult, and the cost for hiring

investment professionals/experts can be prohibitive for retail investors. This study builds the

groundwork for democratizing machine learning technology for retail investors by using a web

application to connect machine learning models for retail investors. It provides predictions and

visualizations to help investors navigate the stock markets and make more intelligent decisions.

5.2 Conclusion

This stock price prediction web application helps retail investors make informed decisions about

the stock market, using technical and data driven models to help them reduce investment losses

and save time. Currently. The platform is limited to some features but it is easily extensible to

support upgrades.

5.3 Recommendation

Having presented all that is necessary for the implementation of this research, the following

recommendations are suggestions aimed at improving and correcting some lapses;

Considering some other factors that affect the price of the stock market such as fundamental

analysis, social media sentiments and market news.

28
REFERENCES

Aurélien G (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow:
Concepts, Tools, and Techniques to Build Intelligent Systems, 2019, p. 135)
Brownlee, J. (2017, July 21, para. 8-22). A Gentle Introduction to Mini-Batch Gradient Descent
and How to Configure Batch Size. Machine Learning Mastery.
https://tinyurl.com/kvnd852f
Hassan MR, Nath B (2005). Stock market forecasting using Hidden Markov Model: a new
approach. In: Proceedings—5th international conference on intelligent systems
design and applications 2005, ISDA’05. 2005. pp. 192–6.
https://doi.org/10.1109/ISDA.2005.85.
Jordan, J. (2017, November 2, para. 1). Hyperparameter tuning for machine learning models.
Jeremy Jordan. https://www.jeremyjordan.me/hyperparameter-tuning/
Laskowski, N. (2018, para. 1). Recurrent neural networks. Retrieved from
searchenterpriseai.techtarget.com/definition/recurrent-neural-networks
Lee MC 2009. Using support vector machine with a hybrid feature selection method to the
stock trend prediction. Expert Syst Appl. 2009;36(8):pp. 10896–904.
https://doi.org/10.1016/j.eswa.2009.02.038.
Lei L (2018). Wavelet neural network prediction method of stock price trend based on rough set
attribute reduction. Appl Soft Comput J. 2018;62:pp. 923–32.
https://doi.org/10.1016/j.asoc.2017.09.029.
Olah, C. (2015, August 27, para. 14). Understanding LSTM Networks -- colah’s blog. Colah.
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Owens, M. (2006). The Definitive Guide to SQLite. In The Definitive Guide to SQLite
(1st ed., p. 1). Apress.
Paszke Adam, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan,
Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga,
Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison,
Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai,
Soumith Chintala (2019, p. 1). PyTorch: An Imperative Style, High-Performance
Deep Learning Library.
Pedregosa Fabian, Ga ̈el Varoquaux, Alexandre Gramfort, Vincent Michel, BertrandThirion,
Olivier Grisel, Mathieu Blondel,Peter Prettenhofer, Ron Weiss,
Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau,
Matthieu Brucher,Matthieu Perrot and Edouard Duchesnay (2011).
Journal of Machine Learning Research 12 (2011, pp. 2825-2830).

29
Piramuthu S 2004. Evaluating feature selection methods for learning in data mining applications.
Eur J Oper Res. 2004;156(2):pp. 483–94.
https://doi.org/10.1016/S0377-2217(02)00911-6.
Rossum, V. G., & Team, P. D. (2018b, p. 1). Python Tutorial: Release 3.6.4. In Python Tutorial.
12th Media Services.
Roy, B. (2020, April 7, para. 9). All about Feature Scaling - Towards Data Science. Medium.
https://towardsdatascience.com/all-about-feature-scaling-bcc0ad75cb35
Sadrmomtazi S. M., "Modelling Compressive Strengh of EPS Lightweight Concrete Using
Regression, Neural Networks and ANFIS," Construction and Building Materials
Journal, vol. 42, (2013, pp. 205 -216).
Saxena, A., Dhadwal, M., & Kowsigan, M. (2020, p. 5). Indian Crop Production:Prediction And
Model Deployment Using Ml And Streamlit. Turkish Journal of Physiotherapy
and Rehabilitation.
Smith, L. N. (2018, July 16, p. 5). Cyclical Learning Rates for Training Neural Networks.
Cyclical Learning Rates for Training Neural Networks
Zulkifli, H. (2019, March 28, para. 3). Understanding Learning Rates and How It Improves
Performance in Deep Learning. Medium. https://tinyurl.com/39kefbyr

30
APPENDICES

Appendix A: Source code of the research part

# !pip install yfinance --upgrade --no-cache-dir

import yfinance as yf

import datetime as dt

import numpy as np

import pandas as pd

import time

# Importing the necessary charting library

import matplotlib.pyplot as plt

import seaborn as sns

sns.set() # Setting seaborn as default style even if use only matplotlib

# get the start and end date dynamically using the datetime module

start = dt.datetime.today() - dt.timedelta(365*8)

end = dt.datetime.today()

print(start, end)

# assign an stock ticker

indicator = 'GOOG'

ohlc_data = yf.download(indicator, start=start, end=end)

# Data preparation and feature extraction

selected_df = pd.DataFrame({'close':ohlc_data['Adj Close'], 'volume':ohlc_data['Volume']},


index=ohlc_data.index)

# plot the close price of the selected stock

selected_df.close.plot(figsize=(15,7))

# Price for the Previous 3 days

selected_df['previous_3_days'] = selected_df.close.shift(-3)

# Nasdaq index

31
selected_df['nasdaq'] = yf.download('NDX', start=start, end=end)['Close']

# Bollinger Band

# 21 days moving average

selected_df['21 Day MA'] = selected_df['close'].rolling(window=5).mean()

# 21 days standard deviation

selected_df['21 Day STD'] = selected_df['close'].rolling(window=5).std()

# upper band

selected_df['Upper Band'] = selected_df['21 Day MA'] + (selected_df['21 Day STD'] * 2)

# lower band

selected_df['Lower Band'] = selected_df['21 Day MA'] - (selected_df['21 Day STD'] * 2)

# Target

interval_for_target = 5

selected_df['target'] = selected_df.close.shift(-interval_for_target)

selected_df.dropna(inplace=True);

# Split data for train and test purposes

from sklearn.model_selection import train_test_split

# split data to train test test split

train_data, test_data = train_test_split(selected_df, test_size=0.2, shuffle=False)

X_train = train_data.drop(columns=['target']) # the training feature

y_train = train_data['target'] # the training target

X_test = test_data.drop(columns=['target'])

y_test = test_data['target']

# scale the data

from sklearn.preprocessing import MinMaxScaler

min_max_scaler = MinMaxScaler()

X_train_scaled = min_max_scaler.fit_transform(X_train)

from sklearn.linear_model import Ridge

# Note that Ridge regression performs linear least squares with L2 regularization.

32
# Create and train the Ridge Linear Regression Model

ridge_model = Ridge()

start = time.time()

ridge_model.fit(X_train_scaled, y_train)

stop = time.time()

print(f"Training time: {stop - start}s")

# Test the model and calculate its accuracy

X_test_scaled = min_max_scaler.transform(X_test)

accuracy = ridge_model.score(X_test_scaled, y_test)

print(accuracy)

selected_df_plus_pred = test_data

selected_df_plus_pred['ridge_predicted'] = ridge_model.predict(X_test_scaled)

selected_df_plus_pred.head()

# Ploting the performance of the ridge regression model

selected_df_plus_pred[['target', 'ridge_predicted']][300:].plot(figsize=(15,7))

plt.legend(['Actual', 'Ridge Predicted'])

# get the mean sqared error

from sklearn.metrics import mean_absolute_error, mean_squared_error

mean_squared_error(y_train, ridge_model.predict(X_train_scaled))

# Build and Train a Linear Regression Model with BaggingRegressor

from sklearn.model_selection import RandomizedSearchCV

from sklearn.linear_model import LinearRegression

from sklearn.ensemble import BaggingRegressor

lr_bag = BaggingRegressor(LinearRegression())

start = time.time()

lr_bag.fit(X_train_scaled, y_train)

stop = time.time()

33
print(f"Training time: {stop - start}s")

# Test the model and calculate its accuracy

X_test_scaled = min_max_scaler.transform(X_test)

accuracy = lr_bag.score(X_test_scaled, y_test)

print(accuracy)

# get the mean squared error

from sklearn.metrics import mean_squared_error

mean_squared_error(y_train, lr_bag.predict(X_train_scaled))

selected_df_plus_pred['bagging_predicted'] = lr_bag.predict(X_test_scaled)

selected_df_plus_pred.head()

# Ploting the performance of the bagging model

selected_df_plus_pred[['target', 'bagging_predicted']][300:].plot(figsize=(15,7))

plt.legend(['Actual', 'BaggingRegressor Predicted'])

# Build and Train a Linear Regression Model with AdaBoostRegressor

from sklearn.model_selection import GridSearchCV

from sklearn.linear_model import LinearRegression

from sklearn.ensemble import AdaBoostRegressor

lr_boost = AdaBoostRegressor(LinearRegression())

lr_boost.fit(X_train_scaled, y_train)

# Test the model and calculate its accuracy

X_test_scaled = min_max_scaler.transform(X_test)

accuracy = lr_boost.score(X_test_scaled, y_test)

print(accuracy)

# get the mean squared error

from sklearn.metrics import mean_squared_error

mean_squared_error(y_test, lr_boost.predict(X_test_scaled))

selected_df_plus_pred['adaboost_predicted'] = lr_boost.predict(X_test_scaled)

selected_df_plus_pred.tail()

34
# Ploting the performance of the Adaboost model

selected_df_plus_pred[['target', 'bagging_predicted']][300:].plot(figsize=(15,7))

plt.legend(['Actual', ' AdaBoostRegressor Predicted'])

# Using Prophet Model

from fbprophet import Prophet

from pandas import to_datetime

df_for_prophet = pd.DataFrame({'ds':selected_df.index, 'y':selected_df.close})

df_for_prophet.reset_index(drop=True, inplace=True)

from sklearn.model_selection import train_test_split

# split data to train test test split

prophet_train_data, prophet_test_data = train_test_split(df_for_prophet, test_size=0.2, shuffle=False)

prophet_X_train = prophet_train_data.drop(columns=['y']) # the training feature

prophet_y_train = prophet_train_data['y'] # the training target

prophet_X_test = prophet_test_data.drop(columns=['y'])

prophet_y_test = prophet_test_data['y']

# Make prediction

pht_model = Prophet()

start = time.time()

pht_model.fit(prophet_train_data)

stop = time.time()

print(f"Training time: {stop - start}s")

future = pht_model.make_future_dataframe(periods=len(prophet_y_test),freq='MS')

prophet_pred = pht_model.predict(future)

prophet_pred

df_for_prophet['predicted'] = prophet_pred['yhat']

from sklearn.metrics import mean_squared_error, r2_score

mean_squared_error(df_for_prophet['y'], df_for_prophet['predicted'])

r2_score(df_for_prophet['y'], df_for_prophet['predicted'])

35
# Ploting the performance of the Prophet model

df_for_prophet[['y', 'predicted']][300:].plot(figsize=(15,7))

plt.legend(['Actual', ' Prophet Predicted'])

# Using LSTM model

import torch

import torch.nn as nn

from datetime import datetime

device = "cuda" if torch.cuda.is_available() else "cpu"

print(f"{device}" " is available.")

# Get the close and volume data as training data (Input)

training_data = selected_df.copy()

training_data['target'] = training_data.close.shift(-5)

training_data.dropna(inplace=True)

# Splitting the data into train-test split

from sklearn.model_selection import train_test_split

def feature_label_split(df, target_col):

y = df[[target_col]]

X = df.drop(columns=[target_col])

return X, y

def train_val_test_split(df, target_col, test_ratio):

val_ratio = test_ratio / (1 - test_ratio)

X, y = feature_label_split(df, target_col)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=test_ratio, shuffle=False)

X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=val_ratio, shuffle=False)

return X_train, X_val, X_test, y_train, y_val, y_test

36
X_train, X_val, X_test, y_train, y_val, y_test = train_val_test_split(training_data, 'target', 0.2)

# Scaling the data

from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()

X_train_arr = scaler.fit_transform(X_train)

X_val_arr = scaler.transform(X_val)

X_test_arr = scaler.transform(X_test)

y_train_arr = scaler.fit_transform(y_train)

y_val_arr = scaler.transform(y_val)

y_test_arr = scaler.transform(y_test)

# Loading the data into DataLoaders

from torch.utils.data import TensorDataset, DataLoader

batch_size = 64

train_features = torch.Tensor(X_train_arr)

train_targets = torch.Tensor(y_train_arr)

val_features = torch.Tensor(X_val_arr)

val_targets = torch.Tensor(y_val_arr)

test_features = torch.Tensor(X_test_arr)

test_targets = torch.Tensor(y_test_arr)

train = TensorDataset(train_features, train_targets)

val = TensorDataset(val_features, val_targets)

test = TensorDataset(test_features, test_targets)

train_loader = DataLoader(train, batch_size=batch_size, shuffle=False, drop_last=True)

37
val_loader = DataLoader(val, batch_size=batch_size, shuffle=False, drop_last=True)

test_loader = DataLoader(test, batch_size=batch_size, shuffle=False, drop_last=True)

test_loader_one = DataLoader(test, batch_size=1, shuffle=False, drop_last=True)

class LSTMModel(nn.Module):

"""LSTMModel class extends nn.Module class and works as a constructor for LSTMs.

LSTMModel class initiates a LSTM module based on PyTorch's nn.Module class.

It has only two methods, namely init() and forward(). While the init()

method initiates the model with the given input parameters, the forward()

method defines how the forward propagation needs to be calculated.

Since PyTorch automatically defines back propagation, there is no need

to define back propagation method.

Attributes:

hidden_dim (int): The number of nodes in each layer

layer_dim (str): The number of layers in the network

lstm (nn.LSTM): The LSTM model constructed with the input parameters.

fc (nn.Linear): The fully connected layer to convert the final state

of LSTMs to our desired output shape.

"""

def __init__(self, input_dim, hidden_dim, layer_dim, output_dim, dropout_prob):

"""The __init__ method that initiates a LSTM instance.

Args:

input_dim (int): The number of nodes in the input layer

hidden_dim (int): The number of nodes in each layer

layer_dim (int): The number of layers in the network

38
output_dim (int): The number of nodes in the output layer

dropout_prob (float): The probability of nodes being dropped out

"""

super(LSTMModel, self).__init__()

# Defining the number of layers and the nodes in each layer

self.hidden_dim = hidden_dim

self.layer_dim = layer_dim

# LSTM layers

self.lstm = nn.LSTM(

input_dim, hidden_dim, layer_dim, batch_first=True, dropout=dropout_prob

# Fully connected layer

self.fc = nn.Linear(hidden_dim, output_dim)

def forward(self, x):

"""The forward method takes input tensor x and does forward propagation

Args:

x (torch.Tensor): The input tensor of the shape (batch size, sequence length, input_dim)

Returns:

torch.Tensor: The output tensor of the shape (batch size, output_dim)

"""

# Initializing hidden state for first input with zeros

39
h0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()

# Initializing cell state for first input with zeros

c0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()

# We need to detach as we are doing truncated backpropagation through time (BPTT)

# If we don't, we'll backprop all the way to the start even after going through another batch

# Forward propagation by passing in the input, hidden state, and cell state into the model

out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))

# Reshaping the outputs in the shape of (batch_size, seq_length, hidden_size)

# so that it can fit into the fully connected layer

out = out[:, -1, :]

# Convert the final state to our desired output shape (batch_size, output_dim)

out = self.fc(out)

return out

def get_model(model, model_params):

models = {

"lstm": LSTMModel,

return models.get(model.lower())(**model_params)

class Optimization:

"""Optimization is a helper class that allows training, validation, prediction.

Optimization is a helper class that takes model, loss function, optimizer function

40
learning scheduler (optional), early stopping (optional) as inputs. In return, it

provides a framework to train and validate the models, and to predict future values

based on the models.

Attributes:

model (RNNModel, LSTMModel, GRUModel): Model class created for the type of RNN

loss_fn (torch.nn.modules.Loss): Loss function to calculate the losses

optimizer (torch.optim.Optimizer): Optimizer function to optimize the loss function

train_losses (list[float]): The loss values from the training

val_losses (list[float]): The loss values from the validation

last_epoch (int): The number of epochs that the models is trained

"""

def __init__(self, model, loss_fn, optimizer):

"""

Args:

model (RNNModel, LSTMModel, GRUModel): Model class created for the type of RNN

loss_fn (torch.nn.modules.Loss): Loss function to calculate the losses

optimizer (torch.optim.Optimizer): Optimizer function to optimize the loss function

"""

self.model = model

self.loss_fn = loss_fn

self.optimizer = optimizer

self.train_losses = []

self.val_losses = []

def train_step(self, x, y):

"""The method train_step completes one step of training.

Given the features (x) and the target values (y) tensors, the method completes

41
one step of the training. First, it activates the train mode to enable back prop.

After generating predicted values (yhat) by doing forward propagation, it calculates

the losses by using the loss function. Then, it computes the gradients by doing

back propagation and updates the weights by calling step() function.

Args:

x (torch.Tensor): Tensor for features to train one step

y (torch.Tensor): Tensor for target values to calculate losses

"""

# Sets model to train mode

self.model.train()

# Makes predictions

yhat = self.model(x)

# Computes loss

loss = self.loss_fn(y, yhat)

# Computes gradients

loss.backward()

# Updates parameters and zeroes gradients

self.optimizer.step()

self.optimizer.zero_grad()

# Returns the loss

return loss.item()

42
def train(self, train_loader, val_loader, batch_size=64, n_epochs=50, n_features=1):

"""The method train performs the model training

The method takes DataLoaders for training and validation datasets, batch size for

mini-batch training, number of epochs to train, and number of features as inputs.

Then, it carries out the training by iteratively calling the method train_step for

n_epochs times. If early stopping is enabled, then it checks the stopping condition

to decide whether the training needs to halt before n_epochs steps. Finally, it saves

the model in a designated file path.

Args:

train_loader (torch.utils.data.DataLoader): DataLoader that stores training data

val_loader (torch.utils.data.DataLoader): DataLoader that stores validation data

batch_size (int): Batch size for mini-batch training

n_epochs (int): Number of epochs, i.e., train steps, to train

n_features (int): Number of feature columns

"""

model_path = f'{self.model}_{datetime.now().strftime("%Y-%m-%d %H:%M:%S")}'

for epoch in range(1, n_epochs + 1):

batch_losses = []

for x_batch, y_batch in train_loader:

x_batch = x_batch.view([batch_size, -1, n_features]).to(device)

y_batch = y_batch.to(device)

loss = self.train_step(x_batch, y_batch)

batch_losses.append(loss)

training_loss = np.mean(batch_losses)

self.train_losses.append(training_loss)

43
with torch.no_grad():

batch_val_losses = []

for x_val, y_val in val_loader:

x_val = x_val.view([batch_size, -1, n_features]).to(device)

y_val = y_val.to(device)

self.model.eval()

yhat = self.model(x_val)

val_loss = self.loss_fn(y_val, yhat).item()

batch_val_losses.append(val_loss)

validation_loss = np.mean(batch_val_losses)

self.val_losses.append(validation_loss)

if (epoch <= 10) | (epoch % 50 == 0):

print(

f"[{epoch}/{n_epochs}] Training loss: {training_loss:.4f}\t Validation loss: {validation_loss:.4f}"

torch.save(self.model.state_dict(), model_path)

def evaluate(self, test_loader, batch_size=1, n_features=1):

"""The method evaluate performs the model evaluation

The method takes DataLoaders for the test dataset, batch size for mini-batch testing,

and number of features as inputs. Similar to the model validation, it iteratively

predicts the target values and calculates losses. Then, it returns two lists that

hold the predictions and the actual values.

Note:

44
This method assumes that the prediction from the previous step is available at

the time of the prediction, and only does one-step prediction into the future.

Args:

test_loader (torch.utils.data.DataLoader): DataLoader that stores test data

batch_size (int): Batch size for mini-batch training

n_features (int): Number of feature columns

Returns:

list[float]: The values predicted by the model

list[float]: The actual values in the test set.

"""

with torch.no_grad():

predictions = []

values = []

for x_test, y_test in test_loader:

x_test = x_test.view([batch_size, -1, n_features]).to(device)

y_test = y_test.to(device)

self.model.eval()

yhat = self.model(x_test)

predictions.append(yhat.to(device).detach().numpy())

values.append(y_test.to(device).detach().numpy())

return predictions, values

def plot_losses(self):

"""The method plots the calculated loss values for training and validation

"""

45
plt.plot(self.train_losses, label="Training loss")

plt.plot(self.val_losses, label="Validation loss")

plt.legend()

plt.title("Losses")

plt.show()

plt.close()

import torch.optim as optim

input_dim = len(X_train.columns)

output_dim = 1

hidden_dim = 64

layer_dim = 3

batch_size = 64

dropout = 0.2

n_epochs = 50

learning_rate = 1e-3

weight_decay = 1e-6

model_params = {'input_dim': input_dim,

'hidden_dim' : hidden_dim,

'layer_dim' : layer_dim,

'output_dim' : output_dim,

'dropout_prob' : dropout}

model = get_model('lstm', model_params)

loss_fn = nn.MSELoss(reduction="mean")

optimizer = optim.Adam(model.parameters(), lr=learning_rate, weight_decay=weight_decay)

46
opt = Optimization(model=model, loss_fn=loss_fn, optimizer=optimizer)

opt.train(train_loader, val_loader, batch_size=batch_size, n_epochs=n_epochs, n_features=input_dim)

opt.plot_losses()

predictions, values = opt.evaluate(

test_loader_one,

batch_size=1,

n_features=input_dim

# Formatting the predictions

def inverse_transform(scaler, df, columns):

for col in columns:

df[col] = scaler.inverse_transform(df[col])

return df

def format_predictions(predictions, values, df_test, scaler):

vals = np.concatenate(values, axis=0).ravel()

preds = np.concatenate(predictions, axis=0).ravel()

df_result = pd.DataFrame(data={"value": vals, "prediction": preds}, index=df_test.head(len(vals)).index)

df_result = df_result.sort_index()

df_result = inverse_transform(scaler, df_result, [["value", "prediction"]])

return df_result

df_result = format_predictions(predictions, values, X_test, scaler)

df_result

# Ploting the performance of the LSTM model

47
df_result[['value', 'prediction']].plot(figsize=(15,7))

plt.legend(['Actual', ' LSTM Predicted'])

from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

def calculate_metrics(df):

result_metrics = {'mae' : mean_absolute_error(df.value, df.prediction),

'rmse' : mean_squared_error(df.value, df.prediction) ** 0.5,

'r2' : r2_score(df.value, df.prediction)}

print("Mean Absolute Error: ", result_metrics["mae"])

print("Root Mean Squared Error: ", result_metrics["rmse"])

print("R^2 Score: ", result_metrics["r2"])

return result_metrics

result_metrics = calculate_metrics(df_result)

Appendix B: Source code for Streamlit web application using Prophet model only

import streamlit as st

from datetime import date

import yfinance as yf

from fbprophet import Prophet

from fbprophet.plot import plot_plotly

from plotly import graph_objs as go

START = "2015-01-01"

TODAY = date.today().strftime("%Y-%m-%d")

st.title('Stock Price Prediction Application')

48
stocks = ('NSRGF','NVDA','NFLX','AMZN','PYPL','TSLA','GOOG', 'AAPL', 'MSFT', 'GME')

selected_stock = st.selectbox('Select dataset for prediction', stocks)

n_years = st.slider('Years of prediction:', 1, 4)

period = n_years * 365

@st.cache

def load_data(ticker):

data = yf.download(ticker, START, TODAY)

data.reset_index(inplace=True)

return data

data_load_state = st.text('Loading data...')

data = load_data(selected_stock)

data_load_state.text('Loading data... done!')

st.subheader('Raw data')

st.write(data.tail())

# Plot raw data

def plot_raw_data():

fig = go.Figure()

fig.add_trace(go.Scatter(x=data['Date'], y=data['Open'], name="stock_open"))

fig.add_trace(go.Scatter(x=data['Date'], y=data['Close'], name="stock_close"))

fig.layout.update(title_text='Time Series data with Rangeslider', xaxis_rangeslider_visible=True)

st.plotly_chart(fig)

49
plot_raw_data()

# Predict forecast with Prophet.

df_train = data[['Date','Close']]

df_train = df_train.rename(columns={"Date": "ds", "Close": "y"})

m = Prophet()

m.fit(df_train)

future = m.make_future_dataframe(periods=period)

forecast = m.predict(future)

# Show and plot forecast

st.subheader('Forecast data')

st.write(forecast.tail())

st.write(f'Forecast plot for {n_years} years')

fig1 = plot_plotly(m, forecast)

st.plotly_chart(fig1)

st.write("Forecast components")

fig2 = m.plot_components(forecast)

st.write(fig2)

50

You might also like