Udeh Onyedikachi Peter

STOCK PRICE PREDICTION USING MACHINE LEARNING
BY
UDEH ONYEDIKACHI PETER
U17/NAS/CSC/251
COMPUTER SCIENCE
COMPUTER SCIENCE/MATHEMATICS
FACULTY OF NATURAL SCIENCES AND
ENVIRONMENTAL STUDIES
GODFREY OKOYE UNIVERSITY, ENUGU NIGERIA

STOCK PRICE PREDICTION USING MACHINE LEARNING
BY
UDEH ONYEDIKACHI PETER
U17/NAS/CSC/251
A RESEARCH SUBMITTED TO THE DEPARTMENT OF COMPUTER
SCIENCE AND MATHEMATICS, FACULTY OF NATURAL SCIENCES
AND ENVIRONMENTAL STUDIES, GODFREY OKOYE UNIVERSITY,
ENUGU.
IN PARTIAL FULFILMENT OF THE REQUIREMENT FOR THE AWARD
OF BACHELOR OF SCIENCE (B.Sc) DEGREE IN COMPUTER SCIENCE
SUPERVISOR: Dr. S.C. Echezona
JULY, 2021
APPROVAL
This research titled STOCK PRICE PREDICTION USING MACHINE LEARNING has been
assessed and approved by the Committees of the Department of Computer Science and Faculty
of Natural Sciences and Environmental Studies, Godfrey Okoye University Enugu.
Dr. S.C. Echezona ……………………….. ……………………………..
Supervisor Signature Date
Dr. J.B Agbogun ……………………….. ……………………………..
Head of Department Signature Date
Assoc. Prof M.N. Unachukwu ……………………….. ……………………..
Dean, FNSES Signature Date
Prof H.C. Inyiama ……………………….. ……………………………..
External Examiner Signature Date
i
CERTIFICATION
This is to certify that this research titled STOCK PRICE PREDICTION USING MACHINE
LEARNING was carried out by UDEH ONYEDIKACHI PETER with Registration Number
U17/NAS/CSC/251 under supervision in the Department of Computer Science, Godfrey Okoye
University Enugu.
Dr. S.C. Echezona ……………………….. ……………………………..
Supervisor Signature Date
Dr. J.B Agbogun ……………………….. ……………………………..
Head of Department Signature Date
ii
DEDICATION
This research work is dedicated to Almighty God, who gave me the wisdom, understanding and
patience to see this project from the beginning to the end.
iii
ACKNOWLEDGEMENTS
With all my heart, I appreciate God Almighty for his protection and guidance and for seeing this
research work from the beginning to the end. . I wish to express my heartfelt gratitude to my able
supervisor Dr. S.C. Echezona who made this research work a reality through his dedication and
corrections. Special thanks to my Head of Department Dr J.B Agbogun and all the Lecturers and
Staff in the Computer science/Maths department for their contributions to my academic life. I
express my immense gratitude to my lovely parents for their unfailing support, words of
encouragement and prayers. I say a big thank you for your immense support.
iv
ABSTRACT
This study builds the groundwork for democratizing machine learning technology for retail
investors by using a web application to connect machine learning models for retail investors. It
provides predictions and visualizations to help investors navigate the stock markets and make
more intelligent decisions. Prophet, Ridge Regressions, Recurrent Neural Networks, and
Bagging regression are among the stock price prediction algorithms and models created. A great
variety of basic features, such as technical analysis features developed using distinct history
timelines, are driven by time series data. Numerous feature selection and feature extraction
approaches are used to find best features for the problem. The major technologies used for
the web application includes; Streamlit, SQLite, Bootstrap.
v
TABLE OF CONTENTS
Title page
Approval…………………………………………………………………………………………...i
Certification……………………………………………………………………………………….ii
Dedication………………………………………………………………………………………...iii
Acknowledgements……………………………………………………………………………….iv
Abstract…………………………………………………………………………………………....v
Table of contents………………………………………………………………………………….vi
List of tables……………………………………………………………………………………...ix
List of figures……………………………………………………………………………………...x
1.0 CHAPTER ONE: INTRODUCTION………………………………………………………1
1.1 Background of the Study……………………………………………………………………...1
1.2 Statement of the Problem……………………………………………………………………...2
1.3 Aim and Objectives…………………………………………………………………………....2
1.4 Significance of the Study……………………………………………………………………...3
1.5 Scope of the Study…………………………………………………………………………….3
2.0 CHAPTER TWO: LITERATURE REVIEW……………………………………………...5
vi
2.1 Introduction…………………………………………………………………………………....5
2.2 Theoretical Background……………………………………………………………………….5
2.3 Review of Related Literature………………………………………………………………...16
3.0 CHAPTER THREE: SYSTEM ANALYSIS AND DESIGN…………………………….19
3.1 Overview……………………………………………………………………………………..19
3.2 Analysis of the Existing System……………………………………………………………..19
3.3 Analysis of Proposed System………………………………………………………………..20
3.4 Design of Proposed System………………………………………………………………….20
4.0 CHAPTER FOUR: IMPLEMENTATION………………………………………………..23
4.1 Preamble……………………………………………………………………………………..23
4.2 Choice of Development Environment……………………………………………………….23
4.3 Software Testing……………………………………………………………………………..24
4.4 User Manual………………………………………………………………………………….27
5.0 CHAPTER FIVE: CONCLUSION AND RECOMMENDATIONS…………………….28
5.1 Summary……………………………………………………………………………………..28
5.2 Conclusion…………………………………………………………………………………...28
5.3 Recommendation……………………………………………....…………………………….28
vii
REFERENCE…………………………………………………………………………………….29
Appendix A………………………………………………………………………………………32
Appendix B……………………....……………………………………………………………....61
viii
LIST OF TABLES
Table 2.1: The train time, accuracy score, mean squared error (MSE) of both the train and
test data from each of the models used in this study………………………………………….11
ix
LIST OF FIGURES
Figure 2.1: A line plot of the actual price versus the predicted price using Ridge Regression
model…………………………………………………………………………………………….12
Figure 2.2: A line plot of the actual price versus the predicted price using the Prophet
model…………………………………………………………………………………………….12
Figure 2.3: A line plot of the actual price versus the predicted price using Bagging
Regression model……………………………………………………………………………….13
Figure 2.4: A line plot of the actual price versus the predicted price using LSTM model...13
Figure 3.1: A use case diagram of the proposed system……………………………………...21
Figure 3.2: Activity diagram of the proposed system………………………………………...22
Figure 4.1: Different stock tickers of different companies to select from…………………...25
Figure 4.2: The raw data and interactive visualization of the raw data…………………….25
Figure 4.3: The forecast data and interactive visualization of the forecast data…………...26
Figure 4.4: Visualized stock trends……………………………………………………………26
x
CHAPTER ONE
INTRODUCTION
1.1 Background of the Study
In recent years, many attempts have been made to predict the behavior of bonds, currencies,
stocks, stock markets or other economic markets among economists and investors. These
attempts were inspired by various information on how uncertain the economic markets
behaves.
Retail investors spend a lot of time trying to find investment opportunities. Wealthier investors
could seek professional financial advisory services, which is not true for retail investors, being
that the costs are prohibitive. Thus, retail investors need to find out the market themselves and
make informed decisions on their own, this makes investment very stressful for them.
Unfortunately, humans are irrational in nature. Without quantitative, data-driven models,
decisions get swayed by cognitive biases or personal emotions, leading to irrelevant losses. Even
if retail investors are watchful enough, most do not have sufficient skills to process a huge
volume of data required to make good investment decisions. Institutional investors rely on
sophisticated models backed by technologies to avoid traps, but retail investors do not have an
avenue to such technologies and often find themselves falling behind the market.
1
Without access to quantitative and data-driven models, one obvious approach retail investors
could use to gauge the market is through simple indicators, for instance, rectilinear regression,
bollinger bands and exponential moving average (EMA). Another conspicuous approach retail
investors might use to predict the stock exchange is to draw a rectilinear regression line that
connects the utmost or minimum of candlesticks.
Inspired by the increasing popularity of machine learning algorithms for forecasting applications,
these algorithms might function as potential tools to discover hidden patterns within the trend of
stock prices, this information might be useful to supply extra insights for retail investors when
making investment decisions. Therefore, this final year project aims to research the usefulness of
machine learning in predicting stock prices and democratize such technologies through a simple
to use interface for the overall public.
1.2 Statement of the Problem
Various prediction techniques were studied in the stock market prediction field and still
nowadays researchers are focusing on implementing the latest technique so as to enhance the
stock price prediction model.
Retail investors most of the time find it difficult to seek professional financial consultative
services because the prices are excessive. The main goal of this project research is to provide
retail investors a web application that uses machine learning to assist them steer within the
fast-changing stock market. The objective of the project is to introduce and democratize the
current Machine Learning and Deep Learning technologies for retail investors to help them make
investment decisions. No prediction is 100% accurate. Therefore, the upper bound and lower
bound of the stock prices will be displayed to illustrate the trading range the investors should be
2
looking at. This application is an additional quantitative tool for investors to ascertain the market
at a different perspective with the help of machine learning.
1.3 Aim and Objectives of the Study
The main aim of the study is to develop a web application that provides stock price prediction
based on the latest machine learning technologies to retail investors.
Specific Objectives:
i. Explore how different machine learning approaches can be used and will affect the
accuracy of stock price predictions.
ii. Investigate how different hyperparameters can be tuned for better performance of the
machine learning models.
iii. Develop a web application to display the predictions in an intuitive way.
iv. Visualize the prediction result in the web application.
1.4 Significance of the Study
This research is carried out with the main objective of introducing and democratizing the latest
machine learning technologies and also helps retail investors make investment decisions.
Therefore, some importance of this study is stated and based on the results obtained, it is hoped
that this study is able to:
1. Provide retail investors access to quantitative and data-driven models useful for making
investment decisions.
2. To encourage more studies on machine learning in term of different model selections
for stock prediction or stock selection
3
3. To provide a basis for researchers who are interested in applying machine learning
algorithms in fundamental data such as accounting information and other financial data.
1.5 Scope of the Study
1. The scope of this project does not exceed a generalized suggestion tool.
2. No prediction is 100% accurate since there are many parameters that can directly affect
the stock market, each and every one of them cannot be taken into account.
3. This system is limited to only certain users that have knowledge of the stock market.
4
CHAPTER TWO
LITERATURE REVIEW
2.1 Introduction
This chapter shows, and outlines the various technologies used in my study. It also presents
related literature from different authors.
2.2 Theoretical Background
This project is partitioned into two sections, specifically a research segment and an application
segment. The Machine Learning and Deep Learning algorithms used in the research part
includes; Ridge Linear Regression model, BaggingRegressor model, LSTM, Prophet and
AdaBoostRegressor model. While the major technologies used for the application part includes;
Streamlit, SQLite, Bootstrap.
2.2.1 Python
Python is one of the most powerful and most popular programming languages for scientific
computing. It is easy to learn, has efficient data structures, and a simple but effective approach to
object-oriented programming. Python’s elegant syntax and dynamic typing, together with its
interpreted nature, make it an ideal language for scripting and rapid application development in
many areas on most platforms (Rossum 2020). In this project, Python programming
packages/libraries will be used for the Machine Learning predictions. The server side web
framework that will be used to server the predicted prices is also written in Python.
5
2.2.2 Streamlit
Streamlit is an open-source software framework for deploying Data Science and Machine
Learning projects. It provides for simple data optimization, deployment, and statistical analysis
with very little code. It also eliminates the need for prior experience with web service
frameworks such as Django and Flask. This is especially handy when working on data
dashboards with a team that is mainly made up of non-technical people. Streamlit is simple to
use because it builds an interactive data-driven web application using predefined commands.
Simple instructions like st.write() may now be used to build a wide range of objects, from simple
text to pandas dataframes and matplotlib visualizations (Saxena et al 2020).
2.2.3 SQLite
SQLite is a relational database management system (RDBMS) that will be used in this project to
store user data and preferences about their choice of stocks. SQLite is an open source embedded
relational database designed to provide a convenient way for applications to manage data without
the overhead that often comes with dedicated relational database systems (Owens 2006). SQLite
is easy to configure, easy to use, highly portable and efficient compared to other relational
database management systems. SQLite is serverless such that it does not need a server to operate.
The database storage file is accessed directly using the SQLite library.
2.2.4 Bootstrap
6
Bootstrap is now the most widely used HTML, CSS, and JS framework for creating responsive,
mobile-first web projects. It is used to speed up and simplify front-end web development. It's
designed for people of all skill levels, devices of all sizes, and projects of all sizes. In this project,
Bootstrap will be used as the front end technology for displaying the Machine Learning predicted
results and other features the user will be seeing on the web page.
2.2.5 Machine learning/ Deep learning algorithms
This study will explore different Machine Learning models, how they can be used and how they
will affect the accuracy of the stock price prediction. The Machine Learning algorithms used in
this project are part of the Scikit-learn library while the Deep Learning algorithms are part of the
PyTorch library.
2.2.5.1 Scikit learn
Scikit-learn is an open source Python library that provides supervised and unsupervised Machine
Learning algorithms. Scikit-learn harnesses this rich environment to provide state-of-the-art
implementations of many well known machine learning algorithms, while maintaining an
easy-to-use interface tightly integrated with the Python programming language (Pedregosa et al,
2011). Some of the functionalities provided by Scikit-learn library includes; Regression,
Classification, Clustering, Model selection, Preprocessing.
2.2.5.2 PyTorch
PyTorch is a Deep Learning framework developed by Facebook's AI Research lab (FAIR).
PyTorch is open source and it focuses on both usability and speed. PyTorch offers an imperative
and Pythonic programming style that supports code as a model, simplifies debugging, and is
7
consistent with other widely known machine learning libraries, all while remaining efficient and
continuing to support hardware accelerators like GPUs (Paszke 2019). PyTorch is easy to learn,
simple to use and more pythonic compared to other Deep Learning frameworks like Tensorflow.
PyTorch has a very special feature called data parallelism which allows it to distribute
computational workload among multiple CPU or GPU cores. Some of the PyTorch Deep
Learning algorithms includes; Recurrent Neural Network (RNN), Convolutional Neural Network
(CNN), Artificial Neural Network (ANN), LSTM.
2.2.5.3 Ridge Linear Regression Model
Ridge Regression (also called Tikhonov regularization) is a regularized version of Linear
Regression: a regularization term equal to α∑n = 1θ2 is added to the cost function. This forces the
learning algorithm to not only fit the data but also keep the model weights as small as possible
(Aurélien 2019). The parameter α determines how much the model should be regularized. If = 0,
Ridge Regression is simply Linear Regression. If it is very large, all weights end up very close to
zero, resulting in a flat line passing through the mean of the data. Ridge regression is a model
tuning technique used to analyze data with multicollinearity. L2 regularization is performed by
this method. When there is a problem with multicollinearity, least-squares are unbiased, and
variances are large, resulting in predicted values that are far from the actual values.
2.2.5.3 Prophet Model
The Prophet model is a relatively new methodology developed by Facebook researchers.
Because of its structure of adjusting parameters without investigating the original model's details,
it is a simple yet robust estimation method. It includes a time series model that can be
decomposed into three main model components: trend, holidays, and seasonality. In a recent
8
study “Long-Term Forecasting of Electrical Loads in Kuwait Using Prophet and Holt–Winters
Models'' by Almazrouee et al (2020), the Prophet model outperformed the well-established
Holt–Winters model in Kuwait’s long-term peak load forecasting. The use of this method
in forecasting is expected to spread due to its robustness and accuracy. Prophet is a method
for forecasting time series data that uses an additive model to fit non-linear trends with yearly,
weekly, and daily seasonality, as well as holiday effects. It works best with time series that have
strong seasonal effects and historical data from multiple seasons.
2.2.5.4 Bagging Regression Model
Bagging regression is an acronym derived from Bootstrap aggregating, an ensemble
approach to classification through utilization of different classification methods and
regression methods with the aim of reducing prediction process variance (Sadrmomtazi
2013). Bagging is built on the development of individual regression models that use a randomly
distributed training set to train a single algorithm. There is a random training set of N instances
for each regression model (N = size of original training set). In the test, a significant number of
the original instances may be repeated or completely omitted. The average prediction values are
used to provide the final prediction after iteratively building several regression models. For a
long time, the approach has been associated with a proper response to handling missing values in
datasets due for use in prediction.
2.2.5.5 Long Short-Term Memory (LSTM)
A recurrent neural network (RNN) is a type of artificial neural network that recognizes
sequential patterns in data to predict the following scenarios (Laskowski 2018). This architecture
is particularly powerful due to its node connections, which enable the display of temporal
9
dynamic behavior. The use of feedback loops to process a sequence is another important feature
of this architecture. Such a characteristic allows information to persist, which is commonly
referred to as memory. Because of this behavior, RNNs are ideal for time series problems. Long
short-term memory (LSTM) architectures were developed based on this structure. LSTMs are
specifically developed to prevent the problem of long-term dependency. They don't have to work
hard to remember knowledge for lengthy periods of time; it's nearly second nature to them (Olah
2015). They are currently frequently utilized and function exceptionally effectively in a wide
range of situations.
2.2.5.6 The performance of the Machine learning algorithms used
The dataset used for this research is an OHLCV (Open, High, Low, Close, Volume) historical
data fetched using Yahoo finance API. Other features were extracted from the fetched dataset.
The features used are the closing price of each particular trading day, the volume of stock traded
any particular day, S&P 500 index, the Bollinger bands derived from the Simple Moving
Average (SMA), rolling mean and rolling standard deviation.
After data extraction, the next step was feature scaling. The reason for feature scaling is to bring
every feature in the same footing without any upfront importance. Another reason why feature
scaling is applied is that few algorithms like Neural network gradient descent converge much
faster with feature scaling than without it (Roy 2020). The MinMaxScaler class from the
Scikit-learn library was used to perform the feature scaling, it scales the data by using a scaling
10
technique called normalization in which the values are shifted and rescaled so that they end up
ranging between 0 and 1.
After feature scaling, the scaled data is used for training each model. After training, the mean
squared error was used to measure the average of the squares of the errors i.e, the average
squared difference between the estimated values and the actual value.
Table 2.1: The train time, accuracy score, mean squared error (MSE) of both the train and
test data from each of the models used in this study.
Ridge Prophet Bagging LSTM
Regression Regression
Train MSE 875.77453 3706.13251 736.78503 540.7672
Test MSE 7456.85947 133462.64 6298.09608 562.30225
Train Time 0.03511s 1.70573s 0.02368s 1.26454s
The Table 2.1 above shows the performance of each model on the prepared dataset. It can be
observed that every other model except LSTM is overfitting. This is because the MSE from their
respective test data is way higher than that of their training data, as such they may not generalize
well on unseen data.
11
Figure 2.1: A line plot of the actual price versus the predicted price using Ridge Regression
model.
Figure 2.2: A line plot of the actual price versus the predicted price using the Prophet
model.
12
Figure 2.3: A line plot of the actual price versus the predicted price using Bagging
Regression model
Figure 2.4: A line plot of the actual price versus the predicted price using LSTM model
2.2.6 Hyperparameters and hyperparameter tuning

13
When building a machine learning model, you'll be given design options for defining your model
architecture. We don't always know what the best model architecture is for a given model, so
we'd like to be able to experiment with a variety of options. In true machine learning fashion,
we'll ideally ask the machine to perform this exploration and automatically select the best model
architecture. The parameters that define the model architecture are known as hyperparameters,
and the process of searching for the best model architecture is known as hyperparameter tuning
(Jordan 2017). The difficulty with hyperparameters is that there is no single magic number that
works everywhere. The best numbers vary depending on the task and the dataset.
Knowing where to begin can be difficult given the number of hyperparameters that you may
want to tune. With this in mind, here are commonly used hyperparameters used in this project.
2.2.6.1 Learning rate
The single most important hyperparameter is learning rate, and it should always be tuned. The
learning rate is a hyper-parameter that governs how much we adjust the weights of our network
in relation to the loss gradient (Zulkifli 2018). A good starting point is 0.01. If our learning rate
is too low, it will take a much longer time (hundreds or thousands of epochs) to reach the ideal
state. On the other hand, if our learning rate is too high, it will overshoot the ideal state and our
algorithm will fail to converge as such, it will result in overfitting.
In a report “Cyclical Learning Rates for Training Neural Networks”, Smith (2018) argued that
you could estimate a good learning rate by training the model initially with a very low learning
rate and increasing it (either linearly or exponentially) at each iteration.
2.2.6.2 Mini-Batch gradient descent
14
Gradient descent is a popular optimization algorithm for determining the weights or coefficients
of machine learning algorithms like artificial neural networks and logistic regression. It works by
having the model predict on training data and then using the error on the predictions to update
the model in order to reduce the error (Brownlee 2017).
Mini-batch gradient descent is a gradient descent variation that divides the training dataset into
small batches that are used to calculate model error and update model coefficients (Brownlee
2017). Implementations may choose to sum the gradient over the mini-batch, which reduces the
gradient's variance even further. Mini-batch gradient descent is the recommended gradient
descent variant for most applications, particularly deep learning. Mini-batch sizes, also known as
“batch sizes” for brevity, are frequently tuned to an aspect of the computational architecture on
which the implementation is running. For example, a power of two that corresponds to the
memory requirements of the GPU or CPU hardware, such as 32, 64, 128, 256 etc. Small values
result in a learning process that converges quickly at the expense of training noise. Large values
result in a slow convergent learning process with accurate estimates of the error gradient.
2.2.6.2 Number of epochs
The number of epochs determines how frequently the network's weights will be changed. As the
number of epochs increases, the neural network's weights are changed the same number of times,
and the boundary shifts from underfitting to optimal to overfitting. The Validation Error is the
metric we should be looking at when deciding on the number of epochs for our training step. The
intuitive manual method is to train the model for as many iterations as the validation error
decreases. To determine when to stop training the model, a technique known as Early Stopping is
15
used. If the validation error has not improved in the last 10 or 20 epochs, the training process is
terminated.
2.3 Review of Related Literature
Various researchers have worked on various studies to improve the accuracy of stock price
prediction in the Artificial Intelligence industry. They include:
Piramuthu (2004) thoroughly evaluated various feature selection methods for data mining
applications. He compared how different feature selection methods optimized decision tree
performance using datasets such as credit approval data, loan default data, web traffic data, tam
data, and kiang data. He compared probabilistic distance measures such as the Bhattacharyya
measure, the Matusita measure, the divergence measure, the Mahalanobis distance measure, and
the Patrick-Fisher measure. The Minkowski distance measure, city block distance measure,
Euclidean distance measure, and nonlinear distance measure are inter-class distance measures.
The author's evaluation of both probabilistic distance-based and several inter-class feature
selection methods is a strength of this paper. Furthermore, the author conducted the evaluation
using various datasets, which added to the paper's strength. The evaluation algorithm, on the
other hand, was only a decision tree. We don't know if the feature selection methods will still
work on a larger dataset or a more complex model.
In their study "Stock market forecasting using Hidden Markov Model" Hassan and Nath (2005)
used the Hidden Markov Model (HMM) to forecast stock prices of four different airlines. They
divide the model's states into four categories: the opening price, the closing price, the highest
price, and the lowest price. The approach used in this paper is unique in that it does not require
expert knowledge to build a prediction model. While this work is limited to the airline industry
16
and is based on a very small dataset, it may not result in a generalizable prediction model. One of
the approaches used in stock market prediction works could be used to perform the comparison
work. The authors limited the date range of the training and testing datasets to a maximum of
two years.
Lee (2009) predicted stock trends using the support vector machine (SVM) and a hybrid feature
selection method in "Using support vector machine with a hybrid feature selection method to the
stock trend prediction". The dataset used in this study is a subset of the NASDAQ Index from the
Taiwan Economic Journal Database (TEJD) in 2008. The feature selection part used a hybrid
method, with supported sequential forward search (SSFS) acting as the wrapper. Another
advantage of this work is that they created a detailed procedure for parameter adjustment with
performance under various parameter values. The feature selection model's clear structure is also
heuristic to the primary stage of model structuring. One limitation was that the performance of
SVM was only compared to back-propagation neural networks (BPNNs) and not to other
machine learning algorithms.
Lei (2018) used Wavelet Neural Network (WNN) to predict stock price trends in "Wavelet neural
network prediction method of stock price trend based on rough set attribute reduction". As an
optimization, the author used Rough Set (RS) for attribute reduction. Rough Set was used to
reduce the dimensions of the stock price trend feature. It was also used to determine the Wavelet
Neural Network's structure. This work's dataset includes five well-known stock market indices:
(1) the SSE Composite Index (China), (2) the CSI 300 Index (China), (3) the All Ordinaries
Index (Australia), (4) the Nikkei 225 Index (Japan), and (5) the Dow Jones Index (USA). The
model was evaluated using various stock market indices, and the results were convincing and
general. The computational complexity is reduced by using Rough Set to optimize the feature
17
dimension before processing. However, in the discussion section, the author only emphasized
parameter adjustment and did not mention the model's flaws. I discovered that because the
evaluations were done on indices, the same model may not perform as well if applied to a
specific stock.
Recent studies make use of input data from a variety of sources and in a variety of formats. Some
systems use historical stock data to perform mathematical analysis, others use financial news
articles, still others use expert reviews to perform sentimentanalysis on financial news articles,
and still others use a hybrid system that uses multiple inputs to forecast the market.
18
CHAPTER THREE
SYSTEM ANALYSIS AND DESIGN
3.1 Overview
In this study, Object-Oriented Analysis and Design (OOAD) methodology and notation symbols
of the Unified Modelling Language (UML) will be used for the analysis of the system.
Object-Oriented Analysis and Design (OOAD) is a new system development approach
encouraging and facilitating reuse of software components. The OOAD methodology uses
diagrams to document an object-based decomposition of systems and to show the interaction
between these objects and dynamics of these objects.
3.2 Analysis of the Existing System
Before the proposal of the new system, retail investors spend a lot of time trying to find
investment opportunities. One obvious approach retail investors could use to gauge the market is
by drawing technical indicators on a visualized price history, for instance, Bollinger bands,
Simple Moving Average, Exponential moving average (EMA) and Relative strength index (RSI).
This requires a lot of work and decisions get swayed by cognitive biases or personal emotions,
leading to irrelevant losses.
3.2.1 Problems of the Existing System
Having the overview knowledge of the existing system, some of its problems are as follows.
1. Irrelevant losses from bad investment decisions due to cognitive biases or personal
emotions.
19
2. Irrelevant losses due to lack of proper analysis of stock market data.
3. Time consuming: a lot of time is spent trying to find investment opportunities or trying to
gauge the market.
3.3 Analysis of the Proposed System
The proposed system will process a huge volume of data required to make good investment
decisions and use quantitative and data-driven models to gauge the market and predict stock
prices. Because it does not account for human behaviors, the proposed system will not always
produce accurate results. Factors such as a change in company leadership, internal issues, strikes,
protests, natural disasters, and a change in authority cannot be considered when relating to a
change in the stock market by the machine. The system's goal is to provide an estimate of where
the stock market might be headed. Many factors and parameters may influence it along the way.
3.3.1 Advantages of the Proposed System
1. The use of quantitative and data driven models to make investment decisions.
2. Time saving: retail investors no longer have to waste time processing market data.
3. Since the models are data driven, the losses will be less because models cannot be
swayed by emotions.
4. Less expensive: retail investors won’t have to spend the huge amount required to hire
financial advisors.
3.4 Design of the Proposed System
The proposed solution is comprehensive as it includes pre-processing of the stock market dataset,
utilization of multiple features and visualization of the stock market price trend prediction.
20
3.4.1 Use Case
The use case is a list of steps, typically defining interactions between a role(actor) and a system.
Use case diagrams are used to gather the requirements of a system including internal and
external influences. These requirements are mostly design requirements. Hence, when a system
is analyzed to gather its functionalities, use cases are prepared and actors are identified.
When the initial task is complete, use case diagrams are modelled to present the outside view.
Figure 3.1: A use case diagram of the proposed system
3.4.2 Activity Diagram
21
Activity diagrams represent workflows in a graphical way. They can be used to describe the
workflow or the operational workflow of any component in a system.
Figure 3.2: Activity diagram of the proposed system
22
CHAPTER FOUR
IMPLEMENTATION
4.1 Preamble
This chapter describes the implementation of the new system and the software and hardware that
may be required for the system implementation.
4.2 Choice of Development Environment
4.2.1 Integrated Development Environment
My choice of IDE Jupyter Notebooks. Jupyter Notebook is a free and open-source web tool that
lets us write and share code and documents. It provides an environment in which you may
document your code, run it, examine the findings, visualize data, and view the results without
having to leave the environment. This makes it a useful tool for completing end-to-end data
science workflows, including data cleansing, statistical modeling, constructing and training
machine learning models, visualizing data, and so on.
It provides an environment in which you may document your code, run it, study the results,
visualize data, and view the output without ever leaving the environment. This makes it a helpful
tool for completing end-to-end data science workflows, such as data purification, statistical
modeling, machine learning model construction and training, data visualization, and so on.
4.2.2 Choice of Programming Language
23
All the machine learning related code are written in Python. Python programming language was
used mainly for the development of the back-end server of the system. The Streamlit framework
was used to server the machine learning prediction to the user interface.
4.3 Software Testing
There was extensive testing of the software throughout the implementation. The testing is in two
phases:
1. Component test
2. System test
Component test - during the development of the project, Jupyter notebook provided the
capability to test each model or function in real time. The functions were tested with different
variables in order to detect any possible error, while the models were tuned with different
parameters to reduce error and increase performance.
System testing - the completed system was later tested on more realistic data after completion to
make sure all components worked correctly together as a whole.
Screenshots
Figure 4.1 shows different stock tickers of different companies to select from, figure 4.2 shows
the raw data and interactive visualization of the raw data for any selected stock, figure 4.3 shows
the forecast data and interactive visualization of the forecast data for any selected stock, figure
4.4 shows the trends for the selected stock.
24
Figure 4.1: Different stock tickers of different companies to select from.
Figure 4.2: The raw data and interactive visualization of the raw data.
25
Figure 4.3: The forecast data and interactive visualization of the forecast data.
Figure 4.4: Visualized stock trends
26
4.5 User Manual
To run the whole program on a personal workstation od PC, follow these steps;
1. Making sure that Python >=3.8 is installed on your server computer.
2. Install all the required modules by running “$ pip install setup.py” on the terminal or
command prompt.
3. Then run “ $ streamlit run main.py” to launch the program.
4.6 Source Code Listing
Some of the source code snippets for this implementation can be seen at Appendix A and
Appendix B.
27
CHAPTER FIVE
CONCLUSION AND RECOMMENDATIONS
5.1 Summary
Stock investment can help you build your savings, protect your money from inflation and taxes.
But deciding on what or where to invest your money can be difficult, and the cost for hiring
investment professionals/experts can be prohibitive for retail investors. This study builds the
groundwork for democratizing machine learning technology for retail investors by using a web
application to connect machine learning models for retail investors. It provides predictions and
visualizations to help investors navigate the stock markets and make more intelligent decisions.
5.2 Conclusion
This stock price prediction web application helps retail investors make informed decisions about
the stock market, using technical and data driven models to help them reduce investment losses
and save time. Currently. The platform is limited to some features but it is easily extensible to
support upgrades.
5.3 Recommendation
Having presented all that is necessary for the implementation of this research, the following
recommendations are suggestions aimed at improving and correcting some lapses;
Considering some other factors that affect the price of the stock market such as fundamental
analysis, social media sentiments and market news.
28
REFERENCES
Aurélien G (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow:
Concepts, Tools, and Techniques to Build Intelligent Systems, 2019, p. 135)
Brownlee, J. (2017, July 21, para. 8-22). A Gentle Introduction to Mini-Batch Gradient Descent
and How to Configure Batch Size. Machine Learning Mastery.
https://tinyurl.com/kvnd852f
Hassan MR, Nath B (2005). Stock market forecasting using Hidden Markov Model: a new
approach. In: Proceedings—5th international conference on intelligent systems
design and applications 2005, ISDA’05. 2005. pp. 192–6.
https://doi.org/10.1109/ISDA.2005.85.
Jordan, J. (2017, November 2, para. 1). Hyperparameter tuning for machine learning models.
Jeremy Jordan. https://www.jeremyjordan.me/hyperparameter-tuning/
Laskowski, N. (2018, para. 1). Recurrent neural networks. Retrieved from
searchenterpriseai.techtarget.com/definition/recurrent-neural-networks
Lee MC 2009. Using support vector machine with a hybrid feature selection method to the
stock trend prediction. Expert Syst Appl. 2009;36(8):pp. 10896–904.
https://doi.org/10.1016/j.eswa.2009.02.038.
Lei L (2018). Wavelet neural network prediction method of stock price trend based on rough set
attribute reduction. Appl Soft Comput J. 2018;62:pp. 923–32.
https://doi.org/10.1016/j.asoc.2017.09.029.
Olah, C. (2015, August 27, para. 14). Understanding LSTM Networks -- colah’s blog. Colah.
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Owens, M. (2006). The Definitive Guide to SQLite. In The Definitive Guide to SQLite
(1st ed., p. 1). Apress.
Paszke Adam, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan,
Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga,
Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison,
Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai,
Soumith Chintala (2019, p. 1). PyTorch: An Imperative Style, High-Performance
Deep Learning Library.
Pedregosa Fabian, Ga ̈el Varoquaux, Alexandre Gramfort, Vincent Michel, BertrandThirion,
Olivier Grisel, Mathieu Blondel,Peter Prettenhofer, Ron Weiss,
Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau,
Matthieu Brucher,Matthieu Perrot and Edouard Duchesnay (2011).
Journal of Machine Learning Research 12 (2011, pp. 2825-2830).
29
Piramuthu S 2004. Evaluating feature selection methods for learning in data mining applications.
Eur J Oper Res. 2004;156(2):pp. 483–94.
https://doi.org/10.1016/S0377-2217(02)00911-6.
Rossum, V. G., & Team, P. D. (2018b, p. 1). Python Tutorial: Release 3.6.4. In Python Tutorial.
12th Media Services.
Roy, B. (2020, April 7, para. 9). All about Feature Scaling - Towards Data Science. Medium.
https://towardsdatascience.com/all-about-feature-scaling-bcc0ad75cb35
Sadrmomtazi S. M., "Modelling Compressive Strengh of EPS Lightweight Concrete Using
Regression, Neural Networks and ANFIS," Construction and Building Materials
Journal, vol. 42, (2013, pp. 205 -216).
Saxena, A., Dhadwal, M., & Kowsigan, M. (2020, p. 5). Indian Crop Production:Prediction And
Model Deployment Using Ml And Streamlit. Turkish Journal of Physiotherapy
and Rehabilitation.
Smith, L. N. (2018, July 16, p. 5). Cyclical Learning Rates for Training Neural Networks.
Cyclical Learning Rates for Training Neural Networks
Zulkifli, H. (2019, March 28, para. 3). Understanding Learning Rates and How It Improves
Performance in Deep Learning. Medium. https://tinyurl.com/39kefbyr
30
APPENDICES
Appendix A: Source code of the research part
# !pip install yfinance --upgrade --no-cache-dir
import yfinance as yf
import datetime as dt
import numpy as np
import pandas as pd
import time
# Importing the necessary charting library
import matplotlib.pyplot as plt
import seaborn as sns
sns.set() # Setting seaborn as default style even if use only matplotlib
# get the start and end date dynamically using the datetime module
start = dt.datetime.today() - dt.timedelta(365*8)
end = dt.datetime.today()
print(start, end)
# assign an stock ticker
indicator = 'GOOG'
ohlc_data = yf.download(indicator, start=start, end=end)
# Data preparation and feature extraction
selected_df = pd.DataFrame({'close':ohlc_data['Adj Close'], 'volume':ohlc_data['Volume']},

index=ohlc_data.index)
# plot the close price of the selected stock
selected_df.close.plot(figsize=(15,7))
# Price for the Previous 3 days
selected_df['previous_3_days'] = selected_df.close.shift(-3)
# Nasdaq index
31
selected_df['nasdaq'] = yf.download('NDX', start=start, end=end)['Close']
# Bollinger Band
# 21 days moving average
selected_df['21 Day MA'] = selected_df['close'].rolling(window=5).mean()
# 21 days standard deviation
selected_df['21 Day STD'] = selected_df['close'].rolling(window=5).std()
# upper band
selected_df['Upper Band'] = selected_df['21 Day MA'] + (selected_df['21 Day STD'] * 2)
# lower band
selected_df['Lower Band'] = selected_df['21 Day MA'] - (selected_df['21 Day STD'] * 2)
# Target
interval_for_target = 5
selected_df['target'] = selected_df.close.shift(-interval_for_target)
selected_df.dropna(inplace=True);
# Split data for train and test purposes
from sklearn.model_selection import train_test_split
# split data to train test test split
train_data, test_data = train_test_split(selected_df, test_size=0.2, shuffle=False)
X_train = train_data.drop(columns=['target']) # the training feature
y_train = train_data['target'] # the training target
X_test = test_data.drop(columns=['target'])
y_test = test_data['target']
# scale the data
from sklearn.preprocessing import MinMaxScaler
min_max_scaler = MinMaxScaler()
X_train_scaled = min_max_scaler.fit_transform(X_train)
from sklearn.linear_model import Ridge
# Note that Ridge regression performs linear least squares with L2 regularization.
32
# Create and train the Ridge Linear Regression Model
ridge_model = Ridge()
start = time.time()
ridge_model.fit(X_train_scaled, y_train)
stop = time.time()
print(f"Training time: {stop - start}s")
# Test the model and calculate its accuracy
X_test_scaled = min_max_scaler.transform(X_test)
accuracy = ridge_model.score(X_test_scaled, y_test)
print(accuracy)
selected_df_plus_pred = test_data
selected_df_plus_pred['ridge_predicted'] = ridge_model.predict(X_test_scaled)
selected_df_plus_pred.head()
# Ploting the performance of the ridge regression model
selected_df_plus_pred[['target', 'ridge_predicted']][300:].plot(figsize=(15,7))
plt.legend(['Actual', 'Ridge Predicted'])
# get the mean sqared error
from sklearn.metrics import mean_absolute_error, mean_squared_error
mean_squared_error(y_train, ridge_model.predict(X_train_scaled))
# Build and Train a Linear Regression Model with BaggingRegressor
from sklearn.model_selection import RandomizedSearchCV
from sklearn.linear_model import LinearRegression
from sklearn.ensemble import BaggingRegressor
lr_bag = BaggingRegressor(LinearRegression())
start = time.time()
lr_bag.fit(X_train_scaled, y_train)
stop = time.time()
33
accuracy = lr_bag.score(X_test_scaled, y_test)
print(accuracy)
# get the mean squared error
from sklearn.metrics import mean_squared_error
mean_squared_error(y_train, lr_bag.predict(X_train_scaled))
selected_df_plus_pred['bagging_predicted'] = lr_bag.predict(X_test_scaled)
selected_df_plus_pred.head()
# Ploting the performance of the bagging model
selected_df_plus_pred[['target', 'bagging_predicted']][300:].plot(figsize=(15,7))
plt.legend(['Actual', 'BaggingRegressor Predicted'])
# Build and Train a Linear Regression Model with AdaBoostRegressor
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import LinearRegression
from sklearn.ensemble import AdaBoostRegressor
lr_boost = AdaBoostRegressor(LinearRegression())
lr_boost.fit(X_train_scaled, y_train)
accuracy = lr_boost.score(X_test_scaled, y_test)
print(accuracy)
# get the mean squared error
from sklearn.metrics import mean_squared_error
mean_squared_error(y_test, lr_boost.predict(X_test_scaled))
selected_df_plus_pred['adaboost_predicted'] = lr_boost.predict(X_test_scaled)
selected_df_plus_pred.tail()
34
# Ploting the performance of the Adaboost model
selected_df_plus_pred[['target', 'bagging_predicted']][300:].plot(figsize=(15,7))
plt.legend(['Actual', ' AdaBoostRegressor Predicted'])
# Using Prophet Model
from fbprophet import Prophet
from pandas import to_datetime
df_for_prophet = pd.DataFrame({'ds':selected_df.index, 'y':selected_df.close})
df_for_prophet.reset_index(drop=True, inplace=True)
# split data to train test test split
prophet_train_data, prophet_test_data = train_test_split(df_for_prophet, test_size=0.2, shuffle=False)
prophet_X_train = prophet_train_data.drop(columns=['y']) # the training feature
prophet_y_train = prophet_train_data['y'] # the training target
prophet_X_test = prophet_test_data.drop(columns=['y'])
prophet_y_test = prophet_test_data['y']
# Make prediction
pht_model = Prophet()
start = time.time()
pht_model.fit(prophet_train_data)
stop = time.time()
future = pht_model.make_future_dataframe(periods=len(prophet_y_test),freq='MS')
prophet_pred = pht_model.predict(future)
prophet_pred
df_for_prophet['predicted'] = prophet_pred['yhat']
from sklearn.metrics import mean_squared_error, r2_score
mean_squared_error(df_for_prophet['y'], df_for_prophet['predicted'])
r2_score(df_for_prophet['y'], df_for_prophet['predicted'])
35
# Ploting the performance of the Prophet model
df_for_prophet[['y', 'predicted']][300:].plot(figsize=(15,7))
plt.legend(['Actual', ' Prophet Predicted'])
# Using LSTM model
import torch
import torch.nn as nn
from datetime import datetime
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"{device}" " is available.")
# Get the close and volume data as training data (Input)
training_data = selected_df.copy()
training_data['target'] = training_data.close.shift(-5)
training_data.dropna(inplace=True)
# Splitting the data into train-test split
def feature_label_split(df, target_col):
y = df[[target_col]]
X = df.drop(columns=[target_col])
return X, y
def train_val_test_split(df, target_col, test_ratio):
val_ratio = test_ratio / (1 - test_ratio)
X, y = feature_label_split(df, target_col)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=test_ratio, shuffle=False)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=val_ratio, shuffle=False)
return X_train, X_val, X_test, y_train, y_val, y_test
36
X_train, X_val, X_test, y_train, y_val, y_test = train_val_test_split(training_data, 'target', 0.2)
# Scaling the data
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
X_train_arr = scaler.fit_transform(X_train)
X_val_arr = scaler.transform(X_val)
X_test_arr = scaler.transform(X_test)
y_train_arr = scaler.fit_transform(y_train)
y_val_arr = scaler.transform(y_val)
y_test_arr = scaler.transform(y_test)
# Loading the data into DataLoaders
from torch.utils.data import TensorDataset, DataLoader
batch_size = 64
train_features = torch.Tensor(X_train_arr)
train_targets = torch.Tensor(y_train_arr)
val_features = torch.Tensor(X_val_arr)
val_targets = torch.Tensor(y_val_arr)
test_features = torch.Tensor(X_test_arr)
test_targets = torch.Tensor(y_test_arr)
train = TensorDataset(train_features, train_targets)
val = TensorDataset(val_features, val_targets)
test = TensorDataset(test_features, test_targets)
train_loader = DataLoader(train, batch_size=batch_size, shuffle=False, drop_last=True)
37
val_loader = DataLoader(val, batch_size=batch_size, shuffle=False, drop_last=True)
test_loader = DataLoader(test, batch_size=batch_size, shuffle=False, drop_last=True)
test_loader_one = DataLoader(test, batch_size=1, shuffle=False, drop_last=True)
class LSTMModel(nn.Module):
"""LSTMModel class extends nn.Module class and works as a constructor for LSTMs.
LSTMModel class initiates a LSTM module based on PyTorch's nn.Module class.
It has only two methods, namely init() and forward(). While the init()
method initiates the model with the given input parameters, the forward()
method defines how the forward propagation needs to be calculated.
Since PyTorch automatically defines back propagation, there is no need
to define back propagation method.
Attributes:
hidden_dim (int): The number of nodes in each layer
layer_dim (str): The number of layers in the network
lstm (nn.LSTM): The LSTM model constructed with the input parameters.
fc (nn.Linear): The fully connected layer to convert the final state
of LSTMs to our desired output shape.
"""
def __init__(self, input_dim, hidden_dim, layer_dim, output_dim, dropout_prob):
"""The __init__ method that initiates a LSTM instance.
Args:
input_dim (int): The number of nodes in the input layer
hidden_dim (int): The number of nodes in each layer
layer_dim (int): The number of layers in the network
38
output_dim (int): The number of nodes in the output layer
dropout_prob (float): The probability of nodes being dropped out
"""
super(LSTMModel, self).__init__()
# Defining the number of layers and the nodes in each layer
self.hidden_dim = hidden_dim
self.layer_dim = layer_dim
# LSTM layers
self.lstm = nn.LSTM(
input_dim, hidden_dim, layer_dim, batch_first=True, dropout=dropout_prob
# Fully connected layer
self.fc = nn.Linear(hidden_dim, output_dim)
def forward(self, x):
"""The forward method takes input tensor x and does forward propagation
Args:
x (torch.Tensor): The input tensor of the shape (batch size, sequence length, input_dim)
Returns:
torch.Tensor: The output tensor of the shape (batch size, output_dim)
"""
# Initializing hidden state for first input with zeros
39
h0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()
# Initializing cell state for first input with zeros
c0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()
# We need to detach as we are doing truncated backpropagation through time (BPTT)
# If we don't, we'll backprop all the way to the start even after going through another batch
# Forward propagation by passing in the input, hidden state, and cell state into the model
out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))
# Reshaping the outputs in the shape of (batch_size, seq_length, hidden_size)
# so that it can fit into the fully connected layer
out = out[:, -1, :]
# Convert the final state to our desired output shape (batch_size, output_dim)
out = self.fc(out)
return out
def get_model(model, model_params):
models = {
"lstm": LSTMModel,
return models.get(model.lower())(**model_params)
class Optimization:
"""Optimization is a helper class that allows training, validation, prediction.
Optimization is a helper class that takes model, loss function, optimizer function
40
learning scheduler (optional), early stopping (optional) as inputs. In return, it
provides a framework to train and validate the models, and to predict future values
based on the models.
Attributes:
model (RNNModel, LSTMModel, GRUModel): Model class created for the type of RNN
loss_fn (torch.nn.modules.Loss): Loss function to calculate the losses
optimizer (torch.optim.Optimizer): Optimizer function to optimize the loss function
train_losses (list[float]): The loss values from the training
val_losses (list[float]): The loss values from the validation
last_epoch (int): The number of epochs that the models is trained
"""
def __init__(self, model, loss_fn, optimizer):
"""
Args:
model (RNNModel, LSTMModel, GRUModel): Model class created for the type of RNN
loss_fn (torch.nn.modules.Loss): Loss function to calculate the losses
optimizer (torch.optim.Optimizer): Optimizer function to optimize the loss function
"""
self.model = model
self.loss_fn = loss_fn
self.optimizer = optimizer
self.train_losses = []
self.val_losses = []
def train_step(self, x, y):
"""The method train_step completes one step of training.
Given the features (x) and the target values (y) tensors, the method completes
41
one step of the training. First, it activates the train mode to enable back prop.
After generating predicted values (yhat) by doing forward propagation, it calculates
the losses by using the loss function. Then, it computes the gradients by doing
back propagation and updates the weights by calling step() function.
Args:
x (torch.Tensor): Tensor for features to train one step
y (torch.Tensor): Tensor for target values to calculate losses
"""
# Sets model to train mode
self.model.train()
# Makes predictions
yhat = self.model(x)
# Computes loss
loss = self.loss_fn(y, yhat)
# Computes gradients
loss.backward()
# Updates parameters and zeroes gradients
self.optimizer.step()
self.optimizer.zero_grad()
# Returns the loss
return loss.item()
42
def train(self, train_loader, val_loader, batch_size=64, n_epochs=50, n_features=1):
"""The method train performs the model training
The method takes DataLoaders for training and validation datasets, batch size for
mini-batch training, number of epochs to train, and number of features as inputs.
Then, it carries out the training by iteratively calling the method train_step for
n_epochs times. If early stopping is enabled, then it checks the stopping condition
to decide whether the training needs to halt before n_epochs steps. Finally, it saves
the model in a designated file path.
Args:
train_loader (torch.utils.data.DataLoader): DataLoader that stores training data
val_loader (torch.utils.data.DataLoader): DataLoader that stores validation data
batch_size (int): Batch size for mini-batch training
n_epochs (int): Number of epochs, i.e., train steps, to train
n_features (int): Number of feature columns
"""
model_path = f'{self.model}_{datetime.now().strftime("%Y-%m-%d %H:%M:%S")}'
for epoch in range(1, n_epochs + 1):
batch_losses = []
for x_batch, y_batch in train_loader:
x_batch = x_batch.view([batch_size, -1, n_features]).to(device)
y_batch = y_batch.to(device)
loss = self.train_step(x_batch, y_batch)
batch_losses.append(loss)
training_loss = np.mean(batch_losses)
self.train_losses.append(training_loss)
43
with torch.no_grad():
batch_val_losses = []
for x_val, y_val in val_loader:
x_val = x_val.view([batch_size, -1, n_features]).to(device)
y_val = y_val.to(device)
self.model.eval()
yhat = self.model(x_val)
val_loss = self.loss_fn(y_val, yhat).item()
batch_val_losses.append(val_loss)
validation_loss = np.mean(batch_val_losses)
self.val_losses.append(validation_loss)
if (epoch <= 10) | (epoch % 50 == 0):
print(
f"[{epoch}/{n_epochs}] Training loss: {training_loss:.4f}\t Validation loss: {validation_loss:.4f}"
torch.save(self.model.state_dict(), model_path)
def evaluate(self, test_loader, batch_size=1, n_features=1):
"""The method evaluate performs the model evaluation
The method takes DataLoaders for the test dataset, batch size for mini-batch testing,
and number of features as inputs. Similar to the model validation, it iteratively
predicts the target values and calculates losses. Then, it returns two lists that
hold the predictions and the actual values.
Note:
44
This method assumes that the prediction from the previous step is available at
the time of the prediction, and only does one-step prediction into the future.
Args:
test_loader (torch.utils.data.DataLoader): DataLoader that stores test data
batch_size (int): Batch size for mini-batch training
n_features (int): Number of feature columns
Returns:
list[float]: The values predicted by the model
list[float]: The actual values in the test set.
"""
with torch.no_grad():
predictions = []
values = []
for x_test, y_test in test_loader:
x_test = x_test.view([batch_size, -1, n_features]).to(device)
y_test = y_test.to(device)
self.model.eval()
yhat = self.model(x_test)
predictions.append(yhat.to(device).detach().numpy())
values.append(y_test.to(device).detach().numpy())
return predictions, values
def plot_losses(self):
"""The method plots the calculated loss values for training and validation
"""
45
plt.plot(self.train_losses, label="Training loss")
plt.plot(self.val_losses, label="Validation loss")
plt.legend()
plt.title("Losses")
plt.show()
plt.close()
import torch.optim as optim
input_dim = len(X_train.columns)
output_dim = 1
hidden_dim = 64
layer_dim = 3
batch_size = 64
dropout = 0.2
n_epochs = 50
learning_rate = 1e-3
weight_decay = 1e-6
model_params = {'input_dim': input_dim,
'hidden_dim' : hidden_dim,
'layer_dim' : layer_dim,
'output_dim' : output_dim,
'dropout_prob' : dropout}
model = get_model('lstm', model_params)
loss_fn = nn.MSELoss(reduction="mean")
optimizer = optim.Adam(model.parameters(), lr=learning_rate, weight_decay=weight_decay)
46
opt = Optimization(model=model, loss_fn=loss_fn, optimizer=optimizer)
opt.train(train_loader, val_loader, batch_size=batch_size, n_epochs=n_epochs, n_features=input_dim)
opt.plot_losses()
predictions, values = opt.evaluate(
test_loader_one,
batch_size=1,
n_features=input_dim
# Formatting the predictions
def inverse_transform(scaler, df, columns):
for col in columns:
df[col] = scaler.inverse_transform(df[col])
return df
def format_predictions(predictions, values, df_test, scaler):
vals = np.concatenate(values, axis=0).ravel()
preds = np.concatenate(predictions, axis=0).ravel()
df_result = pd.DataFrame(data={"value": vals, "prediction": preds}, index=df_test.head(len(vals)).index)
df_result = df_result.sort_index()
df_result = inverse_transform(scaler, df_result, [["value", "prediction"]])
return df_result
df_result = format_predictions(predictions, values, X_test, scaler)
df_result
# Ploting the performance of the LSTM model
47
df_result[['value', 'prediction']].plot(figsize=(15,7))
plt.legend(['Actual', ' LSTM Predicted'])
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
def calculate_metrics(df):
result_metrics = {'mae' : mean_absolute_error(df.value, df.prediction),
'rmse' : mean_squared_error(df.value, df.prediction) ** 0.5,
'r2' : r2_score(df.value, df.prediction)}
print("Mean Absolute Error: ", result_metrics["mae"])
print("Root Mean Squared Error: ", result_metrics["rmse"])
print("R^2 Score: ", result_metrics["r2"])
return result_metrics
result_metrics = calculate_metrics(df_result)
Appendix B: Source code for Streamlit web application using Prophet model only
import streamlit as st
from datetime import date
import yfinance as yf
from fbprophet import Prophet
from fbprophet.plot import plot_plotly
from plotly import graph_objs as go
START = "2015-01-01"
TODAY = date.today().strftime("%Y-%m-%d")
st.title('Stock Price Prediction Application')
48
stocks = ('NSRGF','NVDA','NFLX','AMZN','PYPL','TSLA','GOOG', 'AAPL', 'MSFT', 'GME')
selected_stock = st.selectbox('Select dataset for prediction', stocks)
n_years = st.slider('Years of prediction:', 1, 4)
period = n_years * 365
@st.cache
def load_data(ticker):
data = yf.download(ticker, START, TODAY)
data.reset_index(inplace=True)
return data
data_load_state = st.text('Loading data...')
data = load_data(selected_stock)
data_load_state.text('Loading data... done!')
st.subheader('Raw data')
st.write(data.tail())
# Plot raw data
def plot_raw_data():
fig = go.Figure()
fig.add_trace(go.Scatter(x=data['Date'], y=data['Open'], name="stock_open"))
fig.add_trace(go.Scatter(x=data['Date'], y=data['Close'], name="stock_close"))
fig.layout.update(title_text='Time Series data with Rangeslider', xaxis_rangeslider_visible=True)
st.plotly_chart(fig)
49
plot_raw_data()
# Predict forecast with Prophet.
df_train = data[['Date','Close']]
df_train = df_train.rename(columns={"Date": "ds", "Close": "y"})
m = Prophet()
m.fit(df_train)
future = m.make_future_dataframe(periods=period)
forecast = m.predict(future)
# Show and plot forecast
st.subheader('Forecast data')
st.write(forecast.tail())
st.write(f'Forecast plot for {n_years} years')
fig1 = plot_plotly(m, forecast)
st.plotly_chart(fig1)
st.write("Forecast components")
fig2 = m.plot_components(forecast)
st.write(fig2)
50

Udeh Onyedikachi Peter

Uploaded by

Copyright:

Available Formats

You might also like

Udeh Onyedikachi Peter

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Udeh Onyedikachi Peter

Uploaded by

Copyright:

Available Formats

STOCK PRICE PREDICTION USING MACHINE LEARNING

UDEH ONYEDIKACHI PETER

FACULTY OF NATURAL SCIENCES AND

GODFREY OKOYE UNIVERSITY, ENUGU NIGERIA

UDEH ONYEDIKACHI PETER

A RESEARCH SUBMITTED TO THE DEPARTMENT OF COMPUTER

SCIENCE AND MATHEMATICS, FACULTY OF NATURAL SCIENCES

AND ENVIRONMENTAL STUDIES, GODFREY OKOYE UNIVERSITY,

IN PARTIAL FULFILMENT OF THE REQUIREMENT FOR THE AWARD

OF BACHELOR OF SCIENCE (B.Sc) DEGREE IN COMPUTER SCIENCE

SUPERVISOR: Dr. S.C. Echezona

of Natural Sciences and Environmental Studies, Godfrey Okoye University Enugu.

Dr. S.C. Echezona ……………………….. ……………………………..

Supervisor Signature Date

Dr. J.B Agbogun ……………………….. ……………………………..

Head of Department Signature Date

Assoc. Prof M.N. Unachukwu ……………………….. ……………………..

Dean, FNSES Signature Date

Prof H.C. Inyiama ……………………….. ……………………………..

External Examiner Signature Date

U17/NAS/CSC/251 under supervision in the Department of Computer Science, Godfrey Okoye

Dr. S.C. Echezona ……………………….. ……………………………..

Supervisor Signature Date

Dr. J.B Agbogun ……………………….. ……………………………..

Head of Department Signature Date

patience to see this project from the beginning to the end.

1.0 CHAPTER ONE: INTRODUCTION………………………………………………………1

1.1 Background of the Study……………………………………………………………………...1

1.2 Statement of the Problem……………………………………………………………………...2

1.3 Aim and Objectives…………………………………………………………………………....2

1.4 Significance of the Study……………………………………………………………………...3

1.5 Scope of the Study…………………………………………………………………………….3

2.0 CHAPTER TWO: LITERATURE REVIEW……………………………………………...5

2.2 Theoretical Background……………………………………………………………………….5

2.3 Review of Related Literature………………………………………………………………...16

3.0 CHAPTER THREE: SYSTEM ANALYSIS AND DESIGN…………………………….19

3.2 Analysis of the Existing System……………………………………………………………..19

3.3 Analysis of Proposed System………………………………………………………………..20

3.4 Design of Proposed System………………………………………………………………….20

4.0 CHAPTER FOUR: IMPLEMENTATION………………………………………………..23

4.2 Choice of Development Environment……………………………………………………….23

4.3 Software Testing……………………………………………………………………………..24

4.4 User Manual………………………………………………………………………………….27

5.0 CHAPTER FIVE: CONCLUSION AND RECOMMENDATIONS…………………….28

test data from each of the models used in this study………………………………………….11

Figure 3.1: A use case diagram of the proposed system……………………………………...21

Figure 3.2: Activity diagram of the proposed system………………………………………...22

Figure 4.1: Different stock tickers of different companies to select from…………………...25

Figure 4.4: Visualized stock trends……………………………………………………………26

1.1 Background of the Study

Unfortunately, humans are irrational in nature. Without quantitative, data-driven models,

connects the utmost or minimum of candlesticks.

to use interface for the overall public.

1.2 Statement of the Problem

stock price prediction model.

at a different perspective with the help of machine learning.

1.3 Aim and Objectives of the Study

based on the latest machine learning technologies to retail investors.