Professional Documents
Culture Documents
BTP MSR
BTP MSR
Learning Models
B.Tech.
by
2021
i
CANDIDATES DECLARATION
We hereby certify that the work, which is being presented in the report, entitled Stock
Price Trends Forecasting using Machine Learning Models, in partial fulfillment of
the requirement for the award of the Degree of Bachelor of Technology and submitted
to the institution is an authentic record of our own work carried out during the period
June 2021 to october 2021 under the supervision of Dr. Rajesh Rajagopal. We also
cited the reference about the text(s)/figure(s)/table(s) from where they have been taken.
This is to certify that the above statement made by the candidates is correct to the best
of my knowledge.
ABSTRACT
The nature of stock market movement has always been anomalous for investors because
of various influential factors. The stock market prediction refers to predicting the future
scope of the stock market.The aim of the project is to take a look at a wide range of
proclamation procedures to anticipate future stock returns upheld by past returns and
mathematical news indicators. This study aims to significantly reduce the risk of inac-
curacy in trend prediction with machine learning strategies for stock worth statements
by interpreting the chaotic market information and predicting the future value of the
financial stocks of a company. The aim of this stock market prediction project is to
use machine learning algorithms which makes predictions based on the current stock
market indices and also analysing the impacts of Novel Coronavirus outbreaks on a
particular company and also on national stock exchange index Nifty 50. Although the
stock market can never be accurately predicted due to its vast and enormous domain
this project aims at establishing a relation between chosen factors and stock prices us-
ing statistical analysis and mitigate the risks.
Keywords: stock price, risk mitigation, machine learning, future scope, indicators.
iii
ACKNOWLEDGEMENTS
We are highly indebted to Dr. Rajesh Rajagopal, and are obliged for giving us the
autonomy of functioning and experimenting with ideas. We would like to take this
opportunity to express our profound gratitude to them not only for their academic guid-
ance but also for their personal interest in our project and constant support coupled with
confidence boosting and motivating sessions which proved very fruitful and were in-
strumental in infusing self-assurance and trust within us. The nurturing and blossoming
of the present work is mainly due to their valuable guidance, suggestions, astute judg-
ment, constructive criticism and an eye for perfection. Our mentor always answered
myriad of our doubts with smiling graciousness and prodigious patience, never letting
us feel that we are novices by always lending an ear to our views, appreciating and
improving them and by giving us a free hand in our project. It’s only because of their
overwhelming interest and helpful attitude, the present work has attained the stage it
has.
Finally, we are grateful to our Institution and colleagues whose constant encouragement
served to renew our spirit, refocus our attention and energy and helped us in carrying
out this work.
(Chiranjeev Agrawal)
TABLE OF CONTENTS
ABSTRACT ii
LIST OF TABLES v
LIST OF FIGURES vi
iv
TABLE OF CONTENTS v
6 REFERENCES xxx
REFERENCES xxx
LIST OF TABLES
vi
LIST OF FIGURES
vii
LIST OF FIGURES viii
ABBREVIATIONS
INTRODUCTION AND
LITERATURE SURVEY
Being an inevitable part of any country’s economy the stock market plays an important
role in the growth of the business with eventually effects the economy of that coun-
try.Every concerned investor in stock market must be aware of whether or not the stock
prices may rise or go over a particular period of time.Although the share market can
never be accurately predicted due to its vast and enormous domain this project aims at
applying machine learning techniques on stock indicators 24 cast stock prices.
Using the applied statistical analysis we can establish relation between the factors and
share worth which will facilitate in forecasting correct results. This study aims to
significantly reduce the risk of inaccuracy in trend prediction with machine learning
strategies for stock worth statements by interpreting the chaotic market information
and predicting the future value of the financial stocks of a company.
1.1 INTRODUCTION
Higher the demand of companies stock its corporate worth of share increases and if
the demand of companies stock is less then it’s corporate price will also decrease.The
current trend in stock market prediction Technologies is the use of various machine
learning models in order to make prediction based on stock indices bi training on the
previous data.
The investors are well aware of the overwhelming nature of stock market and because
of its versatility and unpredictability the prediction techniques has always been appre-
ciated bye financial analysts business tycoons brokers and researchers.Therefore, It is
necessary to build a system in order to maximize accuracy considering all the important
factors which may influence the result.
ix
CHAPTER 1. INTRODUCTION AND LITERATURE SURVEY x
(i) Data pre-processing: It refers to a series of steps that our acquired data has to
move through in order to convert it into readable format and generate meaning-
ful information. It involves removing the unnecessary data, corrupted data and
missing values.
(ii) Training and testing Dataset: After Remodeling the data set into a clean data
set, it is then divided into training set and testing set. Most recent values are
included in training set and testing set consists of approximately 10 percent of the
total dataset.
(iv) Data Normalization: For better accuracy the extracted data needs to be nor-
malised thereby ensuring that all the factors are not given exceptionally high for
exceptionally low weightage.
(iii) High: Highest Price recorded for that stock on that particular day.
(iv) Low: Lowest price recorded for that stock on that particular day.
(vi) Turnover: Total currency exchange for that stock on that day.
CHAPTER 1. INTRODUCTION AND LITERATURE SURVEY xi
Moreover, in step with further investigations, movements in market costs don’t seem
to be random. Rather, they behave in a very extremely non-linear, dynamic manner.
Support vector machines (SVM) could be a terribly specific sort of learning algorithm
characterized by the capability management of the choice performed and the utilization
of the kernel functions. SVM can be of Two types:
It is used for Linearly seperable data. It means that if a data is classifiable into two sub-
classes, is called as linearly seperable data and the classifier used is known as linearly
SVM classifier.
If the data can not be differentiated or partitioned using a sstraight line, the data is
called as non linearly seperable data, and the classifier used is called as non linear
SVM classifier.
1.2 MOTIVATION
The variation of stock prices plays a very important role in our business and in many
ways it directly e effects the economy of the country which may indirectly e effect the
ease of Living and ease of doing business of common people which makes this research
severely important.A correct prediction of stocks can lead to huge profits for the seller
and the broker. Machine learning predicts a market value close to the tangible value,
thereby increasing the accuracy. Stocks are affected by many social factors among
which the major attraction is the study and analysis of impact of COVID 19 pandemic
on stock indexes and company’s stock rates.
(ii) Use of Support Vector Machines: Powerful and flexible supervised machine
learning algorithms that can be used for both classification and regression mainly
used for classification problems. They are famous for their unique way of im-
plementation with respect to other machine learning algorithms. In today’s world
they are extremely popular due to their capability of handling multiple continu-
ous variables. Predicting Stock price movements is kind of difficult. Moreover,
in step with further investigations, movements in market costs don’t seem to be
random. Rather, they behave in a very extremely non-linear, dynamic manner.
Support vector machines (SVM) could be a terribly specific sort of learning al-
gorithm characterized by the capability management of the choice performed and
the utilization of the kernel functions. Some previous works on the same are:
https://ieeexplore.ieee.org/abstract/document/9392366.
(iii) Non linear multi factor models: We incorporate deployment of multilayer feed-
forward neural networks for predicting a stock’s excess come back supporting its
exposure to various technical and elementary factors and also well demonstrated
previously in https://ieeexplore.ieee.org/document/8697278/. The effectiveness
of the approach a qualified portfolio that consists of equally capitalized long and
short positions is made and its historical returns square measure benchmarking
against Treasury obligations returns and also the Nifty 50 index.
(v) Random Forest: It is better to use multiple trees for covering huge sized data sets
rather than using just one single decision tree which may lead to an overfit model.
In random forest we therefore build multiple decision trees and merge them to-
gether for more accurate and stable prediction. The generalized error of forest de-
pends directly upon the strength of individual trees.The work https://ieeexplore.ieee.org/document/737
have demonstrated it to the point.
(vi) Forecasting stock indices: Not withstanding extensive evaluation that focuses
on estimating the volume of come on exchange index, there’s a scarcity of re-
CHAPTER 1. INTRODUCTION AND LITERATURE SURVEY xiv
search inspecting the certainty of the path of stock marketplace index movement.
Given the belief that a prediction with little forecast error doesn’t essentially trans-
forms into economic benefit. Particularly, we will conduct statistical comparisons
among the two types .
(vii) Covid1-19 Pandemic and Corporate world: We discover that the corona virus
pandemic-brought inventory expenses and effects have been milder among agen-
cies with Stronger pre-2021 budget, less hindered to covid-19 through interna-
tional offer chains and purchaser locations, extra CSR activities, and much less
entrenched executives. What is more, the stock prices of organizations with larger
hedge fund ownership Completed worse, and people of companies with larger
non-financial enterprise ownership performed better. Some relatable works in-
clude : https://ieeexplore.ieee.org/document/9378030Stock Price Prediction Un-
der Anomalous Circumstances
1. To analyze state-of-the-art machine learning models used for predicting stock mar-
ket.
2. To propose a more accurate machine learning model for predicting stock market.
3. To analyse the impact of covid 19 outbreak on the stock market.
4. To reduce uncertainty associated with investment decision making in order to differ-
entiate between traditional and risky stocks.
5. To reduce the dilemma of investors especially the ones with very less experience of
stock market.
The aim of the project is to take a look at a wide range of proclamation procedures
to anticipate future stock returns upheld by past returns and mathematical news indica-
tors.
CHAPTER 2
xv
CHAPTER 2. SYSTEM ARCHITECTURE AND METHODOLOGY xvi
picked up from various authentic sources may contain noises and missing values which
may be unacceptable or a format which cannot be directly used for machine learning
models therefore it is required to clean the data and make it suitable for use. Data pre-
processing also increases the efficiency and accuracy of machine learning model.
Steps involved in Data pre processing are as follows:
(iii) High: Highest Price recorded for that stock on that particular day.
(iv) Low: Lowest price recorded for that stock on that particular day.
(vi) Turnover: Total currency exchange for that stock on that day.
NIFTY 50 is NSE’s diversified index which consists of stocks from top 50 Indian
companies across 14 sectors. It’s main function is to track the market performance of
the largest cap companies hence, it widely reflects the Indian economy.
The top rows of our data set are as shown in Figure 3.1 which is a direct screenshot
from the nifty index:
xix
CHAPTER 3. PROGRESS MADE SO FAR xx
(vi) Analysis of Various Models: Comparison between the different techniques and
models executed over the dataset is acted in this stage.
Bagging Regressor:
Adaboost Regressor:
K Nearest Regressor:
Gradient Boosting:
TASKS TO BE COMPLETED
(a) We will focus on gaining additional data of stock index as until now we have fo-
cused solely on first six months of the Year. Further, in line with the same objective
will try to incorporate the stock data sets of next few months in order to apply
completely different machine learning models on the data set for prediction.
(b) We will analyse various supervised learning models for prediction and comparision
including the newly incorporated methods. The will then be ranked according to
their performance and accuracy of prediction.
(c) Incorporation of Social Media sentiment as a new factor that influence the stock
trends.
(d) We will analyse the effect of Corona virus pandemic followed by Lockdown on the
Nifty index and will make stock price prediction.
(e) Make conclusion based on the trends followed by Nestle India Ltd.
(f) We will also try to realise the utility and future scope of our research in practical
world.
xxviii
CHAPTER 5
GANTT CHART
xxix
CHAPTER 6
REFERENCES
xxx