Professional Documents
Culture Documents
Stock Market Prediction Using Sentiment Analysis: Prof. Artika Singh
Stock Market Prediction Using Sentiment Analysis: Prof. Artika Singh
Stock Market Prediction Using Sentiment Analysis: Prof. Artika Singh
MBA (Tech.)
COMPUTER ENGINEERING
At
October, 2021
DECLARATION
2
Anubhav Sharma(N264)
Pranit Malviya(N254)
Shubham Kumar(N239)
Place: Mumbai
CERTIFICATE
This is to certify that the project entitled “Stock Market Prediction with Sentiment Analysis”
is the bonafide work carried out by Anubhav Sharma, Shubham Kumar, Pranit Malviya,
Aatmik Sharma, Yagyansh Pareek of MBA (Tech.) (Computer Engineering), MPSTME
(NMIMS), Mumbai, during the VII th semester of the academic year 2021, in partial
fulfillment of the requirements for the award of the Degree of Bachelors of Engineering, as
per the norms prescribed by NMIMS. The project work has been assessed and found to be
satisfactory.
______________________
Artika Singh
Internal Mentor
_______________________ ________________________
Examiner 1 Examiner 2
__________________
Director
4
Table of contents
1. Declaration ii
2 Certificate iii
Abbreviations iii
Abstract v
1. INTRODUCTION
8. LOGBOOK xix
5
ABSTRACT
Predicting stock market prices has been a topic of interest among both analysts and
researchers for a long time. Stock prices are hard to predict because of their highly volatile
nature which depends on diverse political and economic factors, change of leadership, investor
sentiment, and many other factors. Predicting stock prices based on either historical data or
textual information alone has proven to be insufficient.
Existing studies in sentiment analysis have found that there is a strong correlation between the
movement of stock prices and the publication of news articles. Several sentiment analysis
studies have been attempted at various levels using algorithms such as support vector machines,
naive Bayes regression, and deep learning. The accuracy of deep learning algorithms depends
upon the amount of training data provided. However, the amount of textual data collected and
analyzed during the past studies has been insufficient and thus has resulted in predictions with
low accuracy.
We improve the accuracy of stock price predictions by gathering a large amount of time series
data and analyzing it in relation to related news articles, using deep learning models. We will
use a dataset for S&P 500 companies for five years, along with more than 265,000 financial
news articles related to these companies. Given the large size of the dataset, we use cloud
computing as an invaluable resource for training prediction models and performing inference
for a given stock in real time.
6
Chapter 1
Introduction
The fluctuation of the stock market is violent and there are many complicated financial
indicators.
● We will analyse sentiment of news from twitter and NDTV data and find correlation
between news and price movement to finally predict price movement of stock in coming
days.
.
Google Colab notebook:Colab notebooks are notebooks that run in the cloud and are highly
integrated with Google Drive, making them easy to set up, access, and share.The following
sections describe deploying Earth Engine in Google Colab and visualizing maps and charts.
Visual studio code:Visual Studio Code is an source-code editor made by Microsoft for
Windows, Linux and macOS. Features include support for debugging, syntax highlighting,
intelligent code completion, snippets, code refactoring, and embedded Git.
Cloud GPU:A cloud graphics processing unit (GPU) provides hardware acceleration for an
application, without requiring that a GPU is deployed on the user's local device. Common use
7
cases for cloud GPUs are: Visualization workloads: Powerful server/desktop applications
often employ graphically demanding content.
Chapter 2
Research Survey
To understand the opinion of people about the Stock market and their levels of Financial
Literacy a survey was circulated for research. The survey received 73 responses and their
inferences along with the question are given below -
Fig:1 and Fig:2 Gender and Age of People who filled the form
8
9
Literature Review
10
For the literature review, approximately 20 papers were reviewed in the domain of Stock
Prediction with Sentiment Analysis. The below table provides us a brief overview of each of
these papers.
Predicting Stock Ayman E. They used sentiment analysis of Sentiment Analysis, Naive The model is divided into two
Market Behavior Khedr, news and combined it with Data Bayes, stages which increase the
using Data Mining S.E.Salama, mining to get an accuracy of KNN classifier prediction accuracy
Technique and Nagwa Yaseen
89.80% while using Naive Bayes,
News Sentiment
Analysis KNN classifier.
Stock Market Tejas It predicted the stock price Support Vector Cloud services will enable us to
Prediction based on Mankar,Tushar movement, in favour of the Machine,Natural Language collect large amounts of data.
Social Sentiments Hotchandani , sentiments of the tweets. Toolkit (NLT) library
using Machine Manish
Learning Madhwani,
Akshay
Chidrawar
Stock Price Saloni Mohan1, It e improve the accuracy of stock MAPE is useful while In this model it predicted stock
Prediction Using Sahitya price predictions by gathering a evaluating prediction models prices using time series models,
News Sentiment Mullapudi1, large amount of time series data where only the magnitude of neural networks, and a
Analysis Sudheer and analyzing it in relation to the difference between combination of neural networks
Sammeta1, related news articles, using deep predicted values and and financial news articles. The
Parag learning models. The dataset we observed values is important results suggest that there is a
Vijayvergia1 have gathered includes daily stock to consider while the strong relationship between
and David C. prices for S&P500 companies for direction of the difference stock prices and financial news
Anastasiu1, five years, along with more than can be ignored. Evaluation articles
265,000 financial news articles using MAPE overcomes the
related to these companies large deviation bias present
in Root Mean Square Error
(RMSE) and shows
robustness for datasets
containing long tails.
Forecasting Pushpendu In this paper, They used LSTM Random forest,LSTM Both of their models
directional Ghosh,Ariel model in multi feature setting to (CuDNNLSTM) outperformed the market in
movements of stock Neufeld, Jajati get daily returns of 0.64% and intraday
prices for intraday Keshari Sahoo used random forest method to get
11
Visualization and Shivang B, I Learned to predict stock prices Long Term Short Memory, Helps in accurate prediction of
Forecasting of Tirtha Roy. using the LSTM neural network. stock market.
stocks using LTSM Using plotly dash framework for
technique. building dashboards.
Study on the G.Ding They used LSTM network model LSTM-based deep recurrent The model can predict multiple
prediction of stock L.Qin and the LSTM deep-recurrent neural network stock price simultaneously
price based on the neural network model to get an
associated network accuracy of over 95%
model of LSTM
ML Algorithms in Jayesh P Comparison of different machine SVR ,Random forest, LTSM Different methods give
stock market Rejo Mathew. learning algorithms namely different benefits according to
prediction. SVR,LTSM and random forests the requirements of the user.
techniques.
Stock Market Pius Adewale The stacked LSTM model’s ARIMA,Deep Recurrent The model was able to predict
Behaviour Owolawi, ability to closely predict the stock Neural Network stock market behavior with
Prediction using Maredi market behavior based on historic some accuracy.
Stacked LSTM Mphahlele data it was trained on.
Networks
Stock Price Gourav Bathla LSTM is applied on different Deep Learning, LSTM, Deep learning is applied to
prediction using stock indexes and compared with RNN, SVR improve prediction accuracy.
LSTM and SVR linear regression and ARIMA
Stock market price LSTM model, it achieves a binary MLP,SVM,LSTM,Deep They applied the feature
trend prediction Jingyi Shen & accuracy of 93.25%. Learning expansion approaches with
using The RAF algorithm achieved a RFE.
comprehensive M. Omair relatively high true-positive rate .
deep learning Shafiq
system
Sentiment Aditya There are various ups and downs Opinion,Mining,Deep This model Was able to gather
Analysis for indian bharadwaj,Yog in the Indian stock market. In learning,LSTM ,SVR info about markets and
stock market using endra order to invest money in the stock improved efficiency and
sensex nifty. narayan,Vanshi market for purchasing the shares it accuracy.
ka is very essential for the investors
singh,Maitree to predict the stock market
condition. In the India scenario
Sensex and Nifty are two major
indicators for prediction of stock
market conditions.
Dev shah , This paper explains different LSTM,Arima,Random Different methods give
StockMarket Hasurana isah techniques that can be used for forest ,RNN ,SVM different benefits according to
Analysis A Review stock market prediction and the requirements of the user.
and Taxonomy of analysis and conveniently they
Prediction help in easing the trends.
Techniques
Stock Market Ziping Lin , This paper aims to successfully Lstm,RNN, Support vector This paper indicated how these
Prediction Analysis Anuj Thakkar, predict stock price through ,Deep learning sentiments affect the market
Zaphia li analyzing the relationship and this leads to downfall of the
12
by Incorporating between the stock price and the market or the up trends in the
Social and News news sentiments. A novel stock market.
enhanced learning-based method
Opinion and for stock price prediction is
Sentiment proposed that considers the effect
of news sentiments
Literature on Stock YV The objective of any investment is Deep learning ,Lstm,RNN, The different key issues or the
Returns: A Content Reddy,Pranav to earn a return. Return on the Neural network . factors were analyzed and
Analysis narayan amount invested in stocks includes presented in count and
dividend and capital appreciation. percentages. The study indeed
These returns are influenced by helps the stock exchanges, the
both systematic and unsystematic regulators.
risks. Systematic risk includes the
macroeconomic variables
Aparna nayak, There are two common methods to Data collection, data There are many predictive
MM Manhora , predict the stock market prices. Analysis , Data Processing. models which tell about the
Prediction Models Radhika M Pai One among that is chartist or market trend whether it is up or
for Indian Stock technical theories and the second down, but they fail to give
Market one is fundamental or intrinsic accurate results. An attempt has
value analysis. Proposed method been made to build efficient
is built on the principle of predictive model of stock
technical theories. market where the trend for the
next day is predicted.
Chapter 3
Analysis & Design
From research and user survey we analysed that stock market is lucrative field but it can be
very confusing to new comers. We will model the news dataset to analyse the sentiment of
news and how it is correlated with market to give users recommendations
Chapter 4
Proposed Model
For the sake of this project, we are considering two base papers:
1. ARIMA
An ARIMA model is a class of statistical models for analyzing and forecasting time series
data.
14
generalization of the simpler AutoRegressive Moving Average and adds the notion of
integration.
This acronym is descriptive, capturing the key aspects of the model itself. Briefly, they are:
observation from observation at the previous time step) in order to make the time
series stationary.
observation and a residual error from a moving average model applied to lagged
observations.
Each of these components is explicitly specified in the model as a parameter. The parameters
● p: The number of lag observations included in the model, also called the lag
order.
● d: The number of times that the raw observations are differences also called the
degree of differencing.
15
● q: The size of the moving average window, also called the order of moving
average.
2. SARIMAX
ARIMA model considers only trends information in the data and ignores seasonal variation.
SARIMAX is a variation of the ARIMA model which considers seasonal variation in the data
as well. Though, our data do not have high seasonality but why not give it a try.
3. Facebook Prophet
predictions with good accuracy using simple intuitive parameters and has support for
4. LSTM Model
Finance is highly nonlinear and sometimes stock price data can even seem completely
random. Traditional time series methods such as ARIMA, SARIMAX models are effective
only when the series is stationary, which is a restricting assumption that requires the series to
be preprocessed by taking log returns (or other transforms). However, the main issue arises in
This is combated by using Neural Networks (sequential models like LSTM, GRU, etc.),
which do not require any stationarity to be used. Furthermore, neural networks by nature are
16
effective in finding the relationships between data and using it to predict (or classify) new
data.
In this project, we will be able to get the desired result i.e., prediction of stock price by using
model, SARIMAX which is a variation of the ARIMA.Our work aims to predict future stock
market price of a company . Analysis and visualization of risk profile will be done with that
particular stock . It will aim to capture news sentiment that affects the stock market .We will
be taking Twitter data as our main dataset.If the news sentiment is positive, our model will
show that there are more chances that the stock price will go up and if the news sentiment is
negative, then stock price may go down. This project is an attempt to build a model that
Project timeline
In this project, We will analyse the stock market news data and process time-series data and
build deep learning models with a production perspective. Stock Price time series is
considered the most challenging time series and we will try to predict Nifty Index Data with
high accuracy. We will also optimize the model in post-training phase to make it ready for
deployment.Our work will surface the number of likes, comments and shares, and aim to
reach, and truly understand, the significance of social media interactions and what they tell us
about the consumers behind the screens
Logbook