Data Analysis (Machine Learning and Neural Nertworks) : Stock Prediction Based On Combination of Technical and Sentimental

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

STOCK PREDICTION BASED ON COMBINATION

OF TECHNICAL AND SENTIMENTAL


DATA ANALYSIS
(Machine Learning and Neural Nertworks)

Domain MACHINE LEARNING AND NEURAL NETWORKS

Algorithms SENTIMENTAL DATA ANALYSIS


Framework Java

Platform Hybrid Windows Application

Abstract

Efficient Market Hypothesis is the popular theory about stock prediction. With its failure
much research has been carried in the area of prediction of stocks. This project is about taking
non quantifiable data such as financial news articles about a company and predicting its future
stock trend with news sentiment classification. Assuming that news articles have impact on stock
market, this is an attempt to study relationship between news and stock trend. To show this, we
aim to create different classification models which depict polarity of tweet articles being positive
or negative.
Experiments are to be conducted to evaluate various aspects of the proposed model and
encouraging results are obtained in all of the experiments.

Background

A stock proprietor owns a part of the corporation and by purchasing or retailing the
shares; the degree of proprietorship might be amplified or reduced. If the choice making power
chooses to dispense a share of the corporation’s revenues rather than salvaging them, a proprietor
of the share is remunerated a turnover share, called “dividend”. A stock is valued at a static
worth when distributed; then, it might be transacted at any rate. For openly traded corporations,
the stocks are transacted in the stock market, where values are calculated by the equilibrium
between the supply and the demand. A corporation [1] that fails in revenue valuations can
experience major stock value descents due to the anxiety of stockholders.

The stock market is one of the most inconstant, indeterminate, wobbly and risky way to
invest money for any individual. The more hazards the policy has the more revenue it carries.
The stock market is where the corporation stocks are transacted bestowing to the clear rules. In
the initial stages, rich people or companies traded in stock, but with the appearance of
technology, many individual investors also turn out to be capable to finance in stock markets. In
current markets, the applicants range from individuals to businesses who invest within the stock
market.

A share in the stock market may be either in a short term investment or long term
investment. Short term funds are normally based on assumptions. Long term funds are
commonly based on the corporation’s global performance, prospects and prospective to grow its
incomes and revenue.

Existing Systems and their Drawbacks

The existing method looks into the prediction of stock based on not just the
approach which uses machine learning techniques looking into the past data to predict the future
data but also looks into the aspect of the sentiment analysis from the data obtained using Twitter
feeds.

The existing method comprises of the combination of both sentiment analysis using
Twitter data as well as machine learning techniques to predict the next stock values. They have
performed three combinations of it by utilizing three different methods of machine learning
techniques to compare and analyze the results. But specifying just the past data to predict the
future data is not always accurate as it does not take into considerations any data that is not
cyclical. For instance, the stock price of a cool drink company would increase during summer
every time but some other situations such as a new rival brand that would come across in the
recent time is not detected by this kind of analysis.

Hence, the fundamental analysis with respect to sentiment analysis is very important in
this case which is performed by using Twitter. But this is not always to the point i.e. the tweets
are not expressed to the point when compared to the expression of news articles in that respect.
This method performs sentiment analysis on data obtained from Twitter which provides the
results of the analysis of a tweet as either bullish or bearish in nature.

Along with the sentiment analysis, the machine learning techniques they have utilized are
support vector machines, linear regression model and ridge regression model. They have
compared the results of all three of these machine learning technique based models along with
the results of the combination of the machine learning techniques as well as the sentiment
analysis models. The results are compared by mean absolute error (MSE) method and root mean
squared error (RMSE) method which are depicted in the figure 2.1 above and figure 2.2 above.

From both the comparisons of mean absolute error and root mean squared error, we see
that the results obtained from the hybrid methods with the utilization of twitter analysis has
outperformed the methods utilizing the machine learning methods only. Thus, it can be observed
that the twitter analysis plays a major role within the stock prediction when used alongside the
machine learning techniques. This current method utilizes the twitter analysis results as either
positive or negative.

Drawbacks

Drawbacks of Existing Method


The existing method consists of the combination of the machine learning methods with
twitter analysis but twitter data is often unstructured form of text by people when compared to
news articles. For example, “Apple merchandise transportation hit badly by earthquake in Japan”
is a news article heading which clearly specifies a sentiment and can be analyzed easily with
words such as “hit badly” but when we compare the same with respect to twitter analysis, it is
not often that news articles like this are posted and even if a person does post such an article, it is
not guaranteed that it has the same sort of wordings.
Another issue is that the sentiment analysis provides the result only in terms of
bullishness index and bearish index.

Proposed System

Looking into the two issues above, the proposed model predicts stocks based on three different
kinds of inputs. The first one being the input taken from three sources of data namely twitter
data, past stock values and news articles. The past data is utilized in performing the technical
analysis to predict the next stock value i.e. the closing price based on the past data is predicted.
Further, the twitter data and news articles obtained which are from the recent moments are
analyzed for the sentiments. Rather than just looking into the aspect of positive and negative
sentiments i.e. bullish and bearish values, we investigate this in a larger aspect where we classify
the data into seven dimensions instead of just two which provides a deeper insight into the kind
of situation the corporation is currently facing. The seven dimensions are as follows:

 Positive
 Negative
 Uncertain
 Superfluous
 Interesting
 Litigious
 Constraining

The above seven dimensions provides a further insight and the addition of this along with the
analysis performed with the usage of artificial neural networks to perform the technical analysis
provides a better method to predict the stock in two particular ways. The first one, it provides a
detailed description rather than just buy or not to buy by providing information such as whether
the corporation is in any litigious situation or so on. The second one, it provides a better aspect in
analysis with the addition of news article analysis which provides a better way to predict stock
based on sentiment analysis with the news article headings being more accurate in terms of a
structured textual data.
System Architecture

Hardware Requirements

Component Configuration
CPU A minimum of 3 Ghz (smooth operation)
Hard Disk A minimum of 80 GB
Internet Connectivity A minimum of 512 kbps
RAM 4 GB (compiling and running)
Table. Hardware Requirements
Software Requirements

Software / Technology Usage


Eclipse IDE
Neuroph Artificial Neural Networks
Java FX Front End
Java Back End (Entire Code)
HTML, XML Internet Connectivity
PHP Request and Response (Web Pages)
JSON Formats of requests and responses

Table. Software Requirements

You might also like