Professional Documents
Culture Documents
Data Analysis (Machine Learning and Neural Nertworks) : Stock Prediction Based On Combination of Technical and Sentimental
Data Analysis (Machine Learning and Neural Nertworks) : Stock Prediction Based On Combination of Technical and Sentimental
Data Analysis (Machine Learning and Neural Nertworks) : Stock Prediction Based On Combination of Technical and Sentimental
Abstract
Efficient Market Hypothesis is the popular theory about stock prediction. With its failure
much research has been carried in the area of prediction of stocks. This project is about taking
non quantifiable data such as financial news articles about a company and predicting its future
stock trend with news sentiment classification. Assuming that news articles have impact on stock
market, this is an attempt to study relationship between news and stock trend. To show this, we
aim to create different classification models which depict polarity of tweet articles being positive
or negative.
Experiments are to be conducted to evaluate various aspects of the proposed model and
encouraging results are obtained in all of the experiments.
Background
A stock proprietor owns a part of the corporation and by purchasing or retailing the
shares; the degree of proprietorship might be amplified or reduced. If the choice making power
chooses to dispense a share of the corporation’s revenues rather than salvaging them, a proprietor
of the share is remunerated a turnover share, called “dividend”. A stock is valued at a static
worth when distributed; then, it might be transacted at any rate. For openly traded corporations,
the stocks are transacted in the stock market, where values are calculated by the equilibrium
between the supply and the demand. A corporation [1] that fails in revenue valuations can
experience major stock value descents due to the anxiety of stockholders.
The stock market is one of the most inconstant, indeterminate, wobbly and risky way to
invest money for any individual. The more hazards the policy has the more revenue it carries.
The stock market is where the corporation stocks are transacted bestowing to the clear rules. In
the initial stages, rich people or companies traded in stock, but with the appearance of
technology, many individual investors also turn out to be capable to finance in stock markets. In
current markets, the applicants range from individuals to businesses who invest within the stock
market.
A share in the stock market may be either in a short term investment or long term
investment. Short term funds are normally based on assumptions. Long term funds are
commonly based on the corporation’s global performance, prospects and prospective to grow its
incomes and revenue.
The existing method looks into the prediction of stock based on not just the
approach which uses machine learning techniques looking into the past data to predict the future
data but also looks into the aspect of the sentiment analysis from the data obtained using Twitter
feeds.
The existing method comprises of the combination of both sentiment analysis using
Twitter data as well as machine learning techniques to predict the next stock values. They have
performed three combinations of it by utilizing three different methods of machine learning
techniques to compare and analyze the results. But specifying just the past data to predict the
future data is not always accurate as it does not take into considerations any data that is not
cyclical. For instance, the stock price of a cool drink company would increase during summer
every time but some other situations such as a new rival brand that would come across in the
recent time is not detected by this kind of analysis.
Hence, the fundamental analysis with respect to sentiment analysis is very important in
this case which is performed by using Twitter. But this is not always to the point i.e. the tweets
are not expressed to the point when compared to the expression of news articles in that respect.
This method performs sentiment analysis on data obtained from Twitter which provides the
results of the analysis of a tweet as either bullish or bearish in nature.
Along with the sentiment analysis, the machine learning techniques they have utilized are
support vector machines, linear regression model and ridge regression model. They have
compared the results of all three of these machine learning technique based models along with
the results of the combination of the machine learning techniques as well as the sentiment
analysis models. The results are compared by mean absolute error (MSE) method and root mean
squared error (RMSE) method which are depicted in the figure 2.1 above and figure 2.2 above.
From both the comparisons of mean absolute error and root mean squared error, we see
that the results obtained from the hybrid methods with the utilization of twitter analysis has
outperformed the methods utilizing the machine learning methods only. Thus, it can be observed
that the twitter analysis plays a major role within the stock prediction when used alongside the
machine learning techniques. This current method utilizes the twitter analysis results as either
positive or negative.
Drawbacks
Proposed System
Looking into the two issues above, the proposed model predicts stocks based on three different
kinds of inputs. The first one being the input taken from three sources of data namely twitter
data, past stock values and news articles. The past data is utilized in performing the technical
analysis to predict the next stock value i.e. the closing price based on the past data is predicted.
Further, the twitter data and news articles obtained which are from the recent moments are
analyzed for the sentiments. Rather than just looking into the aspect of positive and negative
sentiments i.e. bullish and bearish values, we investigate this in a larger aspect where we classify
the data into seven dimensions instead of just two which provides a deeper insight into the kind
of situation the corporation is currently facing. The seven dimensions are as follows:
Positive
Negative
Uncertain
Superfluous
Interesting
Litigious
Constraining
The above seven dimensions provides a further insight and the addition of this along with the
analysis performed with the usage of artificial neural networks to perform the technical analysis
provides a better method to predict the stock in two particular ways. The first one, it provides a
detailed description rather than just buy or not to buy by providing information such as whether
the corporation is in any litigious situation or so on. The second one, it provides a better aspect in
analysis with the addition of news article analysis which provides a better way to predict stock
based on sentiment analysis with the news article headings being more accurate in terms of a
structured textual data.
System Architecture
Hardware Requirements
Component Configuration
CPU A minimum of 3 Ghz (smooth operation)
Hard Disk A minimum of 80 GB
Internet Connectivity A minimum of 512 kbps
RAM 4 GB (compiling and running)
Table. Hardware Requirements
Software Requirements