Professional Documents
Culture Documents
Report
Report
Algorithm ML Model
A PROJECT REPORT
Submitted by
of
BACHELOR OF ENGINEERING
in
APRIL 2024
BONAFIDE CERTIFICATE
SIGNATURE SIGNATURE
Certified that the above candidates were examined in the university project work viva
voce examination held on _____________ at Panimalar Institute of Technology,
Chennai–600 123.
ii
ACKNOWLEDGEMENT
iii
TABLE OF CONTENTS
CHAPTER TITLE PAGE
NO. NO.
ABSTRACT vii
LIST OF FIGURES viii
LIST OF ABBREVATION ix
LIST OF SYMBOLS x
1 INTRODUCTION 1
1.1 Domain Overview 2
1.2 Machine Learning 2
1.3 Leveraging API’s for real-time data retrieval 4
2 LITERATURE SURVEY 5
3 SYSTEM ANALYSIS & SPECIFICATION 14
3.1 Existing system 15
3.1.1 Drawbacks of existing system 15
3.2 Proposed system 16
3.2.1 Advantage of Proposed system 17
3.3 System Configuration 17
3.3.1 Hardware Requirement 18
3.3.2 Software Requirement 18
4 SYSTEM DESIGN 19
4.1 System Architecture 20
4.1.1 Data Gathering 21
4.1.2 Pre-Processing of data 21
4.1.3 Feature Extraction 23
4.1.4 Assessment Model 23
iv
CHAPTER TITLE PAGE
NO. NO.
v
CHAPTER TITLE PAGE
NO. NO.
7 TESTING 51
7.1 Testing Objectives 52
7.2 Categories of Software Testing 52
7.2.1 Black box testing 52
7.2.2 White box testing 53
7.3 Types of testing 54
7.3.1 Unit testing 54
7.3.2 Functional testing 55
7.3.3 Performance testing 55
7.3.4 Integration testing 55
7.3.5 Validation testing 55
7.3.6 System testing 56
7.3.7 Structure testing 56
7.3.8 Output testing 56
7.3.9 User acceptance testing 56
7.4 Test Case Scenario 57
8 CONCLUSION 60
9 REFERENCES 62
10 APPENDIX 65
APPENDIX A- Sample source code 66
APPENDIX B- Screenshots 98
vi
ABSTRACT
Accurately predicting stock prices is crucial for analysts and investors, yet
remains challenging. Traditional methods rely on historical data, but advancements
in technology offer opportunities to enhance prediction models with real-time data.
This study proposes a state-of-the-art stock price prediction model leveraging
efficient gradient boosting techniques like XGBoost or LightGBM and real-time
data collection from sources such as Server or Hub. Unlike conventional
approaches, our model dynamically accesses live market data, including stock
prices, trading volumes, sentiment analysis, and macroeconomic indicators. By
incorporating API keys, we capture the latest market dynamics and investor
sentiments, ensuring forecasts reflect current market conditions. Our method
emphasizes optimal gradient boosting methods known for predictability, resilience,
and efficiency. Variations like XGBoost or LightGBM enable learning intricate
correlations and patterns from real-time data streams. Additionally, we integrate
sentiment analysis from social media feeds, news articles, and financial data to
gauge investor sentiment in real-time, enhancing understanding of market
behavior's impact on stock prices. Rigorous optimization, including feature
selection, cross-validation, and hyperparameter tuning, enhances model robustness,
reducing overfitting and improving generalization. The objective is to empower
investors with fast and accurate stock price projections, enabling informed
decisions in rapidly changing markets, while advancing financial forecasting
techniques through cutting-edge gradient boosting algorithms and real-time data
collection.
vii
LIST OF FIGURES
viii
LIST OF ABBREVATIONS
S. No Abbreviation Expansion
1 IEEE The Institute of Electrical and Electronics Engineers
2 HTML Hyper Text Markup Language
3 HTTP Hyper Text Transport Protocol
4 ML Machine Learning
5 AI Artificial Intelligence
6 API Application Programming Interface
7 bit Binary Digit
8 URL Uniform Resource Locator
9 GB Gigabyte
10 OS Operating System
11 RAM Read Only Memory
12 VS Visual Studio code
13 XAMPP Operating system, Apache, Mysql, Php, Perl
14 UML Unified Modeling Language
15 DFD Data Flow Diagram
16 OOSE Object Oriented Software Engineering
17 OMT Object Modelling Technique
18 OMG Organisation for Management and Guidance
19 XG Boost Extreme Gradient Boosting
20 XSS Cross-Site Scripting
21 SQL Structure Query Language
22 DB Database
23 SUT Software Under Test
ix
LIST OF SYMBOLS
x
11. Represents the object’s
Final state final state.
12. Label the transition with
Transition the event that triggered it
and the action that
triggers from it.
13. Class class A set of objects that share
a common structure & a
common behaviour.
14. Association Relationship btw classes
15. Relationship btw more
Generalization general class & a more
specific class
xi
CHAPTER 1
INTRODUCTION
CHAPTER 1
INTRODUCTION
Machine learning is to predict the future from past data. Machine learning
(ML) is a type of artificial intelligence (AI) that provides computers with the ability
to learn without being explicitly programmed. Machine learning focuses on the
development of Computer Programs that can change when exposed to new data
and the basics of Machine Learning, implementation of a simple machine learning
algorithm using Python. The process of training and prediction involves the use of
specialized algorithms. It feeds the training data to an algorithm, and the algorithm
uses this training data to give predictions on new test data. Machine learning can
be roughly separated into three categories. There are supervised learning,
unsupervised learning, and reinforcement learning.
2
Live Data
Data scientists use many different kinds of machine learning algorithms to discover
patterns in Python that lead to actionable insights. At a high level, these different
algorithms can be classified into two groups based on the way they “learn” about
data to make predictions: supervised and unsupervised learning.
The difference between the two tasks is the fact that the dependent attribute is
numerical for categorical for classification. A classification model attempts to draw
some conclusion from observed values. Given one or more inputs a classification
3
model will try to predict the value of one or more outcomes. A classification
problem is when the output variable is a category, such as “red” or “blue”.
To harness the capabilities of an API for real-time data retrieval, several key steps
must be followed. Firstly, developers need to identify the API that corresponds to
the desired data source. This often involves exploring documentation provided by
the service or platform offering the API, which outlines its endpoints, parameters,
authentication methods, and usage policies.
4
CHAPTER 2
LITERATURE SURVEY
CHAPTER 2
LITERATURE SURVEY
Guangyu Mu, Nan Gao, Yuhan Wang, Li Dai, Proposed the Accurate
prediction of stock prices can reduce investment risks and increase returns. This paper
combines the multi-source data affecting stock prices and applies sentiment analysis,
swarm intelligence algorithm, and deep learning to build the MS-SSA-LSTM model.
Firstly, we crawl the East Money forum posts information to establish the unique
sentiment dictionary and calculate the sentiment index. Then, the Sparrow Search
Algorithm (SSA) optimizes the Long and Short-Term Memory network (LSTM)
hyperparameters. Finally, the sentiment index and fundamental trading data are
integrated, and LSTM is used to forecast stock prices in the future. Experiments
demonstrate that the MS-SSA-LSTM model outperforms the others and has high
universal applicability. Compared with standard LSTM, the R2 of MS-SSA-LSTM is
improved by 10.74% on average. We found that: 1) Adding the sentiment index can
enhance the model’s predictive performance. 2) The LSTM’s hyperparameters are
optimized using SSA, which objectively explains the model parameter settings and
improves the prediction effect. 3) The high volatility of China’s financial market is
more suitable for short-term prediction. “A Stock Price Prediction Model Based on
Investor Sentiment and Optimized Deep Learning”.
6
Shuzhen Wang, did the research on Bi-directional Long Short-Term Memory
(BiLSTM) and MTRAN-TCN are two models that can completely use the benefits of
the three models—BiLSTM, transformer, and TCN. Transformer does a decent job of
gathering full range distance data, but it does a poor job of gathering sequence data.
Sequence dependencies are captured by TCN, which also helps the model become more
generalizable, whereas bidirectional information in sequences is captured by BiLSTM.
The efficacy of the approach was confirmed with the help of 14 Shanghai and Shenzhen
stocks and 5 index stocks, in addition to the improvement impact of the transformer and
the effectiveness of incorporating the BiLSTM model. This technique has the best fit on
every index stock when compared to other current methods in the literature, and it has
the best R2 in 85.7% of the stock dataset. R2 rises by 0.3% to 15.6% but RMSE drops
by 24.3% to 93.5%. Furthermore, this technique does not suffer from timeliness
problems and has a reasonably consistent prediction performance over time. The
findings show that the BiLSTM-MTRAN-TCN approach outperforms other stock price
prediction methods in terms of accuracy and generalizability. “A Stock Price
Prediction Method Based on BiLSTM and Improved Transformer”.
Saud S.Alotaibi, Forecast of the stock price attempts to assess the potential
movement of the financial exchange's stock value. The exact estimation of the
movement of share price would contribute more to investors' profit. This paper
introduces a new stock market prediction model that includes three major phases:
feature extraction, optimal feature selection, and prediction. Initially, statistical features
like mean, standard deviation, variance, skewness, and kurtosis is extracted from the
collected stock market data. Further, the indexed data collected are also computed
concerning standard indicators like Average True Range (ATR), Exponential Moving
Average (EMA), Relative Strength Index (RSI), and Rate of Change (ROC). To acquire
best-predicted results, it is more crucial to select the most relevant features. Such that,
the optimal features are selected from the extracted features (technical indicators based
7
features, statistical features) by a new hybrid model referred to Red Deer Adopted Wolf
Algorithm (RDAWA). Further, the selected features are subjected to the ensemble
technique for predicting the stock movement. The ensemble technique involves the
classifiers like Support Vector Machine (SVM), Random Forest1 (RF1), Random
Forest2 (RF2), and optimized Neural Network (NN), respectively. The final predicted
results are acquired from the Optimized Neural Network (NN). To make the precise
prediction, the training of NN is carried out by the proposed RDAWA via fine-tuning
the optimal weight. Finally, the performance of the proposed work is compared over
other conventional models with respect to certain measures. “Ensemble Technique
With Optimal Feature Selection for Saudi Stock Market Prediction: A Novel
Hybrid Red Deer-Grey Algorithm”.
Yaohu Lin, Shancun Liu, Haijun Yaung, Harris Wu, Started the Stock market
forecasting is a knotty challenging task due to the highly noisy, nonparametric, complex
and chaotic nature of the stock price time series. With a simple eight-trigram feature
engineering scheme of the inter-day candlestick patterns, we construct a novel ensemble
machine learning framework for daily stock pattern prediction, combining traditional
candlestick charting with the latest artificial intelligence methods. Several machine
learning techniques, including deep learning methods, are applied to stock data to
predict the direction of the closing price. This framework can give a suitable machine
learning prediction method for each pattern based on the trained results. The investment
strategy is constructed according to the ensemble machine learning techniques.
Empirical results from 2000 to 2017 of China’s stock market confirm that our feature
engineering has effective predictive power, with a prediction accuracy of more than 60%
for some trend patterns. Various measures such as big data, feature standardization, and
elimination of abnormal data can effectively solve data noise. An investment strategy
based on our forecasting framework excels in both individual stock and portfolio
performance theoretically. However, transaction costs have a significant impact on
8
investment. Additional technical indicators can improve the forecast accuracy to varying
degrees. Technical indicators, especially momentum indicators, can improve forecasting
accuracy in most cases. “Stock Trend Prediction Using Candlestick Charting and
Ensemble Machine Learning Techniques With a Novelty Feature Engineering
Scheme”.
Shile Chen, Changjun Zhou, did research In the financial market, there are a
large number of indicators used to describe the change of stock price, which provides a
good data basis for our stock price forecast. Different stocks are affected by different
factors due to their different industry types and regions. Therefore, it is very important
to find a multi factor combination suitable for a particular stock to predict the price of
the stock. This paper proposes to use Genetic Algorithm(GA) for feature selection and
develop an optimized Long Short-Term Memory(LSTM) neural network stock
prediction model. Firstly, we use the GA to obtain a factors importance ranking. Then,
the optimal combination of factors is obtained from this ranking with the method of trial
and error. Finally, we use the combination of optimal factors and LSTM model for stock
prediction. Thorough empirical studies based upon the China construction bank dataset
and the CSI 300 stock dataset demonstrate that the GA-LSTM model can outperform
all baseline models for time series prediction. “Stock Prediction Based on Genetic
Algorithm Feature Selection and Long Short-Term Memory Neural Network”.
9
(Autoregressive Integrated Moving Average) and EWMA (Exponentially Weighted
Moving Average), we investigated two possible and potential hybrid methods: EMD-
ARIMA-EWMA, EMD-EWMA-ARIMA based on high and low-frequency
components. We experimented with these methods and compared their empirical results
with four other forecasting methods using five stock market daily closing prices from
the S&P/TSX 60 Index of Toronto Stock Exchange. This study found better forecasting
accuracy from EMD-ARIMA-EWMA than ARIMA, EWMA base methods and EMD-
ARIMA as well as EMD-EWMA hybrid methods. Therefore, we believe frequency-
based effective method selection in EMD-based hybridization deserves more research
investigation for better forecasting accuracy. “Improving Stock Price Prediction
Using Combining Forecasts Methods”.
10
Azamjon Muminov, Otabek Sattarov, Daeyoung Na, In the Bitcoin trading
landscape, predicting price movements is paramount. Our study focuses on identifying
the key factors influencing these price fluctuations. Utilizing the Pearson correlation
method, we extract essential data points from a comprehensive set of 14 data features.
We consider historical Bitcoin prices, representing past market behavior; trading
volumes, which highlight the level of trading activity; network metrics that provide
insights into Bitcoin’s blockchain operations; and social indicators: analyzed sentiments
from Twitter, tracked Bitcoin-related search trends on Google and on Twitter. These
social indicators give us a more nuanced understanding of the digital community’s
sentiment and interest levels. With this curated data, we forge ahead in developing a
predictive model using Deep Q-Network (DQN). A defining aspect of our model is its
innovative reward function, tailored for enhancing predicting Bitcoin price direction,
distinguished by its multi-faceted reward function. This function is a blend of several
critical factors: it rewards prediction accuracy, incorporates confidence scaling, applies
an escalating penalty for consecutive incorrect predictions, and includes a time-based
discounting to prioritize recent market trends. This composite approach ensures that the
model’s performance is not only precise in its immediate predictions but also adaptable
and responsive to the evolving patterns of the cryptocurrency market. Notably, in our
tests, our model achieved an impressive F1-score of 95%, offering substantial promise
for traders and investors. “Enhanced Bitcoin Price Direction Forecasting With
DQN”.
11
generate profitable trades in the stock market, effectively overcoming the limitations of
supervised learning approaches. We formulate the trading problem as a Partially
Observed Markov Decision Process (POMDP) model, considering the constraints
imposed by the stock market, such as liquidity and transaction costs. We then solve the
formulated POMDP problem using the Twin Delayed Deep Deterministic Policy
Gradient (TD3) algorithm reporting a 2.68 Sharpe Ratio on unseen data set (test data).
From the point of view of stock market forecasting and the intelligent decision-making
mechanism, this paper demonstrates the superiority of DRL in financial markets over
other types of machine learning and proves its credibility and advantages in strategic
decision-making. “Deep Reinforcement Learning Approach for Trading
Automation in the Stock Market”.
Bilal Hassan Ahmed Khattak, Imran Shafi, Abdul Saboor Khan, Emmanuel
Soriano Flores, Roberto García Lara, Md. Abdus, presents Artificial intelligence
(AI)-based models have emerged as powerful tools in financial markets, capable of
reducing investment risks and aiding in selecting highly profitable stocks by achieving
precise predictions. This holds immense value for investors, as it empowers them to
make data-driven decisions. Identifying current and future trends in multi-class
forecasting techniques employed within financial markets, particularly profitability
analysis as an evaluation metric is important. The review focuses on examining stud-ies
conducted between 2018 and 2023, sourced from three prominent academic databases.
A meticulous three-stage approach was employed, encompassing the systematic
planning, conduct, and analysis of the se-lected studies. Specifically, the analysis
emphasizes technical assessment, profitability analysis, hybrid mod-eling, and the type
of results generated by models. Articles were shortlisted based on inclusion and
exclusion criteria, while a rigorous quality assessment through ten quality criteria
questions, utilizing a Likert-type scale was employed to ensure methodological
robustness. We observed that ensemble and hybrid models with long short-term memory
12
(LSTM) and support vector machines (SVM) are being more adopted for financial
trends and price prediction. Moreover, hybrid models employing AI algorithms for
feature engineering have great potential at par with ensemble techniques. Most studies
only employ performance metrics and lack utilization of profitability metrics or
investment or trading strategy (simulated or real-time). Similarly, research on multi-
class or output is severely lacking in financial forecasting and can be a good avenue for
future research. “A Systematic Survey of AI Models in Financial Market
Forecasting for Profitability Analysis”.
13
CHAPTER 3
SPECIFICATION
CHAPTER 3
SYSTEM ANALYSIS
• Limitations in Adaptability
• Complexity in Predictive Accuracy
• Risk of Financial Loss
• Inadequate Integration of Advanced Techniques
15
3.2 PROPOSED SYSTEM
This model can capture the latest market dynamics and investor sentiments thanks to
the easy access to a plethora of real-time data made possible by the incorporation of
API keys. We make sure that our forecasts are not just based on past trends but also
accurately represent the situation of the market now by regularly adding new data to
our collection. the method relies heavily on the use of optimal gradient boosting
methods, which are well-known for their predictability, resilience, and efficiency. the
goal is to improve our prediction model's performance by using variations such as
XGBoost or LightGBM, which will let the model to learn intricate correlations and
patterns from the real-time data stream.
In addition, the model will use social media feeds, news articles, and financial data to
use sentiment analysis techniques to measure investor sentiment in real-time. To better
understand how investor behavior and market mood affect stock prices, we want to
combine sentiment analysis with improved gradient boosting methods.
It will perform rigorous optimization and fine-tuning procedures, including feature
16
selection, cross-validation approaches, and hyperparameter tweaking, to guarantee the
robustness and dependability of our model. This will improve our prediction model's
capacity for generalization and lessen overfitting.
Project want to enable investors to make wise judgments in a market that is changing
quickly by providing them with fast and accurate stock price projections through our
proposed work. the objective is to enhance stock price prediction capabilities and
promote the progress of financial forecasting techniques by merging cutting-edge
gradient boosting algorithms with real-time data collecting.
17
3.3.1 HARDWARE REQUIREMENTS
The hardware requirements may serve as the basis for a contract for the
implementation of the system and should therefore be a complete specification of the
whole system. They are used by the software engineers as the starting point for the
system design. It shows what the system do not and how it should be implemented.
18
CHAPTER 4
SYSTEM DESIGN
CHAPTER 4
SYSTEM DESIGN
20
4.1.1 DATA GATHERING
21
II. FEATURE ENGINEERING
➢ Extracting Relevant Features: Raw data often contains numerous variables,
some of which may not directly contribute to the predictive task at hand. Feature
engineering involves selecting or creating features that are most relevant to the
prediction problem. For stock price prediction, relevant features may include
historical stock prices, trading volume, economic indicators, and sentiment
analysis of news articles related to the stock.
➢ Transforming or Combining Features: Sometimes, the raw features may not
be in a suitable format for analysis or may lack predictive power. In such cases,
feature transformation techniques, such as logarithmic transformations or
polynomial expansions, may be applied. Additionally, features can be combined
to create new, more informative features that capture underlying patterns in the
data.
22
4.1.3 FEATURE EXTRACTION
The next action to take is to Included in the quality-degrading process is
extraction. As evidenced by their predictive pertinence, highlight extraction really
affects the features rather than element choice, which places the ongoing attributions.
The modified attributes, or components, are created by simply joining the initial
ascribes. Finally, we prepare our models using the Classifier computation. We make
use of the categorize module from the Python Normal Language Tool stash. We
employ the acquired annotated dataset. We will use the additional marked data we have
to poll the models. A few AI techniques were used to arrange pre- handled data. We
selected non-linear woodland classifiers. Usually, these computations are applied to
locations that include text grouping.
23
4.2 DATA FLOW DIAGRAM
The abbreviation for Information Stream Outline is DFD. DFD deals with
the information flow of a framework or an interaction. Additionally, it provides
information on the data sources, outcomes, and actual interactions for each factor.
There are no circles, choice criteria, or control streams in DFD. A flowchart can
make sense of explicit tasks reliant on the type of data.
It is a graphical tool that makes communicating with clients, supervisors, and other
faculty members easier. It is useful for dissecting both the current and the suggested
structure. It provides a summary of the framework procedures for what information
there is. What adjustments are made, what data is archived, what outputs are made,
and so on. There are various ways to address the Information Stream Outline. There
are organised investigation exhibiting gadgets at the DFD. Information Stream
graphs are well known because they let us visualise the important steps and
information involved in programming framework activities.
Four sections make up the information stream graph: Process Due to dealing
capability, a framework's ability to produce change is affected. Images of a
conversation may be round, oval, square, or rectangular with rounded corners. The
cycle is given a brief name that communicates its essence in a single word or
statement.
Information Stream Information stream depicts the data moving between various
pieces of the frameworks. The bolt image is the image of information stream. An
engaging name ought to be given to the stream to decide the data which is being
moved. Information stream additionally addresses material alongside data that is
being moved. Material movements are displayed in frameworks that are not simply
useful. A given stream ought to just exchange a solitary kind of data. The course
of stream is addressed by the bolt which can likewise be bi-directional.
24
LEVEL 0
The above figure 4.2 explains about the user interface or application where users
interact with the system.This component represents the core functionality of your
system, where the prediction of future stock prices takes place. This is an external
system or service that provides the current stock price data to the prediction application.
25
LEVEL 1
The above Figure 4.3 explains about the input provided by the user, which could include
login credentials, company symbols, etc. Validates the user's credentials and allows
access to the system. This process is responsible for predicting future stock prices. It
involves several sub-processes
26
LEVEL 2
The above Diagram explains the various inputs provided by the user, such as login
credentials and company symbols. Validates the user's credentials. Ensures that the user
is authorized to access the system. Verifies the user's account details and permissions.
This process is decomposed into more detailed sub-processes
27
4.3 UML DIAGRAM
The United Showing Language (UML) is used to decide, imagine, modify,
construct and record the relics of an article arranged programming heightened
system a work underway. UML offers a standard strategy for envisioning a
system's designing charts, including parts, for instance,
●performers
●business processes
●(authentic) parts
●works out
UML unites best methods from data illustrating (substance relationship diagrams),
business showing (work processes), object illustrating, and part illustrating. It will
in general be used with all cycles, all through the item improvement life cycle, and
across different execution developments. UML has integrated the documentations
of the Booch system, the Thing showing technique (OMT), and Thing arranged PC
programming (OOSE) by consolidating them into a single, ordinary, and
comprehensively usable exhibiting language. UML expects to be a standard
showing language which can show concurrent and conveyed systems.
• Grouping Graph:
Grouping Graphs Address the articles taking part the communication evenly
and time upward. A Utilization Case is a sort of social classifier that
addresses a statement of an offered conduct. Each utilization case
28
determines some way of behaving, conceivably including variations that the
subject can act in a joint effort with at least one entertainers. Use cases
characterize the offered conduct of the subject without reference to its inner
design. These ways of behaving, including connections between the
entertainer and the subject, may bring about changes to the condition of the
subject and correspondences with its current circumstance. A utilization case
can incorporate potential varieties of its essential way of behaving, including
excellent way of behaving and blunder dealing with.
• Action Graphs:
• Usecase summary:
29
• class diagram
Classes are referenced in the outline with boxes that have three compartments each:
The class name is located in the top compartment. It is strongly and narrowly
concentrated, and the main letter is emphasised.
Use case diagrams are considered for high level requirement analysis of a
system. So when the requirements of a system are analyzed the functionalities are
captured in usecases. So, it can say that uses cases are nothing but the system
functionalities written in an organized manner.
30
Figure 4.5 Use case Diagram
"External Systems" represent the systems outside the user's control, such as the Stock
Price API and the Prediction Algorithm. The "Predict Future" use case interacts with
these external systems to fetch the current stock price and perform the prediction. The
"Predict Future" use case is decomposed into several steps, each interacting with the
external systems to fulfill its functionality.
31
4.3.2 CLASS DIAGRAM
Class diagram is basically a graphical representation of the static view of the
systemand represents different aspects of the application. So a collection of class
diagrams represent the whole system. The name of the class diagram should be
meaningful to describe the aspect of the system. Each element and their relationships
should be identifiedin advance Responsibility (attributes and methods) of each class
should be clearly identified for each class minimum number of properties should
be specified and because, unnecessary properties will make the diagram
complicated.
The class diagram provides the information such as methods and variables that are
required for the project.it consist of all the class files that are required for the project.
32
4.3.3 ACTIVITY DIAGRAM
Activity is a particular operation of the system. Activity diagrams are not only
usedfor visualizing dynamic nature of a system but they are also used to construct the
executable system by using forward and reverse engineering techniques. The only
missingthing in activity diagram is the message part. It does not show any message flow
from oneactivity to another. Activity diagram is some time considered as the flow chart.
It shows different flow like parallel, branched, concurrent and single.
The above activity diagram provides the flow of project working. In first stage it check
whether the user is valid are not. If user is invalid he needs to register an account.
33
4.3.4 SEQUENCE DIAGRAM
A sequence diagram is a type of interaction diagram in Unified Modeling
Language (UML) that illustrates how objects interact in a particular scenario of a
system or application. It shows the flow of messages, or interactions, between
objects over time, typically from top to bottom. Each object involved in the
scenario is represented by a vertical lifeline, with messages between objects shown
as horizontal arrows. These messages indicate the order and direction of
communication between objects, helping to visualize the sequence of actions in a
system. Sequence diagrams are valuable for understanding the dynamic behavior
of a system, including the order of method calls, collaborations between objects,
and the flow of control.
34
Figure 4.8 Sequence Diagram
The user initiates by entering credentials, validated for login. Upon success, the user
selects stock price prediction, entering a company symbol. The system communicates
with an external API for the current stock price, then predicts the future stock price.
The prediction is displayed for user verification, followed by logout. Messages are
exchanged between user, system, and external API throughout the process.
35
CHAPTER 5
SYSTEM IMPLEMENTATION
CHAPTER 5
SYSTEM IMPLEMENTATION
1. Data Gathering
2. Pre-Processing of Data
3. Feature Extraction
4. Assessment Model
37
5.2.1 DATA GATHERING
2.Feature Engineering
38
Rectifying Data Integrity Issues: Once identified, inconsistencies and errors
need to be addressed. This may involve imputing missing values, removing
outliers, or correcting erroneous entries. The goal is to ensure that the data
is accurate and reliable for analysis.
39
5.2.3 FEATURE EXTRACTION
The next action to take is to Included in the quality-degrading process is
extraction. As evidenced by their predictive pertinence, highlight extraction really
affects the features rather than element choice, which places the ongoing attributions.
The modified attributes, or components, are created by simply joining the initial
ascribes. Finally, we prepare our models using the Classifier computation. We make
use of the categorize module from the Python Normal Language Tool stash. We
employ the acquired annotated dataset. We will use the additional marked data we have
to poll the models. A few AI techniques were used to arrange pre- handled data. We
selected non-linear woodland classifiers. Usually, these computations are applied to
locations that include text grouping.
40
5.3 PYTHON LIBRARIES & PACKAGES:
41
5.5 ALGORITHM
Boosting algorithms are a type of machine learning methods that combine the
capabilities of several weak learners in order to improve the prediction performance of
models. The data is successively trained using weak learners, which are often basic
decision trees, with each new learner learning from the mistakes of the previous ones.
Giving incorrectly categorized examples larger weights emphasizes their significance
in later rounds, which is the main objective. Some well-known boosting algorithms
include XG Boost, AdaBoost, and Gradient Boosting. In AdaBoost, instances are given
different weights according to how well they classify, whereas in XG Boost,
regularization and parallel processing capabilities are included. Gradient Boosting
minimizes model errors by minimizing the gradient of the loss function.
The potential of boosting algorithms to increase prediction accuracy by combining the
outputs of numerous weak learners makes them popular in stock price prediction
models. In the process of predicting stock prices using data gathered from stock market
servers or hubs, the following boosting algorithms are commonly employed:
42
4. CatBoost: Yandex created the CatBoost gradient boosting algorithm with the
express purpose of handling category information well. Even with minimal
hyperparameter adjustment, it offers strong performance and intelligently manages
missing data.
Several forms of data, such as historical price data, trade volumes, technical indicators,
and fundamental data, may be obtained from stock market servers or hubs and enhanced
using these boosting techniques. Analysts and traders may use these algorithms to create
prediction models that help them spot patterns and trends in stock price movements and
help them make wise investment choices.
43
Figure 5.1 Boosting Algorithm
Stock price fluctuations can be predicted using boosting algorithms such as AdaBoost
or Gradient Boosting. These algorithms function by creating a powerful learner by
merging several weak learners, often decision trees. By rectifying the mistakes of its
predecessor, every weak learner enhances prediction accuracy over time. Boosting
algorithms can identify intricate patterns and links in stock price data, which improves
the accuracy of price forecasts. Boosting models, which are trained on past price data
and pertinent attributes, can offer insights into possible stock price deductions by
predicting future price decreases based on the patterns they have learnt. Because these
algorithms can handle non-linear interactions and adjust to changing market
conditions, they are widely used in financial modeling. Like other predictive models,
though, their efficacy is contingent upon the caliber of the training set of characteristics
and data.
44
CHAPTER 6
EXPERIMENTAL RESULTS
CHAPTER 6
EXPERIMENTAL RESULTS
The below figures show the results of the module implementation. These screenshots
show the User Interface through which the modules are being developed
6.1 XAMPP
XAAMP is a free and open source cross-platform web server solution stack
package, consisting mainly of the Apache HTTP Server, MySQL database, and
interpreters for scripts written in the PHP and Perl programming languages. Installing
XAMPP takes less time than installing each of its components separately. Self-
contained, multiple instances of XAMPP can exist on a single computer, and any given
instance can be copied from one computer to another. It is offered in both a full, standard
version and a smaller version.
46
6.2 LOGIN
Users visit the website's login page to start the procedure. They must provide
their login information, which includes their username and password, at this point. The
submitted information is subsequently authenticated by the system using user data that
has been saved. The user is forwarded to the home page and given access to the system
if the credentials match. On the other hand, relevant error messages are shown in the
event of an authentication failure (such as an erroneous username or password),
assisting the user in fixing the problem. To protect user data and stop unwanted access,
security measures including session management and encryption are put in place. Once
the login process is successful, the user may explore the application's capabilities as
they have been authenticated.
47
6.3 HOME PAGE
Users are sent to the main page after a successful login, where they are met with
an extensive summary of the state of the stock market. The main page presents the
current stock values of several firms in an easily navigable format, either arranged in a
list or grid style. Every listed firm has its current stock price associated with it, giving
users important information quickly. Furthermore, by selecting the "Predict the Future"
button linked to each firm, consumers are given the opportunity to project future stock
values. With the help of this function, consumers may make wise investment choices
by using market patterns and predictive research.
48
6.4 PREDICT THE FUTURE PAGE
When users click the "Predict the Future" button, a specialized prediction page
opens and they may choose the firm whose stock price they want to anticipate. An
easy-to-use input area on the prediction page allows customers to enter the name or
symbol of the business they are interested in. Users fill out the form and submit it to
begin the prediction process after providing the firm data. The system uses API keys
to retrieve historical and real-time data from the server or hub that has the most recent
stock market data in the background. The aforementioned data forms the foundation
for producing precise forecasts concerning the forthcoming stock value of the
designated enterprise.
49
6.5 RESULT PAGE
Users are sent to the result page after submitting their prediction request, where
they may view the predicted stock price as well as enlightening commentary and
suggestions. The result page efficiently helps visitors visualize possible market
movements by showcasing a graphical depiction of the expected stock price trend over
a certain time range. The website offers succinct and straightforward suggestions,
such as whether to "Buy," "Sell," or "Hold" the stock depending on the forecast
outcome, in addition to its graphical depiction. In addition, the recommendation's
justification is expanded upon, taking into account variables including past
performance, market trends, and outside influences. It's a smooth and user-focused
experience since users may explore more features, go back to the home page, or start
another forecast.
50
CHAPTER 7
TESTING
CHAPTER 7
SYSTEM TESTING
52
➢ Boundary Value Testing: Boundary value testing is focused on the values at
boundaries. This technique determines whether a certain range of values are
acceptable by the system or not. It is very useful in reducing the number of test
cases. It is mostly suitable for the systems where input is within certain ranges.
➢ Decision Table Testing: A decision table puts causes and their effects in a matrix.
There is unique combination in each column.
White Box Testing is the testing of a software solution's internal coding and
infrastructure. It focuses primarily on strengthening security, the flow of inputs and
outputs through the application, and improving design and usability. White box testing
is also known as Clear Box testing, Open Box testing, Structural testing, Transparent
Box testing, Code-Based testing, and Glass Box testing.
53
7.3 TYPES OF TESTING
1. Unit Testing
2. Functional Testing
3. Performance Testing
4. Integration Testing
5. Objective
6. Validation Testing
7. System Testing
8. Structure Testing
9. Output Testing
10. User Acceptance Testing
54
7.3.2 FUNCTIONAL TESTING
➢ Functional test cases involve exercising the code with normal input values
for which the expected results are known, as well as the boundary values
➢ The objective is to take unit-tested modules and build a program structure
that has been dictated by design.
➢ Validation test succeeds when the software functions in a manner that can
be reasonably expected by the client.
➢ Software validation is achieved through a series of black box testing which
confirms to the requirements.
➢ Black box testing is conducted at the software interface.
55
➢ The test is designed to uncover interface errors, is also used to demonstrate
that software functions are operational, input is properly accepted, output
are produced and that the integrity of external information is maintained.
➢ Tests to find the discrepancies between the system and its original
objective, current specifications and system documentation.
➢ Output of test cases compared with the expected results created during
design of test cases.
➢ Asking the user about the format required by them tests the output
generated or displayed by the system under consideration.
➢ Here, the output format is considered into two was, one is on screen and
another one is printed format.
➢ The output on the screen is found to be correct as the format was designed
in the system design phase according to user needs.
➢ The output comes out as the specified requirements as the user’s hard copy.
➢ Final Stage, before handling over to the customer which is usually carried
out by the customer where the test cases are executed with actual data.
56
➢ The system under consideration is tested for user acceptance and constantly
keeping touch with the prospective system user at the time of developing
and making changes whenever required.
➢ It involves planning and execution of various types of test in order to
demonstrate that the implemented software system satisfies the
requirements stated in the requirement document.
57
INPUT VALIDATION TESTING
➢ Validate user input for the company symbol to ensure it meets the required
format and constraints.
➢ Test scenarios such as entering invalid symbols, empty input, and excessively
long input.
PERFORMANCE TESTING
➢ Evaluate the performance of the application under different loads and network
conditions.
58
➢ Test response times for fetching stock prices and generating predictions to
ensure acceptable performance.
SECURITY TESTING
➢ Conduct security testing to identify and mitigate potential vulnerabilities,
such as SQL injection and cross-site scripting (XSS).
➢ Ensure that sensitive data, such as API keys and user credentials, are handled
securely.
COMPATIBILITY TESTING
➢ Test the application on various web browsers and devices to ensure
compatibility and consistent behavior.
By following these testing requirements, This can ensure the stock price prediction
web application is reliable, accurate, and user-friendly.
59
CHAPTER 8
CONCLUSION
CONCLUSION
Financial forecasting can undergo a revolutionary change with the use of real-time
live data obtained through API keys for stock price prediction. Using the most recent
market data, this project offers unmatched precision and flexibility. Accurate
forecasts, responsible decision-making, and quick reactions to changing market
conditions are made possible by the incorporation of real-time data. In addition, the
incorporation of news events and sentiment research enriches market sentiment
insights, providing investors with relevant and actionable information. The capacity
to handle volatility with confidence is provided by the scalability and efficiency of
API- based data access, which supports strong risk management methods. This project
gives us a competitive edge in the quick-paced financial world of today while also
advancing our grasp of market dynamics. Cutting-edge technologies and real-time
data are embraced, laying the groundwork for nimble, wise, and profitable investing
methods. In conclusion, the use of real-time data via API keys signifies a fundamental
change toward stock price prediction models that are more precise, flexible, and
significant.
61
REFERENCES
REFERENCE
[1] Z. Yan, C. Qin and G. Song, "Stock price prediction of stochastic forest model based
on Pearson feature selection", Comput. Eng. Appl., vol. 57, no. 15, pp. 286-296, 2021.
[2] C. Wang, Y. Chen, S. Zhang and Q. Zhang, "Stock market index prediction using
deep transformer model", Exp. Syst. Appl., vol. 208, Dec. 2022.
[3] Y. Li and Y. Pan, "A novel ensemble deep learning model for stock prediction based
on stock prices and news", Int. J. Data Sci. Anal., vol. 13, no. 2, pp. 139-149, Sep. 2021.
[4] V. Gupta and M. Ahmad, "Stock price trend prediction with long short-term memory
neural networks", Int. J. Comput. Intell. Stud., vol. 8, no. 4, pp. 289, 2019.
[6] X. Zhang, S. Qu, J. Huang, B. Fang and P. Yu, "Stock market prediction via multi-
source multiple instance learning", IEEE Access, vol. 6, pp. 50720-50728, 2018.
[7] T.-H. Lu, "The profitability of candlestick charting in the taiwan stock market",
Pacific-Basin Finance J., vol. 26, pp. 65-78, Jan. 2014.
[9] L. J. Cao and F. H. Tay, "Support vector machine with adaptive parameters in
financial time series forecasting", IEEE Trans. Neural Netw., vol. 14, no. 6, pp. 5-10,
Nov. 2003.
[10] Y. C. Li, L. N. Zhao and S. J. Zhou, "Review of genetic algorithm", Adv. Mater.
Res., vol. 179, no. 180, pp. 365-367, Aug. 2011.
63
[11] F. Allen and D. Gale, "Stock-price manipulation", The Review of Financial Studies,
vol. 5, no. 3, pp. 503-529, 1992.
[12] Y. C. Huang and Y. J. Cheng, "Stock manipulation and its effects: pump and dump
versus stabilization", Review of Quantitative Finance and Accounting, vol. 44, no. 4,
pp. 791-815, 2015.
[13] P. Fan, Y. Yang, Z. Zhang, and M. Chen, ‘‘The relationship between individual
stock investor sentiment and stock yield-based on the perspective of stock evaluation
information,’’ Math. Pract. Theory, vol. 51, no. 16, pp. 305–320, 2021.
[14] Z. Jin, Y. Yang, and Y. Liu, ‘‘Stock closing price prediction based on sentiment
analysis and LSTM,’’ Neural Comput. Appl., vol. 32, no. 13, pp. 9713–9729, Sep. 2019,
doi: 10.1007/s00521- 019-04504-2
[15] H. Liu and Y. Hou, ‘‘Application of Bayesian neural network in prediction of stock
time series,’’ Comput. Eng. Appl., vol. 55, no. 12, pp. 225–229,2019.
[18] A. W. Li and G. S. Bastos, "Stock market forecasting using deep learning and
technical analysis: A systematic review", IEEE Access, vol. 8, pp. 185232-185242,
2020.
64
CHAPTER 10
APPENDIX
APPENDIX A
main.py
# Ignore Warnings
import warnings
warnings.filterwarnings("ignore")
import os
66
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
#To control caching so as to save and retrieve plot figs on client side
@app.after_request
def add_header(response):
response.headers['Pragma'] = 'no-cache'
response.headers['Cache-Control'] = 'no-cache, no-store, must-revalidate'
response.headers['Expires'] = '0'
return response
@app.route('/')
def index():
return render_template('index.html')
@app.route('/insertintotable',methods = ['POST'])
def insertintotable():
nm = request.form['nm']
67
df=pd.DataFrame()
df['Date']=data['date']
df['Open']=data['1. open']
df['High']=data['2. high']
df['Low']=data['3. low']
df['Close']=data['4. close']
df['Adj Close']=data['5. adjusted close']
df['Volume']=data['6. volume']
df.to_csv(''+quote+'.csv',index=False)
return
68
Quantity_date = data[['Price','Date']]
Quantity_date.index = Quantity_date['Date'].map(lambda x: parser(x))
Quantity_date['Price'] = Quantity_date['Price'].map(lambda x: float(x))
Quantity_date = Quantity_date.fillna(Quantity_date.bfill())
Quantity_date = Quantity_date.drop(['Date'],axis =1)
fig = plt.figure(figsize=(7.2,4.8),dpi=65)
plt.plot(Quantity_date)
plt.savefig('static/Trends.png')
plt.close(fig)
quantity = Quantity_date.values
size = int(len(quantity) * 0.80)
train, test = quantity[0:size], quantity[size:len(quantity)]
#fit in model
predictions = arima_model(train, test)
#plot graph
fig = plt.figure(figsize=(7.2,4.8),dpi=65)
plt.plot(test,label='Actual Price')
plt.plot(predictions,label='Predicted Price')
plt.legend(loc=4)
plt.savefig('static/ARIMA.png')
plt.close(fig)
print()
print("#####################################################################
#########")
arima_pred=predictions[-2]
print("Tomorrow's",quote," Closing Price Prediction by ARIMA:",arima_pred)
#rmse calculation
error_arima = math.sqrt(mean_squared_error(test, predictions))
print("ARIMA RMSE:",error_arima)
print("#####################################################################
#########")
return arima_pred, error_arima
69
#***** LSTM SECTION ********
def LSTM_ALGO(df):
#Split data into training set and test set
dataset_train=df.iloc[0:int(0.8*len(df)),:]
dataset_test=df.iloc[int(0.8*len(df)):,:]
############# NOTE #################
#TO PREDICT STOCK PRICES OF NEXT N DAYS, STORE PREVIOUS N DAYS IN
MEMORY WHILE TRAINING
# HERE N=7
###dataset_train=pd.read_csv('Google_Stock_Price_Train.csv')
training_set=df.iloc[:,4:5].values# 1:2, to store as numpy array else Series obj will be stored
#select cols using above manner to select as float64 type, view in var explorer
#Feature Scaling
from sklearn.preprocessing import MinMaxScaler
sc=MinMaxScaler(feature_range=(0,1))#Scaled values btween 0,1
training_set_scaled=sc.fit_transform(training_set)
#In scaling, fit_transform for training, transform for test
#Creating data stucture with 7 timesteps and 1 output.
#7 timesteps meaning storing trends from 7 days before current day to predict 1 next output
X_train=[]#memory with 7 days from day i
y_train=[]#day i
for i in range(7,len(training_set_scaled)):
X_train.append(training_set_scaled[i-7:i,0])
y_train.append(training_set_scaled[i,0])
#Convert list to numpy arrays
X_train=np.array(X_train)
y_train=np.array(y_train)
X_forecast=np.array(X_train[-1,1:])
X_forecast=np.append(X_forecast,y_train[-1])
#Reshaping: Adding 3rd dimension
X_train=np.reshape(X_train, (X_train.shape[0],X_train.shape[1],1))#.shape 0=row,1=col
X_forecast=np.reshape(X_forecast, (1,X_forecast.shape[0],1))
#For X_train=np.reshape(no. of rows/samples, timesteps, no. of cols/features)
70
#Building RNN
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import LSTM
#Initialise RNN
regressor=Sequential()
71
real_stock_price=dataset_test.iloc[:,4:5].values
#To predict, we need stock prices of 7 days before the test set
#So combine train and test set to get the entire data set
dataset_total=pd.concat((dataset_train['Close'],dataset_test['Close']),axis=0)
testing_set=dataset_total[ len(dataset_total) -len(dataset_test) -7: ].values
testing_set=testing_set.reshape(-1,1)
#-1=till last row, (-1,1)=>(80,1). otherwise only (80,0)
#Feature scaling
testing_set=sc.transform(testing_set)
#Create data structure
X_test=[]
for i in range(7,len(testing_set)):
X_test.append(testing_set[i-7:i,0])
#Convert list to numpy arrays
X_test=np.array(X_test)
#Reshaping: Adding 3rd dimension
X_test=np.reshape(X_test, (X_test.shape[0],X_test.shape[1],1))
#Testing Prediction
predicted_stock_price=regressor.predict(X_test)
#Getting original prices back from scaled values
predicted_stock_price=sc.inverse_transform(predicted_stock_price)
fig = plt.figure(figsize=(7.2,4.8),dpi=65)
plt.plot(real_stock_price,label='Actual Price')
plt.plot(predicted_stock_price,label='Predicted Price')
plt.legend(loc=4)
plt.savefig('static/LSTM.png')
plt.close(fig)
error_lstm = math.sqrt(mean_squared_error(real_stock_price, predicted_stock_price))
#Forecasting Prediction
forecasted_stock_price=regressor.predict(X_forecast)
#Getting original prices back from scaled values
forecasted_stock_price=sc.inverse_transform(forecasted_stock_price)
72
lstm_pred=forecasted_stock_price[0,0]
print()
print("#######################################################################
#######")
print("Tomorrow's ",quote," Closing Price Prediction by LSTM: ",lstm_pred)
print("LSTM RMSE:",error_lstm)
print("#######################################################################
#######")
return lstm_pred,error_lstm
#****** LINEAR REGRESSION SECTION *******
def LIN_REG_ALGO(df):
#No of days to be forcasted in future
forecast_out = int(7)
#Price after n days
df['Close after n days'] = df['Close'].shift(-forecast_out)
#New df with only relevant data
df_new=df[['Close','Close after n days']]
#Structure data for train, test & forecast
#lables of known data, discard last 35 rows
y =np.array(df_new.iloc[:-forecast_out,-1])
y=np.reshape(y, (-1,1))
#all cols of known data except lables, discard last 35 rows
X=np.array(df_new.iloc[:-forecast_out,0:-1])
#Unknown, X to be forecasted
X_to_be_forecasted=np.array(df_new.iloc[-forecast_out:,0:-1])
#Traning, testing to plot graphs, check accuracy
X_train=X[0:int(0.8*len(df)),:]
X_test=X[int(0.8*len(df)):,:]
y_train=y[0:int(0.8*len(df)),:]
y_test=y[int(0.8*len(df)):,:]
# Feature Scaling===Normalization
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
73
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
X_to_be_forecasted=sc.transform(X_to_be_forecasted)
#Training
clf = LinearRegression(n_jobs=-1)
clf.fit(X_train, y_train)
#Testing
y_test_pred=clf.predict(X_test)
y_test_pred=y_test_pred*(1.04)
import matplotlib.pyplot as plt2
fig = plt2.figure(figsize=(7.2,4.8),dpi=65)
plt2.plot(y_test,label='Actual Price' )
plt2.plot(y_test_pred,label='Predicted Price')
plt2.legend(loc=4)
plt2.savefig('static/LR.png')
plt2.close(fig)
error_lr = math.sqrt(mean_squared_error(y_test, y_test_pred))
#Forecasting
forecast_set = clf.predict(X_to_be_forecasted)
forecast_set=forecast_set*(1.04)
mean=forecast_set.mean()
lr_pred=forecast_set[0,0]
print()
print("#######################################################################
#######")
print("Tomorrow's ",quote," Closing Price Prediction by Linear Regression: ",lr_pred)
print("Linear Regression RMSE:",error_lr)
print("#######################################################################
#######")
return df, lr_pred, forecast_set, mean, error_lr
#****** SENTIMENT ANALYSIS **********
def retrieving_tweets_polarity(symbol):
stock_ticker_map = pd.read_csv('Yahoo-Finance-Ticker-Symbols.csv')
74
stock_full_form = stock_ticker_map[stock_ticker_map['Ticker']==symbol]
symbol = stock_full_form['Name'].to_list()[0][0:12]
auth = tweepy.OAuthHandler(ct.consumer_key, ct.consumer_secret)
auth.set_access_token(ct.access_token, ct.access_token_secret)
user = tweepy.API(auth)
tweets = tweepy.Cursor(user.search_tweets, q=symbol, tweet_mode='extended',
lang='en',exclude_replies=True).items(ct.num_of_tweets)
tweet_list = [] #List of tweets alongside polarity
global_polarity = 0 #Polarity of all tweets === Sum of polarities of individual tweets
tw_list=[] #List of tweets only => to be displayed on web page
#Count Positive, Negative to plot pie chart
pos=0 #Num of pos tweets
neg=1 #Num of negative tweets
for tweet in tweets:
count=20 #Num of tweets to be displayed on web page
#Convert to Textblob format for assigning polarity
tw2 = tweet.full_text
tw = tweet.full_text
#Clean
tw=p.clean(tw)
#print("-------------------------------CLEANED TWEET-----------------------------")
#print(tw)
#Replace & by &
tw=re.sub('&','&',tw)
#Remove :
tw=re.sub(':','',tw)
#print("-------------------------------TWEET AFTER REGEX MATCHING------------------------
-----")
#print(tw)
#Remove Emojis and Hindi Characters
tw=tw.encode('ascii', 'ignore').decode('ascii')
75
#print(tw)
blob = TextBlob(tw)
polarity = 0 #Polarity of single individual tweet
for sentence in blob.sentences:
polarity += sentence.sentiment.polarity
if polarity>0:
pos=pos+1
if polarity<0:
neg=neg+1
global_polarity += sentence.sentiment.polarity
if count > 0:
tw_list.append(tw2)
tweet_list.append(Tweet(tw, polarity))
count=count-1
if len(tweet_list) != 0:
global_polarity = global_polarity / len(tweet_list)
else:
global_polarity = global_polarity
neutral=ct.num_of_tweets-pos-neg
if neutral<0:
neg=neg+neutral
neutral=20
print()
print("#######################################################################
#######")
print("Positive Tweets :",pos,"Negative Tweets :",neg,"Neutral Tweets :",neutral)
print("#######################################################################
#######")
labels=['Positive','Negative','Neutral']
sizes = [pos,neg,neutral]
explode = (0, 0, 0)
fig = plt.figure(figsize=(7.2,4.8),dpi=65)
fig1, ax1 = plt.subplots(figsize=(7.2,4.8),dpi=65)
ax1.pie(sizes, explode=explode, labels=labels, autopct='%1.1f%%', startangle=90)
76
# Equal aspect ratio ensures that pie is drawn as a circle
ax1.axis('equal')
plt.tight_layout()
plt.savefig('static/SA.png')
plt.close(fig)
#plt.show()
if global_polarity>0:
print()
print("#####################################################################
#########")
print("Tweets Polarity: Overall Positive")
print("#####################################################################
#########")
tw_pol="Overall Positive"
else:
print()
print("#####################################################################
#########")
print("Tweets Polarity: Overall Negative")
print("#####################################################################
#########")
tw_pol="Overall Negative"
return global_polarity,tw_list,tw_pol,pos,neg,neutral
77
print("According to the ML Predictions and Sentiment Analysis of Tweets,
a",idea,"in",quote,"stock is expected => ",decision)
elif global_polarity <= 0:
idea="FALL"
decision="BUY"
print()
print("###################################################################
###########")
print("According to the ML Predictions and Sentiment Analysis of Tweets,
a",idea,"in",quote,"stock is expected => ",decision)
else:
idea="FALL"
decision="SELL"
print()
print("#####################################################################
#########")
print("According to the ML Predictions and Sentiment Analysis of Tweets,
a",idea,"in",quote,"stock is expected => ",decision)
return idea, decision
78
df = pd.read_csv(''+quote+'.csv')
print("#######################################################################
#######")
print("Today's",quote,"Stock Data: ")
today_stock=df.iloc[-1:]
print(today_stock)
print("#######################################################################
#######")
df = df.dropna()
code_list=[]
for i in range(0,len(df)):
code_list.append(quote)
df2=pd.DataFrame(code_list,columns=['Code'])
df2 = pd.concat([df2, df], axis=1)
df=df2
arima_pred, error_arima=ARIMA_ALGO(df)
lstm_pred, error_lstm=LSTM_ALGO(df)
df, lr_pred, forecast_set,mean,error_lr=LIN_REG_ALGO(df)
# Twitter Lookup is no longer free in Twitter's v2 API
# polarity,tw_list,tw_pol,pos,neg,neutral = retrieving_tweets_polarity(quote)
polarity, tw_list, tw_pol, pos, neg, neutral = 0, [], "Can't fetch tweets, Twitter Lookup is no
longer free in API v2.", 0, 0, 0
idea, decision=recommending(df, polarity,today_stock,mean)
print()
print("Forecasted Prices for Next 7 days:")
print(forecast_set)
today_stock=today_stock.round(2)
return
render_template('results.html',quote=quote,arima_pred=round(arima_pred,2),lstm_pred=round(lstm_
pred,2),
lr_pred=round(lr_pred,2),open_s=today_stock['Open'].to_string(index=False),
close_s=today_stock['Close'].to_string(index=False),adj_close=today_stock['Adj
Close'].to_string(index=False),
79
tw_list=tw_list,tw_pol=tw_pol,idea=idea,decision=decision,high_s=today_stock['
High'].to_string(index=False),
low_s=today_stock['Low'].to_string(index=False),vol=today_stock['Volume'].to_
string(index=False),
forecast_set=forecast_set,error_lr=round(error_lr,2),error_lstm=round(error_lstm,
2),error_arima=round(error_arima,2))
if __name__ == '__main__':
app.run()
index.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Stocks</title>
<meta name="description" content="">
<meta name="author" content="">
<meta name="viewport" content="width=device-width, initial-scale=1">
80
<link rel="stylesheet" href="{{ url_for('static', filename='menuzord.css') }}">
<!--preloader start-->
<!--preloader end-->
<!-- Nav Bar-->
<nav class="navbar-fixed-top transparrent-bg">
<div class="container">
<div id="menuzord" class="menuzord red">
<a href="http://localhost/www/wordpress-5.6.2/wordpress/" class="menuzord-brand">
<img src="{{url_for('static', filename='logo-2.png')}}" />
</a>
<ul class="menuzord-menu mp_menu">
<li class="active"><a href="http://localhost/www/wordpress-
5.6.2/wordpress/">HOME</a></li>
<li><a href="http://localhost/wordpress/">DASHBOARD</a></li>
<li><a href="http://localhost/wordpress/about/">ABOUT</a></li>
<li><a href="https://rededge.is-a.dev/currency-converter/" target="_blank">CURRENCY
CONVERTER</a></li>
</ul>
81
</div>
</div>
</nav>
<!-- END Nav Bar -->
<!-- Header -->
<div id="home" class="bg-inner low-back-gradient-inner">
<div class="text-con-inner low-back-up">
<div class="container">
<div style="padding-top: 40px" class="row">
<div class="lead col-lg-12 col-xm-12 text-center">
<h1><span class="top-heading-inner">PREDICT THE FUTURE</span> </h1>
<div class="list-o-i white">
<p class="white no-p"></p>
<div class="pagenation_links"><a href="index.html" class="yellow"><br></a><i>
</i></div>
</div>
</div>
</div>
</div>
</div>
</div>
<!-- ENd Header -->
82
<div class="alert alert-danger" style="color: red;" role="alert">
Stock Symbol (Ticker) Not Found. Please Enter a Valid Stock Symbol
</div>
{% endif %}
</div>
<div class="form-group">
<input type="text" class="form-control" name = "nm" placeholder="Company Stock
Symbol">
</div>
<div class="form-group">
</div>
<div class="form-group">
<button class="btn btn-login">PREDICT THE FUTURE</button>
</div>
</form>
</div>
</div>
</div>
<!-- End Lgin -->
83
<div class="copy-right-area">
<div class="container">
<div class="row">
<div class="col-lg-8 col-md-8 col-sm-8 col-xs-12">
<div class="copy-right">
<p></p>
</div>
</div>
<div class="col-lg-4 col-md-4 col-sm-4 col-xs-12">
<div class="social-media">
<ul>
<li><a href="https://www.facebook.com/groups/sharemarketnewsupdates"
target="_blank"><i class="fa fa-facebook" aria-hidden="true"></i></a></li>
<li><a
href="https://twitter.com/ETMarkets?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Ea
uthor" target="_blank"><i class="fa fa-twitter" aria-hidden="true"></i></a></li>
<li><a href="https://www.linkedin.com/company/realstockmarket/"
target="_blank"><i class="fa fa-linkedin" aria-hidden="true"></i></a></li>
<li><a
href="https://www.youtube.com/live/iyOq8DhaMYw?si=MDm4Coz4_tDygyxs" target="_blank"><i
class="fa fa-youtube-play" aria-hidden="true"></i></a></li>
</ul>
</div>
</div>
</div>
</div>
</div>
<!-- End Copyright -->
<!--Contact Popup-->
<div class="modal fade pop-box" id="donate-popup" tabindex="-1" role="dialog" aria-
labelledby="donate-popup" aria-hidden="true">
<div class="modal-dialog">
<div class="modal-content">
84
<!--Donation div-->
<div class="donation-div">
<div class="donation-plz">
<form method="post" action="contact.html">
<!--Form Portlet-->
<!--Form Portlet-->
<div class="form-portlet">
<h4>Consultaion Information</h4>
<div class="row clearfix">
<div class="form-group col-lg-6 col-md-6 col-xs-12">
<div class="field-label">First Name <span class="required">*</span></div>
<input type="text" name="name" value="" placeholder="First Name"
required>
</div>
<div class="form-group col-lg-6 col-md-6 col-xs-12">
<div class="field-label">Last Name <span class="required">*</span></div>
<input type="text" name="name" value="" placeholder="Last Name"
required>
</div>
<div class="form-group col-lg-6 col-md-6 col-xs-12">
<div class="field-label">Email <span class="required">*</span></div>
<input type="email" name="name" value="" placeholder="Email" required>
</div>
<div class="form-group col-lg-6 col-md-6 col-xs-12">
<div class="field-label">Phone <span class="required">*</span></div>
<input type="text" name="name" value="" placeholder="Phone" required>
</div>
<div class="form-group col-lg-6 col-md-6 col-xs-12">
<div class="field-label">Address 1 <span class="required">*</span></div>
<input type="text" name="name" value="" placeholder="Address 1"
required>
</div>
<div class="form-group col-lg-6 col-md-6 col-xs-12">
<div class="field-label">Address 2 <span class="required">*</span></div>
85
<input type="text" name="name" value="" placeholder="Address 2"
required>
</div>
</div>
</div>
<br>
<!--Form Portlet-->
<div class="text-left">
<button type="submit" class="theme-btn btn-style-two">Send Now</button>
</div>
</form>
</div>
</div>
</div>
</div>
</div>
<!-- END Donate Popup-->
86
Results.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<!-- Tell the browser to be responsive to screen width -->
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="description" content="">
<meta name="author" content="">
<!-- Favicon icon -->
<link rel="icon" type="image/png" href="{{ url_for('static', filename='logo-2.png') }}">
<title>STOCKS</title>
<link href="{{ url_for('static', filename='owl.carousel.min.css') }}" rel="stylesheet" />
<link href="{{ url_for('static', filename='owl.theme.default.min.css') }}" rel="stylesheet" />
<!-- Bootstrap Core CSS -->
<link href="{{ url_for('static', filename='bootstrap.min-RES.css') }}" rel="stylesheet">
<!-- Custom CSS -->
<link href="{{ url_for('static', filename='helper.css') }}" rel="stylesheet">
<link href="{{ url_for('static', filename='style-RES.css') }}" rel="stylesheet">
</head>
87
<a href="http://localhost/www/wordpress-5.6.2/wordpress/" class="menuzord-
brand"><img src="{{url_for('static', filename='logo-2.png')}}" />
</a>
</div>
<!-- End Logo -->
<div class="navbar-collapse">
<!-- toggle and nav items -->
<ul class="navbar-nav mr-auto mt-md-0">
<!-- This is -->
<li class="nav-item"> <a class="nav-link toggle-nav hidden-md-up text-muted "
href="javascript:void(0)"><i class="mdi mdi-menu"></i></a> </li>
<li class="nav-item m-l-10"> <a class="nav-link sidebartoggle hidden-sm-down text-
muted " href="javascript:void(0)"><i class="ti-menu"></i></a> </li>
</ul>
</div>
</nav>
</div>
<!-- End header header -->
<!-- Left Sidebar -->
<div class="left-sidebar">
<!-- Sidebar scroll-->
<div class="scroll-sidebar">
<!-- Sidebar navigation-->
<nav class="sidebar-nav">
<ul id="sidebar-menu">
<li class="nav-devider"></li>
<br>
<br>
<br>
<br>
<li class="nav-label"><a href="http://127.0.0.1:5000" target="_blank">PREDICT
THE FUTURE</a></li>
88
<li class="nav-label"><a href="http://localhost/wordpress/"
target="_blank">DASHBOARD</a></li>
<li class="nav-label"><a href="http://localhost/wordpress/about/"
target="_blank">ABOUT</a></li>
<li class="nav-label"><a href="https://rededge.is-a.dev/currency-converter/"
target="_blank">CURRENCY CONVERTER</a></li>
</ul>
</nav>
<!-- End Sidebar navigation -->
</div>
<!-- End Sidebar scroll-->
</div>
<!-- End Left Sidebar -->
<!-- Page wrapper -->
<div class="page-wrapper">
<!-- Bread crumb -->
<div class="row page-titles">
<div class="col-md-5 align-self-center">
<h3 class="text-primary">TODAY'S {{quote}} STOCK DATA</h3> </div>
<div class="col-md-7 align-self-center">
</div>
</div>
<!-- End Bread crumb -->
<!-- Container fluid -->
<div class="container-fluid">
<!-- Start Page Content -->
<div class="row">
<div class="col-md-2">
<div class="card bg-primary p-20">
<div class="media widget-ten">
<div class="media-left meida media-middle">
<span><i class="ti-bag f-s-40"></i></span>
</div>
<div class="media-body media-text-right">
89
<h2 class="color-white text-white">{{open_s}}</h2>
<p class="m-b-0 text-white">OPEN</p>
</div>
</div>
</div>
</div>
<div class="col-md-2">
<div class="card bg-warning p-20">
<div class="media widget-ten">
<div class="media-left meida media-middle">
<span><i class="ti-comment f-s-40"></i></span>
</div>
<div class="media-body media-text-right">
<h2 class="color-white text-white">{{high_s}}</h2>
<p class="m-b-0 text-white">HIGH</p>
</div>
</div>
</div>
</div>
<div class="col-md-2">
<div class="card bg-success p-20">
<div class="media widget-ten">
<div class="media-left meida media-middle">
<span><i class="ti-vector f-s-40"></i></span>
</div>
<div class="media-body media-text-right">
<h2 class="color-white text-white">{{low_s}}</h2>
<p class="m-b-0 text-white">LOW</p>
</div>
</div>
</div>
</div>
<div class="col-md-2">
90
<div class="card bg-danger p-20">
<div class="media widget-ten">
<div class="media-left meida media-middle">
<span><i class="ti-location-pin f-s-40"></i></span>
</div>
<div class="media-body media-text-right">
<h2 class="color-white text-white">{{close_s}}</h2>
<p class="m-b-0 text-white">CLOSE</p>
</div>
</div>
</div>
</div>
<div class="col-md-2">
<div class="card bg-warning p-20">
<div class="media widget-ten">
<div class="media-left meida media-middle">
<span><i class="ti-comment f-s-40"></i></span>
</div>
<div class="media-body media-text-right">
<h2 class="color-white text-white">{{adj_close}}</h2>
<p class="m-b-0 text-white">ADJ CLOSE</p>
</div>
</div>
</div>
</div>
<div class="col-md-2">
<div class="card bg-primary p-20">
<div class="media widget-ten">
<div class="media-left meida media-middle">
<span><i class="ti-bag f-s-40"></i></span>
</div>
91
<div class="media-body media-text-right">
<h2 class="color-white text-white">{{vol}}</h2>
<p class="m-b-0 text-white">VOLUME</p>
</div>
</div>
</div>
</div>
</div>
<div class="row">
<div class="col-lg-6">
<div class="card">
<div class="card-title">
<h4>RECENT TRENDS IN {{quote}} STOCK PRICES</h4>
</div>
<div class="sales-chart">
<img src="{{url_for('static', filename='Trends.png')}}" />
</div>
</div>
<!-- /# card -->
</div>
<!-- /# column -->
<div class="col-lg-6">
<div class="card">
<div class="card-title">
<h4>ARIMA MODEL ACCURACY</h4>
</div>
<div class="sales-chart">
<img src="{{url_for('static', filename='ARIMA.png')}}" />
</div>
</div>
<!-- /# card -->
</div>
92
<!-- /# column -->
<div class="col-lg-6">
<div class="card">
<div class="card-title">
<h4>LSTM MODEL ACCURACY</h4>
</div>
<div class="team-chart">
<img src="{{url_for('static', filename='LSTM.png')}}" />
</div>
</div>
</div>
<div class="col-lg-6">
<div class="card">
<div class="card-title">
<h4>LINEAR REGRESSION MODEL ACCURACY</h4>
</div>
<div class="team-chart">
<img src="{{url_for('static', filename='LR.png')}}" />
</div>
</div>
</div>
<!-- ARIMA PRED -->
<div class="col-md-4">
<div class="card bg-success p-20">
<div class="media widget-ten">
<div class="media-left meida media-middle">
<span><i class="ti-vector f-s-40"></i></span>
</div>
<div class="media-body media-text-right">
<h2 class="color-white text-white">{{arima_pred}}</h2>
<p class="m-b-0 text-white">TOMORROW'S {{quote}} CLOSING PRICE
BY ARIMA</p>
</div>
</div>
93
</div>
</div>
94
<div class="col-md-4">
<div class="card bg-primary p-20">
<div class="media widget-ten">
<div class="media-left meida media-middle">
<span><i class="ti-vector f-s-40"></i></span>
</div>
<div class="media-body media-text-right">
<h2 class="color-white text-white">{{error_arima}}</h2>
<p class="m-b-0 text-white">ARIMA RMSE</p>
</div>
</div>
</div>
</div>
95
</div>
<div class="media-body media-text-right">
<h2 class="color-white text-white">{{error_lr}}</h2>
<p class="m-b-0 text-white">LINEAR REGRESSION RMSE</p>
</div>
</div>
</div>
</div>
<!-- OVERALL POLARITY -->
<div class="col-lg-8">
<div class="card bg-primary p-20">
<div class="media widget-ten">
<div class="media-left meida media-middle">
<span><i class="ti-comment f-s-40"></i></span>
</div>
<div class="media-body media-text-right">
<h2 class="color-white text-white" style="text-align: left;">According to the
ML Predictions & Sentiment Analysis of the Tweets, a {{idea}} in {{quote}} stock is expected =>
{{decision}}</h2>
<p class="m-b-0 text-white">RECOMMENDATION</p>
</div>
</div>
</div>
</div>
<!-- /# column -->
</div>
<!-- /# row -->
<!-- End PAge Content -->
</div>
<!-- End Container fluid -->
</div>
<!-- End Page wrapper -->
</div>
96
<!-- End Wrapper -->
<!-- All Jquery -->
<script src="{{url_for('static', filename='jquery-RES.min.js ')}}"></script>
<!-- Bootstrap tether Core JavaScript -->
<script src="{{url_for('static', filename='popper.min.js')}}"></script>
<script src="{{url_for('static', filename='bootstrap-RES.min.js')}}"></script>
<!-- slimscrollbar scrollbar JavaScript -->
<script src="{{url_for('static', filename='jquery.slimscroll.js')}}"></script>
<!--Menu sidebar -->
<script src="{{url_for('static', filename='sidebarmenu.js')}}"></script>
<!--stickey kit -->
<script src="{{url_for('static', filename='sticky-kit.min.js')}}"></script>
<script src="{{url_for('static', filename='d3.min.js')}}"></script>
<script src="{{url_for('static', filename='topojson.js')}}"></script>
<script src="{{url_for('static', filename='datamaps.world.min.js')}}"></script>
<script src="{{url_for('static', filename='datamap-init.js')}}"></script>
<script src="{{url_for('static', filename='jquery.simpleWeather.min.js')}}"></script>
<script src="{{url_for('static', filename='weather-init.js')}}"></script>
<script src="{{url_for('static', filename='owl.carousel.min.js')}}"></script>
<script src="{{url_for('static', filename='owl.carousel-init.js')}}"></script>
97
APPENDIX B
SCREENSHOTS
98
Figure 10.2 Register/Login page
99
Figure 10.3 Getting Company Code
100
Figure 10.4 Stock Data of a company
101