Professional Documents
Culture Documents
A Hybrid Optimization Algorithm Using BiLSTM Structure F - 2023 - Measurement S
A Hybrid Optimization Algorithm Using BiLSTM Structure F - 2023 - Measurement S
A Hybrid Optimization Algorithm Using BiLSTM Structure F - 2023 - Measurement S
Measurement: Sensors
journal homepage: www.sciencedirect.com/journal/measurement-sensors
A R T I C L E I N F O A B S T R A C T
Keywords: Sentiment analysis can assist consumers in providing clear and objective sentiment recommendations based on
Sentiment analysis large amounts of data, and it is helpful in overcoming unclear human flaws in subjective assessments. Existing
Product reviews Taylor series sentiment analysis methods, on the other hand, must be enhanced in terms of robustness and accuracy. To
Harris hawks optimization
improve marketing strategies based on product reviews, a reliable mechanism for forecasting sentiment polarity
RNN-BiLSTM
should be implemented. This paper proposes a new approach for sentiment analysis called Taylor–Harris Hawks
Optimization driven long short-term memory (THHO- BiLSTM). By incorporating Taylor series in HHO, Tay
lor–HHO is formed, which aids in improving the BiLSTM classifier’s performance by picking optimal weights in
the hidden layers. The proposed method was evaluated using Amazon product reviews and reviews from the
Taboada corpus benchmark datasets, yielding findings with 96.93% and 93% accuracy, respectively. When
compared to existing approaches, the suggested model exceeds them in terms of accuracy. The proposed
approach helps manufacturers improve their products based on user feedback.
* Corresponding author.
E-mail addresses: sangiprathap@gmail.com (J. Sangeetha), kumaran.u@rediffmail.com (U. Kumaran).
https://doi.org/10.1016/j.measen.2022.100619
Received 8 September 2022; Received in revised form 5 November 2022; Accepted 28 November 2022
Available online 12 December 2022
2665-9174/© 2022 Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
J. Sangeetha and U. Kumaran Measurement: Sensors 25 (2023) 100619
features in standard machine learning methods like Naive Bayes, SVM, obtain the data. Following that, data preprocessing was performed,
and others: the phrases or words that strongly represent the viewpoint as followed by feature extraction, which included phrase frequency, in
negative or positive. However, poor accuracy, high error rate were verse document frequency, and n-gram. Then, ML methods including
considered as the major disadvantages in the existing mechanisms. DL, KNN, SVM, and Random Forest were employed to classify the data.
on the other hand, has the potential to solve several issues that SA faces. AL-Sharuee et al. [9] proposed employing segregated window clus
Deep learning is the application of multiple-layer artificial neural tering (SWC) and window sequential clustering (WSC) to perform a
networks (also known as NN) to solve learning issues. It can benefit from chronological sentiment analysis. SWC was purely founded on the
the learning capabilities of neural networks, which were before believed temporal feature of evaluations, whereas WSC was a dynamic analysis.
to be restricted to a single layer or a tiny amount of data. The topology of The ACAEC learning algorithm serves as the basis for WSC and SWC. For
neural networks is modeled after that of the biological brain, and they improving WSC’s performance, ACAEC’s ensemble technique is
are composed of a vast number of interconnected layers of information improved with a supplementary weight scheme and an additional
processing units (called neurons). By changing the weights of connec learner. New sets of reviews were used for this investigation, including 4
tions between neurons, it can learn to do tasks (like categorization), airlines and one Australian real estate agency. Experiments revealed that
much like a biological brain. The main contributions of the paper are: SWC and WSC accuracy rates are 87.54% and 83.87%, respectively.
Mukherjee et al. [10] proposed a new end-to-end SA approach for
• THHO is attained by combining HHO and Taylor series that aids in dealing with negations, which included negation detection and scope
improving the LSTM classifier’s performance by determining ideal marking. This method used a modified negation marking system for
hidden layer weights. explicit negation detection on Amazon reviews, mainly of cell phones,
• The Taylor–HHO classifier is used to train the B-LSTM classifier to and conducted sentiment analysis tests using several ML algorithms such
enhance the sentiment classification. as Nave Bayes, ANN, Support Vector Machines, and RNN. The RNN
• The accuracy of the suggested technique is increased by processing attained the highest accuracy of 95.67% while analyzing the influence of
input phrase sequences concurrently across a multi-head attention the negation method on SA tasks.
layer with fine-grained embedding (Glove) and evaluating the results Geetha et al. [11] advocated using sentiment analysis to categorize
with varying dropout rates. In the following section of this study, the positive and negative attitudes in consumer review data. To clarify the
data from the two deep multi-layers are combined and sent as input subject of Sentiment Analysis, a strong Deep Learning Model called
to the BiLSTM layer. BERT Base Uncased, was introduced. With good prediction and great
accuracy, the BERT model outperformed the other ML methods. Ying
The rest of the paper is organized as follows: the Related Works Fang et al. [12] utilized a SVM to build a framework for word level
section is in Section 2, while the Proposed Approach section is in Section sentiment analysis, and they improved SVM with 82.85% and 86.35%
3. Section 4 contains the Experiment Results, and Section 5 concludes accuracy on a consumer product review (CPR) dataset. For sentiment
Section 5 concludes the paper. analysis, features were retrieved by means of the TF-IDF approach and a
sentiment lexicon was created. The model’s performance did not
2. Related works improve when the vector dimension was increased beyond 400.
The model suggested in this paper is based on previous SA and DL 3. Proposed THHO- BiLSTM model
research. Despite the fact that numerous researches have been done on
sentiment analysis, the many of them have relied on classic ML classi This research proposes a new method for sentiment analysis called
fiers such as SVM, Naive Bayes, and others. “In Sentiment Analysis, there Taylor–Harris Hawks Optimization driven long short-term memory
are primarily 3 levels of approaches: sentence level, word level, and (THHO- BiLSTM). The following are the main steps in the proposed
document level. The placement of the review’s words and their effect on model: i) Pre-processing the text ii) By incorporating Taylor series with
the evaluation are investigated using Sentiment Analysis at the word HHO, Taylor–HHO is created. This aids in improving the BiLSTM clas
level. To provide the words polarity, it consults a dictionary, and to sifier’s performance by picking appropriate weights in the hidden layers.
produce the result, it determines the total polarity of the review.
“Sentiment analysis at the sentence level examines the general polarity 3.1. Data preprocessing
of the sentence to determine whether a sentence in a text has a positive
or negative sentiment. At the document level, the entire document is Pre-processing [13] checks at the opinions from a syntactical aspect,
treated on its own opinion (i.e., negative or positive). keeping the phrase’s original syntax. To decrease noise and facilitate
Cheng et al. [6] proposed a component concentrating multi-headed feature extraction, several procedures such as Stemming, POS tagging,
co-attention network system with 3 modules: multi-headed co-at and stop word removal are applied to the data set in this step.
tention, extended context, and component focusing. The normal pooling Tokenization: It separates the text of a document into a sequence of
issue, which treats each word as having equal importance, was resolved tokens. The separation points are specified using all non-letter charac
by the component concentrating module by improving the weighting of ters. As a result, tokens are just one word long (unigrams).
adjectives and adverbs. The multi-head co-attention network was used Stemming: Stemming is a key morphological step in the pre
to train the network to identify the most important words in a processing module during feature extraction. All modulated words in the
multi-word goal before obtaining a context characterization and using text are transformed into a root form known as a stem during the
the attention mechanism on the sequence data. stemming process. The stem ’automat’, for example, is made up of the
Sivakumar et al. [7] suggested an intelligent system that uses fuzzy words ’automatic,’ ’automate,’ and ’automation.’ Stemming is more
logic and LSTM to classify customer review statements into four cate efficient when precision isn’t a major consideration.
gories: highly positive, positive, negative, and highly negative. This Token filtering: A length-based filtration strategy was used to
model was tested using Amazon mobile phone reviews, Amazon video decrease the generated token set. The tokens were filtered out using the
games reviews, and Amazon goods customer reviews benchmark data minimum and maximum length settings. The token selection range is
sets, yielding accuracy of 96.03%, 83.82%, and 90.92%, respectively. defined by the parameters.
Rao et al. [8] concentrated on extracting sarcasm from text. Senti Stop Words: Words like "a," "the," "of," "and," and "an" are frequently
ment analysis is a technique for examining a text’s point of view. employed. Stop-word deletion can be accomplished in a variety of
Sarcasm, a harsh style of delivering information, can also be found in the methods, which increases the feature extraction algorithm’s perfor
text. The initial job is to choose a dataset. Amazon datasets were used to mance. The removal of stop words reduces the data sets’ dimensionality,
2
J. Sangeetha and U. Kumaran Measurement: Sensors 25 (2023) 100619
making it easier for automatic feature extraction algorithms to identify updating the position to get an optimal location. Furthermore, Harris’
the essential phrases that remained in the review corpus. The words to hawks update their positions to surround the expected prey. The current
be removed are chosen from a readily available list of stop terms. locations, in this case, updates the solution space and are expressed as
P(u + 1) = ΔP(u) − E|RPrab (u) − P(u)| (3)
3.2. Taylor-HHO for optimal selection
where P(u) is the distance between the rabbit’s current location and its
The suggested Taylor–HHO is used to train the RNN-BiLSTM, which position vector in iteration u, R shows the rabbit’s arbitrary power,
is used to achieve sentiment classification (see Fig. 1). The Taylor series Prab (u) denotes the rabbit’s position in iteration u, and E denotes the
is an infinite extension of function that represents the functions of prey’s fleeing energy.
complicated variables. The Taylor series is used to anticipate the linear
component of the equation and to describe the previously stored data. P(u + 1) = Prab (u) − P(u) − E|RPrab (u) − P(u)| (4)
The advantage of Taylor series is that it is a simpler and easier method of Considering Prab (u) is (+) ve,
computing solutions, even when complex functions are present. The
Taylor series has a number of advantages, including accurate assessment P(u + 1) = Prab (u) − P(u) − ERPrab (u) + EP(u) (5)
of mutual functions and ease of convergence.
HHO [14] on the other hand, is driven by obliging asset and Harris’ P(u + 1) = Prab (u)[1 − ER] + P(u)[E − 1] (6)
hawk-chasing activity. This method is capable of handling the in The update equation, according to the Taylor series, is supplied by
tricacies of search space and is effective in tackling a variety of opti the current and previous observations and is represented as,
mization issues. It also solves Optimal Power Flow (OPF) issues, both ′
single and multi-objective. Furthermore, it minimizes the objective P(u + 1) = P(u) +
P (u) P′′ (u)
+ (7)
function, allowing the HHO to handle nonlinear, nonconvex, restricted 1! 2!
optimization problems. Following are the Taylor–HHO steps:
P(u) − P(u − n)
(8)
′
P (u) =
n
3.2.1. Initialization
The first step is to initialize the solution, which is modeled as P(u) − 2P(u − n) + P(u − 2n)
{ } P′′ (u) = (9)
P = P1 , P2 , …, Pi,…, Pv (1) n2
Consider n = 1 and substitute P (i) and P′′ (i),
′
where Pu denotes the ith solution and v denotes the total solution.
P(u) − P(u − 1) P(u) − 2P(u − 1) + P(u − 2)
P(u + 1) = P(u) + + (10)
3.2.2. Error determination 1! 2!
The best solution is found through trial and error and is stated as a [ ]
1 2P(u − 1) P(u − 2)
problem of minimization; consequently, the solution with the least mean P(u + 1) = P(u) 1 + 1 + − P(u − 1) − + (11)
2 2 2
squared error (MSE) is chosen as the best answer, and MSE is stated as
]2 [ ]
5 P(u − 2)
1∑ s
P(u + 1) = P(u) − 2P(u − 1) + (12)
MSerr = [ξo − ξ∗o (2) 2 2
s o=1
[ ]
2 P(u − 2)
where ξo denotes the expected output, ξ∗o denotes the output that is P(u) = P(u + 1) + 2P(u − 1) − (13)
5 2
predicted and s denotes the sample data count, so that 1 ≤ o ≤ s.
Substituting Eqn (13) into Eqn (6),
3.2.3. Determining the update equation
The HHO algorithm’s selection strategy aids in progressively
3
J. Sangeetha and U. Kumaran Measurement: Sensors 25 (2023) 100619
[ ]
2 P(u − 2) its cell state is controlled by gate structures. The LSTM unit additionally
P(u + 1) = Prab (u)[1 − ER] + (E − 1) P(u + 1) + 2P(u − 1) −
5 2 has a hidden state that records the observed sequence’s history.
(14) LSTM differs from traditional neural networks in that each LSTM unit
has a unique feature. One memory cell Mt and 3 gates, comprising the
2 forget gate ot , the input gate gt , and the output gate rt , make up the unit.
P(u + 1) = Prab (u)[1 − ER] + (E − 1)P(u + 1)
5 The three gates work together to manage the state of the memory cell Mt .
[ ]
2 P(u − 2) The forget gate determines whether or not to keep historical unit data.
+ (E − 1) 2P(u − 1) − (15)
5 2 The output gate regulates the unit’s output, while the input gate regu
lates how much fresh data is fed into the system via the inputs. LSTM’s
2
P(u + 1) − (E − 1)P(u + 1) = Prab (u)[1 − ER] forward calculation at time t can be expressed mathematically as
5
[ ] follows.
2 P(u − 2)
+ (E − 1) 2P(u − 1) − (16) ot = σ(Yo kt− 1 + Qo yt + co ) (20)
5 2
( )
[
2
] gt = σ Yg kt− 1 + Qg yt + cg (21)
P(u + 1) 1 − (E − 1) = Prab (u)[1 − EC]
5
[ ] Mt = ot ⊙ Mt− 1 + gt ⊙ tanh(Yh kt− 1 + Qh yt + ch ) (22)
2 P(u − 2)
+ (E − 1) 2P(u − 1) − (17)
5 2 rt = σ(Yr kt− 1 + Qr yt + cr ) (23)
[ ]
P(u + 1)
5 − 2(E − 1)
= Prab (u)[1 − ER] kt = rt ⊙ tanh(Mt ) (24)
5
[ ]
2 P(u − 2) where kt is the output of the LSTM unit t and σ is the sigmoid function,
+ (E − 1) 2P(u − 1) − (18)
5 2 tanh () denotes activation function,⊙ is the Hadamard product ,Y & Q are
weight matrixes and c are the bias vector.
The final equation that results from combining the Taylor series with
The sequence’s prior information is all that the Long-Short Term
HHO is known as the update equation of Taylor-HHO.
Memory takes into account, which is frequently insufficient. If you had
[ [ ]]
5 2 P(u − 2) obtained to knowledge of the future in the same manner, you could
P(u + 1) = Prab (u)[1 − ER] + (E − 1) 2P(u − 1) − access earlier information; it would be tremendously advantageous for
5 − 2(E − 1) 5 2
(19) the order of chores. Two Long-Short Term Memory layers one moving
forward and one moving backward make up the bidirectional LSTM. The
3.2.4. Find the optimal solution using error guiding idea is as follows: the forward layer records information from
Each individual solution’s mistake is evaluated following the update. the sequence’s past, while the backward layer records information from
As a result, an optimal solution is one that has the fewest errors. the sequence’s future. Both layers have the similar output layer. The
main aspect of the framework is the thorough consideration of the
3.2.5. Terminate sequence setting data. Let’s assume that the word embedding yt , is the
The processes are repeated several times indefinitely until the input at time t, and at time t − 1, the forward hidden unit’s output is
→ ←
maximum number of iterations is reached. k t− 1 and the output of the backward concealed unit is k t+1 . The output
of the concealed with backward units is then equal at time t.
( )
3.3. Classification of sentiments using RNN-BiLSTM → →
kt = L yt , k t− 1 , Mt− 1 (25)
4
J. Sangeetha and U. Kumaran Measurement: Sensors 25 (2023) 100619
activations of the preceding layer b. The following characteristics labeled negative and classed as negative, whereas false positive data is
describe the dense layer: labeled negative and classified as positive.
The proposed model THHO-BiLSTM’s performance was assessed
d = b(x.w) + a (28)
with publicly available benchmark datasets and standard numerical
where b is the argument based on elements, the weights matrix is w, and methods. The findings of the experiments were compared to those of
An is the bias vector for the layer. well-known existing approaches (RNN-LSTM, SV, NB, RF, and DT). The
cross-validation method was employed for three class classification and
3.3.3. Softmax classifier Fig. 3 displays the confusion matrix of the proposed model. Three classes
For sentiment analysis prediction, sends the generated vector are used to summarize and decompose the number of accurate and
directly to the Softmax layer. The following is the outcome of the pre incorrect classifications. The overall accuracy attained by the proposed
diction: model is 96.58% for classification.
precision ∗ recall
F − Score = 2 ∗ (33)
precision + recall
TP
sensitivity = (34)
TP + FN
TN
Specificity = (35)
TN + FP
The given data has four possible outcomes which are false negative
(FN), true negative (TN), false positive (FP), true positive (TP). True
positive data is labeled positive and categorized as positive, whereas
false negative data is labeled positive and labeled negatively. TN data is
Table 1
Amazon reviews format.
Product Product id: B0005645GFHDY
Tittle: Air conditioner for sale
Price: Unknown
Reviews Userid: D4365878HJFHGHU9
Profile name: Kaya
Helpfulness: 10
Score: 4.2
Summary: Service and quality are nice
Text: Delivery was very prompt. Very fresh air
Fig. 3. Confusion matrix for 3 class classification.
5
J. Sangeetha and U. Kumaran Measurement: Sensors 25 (2023) 100619
Fig. 4. Analysis of performance metrics for Kaggle.com. Fig. 7. Accuracy and Loss curve.
Table 2
Performance comparison of existing Sentiment analysis techniques.
Authors Techniques Accuracy
Data availability
Acknowledgements
The author with a deep sense of gratitude would thank the supervisor
for his guidance and constant support rendered during this research.
References
[1] F. Tang, L. Fu, B. Yao, W. Xu, Aspect Based Fine-Grained Sentiment Analysis for
Online Reviews, 488, Information Sciences, 2019, pp. 190–204.
[2] H. Xia, Y. Yang, X. Pan, Z. Zhang, W. An, Sentiment analysis for online reviews
using conditional random fields and support vector machines, Electron. Commer.
Res. 20 (2) (2020) 343–360.
Fig. 6. ROC curve. [3] M.Y.A. Salmony, A.R. Faridi, Supervised sentiment analysis on Amazon product
reviews: a survey, in: 2021 2nd International Conference on Intelligent Engineering
and Management (ICIEM), IEEE, 2021, April, pp. 132–138.
5. Conclusion [4] J. Guerreiro, P. Rita, How to predict explicit recommendations in online reviews
using text mining and sentiment analysis, J. Hospit. Tourism Manag. 43 (2020)
The suggested approach for analyzing the sentiment of product 269–272.
[5] M.O. Aftab, U. Ahmad, S. Khalid, A. Saud, A. Hassan, M.S. Farooq, Sentiment
evaluations by LSTM with THHO integrates Taylor series and HHO, analysis of customer for ecommerce by applying AI, in: 2021 International
which helps to improve the effectiveness of the LSTM by determining Conference on Innovative Computing (ICIC), IEEE, 2021, November, pp. 1–7.
optimal weights in the hidden layers. The proposed approach was [6] L.C. Cheng, Y.L. Chen, Y.Y. Liao, Aspect-based sentiment analysis with component
focusing multi-head co-attention networks, Neurocomputing (2022).
evaluated using Amazon product reviews and results were found to be
[7] M. Sivakumar, S.R. Uyyala, Aspect-based sentiment analysis of mobile phone
96.93 respectively. When compared to current methods, the suggested reviews using LSTM and fuzzy logic, Int. J. Data Sci. Anal. 12 (4) (2021) 355–367.
6
J. Sangeetha and U. Kumaran Measurement: Sensors 25 (2023) 100619
[8] M.V. Rao, C. Sindhu, Detection of sarcasm on Amazon product reviews using [14] A.A. Heidari, S. Mirjalili, H. Faris, I. Aljarah, M. Mafarja, H. Chen, Harris hawks
machine learning algorithms under sentiment analysis, in: 2021 Sixth International optimization: algorithm and applications, Future Generat. Comput. Syst. 97 (2019)
Conference on Wireless Communications, Signal Processing and Networking 849–872.
(WiSPNET), IEEE, 2021, March, pp. 196–199. [15] P. Patel, D. Patel, C. Naik, Sentiment analysis on movie review using deep learning
[9] M.T. Al-Sharuee, F. Liu, M. Pratama, Sentiment analysis: dynamic and temporal RNN method, in: Intelligent Data Engineering and Analytics, Springer, Singapore,
clustering of product reviews, Appl. Intell. 51 (1) (2021) 51–70. 2021, pp. 155–163.
[10] P. Mukherjee, Y. Badr, S. Doppalapudi, S.M. Srinivasan, R.S. Sangwan, R. Sharma, [16] N. Wedjdane, R. Khaled, K. Okba, Better decision making with sentiment analysis
Effect of negation in sentences on sentiment analysis and polarity detection, of Amazon reviews, in: 2021 International Conference on Information Systems and
Procedia Comput. Sci. 185 (2021) 370–379. Advanced Technologies (ICISAT), IEEE, 2021, December, pp. 1–7.
[11] M.P. Geetha, D.K. Renuka, Improving the performance of aspect based sentiment [17] S. Al-Dabet, S. Tedmori, A.S. Mohammad, Enhancing Arabic aspect-based
analysis using fine-tuned Bert Base Uncased model, Int. J. Intell. Netw. 2 (2021) sentiment analysis using deep learning models, Comput. Speech Lang 69 (2021),
64–69. 101224.
[12] Y. Fang, H. Tan, J. Zhang, Multi-strategy sentiment analysis of consumer reviews [18] S. Terra Vieira, R. Lopes Rosa, D. Zegarra Rodríguez, M. Arjona Ramírez, M. Saadi,
based on semantic fuzziness, IEEE Access 6 (2018) 20625–20631. L. Wuttisittikulkij, Q-meter: quality monitoring system for telecommunication
[13] R.V. Karthik, S. Ganapathy, A fuzzy recommendation system for predicting the services based on sentiment analysis using deep learning, Sensors 21 (5) (2021)
customers interests using sentiment analysis and ontology in e-commerce, Appl. 1880.
Soft Comput. 108 (2021), 107396. [19] P. Ray, A. Chakrabarti, A mixed approach of deep learning method and rule-based
method to improve aspect level sentiment analysis, Appl. Comput. Inf. (2020).