Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

RESEARCH ARTICLE | MAY 10 2024

Sentiment analysis on examining students’


opinions regarding NDUM e-learning during COVID-
19 
Syahaneim Marzukhi  ; Muhammad Harith Mohd Nazrul Aman; Nor Fatimah Awang; Syed
Nasir Alsagoff Syed Zakaria

AIP Conf. Proc. 3135, 020001 (2024)


https://doi.org/10.1063/5.0213916

 
View Export
Online Citation

12 May 2024 12:33:41


Sentiment Analysis on Examining Students’ Opinions
Regarding NDUM e-Learning During COVID-19

Syahaneim Marzukhi1, a), Muhammad Harith Mohd Nazrul Aman 1, b), Nor Fatimah
Awang1, c) and Syed Nasir Alsagoff Syed Zakaria 2, d)

1
Cyber Security & Revolution Technology Centre, National Defense University Malaysia, Sg Besi Camp,
57000 Kuala Lumpur, Malaysia
2
Computer Science Department, Faculty Science & Defense Technology, National Defense University Malaysia,
Sg Besi Camp, 57000 Kuala Lumpur, Malaysia

a)
Correponding author: syahaneim@upnm.edu.my
b)
harith@gmail.com
c)
norfatimah@upnm.edu.my

12 May 2024 12:33:41


d)
syednasir@upnm.edu.my

Abstract. Examining Students’ Opinions about NDUM e-Learning during COVID-19 is a study to improve the NDUM
e- Learning system based on Sentiment Analysis approach. Firstly, surveys are distributed among the students to get
feedback related to NDUM e-Learning. From the feedback, useful data can be collected for data analytic process. CRISP-
DM framework and Opinion Mining architecture is used for this study. Next, Sentiment Analysis is implemented to
classify the student’s feedback into “positive” and “negative” sentiment. Further, two machine learning algorithms (i.e.
Decision Tree and SVM) are used to construct the classifier model. Finally, results are displayed in graphical form to
illustrate findings of the study. The result provides an overview of the level of satisfaction, suitability, system
improvement, suggestions and the overall impact of the use of NDUM E-Learning during the pandemic. first, second, and
third level headings (first level heading)

INTRODUCTION
The advancement of information and communication technology (ICT) has contributed towards the improvement
of human life-style. Nowadays, various tasks can be completed only at the fingertips through the advancement of
ICT. The presence of pandemic Covid-19 not only has given an impact on the whole world but also on the people
life-style. The adaptation to the new norms has took place in previous two years of pandemic. Apart from that, the
field of education has experienced major changes of teaching and learning process. During the pandemic of COVID-
19, teaching and learning is performed through online, whereas the previous learning process is face-to-face. Some
universities implement Massive Open Online Course (MOOC) as their online learning policy. Using MOCC classes
are conducted by the lecturers using videos, online assessments, discussions and forums. However, the students and
the lecturers found that the MOOC’s quality of learning is different from face-to-face class especially when it cannot
replace laboratory-work, field-work, practical class and many others [5, 12].
Before the pandemic National Defense University Malaysia (NDUM) e-Learning is introduced as part of the
MOOC. But, NDUM e-Learning is not highlighted as the main learning medium. Therefore, there are still remains

The 1st International Conference on Advanced Computing, Systems, and Applications (InCASA) 2023
AIP Conf. Proc. 3135, 020001-1–020001-8; https://doi.org/10.1063/5.0213916
Published under an exclusive license by AIP Publishing. 978-0-7354-4947-3/$30.00

020001
some issues that need to be improved. First, NDUM e-Learning services is not fully utilized for teaching and
learning tools. Second, there is no manual and guideline on the website on how to use the system and the
functionality of each module and feature. Third, the user interface is not interactive and user-friendly. Instead, the
students found that the online learning process reduces their interest in learning. Consequently, this study is
performed as an effort to improve the system, in order to avoid the risks of decreasing users interest and under-
utilize the system. For this study, the data is collected from the survey that is distributed to a target group of 105
students from Computer Science Department, National Defense University Malaysia (NDUM). The study has three
main objectives, namely: 1) collects the students’ feedback regarding NDUM e-Learning from the Computer Science
students at NDUM, 2) applying text analytic technique to gather related data from the students’ feedback, and 3)
determine the sentiment in the students’ feedback so that it can be described as positive or negative sentiment.
Therefore, data science technique particularly text analytic will be applied. Further, machine learning algorithm
(i.e. Decision Tree and SVM) is adapted to search for knowledge and patterns from the data. Besides, RapidMiner
one of the text analytic software will be used. Next, sentiment analysis is also applied in this text analytic process to
determine the sentiment in the data (i.e. the opinion of students at NDUM regarding online-learning using the
NDUM e-Learning) indicating as positive or negative opinion. The paper is structured as the following. Section 2
provides related background of the research under study. Section 3 presents the methodology for the research,
Section 4 discusses findings and results, and Section 5 contains concluding remarks.

RESEARCH BACKGROUND

Text Analytic

Text analytic is driven by the need to process the natural human language, as natural human language unlikely in
a form of structured format that consist of rows and columns compared to numeric or categorical data [19]. Usually,

12 May 2024 12:33:41


understanding un-structured data including text is an easy task for human, but it is a challenge for a computer [15].
Thus, text analytic combines various methods and techniques including machine learning, statistics and linguistic to
process large volumes of unstructured data or text that does not have a predefined format to derive in-sights and
patterns. Additionally, text analytic also uses other methods such as sentiment analysis, topic modelling, term
frequency, and event extraction. The fundamental step in text analytics involves: 1) data acquisition, 2) feature
extraction, 3) data pre-processing (i.e. converting text into semi-structured data), 4) model induction (i.e. applying
any text analytic techniques to train the models and detect patterns in text), and 5) evaluation and interpretation of
the results [8]. Text analytic has been used in various fields such as market trends analysis, social media analysis,
healthcare, business analysis, fraud detection, risk management and many others [7]. For example, in education text
analytic is used to investigate students’ sentiment or students’ opinion [8]. However, there is still limited research
that applying text analytic to improve the quality of online-learning system.

Sentiment Analysis

Sentiment analysis (SA) is a study that analyzes people's opinions, sentiments, evaluations, attitudes, and
emotions from written language [13]. SA is also a classification process that identifies the opinions and emotions of
users through the written contents [13]. Sentiment can be referred as the positive or negative expression in text. SA
can be extracted from many sources of text, such as surveys, reviews, comments, discussion forums, blogs,
microblogs from social media and social network, and even articles on the web to get their meaning. Based on [11],
SA can be studied at three stages: (1) Document-level SA, (2) Sentence-level SA and (3) Aspect-level SA.
Document-level SA aims to classify a textual review into as either positive or negative sentiment based on a single
topic. Whereas, Sentence-level SA finds the sentiment polarity in a single sentence, and Aspect-level SA explains
the problem from various aspects into a complex sentence. According to [13], SA can be classified into: 1) Machine
Learning (ML) approach, 2) Lexicon-Based approach and 3) Hybrid approach. ML approach uses linguistic features
and applies machine learning algorithms. Lexicon-Based approach applies sentiment lexicon to determine polarity
and compares the result towards the sentiment dictionary. Hybrid approach combines the ML approach and the
speed of the Lexicon-Based approach for solving the problem. SA has been employed in different domains.

020001
Machine Learning Techniques for Classification

There is various machine learning (ML) techniques for text analytic tasks to search for knowledge and patterns
as follows: regression, classification, prediction, clustering, association analysis, time series forecasting, and many
others. Generally, machine learning techniques can be categorized according to the type of learning procedure used
to generate the output such as: supervised, unsupervised, and semi-supervised learning [9]. For this study,
supervised learning algorithms (i.e. Decision Tree and SVM) is used to classify the opinion of students regarding
NDUM e- Learning indicating as positive or negative sentiment. The algorithms are applied and compared to
address the classification problem for determining the sentiment in the students’ opinion. Each of the algorithm is
discussed as follows.
Decision Tree builds a model by dividing the tree into a series of partitions called nodes. The rules are organized
into a series of test questions and conditions in a tree structure. It comprises of leaf nodes, internal nodes and links.
The tree node denotes a class label or output, the internal node denotes a name of attribute or feature, and the link
between nodes (i.e. parent node and child node) denotes a rule or decision [19].
Support-Vector Machine (SVM) works by creating a boundary for a certain region or area that is similar and
categorize it as one class [19]. Once the boundary is created during the training, SVM will determine the new data
points either to be categorized inside the boundary or outside the boundary. These data points are called support
vectors because they support the boundary. Each data point that contains values for a number of different attributes
is referred as vector, whereas, the boundary is called a hyperplane [9].

Related Works

There has been a tremendous increase in online-learning activities during the Covid-19 pandemic, when majority
of the educational institution has shifted its programs using digital platforms. Various research such as in [1-4],

12 May 2024 12:33:41


conduct a comparison among the traditional educational versus the online-learning by applying data mining techniques
(i.e. Decision Tree, Random Tree, Naive Bayes, Random Forest, J48, C4.5 and KNN) for analyzing the students’
performance and impact using online-learning platforms during the pandemic. Those studies emphasized on the
variables and attributes that influence the students’ performance in learning and discussed the students’ satisfaction
using the online-learning system during the pandemic. There are also studies on sentiment analysis (SA) during the
Covid-19 [18]. The authors of [6] analyzed the public opinion on online-learning during the Covid-19 pandemic by
means of the document-based text mining method on Twitter data and was evaluated using Naïve-Bayes algorithm.
Findings showed that 25% was positive sentiment, 74% was negative sentiment, and 1% was neutral sentiment.
Similar study in [14, 16], the researcher investigated the sentiment related to actions taken by government during
Covid-19 from the nations and public’s opinions. The key issues that has been discussed were economic, emotional
and internet connectivity problems. A further investigation was introduced in [4], the lexicon-based method was
performed on the articles that was extracted using web scraping to determine the sentiment. The results showed that
90% of the articles were positive sentiment, and only 10% were negative sentiment. Generally, the blogs were more
positive compared to the newspaper. It was found that the blogs were more opinionated.
Thus, this study is aimed to investigate the student’s satisfaction using the online-learning and it impact towards
the students. For this study, ML approach is used to determine the sentiment in the data (i.e. the students’ opinion
regarding NDUM e-Learning indicating as positive or negative opinion). Two ML algorithms (i.e. Decision Tree
and SVM) are applied and compared to address the classification problem for determining the sentiment in the
students’ opinion. This approach offers the decision maker with a solution to monitor and measure the students’
satisfaction with respect to online learning during the occurrence of Covid-19.

METHODOLOGY

Cross Industry Standard Process for Data Mining (CRISP-DM)

Cross Industry Standard Process for Data Mining (CRISP-DM) is among the most widely used framework for
solving data science problems and is used for this study. In this paper [10, 17], the author discussed on how this

020001
CRISP-DM Framework translating business problems into data mining tasks through executing data mining projects
independently from the application area and the used technology. Figure 1 shows six phases of the model: (1)
business understanding, (2) data understanding, (3) data preparation, (4) modelling, (5) evaluation and (6)
deployment and is de-scribed as the following [20].

FIGURE 1. Cross Industry Standard Process for Data Mining (CRISP-DM) process [27].

Opinion Mining Architecture

Further, in implementing Phase 2 and Phase 3 of CRISP-DM (i.e. Data Understanding and Data Preparation) the
Opinion Mining Architecture is performed (see Figure 2) that consist of four main processes and is defined as the

12 May 2024 12:33:41


following.

FIGURE 2. Opinion Mining Architecture.

1. Opinion Retrieval. Opinion retrieval is related to discovering and retrieving content, particularly from social
media related to the students' information, needs and opinions.
2. Pre-processing. Pre-processing is an iterative process of transforming raw data into an understandable and
usable form of data. Raw data sets are usually contained incomplete data, inconsistent data, data that lacks
properties, and errors. Therefore, data pre-processing is important to handle those incomplete and
inconsistent data this include the following process: transform case or case folding, tokenization, stop-word
removal and stemming.
3. Data Analysis. Sentiment analysis is performed to analyze the students’ opinions, sentiments, evaluations,
attitudes, and emotions from written language either positive or negative opinion.
4. Opinion summarization. During opinion summarization opinions related to the same topic is summarized in
order to understand hidden events and sentiments about different incidents.

020001
RESULTS

Experimental Design

For this study, the dataset is created from the survey that is distributed to a target group of 105 students from
Computer Science Department at NDUM. The survey has three main segments: Participant Consent, Demography,
and questions related to the study (Question 1 until 5). The survey is aimed to investigate the student’s satisfaction
using the system and the online-learning impact towards the students. Additionally, the students can also provide
any suggestion to improve the system. The experiments are conducted on a computer running Windows 10 with an
Intel(R) Core i5-8250U processor and 8 GB of memory. Next, RapidMiner Studio Educational 9.8.01 is used for
analysis. Subsequently, the Decision Tree and SVM algorithm is used to construct the classifier model. The
classifier model will determine either the opinion is positive or negative using sentiment operator in RapidMiner
where this operator creates a sentiment score by applying open source sentiment dictionaries. Further, the result is
displayed in graphical form to illustrate findings of the study. Figure 3 and 4 shows how sentiment analysis is
performed using RapidMiner and Figure 5 shows important component in performing ML algorithm (i.e. Decision
Tree) in RapidMiner.

12 May 2024 12:33:41


FIGURE 3. Using RapidMiner to perform pre-processing.

FIGURE 4. Using RapidMiner to perform Sentiment Analysis.

FIGURE 5. Using RapidMiner to perform machine learning algorithm (i.e. Decision Tree).

Results and Analysis


In this section, results of the experiment are discussed and presented. First, SA is executed to discover hidden
semantic structures that provides insight in the students’ feedback (i.e. Question 1 until 5). Secondly, Word Cloud

020001
technique is used to identify frequent word that appears in the related topics (i.e. Question 1 until 5). Finally, ML
algorithm (i.e. Decision Tree and SVM) is used to construct the model that is to classify the text (i.e. Question 1
until 5) into either positive or negative sentiment. Figure 6 until 8 show part of the results after the above process in
section
4.1 is performed.
Given Question 3 as follows "What are the constraints that students feel towards the NDUM e-Learning?", the
most frequent words that appears from the survey is access and is categorized as negative sentiment. Figure 6 shows
the frequent negative word that appears in Word Cloud as the main constraints of NDUM e-Learning.

FIGURE 6. Frequent negative word appears in Word Cloud related to the constraints of NDUM e-Learning.

Further, given Question 4 as follows “In your opinion, what needs to be improved from the NDUM e-Learning
service?”, Figure 7 shows the frequent negative word that appears in Word Cloud as the main concern where the

12 May 2024 12:33:41


improvement is needed for NDUM e-Learning in certain area.

FIGURE 7. Frequent negative word appears in Word Cloud related to the improvement for NDUM e-Learning.

Table 1 Decision Tree and Support Vector Machine classification performance to determine opinion either positive or
negative in each i.e. Questions 1 until 5.
Question Algorithm Classification Accuracy (%)
DECISION TREE 88.27
Q1 SVM 99.03
DECISION TREE 95.09
Q2 SVM 98.38
DECISION TREE 74.64
Q3 SVM 85.43
DECISION TREE 83.27
Q4 SVM 85.22
DECISION TREE 92.27
Q5 SVM 100

020001
Considering accuracy rate as the performance measure, it shows that SVM algorithm is the best algorithm for
solving this problem. Even though the result of Decision Tree also achieved a good performance for most of the dataset
that is above 85%, but the algorithm does not achieve it best performance in the other dataset (i.e. Question 3 and 4).
Based on this performance, it shows that SVM outperformed Decision Tree for classifying the data. For this work,
SVM has achieved a good classification accuracy due to the nature of the algorithm that is often more powerful and
can scale to larger datasets, however Decision Tree provides more insight into how the model worked. Alternatively,
Decision Tree is great for its simplicity and interpretation. But, Decision Tree is limited to learn complicated rules
and scale to large data sets. One thing that need to consider is that Decision Tree is sensitive to unbalance dataset.
Therefore, Decision Tree and SVM will be further explored in future for solving other classification problems in
order to determine the best controlling conditions in the results.

CONCLUSION
In conclusion, the objectives of study have been achieved. Using the text mining techniques, information related
to online learning (i.e. NDUM e-Learning) among the students during the pandemic can be extracted. Further, the
opinion of students regarding online learning can be described as positive or negative. Thus, the data can be
represented in meaningful way to get the best conclusion from the students (e.g. the students’ reviews and obtain
features that appear most often) to improve NDUM e-Learning. With this, it hopes that the percentage of positive
effects will be increased and the students can carry out the online learning process more efficiently. Even though, the
students feel that the NDUM e-Learning is suitable and relevant to be used in the pandemic era. However, they still
room for improvements that can be shown through findings from Question 3. They stated some of the constraints
when using NDUM e-Learning and they also suggested some aspects of improvement to-wards the system. So the
administrator can make some improvements to the system based on the suggestions from Question 4. Alternatively,
the method could be generalized to other domains, such as public health monitoring and crisis management. For
example, the system could help authorities in provide a rapid and effective monitoring mechanism to manage future

12 May 2024 12:33:41


crises scenarios on a large scale at a low cost.

REFERENCES

1. Ahmed, A. S. A. M. S., Malik, M. H.: Machine Learning for Strategic Decision Making during COVID-19 at
Higher Education Institutes. In 2020 International Conference on De-cision Aid Sciences and Application
(DASA), pp. 663-668, IEEE (2020).
2. Akbar, Arminditya S., Harry P., Panca O. H., Yudhoatmojo, Satrio: User Perception Analy-sis of Online
Learning Platform “Zenius” During the Coronavirus Pandemic Using Text Mining Techniques. Jurnal Sistem
Informasi, 17, pp33-47 (2021).
3. Auliya R. I., Jepi S., Muhammad P. K.: Implementation of K-Nearest Neighbor (K-NN) Al-gorithm for Public
Sentiment Analysis of Online Learning, IJCCS (Indonesian Journal of Computing and Cybernetics Systems),
Vol.15, No.2, pp. 121-130 (2021).
4. Bhagat, K.K.; Mishra, S.; Dixit, A.; Chang, C.-Y.: Public Opinions about Online Learning during COVID-19:
A Sentiment Analysis Approach, In Sustainability, Vol 13 (3346) (2021).
5. Cooper S., Sahami M.: Reflections on Stanford’s MOOCs. Communications of the ACM, 56(2), pp. 28-30
(2013).
6. Deraman, Noor B., Alya G. D., Siti M.: Mining social media opinion on online distance learning issues during
and after movement control order (MCO) in Malaysia using topic modeling approach. In Int. Journal of
Advanced Tech. and Eng. Exploration, 8, pp. 2394-7454 (2021).
7. Dina N. Z., Yunardi R. T., Firdaus A. A.: Utilizing text mining and feature-sentiment-pairs to support data-
driven design automation Massive Open Online Course. In Int. Journal of Emerging Technologies in Learning
(iJET), 16(01), pp. 134–151 (2021).
8. Ferreira‐Mello R, André M, Pinheiro A, Costa E, Romero C.: Text mining in education. Wiley
Interdisciplinary Rev. Data Mining. Knowledge Discovery (2019).
9. Hongbo Du: Data mining techniques and applications: an introduction. Cengage Learning (2010).
10. Layth Almahadeen, Murat Akkaya, Arif Sari: Mining Student Data Using Crisp-DM Mod-el. In Int. Journal of
Computer Science and Information Security, Vol. 15, No. 2 (2017).

020001
11. Lee S.-W., Jiang G., Kong H.-Y., Liu C.: A difference of multimedia consumer’s rating and review through
sentiment analysis. Multimedia Tools and Applications, Volume 80, Issue 26-27Nov (2021).
12. Martin F. G. Will massive open online courses change how we teach? Communications of the ACM, 55(8), pp.
26-28 (2012).
13. Medhat W., Hassan A., Korashy H.: Sentiment analysis algorithms and applications: A sur-vey. Ain Shams
Engineering Journal, 5, 1093–1113 (2014).
14. R. Watrianthos, S. Suryadi, D. Irmayani, M. Nasution, and E. F. S. Simanjorang: Sentiment Analysis Of
Traveloka App Using Naïve Bayes Classifier Method, In Int. J. Science Tech-nology Res., vol. 8, no. 07, pp.
786–788 (2019).
15. Rahmawati P, Larasat, A, Farhan M, Hajji A M, Fanani N A.: Understanding user feedback on Learning
Management System of SIPEJAR by using text mining techniques. IOP Con-ference Series. Materials Science
and Engineering; Bristol Vol. 1072, Issue 1 (2021).
16. Syafrida Hafni S.: Online learning sentiment analysis during the covid-19 Indonesia pan-demic using twitter
data. IOP Conf. Ser.: Mater. Sci. Eng. 1156 012011 (2021).
17. Syahaneim M., Nur Hidayah M. D., Zuraini Z., Omar Z.: Framework of Knowledge-Based System for United
Nations Peacekeeping Operations using Data Mining Technique, 2018 4th Int. Conf. on Info. Retrieval and
Knowledge Mgmt (CAMP), 2018, pp. 1-6 (2018).
18. Syahriani, A. A. Yana, and T. Santoso: Sentiment analysis of Facebook comments on In-donesian presidential
candidates using the Naïve Bayes method. In Journal of Physics: Con-ference Series, vol. 1641, pp. 012012
(2020).
19. Vijay Kotu, Bala Deshpande: Data Science: Concepts and Practice. Morgan Kaufman (2019).
20. Wiemer, H. Drowatzky, L. Ihlenfeldt : Data Mining Methodology for Engineering Applica-tions (DMME)—A
Holistic Extension to the CRISP-DM Model (Application Science) Vol.9 (2019).

12 May 2024 12:33:41

020001

You might also like