A Sentiment Analysis System To Improve Teaching and Learning PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

COVER FEATURE ADVANCES IN LEARNING TECHNOLOGIES

A Sentiment Analysis
System to Improve
Teaching and Learning
Sujata Rani and Parteek Kumar, Thapar University

Natural language processing and machine learning can be


applied to student feedback to help university administrators
and teachers address problematic areas in teaching
and learning. The proposed system analyzes student
comments from both course surveys and online sources
to identify sentiment polarity, the emotions expressed, and
satisfaction versus dissatisfaction. A comparison with direct-
assessment results demonstrates the system’s reliability.

S
entiment analysis (SA) is the process of iden- and other course attributes can also be gathered through
tifying and classifying users’ opinions from social media. In recent years, online learning portals
a piece of text into different sentiments—for like Coursera (www.coursera.org) have attracted many
example, positive, negative, or neutral—or students by providing free courses from a growing num-
emotions such as happy, sad, angry, or disgusted to ber of selected institutions.2 Millions of students join
determine the user’s attitude toward a particular sub- these massive open online courses each year and share
ject or entity. SA plays an important role in many fields their opinions about the course content and quality of
including education, where student feedback is essential teaching on the course’s discussion forum. Students
to assess the effectiveness of learning technologies. also comment about their educational experiences in
Many universities obtain such feedback via a student blogs, online forums such as College Confidential (www
response system (SRS) during or at the end of a course to .collegeconfidential.com), and teacher review sites
analyze the teacher’s performance.1 Student feedback such as Rate My Professors (www.ratemyprofessors
about teacher performance, the learning experience, .com).3 This feedback not only yields useful insights

36 CO M PUTE R P U B LISHED BY THE IEEE COMP UTER SOCIE T Y 0 0 1 8 - 9 1 6 2 / 1 7/ $ 3 3 .0 0 © 2 0 1 7 I E E E


Tokenization Sentiment and
emotion identification
Lowercasing NRC
Feedback Lexicon
data
Normalization
Token matching
Data collection Stemming

Vector creation
Removal of
for university administrators and Feedback irrelevant content
instructors but also plays a key role data
in influencing student decisions Transliteration
Emotions Sentiment
on which universities to attend or Data preprocessing
courses to take.4
Satisfaction and
SENTIMENT ANALYSIS dissatisfaction computation Data visualization
Course outcomes can be assessed
directly or indirectly. Direct assess- FIGURE 1. Proposed sentiment analysis (SA) system architecture. After preprocessing
ment considers samples of actual stu- input data—student feedback obtained from both formal sources such as course surveys
dent work including exams, assign- and informal sources such as blogs and forums—the system uses natural language pro-
ments, quizzes, and project reports. cessing in conjunction with the NRC Emotion Lexicon to classify sentiments and emotions.
Indirect assessment is based upon Sentiments are classified into two categories, positive and negative, and emotions are
student observations of the learn- classified into one of eight categories—anger, anticipation, disgust, fear, joy, sadness,
ing experience and teaching qual- surprise, and trust—from which the system computes satisfaction or dissatisfaction. The
ity. SA of student feedback is a form SA system can process multilingual content and includes a data-visualization component
of indirect assessment that analyzes to facilitate analysis.
text written by students—whether
in formal course surveys or infor-
mal comments from online platforms—to determine stu- feedback. Moreover, they do not process multilingual data.
dents’ interest in a class and to identify areas that could be Finally, previous researchers have not attempted to validate
improved through corrective actions. their systems by comparing the results of their analysis with
SA raises many technical challenges. First, word mean- those of traditional direct-­assessment methods.
ing varies across different domains. For example, in an edu-
cation context the word “early” connotes a negative sen- PROPOSED SA SYSTEM
timent in the sentence “The lecture is too early!” but in a Our proposed SA system helps to improve teaching and
consumer context it connotes a positive one in the sentence learning by performing temporal sentiment and emotion
“The courier arrived early.” Second, performing SA on text analysis of multilingual student feedback in terms of teacher
in different languages can be difficult. In India, for exam- performance and course satisfaction. The system classifies
ple, people often express their opinions using a transliter- sentiments into two categories, positive and negative, and
ated form of Hindi; thus, they might write emotions into Robert Plutchik’s eight categories—­ anger,
anticipation, disgust, fear, joy, sadness, surprise, and trust—
, from which it computes satisfaction or dissatisfaction.
Figure 1 shows the system architecture, which has five
which translates into English as “He teaches very well in main components: data collection, data preprocessing,
the class,” as “Wo class mein achha padhate hain.” These sentiment and emotion identification, satisfaction and dis-
types of challenges motivate the need to develop a context-­ satisfaction computation, and data visualization. The sys-
sensitive, multilingual SA system. tem uses the open source R language (www.r-project.org) to
Most SA studies have focused on user-review corpora—for perform data preprocessing and sentiment classification.
example, product, movie, and hotel reviews—with research-
ers generally classifying the reviews into positive, negative, Data collection
and sometimes neutral. SA has not been extensively applied Our initial data corpus consists of student feedback about
to education, though work in this area has grown recently as a Coursera course as well as data obtained from a univer-
described in the “Related Research” sidebar. However, most sity SRS. The Coursera dataset includes approximately
of these approaches limit the classification of sentiments to 4,000 student comments made during the course, which
the two or three categories indicated above, without consid- ran from August 2015 to August 2016, and 1,700 student
ering the wide range of emotions that can also affect student comments made after completion of the course. The SRS

M AY 2 0 1 7  37
ADVANCES IN LEARNING TECHNOLOGIES

RELATED RESEARCH

I n recent years, researchers have begun to apply


sentiment analysis (SA) to the education field
using various machine learning and natural lan-
real-time feedback analysis. They also observed
better performance without including the neutral
class.3
guage processing techniques. The following year, Trisha Patel, Jaimin
In 2011, Zied Kechaou, Mohamed Ben Ammar, Undavia, and Atul Patel analyzed feedback from
and Adel M. Alimi performed sentiment classi- meetings of students’ parents using the General
fication of e-learning blogs and forums using a Architecture for Text Engineering (GATE) tool and
supervised hybrid technique that combined hidden its ANNIE application to classify comments as
Markov models with support vector machines positive or negative.4
(SVMs). They performed experiments using three Several studies were published in 2016.
feature-selection methods—mutual information, Francis F. Balahadia, Ma. Corazon G. Fer-
information gain, and chi statistics—and determined nando, and Irish C. Juanatas proposed an SA sys-
that the chi-statistics method outperformed the tem to evaluate teacher performance in courses
other two.1 from student responses in both English and
Two years later, Myriam Munezero and her Filipino. They calculated sentiment scores from
colleagues performed emotion analysis of qualitative and quantitative response ratings us-
student learning diaries and classified them into ing an NB algorithm and graphically represented
Robert Plutchik’s eight emotion categories. They the percentage of positive and negative senti-
also computed frustration and anxiety from ments to help university administrators be aware
these eight emotions.2 of students’ concerns.5
In 2014, Nabeela Altrabsheh, Mihaela Cocea, V. Dhanalakshmi, Dhivya Bino, and A.M.
and Sanaz Fallahkhair performed SA of student Saravanan performed SA on feedback from a
feedback using naive Bayes (NB), complement student evaluation survey of Middle East College in
NB (CNB), SVM, and maximum-entropy classifi- Oman. They used the RapidMiner tool to clas-
ers using unigrams as features. They concluded sify the comments into positive and negative on
that an SVM with a radial basis function kernel the basis of features like teacher, exam, module
and the CNB technique achieved good results for content, and resources. The researchers compared

dataset includes about 500 student comments and ratings words in the NRC Emotion Lexicon.5 This step is performed
for lecture and lab sessions after midterm and final semes- using the tm_map function in R’s tm package.
ter examinations for a course taught by one teacher over
the past 10 years. It also includes student surveys and com- Normalization. Abbreviated content is normalized by
ments for 25 courses taught by different teachers at the uni- using a dictionary to map the content to frequently used
versity over the past 2 years, which we used in conjunction Internet slang words. For example, “gud” and “awsm” are
with direct assessments of student performance to evaluate mapped to “good” and “awesome,” respectively.
the system’s reliability.
Stemming. To further facilitate word matching, words in stu-
Data preprocessing dent comments are converted to their root word using the tm_
During this phase, the SA system prepares collected data for map function in R’s SnowballC package. For example, “mov-
further processing. This involves six steps. ing,” “moved,” and “movement” are all converted to “move.”

Tokenization. Students’ comments are split into words, or Removal of irrelevant content. Punctuation and stop
tokens, using the tokenize function in R. words, which are irrelevant for SA, are removed to improve
system response time and effectiveness.
Lowercasing. Characters are converted to lower case to
ease the process of matching words in student comments to Transliteration. To address the issue of use of mixed

38 COMPUTER  W W W.CO M P U T E R .O R G /CO M P U T E R


the performance of their approach using NB, 3. N. Altrabsheh, M. Cocea, and S. Fallahkhair, “Learning
SVM, k-nearest neighbors, and neural-network Sentiment from Students’ Feedback for Real-time Interven-
classifiers.6 tions in Classrooms,” Adaptive and Intelligent Systems, A.
Brojo Kishore Mishra and Abhaya Kumar Sahoo Bouchachia, ed., LNCS 8779, Springer, 2014, pp. 40–49.
used CUDA C programming with a GPU architec- 4. T. Patela, J. Undavia, and A. Patela, “Sentiment Analysis
ture to evaluate faculty performance. They cate- of Parents Feedback for Educational Institutes,” Int’l J.
gorized faculty members as excellent, very good, Innovative and Emerging Research in Eng., vol. 2, no. 3,
good, average, or poor on the basis of average 2015, pp. 75–78.
marks given by students in feedback form. The 5. F.F. Balahadia, M.C.G. Fernando, and I.C. Juanatas,
researchers favorably compared their approach in “Teacher’s Performance Evaluation Tool Using Opinion
terms of time execution to a similar performance Mining with Sentiment Analysis,” Proc. IEEE Region 10
evaluation using a CPU architecture.7 Symp. (TENSYMP 16), 2016, pp. 95–98.
Guadalupe Gutiérrez Esparza and her col- 6. V. Dhanalakshmi, D. Bino, and A.M. Saravanan, “Opinion
leagues proposed a model for SA of student Mining from Student Feedback Data Using Supervised
tweets about teacher performance in Spanish. Learning Algorithms,” Proc. 3rd MEC Int’l Conf. Big Data
They used an SVM algorithm to classify the and Smart City (ICBDSC 16), 2016; doi:10.1109/ICBDSC
tweets into positive, negative, and neutral; they .2016.7460390.
also proposed a syntactic pattern model to com- 7. B.K. Mishra and A.K. Sahoo, “Evaluation of Faculty Perfor-
pare results using SVMs and syntactic patterns.8 mance in Education System Using Classification Technique
in Opinion Mining Based on GPU,” Computational Intelli-
References gence in Data Mining, vol. 2, H. Behera and D. Mohapatra,
1. Z. Kechaou, M.B. Ammar, and A.M. Alimi, “Improving E-learning eds., AISC 411, Springer, 2016, pp. 109–119.
with Sentiment Analysis of Users’ Opinions,” Proc. IEEE Global 8. G.G. Esparza et al., “Proposal of a Sentiment Analysis
Eng. Education Conf. (EDUCON 11), 2011, pp. 1032–1038. Model in Tweets for Improvement of the Teaching-­
2. M. Munezero et al., “Exploiting Sentiment Analysis to Track Learning Process in the Classroom Using a Corpus of
Emotions in Students’ Learning Diaries,” Proc. 13th Koli Calling Subjectivity,” Int’l J. Combinatorial Optimization Problems
Int’l Conf. Computing Education Research, 2013, pp. 145–152. and Informatics, vol. 7, no. 2, 2016, pp. 22–34.

language in student comments, the text is transliterated If a word in a student comment matches a word in the
using the Google Transliterate API. lexicon, the corresponding emotion vector is returned; if the
word matches more than one word in the lexicon, the sum of
Sentiment and emotion identification the corresponding emotion vectors is returned. In this way,
During this phase, the SA system analyzes the preprocessed an emotion vector is created for each comment representing
data to identify instances of sentiment and emotion. It uses the different emotions and sentiments contained within.
the NRC Emotion Lexicon,5 also known as EmoLex, to asso- For example, for the sentence “Sir, you are great!” the SA
ciate words with positive or negative sentiment and the system would return the following emotion vector:
eight basic emotions. The lexicon supports 40 languages
including several Indian ones like Hindi, Tamil, Gujarati,
Anticipation

Marathi, and Urdu. It includes annotations for 14,182 uni-


Negative
Sadness

Surprise

Positive
Disgust

gram words for English and 8,116 for Hindi.


Anger


Trust
Fear

Each word in the lexicon has an emotion vector (E )


Joy

containing a Boolean value (b) for each sentiment (s) and


emotion (e): 0 0 0 0 0 0 0 1 0 1
  
E = Ee + E s , This equates to the positive sentiment, as trust and positive
 
where E e ∈ {b0 ,…,b7 } and E s ∈ {b8 ,b9 }, ∀bi ∈ {0,1}. parameters have a b value equal to 1.

M AY 2 0 1 7  39
ADVANCES IN LEARNING TECHNOLOGIES

99 100
97 Lectures 90
Labs 80

Comments and rating (%)


95 70
Overall rating (%)

93 60 Positive Negative Positive Negative


50 comments comments rating rating
91
40
89 30
20
87
10
85 0
2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016
(a) Year (b) Year

FIGURE 2. Temporal sentiment analysis. (a) Student ratings of a teacher’s performance in lectures and lab sessions of one course over
a 10-year period. Students rated the performance in lectures slightly higher, and average overall performance exceeded 90 percent
during the last six years. (b) Percentage of positive and negative student comments about and ratings of the same teacher; on average,
85 percent of comments were positive and 15 percent were negative.

To enable temporal analysis of sentiments and emotions, Satisfaction = [α(TA) + (1 – α)(J)]/n



the SA system generates a mean emotion vector (E j ) for each Dissatisfaction = [α(AD) + (1– α)(S)]/n,
month and year:
where TA = trust + anticipation, J = joy, AD = anger + disgust,
n−1 p−1 
 1  S = sadness, and n = max(TA or AD, J or S).
Ej = ∑ ∑ E ji , ∀E ji ∈ N where N ≥ 0.
n i =0 j =0 Consider two examples. For the sentence “He is good at
teaching,” the SA system returns the following emotion
Here, n represents the number of comments in each month vector from the NRC lexicon:
and year and p represents the emotion and sentiment para­
meters such that p ∈ {anger, anticipation, disgust, fear, joy, sad-
Anticipation

ness, surprise, trust, negative, positive}. This vector is created to


Negative
Sadness

Surprise

avoid any anomalies that might result from an increase in


Positive
Disgust
Anger

Trust

the value of a particular emotion in that month or year.


Fear

Joy

Satisfaction and dissatisfaction computation 0 1 0 0 1 0 1 1 0 1


Satisfaction and dissatisfaction are crucial parameters
in education. The SA system derives these from six of the
eight emotion parameters—namely, joy, trust, anticipation, Here, TA = 2, J = 1, and n = max(TA, J) = 2. Satisfaction is thus
anger, disgust, and sadness. Anticipation and trust clearly calculated as [0.6(2) + 0.4(1)]/2 = 1.6/2 = 0.8. For the sentence
connote satisfaction, but in some circumstances joy could “He is bad at teaching and every student has doubts about
have a negative connation—for example, a student could the class,” the system returns the following emotion vector:
feel joy at skipping a boring class. Therefore, in computing
student satisfaction, we multiply the sum of anticipation
Anticipation

and trust by a constant (α = 0.6) to give these parameters


Negative
Sadness

Surprise

Positive
Disgust

more weight. We employ the same mechanism in computing


Anger

Trust
Fear

Joy

student dissatisfaction to give more weight to anger and


disgust than to sadness.
1 0 1 2 0 2 0 1 2 0
The calculations are as follows:

40 COMPUTER  W W W.CO M P U T E R .O R G /CO M P U T E R


Anger Anticipation Disgust Fear Joy
Sadness Surprise Trust Satisfaction

56 75
52 70
48 65
44 60
40 55
36 50

Emotions (%)
32 45
Emotions (%)

40
28 35
24 30
20 25
16 20
12 15
8 10
4 5
0 0
Aug. 2015

Dec. 2015
Sep. 2015

Aug. 2016
Mar. 2016
Nov. 2015

Jan. 2016
Feb. 2016

May 2016
Jun. 2016
Oct. 2015

Apr. 2016

Jul. 2016

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016
(a) Month (b) Year

FIGURE 3. Temporal emotion analysis. (a) Percentage of emotions extracted from student feedback on a one-year Coursera course.
Students expressed positive emotions more than negative ones, signaling satisfaction with the experience. (b) Percentage of emotions
extracted from student comments about the teacher in Figure 2. Trust in the instructor gradually increased, and each year the percentage
of positive emotions exceeded that of negative emotions.

In this case, AD = 2, S = 2, and n = max(AD, S) = 2. Dis­satis­ negative comments and ratings in student feedback by
faction is therefore calculated as [0.6(2) + 0.4(2)]/2 = 2.0/2 = 1. month and year. This makes it possible to track teacher
performance and course satisfaction over time. Figure 2a
Data visualization plots overall student ratings (ranging from 0 to 100 percent)
To facilitate analysis of student feedback about course of one teacher’s performance in lectures and lab sessions of
satisfaction and teacher performance, our SA system has a a university course from 2006 to 2016; the graph shows that
data-visualization component that creates sentiment and students rated the teacher’s performance in lectures slightly
emotion word clouds as well as line graphs of changes in higher than that in lab sessions and that the average overall
sentiments and emotions over time. rating was more than 90 percent during the last six years.
Figure 2b plots the percentage of positive and negative
Sentiment and emotion word clouds. Students use a student comments about and ratings of the teacher over the
variety of words to convey their sentiments or emotions same period; the graph reveals that, on average, 85 percent
while giving feedback. Visualizing frequently used positive of comments were positive and 15 percent were negative.
words (“great,” “excellent,” interesting,” and so on) and Sentiment polarity can also be tracked across different
negative words (“dull,” “confusing,” “terrible,” and so on) in teachers and courses over time to analyze overall teaching
the form of word clouds can help identify student learning quality at a given institution.
behavior—for example, whether or not they are taking an Our SA system also groups together emotions identified
interest in lectures and lab sessions. in comments about courses and teachers by month and
year, providing more granular insight. Figure 3a plots the
Temporal sentiment and emotion analysis. As indicated percentage of emotions extracted from student feedback
earlier, our SA system groups together positive and on a one-year Coursera course by month; the graph shows

M AY 2 0 1 7  41
ADVANCES IN LEARNING TECHNOLOGIES

100 Class performance


Course survey
95
90
85
Percentage

80
75
70
65
60
As Figure 4 shows, the results generally agreed, with less
55
than 20 percent absolute difference between the methods.
C1
C3
C5
C7
C9
C11
C13
C15
C17
C19
C21
C23
C25
In those courses where student performance exceeded
(a) Course satisfaction, there could be a number of explanations: the
exams were relatively easy, the course had a particularly
100 Class performance bright or hard-working group of students, or students did
SRS comments
95 not like the teacher for personal reasons or felt they did
90 not gain much value from the class. In those courses where
Percentage

85 student satisfaction exceeded performance, perhaps the


80 exams were exceptionally challenging or students failed to
75 adequately prepare. In either case, the discrepancy in results
obtained from both approaches invites continued analysis.
70
65

O
60
ur proposed SA system has great potential to
C1
C3
C5
C7
C9
C11
C13
C15
C17
C19
C21
C23
C25

improve teaching and learning in universities by


(b) Course analyzing sentiment, emotion, and satisfaction
FIGURE 4. Comparison of student performance, quantified as parameters in student feedback to help administrators and
average class grade on a 0–100 scale, with the percentage of teachers understand problematic areas and take corrective
positive sentiments in (a) surveys and (b) comments obtained actions. The large volume of information contributed by
from a university student response system (SRS) for 25 different students to course surveys, discussion forums, blogs, and
courses over a two-year period. The results of the two methods other sources is a largely underutilized resource that can be
generally agreed. effectively leveraged with the application of machine learning
techniques, which are continually improving. A comparison of
our proposed system’s results with direct assessments of class
that students expressed the positive emotions of trust, joy, performance demonstrates its reliability.
anticipation, and surprise more than the negative emotions Despite its promise, the system has some limitations. It
of sadness, fear, disgust, and anger. Figure 3b plots the is only as good as the data it analyzes, so care must be taken
percentage of emotions extracted from student comments in collecting feedback from students. SRSs must be well
about the teacher from Figure 2; it shows that students’ designed to ensure that they are engaging, and instructors
trust in the instructor gradually increased over the decade must make a concerted effort to ensure that as many students
and that each year the percentage of positive emotions as possible provide complete and accurate feedback.
exceeded that of negative emotions. In both datasets, about In future work, we plan to adapt the SA system API
55 percent of students were satisfied with the teacher. to integrate with SRSs and online learning portals to
enable real-time analysis of student feedback. We will
SYSTEM EVALUATION also add other Indian languages to extend the system’s
In education, there is a general consensus that direct and multilingual capabilities.
indirect assessments of teaching quality and learning
behavior should agree. Students who perform well in a REFERENCES
course, for example, would be expected to give the teacher 1. N. Altrabsheh, M.M. Gaber, and M. Cocea, “SA-E: Sentiment
high ratings and favorable comments; conversely, those Analysis for Education,” Intelligent Decision Technologies,
who perform poorly are likely to be dissatisfied. R. Neves-Silva et al., eds., FAIA 255, IOS Press, 2013,
To validate our SA system, we analyzed student surveys pp. 353–362.
and comments obtained from a university SRS system for 2. M. Wen, D. Yang, and C.P. Rosé, “Sentiment Analysis in
25 different courses over a two-year period and compared MOOC Discussion Forums: What Does It Tell Us?,” Proc. 7th
the percentage of positive sentiments students had about Int’l Conf. Educational Data Mining (EDM 14), 2014; www
each course with the average course grade on a 0–100 scale. .cs.cmu.edu/~mwen/papers/edm2014-camera-ready.pdf.

42 COMPUTER  W W W.CO M P U T E R .O R G /CO M P U T E R


3. B.K. Mishra and A.K. Sahoo, “Evaluation of Faculty
Performance in Education System Using Classification
Technique in Opinion Mining Based on GPU,” Computational
ABOUT THE AUTHORS
Intelligence in Data Mining, vol. 2, H. Behera and D.
Mohapatra, eds., AISC 411, Springer, 2016, pp. 109–119. SUJATA RANI is a research scholar in the Department
4. A. Abdelrazeq et al., “Sentiment Analysis of Social Media of Computer Science and Engineering at Thapar Uni-
for Evaluating Universities,” Proc. 2nd Int’l Conf. Digital versity. Her research interests include natural language
Information Processing, Data Mining, and Wireless Comm. processing (NLP) and machine learning. Rani received
(DIPDMWC 15), 2015, pp. 49–62. an ME in computer science and engineering from
5. S. M. Mohammad and P.D. Turney, “Crowdsourcing a Thapar University. She is a member of ACM. Contact her
Word–Emotion Association Lexicon,” Computational at sujata.singla@thapar.edu.
Intelligence, vol. 29, no. 3, 2013, pp. 436–465.
PARTEEK KUMAR is an associate professor in the
Department of Computer Science and Engineering at
Thapar University. His research interests include NLP,
databases, and machine learning. Kumar received a
Read your subscriptions through PhD in NLP from Thapar University. He is a member of
the myCS publications portal at ACM. Contact him at parteek.bhatia@thapar.edu.
http://mycs.computer.org

CALL FOR STANDARDS AWARD NOMINATIONS


IEEE COMPUTER SOCIETY HANS K ARLSSON
STANDARDS AWARD

A plaque and $2,000 honorarium is presented in recognition of


outstanding skills and dedication to diplomacy, team facilitation, and
joint achievement in the development or promotion of standards in the
computer industry where individual aspirations, corporate competition,
and organizational rivalry could otherwise be counter to the benefit
of society.

NOMINATE A COLLEAGUE FOR THIS AWARD!

DUE: 15 OCTOBER 2017

• Requires 3 endorsements.
• Self-nominations are not accepted.
• Do not need IEEE or IEEE Computer Society membership to apply.

Submit your nomination electronically: awards.computer.org | Questions: awards@computer.org

M AY 2 0 1 7  43

You might also like