Professional Documents
Culture Documents
2016 Multimedia Tools Appl
2016 Multimedia Tools Appl
DOI 10.1007/s11042-016-3637-2
1 Introduction
E-learning, Ubiquitous Computing, Ambient Intelligent and Internet of Things are becoming
widespread. As a result, innovative interactive multimedia applications are emerging, leading
* Alexandros Liapis
aliapis@eap.gr
School of Science and Technology, Hellenic Open University, Parodos Aristotelous 18, Patras 26335,
Greece
GSR signal, along with the performance of various classifiers (e.g. SVM, k-NN etc.) are also
presented.
The rest of the paper is structured as follows. Section 2 presents the research background.
Section 3 presents the experimental details, in specific the stimuli selection, the experimental
general set-up and protocol are delineated. In Section 4, the proposed algorithm for segmenting
the VA space into non-rectangular regions is described. Section 5 presents the results. The
paper concludes with a discussion of the main findings, limitations of the presented work and
directions for future research.
2 Background
Multimedia components and features bridge the gap between traditional computer interfaces
and new innovative systems (e.g. multipurpose public interactive displays [27]) that support
multimodal interaction. The growing number of software applications along with the wide use
of interactive multimedia, generate a need for methods to evaluate them systematically.
In terms of user experience evaluation, one is mostly interested in identification of system
flaws [33]. A multimedia application or a system with such flaws might cause undesirable
activation of ANS, which is associated with behavioral effects known as the Bfight or flight^
response or stress [24, 60]. Stress is defined as a state transition from calmness to high arousal,
accompanied by biochemical, physiological and behavioral changes for reasons of preserving an
organisms integrity [3]. Although stress is often related to negative experience, it may be also
beneficial in some cases by providing an appropriate boost to someone (e.g. meet an important
deadline for a report submission, solve an exercise while taking part in exams etc.). However,
frequent or daily exposure to stressors is a precursor of chronic stress, which can badly affect
peoples health [1]. Beyond health issues, stress may also affect users performance [16], and its
presence in interactive computer environments is probably interpreted as a user experience issue.
Thus, assessing stress is particularly important in this context, and this is the goal of this paper.
Approaches for measuring stress can be distinguished into non-intrusive and intrusive
techniques. Non-intrusive techniques are mainly based on self-reporting instruments, such as
the Daily Stress Inventory [7], the NASA Task Load Index (NASA-TLX) [21], the Stroop
color test [26], and the Situation Awareness Global Assessment Technique (SAGAT) [19].
Although such techniques are straightforward and do not require the use of any special
equipment, they have been criticized for lack of objectivity, extra cognitive load and memory
recall problems [24, 39]. In an attempt to overcome these problems, other stress measurement
approaches capture and analyze users observed behavior in an non-invasive and continuous
way by means of pressure-sensitive keyboards [24, 41].
One particularly popular non-intrusive tool for emotion assessment is the two-dimensional
(2D) Affect Grid [51]. The Affect Grid is an effective and easy tool for emotion measuring
[28], which is based on the theoretically circumplex model of affect, introduced by Russell
[49]. This tool requires participants to select a point on a 9 9 grid (see Fig. 1) that best
indicates their emotional state at a specific moment. The grid consists of two dimensions: the
horizontal valence axis (displeasure-pleasure) and the vertical arousal axis (sleepiness-arousal).
For example, if someone feels neutral, then the middle square of the grid (coordinates = 5, 5) is
expected to be selected. The Affect Grid tool has been used in a variety of research fields.
Regarding the multimedia field, the Affect Grid has been used in the emotional evaluation of
both multimedia environments and content. In specific, [12, 54] used the 9 9 tool in order to
GSR and other physiological signals have been used in order to measure stress in users of
multimedia applications and entertainment technologies [43, 44, 57, 60]. Such studies are
designed to induce intense reactions to users. However, recognizing stress in subtle interaction
events [59], which are typically expected in most interactive computer environments, remains
challenging. This paper studies stress measurement in such subtle interaction events.
3 Experimental details
Inducing emotion in a laboratory environment is particularly challenging. The stimuli must be
carefully designed or selected in order to trigger the appropriate arousal levels of ANS. In
addition, stimuli should be realistic enough and free from any researcher bias. To this direction,
many methods [30, 38, 44, 53] have been used in emotions induction process. However, they
all rely on intense contexts, such as viewing movie clips, listening to songs, experiencing
major hardware/software failures, viewing images of specific databases and playing games.
Thus, recognizing emotions in subtle interaction events [12], which are typically expected in
most interactive computer environments, remains rather unexplored.
To the best of our knowledge, a stimuli dataset that relies on such tasks does not exist.
Hence, this paper also presents the stimuli selection approach that was followed. Stimuli
selection was based on a face to face interview process and extensive pilot-testing. To this end,
15 typical computer users (University employees, students, and friends) participated on a
voluntarily basis and they were asked to report stressful computer tasks. Interviews took place
at the Hellenic Open University infrastructures, and each session lasted from 15 to 20 min per
interviewee. First, demographic information (e.g. age, skills in computer usage, profession,
education etc.) was recorded. Next, interviewees were asked to describe at least five stressful
computer tasks. Interviewees were neither informed nor participated in the main stress
monitoring experiment, which is described in the following. The scenarios provided by the
interviewees did not require any special skills or experience in computer usage.
Subsequently, the collected interview data were analyzed. First, similar participants answers were grouped and a frequency table was created. Frequency analysis did not reveal any
significant differences due to demographic parameters. Next, starting from the most
frequently-mentioned answer, appropriate interaction scenarios were designed and pilot-tested.
This pilot-testing process revealed that although interaction scenarios involving financial
transactions and viruses were commonly reported by interviewees, they were not selected in
order to ensure the ecological validity of the study. For instance, pilot-testing participants
reported that a financial transaction with a wrong charge in a provided credit card was not
found to be stressful. In addition, the interaction scenarios had to require minimum typing
effort in order to avoid any noise during signal recording. In the end, the five most commonly
reported scenarios were selected, taking also into account the aforementioned criteria. The final
interaction scenarios are elaborated in the following section.
of the Hellenic Open University (http://meae.eap.gr). This website was selected in order to
avoid any previous interaction experience. In a three steps scenario participants were asked to:
a) navigate in the website and find a specific file, b) download and save the file to a specific
network folder and c) log in a provided google email account and send the file at a provided
email address. Login credentials were already saved on the testing pc in order to avoid extra
motor and mental effort. While participants were busy creating the email, experiment
facilitators deleted the downloaded file remotely.
chosen due to complaints about its information architecture that had been collected in a
previous study.
Figure 2 illustrates the differences in skin conductance levels of a randomly-selected
participant during baseline and the first interaction scenario. An increase in participants
arousal, as measured by skin conductance, is obvious.
High Arousal
Unpleasant
Feelings
73
74
...
64
65
...
55
56
...
46
47
...
37
38
...
28
29
...
19
20
...
10
11
...
...
41
Pleasant
Feelings
Sleepiness
asked to provide subjective ratings of their emotional experience using the Affect Grid tool.
The Google Forms service was used to implement the Affect Grid tool and collect participants
responses (Fig. 3). Skin conductance was not recorded during the breaks and the selfassessment process.
Fig. 4 Participants ratings in the Affect Grid tool for all stressors. Numbers inside bubbles correspond to the
total number of participants that selected the specific pair of VA values. The minimum and maximum value for
each axis is 1 and 9 respectively
Fig. 5 Participants ratings in the VA space for all stressors. Two representative examples of regions defined by
Eq. 1 are shown. Numbers inside bubbles represent how many participants selected the specific pair of VA
values. The minimum and maximum value for each axis is 1 and 9 respectively
The exploration of the VA space started from defining R(3, 6), a rather small region in the
upper left corner of VA, which was iteratively expanded horizontally, vertically and diagonally
as far as R(5, 4) (see Fig. 5). Next, each region was associated with the corresponding
participants physiological signals. After, six popular classifiers were used in order to test
the classification accuracy in each region. Previous results [36] showed that the regions R(3,
6), R(3, 5) and R(3, 4) achieved best classification accuracies. However, stress region(s) in the
VA space may not be rectangular and this was a reported limitation of our previous work. In
this paper we explore a new algorithmic approach for segmenting the VA space into nonrectangular regions in order to refine the stress region.
Fig. 6 A hypothetical stress region in the VA space produced by a non-rectangular approach. Numbers inside
bubbles represent how many participants selected the specific pair of VA values. The minimum and maximum
value for each axis is 1 and 9 respectively
Ri(v, a) and Ci is associated with participants corresponding physiological signals. Afterwards, six popular classifiers, offered in the MATLAB R2015a Statistics and Machine
Learning Toolbox v10.0, are used to test the classification accuracy between regions. Classification results between Ri(v, a) and Ci are used to determine whether or not a specific Ri(v, a)
will be part of the final region. To this end, we have to predefine a classification threshold as
termination criterion of the ISRC algorithm.
The following algorithm presents the final region construction process. The algorithm was
tested for five classification thresholds (i.e., 60, 65, 70, 75, 80, 85) and the results are presented
in Section 5.2.
GSRi GSRmin
*100;
GSRmax GSRmin
where GSR(i) is the raw data, GSR(max) is the global maximum and GSR(min) is the global
minimum of the raw GSR per participant. Next, the normalized GSR signals were
smoothed using Gauss adaptive smoothing function offered in Ledalab V3.4.8,1 a
MATLAB application [4, 5] that supports electrodermal activity analysis. The final
dataset consists of 149 signals which were used in the classification process of the
proposed ISRC algorithm; for two signals the corresponding participants VA ratings
were not recorded.
1
http://ledalab.de
From each smoothed GSR signal, 7 statistical features (i.e., mean, median, min, max,
standard deviation, minRatio and maxRatio) were extracted. The same statistics were extracted
from the first and the second differences of each signal. Thus, 21 statistical features were
extracted from each smoothed GSR signal. The extracted features were used to train six
classifiers offered in the MATLAB R2015a Statistics and Machine Learning Toolbox v10.0: a)
Linear Discriminant Analysis (LDA), b) Quadratic Discriminant Analysis (QDA), c) Linear
Support Vector Machine (L-SVM), d) Quadratic Support Vector Machine (Q-SVM), e) Cubic
Support Vector Machine (C-SVM), and f) k-Nearest Neighbors (k-NN). The classification
phase used a 3-fold cross-validation [31] approach with 100 simulations (i.e., 300
runs = 3 100).
5 Results
The main objective of the present study is to identify non-rectangular stress regions in the VA
space by combining self-reported and physiological data. To this end, the following approach
was applied: First, 149 pairs of VA ratings were associated with participants corresponding
GSR signals. Next, 21 statistical features were extracted from preprocessed GSR signals and
were used in the classification process. In contrast to the rectangular approach that has been
followed in [36], here the VA regions were algorithmically constructed based on the Affect
Grids smallest unit of analysis: blocks of 1 1.
Results of the proposed ISRC algorithm with a specific classification threshold (i.e., 75 %)
are presented in Section 5.1, in addition to a comparison with the previously-proposed [36]
rectangular approach. Section 5.2 presents an evaluation study for the classification accuracy
threshold effect on VA stress regions produced by all ISRC instances (classifiers). In Section
5.3, we employ the proposed ISRC algorithm on the regions R(3, 6), R(3, 5) and R(3, 4) that
achieved the best classification accuracies in [36] and compare the obtained results.
(a) ISRC:LDA
(b) ISRC:QDA
(c) ISRC:L-SVM
(d) ISRC:Q-SVM
(e) ISRC:C-SVM
(f) ISRC:k-NN
Fig. 7 VA stress region(s) identified per ISRC instance based on participants skin conductance. Regions colored
green represent the output of ISRC with classification accuracy at least 75 %. There were no available ratings or
signals for the red hatched blocks. Red frames represent the corresponding convex hull regions
Classifier
Rectangular approach
Blocks
ISRC instance
Convex Hull
Signals
Accuracy
Blocks
Signals
Accuracy
LDA
10
29
75.4 2.5
R(6,5)
18
69
53.8 3.5
QDA
11
35
75.6 1.0
R(4,4)
16
58
56.5 2.9
L-SVM
Q-SVM
11
11
35
30
75.5 1.0
75.9 2.0
R(4,4)
R(4,4)
16
16
58
58
65.6 2.1
54.6 3.3
C-SVM
22
75.1 2.6
R(4,6)
27
69.7 2.6
kNN
12
29
75.2 2.5
R(5,3)
24
96
50.3 3.4
Fig. 8 Orange blocks constitute the intersection of the corresponding stress regions that were produced from six
(6) different ISRC instances for various threshold accuracies
To this end, an experiment was conducted and 31 participants were asked to perform five
carefully selected stress-inducing interaction tasks. The Affect Grid tool was used in order to
collect self-reported data from participants. Participants skin conductance was also recorded.
The stressful interaction scenarios were produced through a research-based approach: interviews with 15 typical computer users and extensive pilot-testing with participants who were
not involved in the main experiment.
Starting from the upper left point - (v, a) = (1, 9) - of the Affect Grid tool the first region was
defined Ri(v, a) labeled as Bstress^. The rest points in the VA space were defined as
complementary region Ci labeled as Bother emotion^. Next, the Ri region was expanded
horizontally and vertically with a step of one block, the Affect Grids smallest unit of analysis,
in each direction. In this way each unit of analysis (1 1) contributes to the construction of a
Classifier
Blocks
Blocks
Signals
Accuracy
Mean SD
Signals
Increase in
percentage points
Accuracy Mean SD
LDA
10
37
72.4 2.4
26
78.1 2.0
5.7
QDA
10
37
74.1 1.2
36
75.0 1.0
0.9
L-SVM
Q-SVM
10
10
37
37
74.1 1.1
71.1 2.5
9
9
33
26
77.0 0.7
79.1 1.8
2.9
8.0
C-SVM
10
37
63.3 3.0
22
75.2 2.2
11.9
kNN
10
37
64.4 2.8
22
79.7 2.1
15.3
bigger region which will be probably not rectangular. Afterwards, each region Ri (v, a) and Ci
was associated with the participants corresponding physiological signals. Subsequently, six
popular classifiers, which constitute the core of the ISRC algorithm and are offered in the
MATLAB R2015a Statistics and Machine Learning Toolbox v10.0, were used to test the
classification accuracy between Ri and Ci.
Our findings show which regions in the VA rating space may reliably indicate (from 60 to
85 %) self-reported stress that is in alignment with ones measured skin conductance in the
context of typical interactive applications. As a result, HCI and interactive multimedia
researchers and practitioners can employ the Affect Grid in their UEX evaluation studies,
knowing a priori that a specific VA region is associated with both perceived and physiologically experienced stress. One additional important contribution of this work is the proposed
approach for the empirical identification of affect regions in the VA space, which may be also
used for other emotions in the future.
One limitation of this work is that we did not employ any feature selection techniques,
which might improve the classification accuracies. Furthermore, out dataset included VA
blocks with no ratings, which constitutes an additional limitation of this study. Extra studies
are also required to ensure the generalizability of our findings. One of our immediate future
aims is to enlarge our dataset in order to investigate the effect (if any) of gender on the
identified stress region(s) in the VA space. Future work also includes investigating the reported
stress regions using additional physiological signals, such as blood volume pressure,
Table 3 Results for R(3, 5): rectangular vs. ISRC with classification accuracy threshold 75 %
Classifier
Blocks
Blocks
Signals
Accuracy
Mean SD
Signals
Increase in
percentage points
Accuracy Mean SD
LDA
12
41
70.4 2.5
11
30
75.7 2.0
5.3
QDA
12
41
70.7 2.0
36
75.1 1.1
4.4
L-SVM
12
41
72.1 1.3
10
34
76.4 0.7
4.3
Q-SVM
12
41
69.5 2.6
11
30
76.7 2.0
7.2
C-SVM
12
41
62.8 3.4
22
75.2 2.4
12.4
kNN
12
41
64.1 2.8
10
26
75.9 2.4
11.8
Increase in percentage
points
15
49
62.8 2.9
11
30
75.7 2.0
12.9
QDA
15
49
63.1 2.5
11
36
75.1 1.0
12.0
L-SVM
Q-SVM
15
15
49
49
67.3 1.6
63.1 3.4
10
12
36
32
75.0 0.8
75.1 1.9
7.7
12.0
C-SVM
15
49
54.9 3.1
22
75.2 2.5
20.3
kNN
15
49
57.6 3.1
11
28
75.0 2.6
17.4
respiration and temperature. Finally, VA regions for other emotions might be also investigated
following the methodology described in this paper.
References
1. Anderson NB (1998) Levels of analysis in health science. A framework for integrating sociobehavioral and
biomedical research. Ann N Y Acad Sci 840:563576
2. Barrett LF (1998) Discrete emotions or dimensions? The role of valence focus and arousal focus. Cogn Emot
12:579599. doi:10.1080/026999398379574
3. Baum A (1990) Stress, intrusive imagery, and chronic distress. Health Psychol Off J Div Health Psychol Am
Psychol Assoc 9:653675
4. Benedek M, Kaernbach C (2010) Decomposition of skin conductance data by means of nonnegative
deconvolution. Psychophysiology 47:647658. doi:10.1111/j.1469-8986.2009.00972.x
5. Benedek M, Kaernbach C (2010) A continuous measure of phasic electrodermal activity. J Neurosci
Methods 190:8091. doi:10.1016/j.jneumeth.2010.04.028
6. Boucsein W (1992) Electrodermal activity. Plenum University Press, New York
7. Brantley PJ, Waggoner CD, Jones GN, Rappaport NB (1987) A daily stress inventory: development,
reliability, and validity. J Behav Med 10:6174
8. Cacioppo JT, Tassinary LG (1990) Inferring psychological significance from physiological signals. Am
Psychol 45:1628
9. Calhoun BH, Lach J, Stankovic J et al (2012) Body sensor networks: a holistic approach from silicon to
users. Proc IEEE 100:91106. doi:10.1109/JPROC.2011.2161240
10. Campbell JD, Chew B, Scratchley LS (1991) Cognitive and emotional reactions to daily events: the effects of
self-esteem and self-complexity. J Pers 59:473505
11. Chanel G, Rebetez C, Btrancourt M, Pun T (2011) Emotion assessment from physiological signals for
adaptation of game difficulty. IEEE Trans Syst Man Cybern Part Syst Hum 41:10521063. doi:10.1109/
TSMCA.2011.2116000
12. Chauncey A, Azevedo R (2010) Emotions and motivation on performance during multimedia learning: how
do i feel and why do i care? In: Aleven V, Kay J, Mostow J (eds) Intell. Tutoring Syst. Springer, Berlin, pp
369378
13. Chung S, Cheon J, Lee K-W (2015) Emotion and multimedia learning: an investigation of the effects of
valence and arousal on different modalities in an instructional animation. Instr Sci 43:545559. doi:10.1007/
s11251-015-9352-y
14. de Santos Sierra A, Avila CS, Guerra Casanova J, et al. (2010) Two stress detection schemes based on
physiological signals for real-time applications. In: 2010 Sixth Int. Conf. Intell. Inf. Hiding Multimed. Signal
Process. IIH-MSP. pp 364367
15. Deaver CM, Miltenberger RG, Smyth J et al (2003) An evaluation of affect and binge eating. Behav Modif
27:578599
16. Diamond DM, Campbell AM, Park CR et al (2007) The temporal dynamics model of emotional memory
processing: a synthesis on the neurobiological basis of stress-induced amnesia, flashbulb and traumatic
memories, and the Yerkes-Dodson law. Neural Plast 2007:60803. doi:10.1155/2007/60803
Alexandros Liapis is a Ph.D. candidate in the Hellenic Open Universitys School of Science and Technology.
His research interests include Human-Computer Interaction, Usabiity Evaluation and Physiological Signal
Analysis. Liapis graduated from the Department of Financial Applications at the Technological Institute of
Western Macedonia, and received his M.Sc. from the Department of Applied Informatics at the University of
Macedonia. Contact him at aliapis@eap.gr.
Christos Katsanos is a post-doctoral researcher in the Hellenic Open Universitys School of Science and
Technology and an adjunct professor at the Business Administration Department of the Technological Educational Institute of Western Greece. His research interests include Human-Computer Interaction, Web Accessibility, Information Architecture, Educational Technology and Human-Robot Interaction. Katsanos received his
Dipl.-Ing. and Ph.D. from the Department of Electrical and Computer Engineering at the University of Patras.
Contact him at ckatsanos@eap.gr.
Dimitris Sotiropoulos is an adjunct professor and post-doctoral researcher in the Hellenic Open Universitys
School of Science and Technology. His research interests include Machine Learning, Global Optimization,
Artificial Neural Networks, Interval Methods and Physiological Signal Analysis. Sotiropoulos received his B.Sc.
and Ph.D. from the Department of Mathematics at the University of Patras. Contact him at dgs@eap.gr.
Nikos Karousos is a post-doctoral researcher in the Hellenic Open Universitys School of Science and
Technology and an adjunct professor at the Technological Educational Institute of Western Greece. His research
interests include Hypertext, Service Oriented Architecture, Application Development, Design of Knowledge
Management Systems and Software Evaluation. Karousos graduated from the Computer Engineering & Informatics Department of University of Patras, and holds an M.Sc diploma and a Ph.D. diploma from the same
department. Contact him at karousos@eap.gr.
Michalis Xenos is a professor in the Computer Science Department of the School of Science and Technology of
the Hellenic Open University, Director of the Computer Science Course, Director of the Internal Assessment and
Education Unit, Director of the Software Quality Research Group and Director of the Software Quality
Laboratory. His current research interests include, inter alia, Software Quality, Human Computer Interaction
and Educational Technologies. Xenos received his B.Eng., M.Sc. and Ph.D. from the Department of Computer
Engineering & Informatics at the University of Patras. Contact him at xenos@eap.gr.