Professional Documents
Culture Documents
CHI PLAY Michal S Submission
CHI PLAY Michal S Submission
CHI PLAY Michal S Submission
environment
ANONYMOUS AUTHOR(S)
Fig. 1. Duke Spook’em iteratively senses user affect, adjusts in-game tension, and measures the adjustment’s effects. To create
this system, we collected physiological data (electrodermal activity (EDA) and photoplethysmography (PPG)) to serve as privileged
information for use in a Learning Under Privileged Information (LUPI) machine learning paradigm. We also collected non-privileged
telemetry data, and users performed post-hoc affect annotation as ground truth. In our final game system, embedded in a procedurally-
generated horror game, the LUPI student model uses only non-privileged telemetry data to predict user affect with high accuracy
and guide their experience (top) towards a pre-authored tension curve (bottom).
Recent research on player modeling effectively employs machine learning algorithms to achieve incredible results. However, most
such approaches are impractical for real-world applications, usually for one of two reasons: they use intrusive sensory devices and
require their input at run-time [23], or they are not designed for real-time inference [44]. We introduce Duke Spook’em—an end-to-end
system which overcomes those limitations by combining state of the art methodologies in data acquisition, affect modelling and
prediction. Duke Spook’em closes the affective loop and creates a tailored, adaptive horror game experience which aims to maximize
player engagement. We first gather ground-truth data from 21 participants, leveraging unbounded affect annotation [39] and EDA,
PPG biometric signals. We discuss the tonic driver of electro-dermal activity (EDA) as a feature for predicting long-running tension.
We then produce an SVM ensemble trained under the Learning Under Privileged Information (LUPI) paradigm [65, 66], enabling
runtime predictions without need for auxiliary equipment. Using 5-fold cross validation, our model achieves accuracy of 80% for 2
classes, 74% for 3 classes, and 66% for 5 classes. In our final evaluation with 7 players, Duke Spook’em was shown to be more effective
than a corresponding rule-based system, making for a better and scarier experience.
CCS Concepts: • Human-centered computing → Interactive systems and tools; Empirical studies in HCI.
Additional Key Words and Phrases: affect, games, sensing, LUPI, machine learning, fear
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components
of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to
redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org.
© 2022 Association for Computing Machinery.
Manuscript submitted to ACM
1
CHI PLAY, October 10–13, 2023, Stratford, Canada Anon.
1 INTRODUCTION
Affective computing, a relatively new sub-field of computer science in and of itself, has recently broken out into the
world of computer games. Recognizing the power and seemingly unlimited applications of this medium, new grounds
are being broken in research each year. So-called "Serious Games" are widely utilized in education [31], mental health
and therapy [18], developing interpersonal communication skills [29], or even military training [62]. Their impact was,
is, and will continue to be widely studied and explored.
Regardless of its application, however, a video game’s power lies in its ability to fully immerse the user in its digital
world and evoke an emotional response—one that gets stronger with player’s investment and attachment. And what
better way to improve that experience than to create virtual worlds specifically tailored towards the player’s current
moods and needs? Researchers have used affective computing to inform and supplement game design [25, 52] and
to effectively model player experience as a function of the game content and player behavior, enabling generation
of personalized content [68]. 2022 also saw a commercially-released product called Anacrusis—Left4Dead’s spiritual
successor—leverage an AI-powered system to moderate gameplay. Their system constantly tracks the intensity of
combat encounters as perceived by the player and adjusts them to feel rewarding without being overwhelming.
A system’s ability to respond to user emotions is, however, limited by its sensory capabilities. Many techniques for
recognizing emotion have been introduced in literature over the years—examining voice patterns [20, 36, 50], facial
expressions [2, 17, 26], body gestures [7, 9, 30], or other physiological responses [35, 37, 56], as well as blending multiple
techniques [14, 19, 21]. All of these have shown significant promise in laboratory conditions, but quickly become
cumbersome when one wants to use them in a game. The sensory equipment involved not only lowers the quality
and enjoyment of the experience, but also makes it impossible to distribute the product to end users without requiring
them to have access to specific sensor hardware. Even then, quality variance in consumer equipment (even simple and
pervasive devices, like webcams) might heavily influence the results.
In this paper we present a solution which does away with those limitations, requiring specialized equipment only
during the game’s development and design process. We utilize the Learning Under Privileged Information (LUPI)
paradigm [65, 66] to train an SVM ensemble classification model. LUPI enables use of privileged information (in our
case, intrusive sensor data) to create a “teacher” model. This model’s output is then used to train a “student” model
that, crucially, does not have access to the privileged information. We gather information about users’ in-game actions
as input into the teacher model and contextualize them during training with measured Electrodermal Activity (EDA)
and Photoplethysmography (PPG). Eventually, the "student" model produced can predict users’ affective response
without access to their EDA and PPG signals, and efficiently use that information to drive gameplay elements and
maximize the desired effect. During runtime, our implementation uses a relatively small set of statistical data collected
from the game—e.g., players’ movements and actions—which transfer trivially to any first-person game and could be
generalized to third-person implementations with additional work. This approach sacrifices some accuracy—we do
not know specifically when a player meets with a particular enemy or event in our game—at the benefit of allowing
the technology to be much more portable to many different games. In both first-person horror games and walking
simulators, players move around: we derive most of our input data from players’ movement-related behavior.
But, of course, a model is only as good as its training data. First, we discuss affect labels. Researchers have proposed
a variety of methods for collecting user affect data, which we discuss in more detail in Section 2.3. Ultimately, we
settled on using ordinal unbounded affect annotation for our data collection. Our work represents the first time that
this technique has been tested at any scale outside of the original work, and we also discuss how it performs as ground
2
Duke Spook’em: A responsive fear modulation system in a horror game environment CHI PLAY, October 10–13, 2023, Stratford, Canada
truth for establishing long running tension as opposed to “event-based manifestations of tension and stress” [39]. We
performed a data gathering experiment with 21 participants, where we asked them to play a horror game demo, watch
a video capture of the gameplay, and then annotate their experience. We modify the testing protocol from Lopes,
et al. [39]—longer play sessions, focus on building tension rather than “jump-scares”—as we focus on moderating
subject emotion over longer periods of time. Our results show high correlation rank between EDA and annotation,
demonstrating their approach’s merit. We also demonstrate high correlation with a previously under-explored feature:
the tonic EDA component, as explained in the next paragraph.
The other aspect of training data is the privileged information: the sensor input. As noted, we leverage both EDA
and PPG signals, which have been shown to correlate with user arousal. However, we also introduce and evaluate a
novel use of the EDA signal’s composition: while its phasic driver has been evaluated as an indicator of momentary
fear, we instead rely on its tonic driver as a cue to users’ experience of long-running tension. In our dataset, collected
with 21 users, we observed a good correlation rank of 0.12 between the traditionally-used phasic components and users’
self-reported affect. More interestingly, we show better results for the tonic component: 0.46.
By adapting, merging, and improving these cutting edge techniques and technologies we present an implementation
of an end-to end system we call Duke Spook’em: a pre-trained model for affect recognition (built under the LUPI
paradigm) is tightly coupled with a simple system running real-time content adjustment. Duke Spook’em predicts
affect, adjusts gameplay, and gathers feedback on whether said adjustments elicit the desired affective response. With
the loop of prediction and adjustment, we believe we are one step closer to closing the affective loop, and in our final
playtest, our 7 players agreed that their AI-assisted playthrough was more engaging. In addition, we had these players
post-hoc annotate their playthroughs, and our model achieved average 67% arousal prediction accuracy on their data.
Our contributions are therefore:
(1) The first thorough evaluation of ordinal and unbounded affect annotation against accepted EDA measures [39, 70]
(2) Introduction and evaluation of the tonic component of an EDA signal’s decomposition as a predictor of long-
running tension in subjects
(3) The first practical, real-time application of the LUPI paradigm for prediction of player affect
(4) Duke Spook’em, an end-to-end system that iteratively senses player affect and adjusts gameplay content at
runtime to match a pre-authored target tension curve
2 AFFECT
Affect, which we discuss the definition of below, has been studied by psychologists for decades; more recently, it has
been co-opted by computer science researchers in the study of affective computing. From there, further researchers have
built affective loops in video games. There is only moderate agreement on precise definitions for this messy concept;
we discuss the definitions and touchpoints that underpin our work.
(a) The 2D valence-arousal model of emotion proposed (b) Trace gathered from a subjects self-reported arousal in our
by Russel [61] experiment
Fig. 2. Russel’s model of emotion we’ve decided to follow along with a presentation of an arousal measurement
term "affect" is used as an encapsulation of the entirety of an individual’s affective experience (emotions, feelings,
moods, and more).
In this work we only deal with emotions—and a pretty narrow subset of them. As with the rest of the nomenclature,
how emotions should be classified is—to this day—a hotly debated topic. Overall, there is some agreement on the
existence of so-called "basic emotions", however their exact number and nature remain contested [27, 45]. Paul Ekman
isolates six of them: anger, fear, disgust, sadness, happiness, and surprise [16]. Robert Plutchik adds anticipation and joy
into the mix, arranging all the emotions into a color wheel—opposing emotions sit on opposite sides of the spectrum and
are their distance from the center is determined by intensity [59]. Ekman’s other study proposes the two-dimensional
scale, which was later expanded on by Russel into the valence-arousal model most commonly used today [61], where
valence is the positivity/negativity of an emotion (frustrated is low valence, happy is high) and arousal is the energy
of the emotion (calm is low arousal, excited is high). There exist numerous other models that either re-invent the
taxonomy or expand on pre-existing theories, such as the 3D Lövheim Cube of Emotion [42].
We rely on Russel’s valence-arousal model (see Figure 2a) and specifically focus on the emotion of fear. According to
the model, fear is located in the low-valence-high-arousal region (quadrant II). Given that our work is grounded in a
horror-game environment, we assume that players will naturally gravitate towards negative valence and thus we focus
only on modulating their arousal—trying to steer them towards the "sweet-spot of fear".
loop” [6, 63]: a system that can work in the smooth cycle of eliciting, sensing, and then responding to the user’s
emotions. Video games are an ideal environment for this, as player input and behavioural data provide rich input,
while ever-improving output techniques can create entire responsive, parallel worlds. Players also seek out experiences
beyond purely positive emotions: they wilfully subject themselves to stressful or downright unpleasant situations to
experience deeper, more powerful involvement in the medium. This allows us to study a much wider spectrum of
emotions, such as tension, as we do here.
preliminary training data which shows the tonic driver of EDA to be a good predictor of long running tension, and which
evaluates the work of Yannakakis et al. on ordinal annotation [39, 70] in its first real-time prediction application. We
then use these findings to generate ground-truth data for training a LUPI-enabled ensemble SVM capable of multi-label
classification with results of 80%, 74% and 66% for 2, 3, and 5 levels of player arousal, respectively. We discuss a simple,
rule-based system in our horror game which consumes these classifications and moderates in-game content to steer the
player’s emotional response in a desired way—i.e., following a pre-authored tension curve. Lastly, we evaluate Duke
Spook’em in a blind test with 7 participants and show that it meaningfully and noticeably enhanced players’ experience.
3.1.1 Procedure. A researcher welcomed each participant, who was then helped to attach EDA and PPG sensors. The
participant sat in a standard chair and was asked to play a prototype of our horror game on a desktop computer. Sensors
were recorded with a 100 Hz sampling rate, and the user’s gameplay was also recorded at both the pixel- and telemetry
levels. Play sessions lasted between 5–12 minutes. After the gameplay session concluded, users were asked to annotate
their experienced arousal during gameplay using a system similar to RankTrace [39], which showed a replay of their
gameplay footage while they used an unbounded affect annotation tool controlled by the up/down keyboard arrows
to mark whether they were becoming more aroused or less aroused. There was no required frequency of annotation,
subjects were encouraged to indicate only changes in their arousal. Overall, players averaged 3.2 seconds between
annotations.
3.1.2 An aside on collecting affect ground truth. There is some dispute regarding annotating video game data. There is
the obvious technical limitation—players are already using an input device (mouse, keyboard, controller) to play, and
as such are unable to use yet another tool for live annotation. There is the possibility of verbal communication with
researchers, but this could influence the subject’s immersion and results. The de-facto standard for games-related affect
research is having subjects annotate videos captured during their attempt [46, 47]. One could argue that remembering
emotion is not the same as experiencing it live and—while this is true—there is sufficient evidence that humans are
fairly good at remembering their emotional states. Most studies conclude that emotions actually enhance our ability to
remember details correctly [33]. Furthermore, negative emotions have a more beneficial influence on memory than
positive ones [32], which is perfect for our application.
Additionally, humans are better at discriminating between options then ranking them on an absolute scale [69].
According to adaptation theory, we keep a frame of reference for each stimuli and we register said stimuli as deviations
from that reference. This baseline reference changes overtime—being exposed to a certain emotion a lot will eventually
make a subject less susceptible to impulses from the same category [24]. Studies show significant improvement in
inter-user agreement when using ordinal annotation, with less data needed in order to achieve satisfactory results [70].
Thus, as noted, we implemented a tool similar to Yannakakis’s RankTrace [39], which produces a “trace” of subjects’
perceived and recalled up/down changes in affect as they watch a replay of their game session. The annotation is
6
Duke Spook’em: A responsive fear modulation system in a horror game environment CHI PLAY, October 10–13, 2023, Stratford, Canada
continuous and unbounded - the users do not require an arbitrary frame of reference and only focus on changes in
their recalled emotional response.
Fig. 3. An example of our annotation tool (window in the top-left corner) running, with an example recording of the experiment
playing in the background
3.1.3 Participants. We recruited 21 participants (7 female, 12 male) from our university and social networks, ranging
in age from 24–35. Participants all had previous experience with horror video games, and were not compensated for
their time.
3.1.4 Results. We used standard processing on the collected physiological data: we performed R-peak analysis of the
PPG signal resulting in 3 different heart rate measures (BPM, SDSD, RMSSD), and we decomposed EDA into tonic and
phasic drivers. As research suggests that the baseline response of EDA varies greatly from person to person [38, 41],
we also normalize the EDA signal and extract its gradient. Users’ annotated arousal traces were also normalized and
their gradients were extracted, then binned into equal-width classes numbering 2, 3, or 5 classes (used to train different
classifiers). Initial analysis showed correlation rank values of 0.7-0.9 between the annotated values and measured EDA
response. We further extracted 9 distinct features from collected telemetry data, capturing players’ movement patterns
and high-level game state (see Table 1).
All the data—biometrics, annotation, gameplay features—was gathered together and further processed via the "sliding
window" method. We settled on windows with length 5 seconds, step 0.5 seconds; preliminary exploration showed
significant accuracy drops in shorter time frames, while longer windows introduced instability in our models. To reduce
our windows to a not-overwhelming feature size and thus reduce overfitting, we extracted various statistical features
over each timeseries within the window. In particular, we took the average, maximum, minimum, and amplitude of
each of our 14 feature values (the 9 telemetry values, the 5 biometric values) within each window. This gave us 56
features to explore. We performed a similar process on the players’ annotations, taking average, minimum, maximum,
and amplitude in aligned windows of the same size. For annotation, we also considered the gradient (and its min, max,
and average) and the integral of the trace within each window.
From our 21 subjects, data from 18 was usable (the others were discarded due to sensor and other malfunctions)
and amounted to a total of 11732 data points in our dataset (datapoints were then windowed as described). To test our
7
CHI PLAY, October 10–13, 2023, Stratford, Canada Anon.
Feature Description
𝑃𝑉 Player’s velocity vector
𝑃 |𝑉 | Player’s velocity magnitude
𝑃𝐹𝑊 𝐷 Player’s forward vector
Dot product of the velocity
𝑃𝐷𝑂𝑇
and movement vectors
The delta of player’s mouse input
𝑃𝐶 Δ
(X, Y in screen space)
Raw input (keystrokes) input
𝑃𝐼
during the frame
Distance to the nearest in-game
𝑃𝐷
entity (NPC, monster)
Types of objects the player
𝑃𝐿
looked at during the frame
Number of walls that the player
𝑃𝑊
hit head-on
Table 1. Telemetry features extracted from the recorded play sessions. All data
was collected on runtime and then processed after the experiment was over.
hypothesis that the EDA tonic component is a good predictor of long-running tension, we use Pearson Correlation
Coefficient and Spearman’s rank correlation coefficient between annotated ground truth and the two EDA signal
components. This follows the rationale of Yannakakis, et. al. [39] that EDA is a well-established and reliable manifestation
of fear and stress.
We observed promising results in the experiment—as presented in Table 2—showing high correlation levels between
the normalized signal of unbounded, ordinal annotation and EDA. We observed a correlation rank of 0.116 between the
phasic EDA component and annotated arousal, which is consistent with results reported by Yannakakis, et al. [39] .
What is more, we note better overall outcomes in using its tonic driver over the phasic one—corellation rank of 0.462.
We conclude that the ordinal, unbounded affect annotation method proposed by Yannakakis et al. is a valuable and
reliable tool in gathering ground truth data on users’ experience of fear, even in our game scenario with its focus on
long-running tension over jump scares. We also chose to move forward with the tonic component of EDA in our LUPI
model, given the strong Spearman rank correlation between it and the annotation delta in our users’ affect traces.
• The LUPI teacher network is trained using only physiological sensor data—EDA and PPG signals. This network
follows the architecture of the previous two, with adjustments to reflect that it only receives biometric data.
• The student network is trained using telemetry as input and supplementing its loss function with predictions
from the teacher network. Architecture follows all the previous ones, with adjustments made to account for the
fact that the student does not receive the biometric signals.
We also tried fitting a variety of simpler models to our data, with a focus on SVMs as they are the model of choice in
Vapnik’s original LUPI work [66]. We believe them to be well-suited to our problem, as they are traditionally considered
to be one of the best predictors to use on small but complex data sets [10]. This also has the implicit benefit of allowing
us to follow the original work more easily.
We observed similar results to Yannakakis et al. with the pixel based classifier [44]—averaging around 75%—but
it quickly became clear that while NNs were a good fit for pixel-based model, they did not perform as well for our
telemetry-focused data set: we achieved only 52% accuracy for binary clasification, and observed the networks to be
very unstable. The SVMs, however, had significantly higher accuracy (68–80% for 3-label classification), particularly
when used in an ensemble: with 20 parallel SVMs we achieved 80% for binary higher/lower arousal classification.
An ensemble of NNs would be prohibitively expensive to run, so we did not test this. We offer further detail on the
SVM-based models, below.
(b) An example of a more erratic signal with visible peaks and valleys
Fig. 4. A user’s annotation trace plotted against their tonic EDA response. We postulate that the tonic component
of the signal is a good predictor of long-term tension experienced by the users.
granularity of classification. We perform 10 rounds of model training with 5-fold cross validation on each and report
averaged results—it’s important to note that we made sure that each model was trained on exactly the same data splits.
We also tested different split methods—random split, stratified random split, even random split, withholding whole
participants for testing, and a split into time-series based upon whether a player has moved sufficiently enough from
their previously recorded position—to account for any implicit bias being introduced due to considering time-series
data, but did not notice differences between their performance. Thus we report only the stratified random split here.
Even though the demo was designed as a non-linear experience, we observed players following similar patterns and
routes through the levels.
The pixel-based LUPI classifier performed very well, achieving average results of 75% on 2 classes, similarly to the
original paper. We note, however, that in our implementation the teacher network was prone to overfitting, which also
affected the student network. We quickly discovered that neural networks were not a good fit for our particular type of
data; even a non-LUPI model produced just 52% accuracy (barely better than chance) and showed instability which we
were unable to eliminate. As such, we explored various basic models and present the results on the 3-class task in Table
3. A particular star was the SVM, which achieved 81.5% accuracy, as compared to the Neural Net’s 41.5%. As SVMs
performed well and the original implementation of LUPI dealt exclusively with them, we settled on SVMs.
10
Duke Spook’em: A responsive fear modulation system in a horror game environment CHI PLAY, October 10–13, 2023, Stratford, Canada
We quickly note the slow performance of SVM models—both during training and inference—due to high dimension-
ality of our data. It is often recommended to train an SVM model on a representative subset of the data [49, 51]. We
verify the merit of that approach by training a number of classifiers on randomly selected subsets of our data set, each
consisting of 30% of total data points. We observe an average accuracy of 69.2% across 5 distinct classifiers with STD of
0.4 (see Table 4a)—all reported values are averaged over 10 iterations in a 5-fold Cross Validation (5-CV) scheme, using
a stratified random split. Encouraged by the results, we trained a LUPI-enabled SVM classifier—which we henceforth
refer to as SVM+—in the same way. We forked and updated the svmplus library for python—a faithful implementation
of Vapnik’s original work—to use the newest SKLearn API for this task. Accuracies ranged from 44–48.2% for SVM+
LUPI student models on a 3-class classification task (see Table 4b). Needless to say, this is not adequate to our task.
Table 4. Training the support vector machine classifiers on data subsets to improve efficiency and prediction
time
We suspect that this lower accuracy can be attributed to the complex nature of the model—with the teacher’s loss
function encoded directly into the student we increase the potential points of failure when it comes to the highly
transient and contextual affect data. The problems compound as we train the models on only data subsets. We thus
explore ensemble learning to alleviate those issues, while still retaining the benefits of faster training/prediction
compared to a single, large SVM [12]. We should note that normally, we would not be so concerned with a model’s
runtime and could be more liberal with the approach we choose for the best results. However, Duke Spook’em is a live
inference system running alongside a video game on varying consumer-grade hardware, which begets all the practical
concerns. We implement a simple SVM-based ensemble model with majority voting, testing how different numbers of
11
CHI PLAY, October 10–13, 2023, Stratford, Canada Anon.
classifiers from 5–30 influence prediction runtime, and then we compare their accuracy on 2, 3, and 5 label classification
problems (see Table 5). This followed our usual protocol, and reported accuracies are averaged over running the 5-fold
cross validation scheme 10 times.
Average
accuracy
No of classifiers
2 classes 3 classes 5 classes
in ensemble
5 0.721 0.639 0.569
10 0.784 0.681 0.654
20 0.808 0.718 0.596
30 0.801 0.741 0.666
Table 5. Results achieved by utilizing different numbers of classifiers in our ensemble model
Simple ensemble voting enabled us to consistently achieve 80% average accuracy on a binary classification problem,
74% for 3 classes, and 65% for 5-classes, using the largest ensemble size of 30 classifiers. While the accuracy of this
variant was superior, its runtime left a lot to be desired, taking over 2 seconds to produce a single prediction. As such,
we later utilize the 3-label variant of a 20-classifier ensemble—with sub-second prediction times and a still-respectable
accuracy of 80%, 72% and 60% for 2, 3, and 5 classes, respectively.
3.4.2 Creating the "action" space. While we have now solved the problems of sensing player arousal and understanding
what the target arousal is, we need to give Duke Spook’em some ammunition to scare the players with. We abstract
in-game events designed to change the fear response into "actions" that the system is allowed to take. To give game
designers creative control over Duke Spook’em’s behavior, we store information about any available actions in a
pre-authored knowledge base—a text file whose contents follow simple semantics. Each action is represented by (see
Figure 6a):
(1) Action: A descriptive name given, later used as an ID
(2) Ante: A list of prerequisites (other actions) for the action to be available
12
Duke Spook’em: A responsive fear modulation system in a horror game environment CHI PLAY, October 10–13, 2023, Stratford, Canada
(a) An example of how tension is being built (b) We assume that tension will follow
throughout the running time of Star Wars a sinusoid-like function between certain
Episode IV: A New Hope story beats and its baseline will be driven
up by every milestone the player completes
Fig. 5. Examples of tension curves: the intended pre-authored tension curve of our game experience, and
an example from popular culture.
(3) Weight: A weight parameter, arbitrarily assigned by the designer. Weights are the determining factor when
multiple actions are available at the same time. The one with higher weight will be considered first.
(4) Expected: Whether the action is expected to raise or lower the tension
(5) Conse: If a node contains "Ante" and "Conse" fields it is considered a logical operator determining the availability
of certain actions, in which "Conse" determines the action(s) made available if the "Ante" of the given node is
fulfilled.
We were inspired by Haddawy’s work in Bayesian Networks (BN) generation [22] and we borrow the knowledge base
semantics directly from there. We also implement his algorithm for generating Bayesian Networks from pre-existing
knowledge bases, but we make no assumptions about the probability distribution of our actions and only use the
generated network as a graph structure for easy determination of actions available at any given moment. We use the
graph generation algorithm, but forgo the actual BN in our implementation, as they were an inadequate choice to
represent a game design model—it constantly changes throughout development and we want to be able to extend the
action space easily, if needed. An example of generation can be seen in Figure 6, where we present a simple "knowledge
base" and an action graph constructed from it.
3.4.3 The memory model. We introduce a "memory model", based on a simple logging mechanism, to keep information
about actions taken by the system. Executed system actions are batched into 30 second increments in a queue; they
move further and further back and are forgotten after 2.5 minutes (see Figure 7). In each 30 second interval, we keep
track of the following information:
• What actions were taken - including the full description of the action, with expected result and weight
• An array of predicted arousal values (inference model is queried in 1 second intervals)
Every 30 seconds, the data in the earliest bucket is discarded, everything left is moved one bucket over, and the most
recent one—now empty—starts being populated. We use the memory model mostly for adjusting action weights based
13
CHI PLAY, October 10–13, 2023, Stratford, Canada Anon.
Ante : PlayerFoundAccessCard
Conse : PlayerNearAnAccessPoint
(a) An example of what a pre-authored action (b) A simple action graph generated from Listing 6a
knowledge base looks like
Fig. 6. Overview of how we generate actions to be considered by Duke Spook’em from a pre-authored
knowledge base.
on whether they had the desired outcome and, if so, how long it took to elicit the response: actions will be prioritized
(weight increased) if they met the desired responses and deprioritized (weight decreased) otherwise.
Fig. 7. A visualisation of the memory model used by Duke Spook’em: actions taken by the system are
bucketed into 30 second increments, with actions taken more than 150 seconds ago forgotten.
3.4.4 Putting it all together. Duke Spook’em operates in a constant loop of prediction-adjustment-feedback, trying to
steer the subject to follow the sine curve of expected arousal. This curve is assumed to be a sine-line function partitioned
into sections by milestones—pre-authored story beats within the game that are consistent across playthroughs. We
assume that crossing a milestone into another section increases baseline tension, thus shifting the function up (see
Figure 5b). Arousal is predicted by the inference model in 1 second intervals, while the the time between actions is a
configurable parameter—in the gameplay demo we prepared for evaluation it also ran each second (see Algorithm 1).
The feedback loop itself is not concerned with milestones and the increases in baseline tension following them—this is
reflected in the evolving set of actions, which are designed to appear in increasing order of intensity. Duke Spook’em
14
Duke Spook’em: A responsive fear modulation system in a horror game environment CHI PLAY, October 10–13, 2023, Stratford, Canada
attempts to steer the player response within a single section between milestones to match the sine-like curve and—when
a new milestone is reached—the system is turned off for a brief moment, allowing players an opportunity to go down to
their baseline. This baseline is assumed to have risen—following adaptation theory [24]—and when the system resumes,
its loop treats the new baseline as the lowest arousal label. This framing ensures that we can meaningfully moderate a
long-running experience with a relatively low-resolution classifier.
There are numerous ways in which the system adjusts its action selection algorithm, scattered across different stages
of the loop. We present a brief overview of all those methods contributing to the adaptability of Duke Spook’em’s
director capabilities:
• Moderated action space: only a subset of actions ( 80%) will be loaded into memory on game start, giving every
playthrough the opportunity to be realized under unique circumstances. Additionally, with each milestone new
actions become available and some of the old ones are discarded.
• The memory model: following the well-established reasoning that exposition to a certain stimulus increases
our resistance to it [24], we adjust weights of the available actions to reflect whether they have already been
experienced in recent memory
• Desired response feedback: when an action is taken, it is committed to the memory along with its expected
response (UP/DOWN). We track the action throughout its lifetime in our 5 memory stages and adjust its weight
for future generations in the following ways:
– If the fear response predicted by our inference model changes to meet the desired one, adjust the weight:
𝐹
𝑤𝑎𝑑 𝑗𝑢𝑠𝑡𝑒𝑑 = 𝑤𝑐𝑢𝑟𝑟𝑒𝑛𝑡 ∗ (1 + 0.2 ∗ 𝑛𝑏𝑢𝑐𝑘𝑒𝑡 +𝑛 ), where 𝐹 is a configurable fall-off parameter and 𝑛𝑎𝑐𝑡𝑖𝑜𝑛𝑠 is
𝑎𝑐𝑡𝑖𝑜𝑛𝑠
the amount of actions taken since the action in question. We thus reward quickly meeting the desired response
and penalize the response being slow or possibly influenced by other actions taken since.
– If the fear response predicted by our inference model changes to opposite of the desired one, adjust the weight:
𝐹
𝑤𝑎𝑑 𝑗𝑢𝑠𝑡𝑒𝑑 = 𝑤𝑐𝑢𝑟𝑟𝑒𝑛𝑡 ∗ (1 − 0.2 ∗ 𝑛𝑏𝑢𝑐𝑘𝑒𝑡 +𝑛 ), where 𝐹 is the fall-off parameter and 𝑛𝑎𝑐𝑡𝑖𝑜𝑛𝑠 is the amount
𝑎𝑐𝑡𝑖𝑜𝑛𝑠
of actions taken since the action in question. We penalize actions taking opposite effect, but are more lenient
the further back in time they have taken place. We also take into account the amount of actions taken since,
any of which could have actually caused the change.
– If the action reaches the end of its in-memory lifetime without any recorded change in affect: 𝑤𝑎𝑑 𝑗𝑢𝑠𝑡𝑒𝑑 =
𝑤𝑐𝑢𝑟𝑟𝑒𝑛𝑡 ∗ 0.95, we slightly penalize the action’s not having any effect, but not as severely as if it provoked an
opposite one.
4 EVALUATION
To evaluate the performance of our system, we conducted a user study in which subjects were asked to play the entire
game in two variants: with and without the use of Duke Spook’em. The participants answered a questionnaire after
each playthrough and took part in a short free-form interview at the end of the study. We also asked each subject
to annotate the captured videos of their attempts—using methodology introduced in Section 3.1.2—to measure the
accuracy of our inference model on previously unseen data.
during their gameplay attempts. On each of those days we asked subjects to play one copy of our game without telling
them which variant they are given at the time. We did not disclose the exact purpose of the study and did not give
participants any indication whether or how the variants differed. The inference model implemented in the gameplay
demo was using the 3-label classification variant of a 20-model SVM+ ensemble. Like the previous study, we asked
participants to annotate their recorded gameplay videos after their playthroughs using our ranktrace-inspired interface.
We also recorded the actions taken by the system—along with their weights and predicted effect—and the system’s
second-by-second inferences of player affect.
4.1.1 Participants. We recruited 7 participants for this study (4 males, 3 females) ranging in age from 25–26 years.
As opposed to the previous experiments—where we aimed to create as diverse a population under test as possible—
we deliberately recruited subjects from the same class with similar profiles in regards to video game/horror media
enjoyment. This was done to ensure that we minimize subjective bias as much as possible and to allow us to more
confidently draw comparisons between different players.
the control scheme and the game environment. Overall, results are promising: we have reached an average accuracy of
approx. 67% on the Duke Spook’em variant and 64% on the simple variant when comparing the affect predicted at
runtime with players’ annotated arousal, as reported in the post-play interview. We are particularly happy with the
almost 70% result reported on the longest playthrough of almost 19 minutes. All of those findings suggest that our
model is a good predictor of long-running tension.
playthrough as "scarier" than they did the regular version. 2 subjects reported equal scores, and the final subject
rated the Duke Spook’em variant lower.
Users also projected sentience onto elements of the game which were not part of the Duke Spook’em system, but which
rather either followed very simple rules or were completely random (such as the monster and how/when it appeared).
6 CONCLUSION
This paper presented Duke Spook’em—an end-to-end system combining the findings of cutting edge research in video
game affect modelling field and putting them to effective use. Duke Spook’em closes the affective loop in a horror video
game by implementing a constant predict-adjust-feedback loop based on arousal prediction, leading to increased player
satisfaction. Along the way, we discussed and evaluated affect ground truth via ordinal and unbounded annotation
and introduced a novel physiological marker of long running tension—the tonic component of the EDA signal. Our
experiment showed a correlation rank of 0.46 between user reported affect and the tonic EDA, which gave us confidence
to use to use the signal as privileged information for training a LUPI-enabled SVM ensemble: this is now the backbone
of Duke Spook’em. Our end-to-end study showed Duke Spook’em made for a more immersive and scarier experience.
We believe that our work presented here lays out a course for designing robust affect detection and modulation systems
that can be practically applied in the real world.
REFERENCES
[1] 2022. Crowdsourcing research questions in science. Research Policy 51, 4 (2022), 104491. https://doi.org/10.1016/j.respol.2022.104491
[2] K. Anderson and P.W. McOwan. 2006. A real-time automated system for the recognition of human facial expressions. IEEE Transactions on Systems,
Man, and Cybernetics, Part B (Cybernetics) 36, 1 (2006), 96–105. https://doi.org/10.1109/TSMCB.2005.854502
[3] Amina Asif, Muhammad Dawood, and Fayyaz ul Amir Afsar Minhas. 2018. A generalized meta-loss function for distillation and learning using
privileged information for classification and regression. CoRR abs/1811.06885 (2018). arXiv:1811.06885 http://arxiv.org/abs/1811.06885
[4] Mahsa Bagheri and Sarah D Power. 2020. EEG-based detection of mental workload level and stress: the effect of variation in each state on
classification of the other. 17, 5 (oct 2020), 056015. https://doi.org/10.1088/1741-2552/abbc27
[5] Mathias Benedek and Christian Kaernbach. 2010. A continuous measure of phasic electrodermal activity. J Neurosci Methods 190, 1 (May 2010),
80–91.
[6] Daniel R. Bersak, Gary McDarby, Daragh McDonnell, Brian McDonald, and Rahul Karkun. 2001. Intelligent Biofeedback using an Immersive
Competitive Environment.
[7] Nadia Bianchi-berthouze and Andrea Kleinsmith. 2003. A categorical approach to affective gesture recognition. Connection Science 15, 4 (2003),
259–269. https://doi.org/10.1080/09540090310001658793 arXiv:https://doi.org/10.1080/09540090310001658793
[8] Wolfram Boucsein. 2012. Electrodermal activity. Springer Science & Business Media.
[9] Ginevra Castellano, Santiago D. Villalba, and Antonio Camurri. 2007. Recognising Human Emotions from Body Movement and Gesture Dynamics.
In ACII.
[10] Jair Cervantes, Farid Garcia-Lamont, Lisbeth Rodríguez-Mazahua, and Asdrubal Lopez. 2020. A comprehensive survey on support vector machine
classification: Applications, challenges and trends. Neurocomputing 408 (2020), 189–215. https://doi.org/10.1016/j.neucom.2019.10.118
[11] Luca Chittaro and Riccardo Sioni. 2014. Affective computing vs. affective placebo: Study of a biofeedback-controlled game for relaxation training.
International Journal of Human-Computer Studies 72, 8 (2014), 663–673. https://doi.org/10.1016/j.ijhcs.2014.01.007 Designing for emotional wellbeing.
[12] Marc Claesen, Frank De Smet, Johan A.K. Suykens, and Bart De Moor. 2014. EnsembleSVM: A Library for Ensemble Learning Using Support Vector
Machines. Journal of Machine Learning Research 15, 4 (2014), 141–145. http://jmlr.org/papers/v15/claesen14a.html
[13] Cristina Conati and Heather Maclaren. 2009. Modeling User Affect from Causes and Effects, Vol. 5535. 4–15. https://doi.org/10.1007/978-3-642-
02247-0_4
20
Duke Spook’em: A responsive fear modulation system in a horror game environment CHI PLAY, October 10–13, 2023, Stratford, Canada
[14] L.C. De Silva and Pei Chi Ng. 2000. Bimodal emotion recognition. In Proceedings Fourth IEEE International Conference on Automatic Face and Gesture
Recognition (Cat. No. PR00580). 332–335.
[15] Anders Drachen, Georgios Yannakakis, Lennart Nacke, and Anja Pedersen. 2010. Correlation between Heart Rate, Electrodermal Activity and Player
Experience in First-Person Shooter Games (Pre-print). https://doi.org/10.1145/1836135.1836143
[16] Paul Ekman. 1992. An argument for basic emotions. Cognition and Emotion 6, 3-4 (1992), 169–200. https://doi.org/10.1080/02699939208411068
arXiv:https://doi.org/10.1080/02699939208411068
[17] I.A. Essa and A.P. Pentland. 1997. Coding, analysis, interpretation, and recognition of facial expressions. IEEE Transactions on Pattern Analysis and
Machine Intelligence 19, 7 (1997), 757–763. https://doi.org/10.1109/34.598232
[18] Theresa M. Fleming, Lynda Bavin, Karolina Stasiak, Eve Hermansson-Webb, Sally N. Merry, Colleen Cheek, Mathijs Lucassen, Ho Ming Lau, Britta
Pollmuller, and Sarah Hetrick. 2017. Serious Games and Gamification for Mental Health: Current Status and Promising Directions. Frontiers in
Psychiatry 7 (2017). https://doi.org/10.3389/fpsyt.2016.00215
[19] N. Fragopanagos and J.G. Taylor. 2005. Emotion recognition in human–computer interaction. Neural Networks 18, 4 (2005), 389–405. https:
//doi.org/10.1016/j.neunet.2005.03.006 Emotion and Brain.
[20] Michael Grimm, Kristian Kroschel, Emily Mower, and Shrikanth Narayanan. 2007. Primitives-based evaluation and estimation of emotions in speech.
Speech Communication 49, 10 (2007), 787–800. https://doi.org/10.1016/j.specom.2007.01.010 Intrinsic Speech Variations.
[21] Hatice Gunes and Massimo Piccardi. 2007. Bi-modal emotion recognition from expressive face and body gestures. Journal of Network and Computer
Applications 30, 4 (2007), 1334–1345. https://doi.org/10.1016/j.jnca.2006.09.007 Special issue on Information technology.
[22] Peter Haddawy. 2013. Generating Bayesian Networks from Probability Logic Knowledge Bases. https://doi.org/10.48550/ARXIV.1302.6811
[23] Jason Matthew Harley. 2016. Chapter 5 - Measuring Emotions: A Survey of Cutting Edge Methodologies Used in Computer-Based Learning
Environment Research. In Emotions, Technology, Design, and Learning, Sharon Y. Tettegah and Martin Gartmeier (Eds.). Academic Press, San Diego,
89–114. https://doi.org/10.1016/B978-0-12-801856-9.00005-0
[24] Harry Helson. 1948. Adaptation-level as a basis for a quantitative theory of frames of reference. Psychological Review 55, 6 (1948), 297–313.
https://doi.org/10.1037/h0056721
[25] Eva Hudlicka. 2008. Affective computing for game design. 4th International North-American Conference on Intelligent Games and Simulation, Game-On
’NA 2008 (01 2008), 5–12.
[26] Spiros V. Ioannou, Amaryllis T. Raouzaiou, Vasilis A. Tzouvaras, Theofilos P. Mailis, Kostas C. Karpouzis, and Stefanos D. Kollias. 2005. Emotion
recognition through facial expression analysis based on a neurofuzzy network. Neural Networks 18, 4 (2005), 423–435. https://doi.org/10.1016/j.
neunet.2005.03.004 Emotion and Brain.
[27] Rachael E. Jack, Oliver G.B. Garrod, and Philippe G. Schyns. 2014. Dynamic Facial Expressions of Emotion Transmit an Evolving Hierarchy of
Signals over Time. Current Biology 24, 2 (2014), 187–192. https://doi.org/10.1016/j.cub.2013.11.064
[28] Myounghoon Jeon. 2017. Chapter 1 - Emotions and Affect in Human Factors and Human–Computer Interaction: Taxonomy, Theories, Approaches,
and Methods. In Emotions and Affect in Human Factors and Human-Computer Interaction, Myounghoon Jeon (Ed.). Academic Press, San Diego, 3–26.
https://doi.org/10.1016/B978-0-12-801851-4.00001-X
[29] Johan Jeuring, Frans Grosfeld, Bastiaan Heeren, Michiel Hulsbergen, Richta Ijntema, Vincent Jonker, Nicole Mastenbroek, Maarten van der Smagt,
Frank Wijmans, Majanne Wolters, and Henk Zeijts. 2015. Communicate! — A Serious Game for Communication Skills —. 9307 (01 2015), 513–517.
https://doi.org/10.1007/978-3-319-24258-3_49
[30] Asha Kapur, Ajay Kapur, Naznin Virji-Babul, George Tzanetakis, and Peter Driessen. 2005. Gesture-Based Affective Computing on Motion Capture
Data. Affective Computing and Intelligent Interaction 3784, 1–7. https://doi.org/10.1007/11573548_1
[31] Nuri Kara. 2021. A Systematic Review of the Use of Serious Games in Science Education. Contemporary Educational Technology 13 (01 2021), ep295.
https://doi.org/10.30935/cedtech/9608
[32] Elizabeth A. Kensinger. 2007. Negative Emotion Enhances Memory Accuracy: Behavioral and Neuroimaging Evidence. Current Directions in
Psychological Science 16, 4 (2007), 213–218. https://doi.org/10.1111/j.1467-8721.2007.00506.x arXiv:https://doi.org/10.1111/j.1467-8721.2007.00506.x
[33] Elizabeth A. Kensinger. 2009. Remembering the Details: Effects of Emotion. Emotion Review 1, 2 (2009), 99–113. https://doi.org/10.1177/
1754073908100432 arXiv:https://doi.org/10.1177/1754073908100432 PMID: 19421427.
[34] Amjad Rehman Khan. 2022. Facial Emotion Recognition Using Conventional Machine Learning and Deep Learning Methods: Current Achievements,
Analysis and Remaining Challenges. Information 13, 6 (2022). https://doi.org/10.3390/info13060268
[35] K. H. Kim, Seok Won Bang, and S. R. Kim. 2006. Emotion recognition system using short-term monitoring of physiological signals. Medical and
Biological Engineering and Computing 42 (2006), 419–427.
[36] Kang-Kue Lee, Youn-Ho Cho, and Kyu-Sik Park. 2006. Robust Feature Extraction for Mobile-Based Speech Emotion Recognition System. Springer Berlin
Heidelberg, Berlin, Heidelberg, 470–477. https://doi.org/10.1007/978-3-540-37258-5_48
[37] Christine Lisetti, Fatma Nasoz, Cynthia Lerouge, Onur Ozyer, and Kaye Alvarez. 2003. Developing multimodal intelligent affective interfaces for
tele-home health care. Int. J. Hum.-Comput. Stud. 59 (07 2003), 245–255. https://doi.org/10.1016/S1071-5819(03)00051-X
[38] Yun Liu and Siqing Du. 2017. Psychological stress level detection based on electrodermal activity. Behav Brain Res 341 (Dec. 2017), 50–53.
[39] Phil Lopes, Georgios Yannakakis, and Antonios Liapis. 2017. RankTrace: Relative and unbounded affect annotation. 158–163. https://doi.org/10.
1109/ACII.2017.8273594
[40] David Lopez-Paz, Léon Bottou, Bernhard Schölkopf, and Vladimir Vapnik. 2015. Unifying distillation and privileged information. (11 2015).
21
CHI PLAY, October 10–13, 2023, Stratford, Canada Anon.
[41] Erika Lutin, Ryuga Hashimoto, Walter De Raedt, and Chris Van Hoof. 2021. Feature Extraction for Stress Detection in Electrodermal Activity.
In Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 2: BIOSIGNALS,. INSTICC,
SciTePress, 177–185. https://doi.org/10.5220/0010244601770185
[42] Hugo Lövheim. 2012. A new three-dimensional model for emotions and monoamine neurotransmitters. Medical Hypotheses 78, 2 (2012), 341–348.
https://doi.org/10.1016/j.mehy.2011.11.016
[43] Ilias Maglogiannis, Eirini Kalatha, and Efrosyni-Alkisti Paraskevopoulou-Kollia. 2014. An overview of Affective Computing from the Physiology and
Biomedical Perspective. 367–395. https://doi.org/10.1201/b17080-19
[44] Konstantinos Makantasis, David Melhart, Antonios Liapis, and Georgios N. Yannakakis. 2021. Privileged Information for Modeling Affect In The
Wild. CoRR abs/2107.10552 (2021). arXiv:2107.10552 https://arxiv.org/abs/2107.10552
[45] Suzan Mansourian, Jacob Corcoran, Anders Enjin, Christer Löfstedt, Marie Dacke, and Marcus C Stensmyr. 2016. Fecal-Derived Phenol Induces
Egg-Laying Aversion in Drosophila. Curr Biol 26, 20 (Sept. 2016), 2762–2769.
[46] David Melhart, Antonios Liapis, and Georgios N. Yannakakis. 2022. The Arousal Video Game AnnotatIoN (AGAIN) Dataset. IEEE Transactions on
Affective Computing 13, 4 (oct 2022), 2171–2184. https://doi.org/10.1109/taffc.2022.3188851
[47] Dávid Melhárt, Antonios Liapis, and Georgios Yannakakis. 2019. PAGAN: Video Affect Annotation Made Easy. 130–136. https://doi.org/10.1109/
ACII.2019.8925434
[48] Victor Motogna, Georgina Lupu-Florian, and Eugen Lupu. 2021. Strategy For Affective Computing Based on HRV and EDA. In 2021 International
Conference on e-Health and Bioengineering (EHB). 1–4. https://doi.org/10.1109/EHB52898.2021.9657654
[49] Sara Mourad, Ahmed Tewfik, and Haris Vikalo. 2017. Data subset selection for efficient SVM training. In 2017 25th European Signal Processing
Conference (EUSIPCO). 833–837. https://doi.org/10.23919/EUSIPCO.2017.8081324
[50] R. Nakatsu, A. Solomides, and N. Tosa. 1999. Emotion recognition and its application to computer agents with spontaneous interactive capabilities.
In Proceedings IEEE International Conference on Multimedia Computing and Systems, Vol. 2. 804–808 vol.2. https://doi.org/10.1109/MMCS.1999.778589
[51] Jakub Nalepa and Michal Kawulok. 2019. Selecting training sets for support vector machines: a review. Artificial Intelligence Review 52, 2 (01 Aug
2019), 857–900. https://doi.org/10.1007/s10462-017-9611-1
[52] Yiing Y’ng Ng, Chee Weng Khong, and Robert Jeyakumar Nathan. 2018. Evaluating Affective User-Centered Design of Video Games Using Qualitative
Methods. International Journal of Computer Games Technology 2018 (04 Jun 2018), 3757083. https://doi.org/10.1155/2018/3757083
[53] Binh T. Nguyen, Minh H. Trinh, Tan V. Phan, and Hien D. Nguyen. 2017. An efficient real-time emotion detection using camera and facial landmarks.
In 2017 Seventh International Conference on Information Science and Technology (ICIST). 251–255. https://doi.org/10.1109/ICIST.2017.7926765
[54] Pedro A. Nogueira, Vasco Torres, Rui Rodrigues, Eugénio Oliveira, and Lennart E. Nacke. 2016. Vanishing scares: biofeedback modulation of affective
player experiences in a procedural horror game. Journal on Multimodal User Interfaces 10, 1 (01 Mar 2016), 31–62. https://doi.org/10.1007/s12193-
015-0208-1
[55] W. Gerrod Parrott. 2004. The Nature of Emotion. Blackwell Publishing, Malden, 5–20.
[56] R.W. Picard, E. Vyzas, and J. Healey. 2001. Toward machine emotional intelligence: analysis of affective physiological state. IEEE Transactions on
Pattern Analysis and Machine Intelligence 23, 10 (2001), 1175–1191. https://doi.org/10.1109/34.954607
[57] R. W. Picard. 1995. Affective Computing.
[58] Rosalind W. Picard. 1997. Affective Computing. MIT Press, Cambridge, MA, USA.
[59] ROBERT PLUTCHIK. 1980. Chapter 1 - A GENERAL PSYCHOEVOLUTIONARY THEORY OF EMOTION. In Theories of Emotion, Robert Plutchik
and Henry Kellerman (Eds.). Academic Press, 3–33. https://doi.org/10.1016/B978-0-12-558701-3.50007-7
[60] Sara Pourmohammadi and Ali Maleki. 2020. Stress detection using ECG and EMG signals: A comprehensive study. Computer Methods and Programs
in Biomedicine 193 (2020), 105482. https://doi.org/10.1016/j.cmpb.2020.105482
[61] James A. Russell. 1980. A circumplex model of affect. Journal of Personality and Social Psychology 39, 6 (1980), 1161–1178. https://doi.org/10.1037/
h0077714
[62] Andreja Samčović. 2018. Serious games in military applications. Vojnotehnicki glasnik 66 (07 2018), 597–613. https://doi.org/10.5937/vojtehg66-16367
[63] Petra Sundström. 2005. Exploring the Affective Loop.
[64] Konstantinos Tzevelekakis, Zinovia Stefanidi, and George Margetis. 2021. Real-Time Stress Level Feedback from Raw Ecg Signals for Personalised,
Context-Aware Applications Using Lightweight Convolutional Neural Network Architectures. Sensors (Basel) 21, 23 (Nov. 2021).
[65] Vladimir Vapnik and Rauf Izmailov. 2015. Learning Using Privileged Information: Similarity Control and Knowledge Transfer. J. Mach. Learn. Res.
16, 1 (jan 2015), 2023–2049.
[66] Vladimir Vapnik and Akshay Vashist. 2009. A new learning paradigm: Learning using privileged information. Neural Networks 22, 5 (2009), 544–557.
https://doi.org/10.1016/j.neunet.2009.06.042 Advances in Neural Networks Research: IJCNN2009.
[67] Mincheol Whang and Joasang Lim. 2008. A Physiological Approach to Affective Computing. , 5 pages. https://doi.org/10.5772/6174
[68] Georgios Yannakakis and Julian Togelius. 2011. Experience-Driven Procedural Content Generation. Affective Computing, IEEE Transactions on 2 (07
2011), 147–161. https://doi.org/10.1109/T-AFFC.2011.6
[69] Georgios N. Yannakakis, Roddy Cowie, and Carlos Busso. 2017. The ordinal nature of emotions. In 2017 Seventh International Conference on Affective
Computing and Intelligent Interaction (ACII). 248–255. https://doi.org/10.1109/ACII.2017.8273608
[70] Georgios N. Yannakakis and Héctor P. Martínez. 2015. Grounding truth via ordinal annotation. In 2015 International Conference on Affective
Computing and Intelligent Interaction (ACII). 574–580. https://doi.org/10.1109/ACII.2015.7344627
22
Duke Spook’em: A responsive fear modulation system in a horror game environment CHI PLAY, October 10–13, 2023, Stratford, Canada
23