Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

University of Dundee

Adaptive Subtitles
Gorman, Benjamin M.; Crabb, Michael; Armstrong, Mike

Published in:
CHI 2021 - Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

DOI:
10.1145/3411764.3445509

Publication date:
2021

Document Version
Peer reviewed version

Link to publication in Discovery Research Portal

Citation for published version (APA):


Gorman, B. M., Crabb, M., & Armstrong, M. (2021). Adaptive Subtitles: Preferences and Trade-Offs in Real-
Time Media Adaption. In CHI 2021 - Proceedings of the 2021 CHI Conference on Human Factors in Computing
Systems: Making Waves, Combining Strengths (pp. 1-11). Article 733 (Conference on Human Factors in
Computing Systems - Proceedings). Association for Computing Machinery.
https://doi.org/10.1145/3411764.3445509

General rights
Copyright and moral rights for the publications made accessible in Discovery Research Portal are retained by the authors and/or other
copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with
these rights.

Take down policy


If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately
and investigate your claim.

Download date: 21. Nov. 2023


Adaptive Subtitles: Preferences and Trade-Offs in Real-Time
Media Adaption
Benjamin M. Gorman Michael Crabb Mike Armstrong
Bournemouth University University of Dundee BBC Reserach and Development
Bournemouth, England, UK Dundee, Scotland, UK Salford, England, UK
bgoman@bournemouth.ac.uk mzcrabb@dundee.ac.uk mike.armstrong@bbc.co.uk

ABSTRACT viewers in the UK use subtitles daily, and 6% use subtitles "most of
Subtitles can help improve the understanding of media content. the time" [4]. It has also been reported that 18% of all BBC iPlayer
People enable subtitles based on individual characteristics (e.g., lan- content (i.e. online media streaming) is viewed with subtitles on,
guage or hearing ability), viewing environment, or media context with this increasing to over 20% for tablet users [1]. With such a
(e.g., drama, quiz show). However, some people find that subtitles large percentage of media being consumed with subtitles to assist
can be distracting and that they negatively impact their viewing ex- in the viewing experience, it is important to understand the reasons
perience. We explore the challenges and opportunities surrounding behind this usage. Developing an understanding into why viewers
interaction with real-time personalisation of subtitled content. To enable subtitles may allow the overall viewing experience to be
understand how people currently interact with subtitles, we first better tailored for audience members on an individual basis.
conducted an online questionnaire with 102 participants. We used It has been suggested that 80% of subtitle users do not have a
our findings to elicit requirements for a new approach called Adap- hearing impairment [49]. Despite this, research involving subtitles
tive Subtitles that allows the viewer to alter which speakers have commonly focuses on its usage as an accessibility feature. In this
subtitles displayed in real-time. We evaluated our approach with 19 work, we focus on understanding how viewers adapt their subtitle
participants to understand the interaction trade-offs and challenges usage depending on the content and context they are watching
within real-time adaptations of subtitled media. Our evaluation under. Specifically, we aim to understand the challenges and oppor-
findings suggest that granular controls and structured onboarding tunities within personalising subtitle interactions.
allow viewers to make informed trade-offs when adapting media To achieve our aim, we first conducted an online questionnaire
content, leading to improved viewing experiences. with 102 participants to explore subtitle interaction patterns and ex-
periences. Our participants highlighted specific viewing challenges
CCS CONCEPTS surrounding the language being spoken, accents, scene-context,
and programme quality. Participants also described turning on sub-
• Human-centered computing → Interaction paradigms.
titles for specific accents, actors, or content types, and that they
KEYWORDS had to interrupt their viewing to do so.
Previous work has focused on adaptive interfaces as a system-
Subtitles, Captions, Closed-captions, Media, Adaptive-Interfaces wide adoption. Based on our survey findings, we determined that
ACM Reference Format: the content being consumed is an additional factor that should be
Benjamin M. Gorman, Michael Crabb, and Mike Armstrong. 2021. Adaptive considered within such interfaces. With this in mind, we introduce
Subtitles: Preferences and Trade-Offs in Real-Time Media Adaption. In a new approach called Adaptive Subtitling that allows subtitles
CHI Conference on Human Factors in Computing Systems (CHI ’21), May to be adapted based on the viewers’ individual preferences. To
8–13, 2021, Yokohama, Japan. ACM, New York, NY, USA, 11 pages. https:
evaluate our approach, we created a system that gives the viewer
//doi.org/10.1145/3411764.3445509
control over which individual speakers have subtitles enabled. This
1 INTRODUCTION allowed us to explore the opportunities and trade-offs that exist
when allowing for real-time personalisation of media content.
Subtitles (or closed-captions1 ) are used by viewers to help them Paper Contributions: This paper makes three contributions:
understand and enjoy media content. A British Broadcasting Cor- First, we contribute online questionnaire data from 102 participants
poration (BBC) audience survey reported that 10% of television that provides an understanding into why people use subtitles, ques-
1 Closed captions (CC) also provide a text description of sound effects. Most streaming tioning the breadth of use cases that should be considered in their
sites only have the option for ‘English[CC]’ for English subtitles and therefore in this design. Second, we introduce Adaptive Subtitles, which allows real
work we collectively refer to both as subtitles.
time adaptation of subtitled content by the user, and make avail-
Permission to make digital or hard copies of all or part of this work for personal or able sample code for how this can be implemented using modern
classroom use is granted without fee provided that copies are not made or distributed web technologies through a second screen application. Third, we
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than the evaluated Adaptive Subtitles through a lab based user study with
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or 19 participants and introduce design considerations that outline
republish, to post on servers or to redistribute to lists, requires prior specific permission the trade-offs involved when developing real-time media adapta-
and/or a fee. Request permissions from permissions@acm.org.
CHI ’21, May 8–13, 2021, Yokohama, Japan tions. For transparency, we provide anonymised participant data
© 2021 Copyright held by the owner/author(s). Publication rights licensed to ACM. and project code as supplementary material attached to this work.
ACM ISBN 978-1-4503-8096-6/21/05. . . $15.00
https://doi.org/10.1145/3411764.3445509
CHI ’21, May 8–13, 2021, Yokohama, Japan Gorman, Crabb, and Armstrong

2 RELATED WORK not using them as an access service [12]. Viewers must perform a
Subtitles are used to convey spoken dialogue and sound effects complex number of steps for each new subtitle block that appears,
to the viewer within media content. Subtitles enable audiences to and use a variety of communication channels concurrently whilst
gain additional information about particular aspects of a film or doing so [34]. However, despite the reduction in viewing experience,
television show that could relate to character identification, time the inclusion of subtitles has positive benefits outside of their usage
markers, narrative summary, dialogue, and story commentary [14]. as an accessibility aid [16]. The presence of same language subtitles
The most prominent use for subtitles is as an access service that may decrease cognitive load when used in an education setting [33]
enables people with hearing impairments to better understand and it has been suggested they focus attention [33], which may be
media content. One of the main reasons in creating modern subtitled more important for lean-forward experiences [32].
content is to assist viewers with hearing loss [16] and to prevent
this group being “shut out” from broadcast media [38]. 2.2 Customising and Personalising Experiences
Despite subtitles as an access service [66] being one of the primary Developing experiences that match individuals’ preferences is a
reasons for television content being subtitled, it is estimated that complex task that involves understanding user needs [58]. The
only 20% of people use subtitles for this reason [49]. There are overall experience of using a system is not based on the system
many factors that determine whether an individual may choose to itself, but more on the individual that is using it at a given point in
watch media content with subtitles turned on. Situational factors time [59]. Creating services that cater for specific user needs is not
can influence why individuals may be unable to use traditional a domain where one-size-fits-all due to the ever changing abilities
audio as the main method of understanding media content [17], and that individuals may have when using a piece of technology and
the reasons for using subtitles can be as unique as the individuals the environments that they may use these technologies in [26].
themselves that are using them [19]. Alternative reasons for subtitle Personalisation as a method to increase the overall experience of a
usage may include characters mumbling, background noise in TV, service is one that as shown promise in a number of media contexts.
watching in loud environment, having to have the sound low, and Systems with adaptive accessibility require differing levels of
the use of unfamiliar words or accents [56]. Context, therefore, is user involvement. System-led adaptions typically involve user mod-
key in understanding why an individual may, or may not, watch els to be created that facilitate adaptions automatically for a user [44].
video content with subtitles enabled. User-led adaptions involve the users themselves leading the adap-
tions that are taking place in a proactive manner [22]. Both of these
2.1 Impact of Subtitles on Viewing Experience methods are valid and their usage depends on the user, context of
use, and complexity of the interface and interactions being adapted.
Subtitles are designed to have a positive impact within media, how-
Acceptance of customised subtitled content is not based on com-
ever sometimes their inclusion can lead to a reduction in overall
prehension or readability but on culture, habits, age, attitudes, and
viewing experience. It has been suggested that when subtitles are
content [39]; more commonly known as factors relating to User
present they can take up ~37% of a users visual attention [12], and
Experience (UX) [47]. Comfort, rather than readability, has been
eye-tracking work has found that participants spent ~84% of their
suggested as a metric to use when creating guidelines for subtitle
viewing time on the subtitles when watching media content [31].
positioning [65] and in this regard, participants respond positively
It has been argued that the presence of subtitles within a movie
when given the ability to personalise the position of subtitles when
can disengage the viewer from the experience of the film and have
viewing online media [19]. The most common method of subti-
a negative impact on the overall enjoyment [60]. The presence of
tle adaption is based on language, with different subtitle tracks
subtitles within 3D stereoscopic movies can negatively effect the
available to suit viewer preferences. The use of second language
visual comfort of a viewer [36], and the inclusion of subtitles to
subtitling can be used within an education setting to improve word
musical pieces can lower the amount of expression that a user per-
recognition [40] but can also cause confusion between dialects [43].
ceives from a performance, with possible justification for this being
Many online streaming services (e.g. Netflix [45]) also allow the
the multi-tasking elements that are required to listen to music and
user to customise how subtitles visually appear across content.
read subtitles at the same time [57]. Subtitles, therefore, have the
Personalising media content is a difficult task due to the number
potential to distract viewers if they are present when not required.
of interlinking steps involved within the media creation process.
One method that can be used to alter the impact that subtitles
However, recent advancements in the use of Object Based Media
have in overall viewing experience is to adapt their position within
(OBM) has changed the way that media production and consump-
the media content. The traditional position for subtitles is at the
tion can be thought about [3]. OBM retains content as component
bottom of the media being presented, however subtitle placement
parts, rather than rendering a finished artefact, and delivers these
can be changed to avoid obscuring content and to reduce distrac-
separately to the viewer. This allows media to be presented to the
tion [16]. The display of subtitle text can also be adapted based on
viewer in a personalised manner that takes into account individual
device size [29], and available space outside of the media content
needs whilst keeping overall viewing experience as a key concept
frame [19]. Subtitles can also be dynamically positioned [9], with
in content delivery [2, 20].
this method showing potential in increasing the overall viewing
experience of subtitled content [21] and also being important when
considering placement in VR environments [28]. 2.3 Understanding Personalised Subtitles
Despite efforts to create new methods of presenting subtitles, The way that audiences consume media content has shifted towards
they can be distracting to viewers, with dwell time highest for those interactive web-based players [48]. As such, viewers now expect a
Adaptive Subtitles: Preferences and Trade-Offs in Real-Time Media Adaption CHI ’21, May 8–13, 2021, Yokohama, Japan

personalised viewing experience across all aspects of media, includ- of content?", "If yes - What type of content do you turn subtitles
ing subtitles. Technology usage is not a one-size-fits all domain [26] on for?", "Have you ever needed to pause or stop watching a show
but many services do not embrace personalisation opportunities. because you couldn’t hear what was being said on screen?", "If
Previous work has focused on adaptive interfaces as a system-wide Yes, please explain:", "Have you ever had trouble understanding an
adoption [67], we take this concept and hypothesise that the con- accent on a programme?", "If Yes, please explain:".
tent that is being consumed is an additional factor that should be Ethical approval for the questionnaire was obtained from our
considered. We question why the experience of watching subtitled ERB. We distributed the questionnaire using social media (e.g.,
content is constrained to a binary choice when the content [7], Facebook, Twitter), Reddit (r/samplesize), university mailing lists,
context [51], and abilities of viewers differs significantly. and specific charities and organisations (e.g., RNID).
Typically, research has focused on adapting the appearance of
subtitles with regards to location [16], position [9], and text size [29]. 3.2 Participants
However, even if you change how subtitles appear, if they are In total, 102 participants completed the questionnaire. Participants
present all of the time, and especially when viewers don’t need were aged between 18 and 86 (M = 29.57, SD = 13.18), with one not
them, they can be distracting [12, 31, 60]. given. We used an open text field for gender: Male = 50, Female =
Taking the above into consideration, we formulated this paper’s 47, Transgender Guy = 1, Other = 1 2 , Not Given = 3. Participants
main research question as RQ1: “What challenges and opportuni- reported on their highest level of education: University (72 partici-
ties exist when interacting with real-time personalisation of subti- pants), College (9), High School (16), Other (4), and Not Given (1).
tled media content”. To answer this, we have the following aims: Participants reported on their level of computer literacy: Excellent
Aim 1: To understand how people currently interact with subtitles. (80 participants), Good (19), Fair (3), and Poor (0).
We achieve this by carrying out an online questionnaire to deter- In total, 24 participants self-reported having a hearing loss. Par-
mine how people currently interact with subtitled media content. ticipants were asked to describe their hearing loss using textual
Aim 2: To understand trade-offs and challenges that exist when descriptions provided by RNID [55]: Mild (14 participants), Moder-
allowing for real-time personalisation of subtitled media content. ate (5), Severe (4), and Profound (1). Participants also reported how
We achieve this by the design and evaluation of a interaction method long they had a hearing loss. This was an open text field, that was
that allows for real-time personalisation of subtitled content. then categorised into ‘0-5 Years’ (7 participants), ‘5-10 Years’ (7),
‘10-15 Years’ (0),‘15-20 Years’ (1) and ‘20 Years plus’ (9). Participants
3 STUDY 1: QUESTIONNAIRE ON SUBTITLE were also asked to report the cause of the hearing loss. This was
USAGE PATTERNS presented as checkboxes with an ‘Other’ field: Ageing (3 partici-
To explore the context around when and why people choose to use pants), Congenital (5), Viral Infection (4), Exposure to loud noise (6),
subtitles, we conducted an online questionnaire with people who Unknown (3), Ear Damage (1), Head Trauma (1), Otosclerosis (1),
self-reported regularly watching scripted media. There were four and Not Given (2). Participants reported if they used any assistive
questions framing our questionnaire: 1) How often do people watch technology, with nine participants reporting using hearing aids.
scripted entertainment (e.g., movies, documentaries)? 2) What ser-
vices do people use to watch scripted entertainment? 3) What type 4 QUESTIONNAIRE FINDINGS
of content do people turn subtitles on for? 4) Why do people use Closed-ended questions are reported by frequency of responses.
subtitles when watching specific types of content? Open-ended questions were analysed independently using open
coding [63], based on existing procedure [61, 62]. We used the
3.1 Design & Method following four-step process:
There were 24 questions across four sections. The first section con- (1) Generating and collating initial codes: The lead author
tained nine questions that were used to gather basic demographic read all responses, taking note of initial codes. These were
information; age, gender, highest level of education, level of com- generated using a data-driven approach, collated, collapsed
puter literacy, and details surrounding the participants’ hearing and developed into an initial codebook.
ability. The second section contained five questions and focused (2) Evaluating codes: Authors 1 and 2 independently coded
on participants’ viewing frequency: "What devices do you watch 1/3 (randomly-selected) of the responses for each question
scripted entertainment on?", "How often do you watch scripted using the initial codebook, agreeing to identify ‘mentions’
entertainment?", How many hours per day do you watch scripted rather than giving a single code to each response. Codes and
entertainment?", "What services do you use to watch scripted en- descriptions were then refined by discussing disagreements.
tertainment?", "Alongside terrestrial TV and online streaming, do (3) Coding full data set: Authors 1 and 2 separately re-coded
you use any of the following to watch scripted entertainment?". all responses with the updated codebook and rules.
The third section contained 10 questions and focused on partici- (4) Defining themes: Authors reviewed final coding and iden-
pants’ subtitle usage: "Do you regularly watch scripted entertain- tified similarities to allow thematic grouping. We did did not
ment with subtitles turned on?", "What are the reasons you watch calculate survey inter-rater reliability because codes were
scripted entertainment with subtitles turned on?", "How often do not the final outcome of our analysis [42].
you watch scripted entertainment with subtitles turned ON?", "How
often do you watch scripted entertainment with subtitles turned 2 Theresponse given by this participant is an internet meme that has previously been
OFF?", Have you ever turned subtitles on for a specific show or type discussed as aggressive/transphobic [30] and is not reported on further in this work.
CHI ’21, May 8–13, 2021, Yokohama, Japan Gorman, Crabb, and Armstrong

4.1 Viewing Frequency of participants stated that they experienced this problem: Yes (63
Participants reported using a variety of devices to watch scripted en- Participants), No (37), Not Given (2). Participants described issues
tertainment: Television (78 participants) and Smart TV (42), Laptop with accents belonging to specific people, actors, and characters. For
(80), Smartphone (60), Desktop PC/Computer (39), Tablet (24), iPad instance, P9 described that “Game of Thrones has some characters,
(29), Overhead Projector (7), Other (4), and Not Given (1). The most which are very hard to understand”. Furthermore, other participants
common device participants reported to use was either a TV or a mentioned specific speakers being difficult to understand such as
Smart TV, collectively accounting for 32% of the reported devices. P97 who commented that “accents like the one Big Narstie has, grime
Participants reported using a variety of services to watch scripted type” were difficult, and P37 stated they “watched a Netflix film with
entertainment: Netflix (82 participants), Amazon Video (45), Virgin Charlie Hunnam [and] didn’t understand one bit of dialogue”.
Media (14), Freeview (14), NowTV (13), Terrestrial TV (12), Sky (8), Participants also described challenges with accents specific to
Apple TV (6), Sky Go (4), Not Given (2). Additionally, participants individual countries. A wide variety of accents were mentioned
reported if they used any of the following: ‘On Demand TV’ (59 such as British by P13, or any accent different to their own such as
participants), ‘Recorded Programmes’ (25), ‘On the Go Live TV’ (7). P19 who reported “...difficulty with international accents, regardless
Participants reported varying frequencies of watching scripted of country...because I don’t hear them as often.”.
entertainment, with 50% of participants reporting that they watch 3) Content Barriers: There were 28 mentions of barriers within
scripted entertainment every day: Every day (51), every other day content that led to participants needing to pause or stop watching
(34), once a week (5), once every 2 weeks (3), seldom (8), never (1). the content. The most common barrier discussed focused on under-
There was also variety in the number of hours that they reported standing actors speaking, such as P89 who reported they “...couldn’t
watching, with the majority (55%) reporting watching for 1-2 hours understand what the actors were saying (in my native language) so I
each day: Less than an hour (21 participants), 1-2 hours (57), 3-4 turned on the subtitles and rewatched that sequence.”. Both P74 and
hours (22), 5 or more hours (2). P22 mention the production quality being an issue. P74 describes
that they have difficulty in understanding “Mumbling actors or bad
4.2 Subtitle Usage sound! I often rewind to catch the sentence correctly...” and P22 stating
that “sound quality or accents may be an issue.”
Participants were asked if they regularly watch scripted enter-
tainment with subtitles turned on: Yes (69 participants), No (32), 4.2.2 Subtitles to Assist in Context. Participants reported if
Not Given (1). Participants reported their frequency of watching they turned on subtitles for specific types of content, with 53% of
scripted entertainment with subtitles ON: Daily (36 participants), participants reporting that they do (55 participants), 44% responded
2-3 times a week (21), Once a week (10), 1-2 times per month (18), they did not (45), and two participants not responding.
1-2 times per year (9), Never (6), and Not Given (2). Participants 1) Context-specific Content: Participants described content where
also reported frequency of watching scripted entertainment with using subtitles provided additional context. For example, P62 de-
subtitles OFF: Daily (35 participants), 2-3 times a week (24), Once a scribed using subtitles for “quiz shows when the questions are asked
week (11), 1-2 times per month (14), 1-2 times per year (3), Never very quickly.”, and P97 described using subtitles for “Educational and
(14), and Not Given (2). difficult subjects where [they] might encounter new words/expressions”.
Participants reported the reasons that they watched scripted 2) Foreign Language Content: There were 25 mentions of partic-
entertainment with subtitles turned on, selecting all choices that ipants using subtitles when watching content in a foreign language.
applied: Helps me understand context (37 participants), Native lan- For example, P102 described that they “prefer to watch foreign shows
guage translation (35), Noisy viewing conditions (34), Media con- in their native audio language, accompanied by English subtitles”.
tent has low sound quality (33), Use subtitles to reinforce language 3) Viewing Environment: There were 30 mentions of external
(30), Trouble understanding international accents (26), Trouble un- factors leading to participants needing to pause or stop watching
derstanding regional accents (23), Quiet viewing conditions (22), content because they missed dialogue. P87 describes how they
Trouble understanding national accents (16), I have a hearing loss “multitask often and [they] have poor attention span.”. P98 stated that
(14), Busy using another device (10), Other (18). “Volume too variable...didn’t want to make it louder (kids sleeping)”.
4.2.1 Subtitles to Assist in Understanding. Participants also P42 discussed similar problems, such as when watching content on
reported if they ever needed to pause or stop watching a show be- their commute and ...sometimes the train gets too loud and [they]
cause they could not understand what was being said on screen, 55% need to rewind and put on subtitles.”
responded that they had experienced this problem (57 participants),
No (29), Maybe (15), Not Given (1). 4.3 Summary of Questionnaire Findings
1) Personal Accessibility Factors: Participants described barriers Our survey findings demonstrate challenges that result in viewers
to watching content due to accessibility issues. Most commonly this actively enabling subtitles. Participants described interrupting their
was due to participants, such as P15, stating that they “can’t hear viewing to enable subtitles due to specific accents, actors, or content
properly...and there’s no subtitles so I couldn’t understand them...”. Al- types. This is echoed by press articles that criticise actors mum-
though TV access for people with hearing loss has improved, there bling [13, 35], with this coined mumblegate in the UK [27]. Whilst
is still content that remains unwatchable due to a lack of subtitles, having subtitles on all of the time could resolve these issues, our
poor quality subtitles, and excessive background noise [48]. participants reported that they did not use this approach. This may
2) Accent Challenges: Participants reported if they had ever had be due to the impact that subtitles have on viewing experience, sum-
trouble understanding an accent on a programme. The majority marised in our Related Work. Our survey participants were heavy
Adaptive Subtitles: Preferences and Trade-Offs in Real-Time Media Adaption CHI ’21, May 8–13, 2021, Yokohama, Japan

watchers of content whereas viewers outside of this demographic cards with an image and corresponding speaker/character name
may have different reasons for using subtitles. Furthermore, our that gave control over individual speakers (as shown in Figure 1.A).
survey was conducted in early 2019 prior to the Covid-19 Pandemic. Cards were colour coded to match the subtitle text colour of each
As such, the frequency data we report around viewing and subtitle speaker (as shown in Figure 1.A and 1.B) and designed to work with
usage may not be representative of the current population. portrait and landscape display options. The colours used for our
We now explore an approach that can be used to transform subtitles is typical for terrestrial broadcast in the UK and follows
subtitles from a binary (i.e., on or off) interaction to a personalised guidance provided by OfCom [48] and BBC [16].
and adaptable experience that takes into account our survey results. The controller interface displays a set of cards that correspond to
each speaker in a clip. All speakers are initially in the ‘off’ state, and
5 ADAPTIVE SUBTITLES the images were given slight opacity to signify this to users [23].
To show/hide subtitles for a particular speaker, the user taps on the
Our questionnaire findings highlight that people interact with sub-
respective speaker’s image sending a socket.emit() event to the
titles based on environmental factors, challenges in content, and
server. This triggers the TV interface to update style settings for
personal factors (e.g., tiredness, and hearing loss). A key area that
the respective character resulting in subtitles being shown/hidden.
was discussed by participants surrounded interacting with subtitles
2) Adaptive Subtitles TV Interface: Our Adaptive Subtitles TV
due to challenges in understanding specific accents and actors, and
interface allowed viewers to only see subtitles for the specific speak-
alterations for specific content types. To investigate the potential
ers enabled on the Adaptive Subtitles controller. The interface con-
that offering real-time personalisation of subtitles would have on
sisted of a video window with an overlaid subtitle container at the
these areas we introduce a new subtitling approach.
bottom middle of the display (shown in Figure 1.B). Subtitles were
Adaptive Subtitles is a media personalisation approach that can
styled to match BBC Guidelines [16] and styling preferences [15].
alter subtitle presentation on a subtitle-block level, contrasting with
Subtitles typically have a transparent black background to as-
the content level approach that is currently used. Our system is
sist with text contrast [16]. In a traditional web video player with
based on the principle of Object Based Media (OBM) [3] where
subtitle support, subtitles are contained within an element (e.g., a
content is retained as component parts, rather than a rendered
<div>) that surrounds the entire subtitle block with the transparent
finished artefact, and delivered separately to the viewer. This in-
black background being applied to this. In our Adaptive Subtitles
creases opportunities for adaptation and personalisation based on
implementation we styled individual speaker <span> tags to have
user needs and the context of use. OBM has previously been used to
the transparent black background (i.e. not the overall subtitle con-
allow viewers to explore music at live events [52], recap on missed
tainer) and used the CSS visibility:hidden style opposed to
television episodes [20], and to enhance audio mixes [6].
display:none to preserve subtitle placement. All noises included
Instead of subtitles being viewed as a single object within me-
in subtitle tracks were unaltered and presented without a black
dia, we propose that additional metadata is added to subtitles files,
background to differentiate them from speaker text.
which is then used to create opportunities for adaptation surround-
WebVTT (Web Video Text Tracks) files were manually coded for
ing words (e.g., names, locations), phrases (eg., catchprases), speak-
each speaker by adding <v.char> tags, demonstrated in W3C Rec-
ers/characters, accents, audio-descriptive elements, and scene com-
ommendations [53]. The edited WebVTT files could then be parsed
position. This moves subtitles from a single object to a structured
by our Adaptive Subtitles application and inserted into HTML
set of atomic elements following the guiding principles of OBM.
<span> elements to make them easily readable within the Docu-
ment Object Model (DOM). When a socket.emit() event was sent
5.1 Implementation from the controller to the Adaptive Subtitles TV interface contain-
Our survey findings demonstrate significant issues in understand- ing a speaker’s subtitle state, the contents of this were parsed and
ing speakers due to content or accent challenges. We use this to relevant CSS style rules applied in order to enable/disable subtitles.
motivate our implementation of an adaptive subtitle system that
provides the viewer control over which characters have subtitles
enabled/disabled. We chose this to take advantage of the working-
6 STUDY 2: EVALUATION OF ADAPTIVE
memory that is used when consuming subtitled content and the SUBTITLES
viewer correlation that must take place between subtitle track and The evaluation of Adaptive Subtitles comprised of a lab based user
on-screen speaker [37]. To test our concept we created a system study where participants watched a selection of video clips while
that consists of: 1) A second screen Adaptive Subtitles Controller using the system, followed by a discussion of their experience.
Interface for controlling speaker subtitles, 2) Adaptive Subtitles TV The evaluation took place within our in-house user testing lab.
Interface for viewing media content and rendering speaker subtitles, We arranged the lab to resemble a living room with a sofa directly
and 3) A nodeJS server instance to support communication between facing the television (for participant), and an armchair perpendicu-
interfaces. The system architecture used ExpressJS, Angular, and lar to the television (for researcher). Participants used a Moto Z3
NodeJS, with socket.io for real time communication. Sample code Play as the Adaptive Subtitles controller throughout the study (as
is included in supplementary material. shown in Figure 2). BBC report a median UK household television
1) Adaptive Subtitles Controller Interface: Our Adaptive Sub- viewing distance of 2.63m, 5.5 times screen height (i.e. 5.5H), but
titles controller interface allowed control over whether subtitles also report that the median H is decreasing due to an increase in
were on/off for individual speakers within content. The interface television size [46]. Participants sat 2.44m (4H) away from a 48"
consisted of play and pause buttons for the content, and speaker Samsung J5100 5 Series HD LED television. This is 19cm away from
CHI ’21, May 8–13, 2021, Yokohama, Japan Gorman, Crabb, and Armstrong

Figure 1: A) Adaptive Subtitles ‘Controller’ interface displaying speakers for BBC’s ‘Would I Lie To You?’ clip; Lee Mack has
been toggled to have subtitles displayed. B) Adaptive Subtitles displaying subtitles for speakers selected in A. C) Traditional
subtitles showing dialogue for all speakers.

BBC reported median but within 1 SD of reported limits [46]. All ses-
sions were video and audio recorded from three angles: immediate
left and right of participant to assist with understanding responses,
and behind participants to view interaction with controller.

6.1 Apparatus
Five short clips of content were selected from the BBC iPlayer
online library (bbc.co.uk/iplayer). As we required the video files
and raw subtitles we used the open source software ‘get_iPlayer’
(github.com/get-iplayer). We chose clips using the following criteria
(similar to [11]): (a) content was not originally broadcast within Figure 2: Evaluation setup, showing Adaptive Subtitle con-
a month of our study taking place (to reduce potential familiarity trol interface with speaker Lee Mack selected. The Adaptive
with content); (b) content did not contain offensive language, or Subtitles television interface is playing the a clip and only
potentially disturbing material; (c) there were extensive talking- displaying subtitles for the selected speaker.
head shots (i.e., as much spoken dialogue as possible); (d) content
containing speakers with local regional accents (i.e. local to study
location) was excluded; (e) excerpts were around five minutes in du-
ration; (f) similar levels of activity and engagement across content; used within our first study. The questionnaire had 14 questions
(g) subtitles were not superimposed on content before transcod- across two sections. The first section contained nine questions that
ing; (h) content contained challenges that affect people’s ability were used to gather demographic information; age, gender, level
to lipread/speechread such as speakers turning away, and, facial of education, level of computer literacy, and participants’ hearing
hair [24, 25]. iPlayer subtitles are presented using EBU-TT (timed ability. The second section contained five questions and focused
text) format [64]. We used get_iPlayer to obtain these and subse- on participants’ viewing experience and subtitle usage: ‘How often
quently convert these to SRT (SubRip) subtitles files. Aegisub was do you watch scripted entertainment?’, ‘How many hours per day
used to shift subtitle times to match clip length. do you watch scripted entertainment?’, ‘Do you regularly watch
The clips used in the evaluation were: scripted entertainment with subtitles turned on?’, ‘How often do
Would I Lie To You?, 2017, Series 11, Episode 2, Broadcast: 27- you watch scripted entertainment with subtitles turned ON/OFF?’.
Nov-2017, (00:01:35 – 00:06:59) – A comedy panel show. (only used Stage 2 - Lab-based User Study: Participants were asked to
for demonstration and study onboarding). watch the four video clips and use our Adaptive Subtitles remote
Water Diviner, 2014, Broadcast: 7-Jul-2019, (00:17:44 – 00:22:43) – to control the subtitles on the clips to suit their own personal
A movie set after the Battle of Gallipoli. preferences. Participants were initially shown traditional subtitles
Peaky Blinders, 2017, Season 4, Episode 2 - "Heathens", Broadcast: using the ‘Would I Lie To You?’ clip. We asked participants if the
22-Nov-2017, (00:36:45 – 00:42:26) – A drama series set in England. volume was at a comfortable level (set at point 15 on the volume
A Fresh Guide to Florence with Fab 5 Freddy, 2019, Broadcast: slider, ~40 Db) and if this was a typical representation of subtitles
27-Jul-2019, (00:29:46 – 00:35:18) – A documentary on Italian art. that they had used before. Participants were then shown the same
University Challenge, 2019, Season 19/20, Episode 1, Broadcast: clip with Adaptive Subtitles as a form of onboarding.
15-Jul-2019, (00:02:49 – 00:05:11) – An academic quiz show. Participants were shown each of the four clips exclusively with
the Adaptive Subtitles approach. Each clip was shown in full, and
after each clip we asked questions about how participants used
6.2 Design Adaptive Subtitles. Clip order was counterbalanced across partici-
Stage 1 - Questionnaire: Participants were greeted, explained the pants using a Williams Balanced Latin Square.
purpose of the study, and asked to provide informed consent. A Stage 3 - Post-session Discussion: After viewing all of the clips,
questionnaire was then given to participants, similar to the one we used the UX Subtitle Framework [18] to scaffold a semi-structure
Adaptive Subtitles: Preferences and Trade-Offs in Real-Time Media Adaption CHI ’21, May 8–13, 2021, Yokohama, Japan

interview. This framework has been used to assess the UX of dif- visuals and that for documentaries you “don’t need to know what
ferent subtitle approaches [9, 10, 19], and also by media industry everyone is saying”(P8). Participants also saw the benefit of having
practitioners [5]. The framework allows overall measure of UX to adaptive subtitles present within serialised content, discussing that
be assessed when viewing different methods of subtitle display. it “would be useful to have it going across episodes” (P10) and that
this would reduce the overall attention lost to initially setting the
6.3 Participants system –“you would choose a setting and then leave it” (P2).
Participants were over 18 years-old and self identified as turning Our participants were divided in how they approached adaptive
on subtitles whilst watching media content at least once in the last subtitling within the University Challenge clip, with usage being
three months. We recruited 19 participants from a local university very different to story-driven media. In this clip most participants
aged between 19 and 53 (M = 28.38 years, SD = 8.77). Participants turned on the subtitles for the quiz show host, “even though I can
were compensated with a £10 gift voucher. hear the questions, they are long questions so then I can read them.”
We used an open text field for gender: Male = 14, Female = 5. (P4). This was echoed by P5 who stated that “some of the questions
Participants reported on their highest level of education: University might be a bit technical, so it gives you some reassurance.”. Turning
(17 participants), College (1), High School (1), Other (0). Participants on subtitles for only the presenter gave the added value of enabling
reported on their level of computer literacy: Excellent (15 partic- participants to play along with the show itself with P4 describing
ipants), Good (4), Fair (0), and Poor (0). In total, two participants that they turned subtitles off for contestants “so that it didn’t give
self-reported having a hearing loss. One participant had Moderate me the answers”. Some participants took a different approach to
hearing loss for 26 years due to a Virus or Disease, and one had adaptive subtitling in this content type and also turned on subtitles
Mild hearing loss for 3 years due to exposure to loud noise. Neither when teams were conferring, describing that “you would just hear
reported using hearing aids or cochlear implants. whispering...but with subtitles you get it and you learn more” (P17).
Participants reported the frequency of watching scripted enter-
6.4.2 Within-Content Adaptions. Participants noted that as-
tainment: Every day (10 participants), Every other day (4), Once a
pects such as character accents and their previous exposure to a
week (4), Once every 2 weeks (0), Seldom (1), Never (0). There was
given show impacted on why they chose to use subtitles for given
also variety in the number of hours that they reported watching
characters. Participants discussed that content-based difficulties,
each day: <1 hour (4 participants), 1-2 hours (13), 3-4 hours (2), 5
such as understanding accents, caused them to enable adaptive sub-
or more hours (0). Participants were asked if they regularly watch
titles for individual speakers - “I struggled with some of the accents”
scripted entertainment with subtitles turned on: Yes (15), No (4).
(P8). P18 elaborated on this by discussing that “the issue isn’t how
Participants reported their frequency of watching scripted enter-
loud they speak, it’s really the accents”. Participants found some
tainment with subtitles ON: Daily (6), 2-3 times a week (6), Once a
speakers to be more challenging to understand than others, with
week (5), 1-2 times per month (2), 1-2 times per year (0), Never (0).
one participant turning on adaptive subtitles for “the men...because
Participants reported frequency of watching scripted entertainment
they had stronger accents, with trying to be suspicious and all” (P14).
with subtitles OFF: Daily (7), 2-3 times a week (7), Once a week (1),
Some participants were quick to adapt to accents that they under-
1-2 times per month (3), 1-2 times per year (0), Never (1).
stood with P16 stating that “it was the initial anticipation but then I
realised that I could hear him fine so turned him off”. One participant
6.4 Results commented that previous exposure to one of the shows within the
All sessions were transcribed and analysed by the authors. While study assisted in determining which characters to have subtitles
every attempt was made to remain impartial throughout data gath- on for, “I watch Peaky Blinders, I’ve listened to the accents before...I
ering and analysis, a potential bias may exist as an author was was almost pre-empted to turn them on.” (P5). This was contrasted
present for all interview sessions. The use of an interview guide with P12 who stated that “I’m not very familiar with the programme,
with structured questions reduces bias in this regard. Transcripts which made it hard to find out which ones to put on”.
from all sessions were created from the experiment video files and
were blocked according to the related sections within the interview 6.4.3 Benefit I - Adaptive Subtitles Increases Focus on Main
guide. Sections were combined between participants and exam- Content. During our evaluation, participants commented that that
ined individually based on interview guide components. Closed one of the main challenges present with traditional subtitles is that
descriptive coding was carried out with attention paid towards “you are always drawn to the words and you might miss something”
the benefits and drawbacks of traditional subtitles and Adaptive (P16). This adds to the cognitive load associated with watching
Subtitles whilst also exploring users perceptions towards real-time content as “first you read it [subtitles], then you reinforce it with
adaptions of media elements. Individual quotes were coded with talking, and then you get it”. (P14). Using the adaptive subtitles
authors agreeing on the inclusion of each within their categories. approach, participants commented that they are “able to focus on
Conclusions within results are drawn from data general trends. the clip itself, there is less to read...I’m only reading what I want to
read”(P5). Participants also discussed that adaptive subtitles “helps
6.4.1 Context-Based Adaptions. Participants commented they you focus on what is needed.” (P2), and that this approach doesn’t
would “use it [adaptive subtitles] pretty often” (P1) but that it “would “distract me from what is going on as much as traditional subtitles”.
depend on what I’m watching” (P7). This awareness of context-based
adaptions was also highlighted by P10, stating that “if it was a movie 6.4.4 Benefit II - Adaptive Subtitle Presentation alters Con-
or a TV show that was a one off, I would go for traditional [subtitles].”. tent Consumption Method. A common view from participants
In some situations verbal content is less important than on-screen focused on the disconnect that traditional subtitles create with P18
CHI ’21, May 8–13, 2021, Yokohama, Japan Gorman, Crabb, and Armstrong

stating that “if you have subtitles on you are slightly disconnected carried out in order to assist with the move from selection of adap-
with what is going on, I’m always concentrating on the subtitles”. tive subtitles to the consumption of adaptive subtitled content and
This was echoed by P14 who reflected on moving between subtitled the move between devices that is part of this. The use of multiple
and non-subtitled content, describing “I was able to be there and colours in subtitles is a common feature across UK terrestrial TV
see the pictures and felt like I was zooming in and being there while [16, 50] but less common in other countries and in online platforms.
when the other guy was talking and the subtitles come on I feel like Participants commented that our approach meant that “you already
I’m shut out. Its a barrier between me and what is happening”. know what colour what character is” (P18) and that they “appreciated
A consequence of our subtitle styling approach (see Implemen- the colour coding so I could tell who was talking”(P14). The consis-
tation), is that in some situations a visual ‘gap’ between subtitle tent application of colour across devices (see Figure 1) “help(s) with
blocks appears on the screen. Whilst this was not something that contextual understanding of characters and their names. When the
we had intended to be an aesthetic feature of Adaptive Subtitles, colours started I was able to marry up who these people were” (P3).
it is something that participants acknowledged within discussion.
Participants commented that they “knew that I shouldn’t be read- 6.5 Study Limitations
ing it all at once” (P14) and when the gap was present they “didn’t
Our evaluation of Adaptive Subtitles focused on short, lab based
read the second statement, I was able to wait” (P13). Despite one
exposure. Participants viewed four clips in a short amount of time,
participant finding this feature to be “offputting” (P7), participants
and as such, we are only able to generalise our findings to this “setup”
commented that this method of subtitle display altered the method
period. The nature of participant exposure to Adaptive Subtitles in
in which they consumed subtitled content and that “you almost
our work focused on initial system usage, and we did not experience
leave space in your mind waiting for the reply” (P5).
the set and forget phenomenon [19] that would be expected over
6.4.5 Trade-Off I - Personalising Content Leads to Increased longer usage. We encountered issues surrounding split-attention
Physical and Mental Effort. Whilst participants were comfort- due to the use of our second screen remote, echoing challenges
able with the concept of adaptive subtitles, they saw clear disadvan- discussed in other work [8]. However, the on-boarding of users to
tages in the effort required to create a customised list of characters new technology and concepts should be carefully planned . Our
that would have subtitles enabled. P13 described this as “a very work is a necessary step in understanding challenges in this area.
involved process” and P8 added to this by commenting that “I felt Our approach requires additional effort to edit content into atomic
a bit less involved because I was doing a task and doing something elements, like other OBM systems [6, 20, 52]. However, in some
instead of just sitting back and watching”. P13 raised concerns about cases this is done automatically when colour is added to subtitles by
its usage in some shows, stating that “...for dramas, its moving be- broadcasters, therefore it only requires effort to match each colour
tween the fictional world and the real world. Something like pressing to the first instance of the speaker. Our current approach would
pause is a conscious choice and when you press play you are going only work on a digital web-based display system (e.g., Netflix).
back in, with this it is like you’re never really getting in”. Participants
commented on the overall usability of Adaptive Subtitles, saying 7 DISCUSSION
that it “felt like I had to do more work, it was more effort on my part”
Giving users the ability to personalise the way they experience
(P4). P2 added to this by discussing that the implications of this
content is challenging. Any time that is spent implementing an
challenge scales with the number of characters present, and that
adaption is time that is not spent consuming the media itself. The
“if there are many characters, matching the object to the person on the
task that users’ go through to adapt an interface follows the same,
screen is hard”. Despite this challenge, P9 highlighted the trade-off
broad, iterative process: consuming content → deciding that content
that has to happen when personalising media content, stating that
should be adapted → selecting content to be adapted → evaluating
our approach was “easy to use, but there is more to use”.
if the adaption is acceptable . The added challenge with real-time
6.4.6 Trade-Off II - Second Screen Device Interaction Alters media adaptions is that there is a greater emphasis on ensuring that
the Passive Media Experience. Our adaptive subtitles implemen- the final three aspects in this cycle take as limited amount of time
tation was facilitated through a second screen application that al- as possible. We reflect on this and present design considerations
lowed participants to individually choose which characters had to enable others to better understand the interaction challenges
subtitles enabled. Participants felt that this approach altered the that were faced in this work. We initially discuss these as collective
overall experience, “TV is a very passive thing, you want gentle ac- guidance and subsequently expand on these individually.
tions...the remote is more involved it turns it into an active experience” Providing granular control over content allows viewers to move
(P13). Participants commented on the trade offs of this approach, between understanding content and being immersed in scene con-
with P7 suggesting that they felt “more involved in terms of what text, but this complexity can overwhelm users. To assist with this
was happening in what was being said, but less involved because I complexity, there should be an onboarding period for the content
had to look at the remote”. Challenges in using this second screen being watched, with this viewed as separate to onboarding of the
device was also discussed by other participants, with P10 noting technology itself. This onboarding period can assist in reducing
that “one problem is that you have to look down and look up, you information overload, but can initially be seen as a distraction dur-
might be missing content that is on the screen, [but] its only a short ing viewing. Despite this, the short-term distraction created should
lapse in concentration”. In our developed application we matched up lead to long-term benefits for viewers in terms of improved viewing
the background colour of characters on the second screen device experience. In our study, users had to do carry out this process
with their individual subtitle colours on the main display. This was at the start of short clips. Most use-cases for our approach would
Adaptive Subtitles: Preferences and Trade-Offs in Real-Time Media Adaption CHI ’21, May 8–13, 2021, Yokohama, Japan

be when watching longer content: 45 min TV episode, a 120 min One method that television shows use to create onboarding ex-
movie, or serialised content that could span several episodes. periences is by using episode recaps that provide viewers with
information regarding ongoing plot lines and important charac-
7.1 Understanding vs. Immersion ters [41]. Similar techniques could be used to on-board viewers
when interacting with real-time personalisation of media content.
In our work, participants commented that they removed subtitles
This would allow for adaptions to content in a situation where
within shows where the visuals are more important than the in-
consumption of the media is less important, or carried out based
formation being described to assist in promoting involvement in
on previously altered content (e.g. between episodes in serialised
the show. In these situations subtitles were important for under-
content or for common actors). This creates a clear separation be-
standing content but are seen as distracting during establishing
tween onboarding to adaptions for particular content types and the
shots when voice-over was being used, leading to a reduction in
onboarding of how the technology works. We recommend that de-
how viewers experience the context of a particular shot. The re-
signers should consider the onboarding of viewers to content
moval of subtitles during these instances assisted in making users
and the technology as separate elements and cater for these
feel more immersed. Conversely, participants described that they
using different techniques.
added subtitles within shows where they struggled to understand
particular characters. In these cases, the lack of subtitles would lead
to little understanding of the content and subsequently a reduction 7.3 Trade-Offs and Benefits
in overall contextual experience. During the evaluation, participants found it challenging to match
Participants had very different reasons for their individual setup up characters between our second screen interface and the content
options for adaptive subtitles. Each attempted to find their own on the main screen. This could be due to the length of our study
sweet spot for understanding content whilst also creating adaptions clips (5 minutes) compared to entire television show episodes (30-
to allow immersion in the viewing experience. In our work, we 60 minutes) and feature films (120+ minutes). Despite the short
found that participants added subtitles to content when they wanted clip lengths, participants commented that they saw the potential
to understand content but removed subtitles when they wanted benefits of using this when watching longer content.
to feel more immersed in the show. The understanding content → The approaches that our participants used to determine which
immersive context continuum that participants were interacting speakers to enable were related to the coping strategies discussed
with is a careful balancing act that can change often, even between by participants in our initial survey. For instance, some participants
scenes. Giving viewers the ability to move along this highlights the turned on speakers they identified had heavy or unfamiliar accents.
challenge in creating adaptive interfaces for real-time media. Participants commented that once their subtitle view (and therefore
Instead of assuming what type of adaption will be required, we overall content) was personalised they found it to be less distract-
recommend designers should provide users with the ability to ing than their previous experiences using traditional subtitles. This
create granular adaptions to move between understanding reduced the level of disconnect between themselves and the me-
content and being immersed in a specific context. This pro- dia. Participants acknowledged the trade-off between short-term
vides viewers with the opportunity to decide how they wish to distraction at the start of a piece of content, versus long term bene-
consume content and may lead to positive alterations in lean-back fits of a personalised viewing experience. This process follows the
or lean-forward experiences [54] based on their own viewing goals. set-and-forget phenomenon [19]. We recommend that designers
should embrace the set-and-forget phenomenon when devel-
7.2 Technology and Content Onboarding oping real-time media adaptations to improve viewer involve-
ment with content over the long-term. This allows viewers to focus
In our evaluation, participants took part in two distinct onboarding
on personalising their experience during points where story ele-
experiences. Firstly, they were introduced to the concept of adaptive
ments are limited rather than during key points of content and will
subtitles and our implementation of this method (i.e. onboarding
lead to increased levels of engagement.
to the technology). Secondly, they decided on the adaptions they
would like to make to a given clip before consuming the media
(i.e. onboarding to the content). These adaptions to clips took place 8 CONCLUSION
when participants viewed them as being necessary and transitioned Subtitles are commonly thought to be an accessibility feature, and
from onboarding to real-time adaptions. are traditionally viewed to be only used by people with a hear-
The personalisation of media content requires focus and atten- ing impairment. However, for 80% of subtitle users this is not the
tion from viewers. This shift in attention from the media itself case [49]. To understand how and why viewers adapt their subtitle
to the media controls was challenging for our participants. They usage, we conducted an online questionnaire with 102 subtitle users.
discussed how it was a complex process that involved identifying Our participants reported using subtitles based on the language
the character on the screen, locating that character on a secondary being spoken, accents, scene-context, and programme quality.
device, and finally selecting subtitle state. We describe during the Inspired by the challenges that our participants faced, and re-
introduction to our discussion the broad, iterative, process that cent developments in media production [20], we introduced a new
users go through when making adaptions. On-boarding alters this subtitling approach called Adaptive Subtitles to investigate the
process in that it produces an opportunity to decide, select, and challenges and opportunities that exist when interacting with real-
evaluate adaptions in a situation where the consumption of media time personalisation of subtitled media content. Our evaluation
is no longer the primary objective. illustrated that the personalisation needs of an individual changes
CHI ’21, May 8–13, 2021, Yokohama, Japan Gorman, Crabb, and Armstrong

based on what they are watching and how they wish to consume [16] British Broadcasting Corporation. 2018. BBC Subtitle Guidelines. http://bbc.
it. For example, people may turn on subtitles for individual char- github.io/subtitle-guidelines/
[17] Michael Crabb, Michael Heron, Rhianne Jones, Mike Armstrong, Hayley Reid,
acters in a movie due to challenges in understanding accents, and and Amy Wilson. 2019. Developing Accessible Services: Understanding Current
in quiz shows subtitles can be turned off for contestants so people Knowledge and Areas for Future Support. In Proceedings of the 2019 CHI Con-
ference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI
can more easily play along. We also consider content of different ’19). ACM, New York, NY, USA, Article 216, 12 pages. https://doi.org/10.1145/
lengths. For example, the benefits of using adaptive subtitles on 3290605.3300446
long form content (e.g., a movie) likely outweigh the drawbacks. [18] Michael Crabb, Rhianne Jones, and Mike Armstrong. 2015. The Development
of a Framework for Understanding the UX of Subtitles. In Proceedings of the
Whereas the drawbacks of using it on a short term content (e.g., a 17th International ACM SIGACCESS Conference on Computers &#38; Accessibility
short TV episode) may outweigh the benefits. (Lisbon, Portugal) (ASSETS ’15). ACM, New York, NY, USA, 347–348. https:
We propose three design considerations that should be used //doi.org/10.1145/2700648.2811372
[19] Michael Crabb, Rhianne Jones, Mike Armstrong, and Chris J. Hughes. 2015.
when developing media personalisation features: 1) Provide users Online News Videos: The UX of Subtitle Position. In Proceedings of the 17th
with the ability to create granular adaptions to move between un- International ACM SIGACCESS Conference on Computers and Accessibility (Lisbon,
Portugal) (ASSETS ’15). Association for Computing Machinery, New York, NY,
derstanding content and experiencing scenes in context, 2) Con- USA, 215–222. https://doi.org/10.1145/2700648.2809866
sider the onboarding of viewers to content and the technology as [20] Michael Evans, Tristan Ferne, Zillah Watson, Frank Melchior, Matthew Brooks,
separate elements and cater for these using different techniques, Phil Stenton, and Ian Forrester. 2016. Creating object-based experiences in the
real world. Proceedings of the IBC Conference 2016 2014, 1 (2016), 1–8. https:
and 3) Embrace the set-and-forget phenomenon when developing //doi.org/10.1049/ibc.2016.0034
real-time media adaptations. We suggest that by following these [21] Wendy Fox. 2016. Integrated titles: An improved viewing experience? Eyetracking
recommendations should increase engagement with media content. and Applied Linguistics 2 (2016), 5. https://doi.org/10.17169/langsci.b108.233
[22] Alejandra Garrido, Sergio Firmenich, Gustavo Rossi, Julian Grigera, Nuria Medina-
Medina, and Ivana Harari. 2012. Personalized web accessibility using client-side
REFERENCES refactoring. IEEE Internet Computing 17, 4 (2012), 58–66. https://doi.org/10.1109/
[1] M. Armstrong. 2017. Automatic Recovery and Verification of Subtitles for Large MIC.2012.143
Collections of Video Clips. SMPTE Motion Imaging Journal 126, 8 (2017), 1–7. [23] Google. 2019. Material Guidelines - Displaying State. https://material.io/design/
https://doi.org/10.5594/JMI.2017.2732858 interaction/states.html
[2] M. Armstrong, S. Bowman, M. Brooks, A. Brown, J. Carter, A. Jones, M. Leonard, [24] Benjamin M. Gorman and David R. Flatla. 2017. A Framework for Speechreading
and T. Preece. 2020. Taking Object-Based Media from the Research Environment Acquisition Tools. In Proceedings of the 2017 CHI Conference on Human Factors in
Into Mainstream Production. SMPTE Motion Imaging Journal 129, 5 (2020), 30–38. Computing Systems (Denver, Colorado, USA) (CHI ’17). Association for Comput-
https://doi.org/10.5594/JMI.2020.2990255 ing Machinery, New York, NY, USA, 519–530. https://doi.org/10.1145/3025453.
[3] Mike Armstrong, Matthew Brooks, Anthony Churnside, Michael Evans, Frank 3025560
Melchior, and Matthew Shotton. 2014. Object-based broadcasting-curation, [25] Benjamin M. Gorman and David R. Flatla. 2018. MirrorMirror: A Mobile Ap-
responsiveness and user experience. IBC2014 Conference 2014, 1 (2014), 1–8. plication to Improve Speechreading Acquisition. In Proceedings of the 2018 CHI
https://doi.org/10.1049/ib.2014.0038 Conference on Human Factors in Computing Systems (Montreal QC, Canada)
[4] Mike Armstrong, Andy Brown, Michael Crabb, Chris J. Hughes, Rhianne Jones, (CHI ’18). Association for Computing Machinery, New York, NY, USA, 1–12.
and James Sandford. 2016. Understanding the Diverse Needs of Subtitle Users https://doi.org/10.1145/3173574.3173600
in a Rapidly Evolving Media Landscape. SMPTE Motion Imaging Journal 125, 9 [26] Vicki L. Hanson and John T. Richards. 2003. A Web Accessibility Service: Update
(2016), 33–41. https://doi.org/10.5594/JMI.2016.2614919 and Findings. SIGACCESS Access. Comput. 2003, 77-78 (Sept. 2003), 169–176.
[5] BBC Research and Development. 2016. Subtitle Quality - Measuring and improv- https://doi.org/10.1145/1029014.1028661
ing subtitle quality. www.bbc.co.uk/rd/projects/live-subtitle-quality [27] Ellie Harrison. 2017. Why is it so hard to hear the dialogue in TV dra-
[6] BBC Research and Development. 2019. Casualty, Loud and Clear. https://www. mas? https://www.radiotimes.com/news/tv/2017-02-23/tv-sound-problems-
bbc.co.uk/rd/blog/2019-08-casualty-tv-drama-audio-mix-speech-hearing drama-dialogue/
[7] Tim Brooks. 2019. Television and Record Industry Nielson Ratings. https: [28] Chris Hughes, Mario Montagud Climent, and Peter tho Pesch. 2019. Disruptive
//timbrooks.net/ratings/ Approaches for Subtitling in Immersive Environments. In Proceedings of the 2019
[8] Andy Brown, Amaia Aizpurua, Caroline Jay, Michael Evans, Maxine Glancy, and ACM International Conference on Interactive Experiences for TV and Online Video
Simon Harper. 2019. Contrasting delivery modes for second screen TV content. (Salford (Manchester), United Kingdom) (TVX ’19). Association for Computing Ma-
Push or Pull? International Journal of Human-Computer Studies 129 (2019), 15–26. chinery, New York, NY, USA, 216–229. https://doi.org/10.1145/3317697.3325123
https://doi.org/10.1016/j.ijhcs.2019.03.007 [29] Chris J. Hughes, Mike Armstrong, Rhianne Jones, and Michael Crabb. 2015.
[9] Andy Brown, Rhia Jones, Mike Crabb, James Sandford, Matthew Brooks, Mike Responsive Design for Personalised Subtitles. In Proceedings of the 12th Web for
Armstrong, and Caroline Jay. 2015. Dynamic Subtitles: The User Experience. In All Conference (Florence, Italy) (W4A ’15). ACM, New York, NY, USA, Article 8,
Proceedings of the ACM International Conference on Interactive Experiences for 4 pages. https://doi.org/10.1145/2745555.2746650
TV and Online Video (Brussels, Belgium) (TVX ’15). ACM, New York, NY, USA, [30] Samantha Jaroszewski, Danielle Lottridge, Oliver L. Haimson, and Katie Quehl.
103–112. https://doi.org/10.1145/2745197.2745204 2018. "Genderfluid" or "Attack Helicopter": Responsible HCI Research Practice
[10] Andy Brown, Jayson Turner, Jake Patterson, Anastasia Schmitz, Mike Armstrong, with Non-Binary Gender Variation in Online Communities. In Proceedings of
and Maxine Glancy. 2017. Subtitles in 360-degree Video. In Adjunct Publication the 2018 CHI Conference on Human Factors in Computing Systems (Montreal QC,
of the 2017 ACM International Conference on Interactive Experiences for TV and Canada) (CHI ’18). Association for Computing Machinery, New York, NY, USA,
Online Video (Hilversum, The Netherlands) (TVX ’17 Adjunct). ACM, New York, 1–15. https://doi.org/10.1145/3173574.3173881
NY, USA, 3–8. https://doi.org/10.1145/3084289.3089915 [31] Carl J Jensema, Ramalinga Sarma Danturthi, and Robert Burch. 2000. Time spent
[11] Denis Burnham, Greg Leigh, William Noble, Caroline Jones, Michael Tyler, Leonid viewing captions on television programs. American annals of the deaf 2000, 1
Grebennikov, and Alex Varley. 2008. Parameters in television captioning for (2000), 464–468. https://doi.org/10.1353/aad.2012.0144
deaf and hard-of-hearing adults: Effects of caption rate versus text reduction on [32] Helen Katz. 2006. The media handbook: A complete guide to advertising media
comprehension. Journal of deaf studies and deaf education 13, 3 (2008), 391–404. selection, planning, research, and buying. Routledge, Oxford, UK. https://doi.org/
https://doi.org/10.1093/deafed/enn003 10.4324/9781315537870
[12] Cristina Cambra, Olivier Penacchio, Núria Silvestre, and Aurora Leal. 2014. Visual [33] Jan-Louis Kruger, Stephen Doherty, and María-T Soto-Sanfiel. 2017. Original
attention to subtitles when viewing a cartoon by deaf and hearing children: an Language Subtitles: Their Effects on the Native and Foreign Viewer. Comunicar:
eye-tracking pilot study. Perspectives 22, 4 (2014), 607–617. https://doi.org/10. Media Education Research Journal 25, 50 (2017), 23–32. https://doi.org/10.3916/
1080/0907676X.2014.923477 C50-2017-02
[13] Jessica Carpani. 2019. BBC criticised for ’mumbling’ adaptation of A Christmas [34] Jan-Louis Kruger, Agnieszka Szarkowska, and Izabela Krejtz. 2015. Subtitles on
Carol. https://www.telegraph.co.uk/news/2019/12/23/bbc-criticised-mumbling- the moving image: an overview of eye tracking studies. Refractory: A Journal of
adaptation-christmas-carol/ Entertainment Media 25 (2015), 1–14. http://hdl.handle.net/1959.14/1040614
[14] Brad Chisholm. 1987. Reading Intertitles. Journal of Popular Film and Television [35] Michael Lallo. 2017. Speak up! How ’mumble acting’ is ruining TV and
15, 3 (1987), 137–142. film. https://www.smh.com.au/entertainment/tv-and-radio/speak-up-how-
[15] British Broadcasting Corporation. 2018. BBC Global Experience Language. https: mumble-acting-is-ruining-tv-and-film-20170201-gu2u5j.html
//www.bbc.co.uk/gel
Adaptive Subtitles: Preferences and Trade-Offs in Real-Time Media Adaption CHI ’21, May 8–13, 2021, Yokohama, Japan

[36] M Lambooij, MJ Murdoch, Wijnand A IJsselsteijn, and Ingrid Heynderickx. 2013. [53] Silvia Pfeiffer. 2019. WebVTT: The Web Video Text Tracks Format. https://www.
The impact of video characteristics and subtitles on visual comfort of 3D TV. w3.org/TR/2019/CR-webvtt1-20190404/.
Displays 34, 1 (2013), 8–16. https://doi.org/10.1016/j.displa.2012.09.002 [54] Krishnan Ramanathan, Yogesh Sankarasubramaniam, and Vidhya Govindaraju.
[37] Mina Lee, Beverly Roskos, and David R. Ewoldsen. 2013. The Im- 2011. Personalized Video: Leanback Online Video Consumption. In Proceedings
pact of Subtitles on Comprehension of Narrative Film. Media Psychol- of the 34th International ACM SIGIR Conference on Research and Development in
ogy 16, 4 (2013), 412–440. https://doi.org/10.1080/15213269.2013.826119 Information Retrieval (Beijing, China) (SIGIR ’11). Association for Computing
arXiv:https://doi.org/10.1080/15213269.2013.826119 Machinery, New York, NY, USA, 1277–1278. https://doi.org/10.1145/2009916.
[38] Margaret S Jelinek Lewis. 2000. Television captioning: A vehicle for accessibility 2010158
and literacy. On-line Proceedings of CSUN 2000, 1 (2000), 1–5. [55] RNID. 2016. What Happens In An Audiology Appointment. https:
[39] Lluis Manchon and Pilar Orero. 2018. Usability tests for personalised subtitles. //rnid.org.uk/information-and-support/hearing-loss/getting-your-hearing-
Translation Spaces 7, 2 (2018), 263–284. https://doi.org/10.1075/ts.18016.man tested/what-happens-in-an-audiology-appointment/.
[40] Paul Markham. 1999. Captioned videotapes and second-language listening word [56] S4C. 2001. Research into the demand for Welsh language subtitling in Wales. RNID,
recognition. Foreign Language Annals 32, 3 (1999), 321–328. https://doi.org/10. RNID Cymru. https://www.s4c.cymru/abouts4c/corporate/pdf/e_adroddiad_
1111/j.1944-9720.1999.tb01344.x isdeitlo.pdf
[41] Shaun Patrick McCarthy, Yaron Sole, Trevor James Walker, Arun Velayudhan [57] Jason M. Silveira and Frank M. Diaz. 2014. The effect of subtitles on listeners’
Pillai, and Venkatraman Prabhu. 2020. Personalized recap clips. US Patent perceptions of expressivity. Psychology of Music 42, 2 (2014), 233–250. https://doi.
10,555,023. org/10.1177/0305735612463951 arXiv:https://doi.org/10.1177/0305735612463951
[42] Nora McDonald, Sarita Schoenebeck, and Andrea Forte. 2019. Reliability and [58] David Sloan, Matthew Tylee Atkinson, Colin Machin, and Yunqiu Li. 2010. The
inter-rater reliability in qualitative research: Norms and guidelines for CSCW Potential of Adaptive Interfaces As an Accessibility Aid for Older Web Users.
and HCI practice. Proceedings of the ACM on Human-Computer Interaction 3, In Proceedings of the 2010 International Cross Disciplinary Conference on Web
CSCW (2019), 1–23. Accessibility (W4A) (Raleigh, North Carolina) (W4A ’10). ACM, New York, NY,
[43] Holger Mitterer and James M. McQueen. 2009. Foreign Subtitles Help but Native- USA, Article 35, 10 pages. https://doi.org/10.1145/1805986.1806033
Language Subtitles Harm Foreign Speech Perception. PLOS ONE 4, 11 (11 2009), [59] David Sloan, Peter Gregor, Murray Rowan, and Paul Booth. 2000. Accessi-
1–5. https://doi.org/10.1371/journal.pone.0007785 ble Accessibility. In Proceedings on the 2000 Conference on Universal Usabil-
[44] Kyle Montague, Vicki L. Hanson, and Andy Cobley. 2012. Designing for Individu- ity (Arlington, Virginia, USA) (CUU ’00). ACM, New York, NY, USA, 96–101.
als: Usable Touch-screen Interaction Through Shared User Models. In Proceedings https://doi.org/10.1145/355460.355480
of the 14th International ACM SIGACCESS Conference on Computers and Accessi- [60] Peter Thompson. 2000. Notes on Subtitles and Superimpositions. Chicago Media
bility (Boulder, Colorado, USA) (ASSETS ’12). ACM, New York, NY, USA, 151–158. Works 1, 18 (2000), 1–4.
https://doi.org/10.1145/2384916.2384943 [61] Garreth W. Tigwell, David R. Flatla, and Rachel Menzies. 2018. It’s Not Just
[45] Netflix. 2019. Netflix Subtitle Preferences. netflix.com/subtitlepreferences the Light: Understanding the Factors Causing Situational Visual Impairments
[46] K Noland and L Truong. 2015. A survey of UK television viewing conditions. during Mobile Interaction. In Proceedings of the 10th Nordic Conference on Human-
BBC Research & Development White Paper 287 (2015), 1–58. https://www.bbc.co. Computer Interaction (Oslo, Norway) (NordiCHI ’18). ACM, New York, NY, USA,
uk/rd/publications/whitepaper287 338–351. https://doi.org/10.1145/3240167.3240207
[47] Don Norman, Jim Miller, and Austin Henderson. 1995. What You See, Some of [62] Garreth W. Tigwell, Benjamin M. Gorman, and Rachel Menzies. 2020. Emoji
What’s in the Future, and How We Go About Doing It: HI at Apple Computer. Accessibility for Visually Impaired People. In Proceedings of the 2020 CHI Con-
In Conference Companion on Human Factors in Computing Systems (Denver, Col- ference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI
orado, USA) (CHI ’95). ACM, New York, NY, USA, 155–. https://doi.org/10.1145/ ’20). Association for Computing Machinery, New York, NY, USA, 1–14. https:
223355.223477 //doi.org/10.1145/3313831.3376267
[48] The Office of Communications UK (Ofcom). 2018. Making on-demand ser- [63] Sarah J Tracy. 2019. Qualitative research methods: Collecting evidence, crafting
vices accessible. https://www.ofcom.org.uk/__data/assets/pdf_file/0014/131063/ analysis, communicating impact. John Wiley & Sons, Oxford, UK.
Statement-Making-on-demand-services-accessible.pdf. [64] European Broadcasting Union. 2018. EBU-TT-D Subtitling Distribution Format.
[49] OfCom. 2006. Television access services review. https://www.ofcom.org.uk/ https://tech.ebu.ch/docs/tech/tech3380.pdf
consultations-and-statements/category-1/accessservs [65] T. Vigier, Y. Baveye, J. Rousseau, and P. Le Callet. 2016. Visual attention as a
[50] OfCom. 2017. Ofcom’s Code on Television Access Services. https://www.ofcom. dimension of QoE: Subtitles in UHD videos. 2016 Eighth International Conference
org.uk/__data/assets/pdf_file/0020/97040/Access-service-code-Jan-2017.pdf on Quality of Multimedia Experience (QoMEX) 2016, 1 (June 2016), 1–6. https:
[51] OfCom. 2019. Adults’ media use and attitudes Report. https: //doi.org/10.1109/QoMEX.2016.7498924
//www.ofcom.org.uk/research-and-data/media-literacy-research/adults/adults- [66] W3C. 2019. Making Audio and Video Media Accessible - Captions/Subtitles.
media-use-and-attitudes https://www.w3.org/WAI/media/av/captions/
[52] Matthew Paradis, Rebecca Gregory-Clarke, and Frank Melchior. 2015. Venue- [67] Jacob O. Wobbrock, Krzysztof Z. Gajos, Shaun K. Kane, and Gregg C. Vander-
Explorer, Object-Based Interactive Audio for Live Events. Proceedings of the heiden. 2018. Ability-Based Design. Commun. ACM 61, 6 (May 2018), 62–71.
International Web Audio Conference 2015, 1 (January 2015), 1–5. https://doi.org/10.1145/3148051

You might also like