Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/323779800

Eye-tracking: A comprehensive guide to methods, paradigms and measures

Book · November 2017

CITATIONS READS

63 11,211

2 authors, including:

Kenneth Holmqvist
Lund University
177 PUBLICATIONS   6,387 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Data quality View project

Event detection View project

All content following this page was uploaded by Kenneth Holmqvist on 15 March 2018.

The user has requested enhancement of the downloaded file.


Eye tracking

a comprehensive guide to methods, paradigms and measures
|

Eye tracking

a comprehensive guide to methods, paradigms
and measures

Prof. Kenneth Holmqvist, PhD

Richard Andersson, PhD


|

Eye tracking: A comprehensive guide to methods, paradigms and measures

Kenneth Holmqvist (Regensburg University, Germany, Masaryk University, Brno, Czech Republic,
and NWU Vaal, South Africa)
Richard Andersson (Tobii AB, Stockholm, Sweden. Formerly Lund University, Sweden)

Please cite this book as


Holmqvist, K. and Andersson, R. (2017). Eye tracking: A comprehensive guide to methods,
paradigms and measures, Lund, Sweden: Lund Eye-Tracking Research Institute.

c Lund Eye-Tracking Research Institute (LETRI) AB


contact: kenneth.holmqvist@ownit.nu

The moral rights of the author have been asserted through RightsLink and in personal
communication.

This 2nd edition was first published in 2017. The first edition was published by Oxford University
Press in 2011.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or
transmitted, in any form or by any means, without the prior permission in writing of the main
author, or as expressly permitted by law, or under terms agreed with the appropriate reprographics
rights organisation. Enquiries concerning reproduction outside the scope of the above should be sent
to the main author.

You may not circulate this book in any other binding or cover and you must impose this same
condition on any acquirer.

Cover image from c Adobe/Eric Reis

ISBN-13: 978-1979484893
ISBN-10: 1979484899

Printed by CreateSpace, Charleston SC, USA


WHY WE WROTE THIS BOOK | v

Why we wrote this book


This book is written for and by researchers who are still in that part of their careers where they are
actively using the eye-tracker as a tool; those who have to deal with the technology, the signals, the
filters, the algorithms, the experimental design, the programming of stimulus presentation, instruc-
tions to participants, operating the various tools for data analysis, and, of course, worrying about all
of the di↵erent things that must not go wrong!
A central theme of the book concerns the wide range of fields eye tracking covers. Suppose an
educational psychologist wishes to use eye tracking to evaluate a new software package designed to
support learning to read. She may have an excellent idea as a starting point, and some understanding
of the kind of results that eye tracking could provide to tackle her research question. However, unless
she and the group around her are adept in computer science, it is unlikely that she will know how
the eye movement data she collects is generated, in terms of how raw data samples are converted
into fixations and saccades using event detection algorithms, how the di↵erent representations of
eye movement data are calculated, and how all of the measures of eye movements relate to these
processes. All of this is important because the subtleties involved in working with eye tracking data
can have large consequences for the final results, and thus whether or not our educational psychologist
can confidently conclude that her software package is e↵ective in supporting the development of
reading skill.
This is not to say that hardcore computer science skills are the crux of good eye tracking research,
for this is certainly not the case. One can equally envisage a situation where an expert in programming
and the manipulation of data plans and executes an eye tracking study poorly, simply because she is
not trained in the principles of experimental design, or knowledgeable in the associated literature on
the visual system and oculomotor control.
There are many contrasts between the diverging schools of thought which use eye tracking;
practices and preferences vary, but experts in di↵erent fields certainly do not draw on each other’s
strengths enough. We felt there was a need to pinpoint the relative merits of adopting methods based
in one field alone, whilst highlighting that the lack of synergy between di↵erent disciplines can lead
to suboptimal research practices, and new advancements being overlooked.
Besides technical details and theory, however, the heart of this book revolves around practicality.
In the eye tracking group at Lund University we have been teaching eye tracking methodology reg-
ularly since 2000. We commonly see newcomers to the technique run aground when encountering
just the sort of issues raised above, but beginners also struggle with problems which are even more
practical in nature. Hands-on advice for how to actually use eye-trackers is very limited. Setting up
the eye camera and performing a good calibration routine is just as important as the design of the
study and how data is handled, since if the recording is poor your options are limited from the outset.
There are fundamental methodological skills which underpin using eye-trackers, but at the other
end of the spectrum there is also the vast choice of measures available to the eye tracking researcher.
For the present text to be complete, therefore, we felt a requirement should be to draw together eye
tracking measures, as well as methods, into an understandable structure. So, starting around 2005,
we began producing a taxonomy of all eye movement methods and measures used by researchers,
examining how the measures are related to each other, what type of data quality they rely on, and the
preceding data processing required to obtain a certain measure. Our classification work thus consisted
of searching the method sections from thousands of journal papers, book chapters, PhD theses, and
conference proceedings. Every measure and method we found was catalogued and put into a growing
system. Some of the measures were extremely elusive, as they are known by di↵erent names, not
only between research fields, but even within, and often the precise implementations are missing in
the published texts. At first, we were very unclear on how to classify measures. Some varieties of
taxonomic structures that we rejected can be found on p. 619. We ended up with a classification
structure where the operational definitions are at the centre.
Some of these many measure are ubiquitous, such as fixation duration which can be used in
vi | WHY WE WROTE THIS BOOK

almost any study, while other measures are tightly linked to a specific practice. Antisaccade latencies
do not exist outside of a particular paradigm, for instance, and neither do diversion durations and
regression scanpaths. These measures require a task and a stimulus of a specific kind to make any
sense. Paradigms are often well-known by the researchers in a field where they are used, as they
form the core of the experimental trial, but remain unknown outside of the field they originated from.
Knowledge of paradigms simplifies and strengthens experimental designs, adding value to research.
Users of eye-trackers often lack proficient training because there is little or no teaching com-
munity to rely on. As a result people are often self-taught, or depend on second-hand knowledge
which may be out of date or even incorrect. When they participate in our eye tracking methodology
courses, we find that many new users are very focused on their research questions, but are surprised
how much time they need to invest in order to master eye tracking properly. Often people attend-
ing have just purchased an eye-tracker to compliment their research, or for use in their company to
tackle ergonomic and marketing related questions. Our aim for this book is to make learning to use
eye-trackers a much easier process for these readers. If you have a solid background in experimental
psychology, computer science, or mathematics you will often find it straightforward to embrace the
technologies and workflow surrounding eye tracking. Whatever your background, you should be able
to achieve the same level of knowledge and understanding from this book as you would from training
on eye tracking in-house in a fully competent laboratory.
More specifically, this book has been written to be a support when:
1. Evaluating or acquiring a commercial eye-tracker,
2. Planning an experiment where eye tracking is used as a tool,
3. About to record eye movement data,
4. Planning how to process and interpret the recorded data, before carrying out statistical tests on
it,
5. Reading or reviewing eye movement research.
In our e↵orts to classify eye tracking methods, measures, and paradigms, combined with useful
practical hints and tips, we hope to provide the reader with a thoroughly updated second edition of
our comprehensive textbook on methodology. While helpful as an introduction for new users of eye
tracking, this book also caters to the advanced researcher.

Material used in the book


We have repeatedly used data from the following studies in examples, all recorded in our laboratory
in Lund:
1. 318 native speakers of Swedish read 16 pages from an English book on marketing, and 160
of these came back two years later to read a very similar text. The first recording was made
monocularly at 1250 Hz using the SMI HiSpeed system, and the second binocularly at 500 Hz
using the same eye-trackers. Data are described in Nyström and Holmqvist (2010).
2. 24 speakers of Swedish read four texts from the standardized Högskoleprovet (Swedish
Scholastic Aptitude test), either in silence, with cafeteria noise, while listening to music they
like to listen to while studying, or while listening to music that they would never consider
studying with. Recordings were made monocularly at 1250 Hz using the SMI HiSpeed system.
Data described in Johansson, Holmqvist, Mossberg, and Lindgren (2012).
3. 68 people who were on their way into a supermarket were asked to wear the monocular SMI
HED 200/25 Hz system while buying their groceries. Data are described in Gidlöf, Wallin,
Dewhurst, and Holmqvist (2013).
MATERIAL USED IN THE BOOK | vii

4. 23 engineering students and 21 students of the humanities were asked to solve 43 mathematical
problems of three di↵erent kinds, while their eye movements were monitored using the SMI
HiSpeed 1250 Hz system. Data are described in Holmqvist et al. (2011).
5. 24 ninth-graders browsed the web for 15 minutes using the SMI RED 4, 60 Hz system, with a
set of 20 starting links to choose from, but with an instruction to browse freely. Data described
in Sandberg, Gidlöf, and Holmberg (2011).
Ideally, we would have liked to have collected our examples from all 20+ manufacturers of eye-
trackers, but this has proven to be practically impossible. The vast majority of examples in this book
have been collected with LC Technology, DPI, EyeTribe, SMI, Tobii, and SR Research systems in
our lab, or in the labs of colleagues. We use the examples we have, because we think they are very
general to many eye-trackers, and not specific to a single system or manufacturer. Software used and
described are versions used during 2006–2017, the period during which we wrote the first and second
editions of the book, and are included to describe principles. We refer to manufacturer manuals for
current versions.
By definition, a book on eye tracking methodology will have to contain many examples of what
works well and what can go wrong in eye tracking research. Thus, during a few years, we have
collected many examples of successes and mishaps in our own laboratory and the labs of colleagues,
that we have used in the book as warnings and eye-openers. Our examples are not considered as an
endorsement or critique of a particular eye-tracker, but the illustrated property should be critically
evaluated against any eye-tracker you consider using.

Did you find errors in our book?


We are eager to correct erroneous information for future editions. Please send any feedback, correc-
tions, comments, or suggestions to the authors.
viii | ACKNOWLEDGMENTS

Acknowledgements
A great thanks to everyone who has helped us writing this book. Firstly, Halszka Jarodzka has been
a very appreciated co-author both in the first and during the preparations of the second edition. Ellen
Kok updated the Chapters 13, 14, 15, and 16 for the second edition of this book. They have both done
an excellent job throughout and fully deserve co-authorship. We are also very thankful to Richard
Dewhurst for starting the paradigm chapter, and for his valuable work on the first edition, and wish
him the best in his new position and with his new family.
The eye tracking seminar in Lund has debated previous manuscripts of both editions many times
over. In particular, the authors would like to thank Diederick Niehorster, Raimondas Zemblys, Ig-
nace Hooge, Alexander Strukelj, Kerstin Gidlöf, Paulina Lindström, Nils Holmberg, Roger Johans-
son, Roy Hessels, Jana Holsanova, Janna Spanne, Lenisa Brandão, Linnea Larsson, Johan Pihel, and
Philip Pärnamets and the many seminar guests.
Well over five hundred students have read and given feedback on earlier drafts while taking the
7.5 ECTS eye tracking courses given in Lund since 2000, and at the LETA crash workshops in eye
tracking methods that we have organized since 2008.
We would also like to thank the many colleagues who have contributed by reviewing draft chap-
ters for this edition: Koos van Geel for excellent feedback on the introduction, William Schmidt and
Sam Hutton for critical opinions on Chapter 2 and many other parts of the book. Ignace Hooge gave
us invaluable help with the experimental design and paradigms chapters. Je↵ Pelz for his in-depth
review of Chapter 4 and other chapters, and Dan Witzner Hansen, Warren Ward, Jan Ober, Anders
Kingbäck, Carlos Morimoto, Meike Mischko, and Diederick Niehorster for very specific reviews of
this and other chapters. Tom Foulsham and Sam Hutton for their reviews of Chapter 5 and other parts
of the book. We are also very grateful to Pieter Blignaut, Daniel Jacobus Wium and Jan Drewes for
their very thorough reviews of Chapter 6, Oleg Komogortsev for a fresh and initiated view on Chapter
7, Jacob Lund Orquin for carefully checking Chapter 8, and Parag Mital and Andrew Duchowsky for
suggesting excellent improvements in Chapter 9. Olivier Le Meur for the many good suggestions for
Chapter 10. Walter Bishof, Sandra Starke and Nicola Anderson have kindly reviewed parts of the
measure chapters. Many experts and nonexperts have provided invaluable comments and corrections
to our attempt at paradigm overviews: Chrystalina Antoniades, Sara Farshchi, Filip Dechterenko,
Elena Eriksson, Andreas Falck, Sam Hutton, Roger Johansson, Amir Kheradmand, Alan Kingstone,
Jorge Otero-Milan, Jan Theeuwes, Jeremy Wolfe, Ronald Rensink, Michele Rucci, William Schmidt,
and Neal Snape.
Thousands of participants in ours and other studies have contributed by providing data and eye-
images that we have selected. Although anonymous, their participation has provided us with the
examples in many of the chapters, and we extend our thanks to them also.
Thanks to Nils Holmberg for setting up and managing our own Subversion server while we wrote
both editions. Not having to bother with who works with what version made writing this book so very
much easier.
Michael Cutter has done a great job proofreading an earlier version of this book. Our best thanks
for countless improvements on the flow of the text, usage of correct British grammar, spelling and
hyphenation.
This book has been written in the same spirit as the first edition. Our decision to change publisher
has come about for reasons unrelated to our collaboration with OUP. Publication was simplified by
OUPs generous decision to let us reuse the style of content and cover.
Finally, without electronic access to journal papers, this book would have been much more dif-
ficult to write. The authors want to thank all the libraries and scholars who have given (or sold our
universities) access to thousands of peer-reviewed journal papers on eye-tracking-based research.
Contents

About the authors xiv

I TECHNICAL AND METHODOLOGICAL SKILLS

1 Introduction 3
1.1 Success stories in eye tracking 4
1.2 Your first few eye tracking studies—step-by-step 5

2 Eye movements: biology, neurology and psychology 12


2.1 The human eyes and their movements 12
2.2 The pupil 16
2.3 The lens and accommodation 17
2.4 Coordinate systems for eye movements 18
2.5 Behind the eye muscles 21
2.6 Visual intake at the retina and beyond 24
2.7 Attention, preattention, salience, and intake 26
2.8 Peripheral vision and the visual field 29
2.9 The individual participant 31
2.10 Summary: the biology, neurology and psychology of eye movements 33

3 From Vague Idea to Experimental Design 34


3.1 The initial stage—explorative pilots, fishing trips, operationalizations, and highway research 34
3.2 What caused the effect? The need to understand what you are studying 40
3.3 Planning for statistical success 56
3.4 Summary 62

4 Eye-tracker Hardware 64
4.1 A brief history of eye-tracking technologies 64
4.2 Sampling of raw data 83
4.3 Feature detection 86
4.4 Calibration 88
4.5 Sampling frequency: what speed do you need? 90
4.6 Types of eye-trackers and the properties of their set-up 95
4.7 Manufacturers and customers 108
4.8 How to set up an eye-tracking laboratory 115
4.9 Summary 119

5 Data Recording 121


5.1 Building the experiment 121
x | CONTENTS

5.2 Participant recruitment and ethics 127


5.3 Eye camera set-up 128
5.4 Calibration 142
5.5 Eye camera set-up and calibration with challenging participants 150
5.6 Instructions and start of recording 151
5.7 Debriefing 152
5.8 Preparations for data analysis 152
5.9 Summary 156

6 Raw Data, Their Quality, and Error Propagation 157


6.1 Inspection of raw data 158
6.2 Definitions of data quality 159
6.3 Robustness 165
6.4 Data loss 165
6.5 Spatial accuracy 167
6.6 Linearity 173
6.7 Spatial precision 174
6.8 Resolution 189
6.9 Eye-tracker latency 190
6.10 Temporal precision 192
6.11 Stimulus-synchronization latencies 194
6.12 Recovery time 196
6.13 Precalibrated raw data and its quality 197
6.14 Reporting data quality in publications 198

II DETECTING EVENTS AND BUILDING REPRESENTATIONS

7 Estimating Oculomotor Events from Raw Data Samples 201


7.1 Do-it-yourself event detection 202
7.2 Events in raw data 207
7.3 Basic algorithms and their settings 219
7.4 Special purpose algorithms 234
7.5 Event detection in the larger context 249
7.6 Summary: oculomotor events in eye movement data 252

8 Areas of Interest 254


8.1 An AOI hit and dwell calculator in Excel 254
8.2 The basic AOI events 255
8.3 The AOI editor and your hypothesis 259
8.4 Types of AOIs 273
8.5 AOIs in gaze-overlaid videos 282
8.6 AOI-based representations of data 285
8.7 Summary: events and representations from AOIs 298
CONTENTS | xi

9 Gaze Density Maps—Scientific Tools or Fancy Visualizations? 302


9.1 Principles, terminology, and representations 302
9.2 How gaze density maps are built 310
9.3 Interpreting gaze density maps visually 317
9.4 Usage of gaze density maps in data analysis 322
9.5 Summary: gaze density map representations 326

10 Scanpaths—Theoretical Principles and Practical Application 327


10.1 What is a scanpath? 327
10.2 Usages of scanpath visualization 330
10.3 Scanpath events 334
10.4 Scanpath representations 340
10.5 Principles for scanpath comparison 347
10.6 Unresolved issues concerning scanpaths 353
10.7 Summary: scanpath events and representations 358

11 Complementary Data: Recording and Analysis 360


11.1 Types of complementary data that can be collected with eye movement data 360
11.2 Recording complementary data together with eye tracking data 370
11.3 Analysis of complementary data in relation to eye tracking data 376
11.4 Summary: events and representations with Complementary data 385

III PARADIGMS AND MEASURES

12 Paradigms 389
12.1 The extended fixation paradigm 390
12.2 Prosaccade paradigms 391
12.3 Perisaccadic perception paradigms 395
12.4 Multiple target paradigms 396
12.5 Memory-guided saccade paradigms 398
12.6 The multiple object tracking paradigm (MOT) 399
12.7 Electrical stimulation paradigms 400
12.8 The antisaccade paradigm 400
12.9 The oculomotor capture paradigm 403
12.10 Vestibulo-ocular reflex (VOR) paradigms 403
12.11 Optokinetic nystagmus (OKN) paradigms 405
12.12 Vergence paradigms 406
12.13 Smooth pursuit paradigms 407
12.14 Visual search paradigm(s) 409
12.15 The preferential-looking paradigm 413
12.16 Spatial cueing paradigms 414
12.17 Social interaction paradigms 416
12.18 Visual world paradigm 417
12.19 Paradigms of looking at nothing 419
xii | CONTENTS

12.20 Change blindness paradigm 421


12.21 Reading paradigms 422
12.22 Paradigms of gaze-contingent stimulus manipulation 428
12.23 Usability and human factors paradigms 434

13 Movement Measures 439


13.1 Movement direction measures 439
13.2 Movement amplitude measures 447
13.3 Movement duration measures 458
13.4 Movement velocity measures 463
13.5 Movement acceleration measures 470
13.6 Movement shape measures 474
13.7 AOI order and transition measures 477
13.8 Scanpath comparison measures 486

14 Position Measures 499


14.1 Basic position measures 500
14.2 Position dispersion measures 501
14.3 Position similarity measures 514
14.4 Position duration measures 526
14.5 Pupil diameter 542

15 Count Measures 547


15.1 Saccades: number, proportion, and rate 551
15.2 Proportion of post-saccadic oscillations 553
15.3 Microsaccade rate 554
15.4 Square-wave jerk rate 555
15.5 Smooth pursuit rate 556
15.6 Blink rate 557
15.7 Fixations: number, proportion, and rate 560
15.8 Dwells: number, proportion, and rate 565
15.9 Participant, area of interest, and trial proportion 569
15.10 Transition number, proportion, and rate 572
15.11 Number and rate of regressions, backtracks, lookbacks, and look-aheads 575

16 Latency and Distance Measures 578


16.1 Latency measures 580
16.2 Distances 601

17 What are Eye-Movement Measures and How can they be Harnessed? 609
17.1 Eye movement measures: plentiful but poorly accessible 609
17.2 Measure concepts and operationalizing them 611
17.3 Proposed model of eye-tracking measures 613
17.4 Measures and paradigms 617
17.5 Classification of eye movement measures 619
17.6 How to construct even more measures 621
| xiii

17.7 Measures and visualizations 623


17.8 Summary 625

References 627

Index 713
xiv | ABOUT THE AUTHORS

About the authors


While writing the first edition was a collaborative e↵ort by the members of the eye tracking group in
Lund, the second edition has been the work of a focused group of authors. We have divided chapters
between us and rewritten them. Some chapters are new, and many are thoroughly upgraded, by a
single author. This said, everyone in the author group has at some point worked through all the
chapters.
Kenneth Holmqvist is a professor at North-West University, South Africa, at the Masaryk Uni-
versity in Brno, Czech Republic and a guest professor at Regensburg University, Germany. Kenneth
founded the eye tracking laboratory in Lund already in 1995, and has since extended the laboratory
to incorporate almost 50 eye-trackers and a multitude of projects. Kenneth has worked in a large
variety of eye-tracking-based research stretching from reading research and scene perception, over
to newspaper reading, post-saccadic oscillations, the eye movements of dogs, advertisement studies,
and gesture recognition in face to face interaction. Kenneth also has expertise in eye tracking used
in applied areas, including decision making in supermarkets and research on safety in car driving
and air traffic control. In 2000, Kenneth initiated regular masters courses in eye tracking methodol-
ogy. In 2006, he founded the Scandinavian Conference on Applied Eye Tracking, and in 2008, the
international LETA training courses in eye-tracking methodology which has taught more than 650
scholars around the world. He also arranged so that the eye-tracking group in Lund could jointly
organize ECEM 2013 in Lund. Kenneth is the initiator and main author of this book, both its first and
its second edition.
Richard Andersson is a PhD in Cognitive Science, and has worked as a researcher in Cognitive
Science at Lund University. He started out in psycholinguistics, but later moved on to more method-
ological research, such as the role of event detection algorithms, physiological factors, and procedures
on data quality. Richard was a frequent teacher at the LETA intensive eye-tracking courses as well as
a teacher in eye-tracking and experimental design. He now works for Tobii Pro as a Research Liaison,
understanding and helping to prioritize requests from the researchers, and being a domain expert for
the developers.
Part I
Technical and Methodological Skills

Part I introduces the three central areas of competence required for running an eye tracking study: the
hardware, the experimental design, and the actual recording of the data from participants. Mastering
these skills is the key to recording high-quality eye movement data. Each of the three benefit from
understanding the biology of the eye, the neurology behind visual perception, fundamental vision
science, and the psychological theories and models of attention and eye movements.
1 Introduction

Eye tracking as a research tool is more accessible than ever, and is growing in popularity amongst
researchers from a whole host of di↵erent disciplines. Usability analysts, sports scientists, cogni-
tive psychologists, reading researchers, psycholinguists, neurophysiologists, electrical engineers, and
many others all have a vested interest in eye tracking for di↵erent reasons. There is no doubt that it is
useful to record eye movements, and that it advances science and leads to technological innovations.
At the same time, the growth of eye tracking in recent years presents a variety of challenges, the
most pressing of which is how to support the rapidly increasing number of people using eye-trackers.
To address this concern this book follows the process of empirical investigation using eye tracking
from beginning to end, providing detailed advice and discussion of the issues that can be encountered
en route.
Whether a researcher or not, most beginning users of eye tracking equipment today buy both
hardware and analysis software from manufacturers. This includes all of the technical properties of
the eye-tracker, as well as algorithms for recording, filtering, and analysing data, and the provided
default settings. The user initially trusts the system, because they are completely dependent on it.
However, as their understanding grows they find an increasing number of oddities. For instance, they
get an average first fixation duration that di↵ers too much from that reported in the classic literature,
and they are very uncertain why: Is it the hardware with its sampling frequency and resolution, or
possibly a bug in the software? It could perhaps also be the settings, the algorithm, or the study itself
that made the di↵erence. Furthermore, there are strangely short fixations of only 1-2 ms in the data.
What does that mean? It is never reported in the literature; not once. The manual says that they should
be removed, but could they have caused the shorter first fixation durations? Calling the manufacturers
of the system does not really help: They explain the architecture of the system and the algorithms,
but they only implemented an algorithm from a journal paper, and do not really know how good it is
or what precise settings to use for the data in the study.
There are in fact a whole range of real issues in eye tracking methodology that have never been
written down and published. For example, how do you best calculate fixation durations from gaze-
overlaid videos? What angles from the camera to the eye allows you to record data for the entire mon-
itor? Are participants with contact lenses a problem? Can you fix poor data quality post-recording?
What are the sources of latencies in eye tracking data? The tricks of the trade are learnt only from the
experience of recording lots of data from many people in a variety of di↵erent set-ups, and looking
at the data that comes out of the system. The manufacturer manuals do not describe the experiences
gathered by eye tracking researchers, because manufacturers are seldom users of their own systems.
The established eye movement researchers themselves, irrespective of research field, are so focused
on publishing their theoretically important results that they often forget or omit any description of
hardware, settings, or their procedure that the rest of us could learn from. Many journal editors are
also inclined to not want to publish that methodological information.
Occasionally there is a methodological point that is made clear in a result-oriented paper, but
methods papers that are specifically written for eye tracking researchers are only rarely published
(journals such as Behaviour Research Methods provide excellent exceptions). Even when such meth-
ods papers appear they reach very few readers, because the users of eye-trackers are so fragmented
into their own research fields, traditions, terminology, and methods that they are not likely to read a
methods paper published in one of the other traditions. We wrote this book to compile as much of the
in-house knowledge from the best eye tracking groups into one single book, available for all.
4 | INTRODUCTION

1.1 Success stories in eye tracking


Research with eye tracking is not just hard work with confusing data. When everything works, really
interesting findings can be obtained. For inspiration, here are four examples where eye tracking has
clearly made a di↵erence.

Clinical neuropsychology
The use of eye tracking to study schizophrenia started only a decade after the first eye-trackers were
constructed (Diefendorf & Dodge, 1908). Today we know that eye movements—specifically smooth
pursuit gain (p. 604) and antisaccades (p. 400)—allow clinicians to diagnose the illness in passive
phases, as well as in relatives who carry the genes but who are not directly a↵ected. The antisaccade
task has become a reliable tool for studying prefrontal executive control, including its development
and deviations. O’Driscoll and Callahan (2008) stated that ‘Average e↵ect sizes and confidence limits
for global measures of pursuit and for maintenance of gain place these measures alongside the very
strongest neurocognitive measures in the literature.’ As if this were not enough, inspection of eye
movements is the standard method for diagnosing issues with the balance system in any hospital you
visit.

Reading
Reading studies were first conducted very early in eye tracking research. Erdmann and Dodge (1898)
conducted a systematic enquiry into reading, showing that we do not look at every letter in the words
we read, and investigated saccade amplitudes and fixation durations during typical reading. Since
then, thousands of eye movement studies of reading have been conducted and published. In the 1970s,
it was found that readers only need to see text in an area around the point of fixation, referred to as
‘the visual span’. When you change words outside of the visual span to gibberish, people are still able
to read unhindered. Eye tracking research into reading continues to thrive partly because text can be
varied in so many ways, not least into di↵erent languages with di↵erent writing systems, and also
because text is used in so many situations. Since the 1990s, computational models have been able
to successfully simulate eye movements during reading, using word properties as the input (Reichle,
Rayner, & Pollatsek, 2004).

Controlling computers by gaze


The idea has existed since at least the 1960s: If astronauts could control their maneuvring units
and telescopes with their eyes, then their hands could do a better job of controlling other parts of
the aircraft (Merchant, 1967). However, controlling computers using gaze was first achieved in the
1980s. Bolt (1981) presented the first working system, in which he controlled multiple windows with
gaze. It became a research area within human computer interfaces that could be studied theoretically
and empirically (Jacob, 1990). As the eye muscles are often still functional even when the control of
other skeletal muscles is lost in illness, the dominant application became using gaze to type, and to
control computers. An important early player was the ERICA system, which allowed for gaze control
of electrical machines (TV, lights, music), in addition to writing texts and playing games simply by
looking at them (Hutchinson, White Jr, Martin, Reichert, & Frey, 1989). Over the years, art has been
created and whole books have been written by people who can move no other part of their body except
their eyes. Eye-trackers to control computers are now a commercial product, and the technology is
even appearing in some cellphones.
YOUR FIRST STUDIES | 5

Understanding eye movement control


Exploring what drives eye movements was not the first research field to take o↵ when eye-trackers
were beginning to be built in the early 20th century. When it started, this field soon became two
strands of research: One group looked at the psychological determinants of eye movements. This
group designed experiments to reveal a system that selects fixation targets based on peripheral vision.
They found that saccades are launched by a timing mechanism which takes attentional processing de-
mands into consideration (Findlay & Walker, 1999). The other group looked for these mechanisms
in the brain, using electrical probes that either stimulated brain regions to launch saccades, or mea-
sured electrical activity in connection to saccade execution. Today, we know a large amount about
the visual and eye movement circuitry in the brain (pages 21-26 and 400).

1.2 Your first few eye tracking studies—step-by-step


If you are a master’s or PhD student and your supervisor has agreed with you that you are going to
run an eye-tracking study, then this section provides a step-by-step checklist of things to think about.
Studies obviously di↵er from each other, so some check points may not pertain to your study, but this
list should give you a rough idea of what decisions lie ahead of you.
First of all, make sure that at least one of your supervisors is knowledgeable in eye tracking
and eye movements. Ideally, they should have experience of the whole process, from initial ideas to
publication, so that you can consult them about any of the points we mention here. You must then
plan what to do and how much time each step will take. Throughout, take notes on your decisions.
Many of them can be directly inserted into the document that will become your publications. This is
true of the introduction, hypothesis, method, procedure and data analysis. If you do this well, all that
you will need to write after statistical analysis is completed will be the general discussion.
Expect a study from idea to publication to take at least around a year. In rare cases, 6 month
durations are possible. Undergraduate theses that employ eye-tracking methods tend to need a full
semester. At the other end, we have witnessed many projects that take years from idea to publication
and a few unfortunate ones that will likely never lead to public dissemination.

Carefully plan the experiment


Plan your experiment so that you answer exactly the questions you intend to, and so that external
variation that you cannot control for does not invalidate your conclusions. In this, eye tracking is like
any other method in experimental psychology, but today most users of eye-trackers are not trained
in experimentation. Whether you are or not, it is useful to know that eye tracking has a number of
particularities when it comes to experimental design, including the fact that gaze positions are more
ambiguous than we want to believe. Looking at an item does not necessarily mean processing it, and
you may end up measuring a process other than the one you hoped to measure. Also, a lot of eye
tracking data is idiosyncratic, so choose a within-subjects design as often as possible.
Build a trial mechanism that captures your hypothesis, and with this as your starting point select
the measure which fits your hypothesis best. If you are working within an established paradigm,
use whatever trial mechanisms and measures are used in your field to maximize the compatibility
of your research with the prior literature. Prioritize measures that have been extensively tested, as
there is better insight into potential factors a↵ecting them. For example, first fixation durations in
reading have been tested extensively and we know how they will react to changes in, for instance,
word frequency. It would be less problematic to make a reverse inference about the e↵ect of your
manipulation on processing difficulty with this kind of measure than with other less well-explored
measures. Select measures that are as fine-grained as possible, such as those that focus on particular
points in time rather than prolonged gaze sequences. This allows you to perform analyses where you
6 | INTRODUCTION

identify points in time when your participant should be engaged in the particular behaviour in which
you are interested (e.g. search behaviour), and then extract the selected measures during just these
points. This is more powerful than extracting all instances of this measure during the whole trial,
where the particular behaviour of interest is mixed with many other forms of gaze behaviour, which
essentially just contribute noise to your data. Check that:
⇤ I have a concrete hypothesis or prediction.
⇤ I have drawn the eye movements I expect from my theory, previous research, and other predic-
tions onto the stimuli images that I plan to show.
⇤ I have then carefully studied the lines and blobs in my drawings, and figured out what kind of
measures could demonstrate a contradiction vs. a validation of my hypothesis.
⇤ I have imagined that I recorded data from 20 participants, and that I have all the records of their
movements. With this, I made an imaginary chart that illustrates the pattern of results I expect
in my measures.
⇤ I searched for and identified other functionally equivalent measures for my research question.
Are you interested in mental workload for example? Then find out what other measures are
used to investigate this, for instance using the index of this book. Perhaps some of the alterna-
tive measures are better and completely missed by you and others using your paradigm.
⇤ I have formulated the consent form, the participant information letter, and the data management
plan.
⇤ I have applied for and been granted approval from the ethics committee.
⇤ I have decided what kind of statistics I will use to analyse my data.
Chapter 3 describes experimental design in detail, with a particular emphasis on eye-tracking
research.

Choose a suitable eye-tracker


You may have access to a single eye-tracker, and then there is not much else to do other than to learn
how to use it. If you can choose between several di↵erent models, you may want to investigate how
suitable the eye-tracker is for your particular study. Do not let yourself be tricked by the apparent
ease of usage of the system. How it is built and how it is attached—or not—to the participant can be
decisive for some studies. In other studies it is the quality of the data that decides whether an eye-
tracker is suitable. Note that some eye-trackers produce data that cannot even be used to calculate
valid fixation durations, while some eye-trackers cannot be used with AOIs smaller than a quarter
of the stimulus monitor, so be mindful about what data quality your experiment requires. Chapter 4
describes eye-trackers in detail, so read it as part of choosing, or if you would like to understand the
mechanisms inside the black box. Then learn to use your eye-tracker. Record friends and colleagues
on it until you feel comfortable operating it. Always check your recorded data: are they good enough?
Chapter 6 describes data quality assessment and error propagation in more detail.
⇤ I have checked that the data are good enough for my experiment.
⇤ I can consistently record participants with a minimum of fuss.
⇤ I can easily recreate my setup if a colleague records a participant between my sessions and
changing some of the machine settings.
⇤ The data quality is good enough and compatible with the size and layout of my areas of interest
and the events I want to detect.
If you have no eye-tracker, you may need to buy or borrow one. Eye-trackers have been expensive
in the past, but are now dropping in price, and, unfortunately, quality. If they think that you may buy
it, manufacturers are not unwilling to lend you an eye-tracker to try it out.
YOUR FIRST STUDIES | 7

Program your experiment and pilot it


Now take your experimental design and program it in the software you have for your particular
eye-tracker. In some cases, this is easy. A handful of images or webpages are added into slideshow
software that comes with the eye-tracker. In most cases, however, the experiment has several condi-
tions with a certain cross-balance, conditional branching, and many other features that require careful
programming and testing to ensure that the actual experiment you want to run is generated.
Test that any external equipment can interface with the eye-tracker correctly. Select stimulus
presentation tools with care, and test them properly for timing. Be sure to pilot the technical set-up
as well as participant instructions and data analysis properly before you bring in real participants and
commit yourself to the recruitment and recording process.
Once you feel you have a running version, use a few of your friends and colleagues to pilot
the experiment. Pretend that they are real participants and do everything as you would do it in the
real data recording. See whether the desired behaviour is elicited. If not, you may need to redesign
something in your experiment. If you did get data more or less in line with what you expected, export
your data as the measures (fixation durations, for instance) that you will analyse, and plot them. Does
it look like the imaginary chart you made when designing the experiment? Good, then you may really
find an e↵ect. If not, rethink your experiment again.
⇤ I have successfully conducted a pilot study.
⇤ When I plotted the pilot data, it resembled the expected data I drew when planning the experi-
ment.
⇤ I can export the data in a format that allows me to calculate the measures I planned to investi-
gate.
⇤ The analyses that I planned are feasible to conduct.
Finally, take your data through the statistical analysis that you planned. Do your data work with
the requirements of the analysis? If so, great! Did you see a tendency in your data that was in line with
or contradicted your expectations? If you could not use the analysis as you thought then rethink your
measures, your hypothesis, and your statistics. Chapter 3 and the beginning of Chapter 5 describes
this phase in more detail.

Recruit real participants


In some studies, you recruit whoever happens to pass by: students in the cafeteria, visitors to a su-
permarket, etc. Alternatively, you may have particular requirements for your experiment: children of
a specific age, students with a particular educational background, or a balance between genders, or
possibly even a clinical group with a particular diagnosis. Sometimes, recruitment can take months
of preparation and include giving presentations to parental associations and writing applications to
an ethical board. Sometimes, you need to train your dog or chimpanzee for weeks on how to sit still
in your unconstrained environment and look at calibration points before you can record them in the
real task. You may then even need to build a factory line so that you can record as many as possible
in a short time! If you care about high data quality, prioritize recruiting young participants with open,
dark eyes, and avoid recruiting people with makeup, glasses, contact lenses, a narrow eye cleft or
downward pointing eye lashes.
⇤ I developed a successful recruitment strategy.
⇤ I have a cover story for participants who ask what the study is about.
⇤ I have made a recording schedule.
Chapter 3 describes experimental design and Chapter 5 the recording of data.
8 | INTRODUCTION

Recording of real data


Recording data is the pivotal point in all eye-tracker-based research. All preparations lead to data
recording, and all analyses use the collected data. Once you have started to test your participants
with a specific type of experimental set-up, you have committed to it. If you have prepared the data
recording well, you only need to think about how to greet participants as they arrive and how to
operate the eye-tracker and the experiment you have created. If you expect to have participants with
glasses, mascara, or contact lenses, read Chapters 5 and 6 before recording data.
⇤ I know my eye-tracker well enough that I will capture good data for my particular participant
group, whether adult, infant, or animal, as well as my level of freedom for adjusting participant
position and the eye-tracker.
⇤ I prepared and made copies of all consent forms, rewards, and other documentation.
⇤ Participants know when and how to get to my lab.
Setting up the eye camera for a given participant is the most decisive factor for good data quality
(aside from the eye-tracker itself), and requires an understanding of the principles of video-based eye
tracking which can only be acquired by experience. The good thing is that this experience can be
transferred to any other video-based eye-tracker. When they arrive, look at the participant’s eyes to
quickly identify any mascara, eyeshadow, drooping eyelids and downward eye lashes, or squint. Also
check for contact lenses or difficult glasses with antireflective coating and thick, dark frames with
reflective parts in them. Get acquainted with the level of freedom that you have in directing the eye
camera towards the participant’s eye(s), and learn how to use contrast, luminance, and focus settings
to make it the clearest possible eye image.
Back-up your recorded data regularly and systematically, in a way that respects the anonymity of
your participants and that cannot be accessed by third parties. Treat participants so that they feel rea-
sonably good about the task. This is an investment in trust that makes them come back to participate
in your next experiment.

Processing of data
When all of your data are recorded and backed-up, your day-to-day work will change to the pro-
cessing of your data. This can appear very simple, as manufacturer supplied software provides many
functions for the advanced export of values for measures calculated from your data. However, a lot
of things happen under the hood, and decisions you make about your data depend on them. Many
researchers prefer to program their entire analysis themselves, to control it as they want to, and to
know what happens. The data processing step is one where it is easy to get lost. Getting through this
stage is easier if you decided during your experimental design which measures and what statistics to
use, and even easier if you already piloted it. Part II describes the most common types of processing.

Using event detection


Do you need event detection at all? If you are hunting for e↵ects that do not necessarily require
events, and those e↵ects are small, using raw data makes sense. There is always a risk that the bias or
variability of an algorithm can drown out the e↵ects you are looking for.
If you need to analyse your data with a fixation or saccade detection algorithm, and care about
the validity of the output, what should you do? Recommendations for algorithmic settings depend
on factors such as the eye-tracker used for data collection, individual traits such as fixation stability,
and the particular circumstances during calibration and recording (see Chapter 7 for a detailed dis-
cussion). If you have access to a velocity-based algorithm, you are more likely to produce a good
output. Dispersion-based algorithms are not suited to analysing data collected with higher sampling
frequencies (> 200 Hz). If you do not trust manufacturer algorithms, researchers have built many
algorithms that you can choose from, many of which are specialized for specific needs.
YOUR FIRST STUDIES | 9

⇤ I have plotted my fixations next to my raw data, as in Figure 7.31 on p. 233, and selected a
setting so that all of the fixations I see in the raw data appear in the fixation plot.
⇤ I have added a note to my manuscript specifying the algorithm and the detection parameters
that I have used.
If you cannot get the algorithm to produce the fixations you see in the raw data, you could examine
the distribution of events in your measure at di↵erent settings (look at the histograms), before you
decide which setting to use. Make parallel analyses with several settings, and see how this a↵ects
your results (see Green, 2006, who does this).
Beware of smooth pursuit and movements which appear to be smooth pursuit in your data file.
This is likely to be part of your data if you use dynamic stimuli or a head-mounted eye-tracker. Cur-
rent manufacturer algorithms have been designed to analyse only data recorded from static stimuli;
if you use animation, look for algorithms designed for such data. Specific algorithms will also be
needed if you have recorded very noisy data and are yet to use it, or if you want to de-saccade your
data—the practice of removing saccades from smooth pursuit data and interpolating—in order to
calculate gain and phase.
Beware that some implementations divide events (such as a fixation) that cross a trial boundary
into two parts. This may lead to artificially low first fixation durations for trials; one option is to
exclude such partial events from the analysis. Clearly define the events that you use in your article.
For example, do ‘fixations’ refer to implicitly detected intersaccadic intervals or explicitly detected
oculomotor periods of stillness? If saccade endings or fixation onsets are important, what did you do
with post-saccadic oscillations?

Using areas of interest (AOIs)


There are many measures that use the events and representations based on AOIs, and each and ev-
ery one of them is very sensitive to how you divide your stimulus into AOIs. Avoid arbitrary AOI
positioning; ensure your AOIs are as precise as possible in relation to the important elements of the
stimulus. For complex real life stimuli, use an external method (such as expert ratings) to decide
whether your division of stimulus space is suitable. If you are free to design your stimulus, do not
put objects so close together that you cannot have a margin between AOIs. The minimal AOI size
is limited by the precision and accuracy (pp. 159–189) of the data that your eye-tracker can record.
Inaccurate data (o↵sets) can be repaired, but it is manual and tedious, and should not be done unless
you know exactly what calculations to make and the potential consequences for your data.
⇤ My research hypothesis determined what AOIs I needed on the stimulus.
⇤ I set up my AOIs during experimental design and did not change them after I had inspected my
data, except for adding margins to account for poor data quality.
⇤ I set up my AOIs on the basis of the distribution of data, and I can motivate this from my
experimental design.
⇤ My AOIs cover areas with homogeneous semantics, which are founded in the rationale behind
my experimental design.
Overlapping AOIs should not be used unless the hypothesis and stimulus demand it, and then the
calculation of first fixations, dwell times, and transitions must be reconsidered. Do not distribute a
single AOI over many areas of your stimulus, unless there is a clear link between the semantics of
those areas, your research hypothesis, and the measures you employ. When using transition measures,
report what is known as ‘whitespace’ (parts of the stimulus not covered by any AOIs) as a proportion
over the whole stimulus. Define how transitions are calculated with regard to whitespace. Be aware of
measures that are scaling dependent with respect to factors such as the size or the content of an AOI.
Scaling must be motivated by the semantics of the stimulus, and the baseline probability of looking
towards each area.
10 | INTRODUCTION

Manually coding dwells and transitions from gaze-overlaid videos is not overly difficult for lim-
ited amounts of data, but coding hours of recordings with data from head-mounted eye-trackers is
very time-consuming.

Using gaze density maps


If you plan to use gaze density maps to exemplify results, make exploratory investigations of your
data, perform statistical tests between groups, or simply because heat maps are a deliverable presen-
tation to your customer, there is a lot to consider. First of all, gaze density maps represent the spatial
distribution of data and nothing else. Visualizations such as heat maps tend to inspire less experi-
enced users of eye-trackers to jump to conclusions about why participants look at the hot-spots. A
heat map can only show where participants look, not why they look there (p. 40 and 317-322).
Gaze density maps ignore the underlying semantic areas of your stimulus. Use areas of interest
(AOIs) if you want to relate eye movement data to semantic areas. Gaze density maps collapse over
time and often also over participants, which means that you lose a lot of potentially useful informa-
tion. If you plan to use gaze density map-based statistics, see to it that your hypothesis compares the
information retained, namely the overall spatial distribution.
Visualizations such as heat maps can exemplify, support, and even reveal nuances in quantitative
results, but should be published on their own only after careful consideration. When you publish
your gaze density map visualization, always report the type of eye-tracker used and its sampling
frequency and the analysis software version. Also specify the settings for the basic constructs, the
settings for the mapping of colour, luminance, or contrast to the height of the map, the time segment
that the visualization was created from, whether you use fixations rather than raw data samples,
and the criteria for fixation detection. Make sure that you have sufficient amounts of data for your
visualizations.
There are virtually no guidelines or systematic investigations of the e↵ects of settings for the
colour mapping. A value corresponding to a circle on the screen with a diameter of around 2
visual angle will give an indication of what is looked at with the point of highest visual acuity—the
fovea— but note that this can be misleading because items can still be attended while in peripheral
vision outside of this area. Blignaut (2010) addressed this issue by incorporating the perceptual span
into gaze density map settings. Do not make a habit of changing values up and down between heat
map visualizations of your di↵erent groups, participants, or conditions. If you do, your visualizations
will not be comparable or useful for data interpretation. In fact, you will be producing arbitrary art
rather than scientific data visualizations. Select one setting and stick with it for all of the heat maps
you make, so that you know that any di↵erences that you see between the heat maps is actually an
e↵ect in your data and not in your settings. If you have data from a low-speed eye-tracker with lower
precision, use fixations for building your gaze density maps. If you have a high-end eye-tracker, you
can also use raw data. If you are having participants look at a central fixation cross before stimuli
onset, make sure to remove the first fixation in your recordings, otherwise your gaze density map
may have an artificially large hot-spot in the centre.
Gaze density maps are much more versatile than generally believed, and can also be used scien-
tifically. Figure 9.6 on p. 308 illustrates the range of possibilities.
Chapter 9 describes gaze density maps, heat maps, and focus maps in detail, including their
mathematics and how they interact with your experimental design. Addressing people who use eye
tracking for applied purposes, Bojko (2009) gave similar (but not identical) advice.

Using scanpaths
Scanpath visualizations are excellent for first inspections of data, answering questions such as: is
the data quality good, did the fixation detection algorithm do a good job, and is this recording in
line with my hypothesis? Do not put scanpath visualizations in your papers just as decoration. Ask
yourself why you have put it there, and see to it that the scanpath visualization aligns well with your
YOUR FIRST STUDIES | 11

hypothesis, operationalizations, and results. There are a whole number of scanpath events ready to be
used in statistical analyses, such as returns, regressions, look-aheads and sweeps, and many more that
could be defined. In order to attribute meaningful interpretations to individual scanpaths, you need to
disambiguate the data using a tight experimental design, and verbal data or other complementary data
recordings. All of the scanpath representations used in particular measures reduce the level of detail
in the scanpaths, in terms of both spatial and temporal accuracy. Other properties such as fixation
duration are sometimes ignored completely.
Be sure to use a scanpath representation that retains the properties that you want to measure. If
you are using measures that utilize scanpath representations, be aware that raw data quality, event
detection algorithms and their settings, and all issues around AOI identification may all introduce
noise into the values you get from the measure. Always see to it that you have a baseline similarity to
compare against your measured similarity. Scanpath events and representations are at the top of the
hierarchy, and results may depend upon choices that you made in earlier steps of the analysis.
In Chapter 10, we present in detail the scanpath concept, the many usages of scanpaths, common
scanpath events, and the methods for comparing scanpaths.

Statistical analysis
In a study with a simple experimental design, statistical analysis can be done in almost no time.
In more complex studies, statistical analysis often requires a series of explorative investigations of
statistical distributions of your measures (and the intercorrelations between them) before you can
actually calculate e↵ect sizes and p-values. Participants may have to be removed, and this can skew
your balanced experimental design so much that you may decide to go back and recruit some more
participants.
When the statistical analysis is done, you just need to write up and publish. If you have prepared
texts from each of the steps above, a lot of the writing will be done already. In many cases, however,
writing up involves as many complicated considerations and as much work as conducting the actual
study.

View publication stats

You might also like