Computational Characteristics of Human Escape Decisions

1
1 Computational characteristics of human escape decisions
2 Juliana K. Sporrer 1*, Jack Brookes 1*, Samson Hall 1, Sajjad Zabbah 1, Ulises Daniel Serratos
3 Hernandez1, Dominik R. Bach1,2
4
1
5 Max Planck UCL Centre for Computational Psychiatry and Ageing Research and
6 Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology,
7 University College London, United Kingdom

2
8 Hertz Chair for Artificial Intelligence and Neuroscience, University of Bonn, Germany
9 * These authors contributed equally to this work
10 Address correspondence to d.bach@uni-bonn.de
11
12 Abstract: Animals including humans must cope with immediate threat and make rapid
13 decisions between action options to survive. Without much leeway for cognitive or motor errors,
14 this poses a formidable computational problem. Utilizing fully-immersive virtual reality with 13
15 natural threats, we examined escape decisions in humans. We show that escape goals are
16 dynamically updated according to environmental changes. The decision whether and when to
17 escape depends on time-to-impact, threat properties, and stable personal characteristics. Its
18 implementation integrates secondary goals such as behavioral affordances. Perturbance
19 experiments show that the underlying decision algorithm exhibits planning properties and can
20 integrate novel actions, in keeping with model-based planning. In contrast, rapid information-
21 seeking and foraging-suppression, are initiated based on scalar action values, which adapt over
22 time. Our results suggest that instead of being instinctive, stereotypical, or hardwired stimulus-
23 response patterns, human escape decisions integrate multiple variables in a flexible

2
24 computational architecture, whereas secondary defensive behaviors are controlled by simpler
25 decision mechanisms. Taken together, we provide steps towards a computational model of how
26 the human brain rapidly solves survival challenges.
27 Main Text: Field observations and laboratory experiments have revealed that many
28 species employ complex and sophisticated defensive behaviors1 to escape from immediate threat.
29 Despite advances in some non-human species1,2, little is known about human escape actions, and
30 the computational mechanisms that control them, and this is difficult to study for ethical reasons.
31 Beyond imagined threat scenarios3, more recent work has used third-person view computer
32 games4,5 but these restrict possible actions to key presses or joystick movements. This is likely to
33 underestimate the complexity of the action space, and the ensuing decision problem6. For
34 example, Homer's Iliad anecdotally describes at least 13 distinct behavioral patterns under
35 conspecific attack, not counting the use of weapons (Table S1). However, a systematic empirical
36 assessment of the mechanisms that compute this behavior in humans remain elusive6–8.
37 Here, we investigated human escape behavior in a fully-immersive virtual reality (VR)
38 environment in which participants could move freely within 5 x 10 m physical space (Fig. 1.A.).
39 In two independent experiments, N1=29 (E1) and N2=30 (E2) participants were instructed to
40 forage for fruit on a bush, and to avoid contact with various threats over 68/36 (E1/E2 part 1)
41 independent epochs. We included 13/7 natural and 3/1 artificial (control) threats (Figs. 1.B., S1.;
42 Table S2), which appeared on 60/32 of all epochs (see Supplementary Information). A shelter,
43 which was always 5 meters behind the fruit bush, provided protection from all threats. On half of
44 these epochs, the threat moved toward the participant (“attack” condition; Fig. 1.C.) requiring
45 escape, while on the other half, the threat diverted from the attack trajectory after covering 20%
46 of the distance, making any escape unnecessary (“divert” condition; Fig. 1.D., see
3
47 Supplementary Information). All threats were animated to show realistic behavior (URL to
48 replay the experiments will be made public upon publication). Fast animal threats would chase
49 and outrun the participants, forcing them to enter the shelter to survive. Slow animal threats
50 would chase but not outrun the participants, thus requiring escape but not necessarily entry into
51 shelter. The inanimate threat did not chase such that entering the shelter was unnecessary.
52 Analyses in main text refer to natural threats included across both experiments (Fig. 1.B., see
53 Supplementary Information). In an exploration-confirmation strategy, hypotheses were generated
54 by exploring contrasts of estimated marginal means from linear mixed-effects models in the first
55 sample, which were then tested in the second sample with Holm-Bonferroni correction for
56 multiple comparisons (Fig. 1.E.; Tables S5, S6).
57 We define “virtual survival” when participants avoided contact with the threat. When
58 attacked, virtual survival was 81.8%/75.4% and plateaued after around 3/5 epochs (Fig. S2).
59 Participants collected on average 9.1/9.0 fruits per epoch (Fig. S3). Beyond complying with
60 explicit instructions to forage and survive, they also engaged in task-irrelevant behaviors that
61 might be adaptive in natural environments. Alarm vocalization is a commonly used defensive
62 behavior in the animal kingdom to deter or distract predators and warn conspecifics9. Participants
63 vocalized toward the threat (e.g. shrieks, squeaks) in 9.1%/18.8% of epochs (Fig. S3). They also
64 sought shelter in the absence of threat in 16.2%/7.1% of epochs. Compared to typical VR
65 experiments, cybersickness was low (see Supplementary Information)10.
66 Escape is often conceptualized as “instinctive”: triggered by specific features, with
67 flexible motor implementation, but a predictable end goal1,11. In our experiments, escape was
68 initiated on 93.8%/91.7% of attack epochs, and on 54.4%/70.8% of divert epochs (Fig. S4). Once
69 escape was initiated, its ultimate target depended on the threat trajectory. For fast threats,
4
70 participants did not reach the shelter on 14.2%/23.2% of initiated escapes when attacked. In
71 contrast, when threat diverted, shelter was not reached in 63.9%/37.2% of initiated escapes (H1,
72 p<.001; Figs. 1.F., S5.). Notably in attack epochs, escape interruption appeared non-intentional,
73 as it occurred later than during divert epochs and invariably lead to virtual death (Figs. 1.g., S5.).
74 Importantly in divert epochs, 27.4%/37.5% of interrupted escapes had already been initiated
75 before the threat diverted and were thus presumably started with an intention to go to shelter.
76 These results demonstrate that initiated escapes with an intention to go to shelter can be
77 interrupted when the environment changes. This suggests that escape goals are dynamically
78 updated during escape, rather than being predictable from the outset.
79 A commonly suggested escape trigger is “defensive distance”11 or “predatory
80 imminence”12 of the threat, which is thought to integrate physical distance and type of threat.
81 Here, we formalized this concept as time-to-impact of the threat, defined as maximum time
82 available to initiate an escape and just about collide with the threat during escape (i.e. if
83 participants were minimally faster they would surely survive). By adjusting physical threat
84 distance and the surrounding covering grass, we implemented two time-to-impact conditions, 1.5
85 s and 5 s (Fig. S2; Table S3). When threat attacked, participants initiated escape earlier during
86 short time-to-impact (1.3 s/1.2 s) compared to long time-to-impact (2.9 s/2.5 s; H2, p<.0001; Fig.
87 1.H.). When threat diverted, participants initiated escape more often when time-to-impact was
88 short vs. long (69.2%/82.6% vs. 39.8%/59.0%; H3, p<.0001). These results suggest that time-to-
89 impact is an important factor for determining whether and when to initiate escape.
90 Beyond time-to-impact, biological and physical threat characteristics contributed to
91 participants’ escape decisions. For short time-to-impact, participants were more likely to enter
92 the shelter when attacked by fast feral (i.e. elephant, bear; 61.6%/68.3%) than fast familiar threat
5
93 (i.e. dog, human; 51.9%/56.9%; H4, p<.05; Figs. 1.I., S6). Participants entered the shelter less
94 often when attacked by the (ballistic) rock than by any other threat (38.6%/40.7% vs.
95 76.0%/64.9%; H5, p<.0001) and initiated escape later than for any other threat when time-to-
96 impact was long (4.8 s/4.2 s vs. 3.2 s/2.9 s; H6, p<.0001; Figs. 1.H., S7). These results suggest
97 that the decision whether and when to escape integrates the expected trajectory and behavior of
98 the threat with its time-to-impact. Additionally, escape decisions were affected by stable personal
99 characteristics (Tables S7, S8). Across all threats and conditions, fearfulness13, fear of spiders14,
100 and sex jointly explained 27%/40% of the between-person variance in escape initiation time (Q-
101 H1, p<.005), and 27%/46% of the variance in minimum distance from threat during escape (Q-
102 H2, p<.005): generally fearful females with high fear of spiders initiated escape earlier, and left
103 more space between themselves and the threat.

6
104 Fig. 1. Investigation of escape decisions and their implementation using fully-immersive virtual reality. (A)
105 Participant view (example epoch). (B) Images of the 7 natural threats used in E1-E2. (C) Schematic view of the
106 scenario setup. In “attack” condition, threats moved toward the participant’s position. Fast threats appeared from
107 obscuring tall grass (dark green) and slow threats from short grass (light green) with dashed lines representing their
108 possible trajectories. The escape path toward the shelter is shown in blue. (D) In “divert” condition, the threat deviated
109 after covering 20% of the distance to the fruit bush. (E) Schematic representation of all behavioral outcomes whose
110 hypotheses replicated. (F) Percentage of interrupted escape. (G) Distance of the participant from the fruit bush over
111 time. Each colored line represents a participant, and the black line is the overall mean. (H) Time of escape initiation.
112 (I) Percentage of shelter entries. (J) Mean speed during escape. (K) Body orientation (mean cosine of orientation
113 angle from threat; 1: toward; -1: away). Large points with error bars represent mean and standard error across
114 participants and epochs, and small points represent individual epochs. Threats are sorted by speed with the ballistic
115 rock on top from elephant (fastest) to spider (slowest). Orange: 1.5 s time-to-impact; brown: 5 s time-to-impact.
7
116 Next, we addressed the implementation of escape once initiated. When attacked, mean
117 escape speed - and thus, energy expenditure15 - depended on the threat’s speed (slow threat: 1.4
118 ms-1/1.3 ms-1, fast threat: 1.9 ms-1/1.9 ms-1; H7, p<.0001; Figs. 1.J., S8). Similarly, participants
119 oriented their body more toward the threat during escape for slow vs. fast threats when attacked
120 (averaged cosine of orientation angle away from threat, ranging from -1 (away from threat) to 1
121 (toward threat): 0.2/0.1 vs. -0.3/-0.3; H8, p<.0001; Figs. 1.K., S9). Body orientation during
122 movement affects energy expenditure16 as well as how easy it is to observe the threat, and to
123 return to foraging. Furthermore, head orientation during escape, and mean escape speed,
124 depended on time-to-impact: participants oriented their head more toward the threat (0.0/0.0 vs. -
125 0.1/-0.2; H9, p<.05; Fig. S10) and moved more slowly (1.5 ms-1/1.5 ms-1 vs. 1.9 ms-1/1.9 ms-1;
126 H10, p<.0001) when they had more time. Finally, visual scanning during the first 1.5 s after
127 threat appeared was less pronounced for a human threat, compared to any other fast threat
128 (56.9 °s-1/71.3 °s-1 vs. 66.2 °s-1/79.9 °s-1; H11, p<.0001; Fig. S11). Taken together, this suggests
129 that escape implementation accounts for energy optimization and behavioral affordances (e.g.
130 monitor threat). In turn, it also depended on stable personal characteristics (Tables S7, S8).
131 Sensation seeking17, fearfulness13, fear of spiders14, and sex jointly explained 36%/34% of the
132 between-person variance in head orientation when the threat appeared (Q-H4, p.<05). High
133 sensation seekers oriented more, and people with high fearfulness and fear of spiders oriented
134 less, toward the threat.
135 These results strengthen the view that escaping humans face a complex decision-making
136 problem. Statistically optimal decision algorithms, such as model-based tree search, are time-
137 and memory-consuming6. Thus, defensive behavior, especially at short defensive distance, has
138 been suggested to be controlled by mechanisms that are solely based on scalar action values (also
8
139 termed model-free)8,18. These action values might be malleable from experience, but could even
140 be hard-wired. Such characteristics have a potential to improve decision speed but reduce
141 flexibility to adapt to changing environments. Hence, in the second part of E2, we perturbed the
142 environment in non-natural ways to reveal the computational characteristics of the underlying
143 mechanisms through behavior. All epochs in this part employed a predatory threat that had not
144 occurred in the first part of E2 (i.e. panther; Fig. 2.A.).
145 The first characteristic we investigated was sensitivity to reinforcer devaluation. This
146 addresses whether an action is suppressed when the outcome (e.g., escape to shelter) is suddenly
147 no longer desirable. It distinguishes, for example, certain model-based from classical model-free
148 mechanisms19. On half of epochs in a “force shield” block (Fig. 2.B.), we devalued the shelter by
149 endowing participants with a protective force shield that built up around the participant at the
150 start of an epoch and then visually disappeared. In a tutorial, participants learned from
151 experience that objects could not penetrate the force shield even when it was invisible, meaning
152 it would protect just like the shelter. Participants did not encounter the panther during this
153 tutorial. A devaluation-sensitive mechanism would stop escaping immediately when
154 encountering the panther within the force shield, whereas an insensitive mechanism might need
155 repeated exposure to the panther in order to learn to suppress escape19. Hence, the following
156 analyses only refer to the first epoch of each condition (subsequent epochs are visualized in Fig.
157 S12). We found that participants almost never escaped to shelter when attacked within the force
158 shield compared to without, independent of time-to-impact (3.3% vs. 80%; E2-H1, p<.0001; Fig.
159 2.C.; Table S9). When the force shield was active, they stayed at a comparable distance from
160 shelter regardless of threat presence (2.5 m vs. 2.6 m; E2-H2). This demonstrates that escape
161 initiation is sensitive to reinforcer devaluation and likely to be under model-based control. On
9
162 the other hand, some secondary defensive behaviors showed a different picture. When within the
163 force shield and a threat appeared, fruit picking rate over the remaining epoch was suppressed
164 compared to a no-threat epoch (0.8 s-1 vs. 1.1 s-1; E2-H3, p<.0001; Fig. 2.C.). In turn, visual
165 scanning over 0-1.5 s after threat appearance was increased, compared to a no-threat epoch
166 (45.0 °s-1 vs. 25.6 °s-1; E2-H4, p<.05; Fig. 2.C.). This suggests that the mechanisms controlling
167 rapid information-seeking and foraging-suppression are partly insensitive to reinforcer
168 devaluation.
169 The second characteristic we addressed was the flexibility to update behavior from
170 experience (sometimes termed “Pavlovian” vs. “instrumental”). Another important defensive
171 behavior is information-seeking. In a “Medusa” block, participants were instructed that they had
172 to identify and suppress a particular movement (head orientation toward the threat) that would
173 activate a lethal magical force. They were not instructed what this movement was. Most
174 participants learned to suppress this action, and thus reduced mortality from the magical force
175 from 90.0% (epoch 1) to 18.5% (epoch 7; E2-H5, p<.0001; Fig. 2.D.). In a post-hoc interview,
176 60.0% of participants reported the correct force-activating movement. The remaining 40% still
177 appeared successful in reducing their mortality (from 83.3% to 30.0%; Fig. S13), possibly by
178 altering related movements (e.g. body orientation) or by learning the correct movement but not
179 forming awareness of it. Overall, these results indicate that human information-seeking behavior
180 preceding escape, while partly insensitive to reinforcer devaluation, is malleable by experience
181 rather than hardwired.

10
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197 Fig. 2. Computational properties of the controllers underlying human escape behavior. (A) Participant view of
198 the force shield. (B) All epochs in E2 part 2 used the panther as threat. (C) Minimum distance from the shelter (top),
199 fruit picking rate over the entire epoch after threat appearance (centre), and visual scanning during 0-1.5 s after
200 threat appearance (bottom) in the force shield block. (D) Percentage of virtual death from magical force over epochs
201 in the Medusa block. (E) Percentage of escape to shelter in the hands-up block. Large points with error bars
202 represent the mean and standard error across all participants and epochs, and small points represent individual
203 epochs.
11
204 Finally, we addressed the flexibility of the action repertoire. In a “hands up” block,
205 participants were instructed that some panthers were trained to stop attacking when humans
206 raised both hands high above their head. This action is novel and non-natural in this context as
207 participants would normally use their hands to aid escape or protect their torso. The potential
208 presence of these specific panthers was indicated by a sign at the start of each epoch. Participants
209 were so efficient in utilizing this novel action that even in the first epoch for each condition, they
210 never escaped to shelter when it was available (0.0% vs. 86.7%; E2-H6, p<.0001; Figs. 2.E., S14
211 for subsequent epochs). Furthermore, in two exploratory epochs in the last block of E2,
212 participants were attacked by a panther in a visually different scenario without shelter (see
213 Supplementary Information). Unexpectedly, 26.7% of participants spontaneously evoked the
214 non-natural hands-up action they had previously learned, despite lack of instruction to do so, or
215 proof of effectiveness in this new context. Taken together, these results confirm that escape
216 initiation is sensitive to reinforcer devaluation, and demonstrate that the action-selection
217 mechanism can easily integrate novel instructed actions and later retrieve them in novel
218 situations.
219
220 Discussion: While escape from immediate attack is sometimes depicted as a fixed
221 reaction pattern (“fight or flight response”), the absence of observable and explicit deliberation
222 does not imply a lack of dynamic and complex decision-making1. We demonstrate that
223 behavioral goals during human escape are dynamically updated, integrate time-to-impact with
224 threat properties such as attack probability (feral vs. domestic) and expected trajectory (pursuit
225 vs. ballistic interception), and are implemented in a manner that allows optimizing secondary
226 goals (energy expenditure, behavioral affordances). Thus, in line with field observations in non-
227 human animals1, human escape responses are flexible, and can integrate multiple variables such
228 as spatial constraints of the environment and economic trade-offs even under strong timing
229 constraints1,20. This could pose a substantial challenge for computing accurate actions in limited
12
230 time. Nevertheless, perturbance experiments show that escape decisions are controlled by a goal-
231 directed mechanism that is sensitive to reinforcer devaluation in two separate tests. Furthermore,
232 the action-selection mechanism can easily learn a novel non-natural action and integrate it into
233 an enduring action repertoire. At the same time, information-seeking and foraging-suppression
234 behavior before escape appear insensitive to reinforcer devaluation and thus possibly rely on
235 scalar action values, although information-seeking can still be un-learned over time when leading
236 to negative outcomes. Taken together, this suggests that the human brain might use different
237 algorithms depending on the behavior required, or on the time constraints involved, which in our
238 experiment were more pronounced for rapid information seeking than for the decision to escape.
239 It is possible that with a time-to-impact shorter than the 1.5 s used here, humans would resort to
240 simplified algorithms even for escape decisions.
241 Using third-person view and strategic computer games, we have previously shown that
242 humans use goal-directed mechanisms to decide whether and when to approach a foraging
243 opportunity under threat21,22, but that in doing so they may often employ approximate
244 computations that rely on situation-specific heuristics23,24. While we demonstrate that escape
245 initiation depends on time-to-impact, we cannot yet quantitatively formalize the underlying
246 computations and whether they represent or approximate statistical optimality. These are likely to
247 involve additional and possibly threat-specific influences of physical distance, player speed,
248 foraging success, and the players’ actual escape speed which is variable across trials. One might
249 argue that VR, although more realistic than third-person view computer games, is still
250 insufficient to match the rich and large repertoire of defensive behaviors informally observed or
251 imagined in real life. Notwithstanding, the use of task-irrelevant vocalization (e.g. shrieks,
252 squeaks, screams) by participants testifies to the realism and immersive nature of our VR
253 environment.
13
254 In summary, our results provide an entry point for understanding how the human brain
255 computes and implements flexible and sophisticated escape decisions.

256
257 Method
258 Participants
259 We included participants aged 18-40 years who had no mobility impairments and were at
260 low risk of psychological trauma in the VR. We report 29 participants (19 females; mean age ±
261 SD = 25.9 ± 5.6) for the first experiment (E1) and 30 participants (15 females; mean age ± SD =
262 24.7 ± 5.3) for the second experiment (E2). Due to hardware failures, some individual epochs
263 were not completed such that participants experienced, on average, 66.5 out of 68 and 35.8 out of
264 36 epochs. One additional participant in E1 and three additional participants in E2 did not
265 complete the experiment per protocol due to VR hardware failure and were not included. Since
266 we were interested escape behavior (which could only be initiated after the threat appeared), we
267 excluded one additional participant in E2 who moved away from the fruit bush before a threat
268 appeared on 31 out of 36 epochs (corresponding to 5 standard deviations above group average).
269 All participants gave written informed consent before starting the experiment and
270 received a fixed monetary compensation. Experiments complied with all relevant ethical
271 regulations and were approved by the UCL Research Ethics Committee (6649/003).
272
273 Experimental procedure
274 Each experiment consisted of a sequence of short encounters (“epochs”) with various
275 threats (Table S2), and short breaks in-between. Participants were tasked with collecting as many
276 pieces of fruit (resembling kumquats) as possible, whilst avoiding physical contact with any
277 threat.
14
278 Tutorial: An interactive tutorial took place in a white barren environment. First, we
279 instructed participants to walk around the boundaries of the physical environment, run back and
280 forth across the length of the space, and walk backwards towards the edge of the physical
281 environment. Secondly, we showed the participants an example of a fruit bush. Participants were
282 taught to hold their hand over any appearing fruit for one second, which made it disappear and
283 play a “string pluck” sound, before the next fruit appeared. The third stage allowed the
284 participant to experience a red display and loud white noise that occurred on threat contact.
285 Finally, participants were shown the shelter, and told it would always appear behind their starting
286 position. They entered the shelter to end the tutorial.
287 Epochs: At the start of each epoch, the participant was positioned in a low grass clearing
288 surrounded by tallgrass, such that a single fruit bush was 2.5 m in front of them and the shelter
289 was 2.5 m behind them. To reduce cue and context conditioning, color and shape of the fruit
290 bush were randomly varied, and additional bushes, grass patches, animal carcasses, and distant
291 flocks of birds were added randomly around the participant without blocking any escape paths.
292 Once they started fruit collection, a threat could emerge from the grass after a delay which was
293 uniformly drawn from 1-11 s. Threats appeared randomly at a -45-, 0-, or 45-degree angle from
294 the initial participant forward direction and were accompanied by an initial rustling sound when
295 they broke through the grass. An epoch could end in one of three ways: 1. Contact of any body
296 part with a threat (approximated by a set of simple volumes), turning the display red, playing an
297 uncomfortable high amplitude white noise sound, and removing any collected fruit in that epoch
298 while playing a chime sound. 2. Upon entering the shelter, the door was slammed shut (unless
299 the threat was already in close proximity, thus blocking the door, leading to outcome 1). 3. After
300 a pre-determined time, if none of outcomes 1-2 occurred. After outcomes 2-3, a white display
15
301 appeared, and collected fruit were added to a total count while a “ding” sound was played. In all
302 cases, verbal feedback was given on a sign in front of the participant (e.g. “you were killed by
303 the threat”, “you escaped safely”, or “you survived”). Afterwards, the participant was placed in a
304 transition environment with floor markers, on which they had to stand in order to start the next
305 epoch.
306
307 Experimental conditions
308 In E1 (68 epochs) and the first part of E2 (36 epochs), we implemented a 2 x 2 factorial
309 design with “threat behavior” (attack/divert) and “time-to-impact” (1.5 s/5 s) as the two
310 independent variables. Two of the 16 threats in E1 (crocodile and dynamite) were not compatible
311 with divert behavior, and so were only used with the “attack” animation (SI methods). For E1,
312 this resulted in 60 epochs, plus 8 no-threat epochs visually identical to threat epochs. For E2, this
313 resulted in 32 epochs, plus 4 no-threat epochs.
314 Time-to-impact: We defined time-to-impact as the maximum time available to initiate
315 escape and just about collide with the threat (i.e. if participants were minimally faster they could
316 escape). For slow animals and the ballistic rock, this was simply the interval between first
317 occurrence of the threat and their arrival at the fruit-picking position. For chasing fast threats,
318 this was the time that would lead to simultaneous arrival of the threat and the participant at the
319 shelter, given assumed participant speed of 2 m/s based on pilot data. In other words, time-to-
320 impact was the participant’s lead time at the shelter if they could escape with zero delay. For all
321 moving threats (excluding dynamite and crocodile in E1, see SI), time-to-impact was realized by
322 adjusting the placement of surrounding covering grass from which the threat appeared (Table
323 S3).
16
324 Threat behavior: Threats would always start on a collision course, but in divert condition
325 they would deviate from this trajectory after 20% of the distance to the fruit bush. Animals
326 would then simply make a turn and move into the grass (SI for the other threats).
327 In the second part of E2 (blocks 2-4), we used non-natural elements in the environment,
328 to address computational characteristics of action-selection mechanisms for one particular threat
329 not used in block 1, namely the panther.
330 Force shield: In E2 block 2, we tested reinforcer devaluation in a 2 (force shield vs. no
331 force shield) x 2 (panther vs. no threat) x 2 (1.5 vs. 5 s time-to-impact) design. In force shield
332 epochs, a visible and audible force shield built up and surrounded the participant at the beginning
333 of the epoch and then disappeared visually but continued to protect against the threat.
334 Participants learned about the force shield in a preceding tutorial, during which the panther did
335 not occur. To avoid participants from inferring the type of epochs, we set a variable number of
336 epochs per condition (see Supplementary Information, Extended Data Table 5). (Table S4)
337 Hands-up: In E2 block 3, we addressed whether participants could learn to use an
338 instructed non-natural behavior (here, raising both hands above their head) to stop the threat
339 from attacking. Participants were informed at the beginning of each epoch whether panthers in
340 this environment would be sensitive to the novel action, by a pictorial sign in the grass clearing.
341 The factorial design and number of epochs were analogous to block 2.
342 Medusa: In E2 block 4, we sought to investigate whether participants could learn, by
343 trial and error, to identify and avoid a common action (here, looking at the threat) when it led to a
344 negative outcome (here, virtual death by a “magical force”). We used a 2 (panther vs. no threat) x
345 2 (1.5 vs. 5 s time-to-impact) design.

17
346 Exploratory epochs: Block 5 comprised several exploratory epochs related to the shelter
347 presence and position. We included 2 epochs where the participant started in a cul-de-sac gorge
348 with no shelter, which made escape from the threat impossible.
349
350 Equipment
351 We used an HTC Vive Pro Eye HMD, Windows PC with an Intel i7 9700K CPU and
352 Nvidia RTX 2080Ti GPU, running an experiment built in the Unity Engine with SteamVR &
353 Unity Experiment Framework25. Vive controllers were held in each hand, and Vive Trackers
354 were attached to the waist and feet to allow for real-time body tracking. The VR headsets
355 included a built-in microphone positioned on the underside of the HMD.
356
357 Demographics and self-reported questionnaires
358 We aimed to determine if heterogeneity between participants in threat-related behaviors arose
359 due to individual differences, especially those that have an ethological origin. As such, we
360 wanted to assess predictors of cautiousness such as fear susceptibility, phobia and anxiety levels,
361 and risk preferences.
362 We implemented all questionnaires using REDCap electronic data capture tools hosted at
363 University College London26. A few days before the experiment, participants completed self-
364 report questionnaires assessing fear (Fear Survey Schedule-III, FSS)13, trait anxiety (State-Trait
365 Inventory for Cognitive and Somatic Anxiety, STICSA-T)27–29, sensation seeking (Brief
366 Sensation Seeking Scale, BSSS)17, disgust (Disgust Propensity and Sensitivity Scale, DPSS-
367 12)30, spider phobia (SPQ-12)14, snake phobia (SNAQ-12)14, motion sickness (Motion Sickness
368 Susceptibility Questionnaire, MSSQ)31,32, video game usage (Video game usage questionnaire,
18
369 VGUQ)33, as well as questions enquiring about their experience and expertise in martial arts.
370 Immediately before the VR game, participants provided demographic information including sex
371 and gender, age, body weight and height as well as current health and physical status and
372 completed the state portion of the STICSA27–29. Immediately after the VR game, participants
373 assessed their cybersickness using the Simulator Sickness Questionnaire (SSQ)34. For details
374 about questionnaire selection, see Supplementary Information.
375
376 Task measures
377 A priori, we extracted 18 epoch-level summary statistics from our data (Table S5) for E1;
378 a subset of these were used for statistical analysis of the first part of E2. When averaging over
379 time periods, we use the trapezium rule to ensure average is accurate over an irregular time
380 sample. (1-3) The three possible outcomes, namely “escape to shelter” by going into the shelter,
381 “survived” by not going into the shelter, or “virtual death” when getting into contact with the
382 threat. We estimated “initiated escape” (4) and “escape initiation time” (5) as the time when
383 participants first moved away from the fruit bush using the head tracker which had the highest
384 data quality. When participants initiated their escape but did not go into the shelter, we
385 considered this an “interrupted escape” (6). We extracted the smallest distance between the
386 participant and the shelter (7) or the threat (8). We extracted peak (9) and mean speed (10) of the
387 participant during escape. The following measures were extracted and averaged over the 1.5
388 seconds after the threat appeared (corresponding to the shorter time-to-impact), or during the
389 entire escape (until entering the shelter, or epoch end, whichever occurred earlier): (11-12) body
390 orientation (mean cosine of angle between a vector pointing forward from the participant’s
391 pelvis, and the line between the participant and the threat), (13-14) head orientation (similar for a
19
392 vector pointing forward from the participant’s forehead), (15-16) fruit picking rate, and (17-18)
393 visual scanning, defined as cumulative angle of head movements. For no-threat trials, the
394 corresponding measures were taken from a random time point defined a priori within Unity. For
395 trials in which participants did not escape, all measures relating to escape were considered
396 missing.
397 For the second part of E2, we considered the following additional summary statistics:
398 (19) fruit picking rate from threat appearance to the minimum duration of the epoch (12.5 s);
399 (20) virtual death by magical force. Furthermore, as we had eye tracker data available for E2, we
400 used gaze orientation (rather than head orientation) to compute (17) for E2 part 2 only; results for
401 head orientation were similar.
402
403 Statistical analyses
404 The numerous possible analysis methods combined with the high dimensionality of the
405 dataset posed a formidable multiple comparison problem. Therefore, we opted for a rigorous
406 exploration-confirmation approach. We generated 11 hypotheses by exploring E1 (Table S6),
407 which we replicated in E2 while correcting for multiple comparisons (Holm-Bonferroni method).
408 For statistical analysis, we used generalized linear mixed-effects models for binomial variables
409 (virtual death, initiated escape, interrupted escape) and linear mixed-effects models for
410 continuous variables (glmer() and lmer() from the lme4 package in R). All models followed a 7 x
411 2 x 2 factorial design with threats (Elephant, Rock, Human, Bear, Dog, Snake, and Spider), time-
412 to-impact (1.5 s and 5 s), and threat behavior (attack, divert; model syntax DV ~ threat * time-to-
413 impact * threat behavior + (1 | Subject); see Supplementary Information for model optimizer
414 details). To test pre-defined hypotheses, we estimated marginal means (EMMS) for each cell of
20
415 the design using emms() and generated contrasts using contrast() (from the emmeans library in
416 R).
417 To determine which personal characteristics to include in a multiple regression, we
418 selected questionnaire scores that fulfilled r² > .10 in bi-variate correlations with a behavioral
419 outcome and retained up to three variables with the highest correlation. Additionally, sex was
420 included in all multiple regressions as it is an important predictor of cautious behavior35. We
421 generated 4 hypotheses by exploring E1 (Table S7 and S8), which we replicated in E2 while
422 correcting for multiple comparisons (Holm-Bonferroni method).
423 For E2 part 2, the a priori primary outcome measures were “escape to shelter” and
424 “minimum distance from shelter” in force shield and hands-up epochs, and “virtual death by
425 magical force” in Medusa epochs. As secondary outcome measures for force shield epochs, we
426 considered “fruit picking rate from threat appearance to the minimum duration of the epoch (12.5
427 s)” and “mean visual scanning over 0-1.5 s after threat appearance”. For no-threat trials, the
428 corresponding measures were taken from a random time point defined a prior within Unity. The
429 resulting statistical analysis can be found in Table S9.
430
431 Acknowledgements
432 The authors thank Jules Brochard, Peter Dayan, and Marina Rodriguez Lopez for
433 feedback on the first draft of this manuscript.
434 Funding: This work was supported by the European Research Council (ERC) under the
435 European Union’s Horizon 2020 research and innovation programme (Grant agreement No.
436 ERC-2018 CoG-816564 ActionContraThreat). DRB received funding from the National Institute
21
437 for Health Research (NIHR) UCLH Biomedical Research Centre. The Wellcome Centre for
438 Human Neuroimaging is supported by core funding from the Wellcome (203147/Z/16/Z).
439 Author contributions: J.B., S.H., and D.R.B. conceived and designed the experiments
440 with input from all other authors. J.K.S., J.B., and U.D.S.H. collected the data. J.K.S., J.B., S.Z.,
441 U.D.S.H., and D.R.B contributed analytic tools. J.K.S., J.B., S.Z., and D.R.B analyzed the data.
442 J.K.S., J.B., and D.R.B wrote the paper with input from all other authors. D.R.B supervised and
443 coordinated the project.
444 Competing interests: Authors declare that they have no competing interests.
445 Data and code availability: The complied versions of the two VR games are publicly
446 available on Open Science Framework (OSF, will be made available upon publication). All R
447 code used for analysis is publicly available on OSF (will be made available upon publication).
448 Curated data, including questionnaire summary scores, are available on OSF (will be made
449 available upon publication). Raw questionnaire and Unity data are available upon request under a
450 data sharing agreement in line with local data protection regulations.
451
452 References
453 1. Evans, D. A., Stempel, A. V., Vale, R. & Branco, T. Cognitive Control of Escape
454 Behaviour. Trends in Cognitive Sciences 23, 334–348 (2019).
455 2. Evans, D. A. et al. A synaptic threshold mechanism for computing escape decisions.
456 Nature 558, 590–594 (2018).
457 3. Blanchard, C. D., Hynd, A. L., Minke, K. A., Minemoto, T. & Blanchard, R. J. Human
458 defensive behaviors to threat scenarios show parallels to fear and anxiety-related defense
459 patterns of non-human mammals. Neuroscience & Biobehavioral Reviews 25, 761–770
22
460 (2001).
461 4. Bach, D. R. Cross-species anxiety tests in psychiatry: pitfalls and promises. Molecular
462 Psychiatry (2021). doi:10.1038/s41380-021-01299-4
463 5. Mobbs, D. et al. When Fear Is Near: Threat Imminence Elicits Prefrontal-Periaqueductal
464 Gray Shifts in Humans. Science 317, 1079–1083 (2007).
465 6. Bach, D. R. & Dayan, P. Algorithms for survival: A comparative perspective on emotions.
466 Nature Reviews Neuroscience 18, 311–319 (2017).
467 7. Mobbs, D. & LeDoux, J. Editorial overview: Survival behaviors and circuits. Current
468 Opinion in Behavioral Sciences 24, 168–171 (2018).
469 8. Mobbs, D., Headley, D. B., Ding, W. & Dayan, P. Space, Time, and Fear: Survival
470 Computations along Defensive Circuits. Trends in Cognitive Sciences 24, 228–241 (2020).
471 9. Seyfarth, R. M. & Cheney, D. L. Meaning and Emotion in Animal Vocalizations. Annals
472 of the New York Academy of Sciences 1000, 32–55 (2003).
473 10. Davis, S., Nesbitt, K. & Nalivaiko, E. A Systematic Review of Cybersickness. in
474 Proceedings of the 2014 Conference on Interactive Entertainment 19, 1–9 (ACM, 2014).
475 11. Gray, J. & McNaughton, N. The neuropsychology of anxiety: an enquiry into the functions
476 of the septohippocampal system. (Oxford UP, 2000).
477 12. Fanselow, M. S. & Lester, L. S. A functional behavioristic approach to aversively
478 motivated behavior: Predatory imminence as a determinant of the topography of defensive
479 behavior. in Evolution and learning (eds. Bolles, R. C. & Beecher, M. D.) 185–212
480 (Lawrence Erlbaum Associates, 1988).
481 13. Wolpe, J. & Lang, P. J. A fear survey schedule for use in behaviour therapy. Behaviour
482 Research and Therapy 2, 27–30 (1964).

23
483 14. Zsido, A. N., Arato, N., Inhof, O., Janszky, J. & Darnai, G. Short versions of two specific
484 phobia measures: The snake and the spider questionnaires. Journal of Anxiety Disorders
485 54, 11–16 (2018).
486 15. van der Walt, W. H. & Wyndham, C. H. An equation for prediction of energy expenditure
487 of walking and running. Journal of applied physiology 34, 559–563 (1973).
488 16. Chaloupka, E. C., Kang, J., Mastrangelo, M. A. & Donnelly, M. S. Cardiorespiratory and
489 metabolic responses during forward and backward walking. Journal of Orthopaedic and
490 Sports Physical Therapy 25, 302–306 (1997).
491 17. Hoyle, R. H., Stephenson, M. T., Palmgreen, P., Lorch, E. P. & Donohew, R. L. Reliability
492 and validity of a brief measure of sensation seeking. Personality and Individual
493 Differences 32, 401–414 (2002).
494 18. LeDoux, J. & Daw, N. D. Surviving threats: neural circuit and computational implications
495 of a new taxonomy of defensive behaviour. Nature Reviews Neuroscience 19, 269–282
496 (2018).
497 19. Dickinson, A. & Balleine, B. Motivational Control of Instrumental Action. Current
498 Directions in Psychological Science 4, 162–167 (1995).
499 20. Barrett, L. F. & Finlay, B. L. Concepts, goals and the control of survival-related behaviors.
500 Current Opinion in Behavioral Sciences 24, 172–179 (2018).
501 21. Castegnetti, G. et al. Representation of probabilistic outcomes during risky decision-
502 making. Nature Communications 11, 1–11 (2020).
503 22. Bach, D. R. The cognitive architecture of anxiety-like behavioral inhibition. Journal of
504 Experimental Psychology: Human Perception and Performance 43, 18–29 (2017).
505 23. Korn, C. W. & Bach, D. R. Minimizing threat via heuristic and optimal policies recruits
24
506 hippocampus and medial prefrontal cortex. Nature Human Behaviour 3, 733–745 (2019).
507 24. Korn, C. W. & Bach, D. R. Heuristic and optimal policy computations in the human brain
508 during sequential decision-making. Nature Communications 9, (2018).
509 25. Brookes, J., Warburton, M., Alghadier, M., Mon-Williams, M. & Mushtaq, F. Studying
510 human behavior with virtual reality: The Unity Experiment Framework. Behavior
511 Research Methods 455–463 (2019). doi:10.3758/s13428-019-01242-0
512 26. Harris, P. A. et al. Research electronic data capture (REDCap)-A metadata-driven
513 methodology and workflow process for providing translational research informatics
514 support. Journal of Biomedical Informatics 42, 377–381 (2009).
515 27. Ree, M. J., MacLeod, C., French, D. & Locke, V. The State–Trait Inventory for Cognitive
516 and Somatic Anxiety: Development and validation. in (Annual meeting of the Association
517 for the Advancement of Behavior Therapy, New Orleans, LA, 2000).
518 28. Grös, D. F., Antony, M. M., Simms, L. J. & McCabe, R. E. Psychometric Properties of the
519 State-Trait Inventory for Cognitive and Somatic Anxiety (STICSA): Comparison to the
520 State-Trait Anxiety Inventory (STAI). Psychological Assessment 19, 369–381 (2007).
521 29. Ree, M. J., French, D., MacLeod, C. & Locke, V. Distinguishing Cognitive and Somatic
522 Dimensions of State and Trait Anxiety: Development and Validation of the State-Trait
523 Inventory for Cognitive and Somatic Anxiety (STICSA). Behavioural and Cognitive
524 Psychotherapy 36, 313–332 (2008).
525 30. Fergus, T. A. & Valentiner, D. P. The Disgust Propensity and Sensitivity Scale-Revised:
526 An examination of a reduced-item version. Journal of Anxiety Disorders 23, 703–710
527 (2009).
528 31. Golding, J. F. Predicting individual differences in motion sickness susceptibility by

25
529 questionnaire. Personality and Individual Differences 41, 237–248 (2006).
530 32. Golding, J. F. Motion sickness susceptibility questionnaire revised and its relationship to
531 other forms of sickness. Brain Research Bulletin 47, 507–516 (1998).
532 33. Tolchinsky, A. The Development of a Self-Report Questionnaire to Measure Problematic
533 Video Game Play and its Relationship to Other Psychological Phenomena. Master’s
534 Theses and Doctoral Dissertations. 555. 555, (2013).
535 34. Lane, N. E. & Kennedy, R. S. A New Method for Quantifying Simulator Sickness:
536 Development and Application of the Simulator Sickness Questionnaire (SSQ). (Essex
537 Corporation, Orlando, FL 88-7, 1988).
538 35. Bach, D. R., Moutoussis, M., Bowler, A. & Dolan, R. J. Predictors of risky foraging
539 behaviour in healthy young people. Nature Human Behaviour 4, 832–843 (2020).
540

Computational Characteristics of Human Escape Decisions

Uploaded by

Copyright:

Available Formats

You might also like

Computational Characteristics of Human Escape Decisions

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computational Characteristics of Human Escape Decisions

Uploaded by

Copyright:

Available Formats

1

1 Computational characteristics of human escape decisions

3 Hernandez1, Dominik R. Bach1,2

7 University College London, United Kingdom

9 * These authors contributed equally to this work

10 Address correspondence to d.bach@uni-bonn.de

18 implementation integrates secondary goals such as behavioral affordances. Perturbance

23 response patterns, human escape decisions integrate multiple variables in a flexible

24 computational architecture, whereas secondary defensive behaviors are controlled by simpler

26 the human brain rapidly solves survival challenges.

37 Here, we investigated human escape behavior in a fully-immersive virtual reality (VR)

53 Supplementary Information). In an exploration-confirmation strategy, hypotheses were generated

56 multiple comparisons (Fig. 1.E.; Tables S5, S6).

61 might be adaptive in natural environments. Alarm vocalization is a commonly used defensive

64 sought shelter in the absence of threat in 16.2%/7.1% of epochs. Compared to typical VR

65 experiments, cybersickness was low (see Supplementary Information)10.

66 Escape is often conceptualized as “instinctive”: triggered by specific features, with

79 A commonly suggested escape trigger is “defensive distance”11 or “predatory

90 Beyond time-to-impact, biological and physical threat characteristics contributed to

103 more space between themselves and the threat.

134 less, toward the threat.

144 occurred in the first part of E2 (i.e. panther; Fig. 2.A.).

153 tutorial. A devaluation-sensitive mechanism would stop escaping immediately when

167 rapid information-seeking and foraging-suppression are partly insensitive to reinforcer

181 rather than hardwired.

213 Supplementary Information). Unexpectedly, 26.7% of participants spontaneously evoked the

240 simplified algorithms even for escape decisions.

255 computes and implements flexible and sophisticated escape decisions.

273 Experimental procedure

286 position. They entered the shelter to end the tutorial.

307 Experimental conditions

313 resulted in 32 epochs, plus 4 no-threat epochs.

314 Time-to-impact: We defined time-to-impact as the maximum time available to initiate

329 not used in block 1, namely the panther.

337 Hands-up: In E2 block 3, we addressed whether participants could learn to use an

342 Medusa: In E2 block 4, we sought to investigate whether participants could learn, by

345 2 (1.5 vs. 5 s time-to-impact) design.

355 included a built-in microphone positioned on the underside of the HMD.

357 Demographics and self-reported questionnaires

358 We aimed to determine if heterogeneity between participants in threat-related behaviors arose

361 and risk preferences.

374 about questionnaire selection, see Supplementary Information.

376 Task measures

401 head orientation were similar.

403 Statistical analyses

406 exploration-confirmation approach. We generated 11 hypotheses by exploring E1 (Table S6),

417 To determine which personal characteristics to include in a multiple regression, we

420 included in all multiple regressions as it is an important predictor of cautious behavior35. We

422 correcting for multiple comparisons (Holm-Bonferroni method).

429 resulting statistical analysis can be found in Table S9.

433 feedback on the first draft of this manuscript.

443 coordinated the project.

454 Behaviour. Trends in Cognitive Sciences 23, 334–348 (2019).

456 Nature 558, 590–594 (2018).

462 Psychiatry (2021). doi:10.1038/s41380-021-01299-4

464 Gray Shifts in Humans. Science 317, 1079–1083 (2007).

466 Nature Reviews Neuroscience 18, 311–319 (2017).

468 Opinion in Behavioral Sciences 24, 168–171 (2018).