Computational Characteristics of Human Escape Decisions

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

1

1 Computational characteristics of human escape decisions

2 Juliana K. Sporrer 1*, Jack Brookes 1*, Samson Hall 1, Sajjad Zabbah 1, Ulises Daniel Serratos

3 Hernandez1, Dominik R. Bach1,2

4
1
5 Max Planck UCL Centre for Computational Psychiatry and Ageing Research and

6 Wellcome Centre for Human Neuroimaging, UCL Queen Square Institute of Neurology,

7 University College London, United Kingdom


2
8 Hertz Chair for Artificial Intelligence and Neuroscience, University of Bonn, Germany

9 * These authors contributed equally to this work

10 Address correspondence to d.bach@uni-bonn.de

11

12 Abstract: Animals including humans must cope with immediate threat and make rapid

13 decisions between action options to survive. Without much leeway for cognitive or motor errors,

14 this poses a formidable computational problem. Utilizing fully-immersive virtual reality with 13

15 natural threats, we examined escape decisions in humans. We show that escape goals are

16 dynamically updated according to environmental changes. The decision whether and when to

17 escape depends on time-to-impact, threat properties, and stable personal characteristics. Its

18 implementation integrates secondary goals such as behavioral affordances. Perturbance

19 experiments show that the underlying decision algorithm exhibits planning properties and can

20 integrate novel actions, in keeping with model-based planning. In contrast, rapid information-

21 seeking and foraging-suppression, are initiated based on scalar action values, which adapt over

22 time. Our results suggest that instead of being instinctive, stereotypical, or hardwired stimulus-

23 response patterns, human escape decisions integrate multiple variables in a flexible


2

24 computational architecture, whereas secondary defensive behaviors are controlled by simpler

25 decision mechanisms. Taken together, we provide steps towards a computational model of how

26 the human brain rapidly solves survival challenges.

27 Main Text: Field observations and laboratory experiments have revealed that many

28 species employ complex and sophisticated defensive behaviors1 to escape from immediate threat.

29 Despite advances in some non-human species1,2, little is known about human escape actions, and

30 the computational mechanisms that control them, and this is difficult to study for ethical reasons.

31 Beyond imagined threat scenarios3, more recent work has used third-person view computer

32 games4,5 but these restrict possible actions to key presses or joystick movements. This is likely to

33 underestimate the complexity of the action space, and the ensuing decision problem6. For

34 example, Homer's Iliad anecdotally describes at least 13 distinct behavioral patterns under

35 conspecific attack, not counting the use of weapons (Table S1). However, a systematic empirical

36 assessment of the mechanisms that compute this behavior in humans remain elusive6–8.

37 Here, we investigated human escape behavior in a fully-immersive virtual reality (VR)

38 environment in which participants could move freely within 5 x 10 m physical space (Fig. 1.A.).

39 In two independent experiments, N1=29 (E1) and N2=30 (E2) participants were instructed to

40 forage for fruit on a bush, and to avoid contact with various threats over 68/36 (E1/E2 part 1)

41 independent epochs. We included 13/7 natural and 3/1 artificial (control) threats (Figs. 1.B., S1.;

42 Table S2), which appeared on 60/32 of all epochs (see Supplementary Information). A shelter,

43 which was always 5 meters behind the fruit bush, provided protection from all threats. On half of

44 these epochs, the threat moved toward the participant (“attack” condition; Fig. 1.C.) requiring

45 escape, while on the other half, the threat diverted from the attack trajectory after covering 20%

46 of the distance, making any escape unnecessary (“divert” condition; Fig. 1.D., see
3

47 Supplementary Information). All threats were animated to show realistic behavior (URL to

48 replay the experiments will be made public upon publication). Fast animal threats would chase

49 and outrun the participants, forcing them to enter the shelter to survive. Slow animal threats

50 would chase but not outrun the participants, thus requiring escape but not necessarily entry into

51 shelter. The inanimate threat did not chase such that entering the shelter was unnecessary.

52 Analyses in main text refer to natural threats included across both experiments (Fig. 1.B., see

53 Supplementary Information). In an exploration-confirmation strategy, hypotheses were generated

54 by exploring contrasts of estimated marginal means from linear mixed-effects models in the first

55 sample, which were then tested in the second sample with Holm-Bonferroni correction for

56 multiple comparisons (Fig. 1.E.; Tables S5, S6).

57 We define “virtual survival” when participants avoided contact with the threat. When

58 attacked, virtual survival was 81.8%/75.4% and plateaued after around 3/5 epochs (Fig. S2).

59 Participants collected on average 9.1/9.0 fruits per epoch (Fig. S3). Beyond complying with

60 explicit instructions to forage and survive, they also engaged in task-irrelevant behaviors that

61 might be adaptive in natural environments. Alarm vocalization is a commonly used defensive

62 behavior in the animal kingdom to deter or distract predators and warn conspecifics9. Participants

63 vocalized toward the threat (e.g. shrieks, squeaks) in 9.1%/18.8% of epochs (Fig. S3). They also

64 sought shelter in the absence of threat in 16.2%/7.1% of epochs. Compared to typical VR

65 experiments, cybersickness was low (see Supplementary Information)10.

66 Escape is often conceptualized as “instinctive”: triggered by specific features, with

67 flexible motor implementation, but a predictable end goal1,11. In our experiments, escape was

68 initiated on 93.8%/91.7% of attack epochs, and on 54.4%/70.8% of divert epochs (Fig. S4). Once

69 escape was initiated, its ultimate target depended on the threat trajectory. For fast threats,
4

70 participants did not reach the shelter on 14.2%/23.2% of initiated escapes when attacked. In

71 contrast, when threat diverted, shelter was not reached in 63.9%/37.2% of initiated escapes (H1,

72 p<.001; Figs. 1.F., S5.). Notably in attack epochs, escape interruption appeared non-intentional,

73 as it occurred later than during divert epochs and invariably lead to virtual death (Figs. 1.g., S5.).

74 Importantly in divert epochs, 27.4%/37.5% of interrupted escapes had already been initiated

75 before the threat diverted and were thus presumably started with an intention to go to shelter.

76 These results demonstrate that initiated escapes with an intention to go to shelter can be

77 interrupted when the environment changes. This suggests that escape goals are dynamically

78 updated during escape, rather than being predictable from the outset.

79 A commonly suggested escape trigger is “defensive distance”11 or “predatory

80 imminence”12 of the threat, which is thought to integrate physical distance and type of threat.

81 Here, we formalized this concept as time-to-impact of the threat, defined as maximum time

82 available to initiate an escape and just about collide with the threat during escape (i.e. if

83 participants were minimally faster they would surely survive). By adjusting physical threat

84 distance and the surrounding covering grass, we implemented two time-to-impact conditions, 1.5

85 s and 5 s (Fig. S2; Table S3). When threat attacked, participants initiated escape earlier during

86 short time-to-impact (1.3 s/1.2 s) compared to long time-to-impact (2.9 s/2.5 s; H2, p<.0001; Fig.

87 1.H.). When threat diverted, participants initiated escape more often when time-to-impact was

88 short vs. long (69.2%/82.6% vs. 39.8%/59.0%; H3, p<.0001). These results suggest that time-to-

89 impact is an important factor for determining whether and when to initiate escape.

90 Beyond time-to-impact, biological and physical threat characteristics contributed to

91 participants’ escape decisions. For short time-to-impact, participants were more likely to enter

92 the shelter when attacked by fast feral (i.e. elephant, bear; 61.6%/68.3%) than fast familiar threat
5

93 (i.e. dog, human; 51.9%/56.9%; H4, p<.05; Figs. 1.I., S6). Participants entered the shelter less

94 often when attacked by the (ballistic) rock than by any other threat (38.6%/40.7% vs.

95 76.0%/64.9%; H5, p<.0001) and initiated escape later than for any other threat when time-to-

96 impact was long (4.8 s/4.2 s vs. 3.2 s/2.9 s; H6, p<.0001; Figs. 1.H., S7). These results suggest

97 that the decision whether and when to escape integrates the expected trajectory and behavior of

98 the threat with its time-to-impact. Additionally, escape decisions were affected by stable personal

99 characteristics (Tables S7, S8). Across all threats and conditions, fearfulness13, fear of spiders14,

100 and sex jointly explained 27%/40% of the between-person variance in escape initiation time (Q-

101 H1, p<.005), and 27%/46% of the variance in minimum distance from threat during escape (Q-

102 H2, p<.005): generally fearful females with high fear of spiders initiated escape earlier, and left

103 more space between themselves and the threat.


6

104 Fig. 1. Investigation of escape decisions and their implementation using fully-immersive virtual reality. (A)

105 Participant view (example epoch). (B) Images of the 7 natural threats used in E1-E2. (C) Schematic view of the

106 scenario setup. In “attack” condition, threats moved toward the participant’s position. Fast threats appeared from

107 obscuring tall grass (dark green) and slow threats from short grass (light green) with dashed lines representing their

108 possible trajectories. The escape path toward the shelter is shown in blue. (D) In “divert” condition, the threat deviated

109 after covering 20% of the distance to the fruit bush. (E) Schematic representation of all behavioral outcomes whose

110 hypotheses replicated. (F) Percentage of interrupted escape. (G) Distance of the participant from the fruit bush over

111 time. Each colored line represents a participant, and the black line is the overall mean. (H) Time of escape initiation.

112 (I) Percentage of shelter entries. (J) Mean speed during escape. (K) Body orientation (mean cosine of orientation

113 angle from threat; 1: toward; -1: away). Large points with error bars represent mean and standard error across

114 participants and epochs, and small points represent individual epochs. Threats are sorted by speed with the ballistic

115 rock on top from elephant (fastest) to spider (slowest). Orange: 1.5 s time-to-impact; brown: 5 s time-to-impact.
7

116 Next, we addressed the implementation of escape once initiated. When attacked, mean

117 escape speed - and thus, energy expenditure15 - depended on the threat’s speed (slow threat: 1.4

118 ms-1/1.3 ms-1, fast threat: 1.9 ms-1/1.9 ms-1; H7, p<.0001; Figs. 1.J., S8). Similarly, participants

119 oriented their body more toward the threat during escape for slow vs. fast threats when attacked

120 (averaged cosine of orientation angle away from threat, ranging from -1 (away from threat) to 1

121 (toward threat): 0.2/0.1 vs. -0.3/-0.3; H8, p<.0001; Figs. 1.K., S9). Body orientation during

122 movement affects energy expenditure16 as well as how easy it is to observe the threat, and to

123 return to foraging. Furthermore, head orientation during escape, and mean escape speed,

124 depended on time-to-impact: participants oriented their head more toward the threat (0.0/0.0 vs. -

125 0.1/-0.2; H9, p<.05; Fig. S10) and moved more slowly (1.5 ms-1/1.5 ms-1 vs. 1.9 ms-1/1.9 ms-1;

126 H10, p<.0001) when they had more time. Finally, visual scanning during the first 1.5 s after

127 threat appeared was less pronounced for a human threat, compared to any other fast threat

128 (56.9 °s-1/71.3 °s-1 vs. 66.2 °s-1/79.9 °s-1; H11, p<.0001; Fig. S11). Taken together, this suggests

129 that escape implementation accounts for energy optimization and behavioral affordances (e.g.

130 monitor threat). In turn, it also depended on stable personal characteristics (Tables S7, S8).

131 Sensation seeking17, fearfulness13, fear of spiders14, and sex jointly explained 36%/34% of the

132 between-person variance in head orientation when the threat appeared (Q-H4, p.<05). High

133 sensation seekers oriented more, and people with high fearfulness and fear of spiders oriented

134 less, toward the threat.

135 These results strengthen the view that escaping humans face a complex decision-making

136 problem. Statistically optimal decision algorithms, such as model-based tree search, are time-

137 and memory-consuming6. Thus, defensive behavior, especially at short defensive distance, has

138 been suggested to be controlled by mechanisms that are solely based on scalar action values (also
8

139 termed model-free)8,18. These action values might be malleable from experience, but could even

140 be hard-wired. Such characteristics have a potential to improve decision speed but reduce

141 flexibility to adapt to changing environments. Hence, in the second part of E2, we perturbed the

142 environment in non-natural ways to reveal the computational characteristics of the underlying

143 mechanisms through behavior. All epochs in this part employed a predatory threat that had not

144 occurred in the first part of E2 (i.e. panther; Fig. 2.A.).

145 The first characteristic we investigated was sensitivity to reinforcer devaluation. This

146 addresses whether an action is suppressed when the outcome (e.g., escape to shelter) is suddenly

147 no longer desirable. It distinguishes, for example, certain model-based from classical model-free

148 mechanisms19. On half of epochs in a “force shield” block (Fig. 2.B.), we devalued the shelter by

149 endowing participants with a protective force shield that built up around the participant at the

150 start of an epoch and then visually disappeared. In a tutorial, participants learned from

151 experience that objects could not penetrate the force shield even when it was invisible, meaning

152 it would protect just like the shelter. Participants did not encounter the panther during this

153 tutorial. A devaluation-sensitive mechanism would stop escaping immediately when

154 encountering the panther within the force shield, whereas an insensitive mechanism might need

155 repeated exposure to the panther in order to learn to suppress escape19. Hence, the following

156 analyses only refer to the first epoch of each condition (subsequent epochs are visualized in Fig.

157 S12). We found that participants almost never escaped to shelter when attacked within the force

158 shield compared to without, independent of time-to-impact (3.3% vs. 80%; E2-H1, p<.0001; Fig.

159 2.C.; Table S9). When the force shield was active, they stayed at a comparable distance from

160 shelter regardless of threat presence (2.5 m vs. 2.6 m; E2-H2). This demonstrates that escape

161 initiation is sensitive to reinforcer devaluation and likely to be under model-based control. On
9

162 the other hand, some secondary defensive behaviors showed a different picture. When within the

163 force shield and a threat appeared, fruit picking rate over the remaining epoch was suppressed

164 compared to a no-threat epoch (0.8 s-1 vs. 1.1 s-1; E2-H3, p<.0001; Fig. 2.C.). In turn, visual

165 scanning over 0-1.5 s after threat appearance was increased, compared to a no-threat epoch

166 (45.0 °s-1 vs. 25.6 °s-1; E2-H4, p<.05; Fig. 2.C.). This suggests that the mechanisms controlling

167 rapid information-seeking and foraging-suppression are partly insensitive to reinforcer

168 devaluation.

169 The second characteristic we addressed was the flexibility to update behavior from

170 experience (sometimes termed “Pavlovian” vs. “instrumental”). Another important defensive

171 behavior is information-seeking. In a “Medusa” block, participants were instructed that they had

172 to identify and suppress a particular movement (head orientation toward the threat) that would

173 activate a lethal magical force. They were not instructed what this movement was. Most

174 participants learned to suppress this action, and thus reduced mortality from the magical force

175 from 90.0% (epoch 1) to 18.5% (epoch 7; E2-H5, p<.0001; Fig. 2.D.). In a post-hoc interview,

176 60.0% of participants reported the correct force-activating movement. The remaining 40% still

177 appeared successful in reducing their mortality (from 83.3% to 30.0%; Fig. S13), possibly by

178 altering related movements (e.g. body orientation) or by learning the correct movement but not

179 forming awareness of it. Overall, these results indicate that human information-seeking behavior

180 preceding escape, while partly insensitive to reinforcer devaluation, is malleable by experience

181 rather than hardwired.


10

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197 Fig. 2. Computational properties of the controllers underlying human escape behavior. (A) Participant view of

198 the force shield. (B) All epochs in E2 part 2 used the panther as threat. (C) Minimum distance from the shelter (top),

199 fruit picking rate over the entire epoch after threat appearance (centre), and visual scanning during 0-1.5 s after

200 threat appearance (bottom) in the force shield block. (D) Percentage of virtual death from magical force over epochs

201 in the Medusa block. (E) Percentage of escape to shelter in the hands-up block. Large points with error bars

202 represent the mean and standard error across all participants and epochs, and small points represent individual

203 epochs.
11

204 Finally, we addressed the flexibility of the action repertoire. In a “hands up” block,

205 participants were instructed that some panthers were trained to stop attacking when humans

206 raised both hands high above their head. This action is novel and non-natural in this context as

207 participants would normally use their hands to aid escape or protect their torso. The potential

208 presence of these specific panthers was indicated by a sign at the start of each epoch. Participants

209 were so efficient in utilizing this novel action that even in the first epoch for each condition, they

210 never escaped to shelter when it was available (0.0% vs. 86.7%; E2-H6, p<.0001; Figs. 2.E., S14

211 for subsequent epochs). Furthermore, in two exploratory epochs in the last block of E2,

212 participants were attacked by a panther in a visually different scenario without shelter (see

213 Supplementary Information). Unexpectedly, 26.7% of participants spontaneously evoked the

214 non-natural hands-up action they had previously learned, despite lack of instruction to do so, or

215 proof of effectiveness in this new context. Taken together, these results confirm that escape

216 initiation is sensitive to reinforcer devaluation, and demonstrate that the action-selection

217 mechanism can easily integrate novel instructed actions and later retrieve them in novel

218 situations.

219
220 Discussion: While escape from immediate attack is sometimes depicted as a fixed

221 reaction pattern (“fight or flight response”), the absence of observable and explicit deliberation

222 does not imply a lack of dynamic and complex decision-making1. We demonstrate that

223 behavioral goals during human escape are dynamically updated, integrate time-to-impact with

224 threat properties such as attack probability (feral vs. domestic) and expected trajectory (pursuit

225 vs. ballistic interception), and are implemented in a manner that allows optimizing secondary

226 goals (energy expenditure, behavioral affordances). Thus, in line with field observations in non-

227 human animals1, human escape responses are flexible, and can integrate multiple variables such

228 as spatial constraints of the environment and economic trade-offs even under strong timing

229 constraints1,20. This could pose a substantial challenge for computing accurate actions in limited
12

230 time. Nevertheless, perturbance experiments show that escape decisions are controlled by a goal-

231 directed mechanism that is sensitive to reinforcer devaluation in two separate tests. Furthermore,

232 the action-selection mechanism can easily learn a novel non-natural action and integrate it into

233 an enduring action repertoire. At the same time, information-seeking and foraging-suppression

234 behavior before escape appear insensitive to reinforcer devaluation and thus possibly rely on

235 scalar action values, although information-seeking can still be un-learned over time when leading

236 to negative outcomes. Taken together, this suggests that the human brain might use different

237 algorithms depending on the behavior required, or on the time constraints involved, which in our

238 experiment were more pronounced for rapid information seeking than for the decision to escape.

239 It is possible that with a time-to-impact shorter than the 1.5 s used here, humans would resort to

240 simplified algorithms even for escape decisions.

241 Using third-person view and strategic computer games, we have previously shown that

242 humans use goal-directed mechanisms to decide whether and when to approach a foraging

243 opportunity under threat21,22, but that in doing so they may often employ approximate

244 computations that rely on situation-specific heuristics23,24. While we demonstrate that escape

245 initiation depends on time-to-impact, we cannot yet quantitatively formalize the underlying

246 computations and whether they represent or approximate statistical optimality. These are likely to

247 involve additional and possibly threat-specific influences of physical distance, player speed,

248 foraging success, and the players’ actual escape speed which is variable across trials. One might

249 argue that VR, although more realistic than third-person view computer games, is still

250 insufficient to match the rich and large repertoire of defensive behaviors informally observed or

251 imagined in real life. Notwithstanding, the use of task-irrelevant vocalization (e.g. shrieks,

252 squeaks, screams) by participants testifies to the realism and immersive nature of our VR

253 environment.
13

254 In summary, our results provide an entry point for understanding how the human brain

255 computes and implements flexible and sophisticated escape decisions.


256

257 Method

258 Participants

259 We included participants aged 18-40 years who had no mobility impairments and were at

260 low risk of psychological trauma in the VR. We report 29 participants (19 females; mean age ±

261 SD = 25.9 ± 5.6) for the first experiment (E1) and 30 participants (15 females; mean age ± SD =

262 24.7 ± 5.3) for the second experiment (E2). Due to hardware failures, some individual epochs

263 were not completed such that participants experienced, on average, 66.5 out of 68 and 35.8 out of

264 36 epochs. One additional participant in E1 and three additional participants in E2 did not

265 complete the experiment per protocol due to VR hardware failure and were not included. Since

266 we were interested escape behavior (which could only be initiated after the threat appeared), we

267 excluded one additional participant in E2 who moved away from the fruit bush before a threat

268 appeared on 31 out of 36 epochs (corresponding to 5 standard deviations above group average).

269 All participants gave written informed consent before starting the experiment and

270 received a fixed monetary compensation. Experiments complied with all relevant ethical

271 regulations and were approved by the UCL Research Ethics Committee (6649/003).

272

273 Experimental procedure

274 Each experiment consisted of a sequence of short encounters (“epochs”) with various

275 threats (Table S2), and short breaks in-between. Participants were tasked with collecting as many

276 pieces of fruit (resembling kumquats) as possible, whilst avoiding physical contact with any

277 threat.
14

278 Tutorial: An interactive tutorial took place in a white barren environment. First, we

279 instructed participants to walk around the boundaries of the physical environment, run back and

280 forth across the length of the space, and walk backwards towards the edge of the physical

281 environment. Secondly, we showed the participants an example of a fruit bush. Participants were

282 taught to hold their hand over any appearing fruit for one second, which made it disappear and

283 play a “string pluck” sound, before the next fruit appeared. The third stage allowed the

284 participant to experience a red display and loud white noise that occurred on threat contact.

285 Finally, participants were shown the shelter, and told it would always appear behind their starting

286 position. They entered the shelter to end the tutorial.

287 Epochs: At the start of each epoch, the participant was positioned in a low grass clearing

288 surrounded by tallgrass, such that a single fruit bush was 2.5 m in front of them and the shelter

289 was 2.5 m behind them. To reduce cue and context conditioning, color and shape of the fruit

290 bush were randomly varied, and additional bushes, grass patches, animal carcasses, and distant

291 flocks of birds were added randomly around the participant without blocking any escape paths.

292 Once they started fruit collection, a threat could emerge from the grass after a delay which was

293 uniformly drawn from 1-11 s. Threats appeared randomly at a -45-, 0-, or 45-degree angle from

294 the initial participant forward direction and were accompanied by an initial rustling sound when

295 they broke through the grass. An epoch could end in one of three ways: 1. Contact of any body

296 part with a threat (approximated by a set of simple volumes), turning the display red, playing an

297 uncomfortable high amplitude white noise sound, and removing any collected fruit in that epoch

298 while playing a chime sound. 2. Upon entering the shelter, the door was slammed shut (unless

299 the threat was already in close proximity, thus blocking the door, leading to outcome 1). 3. After

300 a pre-determined time, if none of outcomes 1-2 occurred. After outcomes 2-3, a white display
15

301 appeared, and collected fruit were added to a total count while a “ding” sound was played. In all

302 cases, verbal feedback was given on a sign in front of the participant (e.g. “you were killed by

303 the threat”, “you escaped safely”, or “you survived”). Afterwards, the participant was placed in a

304 transition environment with floor markers, on which they had to stand in order to start the next

305 epoch.

306

307 Experimental conditions

308 In E1 (68 epochs) and the first part of E2 (36 epochs), we implemented a 2 x 2 factorial

309 design with “threat behavior” (attack/divert) and “time-to-impact” (1.5 s/5 s) as the two

310 independent variables. Two of the 16 threats in E1 (crocodile and dynamite) were not compatible

311 with divert behavior, and so were only used with the “attack” animation (SI methods). For E1,

312 this resulted in 60 epochs, plus 8 no-threat epochs visually identical to threat epochs. For E2, this

313 resulted in 32 epochs, plus 4 no-threat epochs.

314 Time-to-impact: We defined time-to-impact as the maximum time available to initiate

315 escape and just about collide with the threat (i.e. if participants were minimally faster they could

316 escape). For slow animals and the ballistic rock, this was simply the interval between first

317 occurrence of the threat and their arrival at the fruit-picking position. For chasing fast threats,

318 this was the time that would lead to simultaneous arrival of the threat and the participant at the

319 shelter, given assumed participant speed of 2 m/s based on pilot data. In other words, time-to-

320 impact was the participant’s lead time at the shelter if they could escape with zero delay. For all

321 moving threats (excluding dynamite and crocodile in E1, see SI), time-to-impact was realized by

322 adjusting the placement of surrounding covering grass from which the threat appeared (Table

323 S3).
16

324 Threat behavior: Threats would always start on a collision course, but in divert condition

325 they would deviate from this trajectory after 20% of the distance to the fruit bush. Animals

326 would then simply make a turn and move into the grass (SI for the other threats).

327 In the second part of E2 (blocks 2-4), we used non-natural elements in the environment,

328 to address computational characteristics of action-selection mechanisms for one particular threat

329 not used in block 1, namely the panther.

330 Force shield: In E2 block 2, we tested reinforcer devaluation in a 2 (force shield vs. no

331 force shield) x 2 (panther vs. no threat) x 2 (1.5 vs. 5 s time-to-impact) design. In force shield

332 epochs, a visible and audible force shield built up and surrounded the participant at the beginning

333 of the epoch and then disappeared visually but continued to protect against the threat.

334 Participants learned about the force shield in a preceding tutorial, during which the panther did

335 not occur. To avoid participants from inferring the type of epochs, we set a variable number of

336 epochs per condition (see Supplementary Information, Extended Data Table 5). (Table S4)

337 Hands-up: In E2 block 3, we addressed whether participants could learn to use an

338 instructed non-natural behavior (here, raising both hands above their head) to stop the threat

339 from attacking. Participants were informed at the beginning of each epoch whether panthers in

340 this environment would be sensitive to the novel action, by a pictorial sign in the grass clearing.

341 The factorial design and number of epochs were analogous to block 2.

342 Medusa: In E2 block 4, we sought to investigate whether participants could learn, by

343 trial and error, to identify and avoid a common action (here, looking at the threat) when it led to a

344 negative outcome (here, virtual death by a “magical force”). We used a 2 (panther vs. no threat) x

345 2 (1.5 vs. 5 s time-to-impact) design.


17

346 Exploratory epochs: Block 5 comprised several exploratory epochs related to the shelter

347 presence and position. We included 2 epochs where the participant started in a cul-de-sac gorge

348 with no shelter, which made escape from the threat impossible.

349

350 Equipment

351 We used an HTC Vive Pro Eye HMD, Windows PC with an Intel i7 9700K CPU and

352 Nvidia RTX 2080Ti GPU, running an experiment built in the Unity Engine with SteamVR &

353 Unity Experiment Framework25. Vive controllers were held in each hand, and Vive Trackers

354 were attached to the waist and feet to allow for real-time body tracking. The VR headsets

355 included a built-in microphone positioned on the underside of the HMD.

356

357 Demographics and self-reported questionnaires

358 We aimed to determine if heterogeneity between participants in threat-related behaviors arose

359 due to individual differences, especially those that have an ethological origin. As such, we

360 wanted to assess predictors of cautiousness such as fear susceptibility, phobia and anxiety levels,

361 and risk preferences.

362 We implemented all questionnaires using REDCap electronic data capture tools hosted at

363 University College London26. A few days before the experiment, participants completed self-

364 report questionnaires assessing fear (Fear Survey Schedule-III, FSS)13, trait anxiety (State-Trait

365 Inventory for Cognitive and Somatic Anxiety, STICSA-T)27–29, sensation seeking (Brief

366 Sensation Seeking Scale, BSSS)17, disgust (Disgust Propensity and Sensitivity Scale, DPSS-

367 12)30, spider phobia (SPQ-12)14, snake phobia (SNAQ-12)14, motion sickness (Motion Sickness

368 Susceptibility Questionnaire, MSSQ)31,32, video game usage (Video game usage questionnaire,
18

369 VGUQ)33, as well as questions enquiring about their experience and expertise in martial arts.

370 Immediately before the VR game, participants provided demographic information including sex

371 and gender, age, body weight and height as well as current health and physical status and

372 completed the state portion of the STICSA27–29. Immediately after the VR game, participants

373 assessed their cybersickness using the Simulator Sickness Questionnaire (SSQ)34. For details

374 about questionnaire selection, see Supplementary Information.

375

376 Task measures

377 A priori, we extracted 18 epoch-level summary statistics from our data (Table S5) for E1;

378 a subset of these were used for statistical analysis of the first part of E2. When averaging over

379 time periods, we use the trapezium rule to ensure average is accurate over an irregular time

380 sample. (1-3) The three possible outcomes, namely “escape to shelter” by going into the shelter,

381 “survived” by not going into the shelter, or “virtual death” when getting into contact with the

382 threat. We estimated “initiated escape” (4) and “escape initiation time” (5) as the time when

383 participants first moved away from the fruit bush using the head tracker which had the highest

384 data quality. When participants initiated their escape but did not go into the shelter, we

385 considered this an “interrupted escape” (6). We extracted the smallest distance between the

386 participant and the shelter (7) or the threat (8). We extracted peak (9) and mean speed (10) of the

387 participant during escape. The following measures were extracted and averaged over the 1.5

388 seconds after the threat appeared (corresponding to the shorter time-to-impact), or during the

389 entire escape (until entering the shelter, or epoch end, whichever occurred earlier): (11-12) body

390 orientation (mean cosine of angle between a vector pointing forward from the participant’s

391 pelvis, and the line between the participant and the threat), (13-14) head orientation (similar for a
19

392 vector pointing forward from the participant’s forehead), (15-16) fruit picking rate, and (17-18)

393 visual scanning, defined as cumulative angle of head movements. For no-threat trials, the

394 corresponding measures were taken from a random time point defined a priori within Unity. For

395 trials in which participants did not escape, all measures relating to escape were considered

396 missing.

397 For the second part of E2, we considered the following additional summary statistics:

398 (19) fruit picking rate from threat appearance to the minimum duration of the epoch (12.5 s);

399 (20) virtual death by magical force. Furthermore, as we had eye tracker data available for E2, we

400 used gaze orientation (rather than head orientation) to compute (17) for E2 part 2 only; results for

401 head orientation were similar.

402

403 Statistical analyses

404 The numerous possible analysis methods combined with the high dimensionality of the

405 dataset posed a formidable multiple comparison problem. Therefore, we opted for a rigorous

406 exploration-confirmation approach. We generated 11 hypotheses by exploring E1 (Table S6),

407 which we replicated in E2 while correcting for multiple comparisons (Holm-Bonferroni method).

408 For statistical analysis, we used generalized linear mixed-effects models for binomial variables

409 (virtual death, initiated escape, interrupted escape) and linear mixed-effects models for

410 continuous variables (glmer() and lmer() from the lme4 package in R). All models followed a 7 x

411 2 x 2 factorial design with threats (Elephant, Rock, Human, Bear, Dog, Snake, and Spider), time-

412 to-impact (1.5 s and 5 s), and threat behavior (attack, divert; model syntax DV ~ threat * time-to-

413 impact * threat behavior + (1 | Subject); see Supplementary Information for model optimizer

414 details). To test pre-defined hypotheses, we estimated marginal means (EMMS) for each cell of
20

415 the design using emms() and generated contrasts using contrast() (from the emmeans library in

416 R).

417 To determine which personal characteristics to include in a multiple regression, we

418 selected questionnaire scores that fulfilled r² > .10 in bi-variate correlations with a behavioral

419 outcome and retained up to three variables with the highest correlation. Additionally, sex was

420 included in all multiple regressions as it is an important predictor of cautious behavior35. We

421 generated 4 hypotheses by exploring E1 (Table S7 and S8), which we replicated in E2 while

422 correcting for multiple comparisons (Holm-Bonferroni method).

423 For E2 part 2, the a priori primary outcome measures were “escape to shelter” and

424 “minimum distance from shelter” in force shield and hands-up epochs, and “virtual death by

425 magical force” in Medusa epochs. As secondary outcome measures for force shield epochs, we

426 considered “fruit picking rate from threat appearance to the minimum duration of the epoch (12.5

427 s)” and “mean visual scanning over 0-1.5 s after threat appearance”. For no-threat trials, the

428 corresponding measures were taken from a random time point defined a prior within Unity. The

429 resulting statistical analysis can be found in Table S9.

430

431 Acknowledgements

432 The authors thank Jules Brochard, Peter Dayan, and Marina Rodriguez Lopez for

433 feedback on the first draft of this manuscript.

434 Funding: This work was supported by the European Research Council (ERC) under the

435 European Union’s Horizon 2020 research and innovation programme (Grant agreement No.

436 ERC-2018 CoG-816564 ActionContraThreat). DRB received funding from the National Institute
21

437 for Health Research (NIHR) UCLH Biomedical Research Centre. The Wellcome Centre for

438 Human Neuroimaging is supported by core funding from the Wellcome (203147/Z/16/Z).

439 Author contributions: J.B., S.H., and D.R.B. conceived and designed the experiments

440 with input from all other authors. J.K.S., J.B., and U.D.S.H. collected the data. J.K.S., J.B., S.Z.,

441 U.D.S.H., and D.R.B contributed analytic tools. J.K.S., J.B., S.Z., and D.R.B analyzed the data.

442 J.K.S., J.B., and D.R.B wrote the paper with input from all other authors. D.R.B supervised and

443 coordinated the project.

444 Competing interests: Authors declare that they have no competing interests.

445 Data and code availability: The complied versions of the two VR games are publicly

446 available on Open Science Framework (OSF, will be made available upon publication). All R

447 code used for analysis is publicly available on OSF (will be made available upon publication).

448 Curated data, including questionnaire summary scores, are available on OSF (will be made

449 available upon publication). Raw questionnaire and Unity data are available upon request under a

450 data sharing agreement in line with local data protection regulations.

451

452 References

453 1. Evans, D. A., Stempel, A. V., Vale, R. & Branco, T. Cognitive Control of Escape

454 Behaviour. Trends in Cognitive Sciences 23, 334–348 (2019).

455 2. Evans, D. A. et al. A synaptic threshold mechanism for computing escape decisions.

456 Nature 558, 590–594 (2018).

457 3. Blanchard, C. D., Hynd, A. L., Minke, K. A., Minemoto, T. & Blanchard, R. J. Human

458 defensive behaviors to threat scenarios show parallels to fear and anxiety-related defense

459 patterns of non-human mammals. Neuroscience & Biobehavioral Reviews 25, 761–770
22

460 (2001).

461 4. Bach, D. R. Cross-species anxiety tests in psychiatry: pitfalls and promises. Molecular

462 Psychiatry (2021). doi:10.1038/s41380-021-01299-4

463 5. Mobbs, D. et al. When Fear Is Near: Threat Imminence Elicits Prefrontal-Periaqueductal

464 Gray Shifts in Humans. Science 317, 1079–1083 (2007).

465 6. Bach, D. R. & Dayan, P. Algorithms for survival: A comparative perspective on emotions.

466 Nature Reviews Neuroscience 18, 311–319 (2017).

467 7. Mobbs, D. & LeDoux, J. Editorial overview: Survival behaviors and circuits. Current

468 Opinion in Behavioral Sciences 24, 168–171 (2018).

469 8. Mobbs, D., Headley, D. B., Ding, W. & Dayan, P. Space, Time, and Fear: Survival

470 Computations along Defensive Circuits. Trends in Cognitive Sciences 24, 228–241 (2020).

471 9. Seyfarth, R. M. & Cheney, D. L. Meaning and Emotion in Animal Vocalizations. Annals

472 of the New York Academy of Sciences 1000, 32–55 (2003).

473 10. Davis, S., Nesbitt, K. & Nalivaiko, E. A Systematic Review of Cybersickness. in

474 Proceedings of the 2014 Conference on Interactive Entertainment 19, 1–9 (ACM, 2014).

475 11. Gray, J. & McNaughton, N. The neuropsychology of anxiety: an enquiry into the functions

476 of the septohippocampal system. (Oxford UP, 2000).

477 12. Fanselow, M. S. & Lester, L. S. A functional behavioristic approach to aversively

478 motivated behavior: Predatory imminence as a determinant of the topography of defensive

479 behavior. in Evolution and learning (eds. Bolles, R. C. & Beecher, M. D.) 185–212

480 (Lawrence Erlbaum Associates, 1988).

481 13. Wolpe, J. & Lang, P. J. A fear survey schedule for use in behaviour therapy. Behaviour

482 Research and Therapy 2, 27–30 (1964).


23

483 14. Zsido, A. N., Arato, N., Inhof, O., Janszky, J. & Darnai, G. Short versions of two specific

484 phobia measures: The snake and the spider questionnaires. Journal of Anxiety Disorders

485 54, 11–16 (2018).

486 15. van der Walt, W. H. & Wyndham, C. H. An equation for prediction of energy expenditure

487 of walking and running. Journal of applied physiology 34, 559–563 (1973).

488 16. Chaloupka, E. C., Kang, J., Mastrangelo, M. A. & Donnelly, M. S. Cardiorespiratory and

489 metabolic responses during forward and backward walking. Journal of Orthopaedic and

490 Sports Physical Therapy 25, 302–306 (1997).

491 17. Hoyle, R. H., Stephenson, M. T., Palmgreen, P., Lorch, E. P. & Donohew, R. L. Reliability

492 and validity of a brief measure of sensation seeking. Personality and Individual

493 Differences 32, 401–414 (2002).

494 18. LeDoux, J. & Daw, N. D. Surviving threats: neural circuit and computational implications

495 of a new taxonomy of defensive behaviour. Nature Reviews Neuroscience 19, 269–282

496 (2018).

497 19. Dickinson, A. & Balleine, B. Motivational Control of Instrumental Action. Current

498 Directions in Psychological Science 4, 162–167 (1995).

499 20. Barrett, L. F. & Finlay, B. L. Concepts, goals and the control of survival-related behaviors.

500 Current Opinion in Behavioral Sciences 24, 172–179 (2018).

501 21. Castegnetti, G. et al. Representation of probabilistic outcomes during risky decision-

502 making. Nature Communications 11, 1–11 (2020).

503 22. Bach, D. R. The cognitive architecture of anxiety-like behavioral inhibition. Journal of

504 Experimental Psychology: Human Perception and Performance 43, 18–29 (2017).

505 23. Korn, C. W. & Bach, D. R. Minimizing threat via heuristic and optimal policies recruits
24

506 hippocampus and medial prefrontal cortex. Nature Human Behaviour 3, 733–745 (2019).

507 24. Korn, C. W. & Bach, D. R. Heuristic and optimal policy computations in the human brain

508 during sequential decision-making. Nature Communications 9, (2018).

509 25. Brookes, J., Warburton, M., Alghadier, M., Mon-Williams, M. & Mushtaq, F. Studying

510 human behavior with virtual reality: The Unity Experiment Framework. Behavior

511 Research Methods 455–463 (2019). doi:10.3758/s13428-019-01242-0

512 26. Harris, P. A. et al. Research electronic data capture (REDCap)-A metadata-driven

513 methodology and workflow process for providing translational research informatics

514 support. Journal of Biomedical Informatics 42, 377–381 (2009).

515 27. Ree, M. J., MacLeod, C., French, D. & Locke, V. The State–Trait Inventory for Cognitive

516 and Somatic Anxiety: Development and validation. in (Annual meeting of the Association

517 for the Advancement of Behavior Therapy, New Orleans, LA, 2000).

518 28. Grös, D. F., Antony, M. M., Simms, L. J. & McCabe, R. E. Psychometric Properties of the

519 State-Trait Inventory for Cognitive and Somatic Anxiety (STICSA): Comparison to the

520 State-Trait Anxiety Inventory (STAI). Psychological Assessment 19, 369–381 (2007).

521 29. Ree, M. J., French, D., MacLeod, C. & Locke, V. Distinguishing Cognitive and Somatic

522 Dimensions of State and Trait Anxiety: Development and Validation of the State-Trait

523 Inventory for Cognitive and Somatic Anxiety (STICSA). Behavioural and Cognitive

524 Psychotherapy 36, 313–332 (2008).

525 30. Fergus, T. A. & Valentiner, D. P. The Disgust Propensity and Sensitivity Scale-Revised:

526 An examination of a reduced-item version. Journal of Anxiety Disorders 23, 703–710

527 (2009).

528 31. Golding, J. F. Predicting individual differences in motion sickness susceptibility by


25

529 questionnaire. Personality and Individual Differences 41, 237–248 (2006).

530 32. Golding, J. F. Motion sickness susceptibility questionnaire revised and its relationship to

531 other forms of sickness. Brain Research Bulletin 47, 507–516 (1998).

532 33. Tolchinsky, A. The Development of a Self-Report Questionnaire to Measure Problematic

533 Video Game Play and its Relationship to Other Psychological Phenomena. Master’s

534 Theses and Doctoral Dissertations. 555. 555, (2013).

535 34. Lane, N. E. & Kennedy, R. S. A New Method for Quantifying Simulator Sickness:

536 Development and Application of the Simulator Sickness Questionnaire (SSQ). (Essex

537 Corporation, Orlando, FL 88-7, 1988).

538 35. Bach, D. R., Moutoussis, M., Bowler, A. & Dolan, R. J. Predictors of risky foraging

539 behaviour in healthy young people. Nature Human Behaviour 4, 832–843 (2020).

540

You might also like