Professional Documents
Culture Documents
The Liar's Walk Detecting Deception With Gait and Gesture
The Liar's Walk Detecting Deception With Gait and Gesture
The Liar's Walk Detecting Deception With Gait and Gesture
http://gamma.cs.unc.edu/GAIT/
Abstract
1
tional information, by verbal and/or nonverbal means, in collected from 88 individuals performing deceptive and nat-
order to create or maintain in another or others a belief that ural walking. The videos in this dataset provide interesting
the communicator himself or herself considers false”. Mo- observations about participants’ behavior in deceptive and
tives for deceptive behaviors can vary from inconsequential natural conditions. Deceivers put their hands in their pock-
to those constituting a serious security threat. Many appli- ets and look around more than the participants in the nat-
cations related to computer vision, human-computer inter- ural condition. Our observations also corroborate previous
faces, security, and computational social sciences need to findings that deceivers display the opposite of the expected
be able to automatically detect such behaviors in public ar- behavior or display a controlled movement [22, 19, 48].
eas (airports, train stations, shopping malls), simulated en- Some of the novel components of our work include:
vironments, and social media [60]. In this paper, we address 1. A novel deception feature formulation of gaits and ges-
the problem of automatically detecting deceptive behavior tures obtained from walking videos based on psychological
learned from gaits and gestures from walking videos. characterization and deep features learned from an LSTM-
Deception detection is a challenging task because decep- based neural network.
tion is a subtle human behavioral trait, and deceivers at- 2. A novel deception detection algorithm that detects de-
tempt to conceal their actual cues and expressions. How- ceptive walking with an accuracy of 93.4%.
ever, there is considerable research on verbal (explicit) and 3. A new public domain dataset, DeceptiveWalk, contain-
nonverbal (implicit) cues of deception [22]. Implicit cues ing annotated gaits and gestures with deceptive and natural
such as facial and body expressions [22], eye contact [79], walks.
and hand movements [30] can provide indicators of decep- The rest of the paper is organized as follows. In Sec-
tion. Facial expressions have been widely studied as cues tion 2, we give a brief background on deception and discuss
for automatic recognition of deception [20, 44, 43, 53, 52, previous methods of detecting deceptive walking automat-
29, 69]. However, deceivers try to alter or control what ically. In Section 3, we give an overview of our approach
they think is getting the most attention from others [21]. and present the details of our user study. We also present
Compared to facial expressions, body movements such as our novel DeceptiveWalk dataset in Section 3. We present
gaits are more implicit and are less likely to be controlled. the novel deceptive features in Section 4 and provide details
This makes gait an excellent avenue for observing decep- of our method for detecting deception from walking videos.
tive behaviors. The psychological literature on deception We highlight the results generated using our method in Sec-
also shows that multiple factors need to be considered when tion 5.
deciphering nonverbal information [12].
Main Results: We present an data-driven approach for 2. Related Work
detecting deceptive walking of individuals based on their In this section, we give a brief background of research on
gaits and gestures as extracted from their walking videos deception and discuss previous methods for detecting de-
(Figure 1). Our approach is based on the assumption that ceptive behaviors automatically.
humans are less likely to alter or control their gaits and ges-
tures than facial expressions [21], which arguably makes 2.1. Deception
such cues better indicators of deception.
Research on deception shows that people behave differ-
Given a video of an individual walking, we extract ently when they are being deceitful, and different clues can
his/her gait as a series of 3D poses using state-of-the-art be used to detect deception [22]. Darwin first suggested that
human pose estimation [16]. We also annotate various ges- certain movements are “expressive” and escape the controls
tures performed during the video. Using this data, we com- of the will [17]. Additionally, when people know that they
pute psychology-based gait features, gesture features, and are being watched, they alter their behavior in something
deep features learned using an LSTM-based neural net- known as “The Hawthorne Effect” [39]. To avoid detec-
work. These gait, gesture, and deep features are collec- tion, deceivers try to alter or control what they think others
tively referred to as the deceptive features. Then, we feed are paying the most attention to [21]. Different areas of the
the deceptive features into fully connected layers of the neu- body have different communicating abilities that provide in-
ral network to classify normal and deceptive behaviors. We formation based on differences in movement, visibility, and
train this neural network classifier (Deception Classifier) to speed of transmission [22]. Therefore, using channels to
learn the deep features as well as classify the data into be- which people pay less attention and that are harder to con-
havior labels on a novel dataset (DeceptiveWalk). Our De- trol are good indicators of deception.
ception Classifier can achieve an accuracy of 93.4% when Body expressions present an alternative channel for
classifying deceptive walking, which is an improvement of perception and communication, as shown in emotion re-
5.9% and 10.1% over classifiers based on the state-of-the- search [6]. Body movements are more implicit and may be
art emotion [10] and action [57] classification algorithms, less likely to be controlled compared to facial expressions,
respectively. as evidenced by prior work, suggesting that clues such as
Additionally, we present our deception dataset, Decep- less eye contact [79], downward gazing (through the affec-
tiveWalk, which contains 1144 annotated gaits and gestures tive experiences of guilt/sadness) [9, 21], and general hand
movements [30] are good indicators of deception. Though
there is a large amount of research on non-verbal cues of de-
ception, there is no single conclusive universal cue [63]. To
address this issue, we present a novel data-driven approach
that computes features based on gaits and gestures, along
with deep learning.
3.3. Data Collection Study 7. After the exchange, the participant walks back towards
the Chair. We refer to this walk as Walk 3, and it is
We conducted a user study to collect gait and gesture captured by the camera placed behind the chair (Cam
data of individuals performing either deceptive or natural 1).
walks.
8. The participant places the exchanged package on the
3.3.1 Participants chair.
We recruited 88 participants (51 female, 30 male, 7 pre- 9. The participant walks towards the stairs located near
ferred not to say, age = 20.35) from a university cam- the Start location. We refer to this walk as Walk 4, and
pus. Appropriate institutional review board (IRB) approval it is captured by the camera placed behind the chair
was obtained for the study and each participant provided in- (Cam 2).
formed written consent to participate in the study and record
videos. 10. The participant walks towards the experimenter on
Floor 1 and is debriefed about the experiment.
Given a sequence of 3D pose data for a fixed number of 5.1. Implementation Details
time frames as input, the task of the classification algorithm
is to assign the input one of two class labels — natural or We randomly selected 80% of the dataset for training the
deceptive. To achieve this, we develop a deep neural net- network, 10% for cross-validation and kept the remaining
work consisting of long short-term memory (LSTM) [28] 10% for testing. Inputs were fed to the network with a batch
layers. LSTM units consist of feedback connections and size of 8. We used the standard cross-entropy loss to train
gate functions that help them retain useful patterns from in- the network. The loss was optimized by running the Adam
put data sequences. Since the inputs in our case are walking optimizer [33] for n = 500 epochs with a momentum of 0.9
motions, which are periodic in time, we reason that LSTM and a weight decay of 10−4 . The initial learning rate was
units can learn to efficiently encode the different walking 0.001,
which was
reduced 7nto half its present value after 250
n 3n
patterns in our data, which, in turn, helps the network seg- 2 , 375 4 , and 437 8 epochs.
regate the data into the respective classes. We call the fea-
ture vectors learned from the LSTM layers deep features,
5.2. Experiments
fd ∈ R32 . We evaluate the performance of our LSTM-based net-
work as well as the usefulness of the deep features through
exhaustive experiments. All the experimental results are
4.3.1 Network Architecture summarized in Table 3.
Since ours is the first algorithm that detects deception
Our overall neural network is shown in Figure 5. We first from walking gaits, we compare the performance of our al-
normalize the input sequences of 3D poses so that each in- gorithm with a random forest classification method. First,
dividual value lies within the range of 0 and 1. We feed the we feed all the features and feature combinations to an off-
normalized sequences of 3D poses into an LSTM module, the-shelf random forest classifier with 100 decision trees,
which consists of 3 LSTM units of sizes 128, 64, and 32, each of max depth 2. Each tree performs binary classifi-
respectively, and each of depth 2. We concatenate the 32 cation on sub-samples of the dataset, and the final estimate
dimensional feature vectors output from the LSTM module is calculated as a weighted average of the prediction of the
with the 29 dimensional gait and the 7 dimensional gesture individual trees. Random forest classifiers do not naturally
features and feed the 68 dimensional combined deceptive capture the temporal patterns of the walking styles in the 3D
feature vectors into a convolution and pooling module. This pose sequences. Hence it leads to a lower accuracy com-
module consists of 2 convolution layers. The first convolu- pared to using an LSTM-based network. Moreover, since
tion layer has a depth of 48 and a kernel size of 3. It is each tree in the Random Forest is a weak classifier and the
followed by a maxpool layer with a window size of 3. The final estimate is obtained by averaging, the performance is
second convolution layer has depth 16 and a kernel size of observed to be poorer than the LSTM-based network even
3. The output of the second convolution layer is flattened when only the gait and gesture features are used for clas-
and passed into the fully connected module, which consists sification. Overall, we find a consistent 3% improvement
of 2 fully connected (FC) layers of sizes 32 and 8, respec- in accuracy across the board when using the LSTM-based
tively. All the FC layers are equipped with the ELU acti- network over the Random Forest classifier.
vation function. The output feature vectors from the fully Second, we compare the performance of using the
connected module are passed through a 2 dimensional fully LSTM-based network with the various input features indi-
connected layer with the softmax activation function to gen- vidually. Gesture features by themselves provide the low-
erate the output class probabilities. We assign the predicted est classification accuracies for both the classifiers because
class label as the one with a higher probability. they only coarsely summarize the subject’s activities and
not their walking patterns. Gait features, on the other hand,
5. Results contain only this information and are seen to be more help-
ful in distinguishing between the class labels. Gestures and
We first describe the implementation details of our clas- gait features collectively perform better than their individual
sification network, followed by a detailed summary of the performances. However, these features still lose some of the
experimental results. useful temporal patterns in the original 3D pose sequences.
Figure 7. Gesture Features: We present the percentage of videos
Figure 6. Separation of Features: We present the scatter plot of in which the gestures were observed for both deceptive and natural
the features for each of the two-class labels. We project the fea- walks. We observe that deceivers put their hands in their pockets
tures to a 3 dimensional space using PCA for visualization. The and look around more than participants in the natural condition.
gait and gesture features are not separable in this space. How- However, the trend is unclear in other gestures. This could be
ever, the combined gait+gestures features and deep features are explained in previous literature that suggests that deceivers try to
well separated even in this low dimensional space. The boundary alter or control what they think others are paying the most attention
demarcated for the deceptive class features is only for visualiza- to [22, 19, 48].
tion and does not represent the true class boundary. pockets than the participants in the natural condition.
2. Deceivers look around and check if anyone is noticing
Using LSTMs to learn the temporal patterns directly from them more than the participants in the natural condi-
the 3D pose sequences, we observe an improvement of 5% tion.
over using the combined gait and the gesture features. Fi-
nally, combining the deep features learned from the LSTM 3. Deceivers touch their hair more than the participants in
module with the gait and gesture features leads to an overall the natural condition.
classification accuracy of 93.4%. 4. Participants in the natural condition touched their
We also compared our method with prior methods for shirt/jacket more than the deceivers.
emotion and action recognition from gaits since these meth-
ods also solve the problem of gait classification (albeit with 5. Deceivers touched their faces and hair, folded their
a different set of labels). We compare with approaches that hands, and looked at their phones at about the same
use spatial-temporal graph convolution networks for emo- rate as those in the natural condition.
tion (STEP [10]) and action recognition (ST-GCN [71]).
We also compare our method with an approach that uses Previous literature suggests that deceivers are sometimes
a novel directed graph neural network (DGNN [57]) for ac- very aware of the cues they are putting out and may, in
tion recognition. We train these methods on our Deceptive- turn, display the opposite of the expectation or display a
Walk dataset and obtain their performance on the testing set controlled movement [22, 19, 48]. This would explain ob-
similar to our method. Our method outperforms these state- servations 4 and 5. Factors like how often a person lies or
of-the-art models used for emotion and action recognition how aware they are of others’ perceptions of them could
by a minimum of 5.9% (Table 2). contribute to whether they show fidgeting and nervous be-
Furthermore, we show the scatter of the features from havior or controlled and stable movements.
both the natural and the deceptive classes in Figure 6.
The original features are high dimensional (fgait ∈ 6. Conclusion, Limitations, and Future Work
R29 , fgesture ∈ R7 , and fd ∈ R32 ). Hence we per- We presented a novel method for distinguishing the de-
form PCA to project and visualize them on a 3 dimen- ceptive walks of individuals from natural walks based on
sional space. The gait and gesture features from the two their walking style or pattern in videos. Based on our novel
classes are not separable in this space. However, the com- DeceptiveWalk dataset, we train an LSTM-based classifier
bined gait+gesture features, as well as the deep features that classifies the deceptive features and predicts whether an
from the two classes, are well-separated even in this lower- individual is performing a deceptive walk. We observe an
dimensional space, implying that the deep features fd and accuracy of 93.4% obtained using 10-fold cross-validation
gait+gesture features fg can efficiently separate between the on 1144 data points.
two classes. There are some limitations to our approach. Our ap-
proach is based on our dataset collected in a controlled set-
5.3. Analysis of Gesture Cues ting. In the future, we would like to test our algorithm on
We tabulate the distribution of various gestures in Fig- real-world videos involving deceptive walks. Since the ac-
ure 7. We can make interesting observations from this data. curacy of our algorithm depends on the accurate extraction
of 3D poses, the algorithm may perform poorly in cases
1. Deceivers are more likely to put their hands in their where pose estimation is inaccurate (e.g., occlusions, bad
lighting). For this work, we manually annotate the gesture [11] Pradeep Buddharaju, Jonathan Dowdall, Panagiotis Tsi-
data. In the future, we would like to automatically annotate amyrtzis, Dvijesh Shastri, I Pavlidis, and MG Frank. Au-
different gestures performed by the individual and automate tomatic thermal monitoring system (athemos) for deception
the entire method. Additionally, our approach is limited to detection. In 2005 IEEE Computer Society Conference on
walking. In the future, we would like to extend our algo- Computer Vision and Pattern Recognition (CVPR’05), vol-
ume 2, pages 1179–vol. IEEE, 2005. 3
rithm to more general cases that include a variety of activ-
[12] David Buller and Judee Burgoon. Interpersonal deception
ities. The analysis of our data collection study reveals that
theory. Communication Theory, 6(3):203–242, 1996. 2
the mapping between gestures and deception is unclear and
[13] Niall J Conroy, Victoria L Rubin, and Yimin Chen. Auto-
may depend on other factors. In the future, we would like matic deception detection: Methods for finding fake news.
to explore the factors that govern the relationship between Proceedings of the Association for Information Science and
gestures and deception. Finally, we would like to combine Technology, 52(1):1–4, 2015. 3
our method with other verbal and non-verbal cues of decep- [14] A Crenn, A Khan, et al. Body expression recognition from
tion (e.g., gazing, facial expressions, etc.) and compare the anim. 3d skeleton. In IC3D, 2016. 6
usefulness of body expressions to other cues in detecting [15] Qian Cui, Eric J Vanman, Dongtao Wei, Wenjing Yang, Lei
deception. Jia, and Qinglin Zhang. Detection of deception based on
fmri activation patterns underlying the production of a de-
ceptive response and receiving feedback about the success
References of the deception after a mock murder crime. Social cognitive
[1] Mohamed Abouelenien, Veronica Pérez-Rosas, Rada Mihal- and affective neuroscience, 9(10):1472–1480, 2013. 3
cea, and Mihai Burzo. Deception detection using a multi- [16] R Dabral, A Mundhada, et al. Learning 3d human pose from
modal approach. In Proceedings of the 16th International structure and motion. In ECCV, 2018. 2, 3, 4
Conference on Multimodal Interaction, pages 58–65. ACM, [17] C. Darwin and P. Prodger. The Expression of Emotions in
2014. 3 Man and Animals. Philosophical Library, 1972/1955. 2
[2] Mohamed Abouelenien, Verónica Pérez-Rosas, Rada Mihal- [18] P Ravindra De Silva, Minetada Osano, Ashu Marasinghe,
cea, and Mihai Burzo. Detecting deceptive behavior via inte- and Ajith P Madurapperuma. Towards recognizing emo-
gration of discriminative features from multiple modalities. tion with affective dimensions through body gestures. In
IEEE Transactions on Information Forensics and Security, 7th International Conference on Automatic Face and Ges-
12(5):1042–1055, 2016. 3 ture Recognition (FGR06), pages 269–274. IEEE, 2006. 3
[3] Mohamed Abouelenien, Verónica Pérez-Rosas, Bohan Zhao, [19] Bella DePaulo, James Lindsay, Brian Malone, Laura Muh-
Rada Mihalcea, and Mihai Burzo. Gender-based multimodal lenbruck, Kelly Charlton, and Harris Cooper. Cues to decep-
deception detection. In Proceedings of the Symposium on tion. Psychological Bulletin, 129(1):74–118, 2003. 2, 8
Applied Computing, pages 137–144. ACM, 2017. 3 [20] Mingyu Ding, An Zhao, Zhiwu Lu, Tao Xiang, and Ji-Rong
Wen. Face-focused cross-stream network for deception de-
[4] Anthony P Atkinson, Mary L Tunstall, and Winand H Dit-
tection in videos. In Proceedings of the IEEE Conference
trich. Evidence for distinct contributions of form and motion
on Computer Vision and Pattern Recognition, pages 7802–
information to the recognition of emotions from body ges-
7811, 2019. 2, 3
tures. Cognition, 104(1):59–72, 2007. 3
[21] P. Ekman. Telling Lies: : Clues to deceit in the marketplace,
[5] Dan Atlas and Gideon L Miller. Detection of signs of at- marriage, and politics. Norton, 1994. 2
tempted deception and other emotional stresses by detecting
[22] Paul Ekman and Wallace Friesen. Nonverbal leakage and
changes in weight distribution of a standing or sitting person,
clues to deception. Psychiatry, 32(1):88–106, 1969. 2, 6, 8
Feb. 8 2005. US Patent 6,852,086. 3
[23] Tommaso Fornaciari and Massimo Poesio. Automatic de-
[6] H. Aviezer, Y. Trope, and A. Todorov. Body cues, not fa- ception detection in italian court cases. Artificial intelligence
cial expressions, discriminate between intense positive and and law, 21(3):303–340, 2013. 3
negative emotions. Science, 228(6111):1225–1229, 2012. 2
[24] David Givens. The Nonverbal Dictionary of Gestures, Signs,
[7] Danilo Avola, Luigi Cinque, Gian Luca Foresti, and Daniele and Body Language Cues. Center for Nonverbal Studies
Pannone. Automatic deception detection in rgb videos using Press, 2002. 5
facial action units. In Proceedings of the 13th International [25] Viresh Gupta, Mohit Agarwal, Manik Arora, Tanmoy
Conference on Distributed Smart Cameras, page 5. ACM, Chakraborty, Richa Singh, and Mayank Vatsa. Bag-of-lies:
2019. 3 A multimodal dataset for deception detection. In Proceed-
[8] Gene Ball and Jack Breese. Relating personality and be- ings of the IEEE Conference on Computer Vision and Pattern
havior: Posture and gestures. In International Workshop on Recognition Workshops, pages 0–0, 2019. 3
Affective Interactions, pages 196–203. Springer, 1999. 3 [26] Ikhsanul Habibie, Daniel Holden, Jonathan Schwarz, Joe
[9] Roy F. Baumeister, Arlene M. Stillwell, and Todd F. Heather- Yearsley, and Taku Komura. A recurrent variational autoen-
ton. Guilt: An interpersonal approach. Psychological Bul- coder for human motion synthesis. In BMVC, 2017. 3
letin, 115(2):243–267, 1994. 2 [27] Md Kamrul Hasan, Wasifur Rahman, Luke Gerstner, Taylan
[10] Uttaran Bhattacharya, Trisha Mittal, Rohan Chandra, Tan- Sen, Sangwu Lee, Kurtis Glenn Haut, and Mohammed Ehsan
may Randhavane, Aniket Bera, and Dinesh Manocha. Step: Hoque. Facial expression based imagination index and a
Spatial temporal graph convolutional networks for emotion transfer learning approach to detect deception. 2019. 3
perception from gaits. arXiv preprint arXiv:1910.12906, [28] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term
2019. 2, 3, 7, 8 memory. Neural computation, 9(8):1735–1780, 1997. 7
[29] Guosheng Hu, Li Liu, Yang Yuan, Zehao Yu, Yang Hua, Zhi- [44] Nicholas Michael, Mark Dilsizian, Dimitris Metaxas, and
hong Zhang, Fumin Shen, Ling Shao, Timothy Hospedales, Judee K Burgoon. Motion profiles for deception detection
Neil Robertson, et al. Deep multi-task learning to recognise using visual cues. In European Conference on Computer Vi-
subtle facial expressions of mental states. In Proceedings sion, pages 462–475. Springer, 2010. 2, 3
of the European Conference on Computer Vision (ECCV), [45] Johannes Michalak, Nikolaus F Troje, Julia Fischer, Patrick
pages 103–119, 2018. 2, 3 Vollmar, Thomas Heidenreich, and Dietmar Schulte. Em-
[30] M. L. Jensen, T. O. Meservy, J. Kruse, J. K. Burgoon, and bodiment of sadness and depressiongait patterns associated
J. F. Nunamaker. Identification of deceptive behavioral cues with dysphoric mood. Psychosomatic medicine, 71(5):580–
extracted from video. IEEE Intelligent Transportation Sys- 587, 2009. 3
tems, pages 1135–1140, 2005. 2, 3 [46] Romero Morais, Vuong Le, Truyen Tran, Budhaditya Saha,
[31] Hamid Karimi, Jiliang Tang, and Yanen Li. Toward end- Moussa Mansour, and Svetha Venkatesh. Learning regular-
to-end deception detection in videos. In 2018 IEEE Inter- ity in skeleton trajectories for anomaly detection in videos.
national Conference on Big Data (Big Data), pages 1278– In Proceedings of the IEEE Conference on Computer Vision
1283. IEEE, 2018. 3 and Pattern Recognition, pages 11996–12004, 2019. 1, 3
[32] Mehran Khodabandeh, Hamid Reza Vaezi Joze, Ilya [47] Allyson Mount. Intentions, gestures, and salience in ordinary
Zharkov, and Vivek Pradeep. Diy human action dataset gen- and deferred demonstrative reference. Mind & Language,
eration. In Proceedings of the IEEE Conference on Computer 23(2):145–164, 2008. 3
Vision and Pattern Recognition Workshops, pages 1448– [48] J. Navarro. Four-domain model for detection deception: An
1458, 2018. 3 alternative paradigm for interviewing. FBI Law Enforcement
[33] Diederik P Kingma and Jimmy Ba. Adam: A method for Bulletin, 72(6):19–24, 2003. 2, 5, 8
stochastic optimization. arXiv preprint arXiv:1412.6980, [49] Trong-Nguyen Nguyen and Jean Meunier. Anomaly detec-
2014. 7 tion in video sequence with appearance-motion correspon-
[34] A Kleinsmith, N Bianchi-Berthouze, et al. Affective body dence. In Proceedings of the IEEE International Conference
expression perception and recognition: A survey. IEEE TAC, on Computer Vision, pages 1273–1283, 2019. 1
2013. 3, 6
[50] Fatemeh Noroozi, Dorota Kaminska, Ciprian Corneanu,
[35] Gangeshwar Krishnamurthy, Navonil Majumder, Soujanya Tomasz Sapinski, Sergio Escalera, and Gholamreza Anbar-
Poria, and Erik Cambria. A deep learning approach jafari. Survey on emotional body gesture recognition. IEEE
for multimodal deception detection. arXiv preprint transactions on affective computing, 2018. 3
arXiv:1803.00344, 2018. 3
[51] Dario Pavllo, David Grangier, and Michael Auli. Quater-
[36] Chieh-Ming Kuo, Shang-Hong Lai, and Michel Sarkis. A
net: A quaternion-based recurrent model for human motion.
compact deep learning model for robust facial expression
arXiv preprint arXiv:1805.06485, 2018. 3
recognition. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition Workshops, pages [52] Verónica Pérez-Rosas, Mohamed Abouelenien, Rada Mihal-
2121–2129, 2018. 1 cea, and Mihai Burzo. Deception detection using real-life
[37] Weston LaBarre. The cultural basis of emotions and ges- trial data. In Proceedings of the 2015 ACM on International
tures. Journal of personality, 16(1):49–68, 1947. 3 Conference on Multimodal Interaction, pages 59–66. ACM,
2015. 2, 3
[38] Jiyoung Lee, Seungryong Kim, Sunok Kim, Jungin Park, and
Kwanghoon Sohn. Context-aware emotion recognition net- [53] Verónica Pérez-Rosas, Mohamed Abouelenien, Rada Mihal-
works. In Proceedings of the IEEE International Conference cea, Yao Xiao, CJ Linton, and Mihai Burzo. Verbal and non-
on Computer Vision, pages 10143–10152, 2019. 1 verbal clues for real-life deception detection. In Proceedings
[39] S. D. Levitt and J. A. List. Was there really a hawthorne of the 2015 Conference on Empirical Methods in Natural
effect at the hawthorne plant? an analysis of the original Language Processing, pages 2336–2346, 2015. 2, 3
illumination experiments. American Economic Journal: Ap- [54] Tanmay Randhavane, Aniket Bera, Kyra Kapsaskis, Uttaran
plied Economics, 3(1):224–38, 2011. 2 Bhattacharya, Kurt Gray, and Dinesh Manocha. Identify-
[40] Elisabeta Marinoiu, Mihai Zanfir, Vlad Olaru, and Cristian ing emotions from walking using affective and deep features.
Sminchisescu. 3d human sensing, action and emotion recog- arXiv preprint arXiv:1906.11884, 2019. 5, 6
nition in robot assisted therapy of children with autism. In [55] Rodrigo Rill-Garcia, Hugo Jair Escalante, Luis Villasenor-
Proceedings of the IEEE Conference on Computer Vision Pineda, and Veronica Reyes-Meza. High-level features for
and Pattern Recognition, pages 2158–2167, 2018. 1 multimodal deception detection in videos. In Proceedings
[41] J. Masip, E. Garrido, and C. Herrero. Defining deception. of the IEEE Conference on Computer Vision and Pattern
Anales de Psicologı́a, 20(1):147–171, 2004. 1, 4 Recognition Workshops, pages 0–0, 2019. 3
[42] Daniel McDuff and Mohammad Soleymani. Large-scale [56] C Roether, L Omlor, et al. Critical features for the perception
affective content analysis: Combining media content fea- of emotion from gait. Vision, 2009. 3
tures and facial reactions. In 2017 12th IEEE International [57] Lei Shi, Yifan Zhang, Jian Cheng, and Hanqing Lu.
Conference on Automatic Face & Gesture Recognition (FG Skeleton-based action recognition with directed graph neural
2017), pages 339–345. IEEE, 2017. 1 networks. In Proceedings of the IEEE Conference on Com-
[43] Thomas O Meservy, Matthew L Jensen, John Kruse, Judee K puter Vision and Pattern Recognition, pages 7912–7921,
Burgoon, Jay F Nunamaker, Douglas P Twitchell, Gabriel 2019. 2, 3, 7, 8
Tsechpenakis, and Dimitris N Metaxas. Deception detection [58] M. L. Slepian, E. J. Masicampo, N. R. Toosi, and N. Am-
through automatic, unobtrusive analysis of nonverbal behav- bady. The physical burdens of secrecy. Journal of Experi-
ior. IEEE Intelligent Systems, 20(5):36–43, 2005. 2, 3 mental Psychology: General, 141(4):619–624, 2012. 6
[59] Waqas Sultani, Chen Chen, and Mubarak Shah. Real-world [75] Chen-Lin Zhang, Hao Zhang, Xiu-Shen Wei, and Jianxin
anomaly detection in surveillance videos. In Proceedings Wu. Deep bimodal regression for apparent personality anal-
of the IEEE Conference on Computer Vision and Pattern ysis. In European Conference on Computer Vision, pages
Recognition, pages 6479–6488, 2018. 1 311–324. Springer, 2016. 1
[60] Michail Tsikerdekis and Sherali Zeadally. Online decep- [76] Kaihao Zhang, Wenhan Luo, Lin Ma, Wei Liu, and Hong-
tion in social media. Communications of the ACM, 57(9):72, dong Li. Learning joint gait representation via quintuplet
2014. 2 loss minimization. In Proceedings of the IEEE Conference
[61] Shinji Umeyama. Least-squares estimation of transformation on Computer Vision and Pattern Recognition, pages 4700–
parameters between two point patterns. TPAMI, (4):376– 4709, 2019. 3
380, 1991. 5 [77] Ziyuan Zhang, Luan Tran, Xi Yin, Yousef Atoum, Xiaom-
[62] Sophie Van der Zee, Ronald Poppe, Paul J Taylor, and Ross ing Liu, Jian Wan, and Nanxin Wang. Gait recognition via
Anderson. To freeze or not to freeze: A culture-sensitive disentangled representation learning. In Proceedings of the
motion capture approach to detecting deceit. PloS one, IEEE Conference on Computer Vision and Pattern Recogni-
14(4):e0215000, 2019. 3 tion, pages 4710–4719, 2019. 3
[78] Lina Zhou and Dongsong Zhang. Following linguistic foot-
[63] Aldert Vrij, Maria Hartwig, and Pär Anders Granhag. Read-
prints: automatic deception detection in online communica-
ing lies: nonverbal communication and deception. Annual
tion. Commun. ACM, 51(9):119–122, 2008. 3
review of psychology, 70:295–317, 2019. 3
[79] Miron Zuckerman, Bella DePaulo, and Robert Rosenthal.
[64] Changsheng Wan, LI Wang, and Vir V Phoha. A sur- Verbal and nonverbal communication of deception. Ad-
vey on gait recognition. ACM Computing Surveys (CSUR), vances in Experimental Social Psychology, 14:1–59, 1981.
51(5):89, 2019. 3 2
[65] Shangfei Wang and Qiang Ji. Video affective content analy- [80] Jiaxu Zuo, Tom Gedeon, and Zhenyue Qin. Your eyes say
sis: a survey of state-of-the-art methods. IEEE Transactions youre lying: An eye movement pattern analysis for face
on Affective Computing, 6(4):410–430, 2015. 1 familiarity and deceptive cognition. In 2019 International
[66] Yanxiang Wang, Bowen Du, Yiran Shen, Kai Wu, Guan- Joint Conference on Neural Networks (IJCNN), pages 1–8.
grong Zhao, Jianguo Sun, and Hongkai Wen. Ev-gait: Event- IEEE, 2019. 3
based robust gait recognition using dynamic vision sensors.
In Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition, pages 6358–6367, 2019. 3
[67] Xiu-Shen Wei, Chen-Lin Zhang, Hao Zhang, and Jianxin
Wu. Deep bimodal regression of apparent personality traits
from short video sequences. IEEE Transactions on Affective
Computing, 9(3):303–315, 2017. 1
[68] Di Wu, Nabin Sharma, and Michael Blumenstein. Recent ad-
vances in video-based human action recognition using deep
learning: A review. In 2017 International Joint Confer-
ence on Neural Networks (IJCNN), pages 2865–2872. IEEE,
2017. 1
[69] Zhe Wu, Bharat Singh, Larry S Davis, and VS Subrahma-
nian. Deception detection in videos. In Thirty-Second AAAI
Conference on Artificial Intelligence, 2018. 2, 3
[70] Baohan Xu, Yanwei Fu, Yu-Gang Jiang, Boyang Li, and
Leonid Sigal. Heterogeneous knowledge transfer in video
emotion recognition, attribution and summarization. IEEE
Transactions on Affective Computing, 9(2):255–270, 2016.
1
[71] Sijie Yan, Yuanjun Xiong, and Dahua Lin. Spatial tempo-
ral graph convolutional networks for skeleton-based action
recognition. In Thirty-Second AAAI Conference on Artificial
Intelligence, 2018. 3, 7, 8
[72] Ceyuan Yang, Zhe Wang, Xinge Zhu, Chen Huang, Jianping
Shi, and Dahua Lin. Pose guided human video generation.
In Proceedings of the European Conference on Computer Vi-
sion (ECCV), pages 201–216, 2018. 3
[73] Guangle Yao, Tao Lei, and Jiandan Zhong. A review of
convolutional-neural-network-based action recognition. Pat-
tern Recognition Letters, 118:14–22, 2019. 1
[74] Shiqi Yu, Tieniu Tan, Kaiqi Huang, Kui Jia, and Xinyu Wu.
A study on gait-based gender classification. IEEE Transac-
tions on image processing, 18(8):1905–1910, 2009. 3