Motiv Emot (2012) 36:371–381
DOI 10.1007/s11031-011-9257-2


Do you see what I see? Learning to detect micro expressions

of emotion
Carolyn M. Hurley

Published online: 11 November 2011

Ó Springer Science+Business Media, LLC 2011

Abstract The ability to detect micro expressions is an Introduction

important skill for understanding a person’s true emotional
state, however, these quick expressions are often difficult If facial expressions of emotion were delivered uniformly
to detect. This is the first study to examine the effects of each and every time an emotion was elicited, eventually all
boundary factors such as training format, exposure, moti- of us would be near perfect perceivers of others. However,
vation, and reinforcement on the detection of micro pressures to conceal or mask one’s true feelings may result
expressions of emotion. A 3 (training type) by 3 (rein- in emotional displays that are quick or fragmented (called
forcement) fixed factor design with three control groups micro momentary expressions, Haggard and Isaacs 1966;
was conducted, in which 306 participants were trained and or micro expressions, Ekman and Friesen 1969). Since
evaluated immediately after exposure and at 3 and 6 weeks daily life features many pressures to conceal or mask one’s
post-training. Training improved the recognition of micro emotions, as a function of status, culture, context, polite-
expressions and the greatest success was found when a ness, and so forth (Ekman 1972), the ability to accurately
knowledgeable instructor facilitated the training and perceive and interpret these quick expressions would
employed diverse training techniques such as description, improve our interpersonal skills, allowing us to better
practice and feedback (d’s [ .30). Recommendations are understand individuals’ true emotional states.
offered for future training of micro expressions, which can The ability to ‘‘read’’ others is advantageous for the
be used in security, health, business, and intercultural average person, but in particular for clinicians and security
contexts. practitioners where the ability to understand others can
result in more informed judgments regarding threats to
Keywords Micro expression  Facial expression  oneself and others. Practitioners are already utilizing web-
Emotion  Training based micro expression (ME) training in security (e.g.,
Department of State, Department of Homeland Security,
Department of Defense) and health contexts, although
testing of these efforts has been largely limited to clinical
This work was submitted in partial fulfillment of a Doctor of
populations (e.g., Marsh et al. 2010; Russell et al. 2006,
Philosophy degree at the University at Buffalo by the author. Any 2008). Identifying effective training methods is imperative,
opinions, findings, and conclusions or recommendations expressed in especially in these critical situations where a superior
this material are those of the author and do not necessarily reflect the understanding of emotion can significantly improve our
views of the Transportation Security Administration, the Department
of Homeland Security, or the United States of America. The author
national security and quality of life.
would like to thank Dr.’s Mark Frank and David Matsumoto for loan The best available research in concealment of emotion
of the Micro Expression Training Tool, second edition. suggests that these masked emotional signals, particularly
MEs, are very difficult to detect (Ekman and Friesen 1969,
C. M. Hurley (&)
1974a; Etcoff et al. 2000; Porter and ten Brinke 2008).
Transportation Security Administration, 601 South 12th street,
Arlington, VA 22202, USA Recent research has found that it is possible to train these
e-mail: skills in a short period (Matsumoto and Hwang, in press),

372 Motiv Emot (2012) 36:371–381

yet few boundary factors that may affect training success The existence of MEs has been verified in studies of
have been explored. This manuscript examines the train- concealment (Porter and ten Brinke 2008) and is relevant to
ability of MEs of emotion, the optimal method of training, high-stakes contexts like law enforcement and national
the role of motivating factors, the effect of reinforcement, security. For example, if someone is transiting a security
and the retention of training materials over a 6-week per- checkpoint and is in possession of illegal drugs, he may
iod. This will help identify more effective training meth- have a fear of discovery. He will in all likelihood try to
ods, which can be used to train individuals—such as those hide these feelings, so any emotional clues he produces
in national security contexts—who may encounter con- may be more subtle then in a context where he is not trying
cealed emotions like MEs. to manage his behavior. Research has shown that the
ability to detect MEs is related to skill at detecting
deception in high-stakes scenarios (Ekman and O’Sullivan
Background 1991, 2006; Ekman et al. 1999), likely because it is easier
to judge veracity when an observer is able to accurately
Micro expressions of emotion understand how the target is feeling. This research
emphasizes the importance of ME recognition skills for
Emotions can be defined as ‘‘short-lived psychological- any individual whose profession requires interpersonal
physiological phenomena that represent efficient modes of interaction or deception detection.
adaptation to changing environmental demands’’ (Levenson
1994, p. 123). Emotions are automatic responses that are Facial and micro expression training
triggered—aroused in a fraction of a second—by environ-
mental stimuli that alter our attention and organize biological Scientists have long endeavored to train people to better
responses, preparing us to react. Emotions are complex and recognize facial expressions. As early as the 1920s
involve a number of bodily response systems such as researchers had students study pictorals or verbal descrip-
expression, muscular tonus, voice, and autonomic nervous tions of facial expressions (Allport 1924; Guilford 1929;
system activity (Levenson 1994). Jarden and Fernberger 1926; Jenness 1932). However, the
Besides unique internal signals, emotions also generate absence of clear stimulus materials (drawings versus pho-
external signals—such as facial expressions—that provide tographs) and clear identification of expressions limited
clues of these internal changes. A significant body of lit- this training research. After researchers began to system-
erature has examined the basic emotions of anger, con- atically study and define the muscle movements inherent in
tempt, disgust, fear, happiness, sadness, and surprise, emotional expressions they were able to create detailed
revealing that each appears to have a characteristic facial coding systems (e.g., Ekman and Friesen 1978; Izard
expression that is universal across cultures (e.g., Ekman 1979). This allowed researchers to create standardized sets
2003; Elfenbein and Ambady 2002). The universal pro- of valid emotion training and testing materials (e.g.,
duction of these facial signals suggests that these emotional BART, Ekman and Friesen 1974b; PoFA, Ekman and
expressions are genetically determined and biology is lar- Friesen 1975; JACFEE, Matsumoto and Ekman 1988;
gely responsible for establishing which facial movements JACBART, Matsumoto et al. 2000).
are associated with certain emotions (DeJong 1979; The Japanese and Caucasian Brief Affect Recognition
DeMyer 1980). Test (JACBART) was the first published test of micro
A ME is a special case of the basic emotional expression, expression recognition accuracy (MERA) that was rigor-
which was first discovered by Haggard and Isaacs (1966) ously evaluated (Matsumoto et al. 2000). The JACBART
while studying clinical interviews. They believed MEs were created the appearance of more dynamic expressions, as
caused by an unconscious repression of conflict and that each poser’s neutral face was imposed before and after the
those expressions occurred too quickly to be seen in real emotional expression face, reducing the after effects of the
time. Ekman and Friesen (1969, 1974b) undertook a more stimuli. All expression images were scored with the Facial
rigorous program of study that fully articulated the nature of Action Coding System (FACS; Ekman and Friesen 1978)
MEs. After examining recorded psychiatric interviews to ensure the same muscle actions occurred for each
frame-by-frame they found that MEs were emotional emotion and were consistent with universally recognized
expressions that ‘‘leaked’’ out when individuals attempted to expressions (Ekman 2003). Additionally, these images
inhibit or manage their facial displays (Ekman, 2003). They were tested with an international audience to ensure cross-
concluded that these quick expressions represented signs of cultural agreement (Biehl et al. 1997). Matsumoto and
concealed emotion, as uninhibited or naturally occurring colleagues provided evidence of internal and temporal
emotional expressions generally last several seconds in reliability and convergent and concurrent validity for this
length or more (Hess and Kleck 1990). test across five studies and found similar accuracy patterns

Motiv Emot (2012) 36:371–381 373

even with the differences made to presentation speed and (Elfenbein 2006). Those limitations inhibit interpretation of
judgment task (Matsumoto et al. 2000). these data. These studies also did not examine the ability to
This ME testing procedure evolved into a self-instruc- detect quick expressions—such as MEs—further limiting
tional training tool, originally called the Micro Expression the ability to compare these methods to standardized tools
Training Tool (METT; now available as the METT such as the METT or MiX.
Advanced at and the Microexpression
Recognition Tool [MiX] at The Boundary factors to training
METT is presented as a stand-alone training tool; it offers a
pre-test, a training section, practice examples with feed- While research demonstrates the validity of using
back, a review section, and a post-test. The stimuli used in commercial ME training tools to train recognition skills
these training tools are laboratory produced which provides (Matsumoto and Hwang, in press; Russell et al. 2006,
the necessary consistency and reliability of expression, 2008), little research has analyzed the underlying factors
poser, intensity, angle and so forth to provide scientific test associated with these skill improvements. Training formats
of MERA. However, use of this type of materials limits the such as simple feedback (Elfenbein 2006), lecture and
ability to generalize to naturally occurring spontaneous practice (Stickle and Pellegreno 1982), and the METT/MiX
expression, which have more dynamic features (Naab and (Matsumoto and Hwang, in press) have all improved
Russell 2007). expression recognition; but it is unknown which methods
Researchers have used versions of the METT to have produced the greatest improvements or had the
train department store employees and trial consultants greatest retention, due to differences in both testing mate-
(Matsumoto and Hwang, in press) and individuals with rials and measures of effectiveness. It is also unknown
Schizophrenia (Marsh et al. 2010; Russell et al. 2006, which format and materials are optimal for training indi-
2008) to detect MEs. A 2-h instructor led session using the viduals to detect MEs.
MiX not only significantly improved Korean department These studies revealed that individuals can be trained to
store employees’ ability to identify MEs (N = 81, 18% recognize laboratory produced MEs fairly quickly and
increase), but also led to higher social and communication effectively, however, retention has only been examined in
skills scores (Matsumoto and Hwang, in press). A similar one study and only at 2 weeks (Matsumoto and Hwang, in
experiment using a small group of trial consultants also press). Although training with the METT can improve
showed improvements in accuracy (N = 25, 18% individuals’ recognition in as little as a few hours, the
increase). Further analyses revealed no skill decay over a length that this training outlasts the post-test is unknown.
2-week period for both groups (Matsumoto and Hwang, in Skill decay is an important variable to examine as many
press). military or government employees may only be able to
The METT has also been used to train clinical patients receive ME training once a year or once in a career span.
with emotion recognition deficiencies to more accurately Another factor to consider is that understanding emo-
recognize emotion (Marsh et al. 2010; Russell et al. 2006, tional expressions is a skill that may improve with practice.
2008). Training individuals with Schizophrenia to read People who have repeated exposure to individuals who try
facial expressions using the METT resulted in a significant to conceal their emotions or who scrutinize nonverbal
improvement in ME recognition at the post-test (9% behavior for their jobs—such as law enforcement officers,
increase, Russell et al. 2006; 18% increase, Russell et al. judges, clinical psychologists, and secret service person-
2008), illustrating the tool’s robustness to different popu- nel—are often more accurate judges of how others are
lations. These studies support a meaningful training-accu- feeling (Ekman and O’Sullivan 1991; Ekman et al. 1999).
racy relationship for identifying MEs, as well as, highlight Studies that have repeatedly tested the same participants
some possible social benefits. have found they improved without training (Matsumoto
Researchers have used other materials to teach others et al. 2000). This suggests that repeated exposure to the
about facial expressions. Stickle and Pellegreno (1982) and task or stimuli may serve as a training function as well and
Elfenbein (2006) used the Pictures of Facial Affect (PoFA, should be examined.
Ekman and Friesen 1975) to train American students to Motivation can also influence a person’s ability to learn
recognize emotional expressions (Elfenbein also used a material. Even though micro expression training may
subset of Chinese posing facial expressions Wang and improve MERA for all individuals, those who are more
Markham 1999). Although both studies reported success motivated may learn and retain more material. Motivation
for training, the authors did not report either the pre and to learn is positively related to skill acquisition (Colquitt
post accuracy scores and within subjects change (Stickle et al. 2000), deeming it an important area for investigation.
and Pellegreno 1982) or the baseline recognition accuracy It is important to examine individuals’ motivation to learn

374 Motiv Emot (2012) 36:371–381

both at the start and completion of each testing phase, as Method

motivation may be affected by external factors such as the
quality or content of the training or assignment to the Participants
training or control group. Any differences must be con-
trolled for to insure that any gains made post-training can Three hundred thirty four (334) participants were recruited
be properly attributed to the training. from large introductory communication courses. An in-
Overall, the previously published studies raise questions class announcement advertised the study as ‘‘an evaluation
regarding the optimal method of training, the role of of students’ nonverbal communication skills’’ and inter-
exposure and motivating factors, and the persistence of ested students signed up for three 1-h appointments through
training effects over time. It is important to examine these an online sign-up system. Participants who completed the
boundary factors that may reduce skill loss so that study received 3 h of research credit in partial fulfillment
researchers can identify more effective training techniques. of their 5 h departmental requirement.
The METT is an ideal instructional tool for testing these
differences. This training can be self-administered or Design
administered by an instructor in a group setting and pro-
vides enough stimulus materials to examine skill retention. The study employed a 3 (training type—instructor feed-
This will allow us to assess these factors in an existing and back; instructor feedback plus description; or self led) by 3
well-used training. (reinforcement—none; at time 2 only; or at time 3 only)
Based on the above literature review, which found sig- fixed factor design with three control groups (traditional
nificant improvements in MERA with different iterations control; control with additional exposure of items; or
of the METT training (Matsumoto and Hwang, in press; control with a motivating lecture). The four times at which
Russell et al. 2006, 2008), the following set of specific participants’ accuracy at judging MEs was assessed (pre-
hypotheses are proposed: training, immediately after training, 3 weeks later, and
6 weeks later) was treated as a within-subject independent
H1 ME Training will significantly improve participants’ variable. The dependent variable was the participants’
MERA and result in greater skill retention, opposed to the accuracy on the various ME tests. Participants were ran-
control conditions, which will experience no change in domly assigned to each condition.
Although training by feedback alone has significantly Conditions
improved expression recognition skills (Elfenbein 2006),
ME recognition is an advanced skill which requires Participants in the control conditions received no training
understanding of subtle differences among expressions. to serve as comparison groups to the training manipula-
Thus, tions. Participants in the ‘‘traditional’’ control condition
occupied themselves for the length of the manipulation and
H2 An instructor-led, multi-faceted ME training condi- were not exposed to any other emotional expression items.
tion will produce the greatest increases in MERA, opposed Participants in the ‘‘exposure’’ control condition were
to ME training conditions that are self-led, or only provide exposed to the same stimulus items (photographs of facial
feedback to participants. expressions) as the training conditions during the manip-
Any increased exposure to training material should also ulation period, but received no feedback or other infor-
provide an advantage to the exposed group. Thus, mation to facilitate their judgment. Participants in the
‘‘motivating lecture’’ control condition were provided with
H3 Reinforcement will significantly improve retention of a lecture on the importance of accurately perceiving and
MERA. interpreting human emotion based on the work of Ekman
Previous studies have assumed that a comparison group (2001, 2003), but were not exposed to any other facial
assigned to do nothing during the training time serves as an expression material.
adequate control for examining training effects. Factors Training techniques previously published were com-
such as mere exposure to stimuli or motivation to learn bined to allow for a fair comparison and evaluation of these
could affect ME post-test scores or moderate effectiveness different training methods. Participants in both the ‘‘feed-
of training. Thus, three control groups will also be exam- back only’’ training and ‘‘full instruction’’ training condi-
ined to answer the following research question: tions received the METT training led by an instructor
highly knowledgeable in the area of facial expressions of
RQ1 What is the effect of motivation and simple expo- emotion (the author). The difference was that the feedback
sure on MERA? only training manipulation consisted solely of feedback

Motiv Emot (2012) 36:371–381 375

regarding the MEs of emotion (available in the practice then divided into three sets to create three MERA post-
section of the METT), whereas in the full instruction tests, each having two examples of each emotion. Paired
training manipulation the instructor also discussed subtle samples t-tests revealed no significant differences in test
differences among expressions (according to the ‘‘training’’ difficulty among the three tests. The mean difficulty for
and ‘‘review’’ sections of METT) and answered questions each of these tests based on the pilot data was 0.63 (post-
raised by participants. Participants in the ‘‘self-led’’ train- test 1), 0.66 (post-test 2), and 0.62 (post-test 3).
ing condition also received training via the METT. These
participants led themselves through the training, feedback, Procedure
and review sections of the METT on a personal computer
(monitored by the instructor). The self-led training group Time 1
was exposed to the same materials as the full instruction
training group except the instructor was not allowed to This study was conducted over an 8-week period and was
answer questions or discuss subtle differences to mirror a approved by the University’s Institutional Review Board.
true self-led training environment. The length of time was Participants were scheduled in small groups for hour-long
standardized (25 min) for all six manipulations (both sessions at three points in the semester. Participants were
training and control). randomly assigned to one of the six conditions and each
Reinforcement was manipulated by randomly assigning condition was run separately. One instructor (the author)
the training participants to either receive or not receive led all sessions. After arrival, participants completed an
re-training at their second and third appointments. informed consent document and then completed a demo-
Refreshers were identical in format to participants’ original graphic questionnaire and personality indexes. Then the
training conditions (i.e., feedback only, full instruction, or instructor provided an overview of the experiment and
self-led) although the instruction time was reduced to explained the ME test procedure to the group. The format
15 min. Trained participants were randomly assigned into and procedure of each ME test was identical. At this point
one of three Refresher conditions: approximately one-third in the experiment the pre-test was administered according
received no refresher training, one-third received refresher to the procedure described below.
training at time 2, and one-third received refresher training Before each test, participants were asked to indicate
at time 3. their confidence in their ability to perform well, as well as
their motivation to correctly identify the ME items. Con-
Stimulus materials fidence was measured using a 1 (Very poor) to 7 (Very
good) rating to the question: How well do you think you
The second version of the METT was used for the testing will do at recognizing the upcoming facial expressions of
and training of MERA. The laboratory produced METT emotion? Motivation was assessed using a 1 (Not Moti-
expression items involve full-face flash displays that show vated) to 7 (Very Motivated) rating response to the ques-
a subject’s neutral expression, a quick expression flash tion: How motivated are you to recognize people’s
(1/15th of a second), and then a return to the subject’s emotional expressions?
neutral face. The METT training is divided into five sec- Next, participants viewed the fourteen-item ME test.
tions: (1) a 14-item pre test (anger, contempt, disgust, fear, Each item was projected on a blank wall in the research
happiness, sadness, and surprise, each shown twice), (2) a room at the speed of 1/15th of a second. Participants were
training section in which each of the universal expressions given approximately 10 s to judge each expression by
are introduced and described, (3) a 42-item practice sec- circling the appropriate response on the provided answer
tion, (4) a review section, and (5) a 28-item post-test (the sheet (choices included anger, contempt, disgust, fear,
same seven emotions shown four times). Elements of this happiness, sadness, surprise, and none of the above). After
training program were manipulated to form the stimulus all items had been judged, participants indicated their
materials used to assess MERA as well as functioned as the confidence in their judgments. Post-confidence was mea-
training materials in the training manipulations. To enable sured using a 1 (Very poor) to 7 (Very good) rating to the
three post-training assessment periods, a pilot test was question: How well do you think you did at recognizing
conducted to evaluate the difficulty of the expression items these facial expressions of emotion?
so they could be grouped into equivalent post-tests. (The After the pre-test was completed, the next 25 min served
third post-test was also used to assess MERA at the pre-test as the manipulation period for the experiment. Control
period.) The 42 ME items taken from the pre-test and post- participants received no training, and training participants
test sections of the METT were shown separately to 12 received ME training in one of three styles described pre-
communication undergraduates, who judged each of these viously. After the manipulation, all participants completed
items at the speed of 1/15th second. These 42 items were the fourteen-item ME post-test (1) according to the

376 Motiv Emot (2012) 36:371–381

procedure described above. After the post-test participants (1.6%), or another ethnic background (1.4%). Participants
were reminded of their next research appointment, and were mostly sophomores (38.2%) and juniors (32.7%),
dismissed from the research space. although some seniors (15.0%) and freshman (13.1%) also
participated (1.0% did not list class year).
Time 2
Exactly 3 weeks after the first session, participants returned
to the research space. At this time participants in the
In this study, participants were asked to rate how motivated
training conditions were randomly assigned as a group to
they were on a one item scale (1 = Not Motivated to
one of the three Reinforcement conditions: none, refresher
7 = Very Motivated). Independent samples t tests were
at time 2, or refresher at time 3. Participants assigned to a
conducted to examine motivation differences between
refresher at time 2 received 15 min of training based on
untrained participants and trained participants. One sig-
their original training condition. After the manipulation
nificant difference was uncovered for the pre-test,
participants completed the fourteen-item ME post-test (2)
t (304) = -2.133, p = .034, d = -.24, suggesting that
according to the procedure described previously. After all
trained participants (M = 5.51, SD = 1.07) were more
participants completed the post-test they were reminded of
motivated to succeed than the controls (M = 5.23,
their next research appointment, and dismissed from the
SD = 1.04) before the manipulation. At this point in the
research space.
experiment participants had not received any information
regarding the training manipulation so the cause of the
Time 3
greater motivation level is unknown. There were no sig-
nificant differences in motivation between controls
Exactly 6 weeks after the original training, participants
(M = 5.18, SD = 1.06) and training (M = 5.40, SD =
returned to the research space. Participants assigned to a
1.19) participants after the manipulation was introduced. A
refresher at time 3 received 15 min of training based on
one-way ANOVA was conducted to examine change in
their original training condition. After the manipulation
motivation at Time 1. No significant differences were
participants completed the fourteen-item ME post-test (3)
uncovered; suggesting that assignment to a training group
according to the procedure described previously. After the
did not significantly increase motivation to perform well in
post-test, all participants completed a questionnaire
this paradigm.
exploring how this study had impacted their lives. Last,
Pearson correlations were computed to examine the
participants were debriefed regarding the purpose of the
relationship between motivation and accuracy. Motivation
study, provided research credit, and dismissed from the
was not significantly related to accuracy at any test for
research space.
control participants. For trained participants, motivation
was significantly positively related to accuracy at post-test
1, r (212) = .191, p = .005, and post-test 3, r (212) =
.157, p = .021, revealing that trained participants who
were more motivated to succeed were more accurate on
these tests. Since motivation was not significantly related to
accuracy at the pre-test—the only test in which groups
A total of 334 students participated at Time 1, with a 92%
differed in motivation—it was dropped as a potential
completion rate (N = 306). Analyses were conducted to
covariate in ensuing analyses.
determine if there were any differences in the demographic
makeup (age, gender, and ethnicity) of the 306 final subject
sample and the 28 participants who did not complete the Confidence
study. These analyses revealed no significant demographic
differences between the group who completed the study In the current study confidence in judgment was measured
and the group that dropped out. From hence forth, only the on a one-item scale (1 = Very poor to 7 = Very good)
final sample (N = 306) is discussed. both before and after each ME test. Pearson correlations
The participants were 174 female (57%) and 132 male were computed to examine the relationship between con-
(43%) undergraduates with an average age of 20.13 fidence and accuracy for both trained and control partici-
(SD = 3.08) years. Participants were mostly Caucasian pants. All but two relationships were significant (Table 1).
(70.9%), but there were also participants who identified The only negative relationship occurred for trained par-
themselves as Asian or Pacific Islander (11.1%), African or ticipants at the pre-test, all other relationships were posi-
Caribbean (8.8%), Hispanic (6.2%), Middle Eastern tive. This suggests that people’s perceptions regarding their

Motiv Emot (2012) 36:371–381 377

Table 1 Relationship between confidence and accuracy

Condition Confidence Accuracy
Pre-test Post-test 1 Post-test 2 Post-test 3

Control Pre- -.010 .170 .223* .260*

Post- .404*** .408*** .337** .598***
Training Pre- -1.77** .246*** .215** .155*
Post- .388*** .535*** .316*** .254***
* p \ .05; ** p \ .01; *** p \ .001

MERA were not so different from their objective ability control conditions and the self-led training condition. There
after MEs had been defined. was a significant main effect at post-test 2, F (5, 300) =
3.388, p = .005. Bonferroni post hoc tests revealed that the
Training effects full instruction training participants were significantly
more accurate than the traditional control participants. At
H1 predicted a significant main effect for training, such post-test 3 there was a significant main effect for condition,
that trained participants would improve in accuracy post F (5, 300) = 8.328, p \ .001. Bonferroni post hoc tests
manipulation at Time 1 and retain this improvement, revealed that full instruction training participants were
whereas controls would experience no change in accuracy. significantly more accurate than the traditional control,
A mixed model ANOVA was conducted to examine the motivating lecture control, feedback only training, and self-
differences in accuracy across time within each of the six led training participants. There was no significant differ-
conditions. Mauchly’s test indicated that the assumption of ence between the full instruction training condition and the
sphericity had been violated, v2 (5) = 14.849, p = .011, exposure control condition at post-test 3.
therefore degrees of freedom were corrected using To further explore the within subjects differences,
Huynh–Feldt estimates of sphericity (e = .994). There was paired samples t-tests were conducted for each of the six
a significant main effect for time, F (2.983, 894.954) = conditions to examine accuracy change over time. A total
104.967, p \ .001, g2 = .259. Pairwise comparisons uncov- of 3 comparisons (pre-test vs. post-test 1, post-test 1 vs.
ered that accuracy was significantly different (p \ .001) for post-test 2, and post-test 2 vs. post-test 3) were conducted
all tests except between post-test 2 and post-test 3, showing for each condition. The significant differences are outlined
that accuracy improved from pre-test to post-test 1 and in the Table 2. Between the pre-test and post-test at Time
post-test 2. 1, all three training conditions significantly increased in
There was also a significant main effect for condition, accuracy (feedback only: ?14.19%; full instruction:
F (5, 300) = 4.994, p \ .001, g2 = .077. Pairwise com- ?19.52%; and self-led ?10.66%), and two of the control
parisons indicated that the full instruction training condi- conditions experienced no significant increase (traditional:
tion was significantly more accurate than the traditional ?1.43%; and exposure: -0.76%), revealing support for
control and motivating lecture control conditions, but was H1. Surprisingly one of the control conditions (motivating
not significantly different from the control group with
exposure, or feedback only training, or self-led training Table 2 Within subjects comparisons for accuracy from test to test
Condition Pre-test Post-test 1 Post-test 2
A significant interaction was revealed for time by con- versus versus versus
dition, F (14.916, 894.954) = 5.421, p \ .001, g2 = .083. post-test 1, t post-test 2, t post-test 3, t
To further explore this interaction, one-way ANOVAs
were conducted at each test (pre-test, post-test 1, post-test
Traditional .560 3.477** -2.065*
2, and post-test 3) to examine between subject differences.
Exposure -.205 2.554* 1.745
There were no significant differences at the pre-test,
Motivating lecture 2.143* 3.266** -1.929
revealing that all groups began at approximately the same
skill level. A significant difference was revealed at post-test
1, F (5, 300) = 7.561, p \ .001. Bonferroni post hoc tests Feedback only 7.346*** 2.908** -1.586
revealed that the three control conditions were significantly Full instruction 8.757*** 3.106** 1.430
less accurate than the two instructor-led training condi- Self-led 5.635*** 6.289*** -3.898***
tions. There were no significant differences between the * p \ .05; ** p \ .01; *** p \ .001

378 Motiv Emot (2012) 36:371–381

(?14.19%), t (143) = 1.811, p \ .05, d = .30, supporting

H2. There were no significant differences between the
accuracy change of the feedback only and self-led

The role of refreshers

H3 predicted that refreshers would aid in retention of

training material. A mixed model ANOVA showed a sig-
nificant main effect for time, F (2, 410) = 23.338, p \
.001, g2 = .102 (Table 3). Pairwise comparisons revealed
that accuracy was significantly higher for post-tests 2 and 3
compared to post-test 1 (p \ .001). There was also a sig-
nificant main effect for condition, F (2, 205) = 4.017,
p = .019, g2 = .038. Pairwise comparisons indicated that
the full instruction condition was significantly more accu-
rate than the self-led condition. There was no main effect
for refresher type.
There was a significant interaction for time by condition,
Fig. 1 Accuracy across time and conditions
F (4, 410) = 4.300, p = .002, g2 = .040. This interaction
was previously explored and reported, and revealed that full
lecture: ?6.15%) also significantly increased from the pre- instruction condition improved significantly more than the
test to post-test 1, t (28) = 2.143, p = .041, d = -.40. self-led and feedback only conditions. The two-way inter-
Between post-test 1 and post-test 2, all of the conditions actions for time by refresher, and condition by refresher, and
significantly improved in accuracy, suggesting a possible the three-way interaction for time by refresher by condition,
exposure or practice effect to the stimuli, or that the were not significant.
material shown in post-test 2 was easier than the other Paired samples t tests (post-test 1 vs. post-test 2, post-
tests, although pilot testing suggested that all three tests test 2 vs. post-test 3, and post-test 1 vs. post-test 3) were
were equivalent in difficulty. Between post-test 2 and post- conducted to evaluate MERA retention for each of the
test 3, both the traditional control condition (-5.95%) and three refresher manipulations within the three training
the self-led training condition (-6.91%) significantly groups. In the feedback only condition, the time 2 refresher
decreased. group significantly improved in accuracy after the
H1 was partially supported, as the combined training refresher, t (21) = 3.346, p = .003, d = .71 (post-test 1 to
participants outperformed control participants and more post-test 2). T tests also revealed that accuracy significantly
specifically the full instruction training participants out- decreased from post-test 2 to post-test 3, t (21) = -3.186,
performed most controls on all tests. However, the moti- p = .004, d = -.68. No other significant differences,
vating lecture control group also significantly improved
after the manipulation, suggesting an effect for the moti-
vating lecture. Additionally, all control groups significantly Table 3 Within subjects differences for ME accuracy over time
improved from post-test 1 to post-test 2. This pattern of
results is illustrated in Fig. 1. Training type Refresher N Accuracy
Post-test Post-test Post-test
Type of training 1 (%) 2 (%) 3 (%)

Feedback only None 27 74.60 78.70 78.57

H2 predicted that the full instruction training would result At time 2 22 70.45 80.30 72.40
in greater improved accuracy compared to feedback only At time 3 25 83.43 86.67 85.14
and self-led trainings. Independent samples t tests (one- Full instruction None 24 81.55 85.07 91.96
tailed) were conducted to examine the differences in
At time 2 25 78.00 87.33 89.43
improvement from the pre-test to post-test 1 for the three
At time 3 22 78.25 84.47 88.31
training types. Tests revealed that the full instruction
Self-led None 25 72.29 85.33 74.57
condition (?19.52%) improved significantly more than
At time 2 20 71.79 80.83 76.43
both the self-led condition (?10.66%), t (138) = 3.021,
At time 3 24 72.32 88.89 83.93
p \ .005, d = .51, and the feedback only condition

Motiv Emot (2012) 36:371–381 379

including improvements for the time 3 refresher group, These results revealed that the best method for using the
were uncovered. METT in a short session was to fully explore all sections of
In the full instruction condition, the time 2 refresher the program including the training and review, have a
group significantly increased in accuracy from post-test 1 knowledgeable instructor describe subtle differences
to post-test 2, t (24) = 2.402, p = .024, d = .48, and post- between the expressions, and practice identifying the ME
test 1 to post-test 3, t (24) = 3.578, p = .002, d = .72. For items and provide feedback to trainees. This suggests that
the time 3 refresher group, paired samples t tests revealed feedback paired with additional training techniques may
that accuracy significantly increased from post-test 1 to produce a more effective training manipulation. Although
post-test 3, t (21) = 3.241, p = .004, d = .69. No other the METT has been designed as a self-instructional tool to
significant differences were uncovered. train emotion recognition, this method of training was
In the self-led condition, the no refresher group signif- considerably less effective compared to an instructor-led
icantly increased in accuracy from post-test 1 to post-test 2, training.
t (24) = 3.496, p = .002, d = .70. Accuracy at post-test 3 Two explanations for this finding may be that the full
was significantly less than at post-test 2, t (24) = -4.064, instruction training provided both more material than the
p \ .001, d = -.81. This was the only non-refresher group self-led training (instructor answered questions) as well as
that performed significantly different on one of the post- enthusiasm for the topic. This study does not definitively
tests. For the time 2 refresher group, paired samples t tests show which factor provided significant benefits. However,
revealed success for the refresher in significantly increas- the surprising finding that the motivating lecture control
ing accuracy from post-test 1 to post-test 2, t (19) = 2.323, group also significantly increased at after the manipulation
p = .031, d = .52. For the time 3 refresher group, accu- (?6%) suggests that the instructor’s enthusiasm may have
racy significantly increased from post-test 1 to post-test 2, provided motivation to concentrate or attend closer to the
t (23) = 5.175, p \ .001, d = 1.06, and from post-test 1 to post-test. Further research should examine the both the role
post-test 3, t (23) = 3.646, p = .011, d = .74, revealing an of content and instructor in training ME skills, as there may
increase prior to the refresher. No other significant differ- be some ideal combination of content and charisma that
ences were uncovered. produces the greatest effects.
These inconsistent results do not support H3. The time 2 In this experiment, the same instructor was used for all
refresher groups did experience significant increases in trials in attempt to keep the presentation consistent and
accuracy after their refresher, but this result did not outlast eliminate the possibility of attention or motivation biases
the time period. Actually, all groups increased at Time 2 caused by the instructor’s appearance or presentation style.
(although not all changes were significant) suggesting that However, this is also a limitation of the study since the
the refresher may not have caused increases but rather by instructor was aware of the hypotheses and experimental
some condition of the ME test. design, which could have unintentionally affected por-
trayals within the experiment.
Research examining confidence of judgment has gen-
Discussion erally not found a relationship between one’s confidence in
judgment and the accuracy of that judgment (DePaulo et al.
This study provides the first data comparing different 1997; Patterson et al. 2001). The current study revealed a
methods of training micro expressions, the effects of clear relationship between confidence and MERA: before
motivation and exposure on recognition, and skill retention individuals were introduced to the concept of MEs, their
over three points in time. As predicted, the training was confidence and accuracy were not calibrated, but after
successful. At the pre-test, there were no significant dif- individuals had seen MEs, they became calibrated such that
ferences in accuracy based on condition, but after the accurate judges were more confident and less accurate
manipulation trained participants performed better on post- judges were less confident. Previous studies examining
tests 1, 2 and 3 (76, 84, and 82% respectively) than controls emotion recognition have not found this strong a link. In
(64, 75, and 73% respectively). Further, having an expert this study, participants’ confidence in interpreting the MEs
present to guide participants through the subtle differences was significantly positively correlated to their accuracy
among these expressions and answer questions was an post-testing period, even though they had never received
advantage over the other tested training methods. The full feedback on their performance. This may suggest a very
instruction training condition continually provided consis- parsimonious means to determine whether trainees under-
tent results: it was the only training condition that was stand the material—the trainer merely has to ask. The
significantly more accurate than one or more control con- novelty of this finding suggests that this relationship should
ditions on all three post-tests. be verified in subsequent research. Of particular importance

380 Motiv Emot (2012) 36:371–381

would be to replicate this finding with naturally occurring these MEs, their average post training accuracy (38%) was
MEs, which may be more difficult to spot. much lower than ME accuracy for posed photographs seen
In this study a possible repeated exposure effect was both in their study (78%) and this current study (81%
found, as untrained individuals improved without train- across all post-tests). The posed faces selected for training
ing—and most markedly in the repeated exposure control. with the METT are not representative of all of the facial
If practice—or exposure—improves performance, this expressions encountered daily, and therefore future
suggests additional training time would be beneficial. One research is required to determine whether these are the best
limitation of the current study was the limited time spent materials for training spontaneous expression recognition.
training (25 min) and refreshing (15 min) ME skills. Per- There is a need to correctly recognize and interpret a
haps this is the reason why the reinforcement sessions were person’s true feelings in any number of interpersonal,
ineffective. This data suggests that exposure is an impor- health, business, legal, and social contexts. The detection
tant element for learning, and future studies should explore of concealed or masked emotions is invaluable in law
increased exposure and training time manipulations. enforcement and national security settings, medical con-
A limitation of the current study was all conditions texts and the corporate world—where better understanding
performed significantly better on post-test 2 (as compared of our suspects, patients, or partners can allow us to make
to the pre-test and post-test 1), which suggests that the more informed decisions about a person’s true feelings and
stimuli utilized in post-test 2 may have been easier than the intent. In any context, the ability to recognize emotional
other tests. Pilot tests were conducted to ensure that the ME displays can make us more effective perceivers of others,
items were divided into equally difficult post-tests, and which can enhance the quality of our interpersonal rela-
while not significant the pattern revealed by the means tionships and reduce the potential for misunderstanding.
suggests that post-test 2 was slightly easier. Although the This study was among the first to evaluate the specific
availability of a subject pool was prohibitive in this sense, features and use of the METT, a facial expression training
future research should counterbalance the order of these program currently in use in security and health contexts.
tests to insure that accuracy is due to the manipulation, not These findings validate use of the METT for improving
the ease of any particular test. MERA, suggest the training persists at 6 weeks, and further
Another limitation was the nature of the MEs used to provide the optimal way to deploy that training. Previous
both test and train recognition. These MEs were full-face work suggests that this type of training will translate to real
but very quick expressions of emotion that were imbedded time spontaneously expressed MEs, but this particular tool
within a poser’s neutral expression. Research has shown requires further testing to conclusively demonstrate its utility
that spontaneous expressions are more difficult to interpret across the wide variety of situations seen in daily life.
than posed expressions, as often naturally occurring
expressions blend with other emotions or expressions
(Naab and Russell 2007). Naturally occurring MEs may not
