Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

849559

research-article2019
CDPXXX10.1177/0963721419849559McCrae, MõttusWhat Personality Scales Measure

ASSOCIATION FOR
PSYCHOLOGICAL SCIENCE
Current Directions in Psychological

What Personality Scales Measure: A New Science


1­–6
© The Author(s) 2019
Psychometrics and Its Implications Article reuse guidelines:
sagepub.com/journals-permissions

for Theory and Assessment DOI: 10.1177/0963721419849559


https://doi.org/10.1177/0963721419849559
www.psychologicalscience.org/CDPS

Robert R. McCrae1 and René Mõttus2,3


1
Gloucester, Massachusetts; 2Department of Psychology, University of Edinburgh; and 3Institute
of Psychology, University of Tartu

Abstract
Classical psychometrics held that scores on a personality measure were determined by the trait assessed and random
measurement error. A new view proposes a much richer and more complex model that includes trait variance at mul-
tiple levels of a hierarchy of traits and systematic biases shaped by the implicit personality theory of the respondent.
The model has implications for the optimal length and content of scales and for the use of scales intended to correct
for evaluative bias; further, it suggests that personality assessments should supplement self-reports with informant rat-
ings. The model also has implications for the very nature of personality traits.

Keywords
true score, method bias, implicit personality theory, five-factor model, reliability, informant ratings

A typical personality scale consists of several items— in fact the constant component may include systematic
statements or single adjectives. The respondent rates errors or method biases that do not accurately reflect
the extent to which each item describes a target—either the trait and thus do not contribute to the true score in
oneself (in self-reports) or someone whom the respon- our sense of the term.
dent knows well, such as a spouse, sibling, or close Although there are many different kinds of method
friend (in informant reports). The sum of these item bias, by far the most attention has been given to evalu-
ratings is the target’s score on the scale, and ideally it ative biases (variously conceived as social desirability,
is directly proportional to the amount of the trait the impression management, defensiveness, faking good,
target actually possesses. We refer to the veridical por- or simply lying): the enduring tendency of the respon-
tion of the observed score as the true score. dent to exaggerate good qualities and minimize bad
Neither test developers nor respondents are perfect, qualities. People vary in this tendency; some actually
so the scale score only approximates the true score. have negative scores and even exaggerate their weak-
Items can be ambiguous or misread, and numerical nesses. If assessors could measure evaluative bias, they
ratings may be somewhat arbitrary; all of these intro- could subtract it out of the observed score and better
duce random error into the scale score. This is why approximate the true score. Enormous efforts have been
people seldom receive exactly the same score when made to do just that, but with curiously limited success
the test is readministered even a short time later. Retest (Piedmont, McCrae, Riemann, & Angleitner, 2000; Roth
reliability is a measure of how close retest scores are & Altmann, 2019).
to the original scores. Psychologists have primarily relied on self-reports to
In classical psychometrics (e.g., Lord & Novick, assess personality. With only one source of information,
1968), the term “true score” referred to the hypothetical a model of personality scores consisting of true score,
average of an indefinitely large number of administra-
tions of a scale to a respondent; the observed score on Corresponding Author:
any one occasion was thus seen as a constant (i.e., true) Robert R. McCrae, 90 Magnolia Ave., Gloucester, MA 01930
score plus or minus some quantity of random error. But E-mail: rrmccrae@gmail.com
2 McCrae, Mõttus

random error, and evaluative bias may seem plausible. 2010; see Table 1). More reliable facets showed higher
But a body of research has also examined personality validity by all three criteria—but only for retest reli-
ratings from well-acquainted informants and shown ability. Further, retest reliabilities almost invariably
that, like self-reports, these are stable, heritable, and exceeded internal-consistency reliabilities, 2 suggesting
related to a variety of life outcomes. One would be that scales contain something reliable beyond their
tempted to say that self-reports and informant ratings items’ shared variance.
are interchangeable, except that they show far less than In an attempt to understand these findings, McCrae
perfect agreement. For example, Costa and McCrae (2015) analyzed different components of variance in
(1988) reported a 6-year longitudinal study of three facet scales and items, distinguishing between true
broad traits—Neuroticism, Extraversion, and Openness— score, method bias, and random error. It became clear
using both self-reports and spouse ratings. Both assess- that there were different kinds of true-score variance:
ments showed substantial stability, with retest Some was shared by all of the items in a scale, and
correlations of about .80. But agreement between self- some was specific to each item. For example, the NEO
reports and spouse ratings was much lower, only about angry-hostility facet has one item that concerns feelings
.55. Clearly, there must be something in self-reports that of bitterness and resentment and another about being
is stable over time but not shared by spouse ratings, and hot blooded and quick tempered. Both are expressions
vice versa. We argue that this fact allows us to build a of the same trait, but they are also subtly different in
better model of personality scores. content; each represents a different nuance of angry
One might guess that the traits being rated are actu- hostility. Figure 1 illustrates the components of variance
ally somewhat different: Perhaps the self-report, which in a typical facet scale and shows how they contribute
is based in part on private thoughts, feelings, and to various properties of the trait. This model explains
desires, reflects the “inner” personality, whereas the why retest reliability is a better predictor of validity
informant rating reflects the “outer.” If that were true, than internal consistency: It contains all the true-score
we would expect that two different informants would variance (T + S1–k), whereas internal consistency con-
agree more strongly with each other than they agree tains only the common portion (T).
with the target’s self-report because both informants It has been known for decades that personality traits
see the outer personality. But that is not the case; infor- form a hierarchy: The five broad domains at the top of
mants agree with each other only about as well as they the NEO-PI-R hierarchy are composed of narrower and
do with self-reports (e.g., Costa & McCrae, 1992). more specific facets (see Table 1). McCrae (2015)
It appears that each respondent has a view of the hypothesized that there is at least one more lower level,
target shaped in part by the trait but also in part by the nuances that express facets. But are they actually
systematic, enduring, and idiosyncratic biases—source- traits? If so, like higher-level traits, they should be stable
method biases—that consistently over- or underestimate over time and heritable, and different observers should
trait levels. By examining properties of traits, we can agree on them. Mõttus, Kandler, Bleidorn, Riemann,
tease apart the magnitude of these biases. For example, and McCrae (2017) tested these hypotheses in a German
in the Costa and McCrae (1988) study, about 80% of the sample of twins on whom self-report and informant-
variance was stable over time, but only 55% was shared rating data were available from two assessments 5 years
by different sources, suggesting that 25% (80% − 55%) apart. Not surprisingly, most of the items were stable,
is stable source-method bias. The discovery of the pres- heritable, and consensually valid because they are
ence, structure, and magnitude of source-method biases expressions of higher-order traits that are known to
has ushered in a new view of what scales measure, with have those properties. But Mõttus and colleagues also
profound implications for how personality should be analyzed items’ nuance-specific variance after statisti-
assessed.1 cally removing all of the higher-order variance, and a
majority of these item residuals still showed the char-
acteristic properties of traits. These findings were later
Nuances: A New Source of True Score replicated in samples from other countries (Mõttus
Most psychologists know the adage that the reliabil- et al., 2018). Nuances are real—if quite narrow—traits,
ity of a scale sets an upper limit to its validity—but and people appear to have hundreds of them.
there are different forms of reliability. McCrae, Kurtz,
Yamagata, and Terracciano (2011) compared internal-
Hierarchy in Source-Method Biases
consistency reliability (the extent to which all items in
a scale assess the same thing) with retest reliability as McCrae (2015) focused on components of variance within
predictors of the longitudinal stability, heritability, and items and facets. A later examination of relations between
cross-observer validity of the 30 facets of the Revised facets and the broader domains led to further insights
NEO Personality Inventory (NEO-PI-R; McCrae & Costa, about source-method biases (McCrae, 2018a). Although a
What Personality Scales Measure 3

Table 1.  Domains, Definitions, and Facets of the NEO Inventories

Domain Definition Facets


Neuroticism The tendency to experience many forms of emotional Anxiety, angry hostility, depression,
distress and have unrealistic ideas and troublesome self-consciousness, impulsiveness,
urges; individuals low in Neuroticism are emotionally vulnerability
stable, do not get upset easily, and are not prone to
depression.
Extraversion The tendency to prefer intense and frequent interpersonal Warmth, gregariousness, assertiveness,
interactions and to be energized and optimistic; activity, excitement seeking, positive
individuals low in Extraversion are reserved and tend emotions
to prefer a few close friends to large groups of people.
Openness to experience The tendency to seek out new experience and have a Fantasy, aesthetics, feelings, actions,
fluid style of thought; individuals low in Openness are ideas, values
traditional, conservative, and prefer familiarity to novelty.
Agreeableness The tendency to regard other people with sympathy Trust, straightforwardness,
and act unselfishly; individuals low in Agreeableness altruism, compliance, modesty,
are not concerned with other people and tend to be tendermindedness
antagonistic and hostile.
Conscientiousness The tendency to control one’s behavior in the service of Competence, order, dutifulness,
one’s goals; individuals low in Conscientiousness have achievement striving, self-discipline,
a hard time keeping to a schedule, are disorganized, deliberation
and can be unreliable.

Note: Adapted from McCrae and Sutin (2007).

global evaluative bias might affect all traits, there are true-score component (Passini & Norman, 1966); 3 such
also biases specific to domains and to facets. If for some a structure must be attributed to the mind of the rater
reason—say, a faulty first impression—a rater overestimates rather than characteristics of the target. McCrae and
a target’s Conscientiousness, the rater is likely to overesti- colleagues (2018) showed that people also have an
mate the target’s standing on all of its facets: competence, implicit personality theory for nuance-level traits.
order, dutifulness, and so on (see Table 1). The same rater McCrae’s (2018a) model suggested that about 17%
might systematically underestimate the target’s true level of of the variance in the average facet scale was attribut-
Agreeableness and all of its facets. Similar biases exist at able to method bias at the domain level (influencing
the facet level, affecting all items in a facet scale. all facets in a domain), and another 22% was method
McCrae, Mõttus, Hřebíčková, Realo, and Allik (2018) bias at the facet level (influencing all items in a facet).
tested these ideas in Estonian and Czech samples. They McCrae and colleagues (2018) calculated the average
subtracted informant ratings from self-ratings, essen- domain-level bias to be 18% in the Estonian sample and
tially removing the true-score component and leaving 17% in the Czech sample, almost exactly in line with
only biases and random error. When facet residuals predictions. These consistent findings give grounds for
(informant ratings – self-ratings) were factored, they some confidence in McCrae’s (2018a) model and lead
showed a structure that mimicked the structure of the to the sobering conclusion that facet-level variance con-
traits seen in Table 1. When item residuals were fac- tains almost as much source-method bias (39%) 4 as it
tored, they reproduced the structure of facets within does the combined true score of facets and nuances
each domain. These results supported the hypothesis (42%; the rest is random error).
that biases have a hierarchical structure similar to that Although most of the analyses discussed here have
of true scores (McCrae, 2018a). used the NEO inventories, there is every reason to
The analysis of facet residuals might have shown just expect similar findings with other hierarchical instru-
a single factor of evaluative bias, or two or three method ments, such as Lee and Ashton’s (2004) six-factor
factors, or no meaningful structure at all. Why should HEXACO model or Krueger, Derringer, Markon, Watson,
biases have the same structure as real traits? Probably and Skodol’s (2012) Personality Inventory for DSM-5
because people at some level understand how traits go (Ashton, de Vries, & Lee, 2017; McCrae, 2018a).
together; they know, for example, that if people are
sociable, they are also likely to be cheerful and active—
Theoretical Implications
they are extraverts. That people have such an implicit
personality theory has been known for some time. The It now appears that the variance in personality scales
five-factor structure of facet-level traits emerges from is a mixture of true-score variance specific to domains,
ratings of strangers, in which there can be very little facets, and nuances; source-method biases that mimic
4 McCrae, Mõttus

Self-Report Self-Report
Observer Time 1 Time 2

εO εTime 1 εTime 2

MO MS MS
α α
T T rtt T
rca
S1–k S1–k S1–k

Fig. 1.  Illustration of the variance attributable to random error (ε), method (M), com-
mon trait (T), and item specifics (S1–k) in facet-scale scores from an observer (O) and
from a self-report test (S) and short-term retest. Dashed lines indicate distinct contri-
butions from different items. Internal consistency (α) is attributable to method and
trait variance shared by items. Cross-observer agreement (r ca) and retest reliability (rtt)
reflect shared components of variance across observers and occasions, respectively.
The bottom half of each column represents the true score; the top half represents ran-
dom and systematic error. This is a simplified model; for a more detailed model, see
McCrae (2018a). Adapted with permission from “A more nuanced view of reliability:
Specificity in the trait hierarchy” by R. R. McCrae, 2015, Personality and Social Psychol-
ogy Review, 19, p. 100. doi:10.1177/1088868314541857. Copyright 2015 by the Society
for Personality and Social Psychology, Inc.

the hierarchical structure of the true score; and random common (their intersection)? Union traits include both
error. Moreover, we can now quantify these compo- the variance shared by all components and the variance
nents. This new psychometrics has both theoretical and unique to each, whereas intersection traits exclude the
practical implications. unique variance; this can make a substantial difference.
A century ago, most psychologists thought the human Personality-scale scores are usually the sum of items and
mind was a blank slate, with personality traits shaped thus implicitly follow the union model, but statistical-
by enculturation, childhood experiences, and learning. modeling approaches such as factor analysis and item-
We have known for the past 50 years that traits are in response theory almost always assume an intersection
part heritable; now we know that this applies not only model. New statistical methods are needed to analyze
to broad temperament factors but also to specific traits, union traits (Bentler, 2017).
even at the level of unique nuances. One person may Nuances are natural units for, and lend credibility to,
be prone to express anger through loud outbursts, network and functionalist approaches that explore asso-
whereas another is prone to seethe with resentment, ciations among numerous personality components to
and this nuanced difference is almost as heritable as any understand why they coalesce into broader traits such
other trait. Personality theories need to account for this. as Extraversion in the first place (Mõttus & Allerhand,
Personality traits are important because they predict 2018; Wood, Gardner, & Harms, 2015).
a host of outcomes, from occupational interests to hap-
piness. Today most researchers use broad domains as Implications for Research and
predictors, although some have advocated the use of
more informative facets (Paunonen & Ashton, 2001). A
Assessment
program of research is needed to determine whether Retest reliability appears to be more informative than
nuances provide yet more predictive power; initial evi- internal consistency and must be used to correct estimates
dence is that they do (Seeboth & Mõttus, 2018). of long-term stability (Costa, McCrae, & Löckenhoff, 2019).
The hierarchical model of trait structure has implica- But researchers usually report only internal consistency
tions for how traits are conceptualized. Is Extraversion because they can calculate it from a single administration
the sum (or, in the language of set theory, the union) and need not reassemble their sample a few days later to
of its component facets and nuances, or is it only the complete the scale again. Wood, Qiu, Lu, Lin, and Tov
essential core all of its facets and nuances have in (2018) have proposed that retest reliability can also be
What Personality Scales Measure 5

assessed by a retest administered within the same session. Recommended Reading


If so, reporting retest reliability should become routine. McCrae, R. R. (2015). (See References). Introduces nuances and
Note that retest reliability can also be calculated for single- the distinction between traits as unions and intersections.
item nuances, whereas internal consistency cannot.5 McCrae, R. R. (2018a). (See References). Proposes that
To save respondents time, researchers sometimes method biases are hierarchical; the online supplemental
measure broad traits with very short scales of only one material (McCrae, 2018b) provides a tutorial on the (fairly
or two items. In addition to including true-score vari- elementary) statistical methods used to develop models
ance for the broad trait of interest, such miniscales of variance components.
likely contain as much or more nuance-specific vari- McCrae, R. R., Kurtz, J. E., Yamagata, S., & Terracciano, A.
(2011). (See References). Reports the puzzling superiority
ance. If the miniscales predict some outcome, one can-
of retest reliability as a predictor of validity, a finding that
not determine whether it is because of the trait variance, stimulated later articles.
the nuance variance, or both. Miniscales should be used McCrae, R. R., Mõttus, R., Hřebíčková, M., Realo, A., & Allik,
to measure broad traits only as a last resort and with J. (2018). (See References). Presents empirical evidence
the acknowledgement that conclusions drawn about using Estonian and Czech data that method biases are
their correlates are tentative and must be confirmed hierarchical and structured by implicit personality theory.
using the full-length scale. (Miniscales with narrow con- Mõttus, R., Kandler, C., Bleidorn, M., Riemann, R., & McCrae,
tent are suitable for measuring nuances.) R. R. (2017). (See References). Presents empirical evi-
However, very long scales may be ill-advised as well. dence using a longitudinal, multiobserver German twin
If scales contained only true score and random error, study that nuances are real traits.
lengthening the scale would consistently improve its
validity; the errors would cancel out, leaving relatively Action Editor
more true score (Lord & Novick, 1968). But scales also Randall W. Engle served as action editor for this article.
contain source-method bias, which does not cancel out
with increased scale length. The ratio of true score to ORCID iD
method bias sets an upper limit to the validity of scales Robert R. McCrae https://orcid.org/0000-0003-1325-1485
completed by a single source.
Very similar items overrepresent some nuances and Declaration of Conflicting Interests
thus bias the overall scale. Scale developers could maxi- R. R. McCrae receives royalties from the NEO inventories. The
mize information and efficiency by including a single author(s) declared that there were no other potential conflicts
item for each distinct nuance that can be identified for of interest with respect to the authorship or the publication
a trait. Attention should therefore be paid to individual of this article.
items’ psychometric properties (e.g., retest reliability or
sufficient variance). Notes
Efforts to measure evaluative bias and thus improve 1. There are sources of error in personality-scale scores that
estimates of the true score have had limited success, in are not considered in our model. One source is response sets
part because these measures almost invariably also con- (or transient errors; Schmidt, Le, & Ilies, 2003), which are sys-
tain true-score variance; in fact, correction can some- tematic biases on one occasion but variable across occasions.
Another is what Schmidt and colleagues call specific-factor
times reduce scale validity (Piedmont et al., 2000). But
errors, such as a respondent’s consistent misunderstanding of
even if successful, eliminating evaluative bias would a word, which are random errors on one administration but
still leave domain- and facet-level biases. Single-source systematic errors across administrations. Empirical data suggest
assessments of personality are inherently limited. that our model works reasonably well, at least for the NEO
Perhaps the only practical way to improve assessment inventories, as these are normally used. Research is needed to
is by combining information from two or more sources. determine whether more fully elaborated models would work
Researchers can increase the precision of their trait esti- better with other instruments or circumstances.
mates by using multiple raters. Clinicians (e.g., Piedmont, 2. This was not true in a study of the Big Five Inventory–2 (Soto
1998; Singer, 2005) who gather both self-reports and & John, 2017), perhaps because it used an 8-week retest inter-
informant ratings of their clients—often from a spouse val in a student sample, during which true personality changes
or partner—find a comparison of personality profiles might reduce test–retest correlations.
3. However, as little acquaintance with strangers as viewing
especially revealing: Major disagreements on a trait may
their photograph can lead to better-than-chance accuracy for
point to problems in the client’s self-concept or to sources some traits (Borkenau & Liebler, 1992).
of conflict in the relationship. Although the use of mul- 4. Facet scales have a higher percentage of method bias than
tiple sources complicates personality assessment, it also the domain scales they contribute to because the bias specific
enriches it and should become standard practice. to each facet tends to cancel out when facet scales are summed.
6 McCrae, Mõttus

5. One-week retest data for the items of the NEO-PI-R are avail- Personality Psychology Compass, 1, 423–440. doi:10.1111/
able in “Item retest correlations” at the following link: https:// j.1751-9004.2007.00021.x
osf.io/cjwtb/ (Henry, & Mõttus, 2019). Mõttus, R., & Allerhand, M. H. (2018). Why do traits come
together? The underlying trait and network approaches.
In V. Zeigler-Hill & T. K. Shackelford (Eds.), The SAGE
References handbook of personality and individual differences: Vol.
Ashton, M. C., de Vries, R. E., & Lee, K. (2017). Trait vari- 1. The science of personality and individual differences
ance and response style variance in the scales of the (pp. 130–151). Thousand Oaks, CA: SAGE.
Personality Inventory for DSM–5 (PID-5). Journal of Mõttus, R., Kandler, C., Bleidorn, M., Riemann, R., & McCrae,
Personality Assessment, 99, 192–203. R. R. (2017). Personality traits below facets: The consen-
Bentler, P. M. (2017). Specificity-enhanced reliability coef- sual validity, longitudinal stability, heritability, and utility
ficients. Psychological Methods, 22, 527–540. of personality nuances. Journal of Personality and Social
Borkenau, P., & Liebler, A. (1992). Trait inferences: Sources Psychology, 112, 474–490.
of validity at zero acquaintance. Journal of Personality Mõttus, R., Sinick, J., Terracciano, A., Hřebíč ková, M., Kandler,
and Social Psychology, 62, 645–657. C., Ando, J., . . . Jang, K. L. (2018). Personality charac-
Costa, P. T., Jr., & McCrae, R. R. (1988). Personality in adult- teristics below facets: A replication and meta-analysis
hood: A six-year longitudinal study of self-reports and of cross-rater agreement, rank-order stability, heritability,
spouse ratings on the NEO Personality Inventory. Journal and utility of personality nuances. Journal of Personality
of Personality and Social Psychology, 54, 853–863. and Social Psychology. Advance online publication.
Costa, P. T., Jr., & McCrae, R. R. (1992). Trait psychology doi:10.1037/pspp0000202
comes of age. In T. B. Sonderegger (Ed.), Nebraska Passini, F. T., & Norman, W. T. (1966). A universal concep-
Symposium on Motivation: Vol. 39. Psychology and aging tion of personality structure? Journal of Personality and
(pp. 169–204). Lincoln: University of Nebraska Press. Social Psychology, 4, 44–49.
Costa, P. T., Jr., McCrae, R. R., & Löckenhoff, C. E. (2019). Paunonen, S. V., & Ashton, M. C. (2001). Big Five factors
Personality across the life span. Annual Review of and facets and the prediction of behavior. Journal of
Psychology, 70, 423–448. Personality and Social Psychology, 81, 524–539.
Henry, S., & Mõttus, R. (2019). Item retest correlations [Data Piedmont, R. L. (1998). The Revised NEO Personality Inventory:
file]. Retrieved from https://osf.io/cjwtb/ Clinical and research applications. New York, NY:
Krueger, R. F., Derringer, J., Markon, K. E., Watson, D., & Plenum Press.
Skodol, A. E. (2012). Initial construction of a maladap- Piedmont, R. L., McCrae, R. R., Riemann, R., & Angleitner, A.
tive personality trait model and inventory for DSM-5. (2000). On the invalidity of validity scales: Evidence from
Psychological Medicine, 42, 1879–1890. self-reports and observer ratings in volunteer samples.
Lee, K., & Ashton, M. C. (2004). Psychometric properties Journal of Personality and Social Psychology, 78, 582–593.
of the HEXACO Personality Inventory. Multivariate Roth, M., & Altmann, T. (2019). A multi-informant study of the
Behavioral Research, 39, 329–358. influence of targets’ and perceivers’ social desirability on
Lord, F. M., & Novick, M. R. (1968). Statistical theories of self-other agreement in ratings of the HEXACO personality
mental test scores. Reading, MA: Addison-Wesley. dimensions. Journal of Research in Personality, 78, 138–147.
McCrae, R. R. (2015). A more nuanced view of reliability: Schmidt, F. L., Le, H., & Ilies, R. (2003). Beyond alpha: An
Specificity in the trait hierarchy. Personality and Social empirical examination of the effects of different sources
Psychology Review, 19, 97–112. of measurement error on reliability estimates for mea-
McCrae, R. R. (2018a). Method biases in single-source personal- sures of individual differences constructs. Psychological
ity assessments. Psychological Assessment, 30, 1160–1173. Methods, 8, 206–224.
McCrae, R. R. (2018b). Method biases in single-source per- Seeboth, A., & Mõttus, R. (2018). Successful explanations start
sonality assessments [Supplemental material]. Retrieved with accurate descriptions: Questionnaire items as per-
from http://supp.apa.org/psycarticles/supplemental/ sonality markers for more accurate predictions. European
pas0000566/pas0000566_supp.html Journal of Personality, 32, 186–201.
McCrae, R. R., & Costa, P. T., Jr. (2010). NEO inventories pro- Singer, J. A. (2005). Personality and psychotherapy: Treating
fessional manual. Odessa, FL: Psychological Assessment the whole person. New York, NY: Guilford Press.
Resources. Soto, C. J., & John, O. P. (2017). The next Big Five Inventory (BFI-
McCrae, R. R., Kurtz, J. E., Yamagata, S., & Terracciano, A. 2): Developing and assessing a hierarchical model with 15
(2011). Internal consistency, retest reliability, and their facets to enhance bandwidth, fidelity, and predictive power.
implications for personality scale validity. Personality and Journal of Personality and Social Psychology, 113, 117–143.
Social Psychology Review, 15, 28–50. Wood, D., Gardner, M. H., & Harms, P. D. (2015). How
McCrae, R. R., Mõttus, R., Hřebíčková, M., Realo, A., & Allik, J. functionalist and process approaches can explain trait
(2018). Source method biases as implicit personality the- covariation. Psychological Review, 122, 84–111.
ory at the domain and facet levels. Journal of Personality. Wood, D., Qiu, L., Lu, J., Lin, H., & Tov, W. (2018). Adjusting
Advance online publication. doi:10.1111/jopy.12435 bilingual ratings by retest reliability improves estima-
McCrae, R. R., & Sutin, A. R. (2007). New frontiers for the tion of translation quality. Journal of Cross-Cultural
five-factor model: A preview of the literature. Social and Psychology, 49, 1325–1339.

You might also like