Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Journal of Experimental Child Psychology 109 (2011) 232–238

Contents lists available at ScienceDirect

Journal of Experimental Child


Psychology
journal homepage: www.elsevier.com/locate/jecp

An empirical examination of sex differences in scoring


preschool children’s aggression
Anthony D. Pellegrini a,⇑, Catherine M. Bohn-Gettler b, Danielle Dupuis a,1,
Meghan Hickey a,1, Cary Roseth c,1, David Solberg a,1
a
Department of Educational Psychology, University of Minnesota, Minneapolis, MN 55455, USA
b
Department of Counseling, Educational and School Psychology, Wichita State University, Wichita, KS 67260, USA
c
Department of Educational Psychology, Michigan State University, East Lansing, MI 48824, USA

a r t i c l e i n f o a b s t r a c t

Article history: Sex differences in adults’ observations and ratings of children’s


Received 27 September 2010 aggression was studied in a sample of preschool children (N = 89,
Revised 8 November 2010 mean age = 44.00 months, SD = 8.48). When examining the direct
Available online 13 December 2010
observations made by trained observers, male observers, relative
to female observers, more frequently recorded aggressive bouts,
Keywords:
especially of boys. On rating scales assessing aggression, trained
Sex stereotype
Sex differences
male raters also gave higher aggressive ratings than female raters.
Aggression Lastly, we compared the ratings of trained female raters and female
Peer interaction teachers on the same scale and found no differences. Results are
Observer bias discussed in terms male raters’ and observers’ prior experiences
Rater Bias in activating their experiential schemata where males’ greater
experience in aggression, relative to that of females, leads them
to perceive greater levels of aggression.
Ó 2010 Elsevier Inc. All rights reserved.

Introduction

Sex differences in aggression during early childhood are considered as serious risk factors for sub-
sequent behavioral problems such as persistent antisocial behavior and dropping out of school (Dodge,
Coie, & Lynam, 2006). From this view, early identification of aggression should be an important part of
any attempted remediation. Identification of children’s aggression, however, is not a simple matter.
For example, there are issues associated with obtaining a valid sample of aggressive behavior as well

⇑ Corresponding author.
E-mail address: pelle013@umn.edu (A.D. Pellegrini).
1
The order of authorship, after Bohn-Gettler, is alphabetical.

0022-0965/$ - see front matter Ó 2010 Elsevier Inc. All rights reserved.
doi:10.1016/j.jecp.2010.11.003
A.D. Pellegrini et al. / Journal of Experimental Child Psychology 109 (2011) 232–238 233

problems with observers’ reliability. With regard to valid sampling of behavior, many researchers rec-
ommend direct observations of behavior by trained observers because they are said to be the ‘‘gold
standard’’ in assessment (e.g., Kagan, 1998).
What is not often alluded to in these recommendations (but see Condry & Ross, 1985; Gurwitz &
Dodge, 1975; Lyons & Serbin, 1986; Marsh & Hanlon, 2004; Ostrov, Crick, & Keating, 2005; Susser &
Keating, 1990) is the sex2 difference between observers of sex role stereotypical behavior such as
aggression. The issue of sex differences in observers’ scoring of aggression has recently been pointed
out in the comparative literature. For example, a study of aggression in nonhuman animals (i.e., salaman-
ders) found that male observers, relative to female observers, systematically recorded higher rates of
specific types of aggression (Marsh & Hanlon, 2004). To exacerbate this problem, many observational
studies in the developmental psychological literature either do not even specify the sex of the observers
(e.g., Jacklin & Maccoby, 1978; Pellegrini et al., 2007) or tend to be all female when the sex of the observ-
ers is specified (e.g., Martin & Fabes, 2001; Serbin, Moller, Gulko, Powlista, & Colburne, 1994).
In the first objective of the current study, sex differences between observers recording preschool
boys’ and girls’ aggression were examined. We predicted that both male and female observers would
record higher frequencies of aggression for boys than for girls, possibly because of observers’ extant
sex schemata that boys are more aggressive than girls (Lyons & Serbin, 1986). That is, instruction to
observers stressing objectivity may actually prime them to see boys as more aggressive than girls gi-
ven their extant beliefs. Furthermore, male observers typically have greater experience in aggression
than do females, so they should record more aggression in direct observations (Condry & Ross, 1985).
With that said, it has also been widely recognized that aggression in schools occurs very infre-
quently, thereby making direct observations of aggression too expensive to be practical (e.g., Caspi,
1998; Pellegrini & Bartini, 2000). Consequently, rating scales completed by classroom teachers are of-
ten used instead of observations, with the rationale being that teachers spend extended time with
children and these experiences form a valid base from which to rate children. Similar to the findings
in the direct observation literature, however, there is evidence of sex differences between raters of
children’s aggression. For example, male raters, relative to female raters, give boys higher scores on
antisocial and aggressive items (Davidson, MacGregor, MacLean, McDermott, & Farquharson, 1996;
Sideridis, Antoniou, & Padeliadu, 2008).
Next, we examined differences between raters of children’s aggression on a standardized rating
scale. To do this, male and female trained observers, in addition to the children’s female teachers, pro-
vided ratings of children’s aggression on the same instrument, a variant of the Teacher Checklist
(Dodge & Coie, 1987). Specifically, in the second objective, the ratings of trained male and female re-
search associates who had spent an entire school year observing the same children were compared.
Following the same logic specified in the first objective, we hypothesized a sex of rater difference
on the scoring of boys’ and girls’ aggression. In earlier research using the same rating scale, research-
ers’ ratings of children, relative to those of teachers, were better predictors of risk status because the
researchers were more rigorously trained in identifying dimensions of aggressive behavior (Pellegrini
& Bartini, 2000).
In the final objective, we compared female preschool teachers’ and female research associates’ rat-
ings of children’s aggressive behaviors. The difference between the two sets of raters was primarily in
terms of training on the meaning of aggressive behavior. Specifically, as part of the researchers’ train-
ing as observers, they received extensive and continuous training and monitoring on the criteria for
aggressive behaviors. Furthermore, the researchers observed the children across an entire school year,
thereby minimizing context differences that can bias ratings (Achenbach, McConaughy, & Howell,
1987; Lorenz, Melby, Conger, & Xu, 2007). Teachers, although highly competent in their knowledge
of child development and early education, were not explicitly and repeatedly trained on the specific

2
We use the term sex, rather than gender, throughout this article for the following reasons. Although the literature (e.g.,
Maccoby, 1998) recommends that sex be used to refer to ‘‘biological’’ differences and gender be used to describe differences
associated with socialization, we suggest that biology versus socialization is a false dichotomy akin to nature versus nurture.
Furthermore, ‘‘biological sex’’ of offspring is affected by ‘‘social’’ processes, following the Trivers–Willard hypothesis (Trivers &
Willard, 1973). We used sex to simply describe differences between males and females as identified by their parents (if children) or
by themselves (if adults).
234 A.D. Pellegrini et al. / Journal of Experimental Child Psychology 109 (2011) 232–238

construct being rated. In the third objective of this study, we hypothesized that teachers, relative to
female research associates, would rate boys as more aggressive.

Method

Participants

This study was conducted at a university laboratory preschool located on the campus of a large
midwestern university across a full school year from September through May. A total of 89 children
(44 girls and 45 boys), ranging in age from 29 to 59 months, participated in the study (mean
age = 44.00 months, SD = 8.48). Although three children left the school at mid-year, eight children
joined the school during the year, and one child switched from one classroom to another at mid-year,
these 12 children were included in the study because their inclusion did not change the results.
The research associates who observed and rated children (three males and four females) were PhD
students in educational psychology and were not aware that sex differences in their coding would be
examined. All five preschool teachers were females; four had a master’s degree in early childhood edu-
cation, one had a bachelor’s degree in sociology, and all had state certification in the field.

Observational procedures

Preschoolers were observed across an entire school year by seven researchers after a 4-week train-
ing regimen that involved videotape viewing and discussions followed by live recording, discussion,
and reliability checks. Data collection began once training yielded suitable levels of reliability (aggre-
gated across all observers, j P 0.80). Reliability and retraining sessions were held on alternating
months across the entire year, resulting in reliability data for 60 simultaneous observations (5% of
all observations); the average cross-observer kappa value was 0.88.
Observations were conducted daily during fall and spring semesters. Observations were entered di-
rectly onto laptops using an Excel spreadsheet. All observations were conducted in children’s class-
rooms, in the gymnasium, and on the playground. Aggression (physical and verbal) was event
sampled with continuous recording rules (Pellegrini, 2004); when observers saw an aggressive bout,
they recorded associated behaviors for the duration of the aggressive bout (i.e., all behaviors listed as
aggressive) and for 5 min after it concluded. Physical aggression was defined as behaviors used to con-
trol resources, including hitting, kicking, chasing, pushing–pulling, snatching (i.e., grabbing or taking
an object), and displacing (e.g., cutting in line, taking a seat or spot in a circle); rough-and-tumble play
was not coded as aggression. Verbal aggression was defined as yelling at or threatening another child.
We aggregated the two categories because they occurred with relatively low frequency. We did not
include relational aggression because the level of occurrence was too low to be meaningful (but see
Ostrov, 2006). The unit of analysis was the frequency with which each child was observed engaging
in aggression divided by the total number of times the child was observed. Because proportion data
are often non-normal, the arcsine transformation was used.

Teacher Checklist ratings

Children’s classroom teachers (all females) and research associates who also observed the children
completed an adapted form of the Teacher Checklist (Dodge & Coie, 1987) at the end of the spring
semester. This 7-point Likert-type scale is a valid measure of aggression with preschool children
and early adolescents (e.g., Pellegrini & Bartini, 2000; Pellegrini et al., 2007). The aggression factor
had five items (e.g., hits or shoves, says mean things, verbally threatens) (a = .74). All ratings were
transformed into standard scores, such that they were standardized within classrooms.

Results

In the first objective, we hypothesized that male observers, relative to female observers, would re-
cord higher rates of aggressive behavior. Descriptive statistics are displayed in Table 1. A 2 (sex of ob-
A.D. Pellegrini et al. / Journal of Experimental Child Psychology 109 (2011) 232–238 235

Table 1
Untransformed proportional means (and standard deviations) for initiated aggressive bouts.

Male coders Female coders


Boys Girls Boys Girls
0.11 (0.02) 0.02 (0.10) 0.01 (0.01) 0.01 (0)

Note: standard deviations are in parentheses.

server)  2 (sex of child) repeated measures analysis of variance (ANOVA) calculated for aggressive
behavior revealed a main effect for sex of observer, F(1, 83) = 8.13, p = 0.005, g2 = 0.09, where male
observers, relative to female observers, recorded more aggressive bouts. There was also a main effect
for sex of child, F(1, 83) = 11.86, p < 0.001, g2 = 0.13, where boys, relative to girls, were observed as
being more aggressive. Finally, there was a sex of child by sex of observer interaction, F(1,
83) = 14.03, p < 0.001, g2 = 0.15, such that male observers recorded boys as more aggressive than girls,
t(44) = 3.75, p < 0.001, d = 0.75; female observers’ scoring of boys’ and girls’ aggression was not statis-
tically different, t(39) = 1.17, p = 0.25, d = 0.21.
In the second objective, we hypothesized that male researchers, relative to female researchers,
would rate children higher on aggressive behaviors on the Teacher Checklist. The descriptive statistics
associated with this objective are displayed in the upper portion of Table 2. This hypothesis was tested
with a 2 (sex of child)  2 (sex of rater) ANOVA and indicated a main effect for sex of rater, F(1,
13) = 48.42, p < 0.001, g2 = 0.79; male researchers’ ratings (M = 2.12) were higher than female
researchers’ ratings (M = 1.89). Neither the main effect for sex of child, F(1, 13) = 2.16, p > 0.05,
g2 = 0.14, nor the sex of rater by sex of child interaction, F(1, 13) = 1.54, p > 0.05, g2 = 0.11, was
significant.
In the third objective, we hypothesized that female teachers would rate children’s aggressive
behavior higher than female researchers. This hypothesis was tested with a 2 (sex of child)  2 (rater:
teacher or researcher) ANOVA on the aggressive factor. There was not a significant main effect for
rater, F(1, 65) = 1.12, p > 0.05, g2 = 0.02, for sex of child, F(1, 65) = 0.01, p > 0.05, g2 < 0.001, or for
the rater by sex of child interaction, F(1, 65) = 1.57, p > 0.05, g2 = 0.02. The descriptive statistics for
these analyses are displayed in the lower portion of Table 2.

Discussion

The issue of sex differences associated with observers and raters of children’s aggressive behavior
has received surprisingly little recent attention in the developmental psychological literature. We say
surprisingly because for at least the past 35 years (e.g., Maccoby & Jacklin, 1974) researchers have at-
tended closely to sex differences in children and youths and the associated biases of researchers in the
field (e.g., Maccoby, 1998). Even in this light, many researchers fail to take such basic steps as speci-
fying the sex of the researchers conducting their observations of sex role stereotypical behavior (e.g.,
Jacklin & Maccoby, 1978; Pellegrini et al., 2007). Furthermore, when sex of the observers is specified, it
is often the case that all of the observers are female (e.g., Martin & Fabes, 2001; Serbin et al., 1994).

Table 2
Teacher Checklist means (and standard deviations) by sex and status of
rater and sex of child.

Boys Girls
Coder comparisons
Male coders 2.02 (0.40) 2.22 (0.37)
Female coders 1.93 (0.77) 1.84 (0.63)
Status comparisons
Female coders 1.99 (0.79) 1.82 (1.02)
Female teachers 1.72 (0.83) 1.85 (0.88)

Note: standard deviations are in parentheses.


236 A.D. Pellegrini et al. / Journal of Experimental Child Psychology 109 (2011) 232–238

The results from the current research show that this is problematic on a number of levels. First,
specification of the sex of observers or raters is as important as specification of the sex of participants,
a practice recommended by the American Psychological Association (2001). This level of explicitness is
necessary in any discipline that aspires to scientific objectivity and, correspondingly, to the scrutiny of
replication efforts.
Second, the sex of observers and raters seems to be especially important where judgments are
made about boys’ and girls’ sex role stereotypical behavior. We would expect male observers and rat-
ers to ‘‘see’’ male stereotypical behavior (e.g., Condry & Ross, 1985; Lyons & Serbin, 1986; Marsh &
Hanlon, 2004). This hypothesis has been supported in the few areas where it has been tested such
as male and female observers’ coding of nonhuman animals’ aggression (e.g., Marsh & Hanlon,
2004) and males’ and females’ ratings of boys’ and girls’ hostile behavior (Davidson et al., 1996). So,
in cases where all female observers or raters are used, we would expect a systematic bias to underes-
timate aggression.
In the first objective, and consistent with our hypothesis, the sex of observer difference was signif-
icant but in the context of an interaction with the sex of child such that male observers, relative to
female observers, recorded aggression more frequently in boys than in girls. The sex of observer dif-
ference may be explained in terms of the training itself. Instructions and training associated with the
direct observations stressed objectivity and tried to minimize bias and may have been responsible for
activating sex schemata where boys are represented as more aggressive than girls (Lyons & Serbin,
1986). Furthermore, the interaction between sex of observer and sex of child may have been the result
of male observers, relative to female observers, having more direct experiences in aggression. That is,
past experiences in having engaged in aggression may have primed male observers, especially, to
interpret boys’ behavior as more aggressive than girls’ behavior (Condry & Ross, 1985; Pellegrini,
2003). The robust sex differences in aggression in the literature (Archer, 2004; Hyde, 2005) certainly
support this claim.
In the second objective, we examined sex differences between trained researchers’ ratings of chil-
dren’s aggression. The results for this objective replicated those found for the direct observations such
that male raters, relative to female raters, assigned higher aggression scores. This pattern is also con-
sistent with the research examining male and female differences in rating individuals’ hostility, where
trained male raters are more likely than female raters to give high ‘‘hostility scores’’ to participants
and female raters are more likely than male raters to give more positive ratings (Davidson et al.,
1996), possibly as a result of male and female raters having different experiences with aggression.
That there was no sex of observer by sex of child interaction on the teacher rating scale as there
was for the direct observations may be due to the speed with which observers needed to make judg-
ments about children’s behavior relative to the time afforded while completing rating scales (Lyons &
Serbin, 1986).
In the final objective of this study, we addressed the issue of training and sex differences by com-
paring the ratings of two groups of female raters: highly trained research associates and preschool
teachers. Our analyses suggested that female teachers and female researchers viewed children simi-
larly, although the latter group had more training than the former group. These results, in concert with
those from the first set of objectives, point to the potent effect of the sex of observer when making
judgments about children’s sex stereotypical behavior.
The ubiquity of sex bias among highly trained researchers is not unique to the field of psychology.
Similar findings have been reported in research examining nonhuman animal behavior (e.g., Marsh &
Hanlon, 2004) and health psychology (e.g., Davidson et al., 1996) and across the wider social sciences
(e.g., Pierotti, Annett, & Hand, 1997). A number of eminent female ethologists (e.g., Gowaty, 1997;
Hrdy, 1981) have suggested that sex-biased observations may have compromised the degree to which
they recorded behavior that went against prevailing human sex roles and behavior. Specifically, if stu-
dents are trained in laboratories where there is a certain ethos (e.g., to either find or not find certain
types of sex differences), as students become socialized, they will be likely to view the behavior of
their research participants as consistent with this ethos. Indeed, a similar pressure toward a confirma-
tory bias is one reason why we have double blind researcher procedures.
There are limitations associated with this study. First, a larger and more varied sample of observers
and raters would have added generalizability to the results. However, even with this limited statistical
A.D. Pellegrini et al. / Journal of Experimental Child Psychology 109 (2011) 232–238 237

power, statistically significant results were observed and there is probably a larger threat of type II
error than of type I error. Furthermore, our results did replicate a number of other findings involving
highly trained observers and raters across a variety of social and behavioral sciences, and this likely
attenuates this limitation.
Even in light of these limitations, this study adds substantively to the literature. This study exam-
ined sex differences on a comparable construct, aggression, across different assessment formats. Even
direct observation by highly trained observers showed sex of observer differences. Although sex of ob-
server differences among highly trained researchers has been recognized in other fields, such as
behavioral biology, it has not been as widely recognized in educational and developmental psychology
(but see Condry & Ross, 1985; Lyons & Serbin, 1986; Ostrov et al., 2005; Susser & Keating, 1990), as
evidenced perhaps most directly when researchers do not even report the sex of their observers
(e.g., Jacklin & Maccoby, 1978).

References

Achenbach, T. M., McConaughy, S. H., & Howell, C. T. (1987). Child/adolescent behavioral and emotional problems: Implications
of cross-informant correlations for situational specificity. Psychological Bulletin, 101, 213–232.
American Psychological Association (2001). Publication manual of the American Psychological Association (5th ed.). Washington,
DC: American Psychological Association.
Archer, J. (2004). Sex differences in aggression in real-world settings: A meta-analytic review. Review of General Psychology, 8,
291–322.
Caspi, A. (1998). Personality development across the life course. In N. Eisenberg (Ed.), Handbook of child psychology: Vol. 3. Social,
emotional, and personality development (5th ed., pp. 311–388). New York: John Wiley.
Condry, J. C., & Ross, D. F. (1985). Sex and aggression: The influence of gender label on the perception of aggression in children.
Child Development, 56, 225–233.
Davidson, K., MacGregor, M. W., MacLean, D. R., McDermott, N., & Farquharson, J. (1996). Coder gender and potential hostility
ratings. Health Psychology, 15, 298–302.
Dodge, K. A., & Coie, J. D. (1987). Social information processing factors in reactive and proactive aggression in children’s peer
groups. Journal of Personality and Social Psychology, 53, 1146–1158.
Dodge, K. A., Coie, J. D., & Lynam, D. (2006). Aggression and antisocial behavior in youth. In N. Eisenberg (Ed.), Handbook of child
psychology: Vol. 3. Social, emotional, and personality development (6th ed., pp. 719–788). New York: John Wiley.
Gowaty, P. A. (1997). Darwinian feminists and feminist evolutionists. In P. A. Gowaty (Ed.), Feminism and evolutionary biology
(pp. 1–18). New York: Chapman and Hall.
Gurwitz, S. B., & Dodge, K. A. (1975). Adults’ evaluations of a child as a function of sex of adult and sex of child. Journal of
Personality and Social Psychology, 32, 822–828.
Hrdy, S. B. (1981). The woman that never evolved. Cambridge: Harvard University Press.
Hyde, J. S. (2005). The gender similarities hypothesis. American Psychologist, 60, 581–592.
Jacklin, C., & Maccoby, E. (1978). Social behavior at thirty-three months in same sex and mixed sex dyads. Child Development, 49,
557–569.
Kagan, J. (1998). Biology and the child. In N. Eisenberg (Ed.), Handbook of child psychology: Vol. 3. Social, emotional, and personality
development (5th ed., pp. 177–236). New York: John Wiley.
Lorenz, F. O., Melby, J. N., Conger, R. D., & Xu, X. (2007). The effects of context on the correspondence between observational
ratings and questionnaire reports of hostile behavior: A multi-trait, multi-method approach. Journal of Family Psychology, 21,
498–509.
Lyons, J. A., & Serbin, L. A. (1986). Observer bias in scoring boys’ and girls’ aggression. Sex Roles, 14, 301–313.
Maccoby, E. E. (1998). The two sexes: Growing up apart, coming together. Cambridge, MA: Harvard University Press.
Maccoby, E., & Jacklin, C. (1974). The psychology of sex differences. Stanford, CA: Stanford University Press.
Marsh, D. M., & Hanlon, T. J. (2004). Observer gender and observation bias in animal behavior research: Experimental tests with
red-backed salamanders. Animal Behaviour, 68, 1425–1433.
Martin, C. L., & Fabes, R. A. (2001). The stability and consequences of young children’s same-sex peer interactions. Developmental
Psychology, 37, 431–446.
Ostrov, J. M. (2006). Deception and subtypes of aggression during early childhood. Journal of Experimental Child Psychology, 93,
322–336.
Ostrov, J. M., Crick, N. R., & Keating, C. F. (2005). Gender biased perceptions of preschoolers’ behavior: How much is aggression
and prosocial behavior in the eye of the beholder? Sex Roles, 52, 393–398.
Pellegrini, A. D. (2003). Perceptions and functions of play and real fighting in early adolescence. Child Development, 74,
1522–1533.
Pellegrini, A. D. (2004). Observing children in their natural worlds: A methodological primer (2nd ed.). Mahwah, NJ: Lawrence
Erlbaum.
Pellegrini, A. D., & Bartini, M. (2000). An empirical comparison of methods of sampling aggression and victimization in school
settings. Journal of Educational Psychology, 92, 360–366.
Pellegrini, A. D., Roseth, C., Mliner, S., Bohn, C., Van Ryzin, M., Vance, N., et al (2007). Social dominance in preschool classrooms.
Journal of Comparative Psychology, 121, 54–64.
Pierotti, R., Annett, C. A., & Hand, J. L. (1997). Male and female perceptions of pair-bond dynamics: Monogamy in Western gulls,
Larus occidentalis. In P. A. Gowaty (Ed.), Feminism in evolutionary biology (pp. 261–275). New York: Chapman and Hall.
238 A.D. Pellegrini et al. / Journal of Experimental Child Psychology 109 (2011) 232–238

Serbin, L. A., Moller, L. C., Gulko, J., Powlista, K. K., & Colburne, K. A. (1994). The emergence of gender segregation in toddler
playgroups. In C. Leaper (Ed.), Childhood gender segregation: Causes and consequences (pp. 7–17). San Francisco: Jossey–Bass.
Sideridis, D. S., Antoniou, F., & Padeliadu, S. (2008). Teacher biases in the identification of learning disabilities: An application of
a logistic multilevel model. Learning Disability Quarterly, 31, 199–209.
Susser, S. A., & Keating, C. F. (1990). Adult sex role orientation and perceptions of aggressive interactions between girls and boys.
Sex Roles, 23, 147–155.
Trivers, R. L., & Willard, D. E. (1973). Natural selection of parental ability to vary the sex ratio of offspring. Science, 179, 90–92.

You might also like