Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

Abstract

and supplemental
tables are available here:

David A. Stringham and Thomas R. Graham


James Madison University
Purpose and Research Ques0ons
The purpose of this study was to examine ra4ngs of
Virginia high school wind and percussion ensembles from
events sponsored by Virginia Band and Orchestra
Directors Associa4on (VBODA) in 2009, 2011, 2012, 2013,
and 2014.


Research ques4ons were:
(1) How were performance, sightreading, and nal ra4ngs
distributed among par4cipa4ng ensembles?
(2) How were performance, sightreading, and nal ra4ngs
distributed among performance levels (i.e., Grades I-
VI) and VBODA Districts (i.e., Districts I-XVI)?
(3) What was the reliability of performance ra4ngs among
performance levels and VBODA Districts?
(4) Did nal average ra4ngs dier among performance
levels and VBODA Districts?

Methods
Data from 1,667 high school wind and percussion
ensembles at 80 events in 2009, 2011, 2012, 2013, and
2014 were downloaded from VBODAs website. Data were
entered in SPSS Version 21 for analysis; ra4ngs were
transposed from Roman to Arabic numerals as needed.

Three evaluators rated performances of prepared
repertoire using VBODA rubrics. Groups entering under
Op4on I: Concert Performance with Sight-Reading
subsequently par4cipated in a sightreading performance
with one evaluator. Seven of the 1,667 groups elected to
enter under VBODAs Op4on II: Concert Performance in
Lieu of Sight-Reading. Forty-six groups entered to
perform for comments only.

Using procedures similar to Hash (2012), we used four
constructs to evaluate interrater reliability: (a) IRC
(Spearmans rank order coecient) measured the extent
to which individual judges ra4ngs moved in the same
direc4on, (b) Cronbachs alpha measured the degree to
which judges ra4ngs corresponded with one another, (c)
IRApw measured the average percentage of pairwise
agreement between individual judges, and (d) IRAco
measured the percentage of agreement for all ra4ngs
combined.

Mean performance, sightreading, and nal ra4ngs were
examined by performance level and VBODA District using
Kruskal-Wallis ANOVAs and post-hoc Mann-Whitney U
tests.

Discussion and Future Research

Results

How were performance, sightreading, and nal ra0ngs distributed


among par0cipa0ng ensembles?

The majority of par4cipa4ng ensembles earned performance ra4ngs
of I or II (84.6%), sightreading ra4ngs of I or II (94.7%), and nal ra4ngs
of I or II (88%).

I
Ra0ng (n, %)
P

II
(n, %)

III
(n, %)

757, 45.4 654, 39.2 188, 11.3 21, 1.3 1, .1 46, 2.8

SR 1122, 67.3 457, 27.4 34, 2


F

IV
V
co
(n, %) (n, %) (n, %)
1, .1

0, 0 53, 3.2

757, 45.2 713, 42.8 148, 8.9 6, .4

0, 0 46, 2.8

Reliability of ra4ngs by performance level appear in


the table below. To view reliability of ra4ngs by
VBODA District, please view our supplemental tables
using the QR code above.

Level n IRC
I 2 1
II 19 0.83
III 296 0.75
IV 639 0.74
V 442 0.7
VI 223 0.74

IRApw
0.67
0.75
0.75
0.74
0.78
0.91

IRAco alpha
0.83 0.75
0.89 0.92
0.88 0.91
0.88 0.91
0.89 0.89
0.95 0.91

Did performance, sightreading, or nal ra0ngs


dier among performance levels and VBODA
Districts?
There were no signicant dierences by VBODA District for
performance, sightreading, or nal ra4ngs. Kruskal-Wallis
ANOVAs indicated signicant dierences by performance
level for performance ra4ngs (N = 1621, 2(5)=256.361, p<.
001), sightreading ra4ngs (N = 1614, 2(5)=43.762, p<.001),
and overall ra4ngs (N = 1621, 2(5)=259.520, p<.001). Post-
hoc Mann-Whitney U tests revealed signicant dierences
between several pairs, shown in tables below.

How were performance, sightreading, and nal


ra0ngs distributed among performance levels and
VBODA Districts?
Distribu4on of ra4ngs by performance level appear in
the table below. To view distribu4on of ra4ngs by
VBODA District, please view our supplemental tables
using the QR code above.

Ra0ng
I
II
III
IV
Level n Type (n, %) (n, %) (n, %) (n, %)
I
2 P
1, 50
1, 50 0, 0 0, 0
SR
1, 50
1, 50 0, 0 0, 0
F
1, 50
1, 50 0, 0 0, 0
II 21 P
7, 33.3 8, 38.1 4, 19 0, 0
SR 10, 47.6 8, 38.1 0, 0 0, 0
F
7, 33.3 8, 38.1 4, 19 0, 0
III 320 P 61, 19.1 163, 50.9 64, 20 8, 2.5
SR 171, 53.4 109, 34.1 13, 4.1 27, 8.4
59,
F 63, 19.7 172, 53.8 18.4 2, .6
90,
IV 656 P 245, 37.3 293, 44.6 13.7 10, 1.5
SR 430, 65.4 197, 30 10, 1.5 0, 0
69,
F 243, 37 325, 49.5 10.5 2, .3
V 443 P 254, 57.3 159, 35.9 27, 61 2, .5
SR 325, 73.4 109, 24.6 7, 1.6 1, .2
F 252, 56.9 173, 39.1 16, 3.6 1, .2
VI 223 P 189, 84.8 30, 13.5 3, 1.3 1, .4
SR 185, 83 33, 14.8 4, 1.8 0, 0
F 188, 84.3 34, 15.2 0, 0 1, .4

What was the reliability of performance ra0ngs


among performance levels and VBODA Districts?

V
(n, %)
0, 0
0, 0
0, 0
0, 0
0, 0
0, 0
0, 0
0, 0

co
(n, %)
0, 0
0, 0
0, 0
2, 9.5
3, 14.3
2, 9.5
24, 7.5
27, 8.4

0, 0

24, 7.5

1, .2
0, 0

18, 2.7
20, 3

0, 0
0, 0
0, 0
0, 0
0, 0
0, 0
0, 0

18, 2.7
1, .2
1, .2
1, .2
1, .4
1, .4
1, .4

Signicant Dierences in Performance Ra0ngs:

Grade
Level
2 vs. 6
3 vs. 4
3 vs. 5
3 vs. 6
4 vs. 5
4 vs. 6
5 vs. 6

n
244
976
763
543
1099
879
666

U
418.625
161.977
337.168
549.75
175.19
387.777
212.587

p
<.001
<.001
<.001
<.001
<.001
<.001
<.001

What factor(s) contribute to the preponderance of I


and II ra4ngs?

How could interrater reliability be improved (e.g.,
adjudicator training)?

What rela4onship exists between student achievement
measured through large ensemble performance and
individual measures of student achievement aligned
with state (Virginia Department of Educa4on, 2013) and
na4onal (Na4onal Coali4on for Core Arts Standards,
2014) standards (e.g., crea4ng, performing, connec4ng,
responding)?

To what extent are ra4ngs from these events
appropriate measures of student learning and/or
teacher eec4veness (Hash, 2013)?

To what extent are distribu4on and reliability of these
ra4ngs similar to those for other performing ensembles
(e.g., orchestra, choir, marching band)?

To what extent are distribu4on and reliability of these
ra4ngs similar to those in other states?

What factor(s) inuence teachers decisions between
Op4on I: Concert Performance with Sightreading and
Op4on II: Concert Performance in Lieu of
Sightreading?

What factor(s) inuence teachers decisions to perform
for comments only?

What factor(s) inuence teachers decisions not to
alend this type of assessment event?

Signicant Dierences in Sightreading Ra0ngs:

Grade
Level
3 vs. 5
3 vs. 6
4 vs. 6

n
763
543
879

U
126.158
203.62
124.404

p
<.001
<.001
<.001

Signicant Dierences in Final Ra0ngs:

Grade Level
2 vs. 6
3 vs. 4
3 vs. 5
3 vs. 6
4 vs. 5
4 vs. 6
5 vs. 6

n
244
976
763
543
1099
879
666

U
436.496
163.984
334.584
549.223
170.6
385.239
214.638

p
<.001
<.001
<.001
<.001
<.001
<.001
<.001

References
Hash, P. M. (2012). An analysis
of the ra4ngs and
interrater reliability of high
school band contests.
Journal of Research in
Music Educa3on, 60(1),
81100.

Hash, P. M. (2013). Large-group
contest ra4ngs and music
teacher evalua4on: Issues
and recommenda4ons.
Arts Educa3on Policy
Review, 114, 163169.


Na4onal Coali4on for Core Arts
Standards (2014). Na3onal
core music standards.
Retrieved from hlp://
musiced.nafme.org/
musicstandards

Virginia Department of
Educa4on (2013). Music
standards of learning for
Virginia public schools.
Retrieved from hlp://
www.doe.virginia.gov/
tes4ng/sol/
standards_docs/ne_arts/
music/complete/
musicartsk-12.pdf

You might also like