Professional Documents
Culture Documents
Wroblewski flt808 Assessmentdesigntask Paper
Wroblewski flt808 Assessmentdesigntask Paper
Wroblewski flt808 Assessmentdesigntask Paper
I chose to develop this assessment because it will be very useful in my new district.
After meeting with some of my colleagues they expressed a wish for a more authentic-
materials based midterm assessment (I will discuss this more in the design portion of this
report). While I could not develop an entire midterm on my own due to time and resources,
I created a unit test with similar parameters and for a similar purpose that could be
adapted into the overall midterm exam if necessary. The test will be for 8th grade Spanish 1,
the second level of Spanish offered in the district but the first non-exploratory course.
Students are around 13-15 years old and attend the middle schools. Most students in
Spanish 1 are at a Novice Low to Novice Mid level, as measured on the ACTFL proficiency
scale, meaning they often rely on memorized phrases and isolated words (ACTFL, 2012).
The exam is intended for after the first semester of study, at the end of unit 2. This test
would be used as a summative assessment for the end of the unit, and would be worth
slightly more than other assessments during the unit. The results would only be used
As a unit test, students would be the major stakeholders in the exam results, which
would be calculated into their overall grade. However, with over 6 units total this one exam
would be a smaller portion. The exam would be criterion-referenced, or not based on the
exam it could affect their grade negatively, but the outcome of the course would depend on
many more factors. Parents could also be considered stakeholders as they usually wish to
see their students graduate, and the first level of a language is a requirement for graduation
excerpts, and writing practice (although somewhat formulaic). As a result, backwash would
likely be positive and largely focused on interpreting authentic materials. I would like to
point out that I am not pleased by using multiple choice for the majority of the exam.
Nevertheless, my district requested it and I am following their guidelines. I will discuss the
Overall Design
As discussed above, the teachers from my district were looking for an exam based
on authentic materials. Today many language educators push the use of authentic
authentic materials to be motivating for learners; while others say they can cause difficulty
due to specific vocabulary (Gilmore, 2007). In truth, Gilmore (2007) argues that there is
little empirical evidence either way. He also points out that there are many definitions of
“authentic”: for instance, type of task, or materials made by native speakers, for native
speakers (Gilmore, 2007). In this case, the district was looking for the latter on the
interpretive portion of the test (listening, reading & vocabulary). Despite a lack of empirical
used correctly (Gilmore, 2007). The largest portion of time in developing this test went into
searching for appropriate authentic materials. My district already provided some sources,
but I found additional sources through extended Internet searches. Links to the original
The first assesses the skills of vocabulary, reading, and listening and consists of 25 multiple
choice items that would be scored via scantron. There are many advantages and
disadvantages to this format. Scoring of multiple choice is much quicker and more reliable,
which is very useful if used by a large group of teachers in different buildings, as ours
would be (Hughes, 2003). However, multiple choice questions only test recognition, scores
can be affected by guessing, and if not written properly the questions could be poor
(Hughes, 2003). In creating this portion of the exam I relied heavily on the brownbag
lecture on writing effective multiple choice questions (Ohlrogge, 2014). For example, I tried
to keep prompts short and use simplistic language that students are familiar with. I did my
best to balance the responses and make the distractors plausible (Ohlrogge, 2014). I will
Vocabulary: The first two sections of the exam (A and B) are focused on vocabulary
and in a secondary sense, reading. The vocabulary being assessed comes from the first
three units of the textbook Avancemos with a focus on Unidad 2 about school supplies, class
schedules, and school activities (McDougall Littell, 2010). Due to the use of multiple choice
questions, only recognition can be assessed. Both of these portions could be considered
reading tasks as well because they include an authentic piece of “text” (a map and a chart),
however vocabulary knowledge is the main construct being measured. Although it could be
argued that by including some reading skill in vocabulary assessment validity could be
compromised (Hughes, 2003), I chose this style because it is more authentic and better
reflects tasks students would do in real life. For part A students must know how to tell time,
numbers, class titles, and generally how to read a schedule in order to complete the tasks.
They would also need basic knowledge of question construction in Spanish. Part B contains
a school map and students interpret where various labeled rooms are located by using
their knowledge of prepositions of location (e.g. to the left of, next to, etc.). While basic
realistic context. In this section I chose to use 3 option multiple choice questions instead of
4 simply because I had trouble creating other distractors and it doesn't make a difference
vocabulary assessment, which could result in a murkier construct, but does align with the
curricular objectives of students being able to interpret communications such as maps and
schedules.
Reading: The second portion of the exam is focused on the skill of reading. Some of
the main objectives in Spanish 1 are for students to be able to skim texts, scan texts, use
context clues to determine unknown words, and identify general ideas; a mixture of
expeditious and careful operations (Hughes, 2003). The text I chose is an Internet forum
post I found on children talking about their opinions of school. The excerpts are very short,
are on a familiar topic, and are similar to what students may see in their daily lives
exploring online. The text does contain many unknown words, but I decided not to simplify
it based on findings that simplifying a text does not make a significant difference in overall
text comprehension (Young, 1999). Questions are once again multiple choice with 4
responses and are in English to avoid any issues in comprehension (Hughes, 2003). I put all
of the questions in the same order as the text and included questions on all portions of the
some involve inference (question 13), some are local details (11, 12, 14), and there is one
that involves an inference based on cultural knowledge (that Barcelona is in Spain)
(Hughes, 2003). One major issue with this section is that to accurately measure reading
ability it would likely be necessary to include more items and more reading excerpts to give
a better idea of students’ true performance on the construct (Hughes, 2003). While the first
two sections do include some reading skills, on future iterations of this exam or if it were
used as a midterm I would add more. In this case, time constraints limit what is possible for
Listening: Parts D and E of the exam focus on the assessment of listening skill with
the objective of students being able to identify key details as well as the overall message of
the “text”. Both excerpts are authentic in that native speakers created them, although the
second one is likely for language learners. Both are monologues. The pacing of the first
listening on a school is fairly quick for Novice level learners, which is why I included the
video. Vandergrift (2007) discusses how visuals can aid in comprehension: “Visuals can
who view and listen simultaneously appear to use more… strategies to compensate for
inadequate linguistic knowledge than those who only listen” (p. 200). It’s important to note
that some researchers have found video does not distract learners (Wagner, 2007), but
others have found the opposite (Coniam, 2001). I believe comprehension advantages
outweigh potential distraction. The video can also be slowed in speed and instructors could
determine whether this (as well as repetition of the video) is appropriate depending on the
student population. The second excerpt is a teen describing her school schedule. For this
text I chose to keep the questions in Spanish to have students focus on listening for key
words. The questions for both are in order based on the listening, are far enough apart to
avoid missing one, and include 3 responses only to aid students (Ohlrogge, 2016). One
drawback is the same as the reading in that there are a limited number of items, which
For all sections of the interpretive portion students would receive their scores as a
total number out of 25, although each staff member may choose to go over the questions
and answers with their students to provide them with feedback. In my classroom I always
(Anderson, 2012).
The second portion of the exam is the presentational portion assessing students’
presentational writing skills; in this case their ability to write a personal letter. This is only
one task and not necessarily a representative sample of writing ability, however this would
only be one of likely many summative and formative assessments during the course. The
prompt I included is very specific and offers little choice, as recommended by Hughes
(2003) to restrict candidates and make scoring more reliable. I also chose to use an analytic
rubric as a “heterogeneous… less well-trained group” (Hughes, 2003) will be the ones
scoring the writing samples. The rubric is based off of the Jacobs et. al. (1981) weighted
scoring profile and is out of 100 points total (Hughes, 2003). East (2009) constructed a
highly reliable analytic rubric specific to foreign language assessment (not ESL), so I also
used input from his rubric in constructing my own. The categories rated include (in order
Students will respond to the prompts by writing their letter on the sheet provided
using paper and pencil without the use of a dictionary. Multiple scoring of this assessment
would be ideal, with at least 2 Spanish teachers rating each sample. Feedback for this
well as writing-specific feedback and teacher training would need to be given on how to
best rate as well as give feedback. It’s likely some teachers would weight this writing
assessment overall at 50% or out of 50 points rather than 100 depending on their course
grading scale.
Administration
This test would likely need to be given in one 50 minute class period, which is why
there were limitations on how many items could be included. The writing portion could
take longer and repeating the listening could also increase the time of administration, and
some instructors could choose to use 2 class periods. Each classroom teacher would
administer the assessment in his or her own class in a paper-based format. Required
materials include test booklets, scantrons, cover sheets, pencils, a video-audio projector for
playing the listening excerpts, papers for the writing portion, and rubrics for each student
for ratings. All of these resources would be easily accessible. Some teachers may choose to
have students listen to the excerpts on their own and in that case each student would need
Conclusion
without piloting and analyzing it, by following the recommendations laid out in our course
on assessment and the input of other researchers, I believe this assessment is at least a
manuals/actfl-proficiency-guidelines-2012
Effective Peer Assessment. In Coombe, C., Davidson, P., Sullivan, B. & Stoynoff, S.
(Eds.), The Cambridge Guide to Second Language Assessment (187-197). New York:
in the certification of English language teachers: A case study. System 29, 1–14. doi:
10.1016/S0346-251X(00)00057-9
East, M. (2009). Evaluating the Reliability of a Detailed Analytic Scoring Rubric for Foreign
Retrieved from
http://classzone.com/cz/books/avancemos_1/book_home.htm?state=KS
Hughes, A. (2003). Testing for Language Teachers ( 2nd ed.). Cambridge, UK: Cambridge
University Press.
Ohlrogge, A. (2014). CeLTA Language Learner Training. Multiple Choice Items: The Art and
http://learninglanguages.celta.msu.edu/writing-multiple-choice-items/
Ohlrogge, A. (2016). Module 4 Part 1_Reading and Listening & Module 4 Part 2_Grammar
https://d2l.msu.edu/d2l/le/content/423528/Home?itemIdentifier=D2L.LE.Content
.ContentObject.ModuleCO-3650854
10.1017/S0261444807004338
Wagner, E. (2007). Are they watching? Test-taker viewing behaviour during an L2 video
listening test. Language Learning & Technology 11.1, 67–86. Retrieved from
http://llt.msu.edu/vol11num1/wagner/
7902.00027