Mary Mary

1.
Assessment involves the use of empirical data on student learning to refine programs and
improve student learning. (Assessing Academic Programs in Higher Education by Allen
2004)
2. Assessment is the process of gathering and discussing information from multiple and
diverse sources in order to develop a deep understanding of what students know,
understand, and can do with their knowledge as a result of their educational experiences;
the process culminates when assessment results are used to improve subsequent learning.
(Learner-Centered Assessment on College Campuses: shifting the focus from teaching to
learning by Huba and Freed 2000)
3. Assessment is the systematic basis for making inferences about the learning and
development of students. It is the process of defining, selecting, designing, collecting,
analyzing, interpreting, and using information to increase students' learning and
development. (Assessing Student Learning and Development: A Guide to the Principles,
Goals, and Methods of Determining College Outcomes by Erwin 1991)
4. Assessment is the systematic collection, review, and use of information about educational
programs undertaken for the purpose of improving student learning and development.
(Assessment Essentials: planning, implementing, and improving assessment in higher
education by Palomba and Banta 1999)
There are two main types of assessment, summative assessment, and formative assessment.
Summative Assessment
Oftentimes, summative assessments can be considered high-stakes. Summative assessments are

used to gauge children's learning against a standard or a benchmark. They are often given at the
end of the year and are sometimes used to make important educational decisions about children.
Summative assessments are a snapshot of students' understanding which is useful for
summarizing student learning. What helps me remember the difference between the two types is
that summative is like a summary. Summative is the big picture or the grand summary of a
child's learning.
1
Summative assessments aren't used a lot in early childhood programs because they're not really
considered developmentally appropriate as a form of assessment for very young children. One
example that you might see or use in your program is a Kindergarten Readiness Assessment or a
developmental skills assessment that enables the child to move to the next classroom. I've heard
of some programs doing assessments like that, where a child has to have a certain score on this
assessment in order to move up to the next preschool room or the four or five room, or whatever
it was for that particular program.
There's a little bit of debate within our field about whether it is developmentally appropriate or
not to test children to move them up to that next level. In my experience, there have been several
times where I felt that a child was ready to move up to the next class even though age-wise or
chronologically he wasn't the right age to move. I'm sure we've all had those children where
we're in the three-year-old class and the child's mentally five, but chronologically he's three.
Then there have been other times where the child was chronologically ready to move up at age
five, but developmentally I really felt like he should have been in the other room for a little bit
longer.
There are all kinds of implications for using an assessment of that type for that reason. Not to say
that that's wrong to do. It's just there's a little bit of debate in our profession about using those
types of assessments.
Formative Assessments
That takes us to the second type of assessment which is formative assessments. These are
considered low-stakes. So summative are high-stake and formative are low-stake. They're
ongoing and they tend to be based on teachers' intentional observations of children which are
typically during specific learning experiences and/or during everyday interactions or classroom
involvement. These assessments are most useful for planning learning experiences, activities,
and environments.
These are the everyday interactions that we talked about, where assessment naturally emerges
from the work that you're already doing. Those would be considered more of the formative
assessment. Again, these assessments are used to determine activities for the lesson plan after
asking questions such as
2
 What kind of things should I change out in my centers?
 What kind of items in the science center are the kids just throwing?
 What kind of things in the science center are they actually sitting down and investigating
and trying to see what they can figure out about it or are they really actually curious
about?
When I was a preschool teacher, I had many four to five-year-old children in my classroom
because at the time, the ratio for our state was one to 15. I had 30 children in my classroom and I
had to really be on top of what my children were interested in and what they had figured out or
had gotten over the excitement of. When you have that many children in the classroom, you have
to keep them engaged, active, and busy. Formative assessments were extremely helpful for me in
that way.
Formative assessments are most appropriate for use with young children. Remember, summative
assessments are not necessarily appropriate for age five years and under, but formative
assessments are definitely appropriate as they're often more authentic, more real, and more
holistic. They show a picture of the whole child as well so they can be more useful. Because
young children's learning can be so varied and sometimes erratic, using multiple sources of
assessment information is ideal. That goes back to what we were just talking about where
children develop in such a wide range, with a variety of contexts and situations.
There's such a wide range of development when it comes to young children, that even though
you might have a classroom full of three-year-olds, developmentally they're going to be on a
spectrum. That's because development in learning is varied and can be erratic. The term erratic
may be a little bit shocking at first, but young children's learning can be erratic. For example, if
you work with infants, one day you send them home and they can't sit up or roll over and are just
laying there looking at you. Then they come back on Monday and they're rolling and moving and
grooving and doing all kinds of stuff. If you work with toddlers, one day you send them home
and they barely say two or three words, the next week they come back and you can't get all the
words down that they're speaking. In this situation, erratic means sometimes very sudden, but
sometimes it's drawn out. It depends on the child.
3
Formative assessments can be formal, where you're actually making time to sit down and take
notes during a specific time or a specific center based on a specific child. They can also be
informal such as when you're out on the playground and a child is sitting under the tree with a
book and you just go over and you sit down and say, "Hey, can I read with you?" You notice,
wow, this child knows a lot of words in this book and you make a note of that. That would be
more of an informal type of assessment that you've done.
Formative assessments can be initial or ongoing. The initial formative assessment is usually done
to find out as much as we can about the child, usually at the beginning of the year or as a child
enters a program. It usually involves observing, studying existing information, and reviewing
home background info.
In the program that I supervised, when we had a new child join our program, we had a sheet that
the parents would fill out that asked all kinds of information like, "What's your child's favorite
stuffed animal? How does your child go to sleep at night? What's the bedtime routine? What's
your child's favorite food? What's your child's favorite movie?" It was all background
information about the child so that we could get to know them. That helped us begin those
connections that are so important in early childhood. That home background information would
be a part of that first initial formative assessment.
The other type of formative assessment is an ongoing formative assessment. This typically
provides more in-depth information, often because it takes more time. An ongoing formative
assessment isn’t a quick form that you’re through with once. It’s an ongoing thing you will look
at every week, month, three months, or however it is set up in your program.
Here are some examples of published formative assessment tools often used in early childhood
programs.
 The Work Sampling System (WSS) www.worksamplingonline.com
 Teaching Strategies GOLD www.teachingstrategies.com
 High Scope COR (Child Observation Record) www.onlinecor.net
 The Creative Curriculum Developmental Continuum www.teachingstrategies.com
4
Sometimes a state or funding sources will mandate that certain early childhood programs use a
specific assessment tool. Sometimes your program itself mandates that. I've had the experience
of working with all of these tools at one time or another in my career. All of them have definite
benefits to using them and many of them are pretty easy to complete. As you know, in early
childhood time is not a luxury that we have a lot of. It's always nice to have a tool that's easy to
use so that when you find five minutes to sit down and work on something or do an assessment,
then it's easy to figure out.
Objective items which require students to select the correct response from several alternatives or
to supply a word or short phrase to answer a question or complete a statement; and (2) subjective
or essay items which permit the student to organize and present an original answer. Objective
items include multiple-choice, true-false, matching and completion, while subjective items
include short-answer essay, extended-response essay, problem solving and performance test
items. For some instructional purposes one or the other item types may prove more efficient and
appropriate. To begin out discussion of the relative merits of each type of test item, test your
knowledge of these two item types by answering the following questions.
These are some characteristics of objective and subjective tests:
Objective Tests characteristics:
 They are so definite and so clear that a single, definite answer is expected.
 They ensure perfect objectivity in scoring.
 It can be scored objectively and easily.
 It takes less time to answer than an essay test
Subjective Tests Characteristics
 Subjective items are generally easier and less time consuming to construct than are most
objective test items
 Different readers can rate identical responses differently, the same reader can rate the
same paper differently over time.
5
Criterion-referenced tests compare a person’s knowledge or skills against a predetermined
standard, learning goal, performance level, or other criterion. With criterion-referenced tests,
each person’s performance is compared directly to the standard, without considering how other
students perform on the test. Criterion-referenced tests often use “cut scores” to place students
into categories such as “basic,” “proficient,” and “advanced.”
Criterion-referenced tests compare a student's knowledge and skills against a predetermined

standard, cut score or other criterion.
ln criterion-referenced tests the performance of other students does not affect a student's score.
This text was recognized by the built-in Ocrad engine. A better transcription may be attained by
right clicking on the selection and changing the OCR engine to "Tesseract" (under the
"Language" menu). This message can be removed in the future by unchecking "OCR
Disclaimer" (under the Options menu).
If you’ve ever been to a carnival or amusement park, think about the signs that read “You must
be this tall to ride this ride!” with an arrow pointing to a specific line on a height chart. The line
indicated by the arrow functions as the criterion; the ride operator compares each person’s height
against it before allowing them to get on the ride.
Note that it doesn’t matter how many other people are in line or how tall or short they are;
whether or not you’re allowed to get on the ride is determined solely by your height. Even if
you’re the tallest person in line, if the top of your head doesn’t reach the line on the height chart,
you can’t ride.
Criterion-referenced assessments work similarly: An individual’s score, and how that score is
categorized, is not affected by the performance of other students. In the charts below, you can
see the student’s score and performance category (“below proficient”) do not change, regardless
of whether they are a top-performing student, in the middle, or a low-performing student.
6
7
This means knowing a student’s score for a criterion-referenced test will only tell you how that
specific student compared in relation to the criterion, but not whether they performed below-
average, above-average, or average when compared to their peers.
How to interpret norm-referenced tests
Norm-referenced measures compare a person’s knowledge or skills to the knowledge or skills of

the norm group. The composition of the norm group depends on the assessment. For student
assessments, the norm group is often a nationally representative sample of several thousand
students in the same grade (and sometimes, at the same point in the school year). Norm groups
may also be further narrowed by age, English Language Learner (ELL) status, socioeconomic
level, race/ethnicity, or many other characteristics.
One norm-referenced measure that many families are familiar with is the baby weight growth
charts in the pediatrician’s office, which show which percentile a child’s weight falls in. A child
in the 50th percentile has an average weight; a child in the 75th percentile weighs more than 75%
of the babies in the norm group and the same as or less than the heaviest 25% of babies in the
norm group; and a child in the 25th percentile weighs more than 25% of the babies in the norm
group and the same as or less than 75% of them. It’s important to note that these norm-
8
referenced measures do not say whether a baby’s birth weight is “healthy” or “unhealthy,” only
how it compares with the norm group.
For example, a baby who weighed 2,600 grams at birth would be in the 7th percentile, weighing
the same as or less than 93% of the babies in the norm group. However, despite the very low
percentile, 2,600 grams is classified as a normal or healthy weight for babies born in the United
States—a birth weight of 2,500 grams is the cut-off, or criterion, for a child to be considered low
weight or at risk. (For the curious, 2,600 grams is about 5 pounds and 12 ounces.) Thus, knowing
a baby’s percentile rank for weight can tell you how they compare with their peers, but not if the
baby’s weight is “healthy” or “unhealthy.”
Norm-referenced assessments work similarly: An individual student’s percentile rank describes

their performance in comparison to the performance of students in the norm group, but does not
indicate whether or not they met or exceed a specific standard or criterion.
In the charts below, you can see that, while the student’s score doesn’t change, their percentile
rank does change depending on how well the students in the norm group performed. When the
individual is a top-performing student, they have a high percentile rank; when they are a low-
performing student, they have a low percentile rank. What we can’t tell from these charts is
whether or not the student should be categorized as proficient or below proficient.
9
This means knowing a student’s percentile rank on a norm-referenced test will tell you how well
that specific student performed compared to the performance of the norm group, but will not tell
you whether the student met, exceeded, or fell short of proficiency or any other criterion.
3. d. Item analysis is a process which examines student responses to individual test items
(questions) in order to assess the quality of those items and of the test as a whole. Item analysis is
especially valuable in improving items which will be used again in later tests, but it can also be
used to eliminate ambiguous or misleading items in a single test administration. In addition, item
analysis is valuable for increasing instructors’ skills in test construction, and identifying specific
areas of course content which need greater emphasis or clarity. Separate item analyses can be
requested for each raw score
3.e Test-retest reliability is a measure of reliability obtained by administering the same test
twice over a period of time to a group of individuals. Example: A test designed to assess student
learning in psychology could be given to a group of students twice, with the second
administration perhaps coming a week after the first. The obtained correlation coefficient would
indicate the stability of the scores. For a test to be reliable, it also needs to be valid. For
10
example, if your scale is off by 5 lbs, it reads your weight every day with an excess of 5lbs. The
scale is reliable because it consistently reports the same weight every day, but it is not valid
because it adds 5lbs to your true weight. It is not a valid measure of your weight.
Test validity refers to how well a test measures what it is purported to measure. For a test to be
reliable, it also needs to be valid. For example, if your scale is off by 5 lbs, it reads your weight
every day with an excess of 5lbs. The scale is reliable because it consistently reports the same
weight every day, but it is not valid because it adds 5lbs to your true weight. It is not a valid
measure of your weight.
11
Alternatives A B C D
Upper 25 2 3 5 15
Lower 25 5 2 15 3
NUMBER OF STUDENTS WHO GOT ¿

Calculate its item index of difficulty =
TOTAL NUMBEROP STUDENT
2+ 5 7
= = =0.28
25 25
B) Discrimination Index= number of students in the lower group - number of students in the
upper group
Discrimination Index for A= 3-2
Discrimination Index for A =1
item # Upper 25 Lower item index Discrimination

of difficulty Index
(p) (D)
A 2 3 0.28 1
B 3 2 0.28 -1
C 5 15 0.8 10
D 15 3 0.72 -12
12

Mary Mary

Uploaded by

Copyright:

Available Formats

You might also like

Mary Mary

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mary Mary

Uploaded by

Copyright:

Available Formats

1.

Oftentimes, summative assessments can be considered high-stakes. Summative assessments are

 The Work Sampling System (WSS) www.worksamplingonline.com

 Teaching Strategies GOLD www.teachingstrategies.com

 High Scope COR (Child Observation Record) www.onlinecor.net

 The Creative Curriculum Developmental Continuum www.teachingstrategies.com

These are some characteristics of objective and subjective tests:

Objective Tests characteristics:

 They ensure perfect objectivity in scoring.

 It can be scored objectively and easily.

 It takes less time to answer than an essay test

Subjective Tests Characteristics

Criterion-referenced tests compare a student's knowledge and skills against a predetermined

How to interpret norm-referenced tests

Norm-referenced measures compare a person’s knowledge or skills to the knowledge or skills of

Norm-referenced assessments work similarly: An individual student’s percentile rank describes

NUMBER OF STUDENTS WHO GOT ¿

Discrimination Index for A= 3-2

Discrimination Index for A =1

item # Upper 25 Lower item index Discrimination

You might also like