6507 2

Course: Educational Measurement and Evaluation (6507)
Level: MA/M.Ed
Semester,Spring 2022
Assignment no 2
Q.1 Discuss some of the popular observational techniques. Present a criticism over the use
and credibility of self reported data and its techniques.
Social scientists use many methods to collect data. The most common method is self-report, in
which people respond to questions about themselves regarding a wide variety of issues such as
personality traits, moods, thoughts, attitudes, preferences, and behaviors. In fact, much of
social science knowledge and theory are based largely on self-report data.
ADVANTAGES OF SELF-REPORT
The main advantage of self-report is that it is a relatively simple way to collect data from many
people quickly and at low cost. A second advantage is that self-report data can be collected in
various ways to suit the researcher’s needs. Questionnaires can be completed in groups or
individually and can be mailed to respondents or made available on the Internet. Self-report
data can also be collected in an interview format, either in person or over the telephone.
Researchers can thus obtain data from respondents across a large geographic area or to whom
they do not have direct access.
Furthermore, researchers can collect data regarding behaviors that cannot be observed directly
or are unethical to simulate in the laboratory (i.e., activities typically done in private and
behaviors that would cause embarrassment if done in public). The only person with direct
access to mental events and some behaviors is the self; therefore, the self is the best person to
report on these variables. Also, respondents completing pencil-and-paper questionnaires with
the assurance of confidentiality and anonymitynmight be more willing to accurately report on a
variety of behaviors and characteristics.
DISADVANTAGES OF SELF-REPORT AND POSSIBLE SOLUTIONS
There are several disadvantages of self-report that threaten the reliability and validity of
measurement. Researchers must ensure that measures are reliable—meaning the outcomes of
measurements are repeatable—and valid— meaning the intended variable is measured.
Some threats to validity derive from the way the measure is designed, such as using ambiguous
words or words that are not appropriate for the reading level of the respondents. This is
especially problematic when different groups (e.g., men and women, adolescents and adults)
are likely to interpret words differently or have different reading levels. Researchers must
choose words carefully to convey their desired meaning precisely and at an appropriate reading
level. Another way to ensure consistent interpretation of items is to provide respondents with a
reference group. For example, respondents may be instructed to indicate their level of shyness
in relation to people of their same age and gender. It may also be important to provide
additional information to ensure that respondents interpret and use response scales correctly,
such as providing examples of portion sizes when asking questions about eating habits.
The order of items may also influence how a person responds. For example, people may report
different levels of happiness depending on whether they answer this question before or after
reporting how many dates they had last month. Researchers should consider how questions
might influence answers to subsequent questions. Furthermore, when asking about
controversial or potentially embarrassing information, researchers might plan to ask less
controversial questions first so that respondents feel comfortable at the outset and thus will be
more likely to provide honest responses to more difficult questions.
Other threats to the validity of self-report are due to respondent characteristics. One such well-
known threat is socially desirable responding, or the attempt by respondents to make
themselves look good. Researchers avoid this problem by using neutral items; by using the Q-
sort method of response, in which the respondents are allowed to rate only a certain number of
items as highly descriptive of themselves; and by informing respondents that answers are
anonymous and/or confidential, thereby encouraging honest responding. Finally, researchers
may choose not to use data from respondents who score high on measures of social
desirability.
Respondents may have other biases unrelated to item content. They might tend to embrace or
avoid extreme responses, or they might exhibit acquiescence, a tendency to agree with
statements. These threats can be reduced by requiring respondents to choose one answer from
a list. To reduce acquiescence bias, researchers can use measures with items keyed in different
directions so that agreeing with some items lowers the total score.
An especially harmful threat to self-report occurs when respondents are intentionally

dishonest. Dishonest responding can be decreased by ensuring confidentiality (and anonymity,
if possible). Data from people who score high on measures of lying or faking (in both negative
and positive directions) can also be excluded from further analyses.
Experimenters may pose threats to the validity of self-report by unintentionally influencing how
people respond to questions, which is especially likely with an interview format. Typically, this
threat results from experimenters expecting people to respond in line with hypotheses. To
lessen this threat, experimenters and interviewers should be unaware of the research
hypotheses, and several interviewers should be used to reduce systematic influences of any
one interviewer.
The situation and location of interviews may also influence self-report measures. For example,
people interviewed on college campuses may agree with the statement “The government
should give more money to education” in greater numbers than people interviewed in a park.
The situation may serve as a cue to the respondents about the desirable answer, and they may
respond accordingly. Even when the respondent is not consciously aware of contextual cues, he
or she may still be influenced by them on a subconscious level.
A final threat to validity can occur if all data have been collected with self-report because this is
likely to artificially inflate correlations; this problem is known as method variance. In fact,
collecting data with any single method has its pitfalls. Researchers who use a single method for
collecting data need to be aware of method variance so they can correctly interpret the
magnitudes of the correlations found between various measures.
METHODS OTHER THAN SELF-REPORT
Multiple methods can be used to gain a multifaceted understanding of variables of interest and
to determine consistency between self-reports and other measurements. Other-report (asking
another person about the person of interest) is increasingly being used to examine variables
that were previously primarily measured with self-report. Behavioral observation is another
method for gathering information about variables commonly assessed with self-report,
although this method is more time-consuming and expensive than self-report. To check for
accuracy, self-reports can be compared with archival data (e.g., criminal records).
Researchers must keep in mind the purpose of their research to determine which method(s) of
data collection to use. If researchers are interested in people’s subjective experience of their
own thoughts and behaviors, then self-report is appropriate. However, if researchers are
interested in more than people’s subjective experience of themselves, then a multimethod
approach should be used to ensure reliable and valid measurement. Social science researchers
are often concerned with more than subjective experience and are beginning to embrace
multimethod approaches to data collection. This shift in data collection methods is likely to
increase the quality of data available to test and revise social science theories.
02 What kind of difficulties are faced by examiners during test administration process. Give
some suggestion to overcome the problems during test administration.
Assessing the quality and quantity of learning has been, and always will be, a regular feature of
classroom practice in every public school. For teachers to establish whether their pupils have
been learning, they have to set, administer, score and grade examinations. Testing provides
information about the examinees' abilities and performance. It also enables meaningful
observations and comparisons to be made of the kind of behaviour learners acquire during the
teaching-learning process (Child, 1997; Farrant, 2000).
Similarly, performance by students should reflect similar grade, in the same test, and similar
results should be obtained by groups of comparable candidates using the test on other
occasions, even when marked by a different examiner. This kind of result may only be obtained
if among others the test is carefully administered; implying that the quality of test management
and administration ensures its validity and reliability (Walklin, 1990).
It is, therefore, incumbent upon persons entrusted with the management of tests to learn the
principles and good practices of test administration to ensure these qualities of a test are
upheld in testing. When test administrators are not conversant with the principles of test
administration, the overall aim of examination process fails and more often than not, the
examiner and the examinees suffer the consequences.
Across the world, a number of scholars have documented vast and interesting literatures
regarding principles of test administration and good testing practices in schools. Gronlund and
Linn (1990), for example, suggest that tests can be successfully administered by any
conscientious teacher or test administrator, as long as the prescribed testing procedures are
rigorously followed. They maintain that test administration becomes simple if:
i. the pupils are motivated to do their best,
ii. test administration directions are followed closely,
iii. time is accurately kept,
iv. any significant events that might influence test scores are recorded, and
v. testing materials are collected promptly.
In Nigeria, different approaches of test administration are adopted by different examination

bodies. A number of stakeholders, including the federal state and local governments, Non-
Governmental Organizations and concerned individuals amalgamate their efforts for successful
test administration in the public schools. In addition, punitive measures for those individuals
who commit offences during the examination process have been put in place. The punitive
measures range from imprisonment to monetary fines, depending on the type and magnitude
of the offence committed during the examination process (Jegede, 2003; Nwahunanya, 2004).
Related to punitive measures, according to Adewale (2008); if less emphasis is placed on paper
qualification and continuous assessment encouraged, irregularities during examination
administration can be eliminated, and consequently, the examination administration process
can be more effective.
The situation in Uganda, regarding examination administration does not greatly differ from
other countries. However, in Uganda, examination management, national assessment and
administration depends on the level of learning and the purpose of the examination
coordinated by a national body. At the highest level of learning, universities have the autonomy
to manage their respective examinations. It is, therefore, the responsibility of each university to
put guidelines for administration of its examinations in place and to follow such guidelines. For
secondary schools and colleges, the examination process is entrusted with the examining body,
the Uganda National Examinations Board (UNEB). Conscious of the need to obtain information
on what learners actually learn in school, many countries now operate what are variously called
national assessments, system assessments, learning assessments, or assessment of learning
outcomes (Greaney & Kellaghan, 1996).
In Uganda the Education Policy Review Commission (1989), reported lack of reliable and up-to-
date data on educational indicators. The only assessment information that was used for
monitoring and evaluation was based on public examinations such as Primary Leaving
Examination (PLE), the Uganda Certificate of Education (UCE) examination results. However,
public examinations are done only at the end of a cycle of education and are designed to serve
primarily as instruments for certification and selection of learners into institutions of higher
learning (UNED, 2010).
In his study conducted on behalf of the UNEB, Ogwang (2007) stipulates that the process of
examination administration is an uphill task, as it is sometimes marred by irregularities. This is
why the UNEB concedes that tracking down examination irregularities is a management feat:
that it requires a lot of additional resources, both human and monetary ones to curb
examination malpractices. Kagoro (2008) in his study conducted for the UNEB agrees with
Ogwang and contends that an examination supervisor is the overall officer responsible for the
smooth and proper conduct and supervision of examinations. He asserts that the examination
administration should ensure that the rules and regulations on the conduct and supervision of
examinations are followed.
In any case, the examination processes in Uganda is tailored towards achieving high validity and
reliability of any examination. This is why effective supervision of examinations is a very crucial
element in the administration of public examinations. The officers involved in the
administration of examinations must ensure that examinations are conducted in accordance
with the laid down rules to maintain credibility of the system (UNEB, 2004).
Principles of Test Administration
The paramount guiding principle in administering any classroom test is that all examinees
should be given a fair chance to demonstrate their achievement of the learning outcomes
intended or planned. This implies physical and psychological environment in which the
examination is taking place has to be conducive for the examinee to facilitate the achievement
of the testing outcome. The factors that might interfere with validity of the measurement also
have to be controlled. Even though the evidence regarding the effects of physical and
environmental conditions on test performance is inconclusive, examinees should be as relaxed
as possible and distractions should be eliminated or minimized. Whereas distractions during
testing are known to have little effect on the scores of students, they may have profound effect
on especially young children (Gronlund & Linn, 1990; Mehrens& Lehmann, 1999; Linn & Miller,
2005).
Another principle is students having positive attitudes towards a test. People are likely to
perform better at any endeavor, including test taking, when they approach the experience with
a positive attitude. Unfortunately, teachers frequently fail to help students develop positive
attitudes toward tests. Students are not likely to perform at their best when they are
excessively tense. Thus, the experience of test anxiety among some students (Thorndike, 1977;
Mehrens & Lehmann, 1999; Linn & Miller, 2005).
It is imperative that test administrators are qualified enough and trusted persons. This is to
ensure that tests are properly managed to obtain valid and reliable results. Test administrators
need to have the opportunity to learn their responsibilities as a prerequisite to accurate test
results (United States Department of Labour, 1999). It should also be noted that a well
prepared test is easy to administer, and the reverse is true with a poorly prepared test. It is
equally important to realize that a successful test administration exercise is a product of test
planning. Cheating is most likely to occur in a poorly planned test, thus, proving a challenge to
test administration (Fontana, 1995; Mehrens& Lehmann, 1999; Cottrell, 2001; Linn & Miller,
2005). However, a good test administration irrespective of the test preparation is paramount.
Good Test Administration Practices
Good testing practices rest in the hands of the examiner, who should ensure the testing
exercise, runs smoothly. The period before the test, during the test and after the test should be
effectively managed to realize a highly efficient testing period.
Period before the test
Security of testing instruments:

All test materials used in the assessment process, whether paper-and-pencil or computer-based
must be kept secure. Lack of security may result in some test takers having access to test
questions before the test, thus, compromising the quality, and invalidating their scores. To
prevent irregularities, test administrators should, for example, keep testing materials in locked
rooms or cabinets and limit access to those materials to staff involved in the assessment
process. Test security is also a responsibility of test developers to ensure the test is not
compromised over time. To maintain their security, test developers should introduce new
forms of tests periodically (Gronlund & Linn, 1990; United States Department of Labour, 1999).
Related to security of tests, testing authorities should endeavor to open cartons containing test
materials and inspect the contents to verify that appropriate test levels and quantities have
been received. After inspection of the testing materials, they should be securely stored since
examination monitors may during unannounced visits inspect these materials to ascertain the
seals have not been tampered with before the due date (Gronlund & Linn, 1990).
After securing an adequate number of tests, the following considerations should be part of
prior preparation checklist:
a.) examinees and parents have been notified regarding the test date and time.
b.) candidates have been reminded to bring materials necessary for the test.
c.) all students with special needs (e.g. glasses and hearing aids) have been considered before
the start of the test.
d.) all adequate invigilation has been planned.
e.) examination administrators have read appropriate test administration procedures such as
timing, examination regulations and test modifications.
f.) the rooms where the test is to be conducted have adequate ventilation and lighting and have
been properly arranged.
g.) seats are arranged in such a way that candidates cannot look at each other's work.
h.) candidates have been thoroughly prepared for the examination by suggesting to them ways
of studying, giving them practice tests like those to be used, teaching them test-taking skills and
stressing the value of tests as for improving learning (Gronlund & Linn, 1990; National College
Testing Association, 2010).
i.) when all is set for the exam, secure the room, including the writing "Testing in Progress, Do
not Enter".
Period during the Test
The proper preparation for examinations may not produce the desired results if the conditions
during the test are mishandled. It is the cardinal duty of the test administrators or institutions
to ensure that conditions during testing ensure successful testing (Gronlund and Linn (1990).
The following are guidelines that need to be observed to ensure required conditions for
successful testing are fulfilled:
Observe precision in giving instructions or clarifications
When an examiner announces that there will be "a full three hours" to complete the test and
then talks for the first fifteen minutes, examinees feel that they are being unfairly deprived of
testing time. Besides, just before a test is no time to make assignments, admonish the class, or
introduce the next topic. In other words, examinees are mentally set for the test and will ignore
anything not pertaining to the test for fear it will hinder their recall of information needed to
answer the questions. Thus, the well intentioned remarks fall on "deaf ears" and merely
increase anxiety toward the test and create hostility toward the teacher.
Avoid interruptions
At times, an examinee will ask to have an ambiguous item clarified, and it may be beneficial to
explain the item to the entire group at the same time. All other distractions outside and inside
the examination room should be eliminated, where possible. The challenge, however, is that
more often than not, the distractions are beyond the test administrators' reach!
Avoid giving hints to students who ask about individual items
If the item is ambiguous, it should be clarified for the entire group. If it is not ambiguous, refrain
from helping the pupil to answer it. The challenge is that at times, refraining from giving hints
to examinees who ask for help may be difficult especially for new comers in the field of testing.
Nevertheless, giving unfair aid to some students decreases the validity of the test results and
lowers class morale.
Discourage cheating
When there is good teacher-student rapport and students view tests as helpful rather than
harmful, cheating are usually not a problem. Under other conditions, however, it might be
necessary to discourage cheating by special seating arrangements and careful supervision.
Candidates receiving unauthorized assistance from other examinees during an examination
have the same deleterious effect on validity of test results and class morale as does receiving
special hints from the teacher. We are interested in pupils doing their best; but for valid results,
their scores must be based on their own unaided efforts.
Careful proctoring of the testing session, such as, periodically walking around the room and
observing how the students are doing is also of paramount importance in preventing cheating.
The obstacle is that many teachers define proctoring as "being present in the examination
room". They consequently become physically present but spend their time reading a novel,
writing a letter or marking and scoring previous tests. The best way to proctor an examination is
to observe students doing the test and not being preoccupied at one's desk (Gronlund & Linn,
1990; Mehrens & Lehmann, 1999).
Another way of discouraging cheating is discouraging students from using any form of
communication devices, either in the room where the test is being administered or while on a
supervised break, such as a bathroom visit. It would be better if students are reminded earlier
that they may not use any devices including but not limited to cellular telephones, pagers,
audiocassette players, radios, personal digital assistants, video devices, associated headphones,
headsets, microphones, or earplugs while taking an examination.
Ensure that no eating takes place in the examination hall
Students should not be allowed to bring any food items in the examination room, unless it is on
proven medical grounds. Under such circumstances, it is advisable that special arrangements
are made in advance for purposes of securing a designated area where the food items could be
kept, to avoid distracting those who do not require the food items.
Identify each examinee to prevent a situation where someone may attempt to take the
examination on someone else's behalf
Students should, therefore, be informed in advance to bring with them their identity cards
and/or examination cards (Mbarara University of Science and Technology, MUST, 2008).
Handle emergencies appropriately
If an examinee becomes ill during the examination, and must leave the examination hall, they
should not be allowed to return. The test administrator is advised to make a comprehensive
report about the candidate's situation to make it possible for authorities to consider a retest for
such a candidate, to be scheduled for another time.
Inform students on progress of testing
It is the responsibility of the test supervisor or invigilator to keep the students informed of the
time remaining (e.g., writing the time left on the blackboard at 15-minute intervals (Mehrens &
Lehmann, 1999).
Period after the Test

Orderliness is needed for a successful testing process until all the test materials are securely in
the hands of the test administrators. After the completion of the examination, the following are
expected;
a) all test materials and documents, both used and unused should be collected and accounted
for. They may be kept in a secure and lockable facility.
b) count through candidates' scripts to ensure their number corresponds with the names on the
examination attendance register. Counting also eliminates scenarios where the attendance
register shows a student attended an examination but his or her script is not available.
Conclusions
Tests or examinations are among the components of the teaching-learning process in any public
school system, and it is conducted at all levels of learning in different institutions in Nigeria,
Africa and the globe. Established examination bodies have set regulations and procedures in
the administration of the tests or examinations to individuals, whether in the classrooms or
designated settings. Just as classroom environment is vital for teaching-learning and personal
educational growth and development of the individual, so is participation in examinations.
Therefore, during test administration process;
a) All examinees should be accorded a fair chance through the provision of conducive physical
and psychological environment.
b) Candidates involved in the examination should develop positive attitudes, adhere to the
rules, and therefore conduct themselves decently during the examination.
c) The teachers' roles must be recognized because they contribute to the success of
examination or test administration.
d) To realize a smooth test administration exercise, the period before, during and after the test
should be carefully managed.
Q3 Discuss the functions of marking and reporting students' performance at school level,
Highlight the weaknesses in our examination at secondary level
To increase grading and reporting consistency throughout the district. To improve

communication with parents, students, guidance counselors, other teachers, colleges, future
employers, and any others who need this information.
To diagnose student weaknesses earlier and more accurately so that children can get the help
they need.
To more accurately measure our students’ achievement of the Ohio Academic Content
Standards.
To ensure that our grading practices support learning.
To solve the problem that our grading practices district-wide were inconsistent.
How were the Grading Guidelines developed?
A district-wide committee, comprised of our Teacher Leaders and the members of the
Leadership Team, met to ensure that the grades earned by students are consistent, accurate,
meaningful and supportive of learning.
As a team, we read the research, articles and books and compared that to what we do based on
our teacher, parent and student input.
We outlined seven grading and reporting issues that needed our attention in order to achieve
our goal.
consistency important?
The Grading Policies address certain grading and reporting core practices that need to be
consistent throughout the district. Consistency in grading practices increases fairness for
children. Consistency also improves communication.
How do Grading Guidelines improve communication?
Because the guidelines provide consistency in grading, the grades students earn will mean the
same thing from teacher to teacher. For example, everyone will have a much better
understanding and agreement on what an “A” means and what it takes to earn one.
Communication is also improved because the meaning is made more specific.
Why isn’t the school system sending interim reports (for grades 3-12)? There is far more
information provided online in Progress Book than is provided in an interim. Any parent can
request a hard copy by contacting their school office. Updated grades can be obtained by
checking your student’s Progress Book account.
Practice and Academic Achievement?

If a student is learning something for the first time, or is still in the early stages of learning the
material, it is Academic Practice. The purpose of Academic Practice is not to judge a student’s
final achievement of a topic, but to evaluate where he or she is in the learning process,
diagnose any problems, and aid in getting the help needed to learn or extend the material.
Academic Practice could consist of many different types of assessments including, but not
limited to:
Some quizzes
Some homework
First drafts of writing
Teacher questions during instruction
Some worksheets
Informal observations
Pre-testing
Exit Slips
Class participation
Oral Assessment
If a student has had sufficient instruction and practice on a topic, so that it is fair to evaluate
him or her on the material, then it is Academic Achievement. The purpose of Academic
Achievement is to evaluate how well a student has learned the material. Academic
Achievement could consist of many different types of assessments including, but not limited to:
Tests (written, oral, and performance)
Some quizzes
Some homework
Writings (term papers, essays, stories, etc.)
Projects
Presentations
Performances
You do NOT distinguish between Academic Practice and Academic Achievement by the type of
assessment it is. For example, homework is NOT necessarily Academic Practice; quizzes are NOT
necessarily Academic Achievement, etc. This depends on how the information is used.
Why are Academic Practice and Academic Achievement weighted differently?
Academic Achievement has a higher weight value than Academic Practice.
Student progress is more accurately reflected when Academic Achievement is weighted more.
During Academic Practice a student is still learning the material, and it is reasonable to expect
mistakes.
In summary, there is benefit in helping develop good learning habits of practicing and studying.
Yet as an accurate measure of what a student has learned, a final grade needs to be based
primarily on work that was actually graded for correctness. The assessment needs to be given
at a time when the student has had sufficient instruction and practice to be held responsible for
the material. Therefore, Academic Achievement is weighted more heavily.
The administration and professional staff devise grading systems for evaluating and recording
student progress. The records and reports of individual students are kept in a form that is
understandable to parents as well as teachers. The Board approves the grading and reporting
systems as developed by the faculty, upon recommendation of the Superintendent.
The Board recognizes that any grading system, however effective, has subjective elements.
There are fundamental principles that must guide all instructors in the assignment of marks and
achievement.
1. The achievement mark in any subject should represent the most objective measurement by
the teacher of the achievement of the individual. A variety of evaluation measures are used
and accurate records are kept to substantiate the grade given.
2. An individual should not receive a failing grade unless he/she has not met stated minimum
requirements.
3. Grades are a factor used to motivate students. Poor or failing grades should trigger a variety
of instructional and intervention activities to assist the student in achieving better grades by the
next grading period, if possible.
Q4 How minimum competency testing can be useful at lower grades levels? Explain with local
examples.
Testing students for academic achievement or competency is not new. As early as the 1970s,
some states were making adequate performance on "exit examinations" a prerequisite for high
school graduation. This was done in an effort to enhance teacher quality as well as student
achievement during an era when many questions were raised by parents, educators, and the
public at large about the seeming lack of basic skills in high school graduates.
While varying and inconsistent approaches have been taken to measure student performance
at the elementary school level, there is more unison in setting certain minimum criteria for
graduation from high school. The vast majority of states require an overall accumulation of
"Carnegie units" (reflecting the number of classroom hours spent learning) in addition to
passing grades in certain core subjects. But by 2002, nearly half of all states required (or were
planning to require within the next two years) "exit exams" in addition to accumulated credit
hours in order for students to receive diplomas evidencing high school graduation.
"Exit Examinations" for High School Graduates
Following years of complaints from both employers and academic institutions of higher learning
(that many high school graduates lacked basic educational skills in reading, writing, and math),
both legislators and educators agreed to work toward raising educational standards
nationwide. This has resulted in renewed focus on learning rather than remediation and more
accountability for teachers and school systems.
Educational standards (and correlative exams) for gauging performance have been criticized in
the past for being local or parochial in substance, making grades and class standing a "relative"
achievement based only upon how well others in the same school system or state performed.
The Education Reform Act helped standardize student performance on a national level, but new
questions were raised as to whether teachers were actually enhancing learning skills or merely
"teaching to the test," (i.e., merely teaching those things they knew students would be tested
on, in order to make the school and/or the teacher appear favorably on assessment
reviews).However, questions remain as to which system is the best to assess the academic
competency of graduating students. By far the most often used tool of assessment is the
multiple-choice examination, in many cases combined with a writing sample. This, in
combination with passing grades in key subjects and a minimum number of credit units, seems
to be a growing method of choice for ensuring minimum competency levels of high school
graduates in the United States. Because graduation from high school may be dependent upon
passing an "exit exam," the process has been dubbed "high stakes testing."
Legal Authority for Setting Educational Standards

Most education reform since the 1980s has focused on "performance-based standards" which
ostensibly indicate a minimum level of academic achievement that all graduating students
should have mastered. Some important laws concerning standards-based school reform
include:
The No Child Left Behind Act, signed into law by President George W. Bush in January 2002,
refines and makes major amendment to Title I (see below). Among other factors (like
substantial flexibility for states in the use of federal funds), the new law requires states to
assess reading and math skills in students from grades three to eight on an annual basis.
The Educate America Act (20 USC 5801 et seq.) is only binding upon states that accept its grant
funding (nearly all) but sets as its primary goal the development of strategies for setting
statewide student performance standards and for assessing achievement of those standards.
Title I of the Improving America's Schools Act of 1994 (20 USC 6301 et seq.) contains an explicit
set of requirements for states to submit plans for challenging content and performance
standards and assessing student mastery of the requirements in order to receive Title I funds
(the largest federal school aid program).
The Individuals with Disabilities Education Act (IDEA), (20 USC 1400 et seq.) was substantially
amended in 1997. The Act requires that states which receive grant funds under its auspices
must develop IEPs (individual education plans) for students with disabilities or who are deemed
in need of special services. The 1997 amendments required states to develop policies and
procedures to allow students with disabilities to participate in state and district-wide testing
programs, with necessary accommodations.
Legal Challenges to Educational Testing
Courts have had numerous opportunities over the decades to pass on the validity of education
testing in conjunction with high school graduation and promotion (e.g., to the next level grade).
Most legal challenges have been grounded in the Due Process Clause and the Equal Protection
Clause of the Fourteenth Amendment to the U.S. Constitution. Challenges to testing of special
education students have invoked IDEA and Section 504 of the Rehabilitation Act of 1973.
Q.5 How the measures of central tendency may help to interpret the classroom performance
in the context of high and low achievers? (20)
In this workshop, you will develop the ability to identify the educational significance of statistics and to
interpret and apply useful statistics for the classroom. The accompanying video will review statistical
concepts and calculations.
This lesson will introduce the following measures of central tendency (the center points of data) and
variability (the diversity of the data).
Measures of Central Tendency
One of the most useful statistics for teachers is the center point of the data. Knowing the center point
answers such questions as, “what is the middle score?” or “which student attained the average score?”
There are three fundamental statistics that measure the central tendency of data: the mode, median,
and mean. All three provide insights into “the center” of a distribution of data points. These measures of
central tendency are defined differently because they each describe the data in a different manner and
will often reflect a different number. Each of these statistics can be a good measure of central tendency
in certain situations and an inappropriate measure in other scenarios. The next section describes each
statistic and both its educational value and its limitations.
Mode
The mode is defined as the most frequently occurring score. If the data are arranged in a frequency
distribution similar to illustration 4, then the mode is easy to identify. In illustration 4 the mode is 89.
Why is the mode 89? Because there were four students who scored an 89, and that was the largest
number of students who scored at the same level on this assessment.
The mode is easy to locate on any type of distribution curve graph, regardless of skewing. Let’s examine
several examples to further understand the concept of mode by locating it on three representative types
of graphs.
From Illustration 7: Mode of a Normal Curve
Graph of a normal curve with the mean, median and mode labeled in the highest point on the curve.
Note that the mode is located at the highest point of the graphed data. This represents the greatest
frequency of that score.
From Illustration 8: Mode of Skewed Graphs
As expected, the mode is located at the highest point on both the positively and negatively skewed
graph. Again, the highest point indicates the score with the greatest frequency. Note that the mode
moves to the left on a positively skewed distribution and to the right on a negative skewed distribution.
Both cases are examples of non-normal data distributions. Also note that non-normal does not imply
that it is incorrect. It simply means the data does not indicate a normal distribution of data that would
create a normal curve when graphed.
Let’s complicate the process by looking at the data collected from an elementary class where 14
students were given the same 10 point quiz. The frequency distribution for the class is listed in
Illustration 9.
Illustration 9: Elementary Class Frequency Distribution
Student Score Frequency
10 1
9 5
8 1
6 1
2 5
1 1
Notice that the frequency distribution only lists those scores that were actually attained by students, not
all the possible scores. What is the mode for illustration 9?
If you selected 9 you are correct; if you selected 2 you are also correct. This is a trick question because
the data displays two modes: both 9 and 2 are correct. However, it would be more correct to describe
the data as a “bimodal distribution of data.” Bimodal simply means that there are two modes within the
same distribution of data. In this case, because the modes are considerably far apart, the elementary
teacher likely has a class where a substantial number of the students understand the content and a
substantial number of students who do not. However, if in the same bimodal scenario, one mode was a
score of 10 and a second mode was a score of 9, then the teacher would be entitled to a victory lap
around the school parking lot.
A bimodal graph is easy to identify. In every case, there will be two peaks in the data. The two peaks
represent the frequency that students attained those scores. Illustration 10 is a graph of the data
displayed in illustration 9. Note the two humps in the graph representing a bimodal distribution of the
data.
Illustration 10: Bimodal Distribution
A bimodal distribution of student scores with frequency peaks at 2 and 9.
Is it possible to have more than two modes? Multi-modal distributions become more common as the
amount of data gets considerably larger. The same rules for selecting the mode apply, although the
educational implications may vary. A distribution with four modes at equally spaced intervals of 90, 80,
70, and 60 on a diagnostic exam indicates a wide variety of levels of understanding. This type of
information would be useful to guide the teacher in the selection of appropriate types of activities that
should include lesson preparations that will reach all students. The teacher would prepare differently
when the four modes were clustered in the following manner: 99, 97, 45, and 38. In this scenario, the
teacher would prepare two different lesson plans for this class: one for the high achievers and one for
the lower achievers.
The determination of the mode is a useful statistic for teachers. It not only measures the central
tendency or grouping of data, but it also provides a reference point to assist teachers in understanding
the nature of the students and their needs, and then guides teachers in planning instruction that will
meet their needs.
Median
The median divides a distribution exactly in half so that 50% of the scores are at or below the median
and 50% of the scores are at or above it. It is the “middle value” in a frequency distribution. When the
number of data points is an odd number, the middle score is the median. For example, given 13 scores,
the 7th score would be the median. When the number of data points is even, like 14, then the median is
equal to the sum of the two middle scores in a frequency distribution divided by 2.
Illustration 11: Ordered Array of Unit Exam Scores (Odd number of scores)
Student Score
50. 49. 46. 41. 34. 32 29. 29. 29. 27. 24. 19. 12. 12. 7
Consider the following scores collected from a unit exam worth 50 points in a class of 15 students.
Whenever dealing with an odd number, the median is the middle number. So in Illustration 11, the total
number of student scores is 15, an odd number. The midpoint of 15 is the 8th score because there are 7
scores above it and 7 scores below it. The teacher then counts down or up to the 8th score to determine
the midpoint, or median. In the case of Illustration 11, the median is 29. Note that, for this data set, 29 is
also the mode.
What if this teacher had a class with an even number of students? How would the median be
calculated? Illustration 12 provides an example of how to determine the median in an even numbered
class. Let’s assume that the class size is 6 and they have just completed an exam worth 50 points. The
following illustration displays their scores.
Illustration 12: Ordered Array of Unit Exam Scores (Even number of scores)
Student Score
3. 6. 12. 19. 35. 47
To determine the median of an even number of scores, we begin by adding the 2 middle numbers and
dividing by 2. In this case, the numbers 12 and 19 are the middle numbers. Together they total 31. The
quotient of dividing 31 by 2 delivers a median of 15.5. Note that the median does not have to represent
one of the listed scores. For a teacher using an ordered array of test scores, the median locates the
middle or center grade. On a display of the normal curve the median is exactly the midpoint of the data
distribution and is located in the exact center of the graph. This is also the highest point on the curve.
Would the median be affected by a skewed data distribution? Since the median represents the
midpoint, skewed data would move the midpoint in the direction of the bulk of the scores. Illustration
13 displays how the median is influenced by a positively or negatively skewed data distribution.
Illustration 13: Median Location with Skewed Data
It is interesting to note that skewed data moves the median off of the mode, or the highest peak on the
normal curve. In skewed data, the median moves toward the direction of the skew or tail. For a
positively skewed data distribution, the median moves to the right of the mode; for negatively skewed
data, to the left. The movement of the median to the right or left of the mode indicates that a larger
than normal number of scores are located in that area.
Mean
The mean is the arithmetic average of all of the data points. It is also the most common measure of
central tendency and is the most widely understood. In fact, when most people think of average, they
are imagining the mean. The mean is easy to calculate and most people have been doing it since
elementary school. To calculate the mean, add up all of the data points and divide that result by the
total number of data points. Consider the following ordered array of test scores on a 25 point quiz from
a typical middle school class of 20 students.
Illustration 14: Ordered Array of Students’ Quiz ScoresStudent Score
25. 25. 25. 24. 23 23. 21. 21. 20. 17. 17. 15 14 14. 12. 9. 9. 6. 5. 5
There are 20 scores listed in the ordered array. Note that they do not have to be organized in an ordered
array to calculate the mean.
The sum of these scores is 320. To calculate the mean, divide the total of the scores (320) by the number
of scores (20): 320/20 = 16. Observe that the mean score does not have to be represented by any of the
actual scores as no student scored a 16 on this assessment.
For the teacher, it is helpful to calculate the mean to get a sense of the average score. However, the
mean has a major drawback: it is greatly influenced by extreme scores. Consider the data below in
illustration 16. Assume the data points are from a single student on a series of 10 point tests.
Illustration 15: Student X’s Quiz Scores
Student Score
10. 10. 10
From the data, it is easy to calculate that the student’s mean is 10. This student undoubtedly deserves
one of the top grades in class. But let’s imagine that the child leaves on vacation and misses school for a
week. On the next exam, the student scores a 2, so the new data looks like the following:
Illustration 16: Student X’s Updated Quiz Scores
Student Score
10 10 10 2
From this data the new mean is 8. An 8 is a considerable drop from the previous mean of 10. In this
case, the child has scored the highest possible grade three times and a low grade only once. However,
the extremeness of the low grade has a dramatic effect on the mean, which reduced the child’s average
by 20%.
Let’s get an idea of how many 10’s the student would have to get to move the mean back up to a 10. To
keep the calculations simple, let’s try adding 6 more scores.
Illustration 17: Student X’s 10 Quiz Scores
Student Score
10. 10. 10. 2. 10. 10. 10. 10. 10. 10
The total number of scores is 10 and the sum of the numbers is 92. Therefore, the mean is 9.2. How
might this affect the child? One score out of ten was enough to keep the child from regaining a mean
score of 10. In fact, the child could never get an average of 10 because there is no way to recoup the
mathematical effects of the low score. The mean has limitations as a statistic and this is a classic
example of the most common one. This is a teacher’s dilemma: what score does the student deserve? It
is important for teachers to remember that the mean is strongly influenced by extreme scores. At this
point it may by useful for the teacher to reference the median and mode for additional support.
Using the mean as the sole source of information for determining a student’s grade may be unfair to a
student if the student’s scores contain an extremely low score. Instead, it may be a good idea to drop
the score or minimize the weight of the score. It is unwise to drop an extreme score if it is unusually
high.
Some school districts may have a policy stating that a teacher cannot fail a student by recording a score
lower than a certain grade, like 40% for example. This is to help avoid situations where a student can
never bring up their scores. When grades are deflated to a hopelessly low number, this can have very
negative effects on classroom behavior and participation.
The way that extreme scores affect the mean is apparent in illustration 18. The mean is identified in a
positively and negatively skewed data distribution as it generally relates to both the mode and the
median.
Illustration 18: Mean Values of Skewed Data
Skewed data moves the mean away from the center point of a normal curve. The more skewed the data,
the further the mean migrates to the area of the skew. The more extreme the scores, the more the
mean is affected.
Like the median, in a positively skewed frequency distribution, the mean moves to the right and the
majority of the scores fall below the mean. For a frequency distribution that is negatively skewed, the
mean moves to the left and is shaped so that the majority of its scores fall above its mean.
For a teacher, the use of the mean may be inappropriate. In the case where the bulk of scores are
located in one mode, and a minimum number of scores are a significant distance from the mode, the
mean average may create an arithmetic model that does not approximate the nature of the students.
Likewise, the mean of a bimodal distribution may not describe anything useful to the teacher.
When I was a student in high school, my Latin teacher created a chart with all of our names on it. Every
time she asked a question, she scored the student’s answer on a scale of 0-4, with 4 being the top grade
equal to an “A”. I remember being on a high-performing streak where I had scored several 4s in a row.
This was closely followed by a score of 2. When my score was calculated at the end of the year, the
score of 2 prevented me from reaching the “A” level in that subject. Just one score out of many
subtracted so much from my grade that I could never replace it.

6507 2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

6507 2

Uploaded by

Copyright:

Available Formats

Course: Educational Measurement and Evaluation (6507)

DISADVANTAGES OF SELF-REPORT AND POSSIBLE SOLUTIONS

An especially harmful threat to self-report occurs when respondents are intentionally

METHODS OTHER THAN SELF-REPORT

i. the pupils are motivated to do their best,

ii. test administration directions are followed closely,

iii. time is accurately kept,

v. testing materials are collected promptly.

In Nigeria, different approaches of test administration are adopted by different examination

Good Test Administration Practices

Period before the test

Security of testing instruments:

d.) all adequate invigilation has been planned.

Observe precision in giving instructions or clarifications

Avoid giving hints to students who ask about individual items

Ensure that no eating takes place in the examination hall

Handle emergencies appropriately

Inform students on progress of testing

Period after the Test

To increase grading and reporting consistency throughout the district. To improve

To ensure that our grading practices support learning.

How were the Grading Guidelines developed?

How do Grading Guidelines improve communication?

Communication is also improved because the meaning is made more specific.

Practice and Academic Achievement?

First drafts of writing

Teacher questions during instruction

Tests (written, oral, and performance)

Writings (term papers, essays, stories, etc.)

Why are Academic Practice and Academic Achievement weighted differently?

Academic Achievement has a higher weight value than Academic Practice.

"Exit Examinations" for High School Graduates

Legal Authority for Setting Educational Standards

Legal Challenges to Educational Testing

Measures of Central Tendency

From Illustration 7: Mode of a Normal Curve

From Illustration 8: Mode of Skewed Graphs

Illustration 9: Elementary Class Frequency Distribution

Student Score Frequency

Illustration 10: Bimodal Distribution

A bimodal distribution of student scores with frequency peaks at 2 and 9.

3. 6. 12. 19. 35. 47

Illustration 13: Median Location with Skewed Data

Illustration 14: Ordered Array of Students’ Quiz ScoresStudent Score

Illustration 15: Student X’s Quiz Scores

Illustration 16: Student X’s Updated Quiz Scores

Illustration 17: Student X’s 10 Quiz Scores

10. 10. 10. 2. 10. 10. 10. 10. 10. 10

You might also like