Test - Education (1) STANDARDIZED TESTS

Standardized tests
Introduction
A test is a device or technique used to measure the performance , skill level ,or knowledge of a learner
on a specific subject matter . Test is a systematic procedure for observing persons and describing them
with either a numerical scale or a category system . thus a test may give either qualitative or quantitative
. Test commonly refers to a set of items or questions under specific conditions. series of questions ,
problems , or physical responses designed to determine knowledge , intelligence or ability is called the
test .
Purposes of test
• Tests are conducted to find out whether the set objectives for a particular course ,lesson or topic
have been achieved or not.
• Tests facilitates teacher to determine the progress made by the students in the class
• Tests helps teachers to determine what students have learned or not learnt in the class
• Tests are made to place students /candidates into a particular class ,school, level ,or employment
.
• Tests can reveal the problems or difficulty areas of a leaner
• Tests are used to predict outcomes .tests helps to predict whether or not a learner will be able
to do a certain job, task and use language to study in a university or perform well in a particular
school , college or university .
Characteristics of a good test
• A good test should be valid

• A good test should be reliable
• A good test must be capable of accurate measurement of the academic ability of learner
• A good test should combine both discrete point and integrative test procedures for a fuller
representation of teaching-learning points.
• A good test must represent teaching -learning objectives and goals .
• Test material must be properly and systematically selected .
• Variety is also a characteristic of good test
• Test objectivity
• Comprehensive
• Discrimination
Validity and reliability of a test .
Validity essentially means establishing that the marking measures what it is supposed to measure. This
is a difficult principle, especially when assessing higher order skills such as critical thinking,
formulating, modeling and solving problems in written work, which is why markers sometimes focus
on lower order skills such as referencing, grammar and spelling. In science and practitioner disciplines
where competencies are essential, validity may be established through competency models, but there
are also competencies which are hard to quantify. -Knight, 2007 . Validity refers to how well a test
measures what it is supposed to measure:
• Face validity ascertains that the measure appears to be C assessing the intended construct
under study. The expert panelist can easily assess face validity. This is not a very ‘scientific’
type of validity.
• Construct validity is used to ensure that the measure is actually measuring what it is intended
to measure (i.e., The construct), and not other variables. Using a panel of experts’ familiar with
the construct is a way in which this Type of validity can be assessed. The experts can examine
the items and decide what that specific item is intended to measure. Students can be involved
in this process to obtain their feedback.
• Content validity it ensures that the measure covers the broad range of areas within the
concept under study. Not everything can be covered, so items need to be sampled from all of
the domains. This may need to be completed using a panel of ‘experts’ to ensure that the
content area is adequately sampled. Additionally, a panel can help limit ‘expert’ bias (i.e., a
test reflecting what an individual personally feels are the most important or relevant areas).
Reliability of a Test
Reliability is the degree to which an assessment tool produces stable and consistent results.
Types of reliability:
• Test-retest reliability is a measure of reliability obtained by administering the same test twice over a
period of Time to a group of individuals. The scores from Time1 and Time 2 can then be correlated in
order to evaluate the test for stability over time.
Example: A test designed to assess student learning in psychology could be given to a group of students
twice, with the second administration perhaps coming a week after the first. The obtained correlation
coefficient would indicate the stability of the scores.
Statistical calculation (test-retest method)
Procedure of calculating test-retest reliability (stability) of research instrument involve the following
steps:
Administration of a research instrument to a sample of Participants on two different occasions.
Scores of the tool administered at two different occasions are compared and calculated by using
following formula of correlation coefficient (Formula 9.1).
.The correlation coefficient reveals the magnitude and directions of relationship between scores
generated by a Research instrument at two separate occasions.
Interpretation of results: The results of the correlation coefficient range between -1.00 through 0.0 and
+1.00, and The results are interrelated as reliability follows:
+1.00 score indicates prefect reliability.
Zero score indicates no reliability.
A score above 0.70 indicates an acceptable level of reliability of a tool.
Formula 9.1: Karl Pearson’s correlation coefficient formula for estimation of reliability:
- R= n(Σχy)-(Σx)(Σy)
√[nΣX2 –(∑ x)2 ] [n∑y2-(∑y2)]
In the above formula r = correlation coefficient, n = number of pairs of scores or sample,∑xy = sum of
the products of sample scores, ∑x = sum of x scores ,∑y = sum of y scores, ∑X2= sum of squared x
scores and ∑y 2 = sum of squared y scores. Example: A questionnaire is developed by an investigator
to assess the knowledge of nurses about Glasgow coma scale. The test and re-test score of tool tried out
on 10 nurses are given in Table 9.11. The reliability score of this questionnaire is computed using Karl
Pear- son’s correlation coefficient formula, as described below:
• Parallel forms reliability is a measure of reliability obtained by administering different versions of
an assessment tool (both versions must contain items that probe the same construct, skill, knowledge
base, etc.) to the same group of individuals. The scores from the two versions can then be correlated in
order to evaluate the consistency of results across alternate versions.
Example: If you wanted to evaluate the reliability of a critical thinking assessment, you might create a
large set of items that all pertain to critical thinking and then randomly split the questions up into two
sets, which would represent the parallel forms.
Inter-rater reliability is a measure of reliability used to assess the degree to which different judges or
raters agree with their assessment decisions. Inter-rater reliability is useful because human observers
will not necessarily interpret answers the same way, faters may disagree as to how well certain responses
or material demonstrate knowledge certain responses or materials demonstrate knowledge of the
construction or skill being assessed.
Example:
A rating scale is developed to assess cleanliness of the bone marrow transplantation unit; this rating
scale may be administered to observe the cleanliness of the bone marrow transplantation unit by two
different observers simultaneously but independently .
• Internal consistency reliability is a measure of reliability used to evaluate the degree to which
different test items that probe the same construct produce similar results
a. Average inter-item correlation is a subtype of internal consistency reliability. It is obtained
by taking all of items on a test that probe the same construct (e.g., reading
comprehension),determining the correlation coefficient for each pair of items, and finally
taking the average of all of these correlation coefficients. This final step yields the average
inter-item correlation.
b. Split-half reliability is another subtype of internal consistency reliability. The process of
obtaining split-half reliability is begun by ‘splitting in half’ all items of a test that are intended
to probe the same area of knowledge (e.g., human body) in order to form two ‘sets’ of items.
The entire test is administered to a group of individuals, the total score for each ‘set’ is
computed, and finally the split-half reliability is obtained by determining the correlation
between the two total ‘set’ scores.
Test can be
Standardized or non standardized
Standardized tests are those test stated the uniformity and equality in the scoring and administrating
and interpreting the result .e.g . any examination in which the same test is given in the same manner to
all the students .
Non standardized test is one that allows for an assessment of an individual ,s abilities or performances
,but doesn’t allow for a fair comparison of one student to another.
Difference between standardized and non standardized tests .
Criteria Standardized test standardized test /teacher made

test
Learning outcome and content basic skills and complex Tend to neglect complex
measured learning outcomes learning outcomes
Well adapted to outcomes and
content of local curriculum
Quality of test items Written by specialists Prepared by teacher
General quality of items high
Selected on the basis of Quality typically lower than
effectiveness standardized test
Frequency of occurrence Standardized tests are Frequently used in classroom to

occasionally used . although evaluate the students
their variety and their “high performances.
shakes’’ nature may make them
feel dominant .
Reliability High Usually unknow
Administration and scoring Standardized procedure Flexible approach
Specific instructions provided Uniform procedure possible
Norms for comparison Allows wider comparison of Availability of norms -teachers
achievement have only their students for
comparison and this is mostly
informal .
Standardized test
Introduction
Standardized means uniformity of procedure in scoring , administering and interpreting the results .A
standardized test is a test that is administered and scored in a consistent, or “standard”, manner.
Standardized tests are designed in such a way that the questions, conditions for administering, scoring
procedures, and interpretations are consistent and are administered and scored in a predetermined,
standard manner. Assessment devices are instruments used to determine both how well a student has
learned covered materials, and/or how well he will do in future endeavors. Assessment can be
accomplished through tests, homework, seatwork, etc. Most formal assessments that are used to assign
grades and/or for selection purposes or predictions involve tests. A test is a systematic method for
measuring students' behaviors and evaluating these behaviors against standards and norms. Tests can
be standardized or teacher-made.
Definition
Standardized tests are instruments that measure and predict the ability / aptitude and achievement .such
tests are :
• Normed on an appropriate reference group (e.g ,a group of people similar to those that the test
will be used with )
• Always administered ,scored ,and interpreted in the same way .
A standardized test is a test that is administered and scored in a consistent ,or standard manner
.standardized tests are designed in such a way that the questions, conditions for administering ,scoring
procedures , and interpretations are consistent and are administered and scored in a predominated
,standard manner .
Purposes of standardized tests
• Diagnose students strengths and weaknesses

• Provide information for planning and instructions
• Provide information about students progress and program placement
• Contribute to accountability
• Help in program evaluation.
Characteristics
• Constructed by test experts or specialists.

• Covers board or wide areas of objectives and content.
• Selection of items will be done very carefully and the validity, reliability, usefulness of the rest
is ascertained in a systematic way.
• Procedure of administration is standardized.
• Test has clear directions and it will be motivating, encouraging students.
• Scoring key is provided.
• Test manual provides norms for the test.
• It should be fixed.
• It is specific direction for administering and scoring the test.
• It consist of standard content and procedure
• It provides standardized frame of reference determining individual performance.
Types of standardized tests
1. reference test :it is a test that compares an individual performance with the performance of
others .
2.Criterion reference test: it is a test that compares a persons performances to a set of objectives ,
anyone who meets the criterion can get a high score .
Forms of standardized test
• Achievement test
• Psychological test
Achievement test :
Introduction
Achievement test is an important tool in school evaluation and has great significance in measuring
instructional progress and progress of the students in the subject area. Accurate achievement data are
very important for planning curriculum and instruction and for program evaluation
Definition
A systematic procedure for determining the amount a student has learned through instructions .
It is a type of ability test that describes what a person has learned to do .
Achievement test is designed /used as a sampling of skills or abilities on specified area of knowledge .
Standardized achievement test may assessed any or all of reading ,math,s and written language as well
as subject areas such as science and social studies .
e.g reading tests
mathematics tests ,social studies.
classification of achievement tests
oral written performance test

Written
Essay test Objective test
Extended
Restricted response selective type
response
Supply
type
Completion type
Short
Short essay short answer very short answer
answer
True-false Multiple Matching Extended Assertion -

P Interp
response type response Multiple reason retive
item type choice
Psychological test
Introduction
Psychological tests are standard measure devised to assess behaviour objectively and used by
psychologist to help people make decision about their lives and understand more about themselves.
Psychological test is done to asses the psychological condition of a individual. Psychological
assessment is a process that involves the integration of information from multiple sources, such as, tests
of normal and abnormal personality, tests of ability or intelligence, tests of interests or attitudes, as well
as information from personal interviews . Psychological testing of patients is ideally conducted by a
clinical psychologist who has been trained in the administration, scoring and interpretation of these
procedures.
Definition: A psychological test is an instrument designed to describe and measure a sample of certain
aspect of human behavior .
Psychological tests yield objective and standardized description of the behavior , quantified by
numerical scores.
Types of psychological tests
Intelligence test
Intelligence is the global capacity of individual to think rationally ,to act purposefully ,to deal effectively
with the environment .
Intelligence tests are psychological test that are designed to measure functions such as reasoning
,comprehension ,and judgment .
Example of intelligence test
• Wachsler adult intelligence scale (WAIS)

• Stanford binet scale
Aptitude test
According to Warren “aptitude is a set of characteristics symptomatic of an individual’s ability to
acquire with training , some specific field of knowledge ,skill or set of responses .”
Example of aptitude test
• Verbal reasoning measures

• Numerical reasoning
uses of aptitude test
• Determining student performances

• Determine individual’s knowledge
• Selection of an employee who is best suited for a job
• Carrier test : aptitude based tests are regarded as one of the most useful varirties of career tests
e.g CET
• Guidance : these can be used for guiding candidate for suitable course or training .
Personality tests
Personality is the sum of activities that can be discovered by actual observation over a period of time to
give reliable information .
Personality tests An attempt to measure personality traits, states, types, and other aspects of personality
(such as self concept).Emotional intelligence test Self concept inventory.The big five inventory.Keirsey
temperament sorter etc
Example :
• Minnesota multiphasic personality inventory .

• The 16 PF
Construction of a standardized test
It includes the following steps ;
1.planning ; involves
• Fixing up the objectives /purposes

• Determining the weightage to different instructional objectives
• Determining the weightage to different content areas
• Determining the item types to be included
• Preparation of the table of specification or blueprint
• Taking decision about its mechanical aspects like time duration ,test size ,total marks ,printing
,size of letters etc.
• Giving instructions for scoring of the test and its administration procedure
• Weightage to different categories of difficulty level of the questions is to be fixed
2.preparing the test :
• Write test items according to rules of construction for the type(s) chosen.
• Select the items to be included in the test according to table of specifications
• Review and edit items according to guidelines.
• Arrange items.
• Decide on method of scoring.
3.trying out of the test :
Once the test is prepared now it is time to be confirming the validity, reliability and usability of the test.
Try out helps us to identify defective and ambiguous items, to determine the difficulty level of the test
and to determine the discriminating power of the items.
Try out involves two important functions:
(a) Administration of the test.
(b) Scoring the test.
4.evaluating the test:
Evaluating the test is most important step in the test con-struction process. Evaluation is necessary to
determine the quality of the test and the quality of the responses. Quality of the test implies that how
good and dependable the test is? (Validity and reliability). Quality of the responses means which items
are misfit in the test. It also enables us to evaluate the usability of the test in general class-room situation.
Evaluating the test involves following functions:
(a) Item analysis.
(b) Determining validity of the test.
(c) Determining reliability of the test.
(d) Determining usability of the test.
Advantages of standardized tests
• Practical test
• Standardized test are scored via computer
• Not to the subject of bias or emotions
• Provide a longitudinal report of student progress, teacher can see the growth and decline
• Fast to mark
• They are reliable and valid .
• Allowing for the discovery of talented teachers who prepare students effectively for
standardized tests.
Disadvantages
• Standardized test do not evaluate higher level thinking skills
• Are not equal with typical classroom skills and behaviors
• Does not support multiple intelligence theory
• Creates pressure and anxiety among students.
• Subjectivity in scoring.
Bibliography :
BOOKS
Jaspreet kaur sodhi ,Nursing Education ,2nd edition ,jaypee publication ,2017 ,page no 188- 193
Nima Bhaskar ,text book of nursing Education ,2nd edition ,EMMESS Medical publication ,page no
244-246 .
INTERNET:
http://www.slideshare.net/pokray/standardized-testing-53809235?from_m_app=android .
https://www.scribd.com/presentation/460345036/standardized-and-non-standardized-test-pptxs.
http://www.slideshare.net/romasmart/standardized-test-239131886?from_m_app=android.
https://www.yourarticlelibrary.com/education/test/top-4-steps-for-constructing-a-test/64781.
https://researchmethod.net/split-half-reliability/.
https://definepedia.in/2023/08/split-half-method.html.
https://www.statology.org/inter-rater-reliability/.

Test - Education (1) STANDARDIZED TESTS

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Test - Education (1) STANDARDIZED TESTS

Uploaded by

Copyright:

Available Formats

Standardized tests

• A good test should be valid

Criteria Standardized test standardized test /teacher made

Frequency of occurrence Standardized tests are Frequently used in classroom to

• Diagnose students strengths and weaknesses

• Constructed by test experts or specialists.

oral written performance test

Essay test Objective test

True-false Multiple Matching Extended Assertion -

• Wachsler adult intelligence scale (WAIS)

• Verbal reasoning measures

• Determining student performances

• Minnesota multiphasic personality inventory .

• Fixing up the objectives /purposes

You might also like