Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 51

CHAPTER 4

Of Test and Testing


LEARNING OBJECTIVES
When you have completed this chapter, you should be able to:
• Understand the assumptions about psychological testing ang
assessment
• Discuss and explain the elements of a good test
• Recognize the importance of developing norms for a standardized
test
• Distinguish the difference between norm-referenced and criterion-
referenced evaluation
• Use the appropriate statistical tools for correlating variables.
• Understand the meaning of correlation coefficient.
• Recognize the do’s and don’ts in culturally informed assessment
SOME ASSUMPTIONS ABOUT PSYCHOLOGICAL TESTING AND ASSESSMENT

Assumption 1: Psychological Traits and States Exist


 A trait is “any distinguishable, relatively
enduring way in which one individual varies
from another”
 States also distinguish one person from another
but are relatively less enduring
• Samples of behavior may be obtained
in a number of ways, ranging from
direct observation to the analysis of
self-report statements or pencil-and-
paper test answers.
Specific
Intelligence intellectual Cognitive style Adjustment
abilities

Sexual
Interests Attitudes orientation and Psychopathology
preferences

Personality in Specific
general personality traits
• A psychological trait exists only as a
construct —an informed, scientific
concept developed or constructed to
describe or explain behavior.
Assumption 2: Psychological Traits and
States can be Quantified and Measured
Assumption 3: Test-Related Behavior
predicts Non-Test-Related Behavior
Assumption 4: Tests and Other
Measurement Techniques Have Strengths
and Weaknesses
Assumption 5: Various Sources of Error
are part of the Assessment Process
• Error variance is the component of a
test score attributable to sources other
than the trait or ability measured
Assumption 6: Testing and Assessment
can be conducted in a Fair and Unbiased
Manner
Assumption 7: Testing and Assessment
Benefit Society
Psychometric Soundness

Reliability – consistently measures in the same way.


– is a necessary but not sufficient element of a
good test
Validity - measure what it purports to measure
Other considerations:
• A good test is one that trained examiners can
administer, score, and interpret with a minimum of
difficulty.
• If the purpose of a test is to compare the
performance of the test taker with the performance
of other test takers, a good test is one that contains
adequate norms
NORMS

• Norm-referenced testing and


assessment is a method of evaluation
and a way of deriving meaning from
test scores by evaluating an individual
test taker's score and comparing it to
scores of a group of test takers.
• To yield information on a test taker's standing
or ranking relative to some comparison group
of test takers.
• Norms are the test performance data of a
particular group of testtakers that are designed
for use as a reference when evaluating or
interpreting individual test scores.
• A normative sample is that group of
people whose performance on a
particular test is analyzed for reference
in evaluating the performance of
individual test takers.
• Norming refer to the process of
deriving norms.
Sampling to Develop Norms

• The process of administering a test to a


representative sample of test takers for the
purpose of establishing norms is referred to as
standardization or test standardization.
• Sample of the population—a portion of the
universe of people deemed to be
representative of the whole population
• The process of selecting the portion of
the universe deemed to be
representative of the whole population
is referred to as sampling.
Developing norms for a standardized test
• Establishing a standard set of
instructions and conditions under which
the test is given makes the test scores of
the normative sample more comparable
with the scores of future test takers
Types of Norms

A. Percentile norms are the raw data from a


test’s standardization sample converted to
percentile form.
B. Age norms/age-equivalent scores/age norms
indicate is the average performance of
different samples of test takers who were at
various ages at the time the test was
administered.
C. Grade norms – is designed to indicate
the average test performance of test
takers in a given school grade
- are developed by administering the
test to representative samples of
children over a range of consecutive
grade levels
- the primary use of grade norms is as a
convenient, readily understandable gauge of
how one student’s performance compares
with that of fellow students in the same
grade.
- Drawback: Useful only with respect to
years and months of schooling completed
D. National norms - are derived from a
normative sample that was nationally
representative of the population at the
time the norming study was conducted.
E. National anchor norms – provide some
stability to test scores by anchoring them
to other test scores.
F. Subgroup norms - results from such
segmentation
G. Local norms - provide normative
information with respect to the local
population’s performance on some
test.
Fixed Reference Group Scoring Systems
• Fixed reference group scoring system is
the distribution of scores obtained on
the test from one group of test takers is
used as the basis for the calculation of
test
Norm-Referenced versus Criterion-Referenced Evaluation

• One way to derive meaning from a test


score is to evaluate the test score in
relation to other scores on the same
test - norm-referenced
• Criterion Is a standard on which a judgment or
decision may be based.
• Criterion-referenced testing and assessment is a
method of evaluation and a way of deriving
meaning from test scores by evaluating an
individual’s score with reference to a set standard.
– domain- or content-referenced testing and
assessment
Norm-Referenced VS Criterion-Referenced

• The area of focus regarding test results.


• C-R is also referred to as mastery tests.
Norm-Referenced Criterion-Referenced
A usual area of focus is how an individual usual area of focus is the testtaker’s
performed relative to other people who performance: what the testtaker can do
took the test. or not do; what the testtaker has or has
not learned; whether the testtaker does
or does not meet specifi ed criteria for
inclusion in some group, access to certain
privileges, and so forth.
CORRELATION AND INFERENCE

• A coefficient of correlation (or correlation


coefficient ) is a number that provides us
with an index of the strength of the
relationship between two things.
• Correlation is an expression of the degree
and direction of correspondence between
two things.
The Pearson r
• Also known as the Pearson correlation
coefficient and the Pearson product- moment
coefficient of correlation.
• Devised by Karl Pearson
• r can be the statistical tool of choice when the
relationship between the variables is linear and
when the two variables being correlated are
continuous
• The formula used to calculate a Pearson r from raw
scores is:
• The coefficient of determination (r2) is an
indication of how much variance is shared
by the X - and the Y -variables. – r2 * 100
• The remaining variance, could presumably
be accounted for by chance, error, or
otherwise unmeasured or unexplainable
factors
The Spearman Rho

• rank-order correlation coefficient, a rank-


difference correlation coefficient
• Developed by Charles Spearman
• This coefficient of correlation is frequently used
when the sample size is small (fewer than 30
pairs of measurements) and especially when
both sets of measurements are nominal or
ordinal (or rank-order) form.
Graphic Representations of Correlation

• One type of graphic representation of


correlation is referred to by many
names, including a bivariate
distribution, a scatter diagram, a
scattergram, or—our favorite—a
scatterplot.
• Scatterplot is a simple graphing of the coordinate
points for values of the X -variable (placed along
the graph’s horizontal axis) and the Y -variable
(placed along the graph’s vertical axis).
– Scatterplots are useful because they provide a
quick indication of the direction and
magnitude of the relationship, if any, between
the two variables.
• To distinguish positive from negative
correlations, note the direction of the
curve.
• To estimate the strength of magnitude
of the correlation, note the degree to
which the points form a straight line
• Scatterplots are useful in revealing the
presence of curvilinearity in a relationship.
• Curvilinearity refers to an “eyeball gauge” of
how curved a graph is.
• An outlier is an extremely atypical point
located at a relatively long distance from the
rest of the coordinate points in a scatterplot
Regression

• “is the analysis of relationships among


variables for the purpose of understanding
how one variable may predict another”
• Simple regression involves one independent
variable (X), typically referred to as the
predictor variable, and one dependent variable
(Y), typically referred to as the outcome
variable
• Simple regression analysis results in an
equation for a regression line.
• The regression line is the line of best
fit: the straight line that, comes closest
to the greatest number of points on the
scatterplot of X and Y.
• Multiple regression
– Multiple regression equation takes
into account the intercorrelations
among all the variables involved.
INFERENCE FROM MEASUREMENT

• Meta-Analysis
– Meta-analysis refers to a family of
techniques used to statistically
combine information across studies
to produce single estimates of the
statistics being studied.
Culture and Inference

Culturally Informed Assessment: Some “Do’s” and


“Don’ts”
DO’S DO NOT
 Be aware of the cultural assumptions on  Take for granted that a test is based on
which a test is based assumptions that impact all groups in much
the same way

 Consider consulting with members of  Take for granted that members of all cultural
particular cultural communities regarding communities will automatically deem
the appropriateness of particular particular techniques, tests, or test items
assessment techniques, tests, or test items appropriate for use
DO’S DO NOT

 Strive to incorporate  Take a “one-size-fits-all” view of


assessment methods that assessment when it comes to
complement the worldview and evaluation of persons from
lifestyle of assessees who come various cultural and linguistic
from a specific cultural and populations
linguistic population

 Be knowledgeable about the  Select tests or other tools of


many alternative tests or assessment with little or no
measurement procedures that regard for the extent to which
may be used to fulfill the such tools are appropriate for
assessment objectives use with the assessees
DO’S DO NOT

 Be aware of equivalence issues  Simply assume that a test that


across cultures, including has been translated into another
equivalence of language used language is automatically
and the constructs measured equivalent in every way to the
original
 Score, interpret, and analyze  Score, interpret, and analyze
assessment data in its cultural assessment in a cultural vacuum
context with due consideration
of cultural hypotheses as
possible explanations for
findings
Thank you

You might also like