SPL-3 Unit 3

UNIT-3: VALIDITY OF TESTS
3.1: Validity: Meaning
Validity is the extent to which a test measure what it is supposed to. In order to be valid,
a test must be reliable; but reliability does not guarantee validity, i.e. it is possible to
have a highly reliable test which is meaningless (invalid). Reliability of a test does not
ensure validity.
In quantitative research, you have to consider the reliability and validity of

your methods and measurements.
Validity tells you how accurately a method measures something. If a method measures
what it claims to measure, and the results closely correspond to real-world values, then
it can be considered valid. There are four main types of validity:
3.2: Types: Content, criterion and construct
There are different types of validity depending both on what the scores measure and
what for they do it, that is on the matter and the purpose of measure meant. The various
types of validity may be classified as given below:
1) Content Validity: Is the test fully representative of what it aims to measure?
This type of validity is concerned with the measurement of attainment during a

course according to a prescribed syllabus for accessing the achievement of the
candidate.
Thus this validity is meant for achievement or attainment tests. In order to increase
this validity the items in this type of test should cover both the content and the
objectives of the course. For this purpose, the content and the objective may be
analysed by consulting the teachers, text-books, and the previous question papers,
and a sort of 'blue-print is prepared.
Refers to the extent which a test measures your definition of the construct or
behavior of interest?
•Does a physical test measure your knowledge of psychology?

•Does the Psychology exam measure your knowledge of psychology?
•Does a physical test measure how athletic you are?
2) Criterion Validity: Do the results accurately measure the concrete outcome they
are designed to measure? Relationship between scores on a test and actual
performance
Criterion validity consists of predictive and concurrent validity.

Predictive Validity: “Does the test predict an individual’s performance in specific
abilities?”
It refers to the function of a test in predicting a particular behavior or trait. It assesses
those attributes that are likely to predict success during a course or job. The measures
of a test having predictive validity should predict the success for which they are used as
predictors and correlating these measures after the course or performance on the job,
for any specified period.
This type of validity is estimated statistically with the criterion measures of success that
may be scores in some examination or rating by the superiors. It has to be understood
that the criterion measures are not immediately available, and one has to wait for them.
Concurrent Validity: “Does the measure relate to other signs of the construct (the
subject of study) the test is supposed to be measuring?”
In many situations new tools may be developed in place of some already used for
collecting information. For example pathological tests determine whether a person is
neurotic or not, after a number of clinical tests in medical labs. One may try to cut short
all this by producing an objective type test with a number of questions to determine the
same thing.
Then, the validity of this new test will be determined statistically by finding the
assessments made by the former procedure almost simultaneously. In this case there is
no need to wait as in case of 'predictive' validity.
3. Construct Validity: Does the test measure the concept that it’s intended to
measure?
Construct validity refers to the ability of a measurement tool (e.g., a survey, test, etc) to
actually measure the psychological concept being studied.
In construct validity it is an abstract psychological construct (a complex concept), like

intelligence, creativity, anxiety, motivation, etc as they are so ill-defined, no. 'blue-print'
can be prepared for the construction of their tests.
For example, if we want to know our height we would use a tape measure and not a
bathroom scale because all height measurements are expressed in inches and not in
pounds.
Construct Validity is the most important kind of validity. If a measure has construct
validity it measure what it purports to measure.
Establishing construct validity is a long and complex process.
The various qualities that contribute to construct validity include:
• Criterion validity (includes predictive and concurrent)
• Convergent validity
• Discriminant validity
•
To create a measure with construct validity, first define the domain of interest (i.e. what
is to be measured) then construct measurement item are designed which adequately
measure that domain (topic). Then a scientific process of rigorously testing and
modifying the measure is undertaken.
3.3: Convergent and discriminated validity
Convergent Validity: It is important to know whether this test gives similar result to the
tests which purport to measure the same or related constructs.
Observation of behaviour (criterion) can be compared with self report scores (measure)
Trained interview rating (criterion) can be compared with self report scores (measure)
Discriminated Validity: shows that a measure does not measure what it is not meant
to measure – i.e. it discriminates.
For example Discriminant validity would give low correlation between a quantitative
reasoning test and scores on a reading comprehension test, since reading ability is an
irrelevant variable in a test designed to measure quantitative reasoning.
5) Face validity: Does the content of the test appear to be suitable to its aims?
Face validity is the least important aspect of validity, because validity still needs to be
directly checked through other methods. All that face validity means is that does the test
measures, on the face it seem to measure what is indented for?
But we need to be careful of any measure that suppose to measure one thing, but seem
to measure something different e.g. political polls, a politician’s current popularity is not
necessarily a valid indicator of who is going to win an election.
3.4: Validity: Statistical calculation method
To evaluate criterion validity, you calculate the correlation between the results of
your measurement and the results of the criterion measurement. If there is a high
correlation, this gives a good indication that your test is measuring what it intends to
measure.
Tests of Correlation: The validity of a test is measured by the strength of association,

or correlation, between the results obtained by the test and by the criterion
measure
The following are some of the methods of estimating validity of a test:

1. Correlation Coefficient Method
2. Cross-Validation Method
3. Expectancy Table Method
4. Item Analysis Method
5. Method of Inter-Correlation of Items and Factor Analysis.
1. Correlation Coefficient Method:

In this method the scores of newly constructed test are correlated with that of criterion
scores. The coefficient of correlation gives the extent of validity index of the test. For
this purpose Pearson’s method of correlation is most widely and popularly used. The
technique of correlation depends on the nature of data obtained on the test as well as
on criterion.
2. Cross-Validation Method:
Cross-validation would be a trial of the selected items on new groups. In order to
evaluate the usefulness of a test, the prediction equation or cut-off score must be
derived from one sample of information and validated on a second sample of subjects
from the same universe or population.
Cross-validation is accomplished by trying out previously developed and refined test on
a completely new group.
3. Expectancy Table Method:

In this method the scores of newly constructed test are evaluated or correlated with the
rating of the supervisors. It provides empirical probabilities of the validity index.
4. Item Analysis Method:

The items of a valid test should have proper difficulty value and discriminating power.
Difficulty value and discriminating power of test items can be calculated through ‘Item
Analysis’. Item analysis is a process by which the difficulty value and discriminating
power of the individual items of a test are calculated.
5. Method of Inter-Correlation of Items and Factor Analysis:

The factor analysis is done by highly statistical methods. Methods of inter-correlation
and other statistical methods are used to estimate factorial validity.
Besides the above methods some other forms of expressing validity are as
follows:
a. By expert judgement.
b. By analysing the test with reference to content and objectives.
c. From the reliability of a test.
d. By group difference methods etc.

SPL-3 Unit 3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SPL-3 Unit 3

Uploaded by

Copyright:

Available Formats

UNIT-3: VALIDITY OF TESTS

3.1: Validity: Meaning

In quantitative research, you have to consider the reliability and validity of

3.2: Types: Content, criterion and construct

1) Content Validity: Is the test fully representative of what it aims to measure?

This type of validity is concerned with the measurement of attainment during a

•Does a physical test measure your knowledge of psychology?

Criterion validity consists of predictive and concurrent validity.

In construct validity it is an abstract psychological construct (a complex concept), like

3.3: Convergent and discriminated validity

3.4: Validity: Statistical calculation method

Tests of Correlation: The validity of a test is measured by the strength of association,

The following are some of the methods of estimating validity of a test:

1. Correlation Coefficient Method:

3. Expectancy Table Method:

4. Item Analysis Method:

5. Method of Inter-Correlation of Items and Factor Analysis:

You might also like