Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

SEMESTER III

C-6: EDUCATIONAL MEASUREMENT AND EDUCATIONAL STATISTICS

UNIT 1: CONCEPT OF MEASUREMENT AND EVALUATION

1. Meaning, Nature and Needs of Measurement in Education

Introduction
Measurement is a common feature of our day-to-day life. In our life something or the other is being
measured every moment. According to Ross, “From birth to death almost every aspect of our daily life is
touched by measurement in its numerous forms.’

Meaning
Measurement refers to the process by which the attributes of dimensions of some physical objects
are determined. Measurement aims at finding out the quantity of some commodity or characteristics in
some definite units. It means finding out smallness or bigness of it or knowing how much more or less
it is than another.
When used in the context of learning, it refers to: applying a standard scale or measuring device to an
object, a series of objects, a events or conditions, according to practices accepted by those who are skilled in the use
of the device or scale.
Measurement answers the question of ‘how much’. In our day-to-day life, we measure the height,
weight, miles travelled, etc., the tailor measures the dimensions of an individual’s body to prepare dress according
to required size, the shopkeepers weigh different commodities like rice, wheat, sugar, fruits, vegetables etc. We
have a wrong notion that measurement takes place only with tapes and scales. The ranking of contestants in a
debate competition can be considered as measurement, rating of human behaviour comes under measurement.
Hence, measurement may be defined as, ‘the assignment of one of a set of numbers to each of a set of persons
or objects according to certain established rules’.

Definitions
James M. Bradfield defined measurement as ‘the process of assigning symbols to the dimension of
phenomenon in order to characterize the status of phenomenon as precisely as possible’.
J.P. Guilford defined measurement as the ‘assignment of numerals to objects or events according to
certain rules’.
According to Norman E. Gronlund, ‘measurement results are some score or numerical value and
quantitative descriptions of the pupils’.
E.L. Thorndike stated that ‘anything that exists at all, exists in some quantity; and anything that exists
in some quantity is capable of being measured’. Measurement of any kind is a matter of determining how much
or how little, how great or how small, how much more than or how much less than.
The Encyclopedia of Educational Research explains measurement in more refined terms; to
measure means ‘to observe or determine the magnitude of a variant’.

Measurement involves the process of quantification. Quantification indicates to what extent a particular
attribute is present in a particular object. It has been observed that measurement in any field always involves
three essentials:
i. Identification and definition of quantity, attribute or variable that is to be measured.
ii. Determining the set of operations by which the attribute or variable may be made perceivable.
1
iii. Establishing a set of procedure for translating observations into quantitative statement of degree,
extent or amount.
Types of measurement
Measurement is of two types: (i) physical measurement and (ii) mental measurement/ psychological
measurement/educational measurement.
i. Physical measurement: Physical measurement is the measurement of the object which has
absolute existence. For example, we measure the height of individuals, the weight of rice, etc.
ii. Mental measurement: Mental measurement is also known as ‘educational measurement’ or
‘psychological measurement’. It is always relative and there is no absolute zero in case of mental
measurement.

In the educational system, measurement is the quantitative assessment of performance of the students
in a given test. It can be used to compare performance between different students and to indicate the strengths
and weaknesses of the students. It helps in classifying students into homogenous group, to assign educational
and vocational guidance and to provide remedial measures to the low achievers.
In the teaching–learning situation, teachers should be competent enough to measure the student’s
achievement, intelligence, attitude, aptitude etc. To develop competency among the teachers in educational
measurement, Ebel has suggested the following measures:
i. Know how to administer a test properly, efficiently and fairly.
ii. Know how to interpret test scores correctly and fully, but with recognition of their limitations.
iii. Know how to select a standardized test that will be effective in a particular situation.
iv. Know how to plan a test and write the test questions, to be included in it.

Nature/characteristics of Measurement
To measure the psychological traits with validity and reliability, the measuring instrument or tests
should be far away from the aspects like personal errors, variable errors, constant errors and interpretative
errors. The important characteristics of a good measuring tool are as follows:
i. Should be valid: Validity of a test refers to its truthfulness. It refers to the extent to which a test
measures what it actually wishes to measure.
ii. Should be reliable: Reliability means the consistency of a measuring instrument (how
accurately it measures). It refers to the faithfulness of the test. To express in a general way, if a
measuring instrument measures consistently, it is reliable.
iii. Should be objective: Objectivity of a test refers to two aspects: (a) item objectivity and (b)
scoring objectivity. By ‘item objectivity’ we mean that the items of the test must need a definite single
answer. If the answer is scored by different examiners the marks would not vary.
iv. Should be usable and practicable: ‘Usability’ refers to the practicability of the test. In the
teaching–learning situation, byusabilitywe meanthe degree to which the test can be successfully used
by teachers and school administrators.
v. Should be comprehensive and precise: The test must be comprehensive and precise. It means
that the items must be free from ambiguity. The directions to test items must be clear and
understandable.
vi. Should be easy in administering: If the directions for administration are complicated, or if they
involve more time and labour, the users may lag behind.
vii. Should be economical: A measurement tool should be less time consuming. The cost of the test
must be reasonable so that the schools or educational institutions can afford to purchase and use it.
viii. Should be easy in scoring: The scoring procedure of the test should be clear and simple. The
scoring directions and adequate scoring key should be provided to the scorer so that the test is easily
scored.
2
ix. Should be easily available: Some standardized tests are well-known all over India, but they are
not easily available. Such tests have less usability. It is desirable that in order to be usable, the test
must be readily and easily available.
x. Should have good and attractive get up/appearance: The quality of papers used, typography and
printing, letter size, spacing, pictures and diagrams presented, its binding, space for pupil’s
responses etc., need to be of very good quality and attractive.

Needs/functions of measurement in Education


Measurement plays an important role in our day-to-day life. It serves many specific purposes and
functions. Measurement in education is needed for the following reasons:

1. Selection: Measurement is an essential tool for selection. Whenever there is selection, we have to select
a few and reject many.

2. Classification: This purpose of measurement is served when measurement helps us to rank the
individuals of a given group. Classification is very essential in different areas such as in school, army,
industry, etc.

3. Prediction/prognosis: Tests reveal differences among people’s performance at this movement. All
decisions involve prediction when psychological test is mentioned, for example, the IQ test that is given to
students in school to predict their academic performance.
The measurement provides the extent of a variable which has the specific purpose of predicting future
behaviour. The prognosis function has an administrative function such as classification, selection, promotion and
gradation of students.

4. Comparison: There are individual differences in the traits and qualities of different individuals. These
differences can be ascertained by means of comparison. It helps to find out which one is superior and which
one is inferior.

5. Diagnosis: The diagnosis function identifies the weakness of the student-learning. The remedial instruction
can be prepared on the basis of diagnosis. The diagnostic function establishes the cause–effect relationship.
Thus, diagnosis is helpful not only in identifying weakness, but anticipating cause and remedies.

6. Research: The use of measurement for research purposes, however, is not much when compared to prediction
and diagnosis. This is because a measurement is usually considered a completely valid measure of certain
human characteristics.
An investigator must treat test scores in this experiment as an accurate quantification of real and
useful variable. Measurement provides a more objective and dependable basis for comparison than rough
impressions.

2. Concept of evaluation in education


Evaluation is a process of judgement of value or worth of a process or product, which may be the
achievement, aptitude, interest, skill or other aspects of student’s personality or the method of teaching and
learning.
Evaluation is the act of placing value on something. Evaluation may be said to be the process by
which the value judgements of the educational status or achievement of students are formed. Thus, to
evaluate means to form judgement on the level of achievement and this presupposes that there is a
predetermined level available.
3
According to Wesley, “It (evaluation) indicates all kinds of efforts and all kinds of means to
ascertain the quality, value and effectiveness of desired outcomes. It is a compound of objective evidence
and subjective observation. It is the total and final estimate.”

In education, evaluation is all the more important because only through evaluation a teacher can
judge the growth and development of a student, the changes in their behaviour, progress of students in the
class, and also the effectiveness of his/her own teaching.
The sole purpose of evaluation should be to provide feedback in order to improve the object of the
evaluation. It determines the difference between where we are and where we would like to be and if
necessary designing ways to eliminate or lessen the differences.
Thus, evaluation is a continuous process of ascribing unique value judgement to teaching-learning
outcomes in the light of educational objectives. The process of evaluation consists of three basic
components:
i. Testing situation – the behaviour to be evaluated
ii. Measurement – measuring that behaviour
iii. Placing a value on the level of achievement shown in that behaviour which the measurement
indicates.

Characteristics of evaluation
The essential characteristics of evaluation are as follows:
i. Evaluation is an act of placing value on something.
ii. The purpose of evaluation is to determine the current status of the object of evaluation.
iii. Evaluation includes both quantitative and qualitative description of behaviour.
iv. It is a comprehensive process. It is total and final estimate.
v. It is continuous and systematic.
vi. Evaluation aims at improving the level of achievement and proficiency through diagnosis and
remediation.

3. Relation between measurement and evaluation


There is a common confusion in making a clear distinction between measurement and evaluation.
Very often they are assumed to mean one and the same and are used as though they are synonymous. The
relationship between measurement and evaluation are given in the points below:

i. Measurement is a technique necessary for evaluation.


ii. Measurement describes a situation; evaluation judges its worth of value.
iii. Evaluation is a process that uses measurement. The success of evaluation depend on the quality
of data collected i.e. measurement.
iv. A good measurement leads to accurate evaluation.
v. Evaluation is integrated with all activities of education but measurement is only one activity
among various activities.
vi. Measurement is concerned with knowing the level of attainment while evaluation is concerned
with improvement.
vii. Measurement is the science of collecting information about objects to be studied. Evaluation
includes the use of information collected by the process of measurement.
viii. Measurement is the act of assigning marks in an examination process. However, evaluation
goes beyond and judges whether the marks are good or bad or how good or how bad these are with
reference to the performance in the class.
ix. Evaluation aims at the modification of the education system by bringing change in the
4
behaviour. Measurement aims at measurement only.
x. In evaluation the interests, attitudes, tendencies, mental abilities, ideals, behaviours and social
adjustment etc. of the pupils are evaluated. But in measurement such aspects cannot be evaluated.
xi. Evaluation is integrated with the entire task of education and not only with examinations, tests
and measurements. Evaluation encompasses tests and measurement but also gives beyond them.
It depends upon measurement but is not synonymous with it.
xii. Measurement is only a tool to be used in evaluation by itself, it is meaningless but without it
evaluation is likely to be of little significance.

Norm-referenced and Criterion-referenced Tests

1. Norm referenced tests


Norms are the average scores or values determined by the actual measurement of a group of
persons who are representative of a specified population. It refers to transformation of raw scores into
standard scores. Norm referenced tests are used to determine how much over all knowledge of same subject
a particular pupil has achieved.
A norm-referenced test is a uniform test. It ranks and compares students in relation to one another.
Also, they measure performance on the basis of the theoretical average. Besides, it compares the result of a
statistically selected group.
In simple words, norm reference tests compare student’s performance with others. Also, they all
have taken the same test and assessment. Besides, norm reference test calculating process is known as “norm
process”. And its compare group is “norming group”.
It determines the position of students. Also, it assesses their performance and measures their
behavior. These guesses derive from analysis test scores. This identifies students test is better or worse than
others.

Characteristics of the Norm-referenced test


The characteristics of the norm reference test are as follows:
1. Defining: They measure the performance of a student in comparison to all students. But it does not
define the meaning of all. Thus, they measure the success of an educational restructuring against fixed
aims.
2. Preset results: It means that the norms were traditionally set. And the score level is set at 50 percent.
But, this goal is really high to achieve if we talk about all the students.
3. Quality of Grades: The norms for grading are set by teachers according to them. But they have to
judge the performance of students. Also, the level of knowledge of both is different.
4. Changing Difficulty level: The difficulty level of scores changes from year to year. Also, the passing
rates of students vary from class to class. Likewise, 4th grade has different difficulty level and 10th grade
has a different difficulty level.
5. Fear of Failure: In norm reference test the students have fear of failure. Besides, it compares their
performance with other students.
6. Be competitive: Give students a chance to improve their performance. Also, students can know how
much they have to prepare to compete with others.
7. Being self-confident: It means that students should take their performance confidently. Also, they
should work hard to improve their performance.

2. Criterion referenced tests


5
Criterion referenced test is intended to measure how well a person has learned a specific body of
knowledge and skills.
Glaser (1963) – “The criterion referenced test may be defined as one in which the test performance
is linked or related to same behaviour measure or referents.”
A criterion-referenced test is an assessment and tests that measures student’s performance. Also,
these measures the performance of the students alongside fixed criteria. These criteria’s include written and
brief reports of what students are capable of doing at different stages.
In other words, the Criterion reference test is a set of fixed criteria. That used to measure student’s
performance. Also, these assess the student’s performance.
Criterion reference test is a method which uses test score to judge students. Also, they help to
generate statements about students’ behavior. Also, they use test scores as their reference. Criterion
reference mostly uses quizzes. The main objective of this is to check whether students have learned the topic
or not.
These generally have multiple-choice, true-false, and open-ended questions. They play an important
role to take a decision about student’s performance.

Characteristics of the Criterion Reference Test


1. Authority: It actually assesses whether they measure what it claims or not. An individual item matches
with its goal. Also, if the situations and performance specified in the aim signify in the item or not.
2. Consistency: It means that if it always measures what it states. Also, consistency means if they have
a high degree of confidence in the scores or not. Any random error in the tool can make it unreliable.
3. Practicality: Not all assessment is reliable because of cost and time. It is not always possible to design
reliable and accurate tests. Also, the decision should considerably relate to important factors.
4. Subject Mastery: This help in the pathway the performance of students within the course of study.
Also, test items can be made to match precise purposes. Criterion reference test also judges how well the
student knows and understand the topic.
5. Managed Locally: Generally, these developed at the classroom level. Also, the teacher can easily
check if the standards are met or not. Besides, they also identify shortages. Results of tests are quickly
obtained to give students helpful feedback on performance.

Relation between norm-referenced and criterion referenced tests

Criterion-Referenced Test Norm-Referenced Test


Performance Each student is independently Judged on the basis of other
assessed. student’s performance.
Comparison It does not compare a student’s It compares a student’s
performance with other performance with other
students. students.
Objective Its main objective is to help Its main objective is to assess a
students learn without getting student’s performance with
questioned about grades. other students.
Criteria They have fixed criteria for Their criteria changes with
assessment. outcomes.
Results Results can be derived quickly. Takes little time to derive
results.
Examples Clinical skill competency tools. Class examination.
UNIT II: TOOLS OF MEASUREMENT
1. Measuring Instruments and their Classification; Errors in Measurement; Types of Scales in
Educational Measurement

6
A. Measuring Instruments and their classification
For both physical and mental measurement, some tools and methods are necessary. The variation of
method may be due to the nature of variable and purpose of measurement.
The tools of measurement are as follows:
i. Tests: A test consists of a set of questions to be answered or tasks to be performed. Tests are
used to assess the ability or trait in question. Psychological and educational tests are
standardized procedure to measure quantitatively or qualitatively one or more than one aspect or
trait by means of a sample of verbal or non-verbal behaviours. Items of a test are placed in
increasing order of difficulty and its procedure of administration is standardized to ensure
maximum objectivity. The psychological tests are used to know the ability of the students, to
diagnose the weakness, to predict the future progress, and to provide educational and vocational
guidance. The different types of tests are: achievement tests, intelligence tests, attitude tests,
aptitude tests, personality tests, creativity tests etc.
ii. Inventories: Different inventories are used for different traits. Interest inventories are used to
measure interest; personality inventories are used to measure certain traits on personality etc.
iii. Observation: There are certain traits like honesty, punctuality, persistence, truthfulness etc.,
which can hardly be measured objectively via tests. So here, observation is an important technique
of measurement. The observation may be participant observation or non-participant observation for
accurate and scientific observation. One may use observation schedule and other instruments.
iv. Interview: Interview is a face-to-face interaction between one interviewee and one interviewer
or more than one interviewers. There are certain things which an individual does not want to express
and they can be only assessed through interviews. The interview schedules may be used and the
interviewer through a better personal support, and in congenial atmosphere, can succeed to bring out
the inner feelings of the interviewee through carefully planned interviews.
v. Checklist: A checklist consists of a series of items which needs response from the respondent.
The presence or absence of an item may be indicated by ‘Yes’ or ‘No’ (by a ‘Ö’ or ‘X’ against the
items). Checklists are popularly employed for appraisal of studies, school buildings, textbooks,
outcomes, instructional procedures etc.
vi. Rating scales: Psychological traits are relative concepts. So it is very difficult to make watertight
compartments between them. Sometimes, the degree of a trait is necessary on the part of the rater.
Rating scale is used to evaluate the personal and social conduct of the learner. We take the opinion of
teachers or parents or friends or judges on a particular quality or trait of a pupil along a scale. The rating
scale may be of 5 points, 7 points, 9 points or 11 points. For example, to assess particular trait, we can
use a 5 point scale as: very good, good, average, below average, and poor. The trait in question is
marked by the judges in any one of the five categories. Rating scales can be used to evaluate:
personality traits, tests, school courses, school practices, and other school programmes.
vi. Attitude scales: Attitude refers to the bent of mind or feelings of an individual towards an object, an
idea, an institution, a belief, a subject or even a person. Attitude scales are used to measure this trait
objectively with accuracy.
vi. Projective techniques: Projective techniques are very ambiguous and subjective in nature.
Through projective techniques, the sub-conscious and pre-conscious mind of an individual is reflected.
For example, with the help of Thematic Apperception Test (TAT), we measure the personality of
individuals.

*******************************************************
B. Errors in Measurement
It is desirable that tests should give us an accurate estimate of the ability it measures. A good
measuring instrument measures without error and give us the true scores. Thus, one way of looking at
7
the concept of a ‘good test’ is to consider the variables that influence the test scores.
Error is any effect that is irrelevant to the purposes of the test which result in inconsistencies in
measurement. Inaccurate scores in measurement can be viewed as introducing error.

Source or Causes of error in measurement


a. Error result from the test itself: Variations resulting from the particular sampling of items
included on a specific form of the test.
b. Error variables related to the condition of the particular test administration: This includes
physical situation, directions, distracting factors and the errors of timing, etc.
c. Error variables involve changes within the test taker: These may be long term effects
resulting from education, maturation and changes in environment. They may be short-term
fluctuations in mood, health or attention.

Types of Error
The types of errors are discussed below:
i. Constant Error: Constant errors are the ones that produce effects irrelevant to the purpose of the testing.
These errors are produced because of wrong selection or wrong placement of measuring instrument. For
example, selecting intelligence test for testing intelligence where the individual must answer the test items
on the basis of memory power. In such a case, this test is not an appropriate test of intelligence; rather it is
a test of memory. Hence, it is a wrong selection to use this test for the purpose of measuring intelligence.

ii. Variable Error: Variable error occurs as a result of inconsistency in test scores from occasion to
occasion. These are also known as random error or chance error. For example, giving a test with two
different time limits might produce different scores for the same individual. In case of physical measurement
such as length, weight, speed, etc. can be measured with higher precision with little variation from occasion
to occasion. But, such precision does not occur in psychological measurement. This is because behaviour
changes from occasion to occasion.

iii. Errors of Observation: Errors of observation may be produced because of the subjective factors such
as, personal opinion, biasness and personal judgement on the part of the observers or the scorers. Thus,
errors of observation presuppose subjectivity. For example, on essay type tests and other free response tests
scorers may disagree and thus producing large error effect.
iv. Errors of Interpretation: Errors may also be produced from the point of interpretation of scores. The
same score may have different interpretations from occasion to occasion, from person to person or from test
to test. Thus, interpretation of test scores needs accurate standard of reference for interpreting the test result
in educational and psychological test.
*******************************************

C. Types of Scales in Educational Measurement


In measurement, numerical values are expressed on well defined scales. On the basis of
mathematical and logical assumptions various types of scales are used in psychological measurement. The
scales used in educational measurement are:
i. Nominal scale
8
ii. Ordinal scale
iii. Interval scale
iv. Ratio scale
i. Nominal Scale: Nominal scale refers to simply assigning different categories or classes on the basis of
some common characteristics. This scale is also called classificatory scale. Hence, the fundamental
operation of this scale is to determine whether two persons are members of the same category or class. In
nominal scale numerals are used for identification only. For example, classifications of persons as male and
female, classifying students according to communities, classifying objects according to colour, etc.
Nominal scale is so primitive that some experts do not recognize it as measurement. It is the least
precise or crude among the four basic scales of measurement. It simply implies the classification of an item
into two or more categories without any extent or magnitude. There is no particular order assigned to them.
ii. Ordinal Scale: Ordinal scale is used to rank people on some dimension. This scale is also known as
ranking scale. This scale suggests a continuum of some kind which has the property of order. For example,
if we rank school children in order according to their height from tallest to shortest, it will constitute an
ordinal scale.
In ordinal scale the objects or events are ranked or ordered from lowest to highest or from highest
to lowest according to the characteristic we wish to measure. Thus ordinal scale corresponds to quantitative
classification of a set of objects with reference to some attribute. In the educational institutions or hierarchy
we find professional as well as administrative classifications on ordinal level.
iii. Interval Scale: The third level of measurement is known as interval level. It has the characteristics of
both nominal and ordinal level of scales. The additional characteristic it possesses is quality of interval. It
means the distance or difference between any adjacent class on the scale can be known numerically. The
intervals on the scale are the same; it is a constant unit of measurement.
This consistency of intervals is lacking in two previous level of scale. In other words, the intervals
of the scale i.e. the difference between two consecutive points on the scale are equal over the entire scale.
For example, the difference between 6 cm. and 7 cm. is equal to the difference between 11 cm. and 12 cm.
Thus interval scale is also known as equal-interval scale.
Interval scales have an arbitrary zero. That is, there is no absolute zero-point or unique origin. With
interval scales the measurement units are equal. Interval scales show that a person or item is so many units
larger or smaller, heavier or lighter, brighter or duller etc. from the other.
iv. Ratio Scale: It is the most refined among the four basic scales. It has all the characteristics of an interval
scale. In addition to that, it has an absolute zero point as its origin representing complete absence of the
property being measured.
Ratio of numbers corresponds to the ratios of attributes. As it has an absolute zero point we can
speak that 10 kg. is twice of 5 kg. In this scale the difference between 15 and 10 is equal to the difference
between 83 and 78.

************************************************************
2. Characteristics of a Good measuring Instrument – Validity, Reliability, and Objectivity and their
methods of Determination
A good measuring instrument should give an accurate estimate of the ability it measured. To give
accurate estimate, the measurement must be free from errors. Corresponding to different to different types
of errors we are concerned with different qualities of a good measuring instrument, such as:
9
i. Validity: Validity of a test refers to its truthfulness; it refers to the extent to which a test measures what it
intends to measure. Standardization of a test requires the important characteristic viz., validity. If the objectives
of a test are fulfilled, we can say that the test is a valid one. The validity of a test is determined by measuring the
extent to which it matches with a given criterion.
Freeman states, ‘an index of validity shows the degree to which a test measures what it is supposed to
measure when compared with the accepted criteria’.
Lee J. Cronback held the view that validity ‘is the extent to which a test measures what it purports to
measure’.

Methods of determination of validity


a. Impressionistic assessment of the test items on the basis of judgement.
b. Examination of the content or ability domain or content analysis on the basis of judgement.
c. Find out by judgement considering the relationship between the test scores and the known
psychological concepts and theories about the construct.
d. Find out correlating the test scores with some concurrent criterion.
e. Find out by correlating the test scores with some future performance.

ii. Reliability: Reliability refers to consistency of scores obtained by some individuals when re-tested with
the test on different sets of equivalent items or under other variable examining conditions. It refers to the
consistency of scores obtained by the same individuals when they are re-examined with the same test on
different occasions or with different sets of equivalent items or under different examining conditions.
Reliability paves the way for consistency that makes validity possible and identifies the degree to which
various kinds of generalizations are justifiable. It refers to the consistency of measurement i.e., how stable test
scores or other assessment results are from one measurement to another.
Reliability refers to the extent to which a measuring device yields consistent results upon testing and
retesting. If a measuring device measures consistently, it is reliable. The reliability of a test refers to the degree to
which the test result obtained is free from error of measurement or chance errors.

Methods of determination of reliability


a. Test-retest: Giving the same test twice to the same group with a considerable time-gap between the
two administrations.
b. Alternate forms: Give two forms of test to the same group in succession with minimal time gap.

c. Retest Method using parallel forms: Give two forms of test to the same group in succession with
considerable time gap.
d. Split Half Method: Administer a test once. Get sub-scores for items of two equivalent odd-even
halves of the test. Use Spearman-Brown formula to estimate reliability.
e. Kuder-Richardson Method: Administer objective type test once and apply Kuder-Richardson
formula.
f. Cronback-Alpha Method: Administer subjective type test and apply Cronback Alpha formula.
g. Scorer reliability: Administer a test once. Get it scored by two or more scorers independently.
Correlate the sets of scores to find out scorer reliability using Spearman-Brown formula.

iii. Objectivity: Objectivity is an important characteristic of a good test. Without objectivity, the reliability
and validity of a test is a matter of question. It is a pre-requisite for both validity and reliability. Objectivity
of a test indicates two things— item objectivity and scoring objectivity.
‘Item objectivity’ refers to the item that must call for a definite single answer. In an objective-type
10
question, a definite answer is expected from the test-takers. While framing the questions, some points to be
kept in mind are: ambiguous questions, lack of proper direction, double-barreled questions, questions with
double negatives, etc. These concepts affect the objectivity of a test.
Objectivity of scoring refers to by whosoever checked the test paper would fetch the same score. It
refers to that the subjectivity or personal judgment or biasness of the scorer should not affect the scores. The
essay-type questions are subjective and the scores are affected by a number of factors like mood of the
examiner, his language, his biasness, etc. Essay-type questions can have objectivity if the scoring key and
proper directions for scoring are provided.

Methods of determination of objectivity


a. Item analysis: By analyzing each item included in the test, we determine the effectiveness of each
item by computing two indices, namely (i) difficulty level and (ii) discriminating power.

i. Difficulty level of an item is indicated by percentage of examinees who have responded to the item
correctly.
ii. Discriminating Power of a test item refers to its ability to discriminate between the high and the
low achievers.

*******************************************************

3. General Principles of test construction and Standardization


There are certain principles of test construction. The procedure used to construct a test is designed
to attain its desired goals and purposes. The general principles of test construction are discussed below:

1. Planning the Test: The first and important step in constructing a test is to plan the test. While planning
the test the following aspects has to be taken into account:
a. Specification of purpose: The purpose of the test has to be clearly defined. It should take into
consideration the test takers age, intellectual level, education, socio-economic and cultural background,
etc.
b. Translating the purpose into operational terms: The test constructor must translate the test purposes
into operational terms. He must determine what knowledge or skills the test will cover and how items
will be presented in the test. Here, there are two major aspects –
i. Test content: The test constructor must specify the skill to be tested such as, knowledge,
application, analysis, synthesis, evaluation, etc.
ii. Test formats: Some common dimensions of test formats are:
-Alternative versus free response
-Speed versus power test
-Maximum performance versus typical performance
-Paper and pencil versus performance
-Group versus individual administration
- Structured versus projective

2. Constructing the test: This involve collection of different test items on the basis of the content and the
format of the test. A large number of items will be written that will be needed, since many items will be
eliminated by succeeding analyses. It involves correcting ambiguous wording, strengthening of weak
alternatives and removing inappropriate items.
The test items should also take into consideration specific characteristics such as, language spoken,
social background, etc. Thus, great care is needed in developing test items.

11
3. Construction of Pilot test: The arrangement of the items is to be done with several acceptable plans,
such as equal difficulty plan, increasing difficulty plan, etc. The constructor of the test has to make several
important decisions such as, giving directions, fixing time, scoring the items, etc.

4. Application of the pilot test or Tryout: The prepared items are administered on a representative sample group
for whom the test is designed for. The goal of this tryout is to obtain information on how students react to the items.
At this stage, the test constructor has to keep in mind the following aspects:
i. The environmental condition should be free from undue noise and distraction.
ii. There should be adequate sitting facilities and working space.
iii. The sample group should not be too large.
iv. The mental conditions or anxiety of the subjects should be taken into account.

5. Item Analysis: After the test has been administered and scored it is desirable to appraise the effectiveness
of the items. This can be done by considering the responses of subjects to each item.
By analyzing each item included in the test, we determine the effectiveness of each item by computing
two indices, namely (i) difficulty level and (ii) discriminating power.
i. Difficulty level of an item is indicated by percentage of examinees who have responded to the item
correctly.
ii. Discriminating Power of a test item refers to its ability to discriminate between the high and the
low achievers.

6. Finalizing the test form: This step involves selection of those items that provide the best discrimination, are
of appropriate difficulty and have effective alternatives.

7. Analysis of the test: Standardization: A good test must be a standardized measure. Standardization refers
to the uniformity in administration, scoring and interpretation of test scores. It involves control of errors which
is done by minimizing the influence of irrelevant factors on testing.
8. Psychometric Analysis: Finally, psychometric analysis is needed to establish the reliability, validity
of the test items.

Thus, the process of test construction and standardization is completed through the above mentioned
steps.

*************************************************************

12
4. Scoring of Student Achievement, Method of Interpreting test Scores, reporting test result –
Cumulative Record Card.

A test score is a piece of information, usually a number that conveys the performance of an
examinee on a test. One formal definition is that it is "a summary of the evidence contained in an examinee's
responses to the items of a test that are related to the construct or constructs being measured.

Achievement tests are universally used in the classroom mainly for the following purposes:
1. To measure whether students possess the pre-requisite skills needed to succeed in any unit or
whether the students have achieved the objective of the planned instruction.
2. To monitor students' learning and to provide ongoing feedback to both students and teachers
during the teaching-learning process.
3. To identify the students' learning difficulties - whether persistent or recurring.
4. To assign grades.

Despite the objectivity of scoring short answer tests, certain procedures are indispensable if scoring
is to be done with maximum accuracy and efficiency.

i. Order of Scoring: With essay tests it may be desirable to have one person score all answers to the first
question, then to the second, and so on. If, for objective tests separate answer sheets are provided, the scorer
hay score a given page in all booklets first, then the next page, and so on, rather than scoring all of one
booklet before going on to the next. If so many booklets must be scored that several scorers are needed,
each person may specialize on a given page or group of pages of the booklet but should score only one page
in all booklets at a time.

ii. Rescoring: With a large number of booklets to be scored and sufficient help available, it is always
worthwhile to re-score them so as to eliminate errors that otherwise are almost inevitable in a clerical task
like this. If complete rescoring is not feasible every fifth or tenth booklet should be re-scored to get a much
idea of the frequency aid magnitude of scoring errors. Rescoring a sample sometimes uncovers such an
inaccuracy as to make it desirable to re-score the remainder.

iii. Keeping Records: As soon as possible after the tests have been administered, the answer sheet should
be checked and scored, and the scores should be recorded on the permanent records of the school. Each
teacher should be given copies of the score reports for the pupils in his/her classes. Usually schools have
some type of permanent record for each pupil which provides space for recording standardized test results.

Methods of Interpreting test scores

Test scores are interpreted with a norm-referenced or criterion-referenced interpretation, or occasionally


both. A norm-referenced interpretation means that the score conveys meaning about the examinee with
13
regards to their standing among other examinees. A criterion-referenced interpretation means that the score
conveys information about the examinee with regard to a specific subject matter, regardless of other
examinees' scores.

14
UNIT III: MEASURING HUMAN ABILITIES AND POTENTIALITIES

1. Intelligence Test – Meaning, Classification, Uses

Meaning

According to Wechsler, “Intelligence is the aggregate or global capacity of the individual to act
purposefully, to think rationally and to deal effectively with his environment.”

There are three main aspects of intelligence.


i. Ability to learn: Intelligence is the ability to learn. The more intelligent a person is, the more extensive
is his ability to learn.
ii. Ability to adjust to new situation: Intelligence is the capacity to behave effectively according to
novel situation. It is the capacity of an individual to act more effectively and more appropriately in novel
situations.
iii. Ability to carry on abstract thinking: Intelligence includes ability to abstract thinking. It means
effective use of concepts and symbols in dealing with situations.

Intelligence tests are standardized tests that aim to determine how a person can handle problem
solving using higher level cognitive thinking. Often just called an IQ test for common use, a typical IQ
test asks problems involving pattern recognition and logical reasoning. It then takes into account the time
needed and how many questions the person completes correctly, with penalties for guessing. Specific
tests and how the results are used change from district to district but intelligence testing is common
during the early years of schooling.

Nature of Intelligence
The important points about intelligence and its functioning are as follows:
i. Intelligence is a native capacity.
ii. It is independent of training and acquired experiences.
iii. It is the comprehensive term used for such behaviours as understanding, thinking, remembering,
reasoning, etc.
iv. It has no organic entity. It is a way of behaving.
v. It is the ability to deal with abstraction.
vi. It helps to face and solve problems effectively.

Classification of Intelligence Test


A large number of intelligence tests have been developed after Binet’s intelligence test scale.
These are as follows:

I. Classification on the basis of form – Verbal tests and Non-verbal tests/Performance tests.
II. Classification on the basis of administration – Individual tests and group tests

15
I. Classification on the basis of form – Verbal tests and Non-verbal tests/Performance tests.

1. Verbal intelligence test: In these tests the subjects make use of language. Instructions are given in
words – written, oral or both. Some of the verbal tests include –
i. Vocabulary tests
ii. Memory tests
iii. Comprehension tests
iv. Information tests
v. Association tests
vi. Reasoning tests
2. Non-verbal intelligence test: The intelligence test that is based on symbols or figures and in which language is
either very less used or not used is referred to as non-verbal intelligence test. The subject indicates the correct
answer either through the use of language or by marking one of a number of choices. The symbolic response
might be made with respect to objects rather than printed materials.

Some of the non-verbal tests include –


i. Chicago non-verbal tests
ii. Raven’s progressive Matrices tests
iii. Army Beta tests. It includes several sub-tests such as,
a. Maze drawing
b. Cube Analysis
c. ‘X-O’ series
d. Digit symbol
e. Number checking
f. Picture completion
g. Geometric completion
3. Performance intelligence test: In this type of intelligence test the subject is required to handle or
manipulate objects in such a way as to complete a specified task. Intelligence regarding skills of performing some
task is tested through this test. Performance test usually measures more coordination, speed, perceptual and
spatial factors.

II. Classification on the basis of administration – Individual tests and group tests

1. Individual intelligence test: An individual test is administered to one person at a time. It was
originally designed by Alfred Binet. An individual test can be verbal, non-verbal or performance. Thus,
individual tests are sub-divided as –
Verbal and Non-verbal intelligence tests
i. Individual verbal tests: These tests involve the use of language, which are administered to
one individual at a time. Example, Wechsler Bellevue Intelligence test, Wechsler Intelligence
scale for children (WISC).
ii. Individual Non-verbal tests: In these tests language is not used and includes items which
require responses in terms of motor activities. Example – block designs, mazes, object
assembly or puzzles, picture arrangement, picture completion, etc.

2. Group Intelligence tests: Group tests can be administered to more than one individual of a given age
at a time by one examiner.
Group verbal and non-verbal tests
i. Group verbal tests: These tests require the use of language and are applied to group of
16
individuals at a time. Example, Army Alpha tests.
ii. Group non-verbal tests: These are designed to test intelligence without the use of language.
The test items contain pictures, diagrams and geometrical figures etc. printed in a booklet.

Uses of Intelligence tests


Intelligence tests are used for variety of purposes, such as:
1. Use for Classification: Intelligence tests find their greatest use in classification and grading of pupils
according to their ability. Only then their course of study, nature of treatment and method of instruction
may be planned according to their abilities and fitness.

2. Use for selection: Intelligence tests can be properly used for the purpose of selection of suitable
candidates for different purposes such as admission to an educational programme or course of instruction.

3. Use for Diagnosis: Intelligence tests are used to diagnose and discriminate the differences in mental
functioning of an individual. Example, identifying exceptional children like gifted, backward, feeble
minded, etc.

4. Use for prediction: Intelligence tests can also be used in deciding success in chosen professions and
social careers.

5. Use in Educational Guidance: Intelligence tests judge more or less correctly a child’s capacity to
learn, to read, and to understand. After assessing the IQ of an individual, he/she can be directed to the
suitable channel for him/her.

6. Use in Vocational Guidance and Selection: Intelligence test scores may help the counselor in
suggesting occupation and career that should be aimed at by a particular individual.

7. Use in research work: Intelligence tests are very much needed for research in the field of education,
psychology and sociology. It serves as an useful tool in measuring the mental abilities and qualities of
children.

2. Educational Achievement Test – Meaning, Classification, Uses

Meaning and definition

The word achievement means the level of success attained by an individual or group on the
completion of a certain task. It is the behavioural changes which take place within the individual as a
result of learning experiences of various kinds.
An achievement test is designed to measure the extent to which an individual has achieved
something, acquired certain information or mastered certain skill, usually as a result of specific
instruction provided in a classroom or training programme.
Freeman defines a test of educational achievement as a test “designed to measure knowledge,
understanding, skills in a specific subject or group of subjects.”

Classification of Achievement Test


17
Achievement tests are broadly classified as
I. Standardized Achievement Test
II. Teacher-Made Test.

I. Standardized Achievement Test: Standardized achievement test can be classified on the basis of purpose
or function as follows:
i. Pre-test or readiness test: This type of test is given to measure the necessary skills the students
possess to succeed in a particular educational task. It is designed to appraise the student’s readiness and
capability to take up a new course or to study a particular skill.
ii. Mastery test: This type of achievement test is a technical test and confined to the minimum
performance which are expected to be mastered by the students. The individual’s performance is judged as
indicating mastery or non-mastery of each skill.
iii. Diagnostic test: Diagnostic tests have been developed to pinpoint the causes of a student’s
learning difficulties. These tests help us to know the particular strengths and weaknesses of the individual
and identify those in need of remediation.
iv. General Achievement or Survey Tests: General Achievement tests are further classified into
two:
a. General Achievement Batteries: This type of tests can be used from primary grades to adult
level. They measure continuity of educational growth over several school grades. They also reveal
group or classroom differences in subject matter, skills or insights being tested.
b. Standardized Tests in Separate content areas: These tests measure achievement in specialized
areas included in the educational curriculum. These tests have been prepared for almost every
subject.
II. Teacher-made Tests: These are specially designed by the teacher to his class for a special purpose. A
teacher-made test is likely to reflect what was actually taught to a greater degree in the classroom. A teacher-
made test may mainly consist of the following:
i. Simple questions: These types of questions are used to measure the lower order of knowledge and
comprehension.
ii. Completion type questions: These type of questions are like simple questions but the format
comprises an incomplete statement.
Eg. Lord Mountbatten…………………………….was a historical personality
was an Englishman
was born in England, etc.
iii. Short answer questions: Here, the answer is given in short and may consist of few sentences or
paragraphs. It is used to measure abilities like analysis, synthesis and evaluation.
iv. Essay Type questions: These tests require a long essay as an answer to the question and the
examinee has full freedom of expression and organization of his thoughts. Response to essay
questions may also reflect student’s attitudes, creativity and verbal fluency – factors that may or may
not be relevant to the purpose of the testing.

Uses of Achievement Tests

18
Different uses of Achievement tests are:
i. Achievement tests provide feedback to students regarding the effectiveness of their learning.
ii. Achievement tests motivate students to study. These tests are frequently used to show the student how
much he does not know, and thereby stimulate him to study.
iii. Achievement test help to identify the source of student’s difficulty and indicate possible courses of
remedial action.
iv. Achievement tests may also be useful in the counseling process.
v. Achievement tests can also be utilized to predict future academic success.

3. Personality Test – Meaning, Classification

The word Personality is derived from the Greek word ‘persona’ which means ‘mask’. So, personality
meant the outward appearance of a person. Later it has come to mean the real nature of a man or inner make-
up of the individual. Now, the word personality is used in a wide sense. It implies the organization and
pattern at everything which an individual possesses.
According to J.B Watson, “Personality is the sum of activities that can be discovered by actual
observation over a long enough period of time to give reliable information.”
According to R.B Cattell, “Personality is that which permits a prediction of what a person will do in
a given situation.”

Classification of Personality Tests

1. Subjective Methods: These include,


i. Case History: This is a description of the past development of the individual. Here, the
psychologists collect information about the heredity and environmental factors which must have
influenced the individual’s life.

ii. Autobiography: This is the story of life given by the subject himself. This technique is employed
to get a faithful record of one’s experiences of past as well as of present.

iii. Interview: It is the technique of collecting information directly from the subject about his
personality by face-to-face contacts.

iv. Personality Inventory: This is a self-rating and self-evaluating process. These are used to have
the subject’s responses to a number of items which are meant to study his behaviour, his liking and
disliking, etc.

2. Objective Methods: These include,


i. Observation: In this method, the observer decides what personality traits or a characteristic he
needs to know, and then observes the relevant activities of the subject in real life situations.

19
ii. Checklist: A checklist is a simple tool for evaluating personality. Here, lists of items are given. It
is constructed to cover various aspects of an individual’s behavioural adjustment and personality.

iii. Rating scale: This technique is devised for assessing personality on certain traits accumulating
opinion from others. It takes into account the impression a person has made on another with whom
he has close acquaintance.

3. Socio-metric Method: It is a technique for revealing and evaluating the social structure of a group
through the measurement of frequency of acceptance and non-acceptance among individuals who constitute
the group.

4. Projective Tests: In psychology the term ‘projection’ means the tendency of an individual to see his/her
own unwanted traits, ideas and motives in other persons or objects. In this type of test, the test materials are
unstructured, vague, ambiguous and neutral. The subject is asked to supply meaning, significance and
organizations and in doing so he unconsciously leaves the impression of his own personality upon the
undefined stimulus materials.

Some of the major projective tests are:


a. Word Association Techniques
b. Sentence-completion test
c. Figure drawing
d. Expressive methods
f. Rorschach Inkblot test
g. Thematic Apperception Test

4. Aptitude Test – Meaning, Type, Uses

Aptitude may be described as a special ability or capacity which helps an individual to acquire the
required degree of proficiency or achievement in a specific field. It is a condition, a quality, or a set of
qualities in an individual of the probable extent to which he will be able to acquire some specific knowledge,
understanding or skill in different activities with suitable training.
Aptitude test measure the degree or level of one’s special ability in the same way as intelligence
tests are employed for measuring one’s general mental ability. Aptitude tests indicate the ability to acquire
certain behaviour or skills when given an opportunity.
According to Hull, “An aptitude test is a test designed to discover what potentiality a given person
has for learning some particular vocation or acquiring some particular skill.”

Types of Aptitude Tests

20
1. Multiple-Aptitude Tests: The multiple aptitude approach represents an adoption of a group of factors
approach to abilities. This approach recognizes that testing time will always be limited and that specific
abilities can be clustered into groups.

2. Differential Aptitude Tests: This test in the battery is power test. Each test should measure the level of
ability. Each of the abilities represented in the battery should be independently tested. These tests include
the following eight tests:
i. Verbal reasoning
ii. Numerical Ability
iii. Abstract reasoning
iv. Spatial Relations
v. Mechanical Comprehensive Test
vi. Language usage (Part-I)
vii. Language usage (Part-II)
viii. Clerical Speed and Accuracy
3. General Aptitude test battery: This battery consists of 12 tests selected to measure nine aptitudes
important for success in a wide variety of occupations. These are:
i. Intelligence
ii. Verbal aptitude
iii. Numerical Aptitude
iv. Spatial Aptitude
v. Form perception
vi. Clerical perception
vii. Motor Coordination
viii. Finger dexterity
ix. Manual dexterity
4. Test of Primary Mental Ability: The factors measured under this test include;
i. Verbal reasoning
ii. Number facility
iii. Spatial relations
iv. Reasoning
v. Perceptual speed
vi. Word fluency
5. Measures of Specific Aptitudes: These tests have been devised to measure the aptitudes of individuals
in various specific fields of activities. It focuses on measuring a specific ability rather than attempting to
provide a broad picture of abilities.

6. Mechanical aptitude test: Mechanical ability is an ability involved in manipulating concrete objects,
such as tools and in dealing mentally with mechanical movements.

7. Test of clerical Aptitude: This ability refers to the ability of routine clerical work. It includes:
21
i. Perceptual ability
ii. Intellectual ability
iii. Various Mental skills
iv. Motor ability
8. Musical and Artistic abilities: These tests have been devised for discovering musical and artistic talents.
It includes:
i. Seashore Measures of Musical Talents
ii. Judgement of Rhythm
iii. Intensity or Loudness discrimination
iv. Time discrimination
v. Tonal memory
9. Measuring Aptitude of Art: Art aptitude is a complex of interest, energy, perseverance traits and other
uncommon factors.

10. Tests of Scholastic and Professional aptitudes: Scholastic aptitudes have been developed for selection
of students for admission to specific courses or professions like engineering, medicine, law, teaching, etc.

Uses of Aptitude Test

1) Career counseling: Aptitude tests are used mostly by career counselors to help students make a proper
choice of courses or occupation. In such cases, the counselor administers aptitude test batteries that are a
combination of tests measuring a wide variety of abilities.

2) Clinical service: Information obtained from aptitude tests can also be used for making a clinical decision
regarding an underachieving, maladjusted student whether he has motivational or other conduct problem or
he simply lacks the ability to learn.

3) Personnel selection: Employers use vocational aptitude tests to select employees. Usually, they use
special aptitude tests that measure the particular skill required for the job. These tests predict success in the
particular job.

4) On the job training: Organizations rely on aptitude test scores for training and need analysis, i.e.,
exploring the strengths and weaknesses of the individual employees, so as to provide them need-based on
the job training.

5) Screening for admission: Most educational institutions select candidates to give admission into different
courses on the basis of scores on aptitude tests, e.g., courses on education (B.Ed) and management.

6) Curricular planning: School administrators use performance on multiple aptitude test batteries as a frame
of reference for curricular planning- which courses to be taught and who are to be taught etc.

22
5. Attitude Scale and Interest Inventory – Meaning
The term attitude refers to certain regularities of an individual’s feeling, thoughts, and
predispositions to act toward some aspect of one’s environment.
Feelings are often referred to as the affective component, thoughts as the cognitive component, and
predispositions to act as the behavioural component.

Attitude Scale
Attitude scales are most commonly used technique for measuring attitudes. It is used for discovering
the opinions of the individuals concerning different objects, problems and persons.
These are most commonly used for measurement of attitudes as with these scales, a precise
measurement is possible. They provide degree of affect that individuals may associate with the attitudinal
object. There are four methods of constructing attitude scales viz. (a) Thurstone type scale, (b) Likert type
scale, (c) Guttman’s scalogram and (d) Osgood’s semantic differential type.

a) In Thurstone type scale respondent is given a set of a fixed responses from which he must choose. These
statements are assigned scale values so that a quantitative index of the attitude may be obtained. Scare values
are assigned to equal appearing interval. A Scale value is assigned to each statement at the time of scale
construction. It is standardised by giving to large number of judges who decide the degree to which it is
favourable, unfavourable or neutral. Scale values are assigned to equal appearing interval. A B C D E F G
H I Agree Neutral Disagree The median of all the judgements becomes the value assigned for the statement.
Respondent selects those items with which he, agrees. His attitude score is the average of all the scale values
of the items with which he agrees.

b) In Likert type scale the respondent chooses one of the five possible responses to each item. These are
strongly agree, agree, undecided, disagree and strongly disagree. These are given weights of 1, 2, 3, 4 and
5 respectively. The total score of an individual is the sum total of the weights for each response he makes
to the statements.

c) Guttaman’s scalogram is considered uni-dimensional. Responses to every item, are constant with his
overall position on the attitude dimension. For example an attitude scale consists of three items. Individuals
could make four possible scores on this scale 3, 2, 1 and 0 representing the agreement with all three items
at one extreme and disagreement with all three at the other. Everyone who agrees with item 3 also agrees
with items 2 and 1 and everyone who agrees with item 2 agrees with item 1.

d) In Osgood’s semantic differential scale each statement is provided with two opposite responses like good-
bad, fair-unfair. This is relatively simple to construct. This method has appeared to be useful for certain
kinds of scaling problems.

Interest Inventory
Interest is a behaviour orientation towards certain objects, activities or experiences. It is an
expression of our likes and dislikes, or our attractions and aversions. An individual chooses the most
23
acceptable, suitable alternative out of many, go after preferred objectives. activities, etc., and consequently
derives satisfaction, success and happiness out of the activities selected.

Methods of Measuring Interests


We can measure the interests of individuals by the following methods :
I) Observation: We may observe manifest interests. What an individual actually does is a good indication
of what his interests are.

2) Claims of the Counsellor: We can know the interests by knowing the expressed interests of the individual,
in a subject, activity, object or vocation. Verbal claim can be an indicator of his interests.

3) Use of Instruments: We may assess interests using an instrument like Michigan Vocabulary Test on the
ground that if individual is really interested in something, he will know the vocabulary involved in that area.

4) Use of Inventories: We may determine the pattern of an individual's interest from his responses to lists
of occupations and activities. Interest inventories provide information about the student's preferences which
are more stable than the verbally claimed interests. The latter are too often influenced by his limited and
faulty knowledge of occupations. This technique is by far the most common means of assessing interests
and is commonly used.

1. Kuder Interest Inventories: There are various forms, versions and editions of the Kuder Interest
Inventories. They help in the measurement of interests from different angles and are designed for different
purposes. The items in the Kuder inventories are of the forcedchoice triad type. For each of three activities
listed, the respondent indicates which he would like the most and which he would like the least.

The following forms of Kuder Interest Inventories are quite common :

i) The Kuder Vocational Preference Record - It provides 10 interest scales plus a verification scale for
detecting carelessness, misunderstanding and the choice of socially-desirable but unlikely answers. The
interest scales include: Outdoor, Mechanical, Computational, Scientific, Persuasive, Artistic, Literary,
Musical, Social Service and Clerical. Forced choice traid items are used. The respondents indicate which
of the three activities they would like most and which least. The scores are obtained not for specific
vocations but for 10 broad interest areas.

ii) Kuder General Interest Survey (KGIS) - It has been developed as a revision and downward extension
of the Kuder Vocational Preference Record. It is designed for grades' 6 to 12. It employs simpler
language and easier vocabulary. It is a revision of the Strong Vocational Interest Blank (SVIB).

iii) Kuder Occupational Interest Inventory (KOII) - The occupations covered by this inventory vary
widely in level, ranging from baker and truck driver to chemist and lawyer.

24
2. Strong Vocational Interest Blanks (SVIB) - It is based 011 the assumption that a person who has the
interest patterns typical of successful people in a given occupation will enjoy and find satisfaction in
that occupation.

25
UNIT IV: EDUCATIONAL STATISTICS
Meaning, Nature and Scope and Use of Educational Statistics. Source of Educational Data and Difference
between Statistic and Parameter

Introduction
The word statistics seems to have been derived from the Latin word ‘status’ or the Italian word ‘statista’ or
the German word ‘Statistik’ each of which means a political state. In ancient times the governments used to
collect the information regarding the population and property of wealth of the country- the former enabling
the government to have an idea of the manpower of the country (to safeguard itself against external
aggression, if any) and the latter providing it a basis for introducing new taxes and levies.

Meaning and definition


Statistics has been defined differently by different authors from time to time. In ancient times statistics was
confined only to the affairs of the state but now it embraces almost every sphere of human activity.
Webster defines statistics as classified facts representing the conditions of the people in a state-especially
those facts which can be stated in numbers or in any other tabular or classified arrangement. This definition
confines statistics only to the data pertaining to the state is inadequate as the domain of statistics is much
wider.
Bowley defines statistics as Numerical statements of the facts in any department of enquiry placed in
relation to each other. He himself defines statistics in three different ways:
(i) Statistics may be called as the science of counting.

(ii) Statistics may rightly be called as the science of averages.


(iii) Statistics is the science of the measurement of social organism, regarded as a whole in all its
manifestations.
Statistics, therefore is defined as the science of collection, compilation, tabulation, analysis and
interpretation of quantitative data. It is essentially a branch of applied mathematics i.e. mathematics applied
to the observational data. Statistics essentially mean the procedure by which we understand data.

Nature of Educational Statistics


1. Statistics as a Science:
Statistics is a script in which we get orderly or systematic knowledge. From the point of view of the
prescribed conditions of science, statistics can be called science on the following aspects:-
(i) Statistics is the rule of knowledge and is developing at a very rapid pace.

(ii) Its various policies are widely used in all areas. Law of statistical regularity, Law of Inertia of Large
Numbers, Theory of Probability, etc are universal rules.

(iii) Based on the facts of the past and present, future trends are predicted by many statistical methods.
In this way, we can say that it is absolutely appropriate to say science to statistics.
26
2. Statistics as an Art:
If science is knowledge then art is action i.e., art refers to the branch of knowledge which changes the best
methods for solving various problems and the measures for achieving the facts are also suggested.
Statistics is also an art because of-
(i) Statistics presents solutions, methods, and conclusions for solving the problem of various questions.

(ii) How to use different statistical methods and rules to solve various problems? This thing is also studied
mainly in statistics. For Example, statistics tell us where the use of the arithmetic mean is best and in which
direction the median will be best used? How to create an index and what median should be used?

(iii) For the behavior of statistical methods, special skills and experience, and self-restraint are required in
the person, which is very necessary for art to say a subject.

3. Statistics as a Scientific Method:


It should be understood in the context of general scientific methods of acquiring knowledge. There are four
aspects to this research;
(i) Observation
(ii) Hypothesis
(iii) Prediction
(iv) Verification
In fact, statistical methods are only a useful tool needed for research, so that Crookston has also considered
statistics as a scientific method.

Uses of Statistics in Education


Need, Importance and Uses of Statistics:
1. Group Comparison:
The achievements of a class are not uniform in every subject. It is found that one class is progressing faster
is one subject, while another is progressing is a different one. Even the various sections of a particular class
do not progress uniformly.
2. Individual Comparison:
Statistics helps in the individual comparison of students differing in respect of their ages, abilities and
intelligence levels. It is statistics which tells us why thus students who are similar in every other respect yet
do not show similar achievement is one particular subject.

27
3. Educational and Vocational Guidance:
Every individual student differs from others in his intellectual ability, interests, attitude and mental abilities
students are given educational and vocational guidance so that they make the best use of these abilities and
the process of guidance is based upon statistics only.

4. Educational Experiments and Research:


With a change in place, line and circumstances, the aims, curricula and methods of education keep on
changing. The work of research and experimentation cannot become reliable and valid without the use of
statistics.

5. Essential for Professional Efficiency:


The teacher’s responsibility does not end when he teaches a particular subject in the classroom. His
responsibility includes teaching the students, obtaining the desired level of knowledge for himself and
assessing the achievement of modification in behaviour also.

6. Basis of Scientific Approach to Problems:


Statistics forms the basis of scientific approach to problems of Educational Psychology.

Scope of Statistics
Statistics may be defined as the collection, presentation, analysis and interpretation of numerical
data.

1. Statistics in Business and decision making: With the help of statistical methods, quantitative
information about production, sale, purchase, finance, etc. can be obtained. This type of information
helps the businessmen in formulating suitable policies.

2. Statistics in Mathematics: In statistical quality control, we analyze the data which are based on the
principles involved in Normal curve.

3. Statistics in Economics: Statistics is the basis of economics. The consumer’s maximum satisfaction
can be determined on the basis of data pertaining to income and expenditure. The various laws of
demand depend on the data concerning price and quantity. The price of a commodity is well determined
on the basis of data relating to its buyers, sellers, etc.

4. Statistics in business and industry: In past days, decisions regarding business were made only on
personal judgement. However, in these days, they are based on several mathematical and statistical
techniques and the best decision is arrived by using all these techniques. For example, by using the
testing hypothesis, we can reject or accept the null hypotheses which are based upon the assumption
made from the population or universe.

28
5. Statistics in Science and Research: Statistics has great significance in the field of physical and
natural sciences. It is widely used in verifying scientific laws and phenomenon.

For example, to formulate standards of body temperature, pulse rate, blood pressure, etc. The success
of modern computers depends on the conclusions drawn on the basis of statistics.

6. Statistics in Banking: In banking industry, the bankers have to relate demand deposits, time deposits,
credit etc. It is on the basis of data relating to demand and time deposits that the bankers determine the
credit policies. The credit policies are based on the theory of probability.

7. Statistic in State: In the modern era, the role of State has increased and various governments of the
world also take care of the welfare of its people. Therefore, these governments require much greater
information in the form of numerical figures for the fulfilment of welfare objectives in addition to the
efficient running of their administration.

8. Statistic in Planning: One of the aims of planning could be to achieve a specified rate of growth of
the economy. Using statistical techniques, it is possible to assess the amounts of various resources
available in the economy and accordingly determine whether the specified rate of growth is
sustainable or not.

9. Statistics in Psychology and Education: Statistical methods help in the construction and
standardization of various tests and measures like achievement tests in various subjects, intelligence
tests, aptitude tests, interest inventories, attitude tests or scales and various other measures of
personality testing.

Sources of Educational Data

Sources of Data
There are two sources of data in Statistics. Statistical sources refer to data that are collected for some official
purposes and include censuses and officially conducted surveys. Non-statistical sources refer to the data that
are collected for other administrative purposes or for the private sector.

Statistical Survey
A statistical Survey is normally conducted using a sample. It is also called Sample Survey. It is the method of
collecting sample data and analyzing it using statistical methods. This is done to make estimations about
population characteristics. The advantage is that it gives you full control over the data. You can ask questions
suited to the study you are carrying out. But, the disadvantage is that there is a chance of sample error creeping
up. This is because a sample is chosen and the entire population is not studied. Leaving out some units of the
population while choosing the sample causes this error to arise.

Census
Opposite to a sample survey, a census is based on all items of the population and then data are analyzed. Data
collection happens for a specific reference period. For example, the Census of India is conducted every 10
29
years. Other censuses are conducted roughly every 5-10 years. Data is collected using questionnaires that may
be mailed to the respondents.
Responses can also be collected over other modes of communication like the telephone. An advantage is that
even the most remote of the units of the population get included in the census method. The major disadvantage
lies in the high cost of data collection and that it is a time-consuming process.

Register
Registers are basically storehouses of statistical information from which data can be collected and analysis can
be made. Registers tend to be detailed and extensive. It is beneficial to use data from here as it is reliable. Two
or more registers can be linked together based on common information for even more relevant data collection.
From agriculture to business, all industries maintain registers for record-keeping. Some administrative registers
also serve the purpose of acting as a repository of data for other statistical bodies in a country.

Types of Data and Data Collection


Like stated earlier, there are two types of data: primary and secondary.

Primary data
As the name suggests, are first-hand information collected by the surveyor. The data so collected are pure and
original and collected for a specific purpose. They have never undergone any statistical treatment before. The
collected data may be published as well. The Census is an example of primary data.
Methods of primary data collection:

1. Personal investigation: The surveyor collects the data himself/herself. The data so collected is reliable
but is suited for small projects.
2. Collection Via Investigators: Trained investigators are employed to contact the respondents to collect
data.
3. Questionnaires: Questionnaires may be used to ask specific questions that suit the study and get
responses from the respondents. These questionnaires may be mailed as well.
4. Telephonic Investigation: The collection of data is done through asking questions over the telephone to
give quick and accurate information.

Secondary data
Secondary data are opposite to primary data. They are collected and published already (by some organization,
for instance). They can be used as a source of data and used by surveyors to collect data from and conduct the
analysis. Secondary data are impure in the sense that they have undergone statistical treatment at least once.
Methods of secondary data collection:

1. Official publications such as the Ministry of Finance, Statistical Departments of the government, Federal
Bureaus, Agricultural Statistical boards, etc. Semi-official sources include State Bank, Boards of
Economic Enquiry, etc.
2. Data published by Chambers of Commerce and trade associations and boards.
3. Articles in the newspaper, from journals and technical publications.

30
Difference between Statistics and Parameter

Definition of Statistic

A statistic is defined as a numerical value, which is obtained from a sample of data. It is a descriptive
statistical measure and function of sample observation. A sample is described as a fraction of the population,
which represents the entire population in all its characteristics. The common use of statistic is to estimate a
particular population parameter.

From the given population, it is possible to draw multiple samples, and the result (statistic) obtained from
different samples will vary, which depends on the samples.

Definition of Parameter

A fixed characteristic of population based on all the elements of the population is termed as the parameter.
Here population refers to an aggregate of all units under consideration, which share common characteristics.
It is a numerical value that remains unchanged, as every member of the population is surveyed to know the
parameter. It indicates true value, which is obtained after the census is conducted.

Key Differences Between Statistic and Parameter

The difference between statistic and parameter can be drawn clearly on the following grounds:

1. A statistic is a characteristic of a small part of the population, i.e. sample. The parameter is a fixed
measure which describes the target population.
2. The statistic is a variable and known number which depend on the sample of the population while the
parameter is a fixed and unknown numerical value.
3. Statistical notations are different for population parameters and sample statistics, which are given as
under:
• In population parameter, µ (Greek letter mu) represents mean, P denotes population proportion,
standard deviation is labeled as σ (Greek letter sigma), variance is represented by σ 2, population
size is indicated by N, Standard error of mean is represented by σx̄, standard error of proportion
is labeled as σp, standardized variate (z) is represented by (X-µ)/σ, Coefficient of variation is
denoted by σ/µ.
• In sample statistics, x̄ (x-bar) represents mean, p̂ (p-hat) denotes sample proportion, standard
deviation is labeled as s, variance is represented by s 2, n denotes sample size, Standard error of
mean is represented by sx̄, standard error of proportion is labeled as sp, standardized variate (z)
is represented by (x-x̄)/s, Coefficient of variation is denoted by s/(x̄)

Illustration

1. A researcher wants to know the average weight of females aged 22 years or older in India. The
researcher obtains the average weight of 54 kg, from a random sample of 40 females.
Solution: In the given situation, the statistics are the average weight of 54 kg, calculated from a simple

31
random sample of 40 females, in India while the parameter is the mean weight of all females aged 22
years or older.
2. A researcher wants to estimate the average amount of water consumed by male teenagers in a day.
From a simple random sample of 55 male teens the researcher obtains an average of 1.5 litres of water.
Solution: In this question, the parameter is the average amount of water consumed by all male
teenagers, in a day whereas the statistic is the average 1.5 litres of water consumed in a day by male
teens, obtained from a simple random sample of 55 male teens.

Parameter Statistic
A parameter describes a whole population. A statistic describes a sample but can be used to estimate the
characteristics of whole populations.
A parameter is a fixed, unknown value. A statistic is a known variable that depends on the sample.
In most cases, a parameter is not directly A statistic is easily observable and directly calculable.
observable and calculable(unless we’re
talking about very small populations which
can easily be surveyed or observed in their
entirety).
In case of parameters, the Greek letter “mu” In case of statistics, the “x-bar” symbol represents the
represents the population mean. sample mean, and most other notations differ too.

Conclusion

To sum up the discussion, it is important to note that when the result obtained from the population, the
numerical value is known as the parameter. While, if the result is obtained from the sample, the numerical
value is called statistic.

2. Measures of Central Tendency – It uses and limitations – Mean from ungrouped data grouped
data. (long and short method).

The three most commonly used measures of central tendency are the mean, the median and the mode.

i. The Mean:
The mean of a distribution is commonly understood as the arithmetic average. It is perhaps the most familiar;
most frequently used and well understood average.
The mean of a set of observations or scores is obtained by dividing the sum of all the values by the
total number of values.
The formula for finding the mean ungrouped data is,

∑𝑋
M= , where,
𝑁

M = mean
∑ = sum of
X = scores in a distribution
N = total number of scores
Uses of Mean

32
The mean is used when,
a. The scores are distributed symmetrically about the centre of the distribution
b. The most stable measure of central tendency is desired
c. Additional statistics are to be computed later;
d. The centre of gravity of a sample is desired.
d. When changes in absolute magnitudes are to be averaged.
e. When data represent either an interval or ratio scale.

Limitations of Mean
a. It is very much affected by the extreme values. A single abnormal value will distort the Mean
obtained from the mass values.
b. It cannot be calculated for open-ended classes when lower limit of the first class interval and
upper limit of the last class interval are not known.
c. It cannot be located graphically.
d. It cannot be accurately determined even if one of the values is not known.

ii. The Median


In statistics and probability theory, the median is the value separating the higher half from the lower half
of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the
middle" value. The basic feature of the median in describing data compared to the mean is that it is
not skewed by a small proportion of extremely large or small values, and therefore provides a better
representation of a "typical" value.

For example, 36 is the median of the scores:


31, 33, 36, 37, 40

Uses of Median
Median is used,
a. When a quick estimate of an average is desired
b. When the exact mid-point of the distribution is wanted Median is to be computed
c. Medan can be computed graphically
d. It is specifically useful for the data related to qualitative phenomena, e.g. honesty, character, etc.
e. When there is not sufficient time to compute a mean.
f. An incomplete distribution is given

Limitations of Median
a. It does not depend upon all the observations
b. It ignores the extreme items
c. It is more affected by fluctuations of sampling than in Mean.
d. Median is not amenable to further algebraic manipulation

iii. The Mode


The mode is the value that appears most frequently in a data set. A set of data may have one mode, more
than one mode, or no mode at all. Other popular measures of central tendency include the mean, or the
average of a set, and the median, the middle value in a set.
For example, in the series 9, 10, 11, 16, 18, 18, 19 and 21, the most recurring measure, namely, 18,
is the crude or empirical mode.

33
Uses of Mode
Mode is used when,
a. When a quick and approximate measure of central tendency is wanted.
b. When the measures of central tendency should be the most typical value.
c. When any measure of central tendency is roughly workable.
d. When we need to know the most often recurring scores or value of the item in a series.

Limitations of Mode
a. It is not based on all the observations of the series.
b. It is influenced by magnitude of the class intervals – that is, affected by changes in the grouping
scheme.
c. It is stable only when the sample is large.
d. It is not suitable for further mathematical treatment. It is not used extensively, because its results
are not considered very reliable.

3. Measures of Variability – Its use and limitations

The measures of central tendency will not give the researcher a complete picture of the data. It does
not tell the researcher how the scores tend to be distributed. For this, another kind of statistics, the measures
of variability is used. It is also called the measure of spread or dispersion.

i. The Range
Range is defined as the difference between the highest score and the lowest score. It is the simplest measure
of dispersion.

Thus,
Range = Largest value – Smallest value

Suppose the scores of a group of 10 students are:


100, 90, 82, 81, 80, 79, 78, 70, 60.

Here, the largest value is 100 and the lowest value is 60


Therefore, (100-60) = 40.

Uses of range
Range is used,
a. When knowledge of extreme scores or total spread is all that is wanted.
b. When the data are too small and scattered to justify the computation of a more precise measure
of variability.
c. When a rough and quick comparison is needed. It is very frequently used where variations are
not much.

Limitations of Range
a. It is not reliable, because it is unduly influenced by the two extreme values (largest and
smallest).
b. It does not depend upon intermediate values. It cannot give any information about the general
character of the distribution.
34
c. It is a biased estimate.
d. It has very high sampling fluctuations.
e. It cannot be applied to open-end cases.

ii. The Quartile Deviation


The quartile deviation is one-half of the middle 50 percent of the cases. It is one half the scale distance
between the third quartile and first quartile. The first quartile is denoted by Q1, the point below which lie
25 percent of the scores on the scale. The third quartile is denoted by Q3, the point below which lie 75
percent of the scores on the score scale.

The quartile deviation is denoted by the formula:


(𝑄3−𝑄1)
Q= 2

Uses of Quartile Deviation


Quartile deviation is used,
a. When the Median is the appropriate measure of central tendency
b. When there are scattered or extreme scores and when it is desirable to rule out the influence of
the extreme values.
c. When the concentration around the Median is of primary interest. Here, we take the measure as
Median ±Q.
d. In comparing variations or uniformity in different distributions.

Limitations of Quartile Deviation


a. It is concerned with the middle 50 percent and leave out of consideration the top 25% and the
last 25% of the values.
b. It is a positional measure; hence not amenable to further mathematical treatment.
c. It is very much affected by fluctuation of sampling.
d. It gives only a rough measure.

iii. The Variance and the Standard Deviation


The average of the squared deviations of the measures or scores from their mean is known as the variance.
The standard deviation is the positive square root of variance.

∑𝑋²
σ²=
𝑁

σ ² = variance of the distribution


x = deviation of the raw score from the mean
N = Number of scores or measures

Uses of Standard Deviation


a. It is used when the statistics having the greatest stability is sought.
b. It is used when extreme deviations should exercise a proportionally greater effect upon the
variability.
c. It is used in coefficient of correlation and in the study of symmetrical frequency distribution.

Limitations

35
a. It is easy to understand but difficult to calculate. The square root of the sum or the squared
deviation is not understandable to a non-mathematical mind.
b. It gives more weight to extreme values.
c. It cannot be calculated in open end classes.
d. It is affected by the value of every item in the series.

36
4. Concept of Normal Distribution – Properties and Uses of Normal Probability curve in
Interpretation of Test Scores, Divergence from Normality – Skewness and Kurtosis, Derived Scores:
Linear and Normalized – their uses.

Normal Distribution is highly useful in the field of statistics and is important continuous probability
distribution. It was first discovered by An English Mathematician De-Moivre (1667 – 1754) in 1673 to solve
the problems in game of chances.

Normal Distribution is also known as Gaussian distribution as Karl Friedrich Gauss used this concept to
explain the error of measurement. There is a general tendency of quantitative data to take the symmetrical
bell-shaped form. This general tendency may be stated in the form of a ‘principle’ as “measurement of many
natural phenomena and of many mental and social traits under certain conditions tend to be distributed
symmetrically about their Means in proportion which approximate those of the normal probability
distribution.”

The normal distribution is a continuous probability distribution that is symmetrical on both sides of the
mean, so the right side of the center is a mirror image of the left side.
The area under the normal distribution curve represents probability and the total area under the curve sums
to one.
Most of the continuous data values in a normal distribution tend to cluster around the mean, and the further
a value is from the mean, the less likely it is to occur. The tails are asymptotic, which means that they
approach but never quite meet the horizon (i.e. x-axis).
For a perfectly normal distribution the mean, median and mode will be the same value, visually represented
by the peak of the curve.

37
Properties of Normal Probability Curve
Normal Probability curve is drawn to show the equal distribution of scores in the either side of the
mean with a perfect bell shaped curve without touching the base line.

1) The normal curve is symmetrical about the mean.


2) The number of cases below mean in a normal distribution is equal to the number of cases above the
mean. So mean and median are at same points.
3) The mean, median, and mode are all equal.
4) A normal distribution is perfectly symmetrical around its center. That is, the right side of the center is a
mirror image of the left side. There is also only one mode, or peak, in a normal distribution.
5)The curve does not touch the baseline.
6)The Normal Probability curve is bell shaped curve.
7) The normal probability curve is unimodal that is it has only one mode.
8) The maximum height of the curve is at the center i.e. mean.
9) The curve has no boundaries.
10) The total percentage of area within two points is fixed.
11) Most of the cases fall between +3σ and -3σ , i.e. 99.74 % of the population.

Uses of Normal Probability Curve


• 1) To calculate the percentile rank scores in a normal probability distribution.
• 2) To normalize a frequency distribution, an important process in standardizing a psychological test or
inventory.
• 3) To test the significance of observed measures. To find out sampling errors.
• 4) To determine the percentage of cases within the given limits or scores.
• 5) To know how many students fall below and above the average performance.
• 6)It gives the limits of the scores.
• 7)To compare two different distributions.
• 8) To find out the relative difficulty of test items.
• 9)To find out the number of cases between mean and one standard deviation.
• 10) To divide a group according to same ability and assigning same grade.
• 11) To find out the percentage rank of a student from the scores and score from the percentile rank.

38
Skewness and Kurtosis
i. Skewness

It is the degree of distortion from the symmetrical bell curve or the normal distribution. It measures the lack
of symmetry in data distribution. It differentiates extreme values in one versus the other tail. A symmetrical
distribution will have a skewness of 0.
There are two types of Skewness: Positive and Negative

Positive Skewness means when the tail on the right side of the distribution is longer or fatter. The mean
and median will be greater than the mode.

Negative Skewness is when the tail of the left side of the distribution is longer or fatter than the tail on the
right side. The mean and median will be less than the mode.

ii. Kurtosis
Kurtosis is all about the tails of the distribution — not the peakedness or flatness. It is used to describe the
extreme values in one versus the other tail. It is actually the measure of outliers present in the distribution.

High kurtosis in a data set is an indicator that data has heavy tails or outliers. If there is a high kurtosis, then,
we need to investigate why we have so many outliers. It indicates a lot of things, maybe wrong data entry or
other things. Investigate.

Low kurtosis in a data set is an indicator that data has light tails or lack of outliers. If we get low kurtosis(too
good to be true), then also we need to investigate and trim the dataset of unwanted results.

39
Mesokurtic: This distribution has kurtosis statistic similar to that of the normal distribution. It means that
the extreme values of the distribution are similar to that of a normal distribution characteristic. This definition
is used so that the standard normal distribution has a kurtosis of three.

Leptokurtic (Kurtosis > 3): Distribution is longer, tails are fatter. Peak is higher and sharper than
Mesokurtic, which means that data are heavy-tailed or profusion of outliers.
Outliers stretch the horizontal axis of the histogram graph, which makes the bulk of the data appear in a
narrow (“skinny”) vertical range, thereby giving the “skinniness” of a leptokurtic distribution.

Platykurtic: (Kurtosis < 3): Distribution is shorter, tails are thinner than the normal distribution. The peak
is lower and broader than Mesokurtic, which means that data are light-tailed or lack of outliers.
The reason for this is because the extreme values are less than that of the normal distribution.

UNIT V: GRAPH AND VARIABLE DISTRIBUTION


1. Concept of variable. Types of Data – Grouped and Ungrouped Data
Variable is concerned with variation in presence of something in person, object, animal, place or situation
or in any natural phenomena. It can be defined as:
A characteristic under study of which an identity or value changes or is possible to change per unit
is called variable.
OR
A variable is a characteristic that varies in the context of its value or identity.

Referring the definitions of variable, we can say that any such characteristic, possessed by any living or non-
living unit or thing, is called variable whose value may change per unit or per groups of unit. Such characteristic is
called variable characteristic in research study. Some examples of variable are given below. If we want to study the
number of members in families of a village, the number of members will be variable characteristic, because value of
this number will change per family and family will be considered as unit of study. In the same way, if we want to

40
study Mathematical Reasoning Ability (MRA) of students, MRA will be considered as variable characteristic and
students will be considered as units of study. Units of study is called subject in research study.

Types
There are five types of variable in terms of research methodology as follows.

Independent Variable
The variable, value of which affects the value of another variable is known as independent variable. Such
variable is not affected by the change in the value of another variable but affects the value of another variable.
Generally, effect of such variable on another variable is measured or studied during research studies.
Independent variable is also known as absolute variable.
We will understand this concept with the help of examples.
In a comparative study of Computer Aptitude (CA) of undergraduate students of different faculties, ‘Faculty’
will be considered as independent variable, because in this study researcher will check the impact of faculty on
computer aptitude of the students. Faculty may have different levels like Arts, Commerce and Science. Here,
researcher assumes that CA of students may differ from faculty to faculty. Each level of independent variable is called
Stratum and all levels together are known as Strata.
In how many levels an independent variable is to be divided, depends upon how much large area is to be
covered under study. If researcher wants to compare CA of Engineering and Medical students also in above mentioned
study, there will be five levels of independent variable that is Faculty in this example.
Some independent variable, like Gender, has levels in fixed number. E.g. In the study of Emotional Maturity
of students in terms of their Gender, the Gender will have only two levels Male and Female. Nowadays, third level
of gender that is transgender is also accepted universally. In such cases gender will have three levels like Male, Female
and Transgender.
As discussed earlier, generally, impact of independent variable on dependent variable is studied through
research or dependent variable is studied in relation to independent variable. So, now we will discuss about dependent
variable.

Dependent Variable
The variable, value of which may change due to change in the value of other variable is called dependent
variable.

In other words, such characteristic is called dependent variable for which different values can be obtained in
the context of change in independent variable. In this way, we can say that value of dependent variable may change
due to change in the value of independent variable.

Let’s take an example to understand this concept. In comparative study of Mathematical Reasoning Ability
(MRA) of students in the context of their Intelligence, MRA will be dependent variable and Intelligence will be
considered as an independent variable because in this study, the impact of Intelligence on MRA is to be checked.
Researcher may divide the students according to level of their intelligence. Levels of intelligence may be high, low
and medium or very high, high, medium, low and very low. Levels will be decided according to the need and
objectives of study.
Generally, there has been relationship of cause and effect between dependent and independent variables,
where independent variable acts as a cause and dependent variable as an effect. In our example, researcher takes
intelligence as a cause and MRA as effect. Because, here, he wants to check whether MRA is affected by intelligence
or not.
Let’s take one more example to understand the relationship between dependent and independent variables.
41
Suppose we want to check the impact of teaching strategies like Concept Attainment Model (CAM) and
Project Method (PM) on the achievement of students in Geography, then we will teach certain units of Geography to
the students by these two strategies keeping in mind the procedure of experimental method. In this case, teaching
strategy will act as a cause that can affect the achievement of students.
In real life, we find that a single characteristic is affected by more than one factors. In such cases, two or
more variables may be there that may change the value of dependent variable. For example, achievement of students
in any subject, may be affected by various factors like teaching strategy, intelligence, attention or understanding level
and study habits of students. More factors may also be there. In such cases, researcher has to think over moderator
variables also.

Moderator Variable
We know that independent variable affects the value of dependent variable and there has been cause and
effect relationship between these two. The variable that affects the cause and effect relationship between these two
variables is called moderator variable. It means the effect of independent variable on dependent variable may be
different in the presence of moderator variable.
E.g. In the study of Value Awareness (VA) of urban, rural and semi urban students, the area will be
independent variable and VA will be dependent variable. But here, if researcher thinks that the gender of students
may also affect the relationship between area and VA of students, the gender will be considered as moderator variable.
Various moderator variables may be there for one pair of independent and dependent variable. Researcher has to
decide, in such cases, which variable he wants to take as moderator variable. If, in this case, researcher feels that
Socio-Economic Status (SES) of students may also affect the relationship between Area and VA, he can take SES as
second moderator variable.

Controlled Variable
If the effect of such variables that can affect the cause and effect relationship of dependent and independent
variable, is eliminated, it is called controlled variable. In other words, if the effect of moderator variable is controlled,
it is known as controlled variable.
If researcher defines a problem as ‘A study of Value Awareness of male students of urban, rural and semi
urban secondary schools of Ahmedabad district’, the variable ‘Gender’ will become controlled variable, because in
this case, he does not want to check the impact of gender on value awareness, as he will take only boys as sample.
‘A study of Value Awareness of male students, having High Socio-Economic Status, of urban, rural and semi
urban secondary schools of Ahmedabad district’ In this study, value awareness of Boys, who have High Socio-
Economic Status only will be studied. So, both Gender and SES will become controlled variables.

Intervening Variable
Any such variable is called intervening variable, that may affect the cause and effect relationship of dependent
and independent variables but either cannot be measured clearly or is to be ignored during research. It means,
intervening variables are neither controlled nor taken care of during research. In other words, any moderator variable,
that cannot be measured or observed clearly or ignored is called intervening variable.
In our earlier mentioned example of study of value awareness of students, researcher has classified variables
like Area, SES, Gender and Value awareness as shown in Table 1 to 4. But besides the moderator and controlled
variable, mentioned in tables, the following variable can also affect the cause and effect relationship of dependent and
independent variables of our example.
• School Environment • Friend Circle / Peer group of Student
• Social environment • Emotional Maturity of Students
• Culture of family • Parenting style of parents
• Value Awareness of Parents • Age of student
42
• Extra Reading

Grouped vs. Ungrouped Data

Grouped Data – Data that has been organized into groups (into a frequency distribution).
If you see a table similar to the one below, you will know that you are
dealing with grouped data:

Class
Frequen
cy 0 – 5 4
6 – 10 5 The frequency of a class is
11 – 15 12 the number of numbers in
16 – 20 7 that class. For example, there
must have been four numbers
between 0 and 5.

Ungrouped Data – Data that has not been organized into groups.
Ungrouped data looks like a big ol’ list of numbers.

How to Group Data

On your exam, you may have to construct a frequency distribution. Constructing a


frequency distribution is the same thing as grouping data.

The first step in grouping data is deciding how large of a class interval to use.
(Class interval = Class size)

There are 2 formulas for determining the appropriate class interval. You must be able to
choose which one would be appropriate for any given problem.

1. Class interval = Use when the problem


states the number of classes
to be used.

2. Class interval = Use when the problem does


not state the number of
classes to be used.

**Don’t forget to always round up to the nearest whole number when dealing
with class interval.

43
4. Application of Computer in Data Processing
Educational research involves collection, collation and analysis of large amounts of data, which can be
handled manually or by using electromechanical devices. Calculators and mechanical Facit machines are
the common calculating devices. Computers provides the best alternative for more than one reason. Besides
its capability to process data, it can store data over a long period of time, its capacity is enormous, and it
can house large amounts of data.

Research involves not only collection and storage of large amounts of data, it also involves complicated
calculations for testing hypotheses and carrying out calculations. Imagine if you had to calculate long and
complicated problems like multiple or stepwise regression, analysis of variance and covariance, or factor
analysis with the help of a tiny calculator. Computers, with the help of relevant programs can carry out these
jobs for you in minutes.

Computers carry out such complicated calculations flawlessly and with mind boggling speed. This has also
been possible due to sophistication in the software. Whereas earlier programmes had to be written for each
I incidence of data analysis, readymade packages are now available for data i analysis. The calculations that
took months earlier, in the pre-computer age take now a few minutes and hours. Also, access to computers
has increased substantially over the last few years all over the world including the third world countries.

Data Processing

In computer literature, there are definitions and elaborate descriptions of various related words like data,
information, etc. We will restrict our discussion to educational research. As you have seen in various units
in previous blocks, research tools are used to collect data. There is a separate unit on data collection in Block
IV. When you administer a structured questionnaire or a test, what you get is the data. Similarly, when you
take interviews or use observation or participant observation techniques, you land up with a lot of
information.

These data, however, are discrete and pertain to each individual respondent. As a researcher, you need to
find a pattern -- how does a variable behave in a group. For example, let us assume that you have collected
information from 400 distance learners on their motivation for joining the distance education course. Instead
of saying what 400 individuals say, you should be able to say on the basis of the 400 responses why people
join distance education courses. To come to that kind of a conclusion, you need to analyze and process data.
There are several stages to be passed when we process data with the help of a computer, namely, data
feeding, data checking, creating a data file, actual data processing, data output.

Data collected through the use of research tools need to be entered into the computer. When you use EXCEL,
the data can be entered directly on to the worksheets. The numbers in the columns can represent respondents
and rows can represent variables. For others, specific data entry format has to be developed. The pattern,
however, remains more or less the same - data on each variable against every respondent. Research students
often find difficulty in entering data because they do not take sufficient care when the questionnaires and
other tools are being designed. The research tools and response patterns can be structured keeping in mind
the requirements of data entry in the computer.

44
Data entry can be done manually using the keyboard and electron-mechanically byy using other types of
input devices. For example, for large scale data on examinations or admissions, responses are sought on a
predesigned response sheet; these sheets are then used to transfer data to computer memory through OMR.
The choice depends upon the volume of the data and the way data have been recorded. However, for the
research projects that you will deal with in your project for MADE, data can be entered manually by using
a computer keyboard since the data will be small in volume though on multiple variables, and these will be
recorded on sheets of paper transferred from the research tools scoring sheet. You can enter the data yourself
if you are well versed with the computer and its keyboard. Alternatively, you can take help of a data entry
operator. Whether you do it yourself or use the services of a data entry operator, caution must be taken to
ensure correct entry of data. A computer will take the input and analyze whatever data are fed into it. Wrong
and incorrect data will offer incorrect results and hence provide flawed conclusions. Therefore, it is
necessary to check and verify data before it is analyzed.

Data verification can be done in more than one way. There can be a sample check or a comprehensive check
of the data. After the entry of the data, a data sheet should be printed. The printed data can be compared
with the data contained in the original sheet. The second possibility is to check the data on the computer
terminal itself by comparing the same on the display unit with the data recorded on paper. For large scale
data and where complete accuracy is required, like examination results, they are fed parallel into two or
more computers. Sample check is done by using computer programs. What is important is to check data for
accuracy in data entry. For our purpose of processing research data, it can be checked manually by
comparing the entered data either in the printed form or on the screen with the original data sheets. Following
this, inaccuracies have to be corrected.

Research data entry and data checking are mechanical processes which require great attention. Unlike word
processing, there are no automatic indications of mistakes and wrong entries. Therefore it is wiser to use the
services of a competent data entry operator to reduce the chances of errors.

Coming to the issue of actual data processing, it is important to remember that as you enter data in the
computer, you create a data file. Probably, the computer already houses a program file. Technically, data
processing is interfacing the data file with the program file. The moment you draw the , mouse and block
certain amount of data, say a particular column in a worksheet, the computer will immediately register the
segment of data you I wish to deal with. Next, when you call upon a particular symbol or a formula, the
computer will activate the corresponding segment of the program and the subroutines required to carry out
the operation. The program draws the data from the data file according to its own specification. It carries
out the necessary mathematical operations and produces the result. The entire process operates so fast that
it appears to be automatic. Once, the processing of the data is over, the results can be seen on the display
unit of the computer or can be printed. It may not be possible to use the printed results directly for the
research reports. You may need to construct tables from the result sheets.

Using a computer for data Processing

In using a computer for data processing, it is really not necessary to be an expert in computer applications
or an expert programmer. You, as a researcher are the user and you should have the skills of the user.
Personal knowledge of computers and software is an additional competence but is not crucial. What is
important is to have a fair idea about hardware and software.
45
The skill required of a researcher is his/her ability to understand the statistical tools for data analysis. The
decision regarding the statistical tools depends upon the objectives and hypotheses, research designs,
research tools, size of the sample, etc., the details of which you have read in Blocks 3 and 4. While we look
at the use of a computer for data processing, we need LO be clear about the relevant software for data
analysis. For example, if the research warrants use of 'chi-square' or analysis of variance, we must choose
programs that can do those functions. But, how do we know that?

As mentioned earlier, for data processing you can use two types of software - tailor-made or a package. For
example, if you require studying the relationship between two variables, you can write or get a program
written on correlation. Alternatively, you can look at a readily available package that has a program on
correlation. For example, Statistical Package for Social Sciences (SPSS) has a wide range of statistical
programs that are normally required by researchers in social sciences. Similarly, EXCEL can carry out
certain operations including graphics.

One important factor in a research exercise is the process of consulting. Generally, you would consult your
research guide. There are, however, certain specialized areas in research which need special consultations
with experts, besides the research guides. Two such areas are research designs and data processing. If you
or your research guide is not very conversant with statistical techniques and computer programs, it is worth
consulting experts -- specialists in research design and statistical methods, and then computer software
professional.

46

You might also like