CH IV. Measurement and Scaling

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 51

Measurement

• Measurement in research consists of assigning


numbers to empirical events in compliance with a set
of rules. It occurs when an established yardstick
verifies the quantity and quality of an object or event.
• To measure is to discover the extent, dimension,
quality or capacity of something, especially by
comparison with a standard.
• Measurement involve three process
– Selecting observable empirical events,
– Developing a set of mapping rules- a scheme for
assigning mummers or symbols to represent aspects
of the event being measured.
– Applying the rules to each observation of that event.
• Measurement not only involve quantification
(assigning numbers to an object or properties)
but also include the opinion rating scale or
characterization.
• Variables being studied in research may be
objects or properties of objects. Objects include
the things of ordinary experience such as
people, books, altitude whereas properties are
the characteristics of the objectives such as
physical (weight, height), psychological
(attitude, intelligence) or social (leadership
ability, class or status).
• In literal sense, researcher does not measure
either objects or properties, they measure
indicants of the properties or objects.
Four Different Scales of Measurement
Nominal Scale
– A nominal scale of measurement labels
observations so that they fall into different
categories. A nominal scale consists of a set of
categories that have different names.
– It labels and categories observations but do not
make any quantitative distinctions between
observation.
– Example:
• gender categories-male and female
• Occupational categories: farmers, teacher, govt.
employee etc.
• Interval Scale
• An interval scale of measurement consists of an
ordered set of categories (like an ordinal scale) with
the additional requirement that the categories form a
series of intervals that are all exactly the same size.
• Ith interval make it possible to compute distance
between values on an interval scale.
• It allows to measure differences in the size and
amount of events. But, it does not have an absolute
zero point that indicates complete absence of the
variable being measured.
• Example: temperature, SAT score, knowledge level.
– We cannot compare two values by claiming that one is
twice as large as another.
Ordinal scale
• An ordinal scale consists of a set of categories that are
organized in an ordered sequence. It ranks the
observation in terms of size and magnitude.
• The world ordinal implies- observations which are
measured on an ordinal scale are categorized and
arranged in rank order.
• It is usually used to rate the preferences of the
respondents.
• The measurement from an ordinal scale provides
information about the direction of difference. Between
two individual but it does not reveal the magnitude of
the differences.
• Example: ranking the performance of the workers, students,
response order of preferences.
– First, second, third etc.
Ratio Scale
A ratio scale is an interval scale with the additional
feature of an absolute zero point. With ratio scale,
rations of numbers do reflect ratio of magnitude.
• A ratio scale of measurement has all the features of an
interval scale, but adds an absolute zero point.
• an ratio scale , a value of zero indicates none
(complete absence) of the variable being measured.
• The ratio scale not only allows you to measure the
diff between two individuals but also allows to
describe the diff in terms of ratio.
• For example: we can state that one measurement is
three times larger than another or can say that one
score is only half as large as another.
– Income, expenditure, sales volume, ad expenditure etc.
Mapping Rules characteristics:
• Classification:
– numbers are used to group or short responses but no order exist.
• Order:
– Numbers are ordered. One number is greater than, less than or
equal to another number
• Distance:
– differences between number are ordered. The difference between
any pair of numbers is greater than, less than or equal to the diff
between any other pairs of numbers.
• Origin:
– the number series has a unique origin indicated by the number
zero. All other characteristics exist.
Validity
• Validity refers to the extent to which a test
measures what we actually wish to measure.
• Two facets: External and internal validity
• Internal validity refers to the ability of a research
instrument to measure what is supposed to measure
(Does the instrument really measure what its designer
claims it does?)
• External validity of research findings refers to the
data’s ability to be generalized across persons,
settings and times.
Types of validity
• There are three forms of validity:
1) content validity
2) Criterion –related validity
3) construct validity
• Content validity
• It refers to the extent to which a measuring instrument (scale)
provides adequate coverage of the investigative questions
guiding the study.
• The degree to which the content of the items adequately
represents the universe of all relevant items under study. If the
instrument contains a representative sample of the universe of
subject matter of interest, then the content validity is good.
• The content validity is assessed through methods
– Judgmental
– Panel evaluation
Criterion-related validity
• It is used to measure to show the accuracy of a measurement by
comparing it with another measurement which is assumed to be
valid
– Predictive validity
It is the power of measure to predict the outcome correctly. An
opinion questionnaire that correctly forecasts the outcome of an
election has predictive validity.
– Concurrent validity
It estimates the existence of current behavior or condition. An
observational method that correctly categories families by
current income class has concurrent validity.
• The criterion must be judged in terms of the following four qualities.
– Relevance – judge to be the proper measure
– Free from bias-each subject have equal opportunity
– Reliability- stable or reproducible measure
– Availability- specified information must be available.
• Construct validity
• A measure is said to possess construct validity to the
degree that it confirms to predicted correlations with
other theoretical propositions.
• To measure or infer the presence of abstract
characteristics for which no empirical validation seems
possible is construct validity. It is more complex and
abstract.
• Construct validity is the degree to which scores on a
test can be accounted for the explanatory constructs
of sound theory.
• For determining it, we associate as set of other
propositions with the results received from using our
measurement instrument. If correlation is high it is
valid.
Factors that can lower Validity
• Unclear directions
• Difficult reading vocabulary and sentence structure
• Ambiguity in statements
• Inadequate time limits
• Inappropriate level of difficulty
• Poorly constructed test items
• Test items inappropriate for the outcomes being
measured
• Tests that are too short
• Improper arrangement of items (complex to easy?)
• Identifiable patterns of answers
• Administration and scoring
• Nature of criterion
Reliability
 Reliability refers to a measure’s ability to
capture
an individual’s true score, i.e. to distinguish
accurately one person from another.
 Measure of consistency of test results from one
administration of the test to the next.
 Therefore, reliability is consistency of
measurement, or stability of measurement over
a variety of conditions in which basically the
same results should be obtained.
Types of Reliability
I. Test-retest reliability
• If same test is applied to the same people after a
period of time, is known as test-retest reliability.
• The same treatment (equipment) is used under
the same condition as possible.
• Finally the results obtained from these two tests
are compared and degree of correspondence is
determined.
II. Alternate or parallel form Method
Two measuring tools covering the same concept are
developed and administered to the same people to
see any difference is known as alternate form of
reliability.
If the result is highly correlated then the measuring
instruments are reliable and results of the research
are also considered as reliable.
The chance of getting errors is less in this method
because the two forms of test are applied im the
same time to the same people (sample)
III. Split half method
In this method, the content of the measuring
instrument is divided into two equal groups and
computing the measure of similarity.
The division of instrument can be made based on
odd or even number or on the random number
sampling basis.
Finally, a correlation is calculated between two
halves to observe whether there is correlation. If
the correlation is high, this indicates that the
instrument is reliable otherwise not.
IV. Inter-rater Method
Under this method, the reliability is measured
based on the judgment of the several respondents
on the same question or same issue under the
same circumstances.
If majority of the respondents rate in the similar
way then the instrument may be able to represent
the reliability of the instrument.
It is particularly reliable when the data are
obtained through observation.
Improving Reliability
• Quality of items; concise statements,
homogenous words (some sort of uniformity)
• Adequate sampling of content domain;
comprehensiveness of items
• Longer assessment – less distorted by chance
factors
• Developing a scoring plan (esp. for subjective
items – rubrics)
• Ensure VALIDITY
Criteria of good measurement
In general, there are three criteria of good
Measurement.
A.Reliability
B.Validity
C.Practicability
A. Economic
B. Convenient
C. Interpretability
Validity and Reliability

Neither Valid Reliable but


nor Reliable not Valid

Fairly Valid but Valid & Reliable


not very Reliable
Attitude Measurement
• In marketing and other areas of business,
sometimes, it is essential to measure attitudes of
respondents towards certain tangible or
intangible goods.
• It is the process of converting qualitative facts
into quantitative figures.
• Although the conversion of qualitative facts into
quantitative figures may not be accurate,
sometimes it is essential to convert the facts into
figures.
Components of attitudes
Attitude consists of three principal components.
Affective
This dimension measures the feelings or emotions of
people towards the certain events or objects. In fact, it
measures the respondents’ like or dislike for a particular
object
Cognitive
This dimensions is related to knowledge and beliefs. How
the respondents perceive the knowledge is cognitive
Behavioral
This dimension measures the tendency to action,
intention and behavioral expectation.
Techniques of measuring Attitudes
Although a large number of techniques are available for
measuring attitude, there are four principal techniques.
A. Choice
Respondents are given a large number of alternatives and
ask to select the favorite alternative. The selected
alternative represents the preferred alternative.
B. Ranking
Many alternatives are provided to the respondents and
requested to give the rank based on their priority. This
helps to know the likely alternative in order.
C. Rating
Respondents are requested to rate the magnitude of the
quality or characteristics of event or product.
D. Sorting
Many alternative are provided to the
respondents and they are asked to arrange them
in order based on their priority. This
arrangement reveals the attitude of the
respondents.
Types of rating scales
Six main types of rating scales
1. Category scale
2. Semantic differential scale
3. Stapel scale
4. Likert scale (Summated ratings scale)
5. Constant sum scale
6. Graphic scale
I. Category Scale
• A rating scale which the response options provided for a
closed-ended question are labeled with specific verbal
descriptions.
Example
Please rate car model A on each of the following dimensions:
Poor Fair Good V. good Excellent
a)Durability [ ] [ ] [ ] [ ] [ ]
b)Fuel consumption [ ] [ ] [ ] [ ] [ ]

Characteristics
• Response options are still verbal descriptions.
• Response categories are usually ordered according to a particular
descriptive or evaluative dimension.
• Therefore scale has ordinal properties.
• However, researchers often assume that it possesses interval properties
=> but this is only an assumption.
** One special version is the Simple category scale.
II. Likert Scale (summated rating scale)
• Respondent rates a series of statements according to
how much they agree
• Scores for each statement summed to give an overall
attitude score
• It is the most widely used approach to scaling responses
in survey research.
• When responding to a Likert questionnaire item
respondents specify their level of agreement or
disagreement on a symmetric agree-disagree scale for a
series of statements.
The format of a typical five-level Likert
item
1. Strongly
disagree
2. Disagree
3. Neither agree
nor disagree
4. Agree
5. Strongly agree
Characteristics of the Likert Scale
The following procedure is used to analyze data from Likert scales
1. First, weights are assigned to the responses options, e.g.
Totally agree=1, Agree=2, etc
2. Then negatively-worded statements are reverse-coded (or
reverse scored). E.g. a score of 2 for a negatively-worded
statement with a 5-point response options is equivalent to a
score of 4 on an equivalent positive statement.
3. Next, scores are summed across statements to arrive at a total
(or summated) score.
4. Each respondent’s score can then be compared with the mean
score or the scores of other respondents to determine his level
of attitude, loyalty, or other construct that is being measured
• Note that the response for each individual statement is
expressed on a category scale.
III. Semantic Differential Scale
A rating scale in which bipolar options are placed at both
ends (or poles) of the scale, and response options are
expressed as “semantic” space.
Example
Please rate car model A on each of the following dimensions
Durable ---:-X-:---:---:---:---:--- Not durable
Low fuel consumption ---:---:---:---:---:-X-:--- High fuel consumption
Characteristics
1. The scale has properties of an interval scale.
2. Sometimes descriptive phrases are used instead of
bipolar adjectives, especially when it is difficult to get
adjectives that are exact opposites
3. It is often used to construct an image profile.
IV. Graphic Ratings Scales
• Rating scales in which respondents rate an object
on a graphic continuum, usually a straight line.
• Modified versions are the ladder scale and happy
face scale.

Characteristics
1. The straight line scale has ratio level properties.
2. The ladder and happy face scales have properties
depending on the labeling option chosen –
whether all response categories are labeled
(ordinal properties) or only the scale end-points
are labeled (interval properties).
V. Stapel Scale
A simplified version of the semantic differential scale in which a
single adjective or descriptive phrase is used instead of bipolar
adjectives.

Characteristics
1. The scale measures both the direction and intensity of the attribute simultaneously.
2. It has properties similar to the semantic differential.

Example

Model A
-3 -2 -1 Durable Car 123
-3 -2 -1 Good Fuel Conaumption 1 2 3
VI. Numerical Scale
Any rating scale in which numbers rather than
semantic space or verbal descriptions are used as
response options.

Examples
Poor Excellent
Durability 1 2 3 4 5 6 7

Durable 1 2 3 4 5 6 7 Not durable


Sampling
Concept
• Statistical method of obtaining representative data or
observations from a group (lot, batch, population or
universe)
• The main objective of sampling is to minimize the cost and
time.
• In addition, it facilitates to get the best possible estimates
of the population parameters
• It also helps to test the significance of the population
parameter on the basis of sample statistic.
Parameter and Statistic
The characteristics of the population (universe) are
known as parameter.
Similarly, the characteristics of the sample are
known as statistic.
Basic terms used for parameter and statistics
Terms Population Sample (Statistic)
(Parameter)
Size N n

Mean µ x
Standard Deviation σ s
Proportion P p
Correlation ρ r
Coefficient
Sampling Methods
I. Probability (Random ) Sampling
With probability sampling, every element of the
population has a known probability of being included
in the sample.
The chance of getting selection of each
element is equal in probability sampling
II. Non probability (Non random) Sampling
With non-probability sampling, we cannot specify the
probability that each element will be included in the
sample.
The chance of getting selection of each element is not
equal in this sampling
Types of Probability Sampling
• Simple Random Sampling

• Stratified Random Sampling

• Systematic Sampling

• Cluster Sampling

• Multistage Sampling
I. Simple Random Sampling
• It is the technique of drawing a sample in such a way
that each unit of the population has an equal &
independent chance of being selected in the sample.
• SRS is the simplest form of sampling and is the basis for
many other sampling methods.
• It is most applicable for the initial survey in an
investigation and for studies that involve sampling from
a small area where the sample size is relatively small.
• SRS can be of two types
– SRS with replacement
– SRS without replacement
Applications of SRSWOR: SRS can be adopted in
two ways
A.Lottery methods
This is simplest traditional method of random
sampling process. This can be done as
1. List out the sampling frame
2. Number the sampling frame consequently
3. Decide the sample size , for example
4. Make a slip for each for each element
5. Mix up the slip
6. Pick the slip randomly
B. Random Number Methods
We can get the random numbers using standard
statistical text book or generate the random number
table using computer application software like Microsoft
Excel, SPSS.
Advantages of SRS
Scientific Method and no space for bias
Simple and Easy
Disadvantages of SRS
For scattered sample, it costly and time consuming
A frame of population is needed
For good precision, it requires a large number of samples
II. Stratified Random Sampling
• In this sampling, total population is divided into sub-
populations called strata of same or different size in
such a way that characteristics within the strata are
homogenous but between the strata is
heterogeneous.
• Then samples are taken from each stratum by SRS or
any other methods regarding to optimum or
proportional allocation methods.
• For example, the volume of sale of any product is
different (heterogeneous) in different geographical
regions of our country, then, we use stratified
random sampling.
There are two ways of selecting the required number of
samples under method.
Proportional Allocation
When information regarding the relative variances within strata
and cost of operations are not available, the allocation in the
different strata may be made in proportion to the number of
units in them or the total area of each stratum.
Neyman’s Allocation
Under this allocation, the sample is selected based on fixed cost
and fixed sample size.
Advantages
• More representatives than SRS & SYS
• Greater accuracy (more efficient) than SRS
• Administrative convenience
Disadvantages
• It is almost impossible to form the homogenous
strata
• Need prior & additional information about
population & its subpopulation.
III. Systematic Sampling
• In a systematic sampling, the elements of the
population are put into a list and then every Kth
element in the list is chosen (systematically) for
inclusion in t
• It is a commonly used technique if the complete &
up to date list of the sampling units is available.
• This consists in selecting only the 1st unit at random, the rest
being automatically selected according to some predetermined
patterns involving regular spacing of units.
• Let us suppose that N sampling units are serially numbered from
1 to N in some order & a sample of size n is to be drawn such that
N = nk then K = N/n
Where, K = Sampling interval (an integer)
• For example, if there are 2,000 students at a high school and the
researcher wanted a sample of 100 students, the students would
be put into list form and then every 20th student would be
selected for inclusion in the sample.
Advantages
• This method is simple, administrative easier, cheaper & quicker
or, It is very easy to operate & checking can be done quickly.
• It is possible to select a sample in the field without as sampling
frame.

Disadvantages
• If the population is not in random order, one cannot validly
estimate parameter of the population.
• Not suitable for more heterogeneous data.
• Not suitable for infinite population
IV. Cluster Sampling
• The population is divided in non-overlapping groups
called clusters.
• One version of cluster sampling is area sampling or
geographical cluster sampling.
• The clusters are to be formed in such a way that the
variation within clusters should be high but between the
clusters should be low.
Cluster sampling should fulfill the following steps.
• The population is divided into N groups, called clusters.
• The researcher randomly selects n clusters to include in the sample.
• The number of observations within each cluster Mi is known, and M
= M1 + M2 + M3 + ... + MN-1 + MN.
• Finally there are two options:
One-stage cluster sampling in which all of the elements
within selected clusters are included in the sample.
Two-stage cluster sampling in which a subset of elements
within selected clusters are randomly selected for
inclusion in the sample.
It is often used in marketing research.
Advantages
It can be cheaper than other methods
Disadvantages
Higher sampling error, which can be expressed in the
so-called "design effect
V. Multistage Sampling
• Multistage sampling refers to sampling plans where the
sampling is carried out in stages using smaller and smaller
sampling units at each stage.
• In a two-stage sampling design, a sample of primary units is
selected and then a sample of secondary units is selected
within each primary unit.
• Stratified random sampling and cluster sampling can be
viewed as special cases of two- stage sampling. A stratified
random sample is a census of the primary units (the strata)
followed by an SRS of the secondary units within each
primary unit.
• Similarly, a cluster sample is an SRS of the primary units
(the clusters) followed by a census of the secondary units
within each selected primary unit.
Types of non Probability Sampling
• Judgment (Purposive, deliberate or subjective)
In this method, researcher exercises his judgment in
the choice & includes those items in the sample which
he thinks are most typical of the universe with regard
to the characteristics under investigation.
• Convenience Sampling
Convenience sample is obtained by selecting
convenience population units. In this sampling, that
fraction of population being investigated which is
selected neither by probability nor by judgment but
by convenience.
A sample obtained from such as telephone directories
etc. is a convenience sample.
• Quota Sampling
In a quota sample, first, quotas are set up according to some
specified characteristics such as income, age, political, or
religious group etc.
In the second stage, required number of samples are collected
within each quota.
For example: selection of candidates according to ethnicity.
• Snowball sampling (chain sampling)
This type of sampling relies on previously identified members of
a group to identify other member of the population.
As newly identified members name others, the samples
snowballs. This technique is useful when a population listing is
unavailable.
It is beneficial to obtain secrete information.

You might also like