Vaibhav Chawla Session 10

Course: Research for Marketing Decisions Session
10
Measurement and Scaling:

Multi-Item Scales
Instructor: Vaibhav Chawla

Email: vaibhavchawla@iitm.ac.in
Multi-Item Scales
2
Single-Item Scale or
Single Question Scale
Measuring Food Quality of Zaitoon Restaurant
Strongly Strongly
Disagree Disagree Neutral Agree Agree
Food in Zaitoon Restaurant 1 2 3 4 5
is tasty
Food Quality Score out of 5 = 2
3
Multi-Item Scale
Strongly Strongly
is tasty
has good smell

has attractive presentation

is healthy
Food Quality Score out of 5 = (2+3+2+3)/4 =2.5
4
Strongly Strongly
The food in Zaitoon 1 2 3 4 5
Restaurant has good quality
Food Quality Score out of 5 = 2
5
WHICH FOOD QUALITY

MEASURE WAS BEST AND
WHY?
6
MULTI-ITEM SCALE WAS BEST.
Reasons:
(1) It measured full domain of “food quality” as compared
to single-item scale 1 where only “taste” aspect was
measured
(2) It captured information about each and every aspect
of “food quality” as compared to single-item scale 3
where “food quality” was directly measured without
information about its aspects
7
Difference between Single-Item
and Multi-Item Scales
Single Item scales are those with which
only one item/question is measured.
A multi-item measure has several questions

targeting the same social issue, and the
final composite score is based on all
questions
8
Why to Use Multi-Item Scales?
Multi-Item Scales can be superior to
Single-Item, straight forward question
- With a single question, people are
less likely to give consistent answers
over time.
- Many measured social characteristics
are broad in scope and simply cannot
be assessed with a single question.
9
An Example of a Scale Measuring
Introversion:
- I blush easily. (Strongly Agree .....................Strongly
disagree)
I blush easily. (Strongly Agree .....................Strongly disagree)

At parties, I tend to be a wallflower. (Strongly
Agree .....................Strongly disagree)
Staying home every night is all right with me. (Strongly
I prefer small gatherings to large gatherings . (Strongly
When the phone rings, I usually let it ring at least a
couple of times. (Strongly Agree ....Strongly disagree) 10
An Example of a Scale Measuring Job
Satisfaction:
- I am not satisfied with my work. (Strongly
I am not satisfied with my supervisor. (Strongly

I am not satisfied with my salary. (Strongly
I am not satisfied with my coworkers. (Strongly
I am not satisfied with the work content . (Strongly
11
Multi-Item Scale: Example
12
13
Mental
Fitness
How good you are
in Maths 0----------------------------------100
How good you are in

Sports like Chess? 0---------------------------------100
How good you are in 0-------------------------------100

handling stress? 14
• In multi-Item scales, the concept we
measure is not directly observable
• Can we see the colour, height,

thickness (Yes), but can we see
mental fitness (No).
• Generally, the concepts we measure

using multi-item scales are non-
observable or latent concepts
15
Development of a Multi-Item Scale
Fig. 9.4
Develop Theory
Generate Initial Pool of Items: Theory, Secondary Data, and

Qualitative Research
Select a Reduced Set of Items Based on Qualitative Judgment
Collect Data from a Large Pretest Sample
Statistical Analysis
Develop Purified Scale
Collect More Data from a Different Sample
Evaluate Scale Reliability, Validity, and Generalizability
Final Scale
16
Multi-Item Scales Example:
Attitude Measurement
17
What is an Attitude?
18
Attitude
An enduring system of positive

or negative evaluations,
emotional feelings, and pro and
con actions tendencies with
respect to a social object
19
Attitudes
as Hypothetical Constructs
Variable that is not directly

observable, but measurable by
indirect means, such as verbal
expression or overt behavior
20
Attitude towards Cigarette
Smoking
1. Cigarette smoking is injurious to
health. (Cognitive)
2. Cigarette smoking is a risk.
(Cognitive)
3. I hate cigarette smoking. (Affective)
4. I do not intend to smoke cigarette
throughout my life (Behavioral)
21
Attitudes Behaviors
22
Affective
• Feelings or emotions
toward an object
23
Cognitive
Knowledge and
beliefs
24
Behavioral
• Predisposition to action
• Intentions
• Behavioral expectations
25
26
27
28
When Designing Attitude
Measures, Theory is Important
Example: Laziness
29
Example: Laziness (as a behavior)
is defined as delaying the activities.
1. I get up late in the morning
2. I reach my office always late
3. Most often, I complete my
work long only after the
deadline
4. Being inactive is what I enjoy
30
Example: Laziness (as an attitude) is
defined as the evaluations and the
feelings towards delaying activities
Conceptual
1. Getting up late is acceptable Definition
Operational
Definition
(Cognitive)
2. Missing deadlines is okay
(Cognitive)
3. I think being inactive is an
individual’s choice (Cognitive)
4. I like doing nothing (Affective)
31
When Designing Attitude
Measures, Theory is Important
Example: Salesperson’s
Customer Orientation
32
Example: Salesperson’s
Customer Orientation (as a
behavior)
1. I help my customers select
the best product
2. I address the queries of my
customers in a polite manner
3. I try to understand the needs
of my customers 33
Example: Salesperson’s Customer
Orientation (as an Attitude)
1. A salesperson’s job is to help the
customer select best product
(Cognitive)
2. Understanding customer needs
is exciting (Affective)
3. I like to help my customers
(Affective)
34
Concept
• Generalized idea about a

class of objects,
attributes, occurrences,
or processes
35
Operational Definition
• Specifies what
researchers must
do to measure the
concept under
investigation
36
Media Skepticism:
Conceptual Definition
Degree to which people are skeptical about the
reality presented by mass media. Media
skepticism varies across people, from
– those who are mildly skeptical and accept
most of what they see and hear in mass
media, to
– those who completely discount and disbelieve
the facts, values, and portrayal of reality in
mass media.
37
Media Skepticism:
Operational Definition
Please tell me how true each statement is about
the media. Is it very true, not very true, or not at
all true?
– The program was not very accurate in its
portrayal of the problem.
– Most of the story was staged for
entertainment purposes.
– The presentation was slanted and unfair.
38
Constitutive (Conceptual) vs.
Measurement (Operational) Definition
39
Developing Sound Attitude
Measures
1. Specify conceptual/constitutive
definition
2. Specify operational/measurement
definition
3. Perform item analysis
4. Perform reliability checks
5. Perform validity checks
40
Attitude Measurement Process
41
Attitude Measuring Process
Ranking: Rank order preference
Rating: Estimates magnitude of a
characteristic
Sorting: Arrange or classify concepts
Choice: Selection of preferred
alternative
42
Ranking Tasks
Ranking tasks require that respondents rank

a small number of objects in overall
performance based on some characteristic or
stimulus
43
Rating Tasks
Rating tasks ask respondents to estimate the

magnitude of a characteristic, or quality, that
an object possesses. Respondents’ position
on a scale is where they would rate that
object.
44
Example: Attitude Scale Using Rating
Attitude towards the product (Affective)
1. I love my bike
2. My bike is one of my favorite
possessions
3. My bike is fun to use
45
Example: Attitude Scale Using Rating
Attitude towards the Ad (Cognitive)
The ad …..
1. Was believable
2. Was interesting
3. Was informative
4. Was well-designed
5. Was easy-to-follow
6. Was attention-getting
7. clear
46
Sorting Tasks
Sorting tasks present several

concepts —represented either on
typed cards or a computer display—
and require respondents to arrange
the concepts into a number of piles or
groupings.
47
Choice Tasks
Choice between two or more

alternatives is a type of attitude
measurement that assumes the
chosen object is preferred over the
other object(s)
48
Scale Evaluation
Fig. 9.5
Scale Evaluation
Reliability Validity Generalizability
Test/ Alternative Internal

Content Criterion Construct
Retest Forms Consistency
Convergent Discriminant Nomological
49
Scale Evaluation
Measurement Reliability
and Validity
50
Scale Evaluation
51
Measurement Accuracy
The true score model provides a framework for

understanding the accuracy of measurement.
XO = X T + X S + X R
where
XO = the observed score or measurement

XT = the true score of the characteristic
XS = systematic error
XR = random error
52
Reliability
• Degree to which
measures are free from
random error and
therefore yield consistent
results
53
Validity
• Ability of a scale to
measure what was
intended to be measured
54
55
Rulers are Reliable and Valid
56
Potential Sources of Error on
Measurement
1) Other relatively stable characteristics of the individual that influence

the test score, such as intelligence, social desirability, and education.
2) Short-term or transient personal factors, such as health, emotions,
and fatigue.
3) Situational factors, such as the presence of other people, noise, and
distractions.
4) Sampling of items included in the scale: addition, deletion, or changes
in the scale items.
5) Lack of clarity of the scale, including the instructions or the items
themselves.
6) Mechanical factors, such as poor printing, overcrowding items in the
questionnaire, and poor design.
7) Administration of the scale, such as differences among interviewers.
8) Analysis factors, such as differences in scoring and statistical
analysis.
57
Approaches to Reliability
Assessment
• Test-retest
– identical scale items administered at two
different times to same set of respondents
– assess (via correlation) if respondents
give similar answers
58
Approaches to Reliability
Assessment
• Alternative forms
– two equivalent forms of the scale are
constructed
– same respondents are measured at two different
times, with a different form being used each time
– assess (via correlation) if respondents give
similar answers
– Note. Hardly ever practical
59
Approaches to Reliability Assessment
• Internal consistency reliability determines the extent to

which different parts of a summated scale are consistent
in what they indicate about the characteristic being
measured.
• In split-half reliability, the items on the scale are divided
into two halves and the resulting half scores are
correlated.
• The coefficient alpha, or Cronbach's alpha, is the
average of all possible split-half coefficients resulting from
different ways of splitting the scale items. This coefficient
varies from 0 to 1, and a value of 0.6 or less generally
indicates unsatisfactory internal consistency reliability.
60
Approaches to Validity Assessment
• The validity of a scale may be defined as the extent to which
differences in observed scale scores reflect true differences
among objects on the characteristic being measured, rather
than systematic or random error. Perfect validity requires
that there be no measurement error (XO = XT, XR = 0, XS = 0).
• Content validity is a subjective but systematic evaluation of
how well the content of a scale represents the measurement
task at hand.
• Criterion validity reflects whether a scale performs as
expected in relation to other variables selected (criterion
variables) as meaningful criteria.
61
Approaches to Validity Assessment
Construct validity is evidenced if we can establish –
convergent validity, discriminant validity and nomological
validity
Convergent validity is the extent to which scale

correlates positively with other measures of the same
construct
Discriminant validity is the extent to which scale does

not correlate with other conceptually distinct constructs
Nomological validity is the extent to which scale

correlates in theoretically predicted ways with other
distinct but related constructs.
62
Relationship Between Reliability and
Validity
• If a measure is perfectly valid, it is also perfectly reliable.

In this case XO = XT, XR = 0, and XS = 0.
• If a measure is unreliable, it cannot be perfectly valid, since
at a minimum XO = XT + XR. Furthermore, systematic error
may also be present, i.e., X S≠0. Thus, unreliability implies
invalidity.
• If a measure is perfectly reliable, it may or may not be
perfectly valid, because systematic error may still be
present (XO = XT + XS).
• Reliability is a necessary, but not sufficient, condition for
validity.
63
Sensitivity
Measurement instrument’s
ability to accurately
measure variability in
stimuli or responses
64
References
• Prof. N. K. Malhotra’s Textbook and Slides
• Dr. Michael Hyman Slides
65

Vaibhav Chawla Session 10

Uploaded by

Copyright:

Available Formats

You might also like

Vaibhav Chawla Session 10

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Vaibhav Chawla Session 10

Uploaded by

Copyright:

Available Formats

Course: Research for Marketing Decisions Session

Measurement and Scaling:

Instructor: Vaibhav Chawla

Food Quality Score out of 5 = 2

Food in Zaitoon Restaurant 1 2 3 4 5

has good smell

Food in Zaitoon Restaurant 1 2 3 4 5

Food in Zaitoon Restaurant 1 2 3 4 5

Food Quality Score out of 5 = (2+3+2+3)/4 =2.5

Food Quality Score out of 5 = 2

WHICH FOOD QUALITY

A multi-item measure has several questions

I blush easily. (Strongly Agree .....................Strongly disagree)

I am not satisfied with my supervisor. (Strongly

How good you are in

How good you are in 0-------------------------------100

• Can we see the colour, height,

• Generally, the concepts we measure

Generate Initial Pool of Items: Theory, Secondary Data, and

Select a Reduced Set of Items Based on Qualitative Judgment

Collect Data from a Large Pretest Sample

Develop Purified Scale

Collect More Data from a Different Sample

Evaluate Scale Reliability, Validity, and Generalizability

An enduring system of positive

Variable that is not directly

• Generalized idea about a

Ranking tasks require that respondents rank

Rating tasks ask respondents to estimate the

Sorting tasks present several

Choice between two or more

Reliability Validity Generalizability

Test/ Alternative Internal

Convergent Discriminant Nomological

The true score model provides a framework for

XO = the observed score or measurement

1) Other relatively stable characteristics of the individual that influence

• Internal consistency reliability determines the extent to

Convergent validity is the extent to which scale

Discriminant validity is the extent to which scale does

Nomological validity is the extent to which scale

• If a measure is perfectly valid, it is also perfectly reliable.

You might also like