Measurement and Scaling Concepts

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 57

MEASUREMENT

AND SCALING
CONCEPTS
Presented by
Group 7
Ajai Govind G (191065)
Ambika Gupta (191067)
Ankit Jain (191074)
Bhanupriya Deswal (191081)
Chandrika Mittal (191082)
Kulvir Singh Gill (191092)
Measurement & Scaling
Techniques
Measurement is the process of assigning numbers to
objects or observations

Scaling is a procedure for the assignment of number


to a property of objects in order to impart some of the
characteristics of numbers to the properties in
question.
 Concept
A generalized idea about a class of objects,
attributes, occurrences, or processes. (For
example- age, sex, brand loyalty etc.)

 Operational definition
A definition that gives meaning to a concept by
specifying the activities or operations necessary
in order to measure it
Rules of Measurement

It is an instruction to guide assignment of a


number or other measurement designation.

Scale
Any series of items that are arranged progressively
according to value or magnitude, into which an item
can be placed according to its quantification
Measurement Scales
 Nominal Scale
Uses numbers or letters to identify different
objects .
Eg : A scale to Measure the employment status
1) Public sector
2) Private Sector
3) Self employed
4) Unemployed
5) Others
Nominal scale does not give any relationship
between the variables
Nominal Scale

 Measure of central tendency – Mode


 Statistical Test – Chi-square test
 Least Powerful scale
Eg: Assignment of numbers to basketball players to
identify them.
 Ordinal Scale

 Places events in a particular order


 Variables in an ordinal scale can be ranked
 It only gives relative position of the variables
 Implies greater than or less than
 Measure of central tendency is median
 Statistical test – Non-parametric methods
Eg: Question: Please rank the following mobile telephone
service providers from 1 to 5 with 1 representing the
most preferred & 5 representing the least preferred
Airtel _____
Hutch _____
Idea _____
BSNL _____
Reliance _____
 Interval scale

Interval between the points on the scale are equal.


There is qual distance between the two points on the
scale.
Eg: Interval scales placed at an interval of 1 point
[10]---[9]---[8]---[7]---[6]---[5]---[4]---[3]---[2]---[1]
More powerful than ordinal scale.
Measure of central tendency – Mean, Standard deviation
Statistical Test – t ,f test
 Ratio Scale

Has an Absolute zero of measurement


Have zero points & also have equal intervals.
Compares the two variables measured on the scale.
Represents actual amounts of variables.
This is the most precise type scale.
Can be subjected to any type of mathematical operation,
Eg: Age, Weight , Money ,height are the common ratio
scales
Scale construction decisions

 What level of data is involved (nominal, ordinal, interval, or


ratio)?
 What will the results be used for?
 What types of statistical analysis would be useful?
 Should you use a comparative scale or a noncomparative
scale?
 How many scale divisions or categories should be used (1 to
10; 1 to 7; -3 to +3)?
 Should there be an odd or even number of divisions? (Odd
gives neutral center value; even forces respondents to take a
non-neutral position.)
 What should the nature and descriptiveness of the scale labels
be?
 What should the physical form or layout of the scale be?
(graphic, simple linear, vertical, horizontal)
 Should a response be forced or be left optional?
Sources of Measurement Problems/Errors
1. Respondent Associated errors
Non-response errors:
Failure to respond completely (unit non response)
Failure to respond one or more questions (Item
non response)
Reasons for non-response
Lack of knowledge
Doesn’t want to answer
Response bias
When respondents consciously or unconsciously
misrepresent the truth.
2. Instrument associated errors
Due to poor questionnaire design, improper selection of
samples.
Adequate space for registering the answers in the
questionnaire
Ambiguous questions – confusion for respondents
Complicated words & sentences – misinterpretation
3. Situational Errors
No proper response if a third person is present during
interview
Location of interview - public places – lack of response
No assurance on data confidentiality
4. Measurer as a source
Body language & gestures of the interviewer discourage
certain responses.
Failing to record the full response of the respondent
Inappropriate coding & tabulations
Irrelevant statistical tools
Bases for classification of scales

 Subject orientation
Designed to measure the characteristics of
respondents.
Judge the stimulus object present to the respondent.
Ask the respondent to judge some specific objects in
terms of one or more dimensions
 Response form

Categorical – Rating(without reference to other objects)


comparative – ranking(compares with other objects)
Bases for classification of scales

 Degree of subjectivity
Subjective personal preference – choose which person
he favours, which solution he likes.
Non-preference judgments – judge which solution will
take fewer resources
 Scale Properties
Based on the scale the researcher chooses
(nominal,ordinal etc)
 Number of Dimensions
Unidimensional scales – measures only one attribute
Multidimensional scales – measure more than one
attribute.
Scale construction Approaches

 Arbitrary Approach
Scale is developed on ad hoc basis.
Most widely used approach.
 Consensus approach(Thurstone Scale)
Panel of judges evaluate the items chosen.

 Item analysis approach(Likert Scale)


Individual items are tested by a group of respondents.
Total scores are calculated.
Analysed on the basis of degree of discrimination.
Scale construction Approaches

 Cumulative scales (Guttman’s Scalogram)

Conforming to some ranking of items in ascending or


descending order.

 Factor Scales(Semantic Differential scale)


On the basis of inter correlations of items to identify the
common factors.
Factor analysis is used.
Important Scaling Techniques

 Rating Scale
 Ranking Scale
Rating Scale

 Qualitative description of a limited number of aspects


 Judge in terms of specific criteria
 Like --- Dislike
 Above average, average, below average
 3 to 7 point scales are used
 More the rating, more the sensitivity
A Classification of Noncomparative Rating Scales

Noncomparative
Rating Scales

Continuous Itemized
Rating Scales Rating Scales

Semantic
Stapel Likert
Differential
Non-comparative RatingTechniques

 Respondents evaluate only one object at a time,


and for this reason noncomparative scales are
often referred to as monadic scales.
 Noncomparative techniques consist of continuous
and itemized rating scales.
Rating Scale Types

 Graphical Rating / Continuous rating Scale


 Points are put in a continuum
 Indicate rating by tick mark

Like Very Like Neutral Dislike Dislike Very


Much Somewhat Some What Much
Continuous/Graphic Rating Scale

Respondents rate the objects by placing a mark at the appropriate


position on a line that runs from one extreme of the criterion variable
to the other.
The form of the continuous scale may vary considerably.
How would you rate Bigbazaar as a department store?
Version 1
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Probably
the best
Version 2
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- - Probably
the best
0 10 20 30 40 50 60 70 80 90 100
Version 3
Very bad Neither good Very good
nor bad
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - -Probably
the best
0 10 20 30 40 50 60 70 80 90 100
Rating Scale Types

 Itemized Rating
 Presents a series of statements
 Respondent selects the test
He is always involved in some friction with his
fellow worker
He is often at odds with one or more of his fellow
workers
He sometimes gets involved in friction
He infrequently becomes involved in friction with
others
He almost ever gets involved in friction with his
fellow workers
Itemized Rating Scales

 The respondents are provided with a scale that has


a number or brief description associated with each
category.
 The categories are ordered in terms of scale
position; and the respondents are required to select
the specified category that best describes the object
being rated.
 The commonly used itemized rating scales are the
Likert, semantic differential, and Stapel scales.
Likert Scale (Summated Scale)
 Evaluates each item on its ability to discriminate
between those with high score and those with low
score

 Respondent indicates degree of agreement or


disagreement with the statements in the instrument

 Each response is given a numerical score, indicating


favourableness or unfavourableness and total score
represents the attitude
Likert Scale
The Likert scale requires the respondents to indicate a degree of
agreement or disagreement with each of a series of statements about
the stimulus objects.
 
Strongly Disagree Neither Agree Strongly
disagree agree nor agree
disagree
 
1. Sears sells high quality merchandise. 1 2X 3 4 5
 
2. Sears has poor in-store service. 1 2X 3 4 5
 
3. I like to shop at Sears. 1 2 3X 4 5
 
 The analysis can be conducted on an item-by-item basis (profile
analysis), or a total (summated) score can be calculated.
 When arriving at a total score, the categories assigned to the negative
statements by the respondents should be scored by reversing the scale.
Likert Scale Construction

 Identify the attitudinal object and delimit it quite specifically.


 Compose a series of statements about the attitudinal object that
are half positive and half negative and are not extreme,
ambiguous, or neutral.
 Establish (a minimum of ) content validity with the help of an
expert panel.
 Pilot test the statements to establish reliability for each domain.
 Eliminate statements that negatively affect internal consistency.
 Construct the final scale by using the fewest number of items
while still maintaining validity and reliability; create a balance of
positive and negative items .
 Administer the scale and instruct respondents to indicate their
level of agreement with each statement.
 Sum each respondent’s item scores to determine attitude.
Likert Scale (Multi Item) - Example

1. Bigbazaar is an attractive store.


Neither
Strongly Agree Nor Strongly
Agree Agree Disagree Disagree Disagree
1 2 3 4 5

2. The service at Bigbazaar is slow.


Neither
Strongly Agree Nor Strongly
Agree Agree Disagree Disagree Disagree
1 2 3 4 5

3. Bigbazaar has attractive prices.


Neither
Strongly Agree Nor Strongly
Agree Agree Disagree Disagree Disagree
1 2 3 4 5
Likert Scale (Summated Scale)

 Advantages
 Easier than Thurstone Scale
 Without panel of judges
 More reliable as it considers each item statement
and respondent
 Limitations
 Just gives the difference in attitudes and does not
quantify the same
Semantic Differential Scale

The semantic differential is a seven-point rating scale with end


points associated with bipolar labels that have semantic
meaning.
RELIANCE IS:
Powerful --:--:--:--:-X-:--:--: Weak
Unreliable --:--:--:--:--:-X-:--: Reliable
Modern --:--:--:--:--:--:-X-: Old-fashioned
The negative adjective or phrase sometimes appears at the left
side of the scale and sometimes at the right.
 This controls the tendency of some respondents, particularly
those with very positive or very negative attitudes, to mark
the right- or left-hand sides without reading the labels.
 Individual items on a semantic differential scale may be
scored on either a -3 to +3 or a 1 to 7 scale.
Semantic Differential Procedure

 Identify the concept to be measured


 Generate a list of approximately 7 or 8 bipolar
adjectives with an number of positions between
each pair. (Subjects lose focus after 8)
 Administer the scale and instruct respondents to
identify where, on the continuum between the two
adjectives, their beliefs about the concept lie.
 The spaces or positions between the adjectives
become categories with a numerical value (e.g.
1=unfavorable and 6=favorable) and responses are
summed to determine attitude.
Semantic Differential Scale - Example

Service is discourteous 1…2…3…4…5…6…7 Service is courteous

Location is convenient 1…2…3…4…5…6…7 Location is


inconvenient

Hours are inconvenient 1…2…3…4…5…6…7 Hours are


convenient

Loan interest rates 1…2…3…4…5…6…7 Loan interest rates


are high are low
Stapel Scale

The Stapel scale is a unipolar rating scale with ten categories


numbered from -5 to +5, without a neutral point (zero). This
scale is usually presented vertically.
Bigbazaar
+5 +5
+4 +4
+3 +3
+2 +2X
+1 +1
HIGH QUALITY POOR SERVICE
-1 -1
-2 -2
-3 -3
-4X -4
-5 -5
The data obtained by using a Stapel scale can be analyzed in the
same way as semantic differential data.
Ranking Scale/Comparative scale

 Make comparative/relative judgments


 Approaches
 Method of paired comparison
 Method of rank order
Paired Comparisons

 Description - Paired comparison scales ask a


respondent to pick one of two objects from a set based
upon a given criterion
 Example - Which of the following pairs that is most
important to you while selecting a toothpaste?
a.Fights Decay b.Affordable
a.Affordable b.Longer germ protection
a.Longer germ protection b.Fights decay
Rank-Order Scale

 Description - respondent is asked to judge one item


against another.
 Example - Rank the following brands of cereal according
to your preference (1=most preferred).
__ Kellogg’s Corn Flakes
__ Rice Krispies
__ Wheaties
__ Kellogg’s Raisin Bran ...
Other Types
Constant Sum Scale

 This technique requires the respondent to divide a


given number of points, typically 100, among two or
more attributes based on their importance
 Constant sum scales are used more often than paired
comparisons because the long list of paired items is
avoided
Characteristics of a super market Number of points
It is conveniently located _____
Sales persons are cooperative _____
The ambience is pleasing _____
Parking facility is adequate _____
100 points
Cumulative Scale /Guttman’s Scalogram Scale

 Here the respondent checks each item with which they agree
 The items are constructed so that they are automatically
cumulative– if you agree to one, you probably agree to all of the
ones above it on the list
 Can be a good way to gauge how people feel about controversial
topics
 Requires care when writing so that it doesn’t seem leading
 Example :
 Please check each statement that you agree with:
 __ Willing to permit immigrants to live in the U.S.
 __ Willing to permit immigrants to live in your community.
 __ Willing to permit immigrants to live in your neighborhood.
 __ Willing to have an immigrant as a next door neighbor.
 __ Willing to let your child marry an immigrant.
Differential Scale (Thurstone Scale)

 Uses consensus approach


 Method used in measuring attitude on single dimension
 Used to measure the issues like war, religion, etc.
Thurston Scales

 Items are formed (80 to 100)


 Items are given to a group of judges
 Panel of experts assigns values from 1 to 11 to
each item
 Judges favour or disfavour them
 All items that have consensus are selected other
items eliminated.
 Mean or median scores are calculated for each
item
 Attitude comparison made on the basis of this
median.
 It is a time consuming method.
Thurston Scales

 Example:
Please check the item that best describes your level
of willingness to try new tasks
 I seldom feel willing to take on new tasks (1.7)
 I will occasionally try new tasks (3.6)
 I look forward to new tasks (6.9)
 I am excited to try new tasks (9.8)
Balanced and Unbalanced Scales

Balanced Scale Unbalanced Scale

Surfing the Internet is Surfing the Internet is


____ Extremely Good ____ Extremely Good
____ Very Good ____ Very Good
____ Good ____ Good
____ Bad ____ Somewhat Good
____ Very Bad ____ Bad
____ Extremely Bad ____ Very Bad
Criteria for good measurement

 Reliability
 Validity
 Sensitivity
 Relevance
 Versatility
 Ease of response
Scale Evaluation

Scale
Evaluation
Validity
Reliability

Content
Test-Retest
Internal Criterion
Consistency
Alternative
Forms Construct
Criteria for good measurement
 Reliability
When the outcome of the measuring process is
reproducible then the measuring instrument is reliable.
Eg: If a coffee vending machine gives the same
quantity coffee every time, then measurement of
coffee vending machine is reliable
Ability to obtain similar results by measuring an object,
trait or construct with independent but comparable
measures
Example: Do both CAT and MAT scores measure the
candidates performance?
Reliability can be defined as the degree to which the
measurements of a particular instrument are free from
errors and as a result produce consistent results.
 Evaluation of reliability

1. Test –retest reliability


If the result of a research is the same even when it is conducted
for the second or third time it confirms the repeatability aspect.

Eg : If 40% of the population say that they do not watch movies


and when the research is repeated after sometime and the result
is the same, then measurement process is said to be reliable.

2. Split-half method
In this, the researcher divides the result obtained in two halves
ad would then check one half of the scale items against the other
half
3. Internal consistency
When the data give the same results even after some
manipulations.
Eg: After a research result is obtained for a particular study
, the result can be split into two parts, the result of one
part can be tested against the result of the other , if
they are consistent then the measureis reliable.
Criteria for good measurement
 Validity

Ability of a scale or measuring instrument to measure


what it is intended to measure can be termed as the
validity of the measurement.
Measuring the morale of the exam based on absenteeism
alone.

Test for validity


1. Face validity
Collective agreement of the experts and researchers on
the validity of the measurement scale.
Weakest form of validity
2. Criterion-related validity

It relates the degree to which measurement instrument


can analyze a variable that is said to have a criterion.
If a new method is developed , one has to ensure that
it correlates with other measures of the same construct.
Eg: Length of an object is measured with the help of
tape measure ,calipers, ruler & if a new technique is
developed then one has to ensure that this new
measure correlates with other measures of length.
Types
Predictive Validity – The extent to which the future level
of a criterion variable can be predicted by the current
measurement on a scale.
Eg: A scale measuring the future occupancy of an
apartment.
 Concurrent validity
 It is related with the relationship between the predictor variable
& criterion variable.
 Both the predictor variable & criterion variable are measured on
the same scale.
 A measure is used to predict something assessed at the same
point in time

3.Construct Validity
It refers to the degree to which measurement instrument
represents & logically connects through the underlying theory.
It assesses the underlying aspects relating to behaviour
It measures why a person behaved in a certain way rather
than how he has behaved.
Assessment of how well the instrument captures the
construct, concept, or trait it is supposed to be measuring
 Sensitivity
Sensitivity refers to an instrument’s ability to accurately
measure variability in stimuli or responses.
Sensitivity is not in high in instruments involving
‘Agree’ or ‘ Disagree’
It will be high in ‘Strongly agree, mildly agree, mildly
disagree, none of the above
 Generalizability
The amount of flexibility in interpreting the data in
different research designs.
 Relevance
Appropriateness of using a particular scale.
Examples Of Category (Itemized) Rating
Scales

1. Balanced, forced-choice, odd-interval scale focusing on an attitude toward a


specific attribute
(1) How do you like the taste of Classic Coke?
___ ___ ___ ___ ___
Like It Like it Neither Like Dislike It Strongly
Very Much Nor Dislike It Dislike It
2. Balanced, forced-choice, even-interval scale focusing on an overall attitude
(2) Overall, how would you rate Ultra Brite Toothpaste?
___ ___ ___ ___ ___ ___
Extremely Very Somewhat Somewhat Very Extremely
Good Good Good Bad Bad Bad
Examples Of Category (Itemized) Rating
Scales

3. Unbalanced, forced-choice, odd-interval scale focusing on


an overall attitude
(3) What is your reaction to this advertisement?
___ ___ ___ ___ ___
Enthusiastic Very Favorable Favorable Neutral Unfavorable

4. Balanced, non-forced, odd-interval scale focusing on a specific attribute


(4) How would you rate the friendliness of the sales personnel at Sears’
downtown store?
__ __ __ __ __ __ __ __
Very Moderately Slightly Neither Slightly Moderately Very Don’t
Friendly Friendly Friendly Friendly Unfriendly Unfriendly Unfriendly Know
Nor Un-
Friendly

You might also like