Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Research Methodology

Unit 6: Measurement and Scaling (3 Hrs.)

Concept of measurement, different levels of measurement, measurement of variables in likert scales, the issue of
validity and reliability in research

Review of the Concepts of Measurement and Scales.

Measurement:

Measurement is a process of mapping one aspect of items on to other aspects of assortment according to some rule
of correspondence. It is the assignment of numeral to objects or events according to some rule. The rule of
measurement is the main quality of the assignment process of objects to numeral. Generally, we know ‘A rule is a
guide, a method and a command that tells us what to do’.

The assignment of numerals to the objects is (in mathematical concept) a function or a rule of correspondence. Thus
we can say, ‘rule is a function for assignment the objects of one set to the numerals to the other set’. It is just like a
function y = f (x), where f is such assignment.

Scale:

The function of the rule or the rule of correspondence is called a scale. Scale is simply a range of levels or numbers
used for measuring something. It is a set of all the different levels of symbols or numbers or something so
constructed, from the lowest to highest, that these can be assigned by rule to objects or to items or the individuals or
to their behavior to whom it is applied. In general concept, Scale is also known as a quantifying appliance used in
the following to ways:

i. To indicate a measuring instrument and


ii. To indicate the systematized numerals of the measuring instrument.

In the mapping diagram given above, objects (a, b, c, d) to map with numerals 0 and 1 are the members of domain.
These are mapped with the numerals with some rule ‘f’ is the measurement and the set B which contains the
numeral 0 and 1 of the range is the scale.

In measuring process, we develop some rules in the range and then transform or map the properties of objects from
the domain onto this rule. Thus the measurement is the function of rule which assigns the numerals or the numeric
symbols to object or observation. The rules used to assign numeral to objects define as levels of measurement or
scales of measurement.

● The lowest level (scale) of measurement is known as nominal measurement: used to just distinguish the
objects.
● Second is the ordinal measurement, which indicates rank order and nothing more.
● Third level of measurement is the interval measurement, which gives just an gap measure of the extent.
● Ratio measurement is the fourth level of measurement, which can express relative level of the measure.

Indicants:

Usually we say we measure object, but it is not true. We measure the properties or the characteristics of the object.
Even, it is also not quite true; actually, we measure the indicants properties of the objects. ‘Something that points to
something else is the property indicants’. Essentially, measurements are done to the indicants properties of the
objects.
Constructs:

An idea or belief that is based on various pieces of evidence which are not always true and which are put the test
structure.

Question 1: Define measurement and scale.

Types of Physical Scales:

We have already discussed in chapter two that we can classify the data according to scales of measurement, where
we have discussed about these scales. Here we result the types of scale. The scales of measurement of the variables
are broadly classified into four groups: nominal scale, ordinal scale, interval scale and the ratio scale.

Nominal Scale:

It is the simplest type of scale, also known as categorical scale. It is simply a system of assigning number or the
symbols to objects or events to distinguish one from another or in order to label them. The symbols or the numbers
have no numerical meaning. The arithmetic operations cannot be used for these numerals. The orders of the symbols
have no mathematical meaning.

For example; the labeling of 7 samples in a sensory test from 1 to 7 has no mathematical meaning. We cannot say
that the 5>3 or 4<7 or whatsoever. Similarly, we cannot write 5-2 = 6-3 or 2+3+1=2*3 or 6/3 = 4/2. The main
objective of this scale is no more than to classify or distinguish by labeling the samples/objects one from another.
The number of a set of object is not comparable to the other set. There is not any sense of the computation of the
arithmetic mean and standard deviation of the set of observation.

Ordinal Scale:

The second and lowest level of ordered scale is the ordinal scale. It is the quantification of items by
ranking/ordering. In this scale, the numerals are arranged in some order but the intervals/gaps between the
places/positions of the numerals are not made equal. The rank orders represent ordinal scale and mostly useful in
scaling the qualitative phenomena.

For example, the 5 point hedonic score given to a sensory test of a product are 1, 3, 5, 7 and 9 in an increasing order
of preference or satisfaction (for poor, satisfactory, good, very good and excellent respectively). We can say that 9>5
and 5>1 but it is not true to say 9-5 = 5-1.

For the ordinal/ranking scale, if a<b and b<c but b-c ≠ c-b. Thus, it is customary to say that the use of an ordinal
scale implies a statement or the criteria of ‘greater than’ or ‘less than’ or ‘superior to’ or ‘inferior to’ or ‘is above’ or
‘is below’ etc.

For a variable having ordinal scale, Median is the appropriate measure of central tendency. Percentile rank and the
Quartile deviation are used as the measures of dispersion. The rank correlation method can be used to obtain the
association between the two sets of ordered data. Only Non-parametric statistical tests can be used for the
significance test.

Interval Scale:

In addition to ordering the data, this scale uses equidistant units to measure the difference between scores. This scale
does not have absolute zero. Ratio has no intrinsic meaning to interval scale data. Interval scale is the developed
from of the ordinal scale. The intervals between the ordered numerals are adjusted in terms of some rule.
The Fahrenheit/Celsius scale of temperature is an example of ordinal scale. In an increase in temperature from 320F
to 420F and from 640F to 740F, we can say the increases are equal of 10oF; but one cannot say that the temperature of
640F is twice as the temperature of 320F. The 00C or/and 320F are the arbitrarily set points for the freezing point of
water so the temperatures 320F and 640F are not viable to express in ratio because the zero is not a true zero but it is
an arbitrary point.

For the data having interval scale, we can use the symbols =, > and <. Arithmetic mean and standard deviation are
the common measures of central tendency and the measure of variability respectively. For the bivariate interval
scaled data, product moment correlation coefficient is the measures of association and as the tests of significant t-test
and F- test can be used.

Ratio Scale:

Ratio scale is the ideal scale and an extended from of interval scale. It processes the characteristics of nominal
ordinal and interval scale. Ratio scale has an absolute zero or true zero or natural zero of measurement, which has an
empirical meaning. The true zero point or the initial point indicates the completely absence of that property of an
object what is being measured.

For example, the absolute zero or the natural zero in the centimeter scale indicates the absence of the length.
Numbers on the scale indicates the actual amount of property being measured. All the arithmetic operations like
addition, subtraction, multiplication and division can be applied. The ratio involved in the ratio scale possesses
measure property and it facilitates comparison, which is not possible in interval scale.

It represents actual amount of variables and it is used to measure the physical dimensions. Some examples of ratio
scale variables are life-time (age), length/height, mass, distance, cost and income etc. All arithmetic operations and
rank order among ratio scale data is meaningful. For the ratio scale data the manipulation and partitioning is
possible.

Question: Define physical scale. Describe different types of physical scale.

Likert Scale:

A Likert scale is a rating scale used to measure opinions, attitudes, or behaviors. It consists of a statement or a
question, followed by a series of five or seven answer statements. Respondents choose the option that best
corresponds with how they feel about the statement or question. Because respondents are presented with a range of
possible answers, Likert scales are great for capturing the level of agreement or their feelings regarding the topic in a
more nuanced way.

Likert, is extremely popular for measuring attitudes, because, the method is simple to administer. With the likert
scale, the respondents indicate their own attitudes by checking how strongly they agree or disagree with carefully
worded statements that range from very positive to very negative towards the attitudinal object. Respondents
generally choose from five alternatives (say strongly agree, agree, neither nor disagree, disagree, strongly disagree).

A Likert Scale may include a number of items or statements. Disadvantage of Likert Scale is that longer time to
complete than other itemized rating scales because respondents have to read each statement. Despite the above
disadvantages, this scale has several advantages. It is easy to construct, administer and use.

Example of Likert Scale:

1. Seven-point Likert scale:


A customer satisfaction survey can simply ask, “How do you rate the customer service you received?” The
respondent could choose from the following options:

● Exceptional
● Excellent
● Very good
● Good
● Fair
● Poor
● Very poor

2. An example could be this statement: “Reducing deficit spending is crucial to maintaining the economic health of
the country.” The response options would be

● Strongly disagree
● Disagree
● Somewhat disagree
● Neither agree nor disagree
● Somewhat agree
● Agree
● Strongly agree

3. Five-point Likert scale:

The five-point Likert scale is the one most familiar to the general public. Survey of customer service experience
might offer the following answer options:

● Excellent
● Good
● No opinion
● Poor
● Very poor.

Question: What do you mean by likert scale. Write two examples of likert scale.

The issue of validity and reliability in research:

The validity of your experiment depends on your experimental design. What are threats to internal validity? There
are eight threats to internal validity: history, maturation, instrumentation, testing, selection bias, regression to the
mean, social interaction and attrition.

Reliability:

In research, the term reliability means “repeatability” or “consistency”. It is the degree to which an assessment
tool produces stable and consistent results. It refers to the extent to which a test in internally consistent and the
extent to which it yields consistent results on testing and retesting. A measure is considered reliable if it is would
give us the same result over and over again (assuming that what we are measuring isn’t changing).

Types of reliability

i. Test-Retest Reliability
ii. Equivalent or Parallel-Forms Reliability
iii. Split-Half Method
iv. Inter-Rater or Inter-Observer Reliability
v. Internal consistency Reliability
vi. Rational equivalence Reliability

Next class on wedness day

Test-Retest Reliability:

In this method the same set of objects / items is measured (tested) again and again by using the same or the
comparable measuring instrument. The results so obtain are compared by computing correlation coefficient between
the scores of the different tests (measures). If it is impossible to use such method due to the long space of time, it is
considered whether the effects of causative factor in the period of two tests are present or not. To find the result
related to this problem technique of control group (a team checking by trained and motivated persons) is applied.

Validity:

A scale possesses validity when it actually measures what it claims to measure. In other words, a scale is said to be
valid if it measuring what is expected to measure. Interpretation of test scores ultimately involves prediction about a
subject’s behavior in a specified situation. If a test is an accurate predictor, it is said to have good validity. Before
validity can be demonstrated a test must first yield consistent, reliable measurements. In addition to reliability,
psychologists recognize three main types of validity.

Types of Validity:

There are mainly the following types of validity.

1. Face Validity
2. Content Validity
3. Criterion Related Validity
4. Construct Validity

Face Validity:

Face validity refers to the extent to which a test appears to measure what it claims to measure based on face value.
For example, a researcher develops a questionnaire to measure depression level in employees working in private
organizations. Researcher’s colleague then looks over the questions and believes the questionnaire is valid purely on
face value. Face validity means the content of the test is relevant and appropriate in its appearance. It is weakest and
simplest form of validity.

Content Validity:

The extent to which the measurements cover all aspects of the concepts being measured. For example, a researcher
aims to measure English language ability of college level students. Researcher develop a test which contains
reading, writing components, but no listening component. Listening is an essential aspect of language ability, so the
test lacks content validity to measure English language ability.

Criterion Validity:

Criterion validity evaluates how accurately a test measures the outcome it was designed to measure. Outcome could
be a behavior, performance etc. For example, a researcher wants to know whether a college entrance exam is able to
predict future academic performance of newly enrolled students. Then first semester GPA can serve as the criterion
variable, as it is an accepted measure of academic performance. After completing first semester the researcher can
compare their college entry exam scores with GPA. If the scores of the two tests are close, then the college entry
exam has criterion validity.

a. Concurrent Validity: Concurrent validity is used when the scores of a test and the criterion variables are
obtained at the same time. Scores of new test correlates with another test that is already considered valid.
High correlation between new test and criterion variable indicates existence of concurrent validity.
b. Predictive Validity: Predictive validity is used when the criterion variables are measured after the scores of
the test. Example, a researcher examine how the of result of a job recruitment test can be used to predict
future performance of employees.

Construct Validity:

Construct is a theoretical concept of idea that’s usually not directly measurable. For example, self-esteen,
motivation, anxiety etc. Construct validity concerns the extent to which your test or measure accurately assesses
what it’s supposed to.

a. Convergent Validity: The extent to which measures of the same/similar constructs actually correspond to
each other.
b. Discriminant Validity: Two measures of unrelated constructs (i.e. anxiety and self-esteem) that should be
unrelated, very weekly related, or, negatively related actually are in practice.

Question: What are the issues with validity in research?

The validity of your experiment depends on your experimental design. What are threats to internal validity? There
are eight threats to internal validity: history, maturation, instrumentation, testing, selection bias, regression to the
mean, social interaction and attrition.

Research validity refers to how accurately a method, instrument, or technique measures the object of study. Validity
is essential considerations during research design for any study as well as when planning methods and writing up
one's results.

Validity of the findings, data collected, the instrument used in data collection and the research design is of important
concern in social research. Similar to reliability, the issue of validity transcends methodological boundaries. In
quantitative research validity refers to the ability of the instrument to measure what it is supposed to measure,
whereas in qualitative research the issue of validity goes beyond data extending to the research design adopted, the
techniques (for example, observation, ethnography, interviews and narratives) used in data collection and the
findings discussed in the research study.

This is the biggest threat in qualitative research. As mentioned earlier, in qualitative research when the researcher
becomes the instrument of data collection, the potential for bias in recording the observation is enormous.
Researcher’s personal factors (religious, economic, cultural, gender, etc.), theoretical assumptions, political
affiliations, etc. influence the collection of data and interpretation of data.

For example: thermometer is designed to measure body temperature, it cannot be used to check blood pressure. If
thermometer is used to measure body temperature then we say that it is a valid instrument. But if thermometer is
used to check blood pressure then we say that it is a invalid instrument. Another example, a test is designed to
measure job satisfaction, it is supposed to measure job satisfaction not employees performance.

Issue of Reliability:
Reliability refers to the extent to which studies can be repeated, and if they can, whether the same sorts of results
would be obtained. Consistency of research is the important principle here.

Reliability is another issue where opinions may vary regarding each study. You may disagree if you discuss this in a
group, or read someone else’s view. The most important point is that you can apply your knowledge and make a case
for your ideas by referring to evidence from the studies. Reliability tends to be linked with laboratory studies which
are more tightly controlled, so it may well be the case that reliable studies lack ecological validity.

‘Replicability’ is a useful term to know when considering reliability. This is how we refer to whether the steps of the
procedure could be copied by someone else. A good analogy is with a recipe, which a cookery book will describe in
careful detail so that it can be followed precisely by another cook. The more like a recipe the procedure of a piece of
research is, the more likely it is to be replicable. ‘Representativeness’ is also an issue related to reliability, because if
the sample is unique in some way, the results are more likely to be different if the study was repeated. Therefore, the
more representative of the target population the sample is, the more it is probable that the research would produce
the same results next time. ‘Inter-rater reliability’ is a phrase which refers chiefly to observations, meaning the
extent to which two or more observers agree on what they have seen.

Qustions:

1. Define measurement and scale.


2. Define physical scale. Describe different types of physical scale.
3. What do you mean by likert scale. Write two examples of likert scale.
4. What are reliability and validity?
5. What are issues with reliability and validity?
6. How are reliability and validity used in research?
7. What are examples of reliability and validity?
8. What are the issues with validity in research?

You might also like