BUS 332 Scientific Research Techniques

Main Textbook:

William G. Zikmunds
Business Research Methods

Lectured by Prof. Dr. Ltfihak Alpkan Gebze Institute of Technology



20. 02 Introduction 27. 02 Ch. 9: Survey Research: An Overview 05.03 Ch. 10: Survey Research: Communication with Respondents 12.03 Ch. 11: Observation Methods 1 19.03 Ch. 11: Observation Methods 2

26.03 General overview



09.04 Ch. 12: Experimental Research 1 16.04 Ch. 12: Experimental Research 2 30.04 Ch. 12: Experimental Research 3 07.05 Ch. 13: Measurement and Scaling 1 14. 05 Ch. 13: Measurement and Scaling 2


21.05 General overview

Chapter 9: Survey Research

1. Basic Definitions for surveys 2. Errors in Surveys 3. Classification of Survey Methods

1. Basic Definitions for surveys

Survey: a research technique in which information (primary data) is gathered from a sample of people to make generalizations. Primary data: data gathered and assembled specifically for the project at hand. Sample of the survey: respondents who are asked to provide information, assuming that they can represent (possess same features with) a target population.

Selecting a Sample
Sample: Subset of a larger population

Sampling: POPULATION Who is to be sampled? How large a sample? How will sample units be selected?

Basic Definitions for sampling (

Target population: the group about which the researcher wishes to draw conclusions and make generalizations

Random sampling: selecting a sample from a larger target population where each respondent is chosen entirely by chance and each member of the population has a known, but possibly nonequal, chance of being included in the sample.

Basic Definitions for data collection

Surveys ask respondents (who are the subjects of the research) questions by use of a questionnaire. Respondent: The person who provides information (primary data) by answering a questionnaire or an interviewers questions. Questionnaire: a list of structured questions designed by the researchers for the purpose of codifying and analyzing the respondents answers scientifically. Advantages of Surveys: Quick, Inexpensive, Efficient,
Accurate, Flexible way of gathering information.

2. Errors in Surveys
2.1. Random Sampling Error 2.2. Systematic Error (sample bias)
2.2.1. Respondent error * Nonresponse bias * Response bias 2.2.2. Administrative error * Data processing error * Sample selection error * Interviewer error * Interviewer cheating

2.1. Random Sampling Error

Even if randomly selected, samples may possess different characteristics than the target population (the likelihood of bias is reduced but still exists) This is a statistical fluctuation due to chance variation. Then, an important difference occurs between the findings obtained from this sample and the findings obtained from a possible census of the whole target population.

Consider the hypothetic case in which a study sample could be increased until it was infinitely large; chance variation of the mean, or random error, would be reduced toward zero. These are random errors. Systematic errors would not be diminished by increasing sample size. (Bias in Research Studies,

2.2. Systematic Error

Systematic error results from some mistake(s) done in the design and/or execution of the research. All types of error -except random sampling error, are included in this definition, Sample bias: a persistent tendency for the results of a sample to deviate in one direction from the true value of the population parameter. Sample bias can arise when the intended sample does not adequately reflect the spectrum of characteristics in the target population.

2.2.1. Respondent Bias

A classification of sample bias resulting from some respondent action or inaction Nonresponse bias Response bias

Nonresponse Error
Nonrespondents: in almost every survey information from a small or large portion of the sample cannot be collected. These are those people who refuse to respond, or who can not be contacted (not-at-homes)

Self-selection bias: only those people who are interested strongly with topic of the survey may respond while those who are still within the same sample but indeferent or afraid avoid participating.
This leads to the over-representation of some extreme positions, but under-representation of others.

Response Bias
A bias that occurs when respondents tend to answer questions with a certain inclination or viewpoint that consciously (deliberate falsification) or unconsciously (unconscious misinterpretation) misrepresents the truth.

Reasons of response bias

Knowingly or unknowingly people who answer questions of the interviewer may feel unconfortable about the truth that they share with others, and change it in their responses. They may desire to show themselves as more intelligent, wealthy, sensitive, etc. than they really are.

Types of Response Bias

Deliberate falsification (consciously false answers) Acquiescence bias (positive answers) Extremity bias (exaggerated answers)

Interviewer bias (answers acceptable by the interviewer)

Auspices bias (answers acceptable by the organization)

Social desirability bias (answers creating a favorable impression)

2.2.2. Administrative Error

Unadvertently or carelessly improper administration and execution of the research task Blunders are:
Confusion Neglect Omission

Types of Administrative Errors

Data processing error: incorrect data entry, computer programming, or other procedural errors during the analysis stage. Sample selection error: improper sample design (e.g. based on incomplete databases) or sampling procedure execution (e.g. executed in daytime while most of the target population are working) Interviewer error: mistakes done by the interviewer (e.g. taking wrong or incomplete notes about the answers of the respondents. Interviewer cheating: filling in fake or false answers indeed not given by the respondents.

3. Classification of Survey Methods

3.1. Structure of the questionnaire: * whether standardized questions with a limited number of allowable answer -multiple choices * or unstandardized open ended questions with the possibility of being answered in numerious ways. 3.2. Level of Directness of the questions: * whether direct/undisguised questions * or indirect/disguised questions to hide the real purpose of the survey

Classification of Survey Methods

3.3. Time basis of the Survey:
Cross-Sectional Study: data on various segments of a target population are collected at a single moment in time to make comparisons among segments.

Longitudinal Study: data are collected at different times from the similar respondents to compare trends and identify changes.
Panel Study: A longitudinal survey of exactly the same respondents to record (in a diary) their attitudes, behaviors, or purchasing habits over time.

Chapter 10: Survey Research: Basic Communication Methods

* Comparison of Basic Communication Methods in Surveys: * Questionnaires administered by an interviewer 1. Door-to door interviews 2. Mall intercepts 3. Telephone interviews * Self-administered questionnaires 4. Questionnaires sent by mail, fax, or e-mail 5. Internet questionnaires

1. Door-to-Door Personal Interview

Speed of data collection Questionnaire length
Moderate to fast Long

Geographical flexibility Item non-response

Limited to moderate Low

Respondent cooperation Possibility of respondent Excellent misunderstanding Versatility of Lowest questioning

Quite versatile

Door-to-Door Personal Interview

Degree of interviewer influence of answer: High

Supervision of interviewers: Moderate

Anonymity of respondent: Low

Ease of call back or follow-up: Difficult

Cost: Highest Special features: Visual materials may be shown or demonstrated; extended probing possible

1. Mall Intercept Personal Interview

Speed of data collection: Fast

Geographical flexibility: Confined, urban bias

Respondent cooperation: Moderate to low

Versatility of questioning: Extremely versatile

Questionnaire length: Moderate to long

Item non-response: Medium

Possibility of respondent misunderstanding: Lowest

Mall Intercept Personal Interview

Degree of interviewer influence of answers: Highest

Supervision of interviewers: Moderate to high

Anonymity of respondent: Low

Ease of call back or follow-up: Difficult

Cost: Moderate to high Special features: Taste test, viewing of TV commercials possible

3. Telephone Surveys
Speed of Data Collection: Very fast

Geographical Flexibility: High

Respondent Cooperation: Good Versatility of Questioning: Moderate Questionnaire Length: Moderate Item Non-response: Medium

Possibility of Respondent Misunderstanding: Average

Degree of Interviewer Influence of Answer: Moderate

Telephone Surveys
Supervision of interviewers: High, especially with central location WATS (Wide Area Telecommunications Service) interviewing Anonymity of respondent: Moderate Ease of call back or follow-up: Easy Cost: Low to moderate Special features: Fieldwork and supervision of data collection are simplified; quite adaptable to computer technology (e.g. Central location interviewing, Computerassisted telephone interviewing, Computerized voiceactivated interviews)

Self-Administered Questionnaires








4. Mail Surveys
Speed of data collection: Researcher has no control over return of questionnaire; slow Geographical flexibility: High Respondent cooperation: Moderate but, poorly designed questionnaire will have low response rate Versatility of questioning: Highly standardized format Questionnaire length: Varies depending on incentive Item non-response: High

Mail Surveys
Possibility of respondent misunderstanding: Highest--no interviewer present for clarification Degree of interviewer influence of answer: None - interviewer absent

Supervision of interviewers: Not applicable

Anonymity of respondent: High

Ease of call back or follow-up: Easy, but takes time

Cost: Lowest

5. E-Mail Questionnaire Surveys

Speed of data collection: Instantaneous Geographic flexibility: worldwide Cheaper distribution and processing costs Flexible, but
Extensive differences in the capabilities of respondents computers and e-mail software limit the types of questions and the layout

E-mails are not secure and eavesdropping can possibly occur Respondent cooperation
Varies depending if e-mail is seen as spam

6. Internet Surveys
A self-administered questionnaire posted on a Web site. Respondents provide answers to questions displayed online by highlighting a phrase, clicking an icon, or keying in an answer.

Internet Surveys
Speed of data collection: Instantaneous Geographic flexibility: worldwide Cost effective, visual and interactive Respondent cooperation
Varies depending on web site Varies depending on type of sample When user does not opt-in or expect a voluntary survey cooperation is low. Self-selection problems in web site visitation surveys participants tend to be more deeply involved than the average person.

Internet Surveys
Versatility of questioning: Extremely versatile Questionnaire length: varying according to the answers of each respondent Item non-response: Software can assure none Possibility for respondent misunderstanding: High Interviewer influence of answers: None Supervision of interviewers: not required Anonymity of Respondent: Respondent can be anonymous or known Ease of Callback or Follow-up: difficult unless e-mail address is known Special Features: allows graphics and streaming media

Chapter 11: Observation Methods

1. Types of Observed Phenomena 2. Advantages and Disadvantages of Observation 3. Types of Observation Techniques

1. Types of Observed Phenomena

Physical actions Verbal behavior Expressive behavior Spatial relations and locations Temporal patterns Verbal and pictorial records

Examples for Observed Phenomena

Phenomena Example

Human behavior or physical Shoppers (buyers) movement action pattern in a store Verbal behavior Statements made by airline travelers who wait in line Facial expressions, tone of voice, and other form of body language

Expressive behavior

Examples for Observed Phenomena

Phenomena Spatial relations and locations Example How close visitors at an art museum stand to paintings

Temporal patterns

How long fast-food customers wait for their order to be served

What brand name items are stored in consumers pantries Bar codes on product packages

Physical objects

Verbal and Pictorial Records

2. Advantages and Disadvantages of Observation


2.1. Benefits of Observing Human Behavior

Communication with respondent is not necessary Data without distortions due to self-report (e.g.: without social desirability) Bias No need to rely on respondents memory Nonverbal behavior data may be obtained

Benefits of Observing Human Behavior

Certain data may be obtained more quickly Environmental conditions may be recorded May be combined with survey to provide supplemental evidence

2.2. Limitations of Observing Human Behavior

Cognitive phenomena cannot be observed Interpretation of data may be a problem (e.g. misinterpretation) Not all activity can be recorded Only short periods can be observed Observer bias possible (e.g. selective perception) Possible invasion of privacy

3. Types of Observation Techniques

Natural versus Contrived Observation Direct versus Indirect Observation Disguised versus Nondisguised Observation

Physical-trace evidence Observation

Mechanical Observation

3.1.Natural versus Contrived Observation

Natural Observation: Reactions and behavior observed as they occur naturally in real-life situations A wide variety of companies are sending researchers to the field to observe consumers in their natural environment. Natural observation is also suited for ethnographic research on foreign cultures.

Contrived Observation: Environment artificially set up by the researcher. Researchers are increasingly relying on computers to conduct simulated market testing. Offers a greater degree of control
Speedy Efficient Less expensive

However, it may be questionable as to whether or not the data collected does truly reflect a "real life" situation.

3.2. Direct versus Indirect Observation

Direct observation captures actual behavior or phenomenon of interest Indirect observation consists of examining the results of the phenomenon. can give only relatively crude or imprecise indications of a phenomenon More efficient use of research time More efficient use of research budget May be the only way to get data from situations impractical to observe directly.

3.3. Disguised versus Nondisguised Observation

Nondisguised observation: Respondents are aware that they are being observed Data may be contaminated by respondentinduced errors. Data gathered through using disguised observation might not be as rich as those from nondisguised observation.

Disguised Observation Respondents are unaware they are being observed Allows for monitoring of the true reactions of individuals. Unethical if disguised observation monitors
Normally private behaviors Behaviors that may not be voluntarily revealed to researchers.

Mystery shopping popular disguised observational technique Mystery shopper Unknown to the retail establishment Visits the store Uses a structured script Observes and records the shopping experience.

3.4. Physical-trace evidence

Wear and tear of a book indicates how often it has been read garbology - looking for traces of purchase patterns in garbage detecting store traffic patterns by observing the wear in the floor (long term) or the dirt on the floor (short term)

3.5. Types of Mechanical Observation

Eye-Tracking Response Latency Voice Pitch Analysis People Meter Psychogalvanometer Monitoring Web Site Traffic

Eye Tracking
Measures unconscious eye movements Records how the subject actually reads or views an advertisement, product packaging, promotional displays, websites, etc. Measures which sections attract customers' attention and how much time they spend looking at those sections Oculometers - what the subject is looking at Pupilometers - how interested is the viewer (This
device observes and records changes in the diameter of the subjects pupils)

Voice Pitch Analysis

Measures emotional reactions through physiological changes in a persons voice Used to determine
how strongly a respondent feels about an answer how much emotional commitment is attached to an answer.

Variations from normal voice pitch is considered a measure of emotional commitment to the question's answer.

Response Latency
It measures the speed with which a respondent gives a decision about a choice between alternatives It records the decision time necessary to make this choice. For instance: it can measure the effectiveness of an advertisement on brand preferences. It assumes that a quick expression of brand preference indicates a stronger preference.

People Meter
Electronic device to monitor television viewing behavior
who is watching what shows are being watched.

Measures galvanic skin response Involuntary changes in the electrical resistance of the skin Assumption: physiological changes accompany emotional reactions

Chapter 12.1.: Basics of Experimental Research

1. Basics of Experiment & Causality 2. Advantages and disadvantages of the experimental method 3. Steps of a well-planned experiment 4. Validity in experiments

1.Basics of Experiment & Causality

1.1. Definition of Experiment:
An experiment is a study involving intervention by the researcher beyond that required for measurement.
The usual intervention is to manipulate some variable in a setting and observe how it affects the participants or subjects being studied. There is at least one independent variable and one dependent variable in a causal relationship.

1.2. Causal Evidence There are three types of evidence necessary to support causality.
Agreement between Independent and Dependent Variables

Time order of occurrence

Extraneous variables did not influence Dependent Variables

1.2.1. Agreement between Independent and Dependent Variables

First, there must be an agreement between independent and dependent variables. The presence or absence of one is associated with the presence or absence of the other.

1.2.2. Time order of occurrence

Second, beyond the correlation of independent and dependent variables, we consider the time order of the occurrence of the variables.

The effect on the dependent variable should not precede the manipulation of the independent variable.
The effect and manipulation may occur simultaneously or the manipulation may occur before the effect.

1.2.3. Extraneous variables did not influence Dependent Variables

The third source of support comes when researchers are confident that other extraneous variables did not influence the dependent variable. To ensure that these other variables are not the source of influence, researchers control their ability to confound the planned comparison.

2. Advantages and disadvantages of the experimental method

Advantages Ability to manipulate Independent Variable Use of control group Control of extraneous variables Replication possible Field experiments possible Disadvantages Artificiality of labs Non-representative sample Expensive Focus on present and immediate future Ethical limitations

2.1.Explanation of Some Advantages of Experiments

Replication: is the process of repeating an experiment with different participant groups and conditions to determine the average effect of the Independent Variables across people, situations, and times.
A field experiment: is a study of the dependent variable in actual environmental conditions.

2.2.Explanation of Some Disadvantages of Experiments

The artificiality of a lab is possibly the greatest disadvantage of experiments. Also, experiments typically use small convenience samples which cannot be generalized to a larger population. Compared to surveys, they are expensive. They also cannot deal with past events or predict events in the far-off future. Finally, marketing research is often concerned with the study of people and there are limits to the types of manipulation and controls that are ethical.

3. Steps of a well-planned experiment

Specify treatment variables
Specify treatment levels Control environment

Choose experimental design Select and assign participants

Pilot-test, revise, and test Collect data Analyze data

Steps of a well-planned experiment

The activities the researcher must accomplish to make an experiment a success: 3.1. Specify treatment variables: a) select variables that are the best operational definitions of the original concepts, b) determine how many variables to test, c) select or design appropriate measures for the chosen variables. The selection of measures for testing requires a thorough review of the available literature and instruments.

3.2. Specify treatment levels: In an experiment, participants experience a manipulation of the independent variable, called the experimental treatment. The treatment levels are the arbitrary or natural groups the researcher makes within the independent variable. A control group is a group of participants that is measured but not exposed the independent variable being studied. A control group can provide a base level for comparison.

3.3. Control environment:

Environmental control means holding the physical environment of the experiment constant. When participants do not know if they are receiving the experimental treatment, they are said to be blind.

When neither the participant nor the researcher knows, the experiment is said to be double-blind.
3.4. Choose experimental design: The design is then selected. Several designs are discussed on the next several slides.

3.5. Select and assign participants:

The participants selected for the experiment should be representative of the population to which the researcher wishes to generalize the studys results. Random assignment is required to make the groups as comparable as possible. Random assignment uses a randomized sample frame for assigning participants to experimental and control groups.

Matching is an equalizing process for assigning participants to experimental and control groups.

3.5.1. Random assignment : The sampling frame is often small for experiments and the participants may be self-selected. However, if randomization is used, those assigned to the experimental group are likely to be similar to those assigned to the control group. Random assignment allows one to make the groups as comparable as possible. It means that participants have an equal and known chance of being assigned to any of the groups in the experiment.

3.5.2. Matching : Matching is a control procedure to ensure that experimental and control groups are equated on one or more variables before the experiment.
The object of matching is to have each experimental and control participant matched on every characteristic used in the research. Matching employs a nonprobability quota sampling approach.

Quota matrix is a means of visualizing the matching process. If matching does not alleviate assignment problems, a combination of matching, randomization, and increasing the sample size may be useful.

Quota Matrix Example

Exhibit 10-3 presents an example of a quota matrix. One-third of the participants from each cell of the matrix would be assigned to each of the tree groups.

4. Validity in Experimentation
Internal validity exists when the conclusions drawn about a demonstrated experimental relationship truly implies cause. External validity exists when an observed causal relationship can be generalized across persons, settings, and times.

4.1.Threats to Internal Validity

There are twelve possible threats to internal validity: History Maturation Testing Instrumentation Selection Statistical regression Experimental mortality Diffusion or imitation of treatment Compensatory equalization Compensatory rivalry Resentful Demoralization of the disadvantaged Local history

Threats to internal validity

History: In the experimental designs a control measurement (O1) of dependent variable is often taken before introducing the manipulation (X). After the manipulation an after measurement (O2) of the dependent variable is taken. Then the difference between O1 and O2 is attributed to the manipulation. (See also One Group Pretest-Posttest Design)
However some events may occur during the course of the experimental study, which will affect the relationship between the variables under the study.

Threats to internal validity

Maturation: Changes may also occur within the participant that are a function of the passage of time and are not specific to any particular event. A participant may become hungry, bored, or tired and these conditions can affect response results. Testing: The process of taking a test can affect the scores of a second test. For instance, repeatedly taking (the same or similar) intelligence tests usually leads to score gains.

Threats to internal validity

Instrumentation: This threat to internal validity results from changes between observations in either the measuring instrument or the observer. Selection: Differential selection of subjects for experimental and control groups affects the validity. Validity considerations require the groups to be equivalent in every aspect. The problem can be overcome by randomly assigning the subjects to experimental and control groups. In addition matching can be done. Matching the members of the groups on key factors also enhances the equivalence of the groups.

Threats to internal validity

Statistical regression: This factor operates especially when groups have been selected by their extreme scores. For example, when children with the worst reading scores are selected to participate in a reading course, improvements at the end of the course might not be due to the course's effectiveness.

Experimental mortality: This occurs when the composition of the study groups changes during the test. Some participants may drop out the experiment.

Threats to internal validity

Diffusion or imitation of treatment: If people in the experimental and control groups talk, then those in the control group may learn of the treatment. This eliminates the difference between the groups. Compensatory equalization: Where the experimental treatment is much more desirable for the experimental group, there may be an administrative reluctance to deprive the control group members. Actions to compensate the control group may confound the experiment.

Threats to internal validity

Compensatory rivalry: This may occur when members of the control group know they are in the control group. This may generate competitive pressures, causing the control group members to try harder. (e.g. Hawthorne effect ) Resentful demoralization of the disadvantaged: When the treatment is desirable and the experiment is conspicuous, control group members may become resentful that they are deprived and lower their cooperation and output.

Threats to internal validity

Local history: The regular history effect already mentioned impacts both experimental and control groups alike. When one assigns all experimental persons to one group session and all control group people to another, there is a chance for some peculiar event to confound results.

4.2.Threats to External Validity

External validity is concerned with the interaction of the experimental treatment (X) with other factors and the resulting impact on the ability to generalize to (and across) times, settings, or persons. External validity is high when the results of an experiment are applicable to a larger population.

Three major threats to external validity are as follows:

Reactivity of testing on X Interaction of selection and X Other reactive factors

Reactivity of testing on X
The reactive effect refers to sensitizing participants via a pretest so that they respond to the experimental stimulus (X) in a different way.
For instance, people who participate in a web survey may then be sensitized to store displays and organization.

Interaction of selection and X

The process by which test participants are selected for an experiment may be a threat to external validity. The population from which one selects participants may not be the same as the population to which one wishes to generalize the results. It limits the generalizability of the findings.

Other reactive factors

The experimental settings themselves may have a biasing effect on a participants response to X. An artificial setting can produce results that are not representative of larger populations. If participants know they are participating in an experiment, there may be a tendency to role-play in a way that distorts the effects of X. Another reactive effect is the possible interaction between X and participant characteristics.

Chapter 12.2.: Types of Experimental Research Designs

1. Pre-experiments 2. True experiments

3. Field experiments

X refers to the treatment or manipulation of the independent variable (more than one X refers to a different level of treatment).
O refers to the observation or measurement of the dependent variable. Experimental designs vary widely in their power to control contamination of the relationship between the independent and dependent variables. Experiments can be categorized as pre-experiments, true experiments, and field experiments based on the characteristic of control.

1. Pre-experiment
Pre-experimental research designs are research designs that are characterized by a lack of random selection and assignment.
Types of Pre-experiments: After-Only Case Study One Group Pretest-Posttest Design Static Group Comparison

1.1. After-Only Case Study


In this type of experimental design only one treatment (X) or manipulation is done on the independent variable. Then, the dependent variable is measured.

An example is a media campaign about a products features without a prior measurement of consumer knowledge.

Results would reveal only how much target consumers know after the media campaign, but there is no way to judge the effectiveness of the campaign.
The lack of a pretest and control group makes this design inadequate for establishing causality.

1.2. One Group Pretest-Posttest Design



This design meets the threats to internal validity better than the one-shot case study, but it is still a weak design. For example, a researcher examining the effect of a commercial on brand liking would begin by taking a pre-test to determine current levels of brand liking among the participants.



The commercial would be shown.

Then a post-test would measure brand liking after the commercial.

A comparison between the post-test and the pre-test shows the change in liking.
However, any changes in liking are not necessarily due to the commercial. The act of giving a pre-test could have influenced liking (testing effect).

1.3. Static Group Comparison

Experimental Group:


Control Group:


This design provides for two groups, one of which receives the experimental stimulus while the other serves as a control.

For example, imagine that a new type of cheeseburger is being introduced, and an advertisement campaign is run.
After the ad airs, those who remember seeing it would be in the experimental group (X). Those who have no recall of the ad would be in the control group. The intent of each group to purchase the cheeseburger would be measured. The main weakness of this design is that there is no way to be certain that the two groups are equivalent or that the individuals are representative.

2. True experiment
A true experiment is a method of social research in which there are two kinds of variables. The independent variable is manipulated by the experimenter, and the dependent variable is measured. The signifying characteristic of a true experiment is that it randomly allocates the subjects in order to neutralize the potential to ensure equivalence. There is also a control group for comparison. Types of True experiments: Pretest-Posttest Control Group Design Posttest-Only Control Group Design

2.1.Pretest-Posttest Control Group Design

Experimental Group:

Control Group:


O1 X O3

O2 O4

The symbol R means that the true experimental designs use randomly assigned groups to ensure equivalence. The effect of the experimental is: E = (O2-O1) (O4-O3). This design deals with many of the threats to internal validity, but local history, maturation, and communication among groups can still lead to problems. External validity is threatened because there is a chance for a reactive effect from testing.

2.2. Posttest-Only Control Group Design

Experimental Group: Control Group:


O1 O2

In this design, the pretest measurements are omitted. Pretests are well established in classical research design but are not really necessary when it is possible to randomize. The experimental effect is measured by the difference between O1 and O2. Internal validity threats from history, maturation, selection, and statistical regression are controlled adequately by the random assignment. Different mortality rates could cause a problem.

Example for Posttest-Only Control Group Design Buick dealerships wish to determine the effectiveness of a special test-drive incentive. Buick dealerships nationwide are randomly assigned to either the control group or the experimental group. Those in the experimental group use a promotion to encourage test drives. The control group does not use any such promotions. The number of test drives throughout are measured and compared to determine if the promotion resulted in significantly more test drives.

3. Field experiment
Experiment conducted in a natural setting (e.g. on a sports field during play). The conditions of field experiments are usually very difficult to replicate. Types of Field experiments: Nonequivalent Control Group Design Separate Sample Pretest-Posttest Design Group Time Series Design

3.1. Nonequivalent Control Group Design

Experimental Group: O1 Control Group:


O2 O4

This is a strong and widely used quasiexperimental design. It differs from the pretest and posttest control group design because the test and control groups are not randomly assigned. There are two varieties: intact equivalent design and self-selected experimental group design.

Nonequivalent Control Group Design

In the intact equivalent design, the membership of the experimental and control groups is naturally assembled. The self-selected experimental group design is weaker because volunteers are recruited to form the experimental group, while non-volunteer participants are used for control. A comparison of the pretest results for each group is one indicator of the degree of equivalence between test and control groups.

Example for Nonequivalent Control Group Design For example, children from two different classes in school may be asked to test a toy. Participants are pre-tested on their interest in the toy. The experimental group spends time playing with the toy while the control group is not exposed to the toy. A post-test then measures interest in the toy.

3.2.Separate Sample Pretest-Posttest Design

Experimental Group: Control Group:


O1 (X) X


This design is most applicable when we cannot know when and to whom to introduce the treatment but we can decide when and whom to measure. The parenthesized treatment (X) means that the experimenter cannot control exposure to the treatment. This is not a strong design because several threats to internal validity are not handled adequately. History can confound the results.

Example for Separate Sample PretestPosttest Design

For example, an new advertising campaign for a prescription drug is introduced on television. Awareness of the brand name is measured prior to the campaign introduction. After the campaign ends, awareness is measured again.

3.3. Group Time Series Design


O1 O2 O3 X O4 O5 O6 O7 O8 O9 O10 O11 O12

A time series design introduces repeated observations before and after treatment and allows participants to act as their own controls. The single treatment group design has before-after measurements as the only controls. There is also a multiple design with two or more comparison groups as well as the repeated measurements in each treatment group.

This format is especially useful where regularly kept records are a natural part of the environment and are unlikely to be reactive. The time series approach is also good way to study unplanned events in an ex post facto manner. The internal validity problem for this design is history. To reduce this risk, we keep a record of possible extraneous factors and attempt to adjust the result to reflect their influence. For example, if the government were to begin price controls, we could still study the effects of this action on gasoline prices later if we had regularly collected records for the period before and after the advent of price control.

Summary: types of experiments

Types of Pre-experiments: After-Only Case Study One Group Pretest-Posttest Design Static Group Comparison

Experimental Group:
Experimental Group: Experimental Group: Control Group: O1


O2 O1 O2

Types of True experiments: Pretest-Posttest Control Group Design

Experimental Group: R Control Group: R

O1 O3

O2 O4

Posttest-Only Control Group Design

Experimental Group: R Control Group: R

O1 O2

Types of Field experiments: Nonequivalent Control Group Design

Experimental Group: Control Group: Experimental Group: R Control Group: R Experimental Group: R Control Group: R

O1 O3 O1

O2 O4

Separate Sample Pretest-Posttest Design

(X) X


Group Time Series Design

O1 O2 O3 X O4 O5 O6 O7 O8 O9 O10 O11 O12

Chapter 13: Measurement and Scaling

1. MEASUREMENT 1.1. Conceptual Definition 1.2. Operational Definition/Operationalization 2. SCALES 2.1. Levels of Scale Measurement 2.2. Index or Composite Measures 3. CRITERIA FOR GOOD MEASUREMENT 3.1. Reliability 3.2. Validity 3.3. Sensitivity

The process of assigning numbers or scores to attributes of people or objects. The process of describing some property of a phenomenon of interest by assigning numbers in a reliable and valid way Precise measurement requires: a) Careful conceptual definition i.e. careful definition of the concept (e.g. loyalty) to be measured b) Operational definition of the concept c) Assignment rules by which numbers or scores are assigned to different levels of the concept that an individual (or object) possesses.

1.1. Conceptual Definition

Concept - A generalized idea about a class of objects, attributes, occurrences, or processes. Examples: Gender, Age, Education, brand loyalty, satisfaction, market orientation Construct - A concept that is measured with multiple variables. Examples: Brand loyalty, satisfaction, market orientation, socio-economic status Variable - Anything that varies or changes from one instance to another; can exhibit differences in value, usually in magnitude or strength, or in direction.

Concepts must be precisely defined for effective measurement. Consider the following definitions of brand loyalty: 1. The degree to which a consumer consistently purchases the same brand within a product class. (Peter & Olson) 2. A favorable attitude toward, and consistent purchases of, a particular brand. (Wilkie, p.276) The two definitions have different implications for measurement they imply different operationalizations of the concept of brand loyalty

1.2. Operational Definition/Operationalization

Operational definition - A definition that gives meaning to a concept by specifying what the researcher must do (i.e. activities or operations that should be performed) in order to measure the concept under investigation.
Operationalization - The process of identifying scales that correspond to variance in a concept.

Operationalization should be in line with the conceptual definition. For example: Conceptual definition # 1 for brand loyalty:degree to which a consumer consistently purchases the same brand within a product class. Operational definition # 1 for brand loyalty: the percent of purchases going to brand A over a period of time. Operationalization: in order to measure loyalty for brand A, you will need to: Observe consumers brand purchases over a period of time, and Compute the percent of purchases going to brand A

Operationalization should be in line with the conceptual definition. For example: Conceptual definition # 2 for brand loyalty: A favorable attitude toward, and consistent purchases of, a particular brand Operational definition # 2 for brand loyalty: consumers attitude towards the brand A and the percent of purchases going to brand A over a period of time.

Operationalization: in order to measure loyalty for brand A, you will need to: Observe consumers brand purchases over a period of time, Compute the percent of purchases going to brand A, and Ask consumers questions to determine their attitudes toward brand A

Conceptual Definition of Media Skepticism

Media skepticism - the degree to which individuals are skeptical toward the reality presented in the mass media. Media skepticism varies across individuals, from those who are mildly skeptical and accept most of what they see and hear in the media to those who completely discount and disbelieve the facts, values, and portrayal of reality in the media.

Operational Definition of Media Skepticism

Please tell me how true each statement is about the media. Is it very true, not very true, or not at all true? 1. The program was not very accurate in its portrayal of the problem. 2. Most of the story was staged for entertainment purposes. 3. The presentation was slanted and unfair.

To effectively carry out any measurement we need to use some form of a scale. A scale is any series of items (numbers) arranged along a continuous spectrum of values for the purpose of quantification (i.e. for the purpose of placing objects based on how much of an attribute they possess) The thermometer for instance consists of numbers arranged in a continuous spectrum to indicate the magnitude of heat possessed by an object.

2.1. Levels of Scale Measurement

Numbers assigned in measurement can take on different levels of meaning depending on one of four mapping characteristics possessed by the numbers: 1. Classification 2. Order 3. Distance 4. Origin The type of mapping characteristic assumed depends on the properties of the attribute (or construct) being measured

The Four Characteristics of Mapping Rules

1. Classification The numbers are used only to group or sort responses. No order exists 2. Order The numbers are ordered. One number is greater than, less than, or equal to another 3. Distance Differences between the numbers are ordered. The difference between any pair of numbers is greater than, less than, or equal to the difference between any other pair of numbers 4. Origin The number series has a unique origin indicated by the number zero

Four levels of scale measurement result from this mapping 1. Nominal Scale: a scale in which the numbers or letters assigned to an object serve only as labels for identification or classification, e.g. Gender (Male=1, Female=2) 2. Ordinal Scale: a scale that arranges objects or alternatives according to their magnitude in an ordered relationship, e.g. Academic status (Sophomore=1, Freshman=2, Junior=3, etc

3. Interval Scale: a scale that both arranges objects according to their magnitude, distinguishes this ordered arrangement in units of equal intervals, but does not have a natural zero representing absence of the given attribute, e.g. the temperature scale (40oC is not twice as hot as 20oC) 4. Ratio Scale: a scale that has absolute rather than relative quantities and an absolute (natural) zero where there is an absence of a given attribute, e.g. income, age.

Nominal, Ordinal, Interval, and Ratio Scales Provide Different Information

Type of Scale Nominal

Data Characteristics

Numerical Operation

Descriptive Statistics Frequency in each category Percent in each category Mode

Examples Gender (1=Male, 2=Female)

Classification but Counting no order, distance, or origin


Classification and Rank ordering order but no distance or unique origin Classification, order, and distance but no unique origin Arithmetic operations that preserve order and magnitude

Median Academic status Range (1=Freshman, Percentile ranking 2=Sophomore, 3=Junior, 4=Senior) Mean Standard deviation Variance Temperature in degrees Satisfaction on semantic differential scale Age in years Income in dollars



Classification, Arithmetic Geometric mean order, distance operations on Coefficient of and unique origin actual quantities variation

2.2. Index or Composite Measures

Both index and composite measures use combinations (or collection) of several variables to measure a single construct (or concept); they are multi-item measures of constructs.

However, for index measures, the variables need not be strongly correlated with each other, whilst for composite measures, the variables are typically strongly correlated as they are all assumed to be measuring the construct in the same way

Index Measures
Example 1: Index Measure Construct: Social class Measures: Linear combination (index) of occupation, education, income. Social class = 1Education + 2Occupation + 3Family Background

Composite Measures
Example 2: Composite Measure Construct: Attitude Toward Brand A Measures: Extent of agreement/disagreement with multiple statements: a) I like Brand A very much b) Brand A is the best in the market c) I always buy Brand A Statements a), b), c), constitute a scale to measure attitudes toward brand A

Computing Scale Values for Composite Scales Summated Scale

A scale created by simply summing (adding together) the response to each item making up the composite measure.

Reverse Coding
Means that the value assigned for a response is treated oppositely from the other items.


Three criteria are commonly used to assess the quality of measurement scales :

1. Reliability 2. Validity 3. Sensitivity


The degree to which a measure is free from random error and therefore gives consistent results. An indicator of the measures internal consistency

Test-Retest Stability

Internal Consistency

Splitting halves

Equivalent forms

3.1.1. Stability (Repeatability)

Stability the extent to which results obtained with the measure can be reproduced.
1. Test-Retest Method
Administering the same scale or measure to the same respondents at two separate points in time to test for stability. The pre-measure, or first measure, may sensitize the respondents and subsequently influence the results of the second measure. Time effects that produce changes in attitude or other maturation of the subjects.

2. Test-Retest Reliability Problems

3.1.1. Internal Consistency

Internal Consistency: the degree of homogeneity among the items in a scale or measure
1. Split-half Method
Assessing internal consistency by checking the results of onehalf of a set of scaled items against the results from the other half. Coefficient alpha () The most commonly applied estimate of a multiple item scales reliability. Represents the average of all possible split-half reliabilities for a construct. Assessing internal consistency by using two scales designed to be as equivalent as possible.

2. Equivalent forms

The accuracy of a measure or the extent to which a score truthfully represents a concept. The ability of a measure (scale) to measure what it is intended measure. Establishing validity involves answers to the following:
Is there a consensus that the scale measures what it is supposed to measure? Does the measure correlate with other measures of the same concept? Does the behavior expected from the measure predict actual observed behavior?


Face or Content

Criterion Validity

Construct Validity



1. Face or content validity: The subjective agreement among professionals that a scale logically appears to measure what it is intended to measure.

2. Criterion Validity: the degree of correlation of a measure with other standard measures of the same construct. Concurrent Validity: the new measure/scale is taken at same time as criterion measure. Predictive Validity: new measure is able to predict a future event / measure (the criterion measure).

3. Construct Validity: degree to which a measure/scale confirms a network of related hypotheses generated from theory based on the concepts. Convergent Validity. Discriminant Validity.

Relationship Between Reliability & Validity

1. A measure that is not reliable cannot be valid, i.e. for a measure to be valid, it must be reliable Thus, reliability is a necessary condition for validity
2. A measure that is reliable is not necessarily valid; indeed a measure can be but not valid Thus, reliability is not a sufficient condition for validity 3. Therefore, reliability is a necessary but not sufficient condition for Validity.

Reliability and Validity on Target

The ability of a measure/scale to accurately measure variability in stimuli or responses;

The ability of a measure/scale to make fine distinctions among respondents with/objects with different levels of the attribute (construct). Example - A typical bathroom scale is not sensitive enough to be used to measure the weight of jewelry; it cannot make fine distinctions among objects with very small weights.

Composite measures allow for a greater range of possible scores, they are more sensitive than single-item scales. Sensitivity is generally increased by adding more response points or adding scale items.

Main Textbook: William G. Zikmunds Business Research Methods Other Texbook: Donald R. Cooper and Pamela S. Schindlers Business Research Methods Lecture Notes: Dr. Alhassan G. Abdul-Muhmin

