Professional Documents
Culture Documents
Business Research Techniques (BRT) - Tilburg University
Business Research Techniques (BRT) - Tilburg University
Business Research Techniques (BRT) - Tilburg University
Research: the process of finding solutions to a problem after a thorough study and analysis of the
situational factors.
- A series of well thought out activities and carefully executed data analyses that help a manager to
avoid, solve or minimize a problem;
- A systematic and organized effort to investigate a specific problem encountered in the work
setting.
Managers that have knowledge about research can more easily communicate with the researcher about
expectations of both parties and can easily foresee what information researchers might require.
Types of research:
- Management: The study of employee attitude and behaviors, strategy formulation, information
systems etc.
- Marketing: Research towards consumer decision making, customer satisfaction and loyalty market
segmentation.
1
Types of business research: Applied vs. Fundamental research
1. Applied research: Specific research with the purpose of solving a current problem faced by the
manager in the work setting. FE a particular product is not selling well and the manager wants
to know why.
- Applied research applies to a specific company;
- Mostly done within firms.
2. Fundamental, basic or pure research: Research that focusses on generating a body of knowledge
by trying to comprehend how certain problems that occur in organizations can be solved.
- Basis for applied research;
- Applies to several organizational settings;
- To base theories based on the research results.
- (Basic driven purely by curiosity and a desire to expand knowledge).
2
Scientific or business research: Focuses on solving problems and pursues a step-by-step logical, organized
and rigorous method to identify the problems, gather data, analyse them and draw valid conclusions
from them. Note: Scientific research is not based on hunches, experience and intuition. Scientific research
includes both applied and basic research.
Examples of applying the hallmarks of scientific research | What’s wrong in terms of the hallmarks
Samsung wants to find out if people are satisfied with their Samsung phones. They decide to do a survey,
which they email to everyone who has subscribed to the newsletter. Answer: Not generalizable, because
people who are unsatisfied might have unsubscribed for the newsletter and may not be willing to answer.
Samsung understood that they should also ask people who were not subscribed to their newsletter. They
decided to invite a hundred-random people who were promised a compensation of 20 dollars for an
interview. During the interview, they were handed a Samsung phone and asked questions about it.
Answer: This is generalizable, but not rigorous. The methodological design doesn’t seem to be right,
because (1) people are paid to do the interview and (2) people are interviewed by someone who might
expect a certain response from them. Both of these can cause bias.
3
Deductive vs. Inductive research
The seven-step research process for deductive research | Theory to data (test if the theory holds)
The seven-step research process for inductive research | Data to theory (develop theory)
4
STEP 1: Define the business problem
A ‘problem’ does not necessarily mean that something is seriously wrong with a current situation that
needs to be rectified immediately. It is any situation where a gap exists between an actual and desired
ideal state. Two types of (business) problems:
- Actual state: The actual situation is seriously wrong and needs to be improved asap.
- Desired state: The actual situation is not seriously wrong but can be improved.
Using primary or secondary data (such as interviews, archival research) to transform the broad
management problem into a research problem. Although the exact nature of the information needed for
this purpose depends on the type of problem one is addressing, it may be broadly classified under two
headings:
- Background information: on the organization and its environment (contextual factors). Like, the
origin and history of the company, size in terms of employees, location, resources etc.
- Information on the topic of interest: the body of knowledge available to you as researcher may also
help you to think about and/-or better understand the problem. A careful review of textbooks,
journals, conference proceedings and other published materials (Chapter 4) ensures that you have
a thorough awareness and understanding of current work and viewpoints on the subject area.
The data for preliminary research can be gathered by primary and secondary data:
- Primary data: Refers to information that the researchers gather first hand through instruments such
as surveys, interviews, focus groups or observation(s).
- Secondary data: Data that already exists (although for another purpose than the current study) and
do not have to be collected by the researcher. Hence! It is often beneficial to gather primary and
secondary data at the same time.
5
What makes a GOOD business problem?
6
Step 2: FORMULATING A PROBLEM STATEMENT & RESEARCH QUESTIONS
A good problem statement includes both a statement of the research objective(s) and the research
question(s). The objective of the study explains why the study is being done. The research questions
clarify the issue to be resolved. They specify what you want to learn about the topic. Criteria for good
quality of the problem statement:
7
Step 3: Develop a Theoretical framework:
A theoretical framework is the foundation of the hypothetico-deductive research process. The theoretical
framework represents your beliefs on how certain phenomena (variables or concepts) are
related to each other (a model) or an explanation of why you believe that these variables are related to
each other (a theory). The process of building a theoretical framework includes three steps:
Variable(s): Is anything that can take on differing or varying values. The values can differ at various times
for the same object or person or at the same time for different object or persons. Variables are based on
careful literature review. Avoid jargon, use more simplified and common terms. For example: motivation,
absenteeism and production are variables. Can be discreet (male or female) or continuous (the age of an
individual).
Choosing the right definition based on literature review. With the existence of many different definitions.
- Formulate one and explain why this is the best definition;
- Create a new definition, based on existing definitions, which fits best to your research.
Explaining the problem statement by visualizing the different variables and relationships and effects they
have on each other. There are four types of variables:
- Independent variable (IV): A variable that influences the dependent variable in either a positive or
in a negative way. Also notated as predictor variable or X.
- Dependent variable (DV): The variable that is of primary interest of the researcher. Also notated as
criterion variable or Y. It is possible to have more than one dependent variable in a study.
For example, salary levels (IV) influences job satisfaction (DV).
To establish that a change in the independent variable causes a change in the dependent variable, all four
conditions should be met:
1. A change in the dependent variable should be associated with a change in the IV;
2. The cause must occur before the effect (you cannot have a hangover without drinking beer);
3. No other factor should be possible cause of the change in the DV;
4. A logical explanation is needed and must explain why the IV affects the DV.
8
Mediator (MED): A variable that explains the mechanism at work between X and Y
(dependent/independent) also known as the intervening variable. Surfaces between the time the IV starts
operating to influence the DV and the time their impact is felt on. Two types of mediating variables:
9
Moderator (MOD): A variable that alters the strength and sometimes even the direction (positive or
negative) of the relationship between X and Y. Example is hours of driving lessons (X), gender (MOD)
and parking skills (Y). Two types of moderating variables:
- Pure-moderation: Only influences the relationship between X and Y, but no direct effect on Y;
- Quasi-moderation: Moderates the relationship between X (IV) and Y (DV), but also has a direct
effect on the DV (rare).
Conditional process model: When both a mediator and a moderator are present.
10
3. Building a Hypotheses
The explanation for relationship between variables is done by formulation hypotheses. A hypothesis is a
tentative (voorlopig/voorzichtige), yet testable statement about the coherence between two or more
variables, which you hope to find in your empirical data (primary data).
11
Step 4: Choose a Research design
Research design: is a blueprint for the collection, measurement, analysis of data, based on the research
questions of the study. Studies may be either exploratory, descriptive or causal in nature. The nature of
the study depends on the stage to which knowledge about the research topic has advanced. Research
strategies will help you to meet your research objective(s) and to answer the research questions. The
choice for a particular research strategy depends on the research objective and the type of questions.
Variables that have a causal relationship, will show a significant correlation. A correlation shows a linear
positive or negative relationship between variable X and Y, but correlation does not always imply causality
(logical explanation for effect between X and Y). Watch out for: (1) expert bias; an authority says so isn’t
an argument and (2) confirmation bias; only citing articles that confirm your point of view.
Causal research
Within causal research there is already a relationship. Causality means that a dependent variable Y is
caused by independent variable(s) X. In a causal study, the researcher is interested in delineating one or
more factors that are causing the problem. This research is conducted to establish cause-and-effect
relationships, where the researcher tries to manipulate certain variables so as to study the effects of such
manipulation on the DV of interest.
- Lab experiments: Takes place in an artificial environment (not in a natural setting), also called
contrived or artificial settings. Independent variables are manipulated and the effect on the
dependent variable is measured. Cause-and-effect relationships are exploring with a high degree of
control.
12
- Field experiments: An experiment carried out in the natural environment in which life goes on as
usual (non-contrived). Manipulation is still possible. There are two options for field experiments:
o Field studies: Where various factors are examined in the natural setting in which daily
activities go on as normal with minimal researcher interference. There is no manipulation.
(note! field studies are used within correlational, descriptive research).
o Field experiments: Where cause-and-effect relationships are studied with some amount of
researcher interference, but still in the natural setting where events continue normally.
Example: When French music was played in the store, people who were in the wine section were
more likely to buy French wine rather than German wine. There is a manipulation of an
independent variable (type of music) and the dependent variable (type of purchased wine).
13
Correlational, descriptive research
The focus of correlation research is to look for a certain relationship. Describes the relationship between
variables. Descriptive studies are often designed to collect data that describe the characteristics of
persons, events or situations. This study is either quantitative or qualitative in nature. Finding a
correlation does not mean that one variable causes a change in another variable. It is conducted in a
natural environment with minimal interference by the researcher with the normal flow of events.
Use group labels like work related factors (satisfaction about current job, attitude towards current job). All
variables can be combined as 1 variable. Correlation can mean causation in non-experimental settings as
well if proper methods are used.
The correlation coefficient (ρ) is a measure that determines the degree to which the movement of two
different variables is associated. The most common correlation coefficient, generated by the Pearson
product-moment correlation, may be used to measure the linear relationship between two
variables. However, in a non-linear relationship, this correlation coefficient may not always be a suitable
measure of dependence.
The possible range of values for the correlation coefficient is -1.0 to 1.0. In other words, the values cannot
exceed 1.0 or be less than -1.0, and a correlation of -1.0 indicates a perfect negative correlation, and a
correlation of 1.0 indicates a perfect positive correlation. Anytime the correlation coefficient is greater
than zero, it's a positive relationship. Conversely, anytime the value is less than zero, it's a negative
relationship. A value of zero indicates that there is no relationship between the two variables.
14
2. Choosing between Statistical techniques
A scale is a tool or mechanism by which individuals are distinguished as to how they differ from one
another on the variables of interest to study. There are four different types of scales. The first is the least
powerful and the fourth the most powerful. Always use the most powerful measurement scale possible.
The four types of scales:
1. Nominal scale: Meant to categorize a group of data, just label in categories. No further meaning. Like,
2. Ordinal scale: Also meant to categorize the data and rank-orders the preferences of categories in a
meaningful way. The distances between rank-orders do not have to be the same everywhere. For
example: scale from 1 = hot, 2 = hotter, 3 = hottest or the position in a race.
3. Interval: Numerically equal distances on the scale represent equal values in the characteristics being
measured. Allows us to compare differences between objects. Distance between the measurements
means something. But there is no zero point. For example: IQ measuring. You can calculate the
arithmetic mean, range, standard deviation and the variance. Also, temperature has an interval scale.
4. Ratio: Has an absolute zero point. Not only measures the magnitude of the differences between the
points on the scale, but also taps the proportions in the differences. Meaningful differences and ratios
between variables. You can say that something is twice as much than the other one. Like, money and
distance. Most powerful scale and you can also calculate the arithmetic mean, range, standard
deviation and the variance.
15
1. Likert scale (interval): Used to examine how strongly subjects agree or disagree with statements on a
five-point scale (1= strongly disagree until 5 = strongly agree). Commonly used to measure opinions
and attitudes. Also known as a summated scale.
2. Semantic differential (interval): Several bipolar attributes are
identified at the extremes of the scale, respondents are asked to
indicate their attitudes. You have two opposites, and people choose
one. See picture for examples.
Goodness of measures
It is important to make sure that the instrument that we develop to measure a particular concept is
indeed accurately measuring the variable.
1. Item analysis: Item analysis is carried out to see if the items in the instrument belong there or not. In
item analysis, the mean between the high-score group and the low-score group are tested to detect
significant differences through the t-values.
2. Validity: Does an instrument measure what it was intended to measure? The extent to which
observations accurately record the behavior in which you are interested. Internal validity is the
authenticity of the cause-and-effect relationships. External validity is the generalizability to the
external environment.
o Interviewer biases: Loaded questions, expressing one’s own opinion and judging.
Selective perception: hearing what you want to hear and observing what you want to
observe;
o Interviewee biases: Obedience: desire to please the interviewer; conformity: do/think
what the majority does/thinks (=normative social influence).
3. Reliability: Is the data accurate (free from measurement error) and consistent (from one occasion to
another) across time and across the various items in the instrument. In other words, the stability and
consistency of the measures contributes to the goodness of measures.
16
3. Choosing between sampling designs
The process of selecting the right individuals, objects or events as representatives for the entire
population is known as sampling. We use sampling, because sometimes is it impossible to study the entire
population or because in terms of costs and speed it is better not to study the entire population. When it
is impossible to study the entire population, this is called destructive sampling.
Sampling process
The physical representation of all the elements in the population from which the sample is drawn. For
example: The target population consists of TiSEM students 2017/2018, then the sample frame is the
database of students from TiSEM. Sometimes errors can occur, called coverage errors. This means that
the sample frame is not equal to the population.
Solutions: If small, recognize but ignore. If large, redefine the population in terms of sampling frame.
17
Probability sampling
Probability sampling is when elements in population have a known, non-zero chance to be selected as
sample subjects. The results are generalizable to the population, but costs more time and is resource
intensive. 4 types of probability sampling: (use if you have a good sample frame, if not choose non-ps
because if you don’t know your variables correctly)
Non-probability sampling
With non-probability sampling, the elements do not have a known predetermined chance of being
selected as
subjects. Not everything or everyone has an equal chance of being selected. If you don’t have a sampling
frame, you have non-probability sampling. Results are non-
generalizable, but it is less costly and time consuming as probability
sampling. Four types of non-probability sampling:
18
example: ‘Do you know people who…’ this is for rare characteristics (experts) and note that the first
participants strongly influence the sample. Not easily generalizable.
19
4. Determine the appropriate sample size:
There are no exact rules for how large a sample size must be. Factors affecting decisions on sample size:
RULE OF THUMB
20
EXAMPLE 3:
So if the number of parameters is asked, you count the arrows +1 (DE CONSTANTE)
21
Survey research| Quantitative data
Research based on a survey/questionnaire to which respondents record their answers, typically with
closely defined alternatives. A survey is a preformulated set of questions to which respondents record
their answers. It is a research strategy for correlational research. That means that it is quantitative and is
used within the deductive research process.
- When you are interested in quantitative descriptors (number rather than text);
- When you want to say something about a population, but you cannot measure the whole
population (happens very often);
- When you are interested in the ‘perception’ of customers;
- Quantitative research often uses a big sample size (N). Generalizability is very important, the
purpose can be that the results will represent the population and important decisions have to be
made.
How can you measure all of your concepts and/or variables in your
conceptual model?
- Reduction of abstract;
- concepts to render them measurable in a tangible way.
Categories of questions:
22
2. Single-item vs. multi-item measures
a. Single-item: When concrete singular object/attribute: What is your marital status? How long
have you been working for this company? NOT: How diverse is your company’s workforce?
Or how effective is your organization?
b. Multi-item: When you have an abstract construct. Use ‘off-the-shelf’ scales and develop your
own scale. Attitude, perception and feelings to measure a variable. Overall perceived service
quality of a bank, measure by ‘bank is expert’, bank adapts to my need & bank is accessible.
- Avoid double-barreled questions: Two questions in one question, which results in vague answers.
F.e. How is the taste and appearance of your pancakes?
- Avoid ambiguous, vague questions
- Avoid leading questions: f.e. a true American’s favorite colors are red, white and blue. Are these
your favorite colors?
- Avoid loaded questions: ‘Do you still beat your wife?’ If you say yes, you still beat her and if you say
no, you say you have beaten her in the past.
- Avoid social desirability: Elicit socially desirable responses ‘Do you think that older people should be
laid off?’.
- Avoid double negatives: ‘Do you oppose not allowing the board to pass article 10 of the ballot?’.
What does not oppose to not allowing the board to pass article 10 really mean? Confusing: unclear
what people mean when answering yes/no.
- Recall-dependent questions: A question might require respondents to recall experience from the
past which they don’t know for sure. F.e. when exactly did you start smoking? This can lead to bias.
23
24
Decide on response categories | Item response scales
Response scales are devided into two categories:
- Rating scales: (non-comparative scales): Each object is scaled independendly of the other object of
the study;
- Ranking scales: (comparative scales): make comparences among object and lists the prefered choises
among those objects.
1. Comparative scales (Ranking): With ranking you compare one or more subjects to one another.
Ranking is ordinal in nature. There are several types of ranking scales:
a. Paired comparison: Used when a small number of objects, respondents are asked to
choose between two objects at the same time;
b. Forced choice: Enables respondents to rank objects relative to one another, among the
alternatives provided. For example, rank the following magazines that you would like to
subscribe to in order of preference.
c. Comparative scale: Provides a benchmark or a point of reference to access attitudes
towards the current object, event or situation under study.
2. Non-comparative scales (Rating): Through rating scales each subject gets its own separate score.
Several types of rating scales:
a. Continuous rating scales: How would you rate Bijenkorf as a department store? 0 till 100
b. Semantic differentials: Good – bad, powerful – weak, modern – old-fashioned;
c. Likert: Disagree or Agree on a scale of 1 – 5 or 1 – 7;
25
2. Decide on survey mode| How the data are collected
1. Personally administered: Interviewers asking questions and recording responses. Can be used
when the survey is confined to a local area. Advantage is that information is gathered within a short
period of time and the researcher can give extra explanation and explain unclarities. A disadvantage
is that respondents may feel that they cannot freely answer because it is not anonymous.
Two types:
o Personal, face-to-face questionnaires;
o Telephone questionnaires;
2. Self-administered: Respondents reading and recording their own answers. A big advantage is the
anonymity and the ease to reach a lot of people irrespective of their geographical location. A
disadvantage is the low response rate. Two types:
o Mail questionnaires (sent to respondents through mail);
o Electronic, online questionnaires (quick distribution of the survey, most commonly used).
Keep the following subjects into account when choosing a survey mode:
Mixed-mode designs: To trade off cost and errors, mixed-mode designs can be used. E.g., using web
surveys + mail surveys to senior citizens. The web is cheaper, but via mail there is better coverage.
26
Step 3 | Appearance of the questionnaire
Response rate:
(Empathically mentioned that the response rate is important for the exam)
The response rate is generally low:
- Generally LOW
- Internal surveys: 30/40%
- External surveys: 10/15%
- Up to 85% when respondent population is motivated and survey well executed, and you are very
lucky. (reached in panels)
- <2% when respondent population is less-targeted, contact information is unreliable or lower incentive
and motivation to respond.
27
Validity and reliability | Survey research
Validity :
- Everybody-does-it: ‘Even the most truthful people may sometimes not declare all income for taxes.
Has this happened to you?’
- Assume-the-behavior: ‘How often have you overeaten yourself in the past week?’;
- Authorities-recommend-it: ‘Doctors generally acknowledge that drinking wine in moderation is
beneficial. Did you drink wine yesterday?’;
- Reasons-for-doing-it: ‘Did things happen, so that you could not go to the dentist for the regular
check-up, or did you go?’.
28
Reliability
• For multi-item measures: Cronbach’s alpha. For example, to measure customer satisfaction;
• Cronbach’s alpha measures to what extent a set of items are inter-related;
• High inter-relatedness = high reliability;
• Cronbach’s alpha is between 0 and 1, values > 0.7 are considered acceptable.
29
Experimental research
Example:
- 4 stores are chosen as test units; 3 of these stores are
will use a POP display (POP1, POP2, POP3); 1 store
will not use a POP display (control group);
- All stores are AH stores, with the same yearly
turnover;
- Sales of Coca-Cola are measured 5 days before and 5 days after placement of POP displays.
POP Displays (X vs. Sales (Y)
X (IV): 4 levels (manipulated by ‘type’); three different displays + one store with no POP display;
Y (DV): Sales difference (Sales after and sales before);
Possible Z: yearly turnover store (controlled for: ‘All store are AH stores, with the same yearly turnover’).
30
Validity | Experimental research
Validity issues
- Experimental studies have two main objectives, namely:
o To draw valid conclusions about the effects of IV’s on DV (requires internal validity);
o To make valid generalizations towards a broader group (requires external validity).
- Without internal validity, no external validity;
- Confound: A variable (Z) that threatens internal validity, to prevent confounds include extraneous and
1. History effect: Events or factors outside the experiment have an impact on DV during the experiment;
F.e. road constructions towards one of the AH test stores (Coca-Cola);
2. Maturation effect: Biological, psychological changes over time; passage of time can have an effect on
the dependent variable. F.e. an R&D director wants to test whether workers will work more efficiently
with the help of technology. After 3 months, there is in fact an increase in efficiency, but this is also
caused by the experience workers gained with working with the technology; (honger krijgen)
3. Testing effect: Prior testing affects the DV; people start behaving differently after you pre-measured
your DV. Pretesting = first a measure of Y is taken (pretest), then the treatment is given, and after that
a second measure of Y is taken (posttest). There are two threats of the testing effect:
a. Main testing effect: The prior observation (pretest) affects the later observation (posttest).
This effect occurs because participants want to be consistent and therefore want to try to
answer the same in the posttest as in the pretest. Therefore, no significant effect on the
dependent variables can be found;
b. Interactive testing effects: The pretest affects the participants’ reaction to the treatment (IV).
Discussed below in ‘Threats to external validity’.
4. Instrumentation effect: The observed effect is due to a change in measurement between the pretest
and posttest. With administrating the data, results cannot really be compared because the frame of
the measurement is different. Can happen when the pretest and posttest are led by different people;
(getting tired toward the end)
5. Selection bias effect: Incorrect selection of respondents (experimental and/or control group):
31
a. Threat to internal validity: improper or unmatched selection of subjects for the experimental
and control groups. Such bias in the selection might contaminate the cause-and-effect
relationships. Randomization or matching groups is recommended to avoid this threat;
b. Threat to external validity: Discussed below in ‘Threats to external validity’.
6. Mortality effect: Drop out of respondents during experiment. Therefore, the group composition
changes over time and it is difficult to draw an unbiased conclusion from the gathered data. Try to
prevent this, try to see why they drop out.
1. Randomization: Everyone has a known chance of being selected. The process of controlling the
nuisance variables by randomly assigning members among the various experimental and control
groups, so that the confounding variables are randomly distributed across all groups. Random
allocation of participants to different conditions (to decrease selection bias, but also
instrumentation, history and mortality effect). Matching might be less effective than randomization.
3. Statistical control: Measure extraneous variables, and include these in the statistical analysis
(covariance analysis) (to decrease history and selection bias effect).
32
Experimental designs:
- True experimental design: Includes both the treatment and control groups and record information
both before and after the experimental group is exposed to the treatment.
- Quasi experimental design: Expose an experimental group to a treatment and measure its effects.
This is the weakest design because there is no comparison between groups nor a situation without
the treatment. Therefore, the true cause-and-effect relationship is hard to determine.
Solomon four group design: Almost never used, because of its complexity. Strongest design.
The participants in the study are randomly assigned to four different conditions, namely:
1. Manipulation, intervention with pre-test and post-test: What you would see in a typical pre-test,
post-test design with a control group;
2. Pre-test and post-test without manipulation or intervention: What you would see in a typical pre-
test, post-test design with a control group;
3. Manipulation, intervention with post-test: Replicate conditions of a and b, except no pre-test is
included;
4. Post-test without manipulation, intervention: Replicate conditions a and b except no pre-test is
included.
33
Analyzing your full factorial Experiment: If IV’s are categorial and DV is continuous, as is often the
case in experimental designs:
Wrap-up:
34
Archival Research
Data that already exists, have been collected by someone else to answer another research question.
Archival data is thus secondary data. Note that archival data was at some point primary data and can
therefore contain for example survey data.
Archival-based research:
Research that uses archival data, what is already out there (rather than generating new primary data).
Unit of analysis: Level of data aggregation/analysis on which you are going to run your analysis. From high
towards low the levels are country, industry, firm, manager(s), product etc. The IV’s should be at the same
or a
higher level as the DV. The level of the IV’s can never be lower than your level of DV.
- Tap into industry wisdom: Learn from past successes and failures in the industry when you
cannot rely on your own experiences. F.e. the effect of buying group entry on firm profitability;
- Power: High likelihood of rejecting H0 when H0 is false (a correct decision) = low likelihood of
missing a real effect. F.e. the effect of innovation on firm value and risk;
- Examining effects across time: Examine whether a phenomenon changes over time, or examine
the duration of an effect. F.e. effect of price and advertising on sales during business cycle
contractions and expansions;
- Examining effects across countries: Primary international research is expensive and
cumbersome. F.e. the effect of retailers’ entry decisions in Eastern Europe on performance.
Survey costs lots of time and is difficult;
- Examining socially sensitive phenomena: Archival data is unobtrusive (onopvallend, niet
opdringerig); good way to gather true and genuine data. There is a big difference between what
people say and what people do. Minimize the opportunity of distorted responses.
35
Unit of analysis
time-varying (different numbers per year) or time-invariant variables (same values for all years).
Kijken waar er variatie is (zijn de jaren verschillend? Zijn er verschillende landen? Dan country-year!
Choose the data that changes (varies). Altijd kijken naar de DV!
36
Reliability | Archival Research
Multi item vs. Single-item measures: If you want to combine multiple archival indicators you have to
standardize them.
- Missing observations| problem: Some data points may be missing (for some customers we don’t
have
income information), solutions for cross-sectional data;
1. Means substitution: You are going to plug in values for the missing information. The data
you are going to use for this is the average (mean) of all other samples concerning the data.
Why the means? If you run a regression analysis it will hardly be influenced when you use
means.
2. Listwise deletion: You kick out all of the observations with missing values (pretty extreme).
Not something you would like to do (less to analyses). When to use it? When a respondent
only answered 3 out 100 questions. But for all other use substitution. Information missing
set to 0, because sometimes people don’t have the information (no credit card). Note:
sometimes, no information, means 0. So, do not fill the empty part with the means, but fill
in 0.
3. Solutions for longitudinal data: Intrapolate: Some time series data are missing (only data of
2002, 2004, 2007 and 2008). You should average the sales numbers of 2002 and 2004 to the
mean for the sales numbers of 2003.
- Inaccurately recorded observations: problem: Data error because of extreme observations. (make a
plot). Solutions:
1. Trim/truncate: remove the most extreme ones of observations; (only very big data sets)
2. Winsorize: Suppose that the sample is small. Then you don’t want to throw out 1%. You
replace the extreme observation with the next value in line that is more realistic.
- Fake observations: Be critical and always check who collected the data? When? Where? For what
purpose?
37
Validity | Archival Research
Archival proxy: A quantitative measure that is used to represent a theoretical construct that is relevant to
the design and completion of a research study. Construct validity: Is defined as how close of an
association actually exists between a measure and the theoretical construct that the measure is meant to
represent or capture. Type I error: Incorrect rejection of a true null hypothesis (H0). Is to falsely infer the
existence of something that is not there, while a type II error is to falsely infer the absence of something
that is. Type II error: Incorrectly retaining a false null hypothesis (H0).
- Provide precedence (prior studies that have used this measures as explanation);
- Provide ‘sound logic’ to support that considerable conceptual overlap exists between proxy and
construct (maybe things have changed over time);
- Provide evidence of significant correlations with related constructs (nomological validity): r > 0.3 it is
acceptable correlation and r > 0.5 there is high correlation.
38
Generalizability | Archival Research
Assess generalizability (external validity) | To what extend can you generalize your conclusions to the
sample?
39
Field Experimental Research
Pros: Cons:
- Real world impact (high external validity); - Time consuming;
- Authenticity - Challenging to implement;
- Novel Insights (does the discount; strategy - Focus on observed behavior;
really work). - High degree of noise;
- Ethical considerations.
- Unexpected Factors;
- Poor timing;
- Failure to Randomize;
- Spill over and Side-effects;
- Non-Compliance;
- Insufficient Sample Size.
- Second order effect effects after the effect
40
Reliability: For concrete constructs: single-item measures , For abstract constructs: multi-item measures
→ Report Cronbach’s alpha (preferred > 0.7).
Best practice: Randomization checks a priori & ad hoc, Unit of randomization, Power calculations a priori,
Outcome measures (capture all relevant metrics).
A/B Testing
41
To implement successfully:
- Instead of focusing on overall effects only, consider to look at the effects in subgroups /segments
(Heterogeneous treatment effects)
- Effects in segments might vary; for example for gender (men vs. women), historical purchase
patterns (heavy vs light shoppers) etc.
- Focusing on aggregate data might lead to the incorrect conclusion that there are no effects on any
participant.
- Important 1: Partition subjects in subgroups based on pre-treatment covariates
- Important 2: Need sufficiently large sample in each condition to get valid results
- Appreciate the Value of A/B Tests:Tiny Changes can have a Big Impact &
Experiments can guide Investment Decisions
- Build a Large-Scale Capability: Center-of-Excellence Model:A third
option is to have some data scientists in a centralized function and others
within the different business units. (Microsoft uses this approach.) A center
of excellence focuses mostly on the design, execution, and analysis of
controlled experiments. It significantly lowers the time and resources those
tasks require by build-ing a companywide experimentation platform and
related tools. It can also spread best testing practices throughout the
organization by hosting classes, labs, and conferences. The main
disadvantages are a lack of clarity about what the center of excellence
owns and what the product teams own, who should pay for hiring more
data scientists when various units in-crease their experiments, and who is
responsible for investments in alerts and checks that indicate results aren’t
trustworthy.
42
- Causality Trap: correlation =/= causality
- Apparent causalities often fail to hold up under examination
- Field experiments permit causal inferences
- 5 steps of implementation
43
Big data
Advantages:
Disadvantages:
1. Incomplete: Big data records what consumers do, not why they do it.
2. Inaccessible: from outside the organization ethical and legal barriers, from inside the company,
not integrated lacking variables..)
3. Nonrepresentative: If the sample is representative, you can make inferences about the
population based on your sample. How representative are online opinions?
4. Drifting: If you want to measure change, don’t change the measure.
5. Algorithmically confounded: How the platform is designed can influence behavior, introducing
bias or noise into what you’re trying to study.
6. Dirty: Big data sources can be loaded with junk or spam.
7. Sensitive: Some of the information that companies have is sensitive (Strava).
GDPR:
44
Exploratory research -> Inductive Research
What is exploratory research? Aim: to require an in-depth understanding when prior theory is absent
often based on qualitative data. In words rather than in numbers , has a small sample size, questions
often asked 1 by 1, offers you richness of information of each respondent
1. Open ended:
- No need to predetermine precise constructs;
- flexible and exploratory.
2. concrete and vivid:
- See the world through the eyes of the subjects.
3. Rich and nuanced:
- capture details.
So at inducted research you only don’t do statistical techniques (obviously because you don’t use
statistical data).
45
Research strategies:
1. In-depth interviews;
2. Focus groups;
3. Observations;
Unstructured interviews:
- Interviewer: only has a vague idea about the info needed, no planned sequence of questions;
- Respondent: Talks openly and widely about the topic.
Structured interviews:
- Interviewer: Knows at the outset what info is needed. has a list of the predetermined questions &
has a list of predetermined questions.
- Respondent: is asked the same set of questions in the same order.
1. Designing:
1. Introduce yourself;
2. Introduce the purpose of the interview;
3. Assure confidentiality;
4. Ask permission to tape-record the interview;
5. Construct the questions.
2. Interviewing:
1. Warm up questions (easy to answer);
2. Main questions: per topic (first an open answer, followed by one or more probing questions).
3. Transcribe
1. Write down questions and answers;
2. Do it immediately and exactly (in same language).
4. Analyzing:
1. Data reduction;
2. Data display (identify themes and patterns).
5. Reporting:
1. Empirical description of themes and patterns.
2. Quotes from interviews.
3. Explanations for observed patterns and relationships (theory development).
46
Focus groups
Observation
- The watching and analysis;
- Of the behavior;
- Of Employees, consumers investors.
4 types of observation:
47
Sampling design:
Sampling designs in qualitative research: (generalizability is not important, so usually a small sample size).
- Convenience sampling;
- Quota sampling;
- Judgement sampling;
- Snowball sampling.
1. Interjudge reliability: (how much consensus there is, giving in the ratings by judges)
- Degree of agreement among raters/judges
- Calculations: percentage of agreement, cohen’s kappa etc.
2. Interviewer biases:
- Loaded questions (hidden purpus behind its asking);
- Expressing one’s own opinion and judging;
- Selective perception (= hearing what you want to hear, = observing what you want to
observe).
3. Interviewee biases:
- Obedience = desire to please the interviewer
- Conformity = do/think what the majority does/thinks (= normative social influence)
Summary:
Mixed methods:
48
International research
Differences between domestic and international research – going from local country to doing research in
international level:
- Increased complexity
- Most slips/blunders come from inadequate research
- Previous business research knowledge still applies
1. Cultural factors: Widely shared norms or patterns of behavior within a large group of people
Differences in family structure, household roles and attitudes.
2. Ethnic factors: Do not influence behavior – differences in physical features
Different hair culture, hair care products needed differ.
3. Climatic factors: Climates in different parts of the world account for many differences between
cultures. Differences in produces/ consumed products (British drink beer).
4. Economic factors: Level of wealth and taxation also effects behavior in countries. Prices of
products, Norwegians drink little alcohol.
5. Religious factors: Religious can lay down specific rules and patterns diet and thinks you may not
Eat.
6. Historical factors Differences have slowly evolved over time but can have strong effect on
consumer behavior (bull fight Spain).
7. Geograph. factors: Some target groups are not easy to reach small towns vs. cities.
8. Consump. patt fact: Difference in consumption patterns between regions. French wine, port before
or after dinner.
9. Res. condition fact: Researchers will have to make note of small differences in different cultures
while performing research different mode for different preference.
Also deal with differences in: language, market research- facilities and capabilities.
49
Classification of different types of international research
- Single-country research: Need when want to know if strategies from country x can be adopted to
country y.
- Multi-country research:
1. Independent multicounty research: Most common. When branches independently conduct similar
research of same products in different countries. Often leads to double effort and – cannot compare
numbers between countries (brand awareness checks/job satisfaction in different countries).
double the effort and difficult to compare countries.
2. Sequential multicounty research: Countries researched one after the other. Attractive way to
research range of geographical regions. Learning from prior research. Sequential approach. Helping to
define limits subjects matters that is covered, operational problems earlier learned avoided, key
findings are focus of later studies. spreading cost
1. Construct equivalence:
- Studying same phenomena/concepts?
- Gender have same construct over all countries;
- Coffee covers a lot of different beverages or forms;
- Important in primary research and secondary data;
- May not be readily comparable;
How to ensure?
50
- Do not only use domestic literature but also country-specific literature;
- Conduct qualitative research – focus group.
2. Measurement equivalence: Are phenomena/concepts measured in the same way in terms of:
- Careful: with secondary data, categories and calibration systems may differ across countries.
(monetary units, measure of weight, distance and volume).
3. Sampling equivalence
Without sampling equivalence validity findings is called into question. Cannot eliminate that differences in
sample are not differences in countries. Consider 3 elements:
1. Timing – preferably take place as simultaneously as possible
2. Sampling frame may need to be different in different countries
3. Data collection procedure
Sampling frame – use comparable sampling frames, unless inadequate coverage in countries. Not when,
women were not allowed to vote in Saudi Arabia , election list only men.
51
52
53
Units of analysis: The unit of analysis refers to the level of aggregation of the data collected during the
research. It depends on the kind of research objective, what kind of ‘units’ will be analysed. (The units are
named differently on p25).
- Individual level: when the problem statement focuses on motivational levels of employees for
example;
- Dyads: when the researcher is interested in two-person interactions;
- Groups: when the problem statement is related to group effectiveness. Collect individual data
first and categorize it into groups to see whether there are differences between these groups;
- Organisations: when the researchers wants to look at differences between companies;
- Nations: when for example cultural differences among nations are studied.
Mixed methods
Mixed methods research: Combinations of methods are used in many studies. For example: You can
interview managers to collect data about the nature of managerial work. Based on the analysis of this
interview data, you can formulate theories of managerial roles, the nature and types of managerial
activities. These have been tested in different setting through both interviews and questionnaire
surveys.
• Triangulation: Is a technique that is associated with mixed methods, because triangulation means that
one can be more confident in a result if the use of different methods or sources leads to the same
results. Triangulation can be addressed from multiple perspectives, namely:
o Method: using multiple methods of data collection and analysis;
o Data: collecting data from several sources and/or at different time periods;
o Researcher: multiple researchers collect and/or analyse the data;
o Theory: multiple theories and/or perspective are used to interpret and explain the data.
Omitted variable: where there is an omitted variable (weghalen), the estimated model will have biased
parameters.
54