Download as pdf or txt
Download as pdf or txt
You are on page 1of 631

PHILOSOPHY OF SCIENCE

LESSON 1: INTRODUCTION
Carsten Bergenholtz (cabe@mgmt.au.dk)
Department of Management, Aarhus BSS
Aarhus University
AGENDA
Introduction
• Why Philosophy of Science – what is it?
• Practical info: Lesson overview, curriculum
• Learning objectives & the exam
• Illustrating a topic and type of exam question
• Video: How to study efficiently (for this course)

What is (social) science?


1) What is science, actually? (Ritchie 2020)
2) Why is social science different from other sciences (Watts 2011)

2
WHY AM I TEACHING THE COURSE?
Education
• Master’s degree in Philosophy
• PhD in Business Administration at BSS Aarhus

Research: Apply a broad range of methods


• Experiments
• Qualitative interviews & observations
• Quantitative surveys
• Simulations

Teaching & Supervision: Also supervise cand. merc. research


methods projects (Digital Business Management)

3
PHILOSOPHY OF SCIENCE: A META-LEVEL COURSE
1
Business Organizational
Math Financial Accounting
Statistics Behavior

2 Different levels of knowledge


Quantitative Research Management
Business Economics
Methods Accounting 1) Everyday business life
3 2) Theories and methods about
Qualitative
Business Dev.
Industrial
Strategy
Macroeconomics business life
Methods Economics
3) Theory about science (e.g.
4 business administration theories)
Marketing Management Finance Operations Management

5
Electives, or Stay
Electives, or Stay abroad Electives, or Stay abroad
abroad
6
Commerci Philosophy of
Bachelor Project
al law Science

4
WHY STUDY PHILOSOPHY OF SCIENCE (POS)? I
(DANISH TITLE: SCIENTIFIC METHOD)

Three big meta-questions – beyond individual theories and methods


• How certain can we be about scientific conclusions?
• What kind of social reality is business administration about?
• What is good science and how do we get there ?

5
WHY STUDY PHILOSOPHY OF SCIENCE (POS)? II
(DANISH TITLE: SCIENTIFIC METHOD)

(some) Key insights and competencies


• More realistic understanding of what social science and ‘truth’ is
• How are social sciences different from other (e.g. physics) sciences
• What it means to claim science is objective
• How does science actually unfold and develop - and why do scientists at times disagree
• How PoS assumptions shape research approaches / methods / theories / questions
• Facilitate an ability to assess (science) claims made in studies, various media or in businesses
• What types of research designs allow what kind of conclusions
• How can we move from correlation to causal insights
• Be an informed skeptic, not an anti-science sceptic

All central competencies: As a citizen, academic


and employee + a cand. merc. student
6
Lesson Week Topics Literature
1 5 Introduction to the course: Navigating social science Ritchie ch 1
Watts 2011
2021 Spring exam case
2 5 Part 1: Correlation vs causation & Experiments Isager 2023
Part 2: Establishing causation – Experiments BRM ch 3 (pp. 50-59)
Bergenholtz 2024a
3 6 Experiments – continued See lesson 2 part 2
Bergenholtz 2024b
4 6 Research designs: Strengths, weaknesses, and synergies BRM 3 (pp. 59-74)
AMJ Editorial 2011
5 7 Study cafe 1: Research designs See lesson 4
2020 Spring exam case
6 7 The essence of theory Bergenholtz 2024c
NYT 2016
7 8 Kuhn and Popper on Scientific Progress - Part 1 Kuhn 1962 (section 4 p. 191-198 & section 6 p. 205-207)
PoS 5
ABM 2018 ch 3
Popper 1962
8 8 Kuhn and Popper on Scientific Progress - Part 2 See lesson 7
Why business administration research is particular Nelson 2016
9 9 Philosophical assumptions in science Guba 1990

10 9 Rationality and biases in research Kahneman 2011 ch 1 & 7

11 10 Scientific processes and objectivity Aschwanden 2015


Harrer 2021 (p. 1-9)
12 10 Scientific integrity Danish Committee on Research Misconduct 2020
Data Colada 2023
Piper 2023
Carter 2000
13 11 Complexity in social science Sullivan 2011 (p. 1-3)
Watts 2007
Feldman-Barrett 2021
14 11 Study cafe II 2023 Spring exam case
15 12 Course conclusions and Q&A
7
PRACTICAL INFO I: CURRICULUM AND HOW TO
FIND LITERATURE
• Curriculum: See Brightspace – no books to buy
• Finding articles: AU network and / or Google Scholar from anywhere

8
PRACTICAL INFO II: ABOUT LECTURES
• Videos 2020-21: https://www.youtube.com/@PhilosophyScience/videos

• Reading guides: To facilitere reading of difficult texts (see Brightspace)

• Quizzes: Vevox.app 143-136-950

• Slides sent out evening before

• Slide style Open Office: Monday’s week 6-11 from 2.10 to 3


• Blue: New topic pm in 2623, 103a or online at:
• Yellow: (probably) onliy in slideset https://aarhusuniversity.zoom.us/my/bergenholtz
• Green: Interaction Ask PoS questions
• Orange: Summing up / repetition

9
PRACTICAL INFO III: A CUSTOMIZED
CHATGPT-4 CHATBOT

More info soon, on


what you can get
access to, and how

10
PRACTICAL INFO IV: EXAMS
Exam WO+: On-site written exam with upload of the exam paper in WISEflow, use of the
form Internet NOT allowed during the exam, own PC required. The assignment is
completed in FLOWlock+. Flowlock+: ”a closed browser which
5 hours on-site exam, March. prevents you from accessing
anything but PDF-files saved locally
Partially based on an unknown case. During the lectures, the students will have had
on your own computer. The
the opportunity to reflect upon similar types of questions, in order to prepare for this
lockdown browser must be installed
exam.
before the exam..”
Re-exam: Same as ordinary exam
https://kursuskatalog.au.dk/en/course/123022/Philosophy-of-Science

Examples of former exam questions


Question 1 (17%) Using the case as an illustrative example, please explain the benefits of a randomized, controlled trial
and why it by many is considered the ‘gold standard’ of science. b) Furthermore, please discuss the external and
ecological validity of the experiment presented in the case.
Question 5 (17%) Please discuss if the text “Science isn’t broken” (referred to as Aschwanden 2015 in the curriculum) is
suggesting a conception of science that is most similar to positivism, post-positivism or constructivism? Please be
specific.

See Brightspace for prior exam sets (assessment guides & good answers uploaded later)

11
ILLUSTRATING A TOPIC & THE EXAM
RESEARCH DESIGNS, LESSON 2-5

• Qualitative and Quantitative/Statistics methods classes provide a range of tools to


ensure quality of analysis

• In Phil of Science focus on more general decisions: Certain research designs* (don’t)
allow certain kinds of conclusions

• What does this mean? Consider this example from a prior exam 

*Research design: overall framework to collect and analyze data (more in lesson 4)
EXAMPLE OF PRIOR EXAM QUESTION (SPRING 2021)
READ AND REFLECT (3-4 MIN)
Question 2 (17 %)
Please explain what you think are the main weaknesses of the following study, in terms of its research
design.
Imagine a group of researchers recently set out to study how Covid-19 has impacted employees'
perception and behaviour concerning engaging in remote, virtual work. The study is based on a
survey, where individuals responded to a range of pre-validated questions about their work practices;
e.g. how useful online meetings are, and to what degree their organisation was able to facilitate high
quality online meetings. Survey responses were collected via online polls. One poll was sent out
during a large online conference for individuals in the particular industry surveyed in August 2020
(1400 responses). Another poll was sent out in December 2020 during a different, large online
conference for individuals in the particular industry surveyed (1700 responses). One of the results was,
for example, that most respondents appreciated virtual meetings, because they were easier to
access. The group of researchers also notes that the responses show that during 2020 organisations
became better at facilitating online work for their employees.

13
SIMILAR POINT IN THIS EXAMPLE: SELECTION BIAS

“Back during World War II, the RAF lost a lot


of planes to German anti-aircraft fire. So
they decided to armor them up. But where
to put the armor? The obvious answer was
to look at planes that returned from
missions, count up all the bullet holes in
various places, and then put extra armor in
the areas that attracted the most fire.”

14
SELECTION BIAS: HOW DATA COLLECTION SHAPES
SAMPLE AND THUS CONCLUSIONS TO DRAW

“Obvious but wrong. As Hungarian-born mathematician


Abraham Wald explained at the time, if a plane makes it
back safely even though it has, say, a bunch of bullet holes
in its wings, it means that bullet holes in the wings aren't
very dangerous. What you really want to do is armor up the
areas that, on average, don't have any bullet holes. Why?
Because planes with bullet holes in those places never
made it back. That's why you don't see any bullet holes
there on the ones that do return…”

http://www.motherjones.com/kevin-drum/2010/09/counterintuitive-world

https://medium.com/@penguinpress/an-excerpt-from-how-not-to-be-wrong-by-jordan-
ellenberg-664e708cfc3d#.yterlkdg0
Even if you randomly selected what
returned planes to investigate – you
would still face selection bias

15
A CORE TOPIC IN POS: HOW RESEARCH
DESIGN LIMIT CONCLUSIONS
• We can’t know if X causes Y if we don’t have an appropriate
research design, and e.g. know where the data comes from
(selection bias)
• Interviewing 100 successful entrepreneurs doesn’t
necessarily tell you much about how to succeed
• Finding a strong correlation between employee benefits
and performance doesn’t necessarily tell you if employee
benefits are important
• Many claims in business are not valid – will show why/how

• Q2 in 2021 Spring exam just one illustration. Lesson 2-5:


Additional challenges and frameworks

16
QUESTIONS ABOUT THE COURSE, EXAM AND
SETUP? (DISCUSS 1-2 MIN)

17
NEED HELP? OR SOMEONE TO TALK TO?
• Our student welfare counsellors are there for you to talk to.
• They have an overall knowledge of rules and options at the degree
programmes at Aarhus BSS. The counselling sessions can relate to all
aspects of your student life, for example:
• Delays in your degree programme
• Doubts about your degree programme
• Illness with you or your relatives
• How to resume your degree programme
• Personal problems that affect your life as a student
If they are not able to help you directly, they can assist you in getting the help
you need.
HOW DO I LEARN MOST EFFICIENTLY AS A STUDENT?
IN GENERAL, AND FOR THIS CLASS

19
VIDEO WHERE I PRESENT GENERAL AND POS SPECIFIC INSIGHT ON HOW
TO STUDY EFFICIENTLY  HTTPS://YOUTU.BE/AE5BPQSFIUU (13 MIN)

20
CONSIDERATIONS ON HOW TO PREPARE POS EXAMS
• Use reading guides to read texts. Often read twice.
• Use lectures/slides as pointers to identify key elements and get overview of curriculum
• Create own overview, e.g. including key quotes of key texts
• Read prior exam questions + answer exam answers to test yourself.
• Don’t only review notes, but practice exam questions
• Flowlock+: Remember to save notes / documents on your own computer, not online!

5 ECTS course ≈140 hours


Exam + preparation: 40 hours
100 / 15 = 2 hour lesson + 5 hour preparation

21
WHAT IS SOCIAL SCIENCE & AND DOES IT WORK, ACTUALLY?
(RITCHIE 2020, WATTS 2011)

Courses usually present final & stylized science models

• Ritchie 2020: Idealized conception of science vs. the actual more


messy, scientific process

• Watts 2011: Social science is particular, different from natural sciences


RITCHIE 2020 (CH 1)
IDEALIZED CONCEPTION VS ACTUAL SCIENCE

• The idealized conception of science: Mertonian norms of (p. 21)


• Universalism
• Disinterestedness
• Communality
• Organized scepticism

• Ritchie: Not quite ”How Science works” (p. 13)


• For example: All researchers always disinterested?

• Main overall message of book: Science Fictions


• Does believe in science (see e.g. p. 13)
• But highlights it is more messy than textbooks indicate
”SCIENCE IS A SOCIAL CONSTRUCT” (P. 13)

• What does this quote mean


• Does not imply anti-science, that science is made up, or doesn’t refer to
something real - more in lesson 7-9
• Science is an intersubjective process; we need some consensus among
researchers (not necessarily full agreement) – more in lesson 11

p. 14
SCIENCE: ”UNMATCHED DEGREE OF OBJECTIVITY”
• Irrespective of flaws
• Science is still ”our best way of moving towards the truth” (p. 13)
• Science represents an ”unmatched degree of objectivity (p. 14) – more in lesson 7-9

• However, need to be aware of flaws in the system


• We are biased / subjective (lesson 10)
p. 15
• Some even cheat (lesson 12)
• Science is complicated, mistakes should be expected (lesson 11)

• Ritchie has great trust in science – but also business administration?

p. 13
BUSINESS ADMINISTRATION AS A SCIENCE, COMPARED
TO E.G. MEDICINE (VEVOX.APP & 143-136-950, 1 MIN)

How strong is the causal effect of taking a painkiller


(e.g. Ibuprofen) and the relief of headache?
VARIOUS CORRELATIONS (ESTIMATES), TO ILLUSTRATE
BUSINESS ADMINISTRATION CAN BE EFFECTIVE
Gender and height ≈ .67
Weight and height ≈ .40
Financial incentives and simple tasks ≈ .32
Goal setting and performance ≈ .28
Headache pill and pain reduction ≈ 0.24
Alcohol intake and aggressive behavior ≈ .23
Personality and job performance ≈ .20
Childhood lead exposure and IQ reduction ≈ .12
See https://twitter.com/BrianNosek/status/1416427265118908419?s=20 for further examples

27
TO BE FAIR, ALSO LARGE DIFFERENCES BETWEEN
E.G. MEDICINE AND BUSINESS ADMINISTRATION

“Because companies vary so wildly in size, form,


and age, …it is far more risky in business to
presume that a proven ‘cure’ developed in one
place will be effective elsewhere.” (Pfeffer and
Sutton 2006: 64)
RITCHIE 2020 (CH 1): OVERALL TAKE-AWAY
• Overall take-away
• Some have (too) idealized conception of science (Mertonian, rational norms)
• Acknowledge and understand flaws of science (see also lesson 10, 12)
• Realize difference between science process and science results (see also lesson 11)
• Science is ”moving towards the truth”, p. 13 (7-9, 11)
• Implies a question and challenge: how does science reach this ‘unmatched degree
of objecitivty’ (see also lesson 7-9, 11)

• Ritchie has introduced some core questions & indicated answers - to be expanded
2021 SPRING EXAM QUESTION 6
• How can Ritchie 2020 contribute to answer the following question? (other relevant texts are
presented later)

“A famous scientist recently claimed that science provides insight into objective reality. Please rely
on the curriculum to provide counterarguments to this claim.”
2021 SPRING EXAM QUESTION 6
• How can Ritchie 2020 contribute to answer the following question? (other relevant texts are
presented later)

“A famous scientist recently claimed that science provides insight into objective reality. Please rely
on the curriculum to provide counterarguments to this claim.”

• Science not an idealized, impartial process that clearly leads to truth (about objective reality)
• Science might ‘move us towards the truth’
• But human subjectivity, flaws and biases lead to imperfect social consensus seeking (constructs)
can shape and constrain outcomes
• …
WATTS 2011: WHY EVERYTHING IS (NOT) OBVIOUS
• Highlights how social sciences (incl. business administration) differs from e.g. physics
• But not due to methods or research designs
Social sciences: Economics, political
science, psychology, business
• Physics administration etc.

• We identify something new / non-obvious


• E.g. planets, laws of gravity, what is inside cells etc.

• In social science:
• Investigates the familar social world we live in
• Everything is obvious – once you know the answer
• Aim is to find what is a myth, and what works

Ritchie and Watts: Two complementary perspectives on how social science


works, and its main challenges
WATTS 2011 - VIDEO

Watts 2011 also presented in this video: https://youtu.be/ulqv_tX_uAY


AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

https://www.youtube.com/watch?v=D9XF0QOzWM0 34
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Social science vs. classical physics:


Measuring the weight of planets

An experimental device in Cavendish’s 18th century lab  able to measure the weight
of the earth (!) and following Newton’s law of gravity infer weight of other planets in the
solar system
This did not seem obvious at all!

What can social scientists infer from a small wooden / metal box in our lab?

35
https://www.insidescience.org/news/measuring-earth-wire
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Social sciences: That’s obvious!


Following WW2 600.000 soldiers were interviewed about
life as a soldier, in order to investigate:

Who handles life as a soldier better; people from the country


side or cities?

36
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Why everything is not obvious

https://twitter.com/random_walker/status/909486051135299584
37
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

https://www.youtube.com/watch?v=D9XF0QOzWM0 38
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Why can social science seem obvious?


1) Familiar!
• We face a social world – not a world of atoms and black holes
• If someone explains quantum mechanics to me, it is obviously new ≠ world of business

2) Natural sciences can identify entirely new things and mechansisms


• Physics: Cf. Cavendish example
• New vaccines, new planets, new animals, new laws...

3) Social science insights can seem obvious after the fact


• Cf. WW2 example
• How to motivate, reduce inflation, market products…

39
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

WHY EVERYTHING IS NOT


OBVIOUS
Predictions are hard because

• Complex world
• Many independent actors interact and influence each other (cf. lesson 13 on complexity)
• In other sciences one can ‘isolate’ what one is interested in (gravity, DNA, planetary movements…)

• Self-reflexive
• Definition: Humans are capable of changing their perception of the world and their behavior and targets
• Self-fulfilling prophecies? (cf. lesson 13 on complexity)

• The most important things never happened?


• ”The greatest hero with the greatest impact - the one that legislated that all pilot doors have to be bullet
proof and locked, thus avoiding 9-11…” (Taleb: The Black Swan).

40
AARHUS UNIVERSITY

SUMMING UP LESSON 1
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

• Intro to and overview of the course setup


• Teaching & practicalities
• Exam: Type of question and Flowlock+
• How to study (efficiently)

• Core focus of early parts of the course


• Research design and correlation vs. causality
• Ritchie 2020: How does science actually work
• Watts 2011: Why everything seems obvious but isn’t – or how social science is different
from e.g. natural sciences

41
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Philosophy of Science

How to study efficiently – in general and this


course in particular (video)

https://youtu.be/Ae5bPqsFiuU

Carsten Bergenholtz (cabe@mgmt.au.dk)


Department of Management, Aarhus BSS
Aarhus University

TATION
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Comparison of 10 study techniques

43
Improving Students’ Learning With Effective Learning Techniques
by Dunlosky et all, 2013
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

How to read curriculum efficiently


(general advice)
https://twitter.com/adammgrant/status/1117069078403829760?s=11

“To remember things you read, don't reread or highlight. Instead:


(1) Consolidate: take a 10-minute break in a quiet place
(2) Quiz yourself: test your recall and identify gaps
(3) Share: make it stick by teaching someone else about it”

44
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Efficient study techniques

• Based on a meta study of many


studies, comparing 10 study
techniques

• Comparison criteria: Contribution to


learning + easy to do

Read more here in this short, insightful essay:


http://bigthink.com/neurobonkers/assessing-the-evidence-for-the-
one-thing-you-never-get-taught-in-school-how-to-learn

45
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Results of studies on (no) practice, and (non)


distributed practice

8
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Philosophy of Science (BSc) grade distribution


2019

47
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Grades and effort


Grades Student effort, # hours
2017: 4.6 average 2017: Index 10
2018: 6.3 average 2018: 13 (30% more)
2019: 5.5 average 2019: 11 (10% more)

More hours spent  better grades?

Not necessarily! Correlation not always causation…but often it is

5 ECTS course ≈140 hours


Exam + preparation: 40 hours
100 / 15 = 2 hour lesson + 5 hour preparation
48
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Qualitative insights from two very successfull


2020 students

• Some of the texts are difficult – pay attention to reading guides’


advice on what to focus on
• Produce written notes on texts’ key message (incl. quotations to use),
to ensure actual understanding
• Answer former exam sets + polleverywhere questions
• Consider the exam: Produce overview of what material and notes
might be applicable for various types of exam questions
• Watch videos before lecture (when relevant)

49
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

50
PHILOSOPHY OF SCIENCE

LESSON 2: CORRELATION VS CAUSALITY


& EXPERIMENTS (PART 1)
Carsten Bergenholtz (cabe@mgmt.au.dk)
Department of Management, Aarhus BSS
Aarhus University
LESSON 1
WHAT CHARACTERIZES THE
SOCIAL SCIENCES AND HOW DO
THEY WORK?
AGENDA: LESSON 2 (PART 1)

• Why correlation ≠ causation


• Chance, confounding, selection & controls
• Different types of causality
• Why we need a counterfactual
• Experiments: Establishing the counterfactual and
ensuring causality ( Part 2 + lesson 3)

• Main text for part 1: Isager 2023


CORRELATION ≠ CAUSATION I
VIDEO: HTTPS://YOUTU.BE/WZ-1B_B6LNU

Correlation: Statistical relationship between two variables. When one


variable changes, the other tends to change in a predictable pattern.

Causation: One event is the result of the other event, i.e. one caused an
effect. Causal terms: leads to, brings about, impacts, reduces, increases, weakens etc.

We often really want to find out if X actually causes Y!


• if we increase salary, will performance improve
• If we increase marketing budget, will sales increase
• if I (or others) get a vaccine, will I become less sick
• If I exercise more, will my mood improve
• Are increased Co2 emissions linked to climate change
• …
Arguably the most fundamental question in life are about causality!

4
CORRELATION ≠ CAUSATION II
SOMETIMES CORRELATION DUE TO CHANCE

www.Tylervigen.com
CORRELATION ≠ CAUSATION III
CORRELATION DUE TO A THIRD - CAUSAL – VARIABLE?

https://pedermisager.org/blog/why_does_correlation_not_equal_causation/
https://imgs.xkcd.com/comics/correlation_2x.png
WHY IS IT DIFFICULT TO DISTINGUISH
CORRELATION AND CAUSALITY?
Data might show a correlation, but is it causal?
• Relation due to pure chance/randomness
• Reverse causation
• Confounders (or omitted variable bias)
• Selection bias

• Actual causation
REVERSE CAUSATION
CORRELATION OFTEN IMPLIES CAUSATION – BUT WHAT KIND?

Isager 2023: “If we see correlation, we have good reason to favor the hypothesis “exercise causes
mood” over the hypothesis “exercise and mood are completely independent”.”

Maybe both directions are


Yet, reverse causation?
causal?
Isager 2023: “Correlation does not
imply your favored causal direction”

Simple statistical analysis cannot tell


you which direction is true!
CONFOUNDING VARIABLES
Isager 2023: “Confounding happens when a third variable is a cause of both variables we are
interested in.”

Isager 2023: “This is what we mean when we say that stress confounds the relationship between
exercise and mood, and why confounding is also known as third variable problems.”
SELECTION BIAS

Isager 2023: “…suppose that people’s desire to participate in our


exercise/mood survey is influenced both by their mood and by how much
they exercise”

““Desire to participate in survey” is marked with dashed lines in the figure”


Those that participate in survey different than the rest = selection bias

Don’t respond to survey


ADDITIONAL EXAMPLES I: CONFOUNDERS
Drinking more diet coke is
associated with obesity?

Diet Obesity
soda

Diet Weight
plan loss

Another Exercise
https://www.ft.com/content/0a5a4404-7c8e-11e7-ab01-a13271d1ee9c
factor
ADDITIONAL EXAMPLE: SELECTION BIAS
• Organisation X surveys relationship between job
satisfaction and productivity among employees. Not
everyone answers survey.
• Maybe respondents are both more likely to be highly
satisfied and highly productive
• Hence, relation would not be due to causality, but
selection bias (who responded)

Satisfaction
High S / Low P High P / High S
Low S / Low P High P / Low S
Productivity

12
WHAT ISSUE IS AT STAKE HERE?
VEVOX.APP 143-136-950 (REFLECT AND DISCUSS 2 MIN)
An organization with 100 employees introduced a wellness program. After 3 months, all employees
completed a survey on their use of the program and productivity levels. It was found that the more
an employee used the wellness program, the more they reported an increase in job productivity.
What should researchers primarily be worried about, when interpreting these results?

1) Selection bias
2) Neither selection bias nor confounding variables, since all data comes from within the
organization.
3) Confounding variables
4) Both selection bias and confounding variables.
CORRELATION VS.
CAUSALITY – WHICH IS IT?
Isager 2023: “Put simply, correlation indicates that
something interesting might be going on, but we need
to investigate further (do more research, collect more
data, and utilize our expert knowledge) to figure out
exactly what is going on!”

Isager 2023 stops here.

But, what do we do?


1) Use control variables in statistical analysis
2) Experiments
CONTROL FOR CONFOUNDERS
STATISTICAL APPROACH TO SOLVE PROBLEM I
Regression analysis: Aim is to identify correlation/association between independent variable (X) and
dependent variable Y).
Controls are added, to ensure that correlation between X and Y is not due to confounding variables.

Example: Physical exercise (X) and mood (Y).


Potential control variables: Diet, sleep, stress, social support, weather/season, personality traits etc.

Made up numbers!
CONTROL FOR CONFOUNDERS:
STATISTICAL APPROACH TO SOLVE PROBLEM II
• Controlling for confounders can increase likelihood of finding
causal relations

• Theory drives choice of controls


• Theory tells us that diet, age, season etc. can influence both
exercise and mood, e.g.

• However: Can’t know (!) if other relevant variables are hidden


& missing (=omitted variable bias, see also lesson 4)

• Hence, controlling for confounders (simple statistical analysis)


can never guarantee causality!
IDEALLY, WE WANT THE
COUNTERFACTUAL

• We can add controls to isolate correlation


between exercise and mood

• But what we really want to know is what


happens when same person
• Exercises
• Does not exercise

• We want the counterfactual

• Sadly, in real world we only make one choice


ILLUSTRATING NEED FOR COUNTERFACTUALS
EBAY ADS: A CAUSAL EFFECT ON SALES?
The more ads (y-axis) the more sales (x-axis)
But what is the counterfactual…?

Stylized illustration of Ebay data

https://thecorrespondent.com/100/the-new-dot-com-bubble-is-here-
its-called-online-advertising/109970897300-fd465983
18
COUNTERFACTUAL: NEED TO COMPARE A
WORLD OF ADS WITH A WORLD OF NO ADS

Maybe statistical
analysis / controls
could have ensured
counterfactual?

19
ADDITIONAL EXAMPLES: HOW TO GUARANTEE ALL
HIDDEN (POSSIBLY CONFOUNDING) VARIABLES HAVE
BEEN IDENTIFIED?
Org Behavior
• Employee Productivity Programs
• Organizational Restructuring
• Financial Incentives

Marketing: Financial
• Marketing Campaign Effectiveness incentives

• Customer Loyalty Interventions

Etc.
A FEW MORE WORDS ABOUT CAUSALITY I

Average (causal) effects

Deterministic vs probabilistic causality


A FEW MORE WORDS ABOUT CAUSALITY II
‘ONE’ OR ‘MULTIPLE’ CAUSAL EFFECTS?

https://www.thenewatlantis.com/publications/correlation-causation-and-confusion p. 3

Not only exercises that causes an effect on mood


• Also diet, health, season, personality etc.

Most types of events we are interested in are due to multiple causes

22
A FEW MORE WORDS ABOUT CAUSALITY III
HOW ARE CAUSAL EFFECTS ‘REAL’? MORE IN LESSON 7-9

https://www.thenewatlantis.com/publications/correlation-causation-and-confusion p. 4

23
A FEW MORE WORDS ABOUT CAUSALITY IV
DIFFERENT TYPES OF CAUSAL EFFECTS

http://gusevlab.org/projects/hsq/#h.a6jctcodj87b
CREATING THE COUNTERFACTUAL:
RANDOMIZED CONTROLLED TRIALS (RCT)
We can’t clone a person, but
we can compare how
‘similar’ people respond to
exercise (or not)

How do we create ‘similarity’?


We randomly divide people

(Barrowman p. 14)
BUSINESS EXAMPLE: RUNNING EXPERIMENTS TO
DISTINGUISH CORRELATION FROM CAUSATION
"Booking had learned early on that • At Expedia the scientific method is
it could not trust intuition and ”absolutely critical”
assumptions. "We see evidence • Thousands of experiments run
every day that people are terrible at continuously
guessing. Our predictions of how • User experience, human resources,
customers will behave are wrong sales etc.
nine out of ten times,"…We follow • Thousands of employees involved in
what the majority wants. And if you experimentation
fail fast, you can try a lot of things". • New employees take a crash-course
in the scientific method (experiments)
(Thomke , Experimentation Works p. 160)
(Thomke, Experimentation Works)

See also this short post for further perspectives on why and how businesses should run
experiments: https://www.linkedin.com/posts/carstenbergenholtz_innovation-experiments-
business-activity-6813800441286053888-Noil
26
CORRELATION OFTEN EXCHANGED WITH CAUSALITY
IN MEDIA STORIES ABOUT THE SCIENCE

50 studies, shared in 1.3 million social media shares


https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0196346
27
LEARNING, PART 1
• Correlation ≠ causality - and why it matters
• How to distinguish correlation & causation (chance, confounders, reverse causality, selection)
• How correlations often are mistaken for causality (in business and in science)
• Different types of causality (probabilistic, multiple causes, type of causality)
• We want the counterfactual: what would have happened if?
• How statistical analysis & controls attempt to eliminate confounding variables
• The gold standard for eliminating confounding variables, selection bias & creating a
counterfactual:  RCT

28
Extra chance to meet the cand.merc. coordinators!
Still in doubt about which cand.merc. track to choose?

Stop by the S-building “Øvre Multirum” today – Friday 2nd


February from 10-12.

You can grab a free cup of coffee and meet the coordinators – to
make sure that you choose the right cand.merc. track.
PHILOSOPHY OF SCIENCE

LESSON 2 PART 2
INTRODUCTION TO EXPERIMENTS
VIDEO: HTTPS://YOUTU.BE/EDCYXONPTXQ

Carsten Bergenholtz (cabe@mgmt.au.dk)


Department of Management, Aarhus BSS
Aarhus University
AGENDA PART 2
• What is a Randomized Controlled Trial? (=RCT, experiment)
• Why are RCTs considered the ‘gold standard’ of (quantitative) research?
• Lab and field experiments
• Internal and external validity

Objective: Learn to assess the quality of an experiment – across scientific disciplines

Based on: Mainly BRM ch 3 (pp. 50-59) (Experimental design), & also Bergenholtz 2024a

31
Treatment: Independent
variable

Outcome: Dependent
variable

Slide borrowed from Phanish Puranam. See also Bergenholtz 2024A: p. 1 32


WHY RANDOMIZE?
(BERGENHOLTZ 2024A: P.2)

• Want counterfactual: Compare same person getting treatment and in control

• If 1 person gets treatment, and 1 person is control  differences in outcome could have many
reasons (everybody is different)

• If we randomly place many people in treatment & control


• Control group is treatment’s groups counterfactual
• Not identical groups, but statistically equivalent

• Individual Rolls: 2 people (1 in each group) rolling a die


will likely see different results due to chance.
• Group Rolls: With e.g. 100 people (50 per group) rolling,
the average results of each group likely similar.
3 KEY ELEMENTS OF A RCT
Bergenholtz 2024a: p. 1-2

1) Randomized process in allocation of participants


2) Treatment (‘intervention’) of one variable (in control vs.
treatment group) – decided and implemented by researchers
3) Controlled setting: Only treatment differs across treatment
and control group (everything else identical)

An experiment has to abide to all 3 elements


When you assess a design at the exam, assess all 3 elements

34
EXAMPLE OF LAB-EXPERIMENT
(SAME EXAMPLE AS IN BERGENHOLTZ 2024A)

• Randomization: Two sessions run pr. day, randomly decided if 9 or 10 AM session was higher / lower pay
• Treatment: “For each correctly solved task, 1 [5 in treatment] DKK will be added to your payment”.
• Controlled setting: All received the same information, same room and tools, and solved the same two
kinds of tasks - addition and anagram tasks

35
IS THIS A PROPER RCT? WHY, WHY NOT?
VEVOX.APP 143-136-950 (1 MIN)

A researcher is interested in the impact of rewards on behavior.


Participants are randomly allocated one of 2 groups.

• Group 1: 50 people are set to do a task as many times as possible


and receive 1 DKK pr. task done.

• Group 2: Another set of 50 people are to do the same task as many


times as possible, and are told that they will reiceive 1.50 DKK pr. task
and that the goal is to solve 50 tasks.

36
Randomized Controlled Trial
Allocation to each treatment Everything else the same 1 (only 1) treatment tried out
(not systematically different) No other changes

Note, each group is not identical –


but only randomly different RCT creates causality: ‘Two worlds’, can now
(statistically ‘equivalent’) compare world with and without treatment

37
3 KEY ELEMENTS OF A RCT
Compared to sampling
strategy – different
1) Randomized process in allocation of participants “randomization” term!

2) Treatment (‘intevention’) of one variable (in control vs.


treatment group) - decided and implemented by researchers
3) Controlled setting: Only treatment differs across treatment and
control group (everything else identical)

An experiment has to abide to all 3 elements


When you assess a design at the exam, assess all 3 elements

38
2020 SPRING EXAM EXAMPLE
READ, REFLECT, DISCUSS 4-5 MIN

Question 2 (16%)
The COBE Lab at BSS, Aarhus University, is a social science lab that allows social science researchers to run
randomised controlled trials, similar in type to the experiment described in Bergenholtz (2024a). Individuals can sign
up to participate in studies via an online system. Before the government closed down universities in Denmark due to
the Coronavirus, the COBE Lab experienced that more individuals than usual cancelled the studies, for which they
had already signed up. This was probably due to more people preferring to stay at home, rather than being around
other people. Now, consider a situation before universities were locked down. Imagine that a study at COBE Lab has
started, and participants have been randomised into different treatments, creating two different groups. As usual,
after the randomisation but still at the beginning of the study, participants are informed about what they are
supposed to do. Imagine that, in the treatment group, they are being told that they are about to be part of a study
which involves participants being physically fairly close to others. This is related to the treatment, albeit participants of
course do not know if they are in the treatment or control group. In this hypothetical scenario, some people therefore
decide to drop out from the study, since they do not feel comfortable being close to others, while the Coronavirus is
beginning to spread in society. It is a common practice in the COBE Lab that participants can end their participation
at any time. Participants in the other (control) group are not physically close to each other, which might be the reason
why no participants drop out of that part of the study.
Question: How did it impact the quality of this RCT, that some participants decided to leave the experiment because they
did not want to be physically close to others?
Exam example & question has been slightly adapted 39
LAB VS FIELD EXPERIMENTS
Lab-experiments: “experiments conducted in an artificial environment where conditions
are much easier to control than in field experiments” (BRM p. 54)

40
FIELD EXPERIMENT I: DOES MANAGEMENT MATTER
(DIFFICULT QUESTION TO STUDY IN THE LAB)

41
FIELD EXPERIMENT II: DOES MANAGEMENT MATTER

42
A FIELD STUDY: IS THIS A PROPER RCT? WHAT ARE
POTENTIAL WEAKNESSES (READ AND DISCUSS, 2 MIN)

Former edition of BRM p 53 (not in current version)


43
WE CONSTANTLY PARTICIPATE IN FIELD EXPERIMENTS

https://twitter.com/Devin_G_Pope/status/1258492772442681346?s=20
https://www.nytimes.com/2016/06/13/insider/which-headlines-attract-most-readers.html

44
ONLY MEN WERE INVITED TO PARTICIPATE IN THIS STUDY. IS IT
STILL A RCT? (VEVOX.APP 143-136-950 , 1-2 MIN)

Seeing the fruits of our labor may make us more productive

The Study: In a study conducted at Harvard University, Ariely


asked participants to build characters from
Lego’s Bionicles series. In both conditions, participants were
paid decreasing amounts for each subsequent Bionicle: $3
for the first one, $2.70 for the next one, and so on. But while
one group’s creations were stored under the table, to be
disassembled at the end of the experiment, the other group’s
Bionicles were disassembled as soon as they’d been built.
“This was an endless cycle of them building and we
destroying in front of their eyes,” Ariely says.

http://blog.ted.com/2013/04/10/what-motivates-us-at-work-7-
fascinating-studies-that-give-insights/

45
ABOUT RANDOMIZATION
(TYPICAL ERROR IN EXAMS!)

Randomization: The process of randomization (not like in quant methods; if


participants are a randomly selected sample of the population)

• Think about it in 2 phases


• First phase: We gather a group of participants. Maybe not representative nor
random sample (cf. men in Ariely 2008)
• Second (key) phase: Participants allocated randomly to each group

• If participants are randomly allocated into 2 groups, probably (as good


as) identical groups
• They might not have turned out identical – which is why you still statistically
compare treatment and control group (cf. Bergenholtz 2024a: p. 2)
INTERNAL AND EXTERNAL VALIDITY
CRITERIA FOR ASSESSING THE QUALITY OF A STUDY
• Validity: Can we be certain about the results of a scientific study?

• Internal validity: Can we be certain that we have identified the effects of the
treatment (or lack of effect) (typically high in lab-experiments)
• External validity: Can findings from this particular study be transferred to other
settings? (typically low / challenging in lab-experiments)

External validity:
Internal validity: Going beyond
The study this study,
generalizing

47
RCT THEME TO BE CONTINUED IN LESSON 3

48
PHILOSOPHY OF SCIENCE

LESSON 3
EXPERIMENTS - CONTINUED
Carsten Bergenholtz (cabe@mgmt.au.dk)
Department of Management, Aarhus BSS
Aarhus University
SUMMING UP LESSON 2

?
Mood

Exercise Mood
https://rpsychologist.com/correlation/
Diet

2
SUMMING UP LESSON 2 Field vs. lab
experiments

• Jeff Bezos: Our success at Amazon is a function of how many experiments we run
• Hal Varian, chief economist at Google: If you want to find out what happens if you
change something…you have to change it
3
RANDOMIZATION VS. RANDOMIZATION
Randomization in RCTs:
Purpose: The primary aim is to evenly distribute known and unknown confounding variables across the treatment and
control groups to isolate the effect of the treatment being tested.
Implementation: Randomization occurs after the sample has been selected, and participants are randomly assigned
to different intervention groups within the study.
Outcome: This ensures that the groups are comparable, and any differences in outcomes can be attributed to the
treatment rather than other factors.

Randomization in Sampling Strategies:


Purpose: Here, randomization is used to select a representative sample from a larger population to ensure that the
sample reflects the population’s characteristics.
Implementation: Randomization occurs at the stage of choosing participants for the study from a broader population.
Outcome: The goal is to reduce sampling bias, ensuring that every individual in the population has an equal chance
of being selected for the sample, making the study results generalizable to the entire population.

In summary, randomization in RCTs is about creating internal validity and ensuring that the causal relationship
between the treatment and the outcome is not confounded by other variables. In contrast, randomization in sampling
is about creating external validity and ensuring that the sample accurately represents the population, allowing the
results to be generalized.
ChatGPT-4 generated 4
WHAT IS THE MAIN CHALLENGE WITH THIS
KIND OF CLAIM (LESS SLEEP  MORE POST-COVID RISK)

5
DOES THIS CONSTITUTE A RCT?
VEVOX.APP 143-136-950 (2 MIN)

Cognitive fatigue influences students’ performance on standardized tests


“Using test data for all children attending Danish public schools between school years 2009/10
and 2012/13, we examine how the time of the test affects performance. Test time is determined
by the weekly class schedule and computer availability at the school. We find that, for every hour
later in the day, test performance decreases by 0.9% of an SD (95% CI, 0.7–1.0%). However, a 20-
to 30-minute break improves average test performance by 1.7% of an SD (95% CI, 1.2–2.2%).
These findings have two important policy implications: First, cognitive fatigue should be taken into
consideration when deciding on the length of the school day and the frequency and duration of
breaks throughout the day. Second, school accountability systems should control for the influence
of external factors on test scores.”
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4790980/

6
AGENDA
• Internal vs. external validity (and lab vs. field experiments)
• Quasi-experiments
• Limits of RCTs

Objective: Learn to assess the quality of an experiment – across scientific disciplines

BRM ch 3 (pp. 50-59) (Experimental design)


Mainly Bergenholtz 2024a & Bergenholtz 2024b

7
VIDEO ON LAB VS FIELD EXPERIMENTS, INTERNAL
AND EXTERNAL VALIDITY AND OTHER TOPICS
https://www.youtube.com/watch?v=TY1xoPon42U
(different order and structure, same message)

8
INTERNAL AND EXTERNAL VALIDITY
CRITERIA FOR ASSESSING THE QUALITY OF A STUDY
• Validity: Can we be certain about the results of a scientific study?

• Internal validity: Can we be certain that we have identified the effects of the
treatment (or lack of effect) - typically high in RCTs in particular lab-experiments
• External validity: Can findings from this particular study be transferred to other
settings? (typically low in lab-experiments)

External validity:
Internal validity: Other studies,
The study generalizing

9
THREATS TO INTERNAL VALIDITY THAT A
PROPER RCT DISMANTLES (BERGENHOLTZ 2024A: P. 2)
• Selection bias
• No self-selection but randomization

• Confounding variables
• Since treatment and control groups are ‘equivalent’ & only one
difference (treatment) between groups – no other confounding
variables can explain outcome.

• (no) Reverse causality


• The treatment ensures direction (treatment before outcome)

10
Premise: If no internal validity – not possible to generalize

EXTERNAL VALIDITY: GENERALIZING


We usually do not just want to investigate the sample of the study
• Mice from this particular supplier
• Management in these particular Indian textile firms
• How people in the COBE lab respond to incentives
• ….

We want to say something more general across various populations


and settings  external validity

11
THREATS TO EXTERNAL VALIDITY
CRITERIA TO PAY ATTENTION TO (BRM P. 54-55, SEE ALSO BERGENHOLTZ 2024A P2-3)

• Interaction of selection and treatment


• Representative participants?
• Generalizes to other people?

• Interaction of setting and treatment


• What activity is taking place, in what setting?
• Might be particular and not generalize to other activities/settings https://img.clipartfest.com/1ed353766970636a717cc030ffd70b15_comput
er-clipart-for-kids-kids-computer-lab-clipart_500-500.jpeg

• Interaction of history and treatment


• Society changes, studies from 1990s still valid?

• Reactive effects of experimental arrangements


• Hawthorne effect , only when observed these results occur?
INTERACTION OF SELECTION
AND TREATMENT I

https://us.macmillan.com/books/9780374710453
INTERACTION OF SELECTION Treatment could be so complicated, that it
only works on a specialist chef – and can’t
AND TREATMENT II generalize to a non-specialist. Thus limiting
external validity
INTERACTION OF SELECTION AND TREATMENT III

From students to actual prisoners

“What happens when prisoners are asked to solve


The Prisoner’s Dilemma?” Inmates are... able to
better solve their classical dilemma situation than
students: on average one can expect inmates to
mutually cooperate in 30% of cases, while only
13% of students’ pairs fully cooperate”
https://twitter.com/emollick/status/1304799875926765570?s=20

15
INTERACTION OF SETTING AND TREATMENT:
DO PARACHUTESWORK?
• Results: Parachute use did not significantly reduce death or
major injury (0% for parachute v 0% for control; P>0.9). This
finding was consistent across multiple subgroups.

• Well, if you make people jump from a height of 0.60


meters, they don't make much of a difference!

https://www.bmj.com/content/363/bmj.k5094

Setting: Can refer to the task used (as in a lab-experiment), or the overall setup/environment – as in this example
16
WHAT ARE THE LIMITS TO THE EXTERNAL VALIDITY OF
EXAMPLE PRESENTED IN BERGENHOLTZ 2024A
CONSIDER EACH OF THE THREATS/CRITERIA (3-4 MIN)

Typical type of exam question


If you don’t have the info at hand – consider what you would need to know
17
FIELD-EXPERIMENTS CAN HAVE BOTH (FAIRLY)
HIGH INTERNAL AND EXTERNAL VALIDITY

But often a trade-off: (lab vs. field experiment)

18
SOME NOTES ON HOW TO GENERALIZE
• Not just binary – a spectrum (a lot, medium, a little)
• A study on paying 1 vs 5 DKK pr task will probably generalize to 1.1 vs. 5.1 pr. task
• Effect of headache pill can probably generalize
• People are quite similar, chemically speaking
• Much more challenging with humans / organizations
Be nuanced: How much can it generalize, how far?
• Assess multiple criteria (sample, setting, history etc.)
• Can it generalize to much higher sums, to people with high-paying jobs, to people
from other ages, regions, countries…
• Generalizations stronger when based on theory
• In exam setting, you might not know theory – speculate on what you find limiting
characteristics

19
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

COMPARING LAB VS. FIELD EXPERIMENTS


• Lab experiment

• Field experiment (generally)


+ More real behavior / less artificial settings (ecological)
- Less control over variables
- More difficult to obtain clean randomization
+- Potentially more relevant sample (self-selected?)
- Resource demanding http://nordic.businessinsider.com/inside-boston-consulting-
group-office-2017-1?r=US&IR=T

+ Embedded in real life: Participants might know the


setting
+ Stakes often higher Exam question re-exam 2018: “Now, imagine you read about
- More difficult to replicate the following (hypothetical, lab vs. field) studies. Which of
these two studies (A or B) would you trust the most, and why?
Please use the different versions of the validity concepts to 20
answer this question.“
SURVEY EXPERIMENT
DIFFERENT TYPE OF EXPERIMENT, SAME PRINCIPLES
An experimental treatment can be added to survey

• Group A get X question / framing  Random allocation


• Group B get Y question / framing  Fairly controlled
 One (treatment) change

Everyone gets information on CV based on given fictitious job posting


• Control: Based on this CV, would you hire [very Danish name]?
• Treatment: Based on this CV, would you hire [very non-Danish name]?

Only captures perceptions, not behavior!

Lab/field RCT can capture behavior

21
SURVEY VS FIELD EXPERIMENTS: PROS AND CONS

22
QUASI-EXPERIMENTS
VIDEO: https://www.youtube.com/watch?
v=62QVoC7QBG4
QUASI EXPERIMENTS ≠ RCT
• RCT: Randomly assigns subjects to treatment vs control

• ”…quasi-experiments….have certain characteristics of experimental designs but …do not


fulfill all the internal validity requirements” (BRM p. 57 – see also first paragraph of
Bergenholtz 2024b)

• Quasi: No actual randomization – but ”as if” random and no self-selection into treatment

24
Key question: Has the treatment been randomized
across groups?

https://twitter.com/jenniferdoleac/status/1092057392542552067
25
IS THIS A QUASI-RCT?

Cognitive fatigue influences students’ performance on standardized tests


“Using test data for all children attending Danish public schools between school years 2009/10 and
2012/13, we examine how the time of the test affects performance. Test time is determined by the
weekly class schedule and computer availability at the school. We find that, for every hour later in the
day, test performance decreases by 0.9% of an SD (95% CI, 0.7–1.0%). However, a 20- to 30-minute
break improves average test performance by 1.7% of an SD (95% CI, 1.2–2.2%). These findings have
two important policy implications: First, cognitive fatigue should be taken into consideration when
deciding on the length of the school day and the frequency and duration of breaks throughout the day.
Second, school accountability systems should control for the influence of external factors on test scores.”
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4790980/

26
QUASI-EXPERIMENT: AS GOOD AS RANDOM
Quote from paper: “In Denmark, as in many other places across the globe, test time
is determined by the weekly class schedule and computer availability at
schools….We claim that this creates variation in test time that is as good as random
to the individual.”
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4790980/#si1

Students haven’t self-selected into classrooms / best students have (probably) not
been put into certain timeslots.

Therefore: Quasi-experiment - almost as good as randomization.

Example explained in Bergenholtz 2024b: p. 1

27
HOW TO ASSESS IF SOMETHING IS A QUASI-RCT?
1) Check for Random Assignment:
• RCT: Deliberate randomization by researchers.
• Quasi-RCT: Participants assigned without researcher-controlled randomization.
2) Confounding Variable Scan:
• RCT: Randomization aims to balance confounders.
• Quasi-RCT: Higher risk of confounder influence.
3) Study Context Analysis:
• RCT: Controlled settings with researcher oversight.
• Quasi-RCT: Real-world contexts, where full control isn't feasible.

Consider: How do these factors influence the study's conclusions?

28
ANOTHER EXAMPLE: QUASI-EXPERIMENTS
(FROM BERGENHOLTZ 2024B: P. 2)

• Rolling out new bonus program – to half of their employees, e.g.


those in a particular location (no randomization)
• Maybe other significant changes at the same time? (scan for
confounding variables)
• Office relocation or on-the-job training to some employees
• Maybe more employees in treatment than control were
relocated? (thus confounding)?

• Main strategy to eliminate that confounding threat? 


Randomizing

29
THE PROMISE OF QUASI-EXPERIMENTS?!

• Experimental = golden standard


• Yet, do lab-experiments in artificial
environment  real world insights?
• Field-experiments difficult to executive

• Solution: Quasi-experiments that


combine real world and exp method

https://noahpinion.substack.com/p/the-
econ-nobel-we-were-all-waiting

30
TERMINOLOGY: NATURAL EXPERIMENT ≈ QUASI RCT
NATURE OR RESEARCHER THAT HAS ‘DECIDED’ TREATMENT? (CF. BRM P. 58)

Wiki: “A natural experiment is an empirical study in which individuals (or clusters of


individuals) are exposed to the experimental and control conditions that are
determined by nature or by other factors outside the control of the investigators.
The process governing the exposures arguably resembles random assignment.”

I only use the ”quasi-experiment” terminology in exam questions


So, just focus on identifying RCTs, and as good as random=quasi

31
Data collected on the same group of students
pre- and post Covid-19 lockdown

32
Lab-experiment

Natural experiment

33
Randomized Controlled Trial
Allocation to each treatment Everything else the same 1 (only 1) treatment tried out
(not systematically different) No other changes

Quasi: Can we assume allocation Quasi: Can we assume only one difference?
was as good as random? Or other systematic differences (e.g. created by
Treatment related to randomization? those that allocated participants?)
34
ECOLOGICAL VALIDITY
(SUB-CATEGORY OF EXTERNAL VALIDITY)

Particular advantage of field-


experiments / quasi-experiments
in the field

Additional validity perspective

BRM p 55

External validity: generalizes to other labs, tasks, societies, times etc.


Ecological validity: is it ”naturalistic”, like the real world?

35
HOW STRONG IS THE EXTERNAL & ECOLOGICAL
VALIDITY? (VEVOX.APP 143-136-950, 1-2 MIN)

Cognitive fatigue influences students’ performance on standardized tests


“Using test data for all children attending Danish public schools between school years 2009/10 and
2012/13, we examine how the time of the test affects performance. Test time is determined by the
weekly class schedule and computer availability at the school. We find that, for every hour later in the
day, test performance decreases by 0.9% of an SD (95% CI, 0.7–1.0%). However, a 20- to 30-minute
break improves average test performance by 1.7% of an SD (95% CI, 1.2–2.2%). These findings have
two important policy implications: First, cognitive fatigue should be taken into consideration when
deciding on the length of the school day and the frequency and duration of breaks throughout the
day. Second, school accountability systems should control for the influence of external factors on test
scores.”

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4790980/

If this was an exam question: How would you answer it?

36
INTENTION TO TREAT(MENT): IS THIS AN ACTUAL RCT?
VEVOX.APP 143-136-950, 1-2 MIN

• Study on how groups perform in an Escape Room


• Research question. Does leadership (having a leader)
lead to better performance
• Randomly assigned to one of two conditions: Control
or Leadership
• In leadership condition: Groups are encouraged to
select a leader
• All other features the same across conditions
https://conference.nber.org/conf_papers/f162245.pdf

37
TO CONSIDER AT THE EXAM
If you are asked to ”assess” an ‘experiment’”
• Validity: Internal, external, maybe ecological
• Internal: Is it actually an RCT (the 3 elements) – or a quasi-RCT
• External: Don’t just state ”it’s low” or ”high”, but use (some of the) criteria mentioned in
chapter/slides. What’s the sample, what’s the task, in what time/history was it done…

• Not your subjective assessment: Curriculum based assessment of the quality of the RCT
• Not just a binary (good/bad) assessment: Characterize strengths and weaknesses

38
REFLECT: HOW WOULD YOU SETUP AN
EXPERIMENT TO STUDY THE RQ IN YOUR BP

• Lab or field experiment?


• Which variable to manipulate and what data to collect?
• How to ensure a randomized, controlled setup?

39
LIMITS OF EXPERIMENTS

Carsten Bergenholtz (cabe@mgmt.au.dk)


Department of Management, Aarhus BSS
Aarhus University
WHY CAN’T RCT’S SOLVE ALL OUR RESEARCH QUESTIONS
AND BUSINESS PROBLEMS? (REFLECT, DISCUSS 1-2 MIN)

For example, can you think of topics that can not be adressed via RCTs, be it
field- or lab-experiments? And why?

41
LIMITATIONS OF RCT’S I: OFTEN NOT POSSIBLE
• Long vs. short term: Experiments are usually relatively short-term – how to study impact of 5 year
organizational change process?

• Sometimes RCTs are impossible – can’t manipulate impact of gender, or economic impact of a depression

• Ethics: Maybe the control group is unethical – usefulness of parachutes, or impact of a psychopath
manager, or impact of new teaching design BRM p. 51

• Speed (and ethics): Maybe no time to wait for an RCT? – usefulness of facemasks etc.

• Dynamic world: The world does not change one (discrete) variable at a time (more in Nelson
2016 and lesson 13) – cf. multipath causality, e..g macro-economics or organizational culture

42
LIMITATIONS OF RCTS II: DON’T PROVIDE ALL THE
ANSWERS
Even when we have done an experiment…

• RCTs tell us whether but not why a treatment works


• Why would it work elsewhere? (external validity challenge)

• RCTs provide insight into average effects – not focus on differences across subjects
• Why do people decide to do what they do? (more in lesson 6, on theory)

• RCT can’t stand alone. Need theory (and other research designs) to generalize.

 More in lessons 4-5, 8-9 & 13

43
LIMITATIONS OF RCT’S III: WHAT CAN WE LEARN
FROM THEM? (SELECTION BIAS)
Hypothetical example
• Let’s say a company for the first time introduces something new (wellness program) – and
performance improves
• might the improvement be due to the wellness program, or just that something new
happened?

Real example
• Tennessee (a small’ish state) successfully reduced class size (improved performance)
• when California tried the same they failed. they lacked qualified teachers

• Hence, can be limits to how much one can generalize from – even great – field-experiments

44
LIMITATIONS OF RCT’S III: STATUS AND VALUE
OF RCT DEPENDS ON FIELD

45
BUSINESS ADMINISTRATION AND RCT’S

• Strong business case for business experimentation. Why not more widespread?
• Not aligned with incentives, org culture, and leadership
• But also difficult: Expedia CEO: ”People need to be trained in the scientific method” -> all
new employees get a crash-course (Thomke, Experimentation Works, p. 148)

Therefore: Teach the scientific method / RCT to business school students – to prepare them for
business life and to ensure more businesses run good RCT’s in the future

46
RCT’S: SUMMING UP
• Why are experiments considered the ”the golden standard”?
• The R and C and T – vs. the counterfactual
• Assessing the quality of RCTs
• Internal vs. external and ecological validty
• (dis)advantages of different types of experiments
• Lab vs. field experiments & quasi-experiments)
• Limitations of RCTs

• Experiments: A method that provides insight into general philosophy of science issues
• What is good, certain science
• How to balance certain (high internval validity) with more uncertain but more
generalizable knowledge (high externval validity)

47
PHILOSOPHY OF SCIENCE

LESSON 4
RESEARCH DESIGNS
Carsten Bergenholtz (cabe@mgmt.au.dk)
Department of Management, Aarhus BSS
Aarhus University
RCT: MORE EXERCISE, BETTER MOOD?
DISCUSS AND VOTE, VEVOX.APP 143-136-950 (2 MIN)

Imagine the following RCT

• Randomization: 1000 people were randomly allocated to treatment and control


• Treatment: Daily exercise program scheduled, for a month vs Control group: No program
• Controlled environment: Everything else similar

• Result: Treatment group reported (statistically significant) better moods compared to control

We can conclude a causal relation; more exercise  better mood?

2
WHAT IS THE EXTERNAL
VALIDITY?
Well, what if exercise program (treatment) was:
• Run 5 kilometer in forrest – still believe general claim?

Would we get same results if following treatment


• Run 5 kilometer - on a treadmill
• Lifting weights
• Playing teamsports
• ….

All treatment effects might be particular to the chosen treatment (=setting)


• and chosen selection of people, time in history, extent of monitoring
To establish general relation between exercise and mood: Run many studies, with different people,
settings etc. (more in lesson 11)

Keep threats to external validity in mind when you see a study claiming: RCT showed X
3
SUMMING UP LESSON 3

4
LESSON 4: AGENDA AND LEARNING GOALS
Learning goals
• Research designs (RD) Not just to be able to summarize different types of RDs, but
• Experiments also to be able to assess research critically, and question if a
• Cross-sectional given RD can support the conclusion.
• Longitudinal
• Case-study Notice: There are no 100% unambiguous rules for when a
RD is appropriate.

• Identifying RDs, and their strengths and weaknesses


• AMJ 2011: General challenges that RDs (might fail to) resolve
• Examples illustrating strengths, weaknesses and complementarity
FOLLOWING OVERVIEW OF 4 RESEARCH DESIGNS ALSO
AVAILABLE IN 2021 VIDEO

https://www.youtube.com/watch?v=U9GNoIQ0qOw

6
RESEARCH DESIGN VS RESEARCH METHODS

7
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

RESEARCH METHODS VS RESEARCH DESIGNS BRM P. 47

8
WHERE ARE WE IN THE RESEARCH PROCES?

Research Question

Research Design
Theory (more in
lesson 6-7)
Data collection & analysis

Presenting and discussing


findings

9
A GOOD RESEARCH DESIGN IS A DRIVING
FORCE OF GOOD, CREDIBLE RESEARCH

10
RESEARCH QUESTIONS AND RESEARCH DESIGN:
WHAT DO WE WANT TO KNOW?
RCT can eliminate confounders and establish
causality. Yet, not only interest we have

Examples
• How and why has the introduction of daily exercise programs shaped the mood
• How much do employees at different age levels exercise, and what are their moods?
• How does the mood level generally change and fluctuate over a long period (a year e.g.)?

In other words
• Maybe we want to understand what happens in specific contexts (case)
• Maybe we want to know if something is correlated (cross-sectional)
• Maybe we really want to understand how something develops over time (longitudinal)

11
DIFFERENT TYPES OF RESEARCH DESIGNS I
• Experiments (cf. lesson 2 & 3)
• Cross-sectional
• Case study
• Longitudinal

In order to pass exam questions on research designs, one has to be able to


a) identify what research design(s) are in play in a given study, as well as
b) assess strength and weaknesses of the applied research design.
DIFFERENT TYPES OF RESEARCH DESIGNS II
Before After

• Experiments

• Cross-sectional

• Longitudinal

• Case study

Also presented here: https://youtu.be/L15ZJ8OmpQA


CROSS-SECTIONAL: SNAPSHOT (BRM P. 63-69)
Characteristics
• More than one case (variation)
• At a single point in time – not longitudinal/over time
• Quantitative/quantifiable data

Typical data: surveys, databases (cf. quantitative methods course)

Aim: To find correlations, generalization

IV typically weak(er), EV strong(er) (if random sample)

14
LONGITUDINAL: STUDY OVER TIME (BRM P. 62-63)
Characteristics
• Data collected on the same units (individual, firm, case etc.) at
least twice – capturing change over time
• Ex: How motivation measured in time point 1 is related to
motivation and productivity measured in time point 2

Typical methods: Repeated surveys, databases,


ethnographic/qualitative fieldwork

Aim: To examine X over time


• Resolves ‘reverse causality’ challenge – sequence of events
established

Cross-sectional study done twice on same participants over


time = longitudinal  internal validity strengthened.
EBAY EXAMPLE: LONGITUDINAL
VIDEO GAMES AND AGGRESSIVENESS: VERSION 1 & 2
WHAT RESEARCH DESIGN ARE THESE STUDIES BUILT ON? (2-3 MIN)
Version 1 Version 2
In this study, we investigated the extent to which
In this study, we investigated the extent to which
adolescents who spend time playing violent video games
adolescents who spend time playing violent video
exhibit higher levels of aggressive behaviour when
games exhibit higher levels of aggressive behaviour
compared with those who do not. A large sample of British
when compared with those who do not. A large
adolescent participants (n = 1004) aged either 14 or 15
sample of British adolescent participants (n = 1004)
years were interviewed. Over a period of 2 years young
aged 14 and 15 years. The kids informed us how
people provided weekly reports of how many hours a
many hours a week they played what they consider
week they played what they consider violent video
violent video games, and how often and to what
games. Furthermore, at the beginning of the 2 year period,
extent they display aggressive behavior (based on
after 1 year, and after 2 years they provided information
Likert-scales in a survey). Results were statistically
on to what extent they display aggressive behavior
analysed, to identify if a correlation between playing
(based on Likert-scales in a survey). Results were
violent video games and displaying violent behavior
statistically analysed, to identify if a correlation between
can be identified.
playing violent video games and displaying violent
behavior can be identified.

https://royalsocietypublishing.org/doi/10.1098/rsos.171474 (slightly modified)


Longitudinal
(and quasi-RCT)
Data collected on the same group of students
pre- and post Covid-19 lockdown

18
CASE STUDY DESIGN (BRM P 63-70)
Characteristics
• Bounded organization / location / person / event
• In-depth focus on the unique / context – less about averages and variation (≠ cross-sectional)
• Can both be comparative (multiple cases) or longitudinal (e.g. lengthy study at one site)

Aim: Explain how and why something happened


• Ex.: Why is it difficult to executive a certain leadership style
• More on theory building vs. theory testing in lesson 6

Typical methods
• Interviews, observations, but also surveys, databases
• Usually qualitative, can also contain quantitative elements

“Case study research frequently includes a longitudinal element.” (BRM p. 67)

In business research: often (post) positivistic – more in lesson 7-9

19
SINGLE CASE VS. (MANY) MULTIPLE CASES (BRM P. 67)
• Single case study: Opportunity to focus on the particular context -> rich, deep
insight into the particular case

• Multiple case study (e.g. 4 or 8 cases): Focuses less on individual unique context,
more on variation/similarity across cases (closer to cross-sectional aim)

”Thus, case studies can be associated with both theory generation and theory testing.” (BRM p. 67)

• What is most appropriate depends on context and research question

• Good (shortish) discussion on single vs. multiple cases:


http://amr.aom.org/content/16/3/613.short

20
A FEW FURTHER NOTES ON RESEARCH DESIGNS I
(BASED ON TYPICAL QUESTIONS)
Longitudinal
• Collecting data on alcohol consumption over multiple weeks to obtain a more reliable measurement of
consumption (creating a better measurement)  not longitudinal change, thus cross-sectional.
• Collecting data on alcohol consumption in week 1 and 15 to assess development over time (longitudinal)

Strength and weaknesses of research designs


• Always assess in relation to what the research question is. If it is correlational question, then cross-sectional
approach is fine.

When is it a case (study)


• If entities of interest are, somehow, bounded. Employees at a company, students in a class, people at a certain
bar, or following a certain manager in an organization over time

Why and how correlational research might be of interest


• To show that there is no correlation between X and Y variables. Then an experiment would probably also not
find it (game aggression ex)
• If correlational data are collected from different sources, it’s value increases (game aggression ex)
A FEW FURTHER NOTES ON RESEARCH DESIGNS II
(BASED ON TYPICAL QUESTIONS)

•A study can combine different types of research designs (BRM p. 68)


• E.g., a case study can be longitudinal, if one continuously collects data over time. A case study is typically not considered
longitudinal, if one asks a respondent about this person’s behavior over time. In that case it would still only be data
collected at one particular point in time. In contrast, an archival dataset (cf. Goodreads ex. in lesson 5) where in one
particular time point (2011) data is collected over a longer time period, the archival nature of the data leads it to be a
longitudinal approach, since data on book ratings have been collected in an archive continuously. Had the employee one
interview kept a diary or allowed access to emails, that would be of a longitudinal nature.

•Cross-sectional and longitudinal studies


• A study can also be both cross-sectional and longitudinal. E.g., one could have data on 100 different firms (cross-sectional)
as well as data on the same firm at different points in time (longitudinal). This would allow a cross-sectional analyse
between the 100 firms, but also a comparison of the same firm in time point 1, 2, 3 etc.

•A construct
• A construct is a general expression, very similar to the ‘concept’ word. Often one would call motivation a concept, while the
way one has measured / speaks of motivation would be constructs; e.g., job satisfaction or intention to quit (cf. Nohria). Any
empirical project relies on the use of constructs.
ANOTHER OVERVIEW OF RESEARCH DESIGNS

23
RESEARCH QUESTION: SHOULD WE BE WORRIED
ABOUT VIDEOGAMES & KIDS BEING OR
BECOMING AGGRESSIVE?

Can be addressed via various research designs cf.


following 3 examples
VIDEO GAMES AND AGGRESSIVENESS: VERSION 1 & 2
WHAT RESEARCH DESIGN ARE THESE STUDIES BUILT ON?
Version 1 Same as before Version 2
In this study, we investigated the extent to which In this study, we investigated the extent to which
adolescents who spend time playing violent video games adolescents who spend time playing violent video
exhibit higher levels of aggressive behaviour when games exhibit higher levels of aggressive behaviour
compared with those who do not. A large sample of British when compared with those who do not. A large
adolescent participants (n = 1004) aged either 14 or 15 sample of British adolescent participants (n = 1004)
years were interviewed. Over a period of 2 years young aged 14 and 15 years. The kids informed us how
people provided weekly reports of how many hours a many hours a week they played what they consider
week they played what they consider violent video violent video games, and how often and to what
games. Furthermore, at the beginning of the 2 year period, extent they display aggressive behavior (based on
after 1 year, and after 2 years they provided information Likert-scales in a survey). Results were statistically
on to what extent they display aggressive behavior analysed, to identify if a correlation between playing
(based on Likert-scales in a survey). Results were violent video games and displaying violent behavior
statistically analysed, to identify if a correlation between can be identified.
playing violent video games and displaying violent
behavior can be identified.

https://royalsocietypublishing.org/doi/10.1098/rsos.171474 (slightly modified)


VIDEO GAMES AND AGGRESSIVENESS: VERSION 1
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

(Vevox.app 143-136-950, 2 min)

Version 1
Let’s assume a longitudinal design In this study, we investigated the extent to which
found a positive link between violent adolescents who spend time playing violent video games
videogames and aggression. Could exhibit higher levels of aggressive behaviour when
we be certain about the causal nature compared with those who do not. A large sample of British
adolescent participants (n = 1004) aged either 14 or 15
of this link? years were interviewed. Over a period of 2 years young
people provided weekly reports of how many hours a week
Why, why not? they played what they consider violent video games.
Furthermore, at the beginning of the 2 year period, after 1
year, and after 2 years they provided information on to
what extent they display aggressive behavior (based on
Likert-scales in a survey). Results were statistically analysed,
to identify if a correlation between playing violent video
games and displaying violent behavior can be identified.

26
https://royalsocietypublishing.org/doi/10.1098/rsos.171474 (slightly modified)
VIDEO GAMES AND AGGRESIVENESS: VERSION 3
WHAT RESEARCH DESIGN IS THIS & WHICH OF THE THREE DESIGNS PROVIDES
THE MOST CREDIBLE RESULTS, AND WHY? (VEVOX.APP 143-136-950 2-3 MIN)

https://www.nature.com/articles/s41380-018-0031-7
SPEAKING OF VIOLENT VIDEO GAMES
AND AGGRESSIVE BEHAVIORS…

https://nyheder.tv2.dk/video/bklXaHMzNTZCaF9fWEp0ZzNYNUhBM0thSkhvaEplTkY=
ASSESSING ONE STUDY VS. MULTIPLE
REFLECT ON FOLLOWING (HYPOTHETICAL) COMPARISON (1-2 MIN)

Which setup provides the strongest evidence?

A) A range of cross-sectional studies investigate several different violent games,


samples from different populations, and different measurements (of
aggressiveness e.g.) and show no effect of violent games on aggressive
behavior

B) One RCT relying on one particular game and a small’ish sample shows a
positive link between violent videogames and aggression.

More in lesson 6 & 11 (Aschwanden 2016)

29
CASE STUDY VS. CROSS-SECTIONAL VS
LONGITUDINAL DESIGNS

Different internal and external validity & logistics strenghts and


weaknesses
30
2023 EXAM QUESTION ON QUASI-EXPERIMENTS
(READ, DISCUSS 4 MIN)
Question 2 (15%)
Imagine a large Danish company that offers telemarketing services. The company has 60 different units
across Scandinavia, each with at least 60 employees. It has for years been fairly profitable but also realised
an increasing level of competition. The top management team is worried about its future earnings and
decides to implement a new system that integrates a chatbot (a computer program that uses advanced
technology to imitate human conversations) in the sales process. The idea is that telemarketing employees
would be able to get inspiration from the chatbot on how to optimise their sales. 30 of the 60 unit locations
have been selected to try out this new feature. The selection was partially related to logistical challenges
and the availability of IT expertise. The team that implemented the chatbot visited each of the 30 different
units during a span of 15 weeks. Subsequent analysis of sales performance shows that the implementation
of the chatbot, on average, led to a subsequent and immediate increase in performance, compared to the
units that have not (yet) implemented the chatbots. The analysis controlled for prior performance, size of the
unit and the experience of the manager in charge of each of the units.

You have now received some limited information on the above outlined research design. Based on the
provided information one can infer that it is not a randomised controlled trial. What information would you
have liked to know about this case in order to identify if the setup constitutes a quasi-experiment or not?
Please be specific when explaining what information you need. 31
FROM AN OVERALL CATEGORIZATION OF TYPES OF
RESEARCH DESIGNS TO IDENTIFYING SPECIFIC
CHALLENGES  AMJ 2011

2021 video available here: https://youtu.be/iVBvrr48kDQ

32
AMJ 2011: ON RESEARCH DESIGN
HTTPS://WWW.JSTOR.ORG/STABLE/PDF/23045105.PDF

• Audience: Top-researchers
• Aim: How to optimize research designs or; key reasons for rejecting a
scientific paper at one of the world’s leading scientific journals
• Provides: Incomplete list of challenges
AMJ 2011: LIST OF KEY REASONS FOR
REJECTION
• Limits of cross-sectional data (when adressing change and causality)
• Omitted variable bias (don’t overlook the confounding / causal variable)
• Don’t use inappropriate samples (if interested in CEO decisions, why study students)
• Engage in careful construct ’creation’ (data should match theory)
• Avoid common method variance/bias (when independent & dependent variables stem
from the same source)

34
Internal validity challenge

CROSS-SECTIONAL DATA ≠ CHANGE & CAUSALITY


• (not) Understanding change: How to know if X impacts Y, rather than Y impacts
X - when data is collected at one point in time?

Productivity  salary increase


Salary increase  productivity

 Longitudinal data or experimental design needed (or fancy statistical methods)

AMJ 2011 p. 657: “The problem with cross-sectional data is that they are mismatched with
research questions that implicitly or explicitly deal with causality or change, strong tests of
which require either measurement of some variable more than once, or manipulation of one
variable that is subsequently linked to another.”
OMITTED VARIABLE BIAS Internal validity challenge

(MAYBE CONFOUNDING VARIABLE IS OMITTED)

If a relevant variable is not included, the results could be biased / misleading


• Could be a confounding variable (that influences both independent and dependent variable)

Examples
• Exercise and mood
• School closures and learning
• Sleep and job performance
•…

Thus: Include relevant control variables in statistical analysis (cf. AMJ 2011 p. 659) to account for
confounders and reduce omitted variable bias.

Good video if further interested: https://mru.org/dictionary-economics/omitted-variable-economics


INAPPROPRIATE SAMPLES
• Is the sample approproriate for the research question and theory?

• If we want to explain how an average Danish firms should hire,


should we rely on data from Google’s hiring practices?
• If we wan to find out if vaccines work, can we just compare those
https://www.weforum.org/agenda/2016/02/googles-4-rules-for-hiring-the-best-employees

vaccinated vs. those that are not?


• If we want to explain how investment funds work, should we only
speak to existing funds (not those who already failed)?

Internal and external validity challenge


CAREFUL CONSTRUCT ‘CREATION’
(RELATION BETWEEN DATA AND THEORY)

• Construct: Theoretical concept that captures phenomena we are interested in


• Often we cannot easily or directly measure what we are interested in
• Concept: Exercise – Construct: How we measure exercise (and create data from phenomena)

• Does your construct capture the phenomena you are interested in?
• Measuring exercise – by asking people how much they exercise
• Measuring motivation – by asking people about motivation or effort
• Measuring firm performance – short / long term, turnover, profits, growth…
• …
• Reason to use established surveys/measures: Already tested (for reliability, construct
validity)

Internal validity challenge


FORMER CROSS-SECTIONAL VIDEO GAME EXAMPLE –
ILLUSTRATING COMMON METHOD BIAS (READ, REFLECT 1 MIN)
Former version Actual version
In this study, we investigated the extent to which
In this study, we investigated the extent to which adolescents who spend time playing violent video
adolescents who spend time playing violent video games exhibit higher levels of aggressive behaviour
games exhibit higher levels of aggressive behaviour when compared with those who do not. A large sample
when compared with those who do not. A large of British adolescent participants (n = 1004) aged either
sample of British adolescent participants (n = 1004) 14 or 15 years and an equal number of their carers were
aged 14 and 15 years. The kids informed us how many interviewed. Young people provided reports of their
hours a week they played what they consider violent recent gaming experiences. Further, the violent
video games, and how often and to what extent they contents of these games were coded using official EU
display aggressive behavior (based on Likert scales in and US ratings, and carers provided evaluations of their
a survey). Hereafter the variables were statistically adolescents' aggressive behaviours in the past month.
analysed, to identify if a correlation between playing Following a preregistered analysis plan, multiple
violent video games and displaying violent behavior regression analyses tested the hypothesis that recent
can be identified. violent game play is linearly and positively related to
carer assessments of aggressive behaviour. Results did
not support this prediction, nor did they support the idea
that the relationship between these factors follows a
https://royalsocietypublishing.org/doi/10.1098/rsos.
nonlinear parabolic function.
171474 (slightly modified)
Internal validity challenge

COMMON METHODS VARIANCE/BIAS


• Common methods variance/bias: Variance due to method
• Independent and dependent variable from same source

• Potential systematic error – if the source is biased, bias influences both independent
and dependent variable in same, systematic way
• Imagine a student grading their own paper
• Imagine a manager rating both work habits and performance of employee – manager might like
a particular employee, skewing both ratings?
• Imagine an accountant in charge of assessing asset quality and financial health of company –
any correlation might be due to CMB

• Preferably different sources for independent and dependent variable – less bias

If variable is based on actual behavior/performance rather than


perception – common method bias can be minimized
VIDEO GAME EXAMPLE REVISITED – HOW TO
ADDRESS (ALL?) RESEARCH DESIGN CHALLENGES

• Limits of cross-sectional designs (change & causality direction) - via longitudinal and RCT designs
• Omitted variable bias – via RCT’s, careful inclusion of confounding variables and case studies
• Appropriate samples – the young we are interested in, both aggressive and not
• Construct creation – studying perception, behavior, archival data
• Avoid common methods bias – unbiased assessment of behavior from multiple sources

41
OVERALL AIM: IMPROVE YOUR ABILITY TO ASSESS IF OTHERS’
CLAIMS ARE VALID / CREATE GOOD PROJECTS YOURSELF

https://www.nytimes.com/2018/10/21/smarter-living/tips-better-morning-routine.html

https://towardsdatascience.com/correlation-vs-causation-a-real-world-example-9e939c85581e
https://www.rebeccabarter.com/cv/talks/Cal_Day_Presentation.pdf

42
CORRELATION OFTEN EXCHANGED WITH CAUSALITY
IN MEDIA STORIES ABOUT THE SCIENCE

50 studies, shared in 1.3 million social media shares


https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0196346
43
SUMMING UP LESSON 4

Lesson 5: Study cafe


Practice assessing research
designs / exam situation

44
PHILOSOPHY OF SCIENCE

LESSON 5
STUDY CAFE 1
Carsten Bergenholtz (cabe@mgmt.au.dk)
Department of Management, Aarhus BSS
Aarhus University
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Agenda

1) Identify what research design a study is based on


2) Assess strength and weaknesses of a given research
design

What is this course about?


Not just knowing what a research design is, but to be able to
critically assess (research) articles, and consider if the conclusions
are supported given the research design.

2
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

A) Book reviews and rewards, version 1


(discuss 5 min)

Imagine a study that investigates all books reviewed on a book review site
(about 50.000 books). The study has data on the average review rating (scale
from 1 to 5) submitted for all books, in all of 2011 - thus, one average number
for each book. Then the study compares if the average review rating a book
got, is correlated with the likelihood of getting a book award from a specific
institution. The award was given in the summer of 2011. Results show that the
reivew books receive are correlated with the likelihood of getting an award
(the higher the review, the higher the rating). One might then conclude that
getting a good average review rating increases the likelihood of getting an
award.

a) What research design(s) are in play here? (case study, cross-sectional,


longitudinal, (quasi) experiment)
b) Is it reasonable to draw this conclusion? 3
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Feedback on A)

4
B) ‘Book reviews’ and ‘awards’, version 2
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

(discuss 5 min)

In 2011 we set out to compare thousands of reader reviews posted


on Goodreads.com of 64 English-language books that either won or were short-listed
for prestigious book awards between 2007 and 2011. We had access to time-
stamped reviews posted daily between 2007 and 2011, and we thus know the
average review rating a book receives for any given month in the entire time period.
We find that books that won a prize (e.g. in august 2009) tend to attract more readers
and reviews (in the rest of 2009 and 2010 and 2011) following the announcement of
an award. Readers’ ratings of award-winning books tend to decline more strongly
following the announcement of an award relative to books that were named as
finalists but did not win.

a) What research design(s) are in play here? (case study, cross-sectional,


longitudinal, (quasi) experiment)
b) Why can we more confident about a causal effect in this design, compared with
the design in the prior version 1?

Kovács, B. and A. J. Sharkey. 2014, "The Paradox of Publicity: How Awards Can Negatively Affect the Evaluation of Quality."
Administrative Science Quarterly 59(1): 1-33. (slightly modified abstract) 5
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Feedback on B)

6
C) What are strengths / weaknesses of this study + how could the authors’
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

claim (see title) be strengthened by a new, different RCT? (15-20 min)


The following information is adapted from a study on how students can take the most effective notes while studying; using
a laptop or writing longhand with an actual pen. The study is carried out by Mueller and Oppenheimer and published in
Psychological Science (2014): “The Pen Is Mightier Than the Keyboard: Advantages of Longhand Over Laptop Note
Taking”.

Case description
How should one take notes while studying, e.g. in a class room setting? It is examined whether students learn the most
effectively (e.g. get the best grades) when taking notes using a computer or taking notes using a pen and paper. The
simple message is that students learn far better when writing with a pen, in contrast to typing on a computer. Mueller and
Oppenheimer carried out two lab-based experiments in order to test the varying hypotheses. In total 327 (primarily
psychology) students (about 20-22 years old)were involved in the two experimental studies; students were allocated
randomly to one of the following settings: either they had to use a computer to take notes, or they had to rely on the use of
pen and paper. All students were given a 45 min lecture in organizational behavior, and then tested immediately after the
lecture and a week later. The 15 min test both consisted of remembering theories and concepts.

Intuitively it could seem plausible that the reason why not using a computer would be beneficial is that wireless internet or
other technological distractions didn’t disturb the student while trying to listen to and learn what was being taught.
Classroom behavior clearly shows how much time students spend on the internet, both doing useful fact-finding search
and wasting time on Facebook. However, the researchers emphasize that the experimental setting did not allow the use of
any internet, and the reason for why using a pen and paper was more effective, is – they argue- because it facilitates the
learning process. When taking notes on a computer, students tend to write long sentences, while when using a pen, the
students attempt to abbreviate and simplify what is being taught. Students using a pen simply wrote fewer words.

This could be an exam challenge – discuss in detail what would be your answer 7
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Feedback on C) Lab-study on note-taking

8
Assessing quality of RCT research design: What to
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

pay attention to (rules of thumb) I


• Is this an actual randomized controlled experiment / are there
issues with the internal validity?
• Are all 3 elements ‘completed’ (randomized, controlled, one trial/treatment)
• If not, is this ‘as good as’ an experiment? (i.e. a quasi-RCT, like the elementary school study)

• How strong is the external validity


• Use the criteria listed in sessions 2-3 (selection, setting, history, Hawthorne)

• How strong is the ecological validity


• E.g.: Solving math problems in a lab-setting (quite weak)
• E.g.: Field experiment (generally strong, since in the field)

• When assessing validity


• Don’t make a binary assessment (good vs. bad). Characterize strengths and weaknesses, be specific,
use criteria, assess the sample, task etc. Maybe compare to examples seen during the course.

Not your personal assessment: Curriculum based assessment of the quality of the RCT
9
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Assessing (quality of) research designs: What to


pay attention to, in general (rules of thumb)
• Don’t focus on the content of the study, but research design decisions
• Is it a longitudinal, cross-sectional, RCT, case study design?
• What data is collected? How many times is data collected? Words or
numbers? Focus on one context, or a larger sample (of people/firms)?
• What sample does the study rely upon?
• How did we end up with this sample, i.e. is selection bias an issue? (cf.
WW2 airplane example) Does the conclusion match the sample?
• Might there be important data left out (omitted variable bias / confounders)?
• Are the variables in the study capturing what they set out to measure?
• Is all data collected from a single source, or multiple different ones
(common methods bias, cf. video games examples)?
• Does the overall design, sample and the data match the (causal) conclusion
made?
• Does the evidence and data actually speak to the conclusions and theory?

• Not an exhaustive list! Not clear-cut rules, therefore: Practice! 10


D) Assessing research designs
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Danish supermarkets (4-5 min)

A study by Yougov, asked a representative


sample of the Danish population, 1023
respondents, the following 2 questions:
a) which supermarkets they have been to in
the last 6 months
b) if they have encountered a mistake (wrong
price etc.) on their receipt at the given
supermarket within the last 6 months

Numbers to the right indicate (in %) how many


customers of each chain have experienced a
mistake on the receipt within the last 6 months.

Based on this it is claimed that some


supermarkets make more mistakes, than others.

Please identify the main challenge with this


research design
11
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Feedback on D) Supermarkets

12
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Assessing research designs: What design is this, and


what is the main weakness of this study? (3-4 min)

A researcher has collected data on both the performance of children in


various types of elementary schools. The quantitative results, comparing
final year grades as well as salary later in life, seem to clearly show that
children that have attended private schools (which cost money to sign up
to) perform significantly better than children attending public schools
(which are free) – both getting higher grades in school as well as higher
salary later in life. The researcher writes up the results, suggesting
allocating governmental support for allowing more children to attend
private schools. Why is this suggestion not well supported?

13
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

14
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

E) 2020 Spring exam, question 1: Please discuss and


outline how you would answer the questions (20-25 min)

• See exam question on the next slide.


• Note, that in the 2020 exam you would have had about 1
hour in total to provide an answer.
• Consider what elements are relevant to include, and what
the core of your answer would be.

15
2020 Spring exam: Question 1 (16%)
Please identify theUNIVERSITY
AARHUS research design that the research study presented in section i) relies on. Furthermore, please
assess and explain if the interpretation (presented in section ii) of the research study (presented in section i)
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

seems credible. In other words, does the research design applied in the study justify the interpretation?

i) Research study
Is there any link between individuals having played a musical instrument at any point in their life, and how
well they perform on cognitive (IQ) tests, compared to people that have not played a musical instrument? In
a study that researchers made, 103 healthy, elderly humans complete a range of questions and tests, in
order to capture their IQ. In addition, at the same time the researchers also asked the 103 individuals if they
had played any musical instruments during their lifetime, and for how many years they had done this. The
researchers also contacted the study participants’ family members, in order to check this information.
Furthermore, the researchers collected data on the current health as well as current physical activity, e.g. if
the 103 individuals currently engaged in any sports. This data makes it possible to run a statistical regression
analysis, where the researchers could contrast and compare the cognitive ability of the group of people that
had played musical instruments in their life, with the group of individuals that had not, or at least had played
less. In the analysis, the researchers could control for how healthy the individuals currently were.

ii) Interpretation of the research study


Many individuals want to make sure they also have a sharp and functioning mind when they get older. How
can one do that? Well, it turns out one should play an instrument, to maintain your IQ and wits. Old people
who have played a musical instrument in their lifetime are smarter than similarly aged people, who did not
play a musical instrument. More specifically, results of a study on the IQ of a range of elderly individuals
show that people who had played a musical instrument performed better on the mental tests and questions.
Results also show that if one had played the instrument for many years, the results on their tests were even
better. Furthermore, it also became clear that individuals that were in better physical health condition,
performed even better. So, find a musical instrument, and begin to play, and maintain your brain. 16
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

A) Book reviews and rewards, version 1


a) What research design(s) are in play here? (case study, cross-sectional,
longitudinal, (quasi) experiment)

b) Reasonable conclusion? Weak internal validity: Maybe high review rating


generated after getting reward in summer 2011?

17
B) ‘Book reviews’ and ‘awards’,
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

version 2
The study’s setup
• A given book (X) receives ratings from 2007
• Intervention: Award given to X book in August 2009
• The book (X) continues to receive ratings in 2009-2011: Higher or lower than
before?
1) Thus a) longitudinal study + b) weak form of quasi-experiment
• Not RCT because the books are not selected at random (like test time in
elementary schools): A particular (probably good) type of books get the
awards
• One can argue for quasi: Compare one and the other group
• E.g. compare effect of reward: reviews before award and after reward
• Books with awards compared to books without rewards

2) Even though one platform is examined, not categorized as a case study


• Focus is on variation and average effects, not how readers (qualitatively)
perceive meaning and value of books.
18
AARHUS UNIVERSITY
A short note expanding on the book award example: Let's imagine how
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

different kinds of research designs can investigate the link between getting
a book award and the quality of the reviews that the book receives. I

Cross-sectional
• One could have data on 1000 books, and their awards, and (quality of) reviews at one point in time
in 2010. This would enable us to correlate what reviews books that get a reward have. However, let's
say the award is handed out in the middle of the year. If the book with awards have high reviews, we
do not know if these reviews have emerged because the book got an award, or if books with good
reviews get an award - since we only have data on the books at one point in time.
• Another angle. Let's say we have data on the nr. and quality of reviews on the exact day a book gets
a reward (or not). We might be able to show that books with good reviews are more likely to get a
reward, but we would not know if there is any clear causal relation. Maybe good books leads to
good reviews and awards - but maybe something else can explain it. Maybe books by famous
authors are more likely to get good reviews and are also more likely to get rewards - even though
the books really aren't better than other books.
• The key advantage here (compared to, e.g., the case study approach) is that one can create a quite
large sample. Thus, if one had selected 1000 books in the year 2011 (instead of 2010), there is a
reasonable chance one might get the same effect - unless maybe if completely different kinds of
books are picked in each year. So, having data from multiple years or investigate the relation to
different kinds of awards will strengthen ones ability to generalize from the particular dataset to other
situations.
• One could also engage in cross-sectional survey designs where one sends out a survey to a lot of
people, asking them what they think of rewards. I won't go further into this here, but just to emphasize
that correlational research designs could look very differently, depending on ones angle. 19
AARHUS UNIVERSITY
A short note expanding on the book award example: Let's imagine how
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

different kinds of research designs can investigate the link between getting
a book award and the quality of the reviews that the book receives. II
Longitudinal
• In the longitudinal design we have multiple data points on the reviews and if they get an award at a
given point (we still investigate the same 1000 books as in the cross-sectional design). So, let's say
we have the average review received in each month of a given year, from 2009 to 2015. If a book
gets an award in August 2011, we can now track if the average quality of reviews changes over
time. If the review quality decreases it is likely that getting an award has attracted different kinds of
readers that might think differently of the book or just write different kinds of reviews. However, we
can't rule out that the book got an award because it got a lot of attention (let's say it was mentioned
in Oprah Winfrey's talkshow in early 2011) and that this is both the reason for it getting award later
on, as well as the change in review quality (omitted variable bias). A more sophisticated study would
then statistically control for "was the book mentioned on TV shows" in order to eliminate the impact
of this. This is of course challenging, not just because one needs to find the data, but also one needs
to define what TV shows to include, what it means to be mentioned etc. Maybe some books are
mentioned with a bad review on a TV show, and that should maybe be categorized differently? This
is related to Nelson (2016) and his point about heterogeneous variables and numerous influences.

Experiment
• In an experimental design one randomly assigns books with the award or not. So, out of the 1000
books, 50 randomly selected books get an award. We can then investigate if getting an award has
any identifiable impact on the quality of reviews. Since we know that the only systematic difference
between books is if they got this particular award or not we can be more certain about the causal
20
effect. Maybe some books were also mentioned on Oprah's show, but if we study many books, such
singular effects should be averaged out.
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

A short note expanding on the book award example: Let's imagine how
different kinds of research designs can investigate the link between getting
a book award and the quality of the reviews that the book receives. III

Case study
• One could also interview readers of books and ask them if a book getting an award a) influences
them to buy a given book, b and if it is also influences if they might a review and what review they
write. Maybe some readers state that they had really high hopes for this book because of the
award it received, but the high expectations were not fulfilled. One could thus try to inductively
infer, that when the individual reads a book that did not get an award, they don't have the same
high expectation and the review might become better. The same meal will probably get a
different review at a local, cheap cafe vs. a (more expensive) restaurant with multiple Michelin
stars. However, we need to keep in mind that we get individuals' self-perceptions. Maybe readers
don't really know why they buy books. Maybe they think that the award has one impact, but it
actually influenced their thinking differently. Nevertheless, a case study design can get to the
question of why people buy something, in a more in-depth way than a survey e.g. could, or the
data that the above 3 designs imply. Several long'ish interviews with different kinds of readers
could generate insight into how different kinds of readers seem to think differently about rewards
and reviews. While the smaller scale involves drawbacks, there are also key advantages
associated to this approach.

• Overall, different designs allow different kinds of conclusions, and different certainty in the 21
conclusion.
C) Is the Pen mightier than the Sword? A few notes
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

• Since lab-based RCT, high internal validity


• Students are a relatively homogeneous group, which enhances internal validity
• In a lab setting, so very well controlled: treatment only difference

• External validity
• Which other types of students can this be transferred to, e.g. in terms of age, educational level, countries
etc. (cf. interaction between selection and treatment)
• Might depend on type of course: What if we had to take notes on equations / drawings, or words (cf.
interaction between setting and treatment). What if the test is different, e.g. analyzing a case.
• Might depend on time study is carried: Are we worse at taking long-hand notes in 2023, compared to
decades ago? (interaction between history and treatment).
• Maybe students worked hard because the study was run by their professors? Maybe difference is therefore
smaller because everyone worked harder. (Hawthorne effect)

• Ecological validity limited (since lab-experiment)


• How artificial is the learning and test situation?
• Performance measured in a short 15 min test right after / a week later: How is regular student life different?
• In regular student life one has many notes, across courses – easier to keep track of on a computer
• One might want to share notes, online
• Computer written notes are always readable, vs. handwritten notes

• What about Ewriters, where you write with a pen on a screen, and get the advantage of both?

Note other researchers have tried to replicate the results and ‘failed’: The pen is not mightier than
22
the sword https://twitter.com/HeatherUrry/status/1289013799828090880?s=20
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

D) Assessing research designs: Danish supermarkets

Large sample, data shows difference in responses

However
• Different kinds of customers in different supermarkets?
Who looks at the receipt?
• Maybe Netto customers go there daily, while you only go
to Bilka once a month + maybe buy a lot?
Selection bias and risk of confounding variables

We can not with certainty infer which supermarket makes


the most mistakes.

23
AARHUS UNIVERSITY

E) 2020 Spring exam: Question 1, a few notes


SCHOOL OF BUSINESS AND SOCIAL SCIENCES

In order to pass this question, the following should be established in the answer:
• Identify that the research design is cross-sectional
• Identify that a causal claim is made
• Internal validity is weak’ish: Explain why the causal claim is not warranted due to the
cross-sectional nature of the data, e.g. by referring to the risk of confounding variables
(omitted variable bias), or that the causality could be reversed i.e. high IQ could lead to
playing a musical instrument.

Additional points can also be made


• Explain how the external validity is limited (only studied healthy, older people)
• Common methods bias is reduced by asking family members if the person has played a
musical instrument.
• Omitted variable bias is reduced by including the health of the individuals, but other
factors might still be confounding / unaccounted for (education, demography etc.).
• The quality of the constructs: E.g., is ‘wits’ the same as IQ? (Nelson 2016, and AMJ 2011
discusses this challenge).
• Selecting on dependent variable: We only interview people who made it to this age, but
who might have died before they got old? So, survival bias as in WW2 planes example
• This is not an exhaustive list of how to generate points.
24
PHILOSOPHY OF SCIENCE

LESSON 6
WHAT IS A THEORY
Carsten Bergenholtz (cabe@mgmt.au.dk)
Department of Management, Aarhus BSS
Aarhus University
Dear students CONCERNING CHATBOT ACCESS
As promised in the first lesson, you can get access to a ChatGPT-4 driven, customized chatbot. Let's call it Phil (as in Philosophy of Science). Roughly
speaking, we have created a platform, where any question you ask is answered by a ChatGPT-4 search tool that relies on parts of the curriculum to
answer your question.

This is a pilot project and - as far as I know - the first of its kind at our university. A few things for you to consider before using the platform:
•Purpose: The platform is designed to only answer questions related to this course. Phil costs money, and therefore search has to be limited to the
course. If we encounter usage that goes clearly beyond the topics of the course, we can (and will) shut down the individual's access.
•Validity: While I have tested Phil, I cannot guarantee that all answers are true - or identical to what I would have answered. However, I am very confident
that the vast majority of answers will be as good as any I would have generated and better than using any other available chatbot. I will continuously
monitor the questions posed and answers provided, in order to ensure the quality of the answers. If errors emerge, I will of course flag them.
•Privacy: Please note that this also means that the questions you ask are stored, linked to your user id. However, my colleague (Oana Vuculescu) will only
send me the questions and answers, not user information. I am honestly not interested in who asks which questions, just the nature of the questions and
quality of the answers. Btw., nothing is sent to OpenAI (the owner of ChatGPT).
•Content: Copyright limits how much of the curriculum I can upload to Phil. Main documents are prior lectures, various texts I have written and good
student exam answers. Phil is probably strongest in relation to Research Design and Theory (lesson 2 to 6), while it also does contain information about
the rest of the curriculum.
•Usage limit: We are not entirely certain how costly this is going to be, so due to a limited budget we are for now setting a limit to 30 searches pr. week.
We will revisit this limit.
•Usage, how to: Note that when you have received an answer to your question, you can click on the citation (below the answer) and get information on
where the chatbot retrieved the answer from.
•Plagiarism (important!): Different students can get a) identical (!) answers to similar questions and b) answers can be very similar to material in the text.
Hence, if you want to use answers from Phil in the exam, you will have to quote it, in order not to plagiarize from the curriculum or other students. More
in lesson 12.

I might adjust the settings and functionality as we go along. If you encounter any errors or chatbot 'behavior' that you find odd, please let me know at
cabe@mgmt.au.dk. Link: https://app-backend-ubqtrwmmqio3c.azurewebsites.net/ (you should have access since Microsoft Azure is connected to your
AU email account). Best regards, Carsten
2
ANOTHER RESEARCH
DESIGN EXAMPLE:
HARM OF SCREENTIME?
Journalistic causal conclusion: Letting young children sit in front of a screen for
too long can have significant negative consequences…
Actual correlational claim of study: Infants who had around two hours of
screen time per day had weakened cognitive abilities later in life.
The study’s longitudinal research design: 437 one-year-olds, same children
examined when 18 months old and again at 9 years old.

Confounding variable: "I could imagine that those who give their children a
screen at the age of one are parents who may demand less of their children or
are less interested in their children…"

https://ekstrabladet.dk/nyheder/samfund/nyt-studie-chokerer-saadan-paavirker-skaermtid-boernene/9634816

3
AGENDA
HA / BSc bachelor project course description:
• Empirical thesis: The purpose is to collect knowledge about a given topic by relating theory to
empirical observations, for example by using existing data from databases, questionnaires, interviews,
observations, simulations or experiments”
• Irrespective of the topic chosen, the use of relevant theory and literature is fundamental to the thesis.”
https://kursuskatalog.au.dk/da/course/123045/Bachelors-Project

However: What is a theory, what do we use it for, and how is the use of theory different
depending on your approach?

4
VIDEO FROM 2021 ON WHAT THEORY IS (NOT)

https://youtu.be/NF1059J7KBo

Based on different curriculum, same core content

5
RESEARCH QUESTION – RESEARCH
DESIGN - THEORY

Theory
• Informs relevance of research question
• Guides choice of research design
• Shapes how to measure and analyze data
• Provides framework for interpreting results
• …

6
A THEORY IS NOT JUST A THEORY (NYT 2016)
OR: THEORY IN EVERYDAY LIFE VS IN RESEARCH

• “In everyday conversation, we tend to use the word “theory” to mean a hunch, an
idle speculation, or a crackpot notion. That’s not what “theory” means to
scientists.” (NYT 2016)

• Ancient Greek word for theorizing: Attentive observer sent to observe foreign
practices and “theorize” about them (Swedberg 2014: p. 19)

• “In science, the word theory isn’t applied lightly,” Kenneth R. Miller, a cell biologist
at Brown University, said. “It doesn’t mean a hunch or a guess. A theory is a system
of explanations that ties together a whole bunch of facts…” (NYT 2016)
https://www.nytimes.com/2016/04/09/science/in-science-its-never-just-a-theory.html?_r=0

7
THEORIES AS MAPS – NAVIGATION GUIDE
“To judge a map’s quality, we can see how well it guides us through its territory. In a
similar way, scientists test out new theories against evidence. Just as many maps have
proven to be unreliable, many theories have been cast aside.” (NYT 2016)

• Different maps / theories for different purposes

• But what is a theory? – First, more about what researchers do

https://www.youtube.com/watch?v=4WVelCswXo4&t=9s

8
FIRST: WHAT DO RESEARCHERS DO? I
DESCRIPTIONS VS EXPLANATIONS VS PREDICTIONS

Descriptions vs explanations
"Empirical analyses in economics have diverse
How much, how
goals—all valuable. Some analyses advance many etc.
knowledge by uncovering new facts or providing
richer descriptions of old facts ... Other analyses
seek to understand the [theoretical] mechanisms
producing outcomes with an eye toward
interpretation and counterfactual policy
evaluation."
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5804814/ (authored by Remember: Need theory to make
Heckman, 2000 Nobel Prize winner in Economics)
descriptions – e.g. to define & measure GDP,
inflation, cost of living etc.
Yet a description is not a theory
9
WHAT DO RESEARCHERS DO? II
DESCRIPTIONS VS EXPLANATIONS VS PREDICTIONS

Predictions vs explanations
Predicting future events – using large datasets
Remember: Need theory to define/analyse variables
Yet: A prediction is not necessarily a theory

For example: Model might predict employee success


based on historical performance data – without
explaining why (some succeed, others don’t)

Good descriptions and predictions might lead one to ask:


‘Why’ do we find these relations?  explanations

10
Ok, so research often (not always) involves aiming to explain something. But what does
that mean?

11
A THEORETICAL EXPLANATION
(BERGENHOLTZ 2023A SECTION 1.3)

‘Abstracting away’  ignoring many details, in order to generalize

12
ELEMENTARY SCHOOL QUASI-EXPERIMENT EX
WHAT IS THE ”ABSTRACTED” EXPLANATION? (DISCUSS 1 MIN)

“We identify one potential source of bias that influences children’s performance
on standardized tests and that is predictable based on psychological theory: the
time at which students take the test. Test time is determined by the weekly class
schedule and computer availability at the school. Using test data for all children
attending Danish public schools between school years 2009/10 and 2012/13,
we find that, for every hour later in the day, test scores decrease by 0.9% of an SD.
In addition, a 20- to 30-minute break improves average test scores. Time of day
affects students’ test performance because, over the course of a regular day,
students’ mental resources get taxed.”
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4790980/, (p. 2621, order of sentences slightly modified)

13
A THEORY IS NOT

• List of references
• Data
• Hypotheses
• List of variables
• Graphs

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4790980/
14
AN ABSTRACTED DESCRIPTION CAN GUIDE US AND PREDICT –
BUT NOT NECESSARILY INVOLVE THEORETICAL EXPLANATION
OF HOW OR WHY

15
WHAT EXPLAINS EXERCISE I (DISCUSS 1-2 MIN)

How many different theoretical explanations for why people exercise (so much / so
little) can you think of?

16
WHAT EXPLAINS EXERCISE II
Let’s say we investigate why people exercise (so much / so little)

• Individual explanations
• Intrinsic motivation (enjoyment)
• Extrinsic motivation (social status)
• Individual well-being (stress, sleep)
• …
• Structural explanations:
• Economic factors (affordability)
• Urban design (availability of parks/gyms)
• Cultural norms (social values)
• …
In a project one would usually not investigate all
• Biological explanations possible theoretical explanations
• Genetics
• Age, gender
• …
17
HOW THEORY CAN HELP
GENERALIZE

• How can we know if these results transfer to


other settings (=generalizes/external
validity)?

• Theoretical explanation of why people’s


behavior is impacted
• Some business contexts allow online
meetings, others not – limits
generalizability?

• Theory -> informs generalizaility

https://twitter.com/I_Am_NickBloom/status/1512446919875842051 18
THEORY AND GENERALIZATION IN PHYSICS,
MEDICINE ≠ BUSINESS ADMINISTRATION

• Theoretical explanations – usually – more universal in e.g. physics and often also medicine
• Atoms, cells etc. less different than humans & organizations

• Hence, a physics analysis of X atoms usually transfers to other atoms ≠ business


administration

19
TWO DIFFERENT METAPHORS FOR WHAT
CONSTITUTES A THEORETICAL EXPLANATION
What does it mean to have ‘theoretically explained’ something?

An explanation – the causal logic


• What produced the outcome?
• Predicting relationships
• Causal diagrams can be drawn

An explanation – understanding
• Make sense of a situation
• Subjective meanings/experiences

https://www.nature.com/articles/s41390-018-0071-3

20
CAUSAL EXPLANATIONS I

Incentives  Motivation  Effort  Performance Expectancy theory – an economic theory


Social comparison Equity theory – a socio/psychological theory

21
CAUSAL EXPLANATIONS II
COUNTERFACTUAL

• We investigate what happens if incentives are increased

Incentives  Motivation  Effort  Performance

• Counterfactual: Can – theoretically- infer what happens if we reduce incentives or


the context changes
• If you only have data on X setting – how do we know if we can transfer to Y?
• We need a theoretical explanation to be able to transfer from X to Y setting

• Explains why RCT’s are popular in this conception of ‘theoretical explanations’


(counterfactual is provided)

22
EXPLANATION AS UNDERSTANDING I

Bergenholtz 2024c, section 2.2

• Natural sciences vs humans and organizations


• Planets and atoms have no intentions that we can make sense of

23
COMPARING THE TWO METAPHORS I
CAUSALITY VS UNDERSTANDING
Explanation as causality:

Explanation as understanding (different focus/RQ):


• Investigate how/why individuals experience and make sense

• For example: What do I think about the incentive change? What do I think was
changed, and how does this matter for me? Why does it matter what incentives
others get? (intentions, beliefs, behaviors)

• Not just about getting data that can validate quantitative data – but open up
rich(er) perspective and make sense of situation
24
EXPLANATION AS UNDERSTANDING II
ALSO CONTAINS CAUSAL ELEMENT?

Bergenholtz 2024c, section 2.2

• Understanding perspective can, but does not have to, make causal claims.
• Understanding is primary, causality secondary

• Often different research questions – not how incentives impact


performance (for example)

• World is complex (more in lesson 13), simple causal relationships less likely

25
COMPARING THE METAPHORS II
HOW DO WE KNOW IF EXPLANATIONS ARE GOOD?

Causality: Did the answer ‘hit the bullseye’ – comparison with the world (validity concepts)

Understanding:

Bergenholtz 2024c, section 2.3

26
WHAT IS YOUR IDEA OF A THEORY - IN YOUR
BACHELOR PROJECT FOR EXAMPLE
(VEVOX. APP 143-136-950, 1-2 MIN)

27
INDUCTION VS DEDUCTION
(VEVOX. APP 143-136-950, REFLECT AND DISCUSS 1-2 MIN)

Imagine one has a dataset consisting of 1000s of book reviews; incl. data on age,
gender, educational level and personality of those writing the reviews. One now
examines the dataset in order to see if one can find any correlations between age
groups, educational level and various types of personality and the review rating
provided. Results are then produced based on this analysis.

Is this an inductive or deductive approach?

28
HOW DO WE GET TO A THEORETICAL EXPLANATION?
THEORY BUILDING VS THEORY TESTING

Theory building: Induction


Theory testing: Deduction

Theory as outcome or starting point


of empirical process

De Vaus 2001, p: 3-4


29
INDUCTION VS DEDUCTION
Imagine one has a dataset consisting of 1000s of book reviews; incl. data on age,
gender, educational level and personality of those writing the reviews. One now
examines the dataset in order to see if one can find any correlations between age
groups, educational level and various types of personality and the review rating
provided. Results are then produced based on this analysis.

Is this an inductive or deductive approach?

30
THEORY TESTING
• “De Vaus (2001: p. 8) notes: “If the theory is true
certain things should follow in the real world”, in
other words certain observations should be implied
by the theory.” (Bergenholtz 2024c, section 3.1)

• Researcher is making an informed ‘guess’


(hypothesis), pre-data collection: This is what the
data will look like (and we test only that guess)

• Hypothesis derived from theory; first theory, then


then hypothesis

• If hypothesis is changed after data analysis ->


inductive element (see abduction perspective)

Figure 2.1, BRM p. 21


31
THEORY VS. HYPOTHESIS VS MODEL I

Bergenholtz 2024c, section 3.1.1

• Theory: General explanation – the relation between independent and dependent variable (e.g.
incentives and performance)

• Hypothesis: Specific, testable prediction of specific variables). Involves range of choices


• Performance measurement: E.g. most tasks solved in 10 min, or the best solution, or the speed of the
quickest solution etc.
• Nature of incentives: Size of monetary reward, recognition etc.
• …
Control variables reduce risk
Hypothesis: Bridge between theory and data of confounding variables – are
• A theoretically informed ‘guess’ / prediction, in some sense not hypotheses
• Some hypotheses are obvious, others more speculative
32
PRE-REGISTERING HYPOTHESES (NEW TREND)
• How do we know if a hypothesis was derived from
theory, not from data?
• We do not – unless researchers have pre-registered
hypothesis = committed their hypothesis and
registered it on a closed online platform
• If no pre-registration: Researchers might after looking
at data change hypothesis/measurement -> make it
appear a success
• Pre-registration = actual deductive science

33
THEORY VS. HYPOTHESIS VS MODEL II

Bergenholtz 2024c, section 2.4

34
RELATION BETWEEN HYPOTHESIS AND THEORY
(VEVOX. APP 143-136-950, REFLECT AND DISCUSS 1-2 MIN)

a) If the hypothesis turns out to get statistical support (e.g.


p-value < 0.01) – is the theory then true?

b) If the hypothesis is not supported by statistical analysis


(e.g. p-value > 0.20) – is the theory then false?

35
RELATION BETWEEN HYPOTHESIS AND THEORY
(VEVOX. APP 143-136-950, REFLECT AND DISCUSS 1-2 MIN)

a) If the hypothesis turns out to get statistical support (e.g. p-value <
0.01) – is the theory then true?

b) If the hypothesis is not supported by statistical analysis (e.g. p-


value > 0.20) – is the theory then false?

Concerning b)
• Imagine two dices
• One is is slightly off: 20% chance of a 6
• The other is perfect: 16.6% chance of getting a 6

• Imagine throwing the dices


• 30 times: Not significantly different # of 6s
• (if you throw it +2000 times, difference will probably emerge)
36
THEORY TESTS ARE INCOMPLETE
(UNDER-DETERMINATION OF THEORY)

• If support is found for hypothesis: Theory is supported – not confirmed


• If support is not found for hypothesis: Evidence against theory – not necessarily false/falsified

• Test of hypothesis always an incomplete test of theory, based on theoretical assumptions,


constructs etc. (cf. Guba 1990, p. 25: underdetermination of theory)

• We always (see e.g. Bergenholtz 2024a) test a particular version of a theory - based on
particular tasks, with a particular sample, particular size incentive, particular time, country etc.

• Full test of all combined options is incredibly large (millions) – see also lesson 13 on complexity

37
ILLUSTRATING DEDUCTIVE APPROACH IN
QUALITATIVE RESEARCH

Slide borrowed from Franziska Günzel 38


THEORY BUILDING: INDUCTION I

Bergenholtz 2024c, section 3.2

Slide borrowed from Franziska Günzel


39
THEORY BUILDING: INDUCTION II
• Pure induction is rare: One still does literature
reviews, have speculations.

• However: Aim not to test a theory but to make (new)


sense of some empirical phenomena

• Induction’s main challenge: ”No matter how many


times you see a pattern (all swans are white), the
next might be black” (more in lesson 7-8)

40
THEORY BUILDING: INDUCTION III

41
WHEN TO ENGAGE IN THEORY BUILDING
(INDUCTION) VS. TESTING (DEDUCTION)?
• Depends on research question, but general rules of thumb:

• Induction
• When little is known / clear theory does not exist
• If there are tensions and disagreement in existing knowledge
• One stumbles upon new empirical insight
• If case study, and no clear theory to be tested

• Deduction
• Applied when testing hypotheses derived from existing theories.
• If theory can be extended to new untested fields, or contexts (samples)
• If large quantitative dataset is to be studied – and theoretical field is well
established so hypotheses developed in advance
• …

42
INDUCTION VS DEDUCTION: NOT ALWAYS EASY
TO SEPERATE TYPES OF REASONING

”However, just as deduction often entails an element of


induction, the inductive process is likely to involve some
deduction” (BRM p. 24)
”Deductive and inductive strategies are better thought of as
tendencies rather than as a hard-and-fast distinction” (BRM
p. 24)
What started out deductive (e.g. Hawthorne studies) can
turn into an inductive study (BRM p. 24)

 Abduction?
43
ABDUCTION: DIFFERENT TYPE OF INFERENCE

Technical definition: Inference to the best explanation – not either from


data or from theory

44
ABDUCTION: NOT JUST SEQUENTIALLY SWITCHING BETWEEN
INDUCTION AND DEDUCTION

Abduction – then deduction, ideally


“This approach addresses the problem
of using the same data to formulate and
test hypotheses. Analysts are advised to
test provisional models on fresh data”
(Heckman and Singer 2017: p. 302)

45
ABDUCTION VS INDUCTION / DEDUCTION
• Some/much research characterized as deductive, is actually abductive (less so in
experiments) – according to Heckman
• Important not to oversell: don’t claim deduction if it is not

However:
• Don’t just (uncritically) adopt the term abduction.
• Few projects are 100% deductive / inductive – small changes ok’ish
• Do reflect on your process
• Maybe add a critical paragraph outlining the degree to which your process was in fact
inductive/deductive

Your bachelorproject: Induction, deduction or


abduction? (reflect 1-2 min)

46
THIS ALL SOUNDS VERY RATIONAL AND CUMULATIVE
IS THIS HOW THEORIES ACTUALLY DEVELOP? (Kuhn 1970 vs. Popper 1962)

• Rational perspective: Cumulative improvements


• Due to better research designs, larger samples, stronger evidence etc.
• Assumes researchers are willing to change their minds!

47
SUMMARY OF LESSON 6
• What is a theory (not)?
• A theoretical explanation
• Different aims of theory: Causality vs. understanding
• Theory vs. models vs. hypotheses
• Theory building and theory testing or Different theoretical reasonings:
Inductive vs. deductive – and abductive approaches

48
PHILOSOPHY OF SCIENCE

LESSON 7
KUHN AND POPPER ON SCIENTIFIC
PROGRESS - PART 1
Carsten Bergenholtz (cabe@mgmt.au.dk)
Department of Management, Aarhus BSS
Aarhus University
From lesson 6 slides: Plagiarism (important!): Different students can get a) identical (!) answers to similar
questions and b) answers can be very similar to material in the text. Hence, if you want to use answers from Phil
in the exam, you will have to quote it, in order not to plagiarize from the curriculum or other students. More in
lesson 12.
https://app-backend-ubqtrwmmqio3c.azurewebsites.net/
2
Why is a deductive test better than a
non-deductive test? Example:
• Thow a dice X number of times
• Some outcome might be surprising (#
of eyes shown on dice)
• If not pre-registered expectation
•  6 times bigger chance of finding
something interesting due to random
chance
3
AGENDA
From improving particular research designs and getting insights on how to develop
(good) theory  more general questions about nature and truth of theories and how
science develops

4
PHILOSOPHY OF SCIENCE: HOW DO THEORIES
DEVELOP AND IMPROVE OVER TIME?
Not about one individual theory – but the philosophy of science (in general)
Insights on research design & deduction and induction can give impression of
• Science directly building on prior work
• Science continuously improving
 Positivism
• We can observe the world as it is
Science as a very rational, cumulative process
• We can test theories & verify truth

5
THE (POSITIVISM) POSITION THAT
KUHN & POPPER ARGUE AGAINST
Positivism: Classic ‘textbook’ conception of science pre-Kuhn (1962/1970)
• “…positivists believed that disputes between rival scientific theories could be solved in a perfectly
objective way, comparing the theories directly to the observations…” (PoS 5, p.80)
• The world can be observed as it is / there are neutral facts (PoS p. 80)
• (true) Theory corresponds to / mirrors world (PoS p. 78)
• Theory development as a rational proces (PoS p. 78-79)

Linear Lego brick metaphor: Scientific development is rational and cumulative ≠ Kuhn

Implication I: If rational, cumulative science, we can verify theories and establish truth
Implication II: If theory development is rational, a better theory should always be accepted

Right?
Kuhn & Popper disagrees
6
The following introduction of Kuhn is also available in this
video from 2021: https://youtu.be/gD84WMLmQes

7
KUHN 1962: MOST CITED SOCIAL SCIENCE RESEARCH
(= MOST OFTEN IN REFERENCE LIST)

http://blogs.lse.ac.uk/impactofsocialsciences/2016/05/12/what-are-the-most-cited-publications-in-the-social-
sciences-according-to-google-scholar/ 8
HOW RATIONAL IS SCIENCE?
SOMETIMES A CLEARLY BETTER EXPLANATION IS NOT ACCEPTED
BY RESEARCHERS
Semmelweis had no
(good) theory though

Kuhn’s project: Explaining


why clearly ‘better’
explanations are not always
(immediately) accepted by
a scientific community

Thus not always rational


development of science

http://www.jameslindlibrary.org/articles/how-dramatic-were-the-effects-of-
handwashing-on-maternal-mortality-observed-by-ignaz-semmelweis/
9
THE RELEVANT GERM THEORY WAS ACTUALLY
PROPOSED THOUSANDS OF YEARS AGO

https://twitter.com/page_eco/status/1356953614544064516?s=20

10
KUHN: SCIENTIFIC DEVELOPMENT – 2 TYPES

Kuhn: Science not completely rational – messy


social/psychological process

Paradigm: fundamental assumptions that


members of scientific community accept
(PoS p. 81) – will expand on this concept

http://www.thwink.org/sustain/glossary/KuhnCycle.htm

11
THE ESSENCE OF KUHN’S MESSAGE: NORMAL
SCIENCE VS. PARADIGM SHIFTS

https://ftp.cs.ucla.edu/pub/stat_ser/r487.pdf

12
BUT WHY ISN’T SCIENCE JUST RATIONAL -
WHAT ARE THE BARRIERS?

Kuhn’s explanations of why researchers resist new theories 


Perception + biases + norms

(psychological-sociological features and limits)

13
PERCEPTION: WE, LITERALLY, DO NOT SEE THE SAME I

• Data / the world does not prescribe just one


interpretation, says Kuhn
• We don’t see the world itself – just a processed version
• “In the metaphorical no less than in the literal use of ’seeing’,
interpretation begins where perception ends….what
perception leaves for interpretation to complete depends
drastically on the nature and amount of prior experience
and training“ (Kuhn p. 198, my emphasis).

≠ positivism

14
PERCEPTION: WE, LITERALLY, DO NOT SEE THE SAME II
”If two people stand at the same place and gaze in the same direction, we must…conclude that
they receive closely similar …But people do not see stimuli…Instead they have sensations, and
we are under no compulsion to suppose that the sensations of two viewers are the same…much
neural processing takes place between the receipt of a stimulus and the awareness of a
sensation…the route from stimulus to sensation is in part condition by education.” (Kuhn p. 192-
193, my emphasis)

Although, one can’t make up any kind of interpretation: If you can’t distinguish a wolf from a
dog, you won’t survive (Kuhn p. 195, not direct quote)

• Outside world creates stimuli


≠ • Humans don’t ‘observe’ stimuli – but experience
already processed version of stimuli
• Thus different humans experience different ‘worlds’
• Therefore: No one ‘objective’ observation of the world
15
P. 196: ”WE DO NOT SEE ELECTRONS, BUT RATHER THEIR
TRACKS OR ELSE BUBBLES OF VAPOR IN A CLOUD CHAMBER”.
”In the metaphorical no less than in the literal use of ‘seeing’,
interpretation begins where perception ends. The two
processes are not the same, and what perception leaves for
interpretation to complete depends drastically on the nature
and amount of prior experience and training.” (Kuhn p. 198)

”What we see depends in part on what we believe” (PoS p. 88)


”Facts are never just facts, facts are embedded within a
paradigm” (PoS p. 84)

https://upload.wikimedia.org/wikipedia/commons/a/ac/Alpha_particle_and_ele
ctrons_from_a_thorium_rod_in_a_cloud_chamber.jpg

• Kuhn: We don’t directly ‘see’ motivation,


efficient market hypothesis, productivity, firm
performance etc.
• We rely on a paradigmatic lense: techniques,
norms, perspectives.
16
WE ONLY HAVE INDIRECT ACCESS TO ≠ positivism
REALITY / NATURE (HEISENBERG: QUANTUM PHYSICIST)

17
SOCIAL SCIENCE EXAMPLES: WE DON’T DIRECTLY SEE THE
WORLD AS IT IS – BUT THROUGH A PARADIGMATIC LENSE

If this is right – how can we be certain our theories actually represent the world / are true?

18
VIDEO OF WHAT YOU SAY?
Data is not ”raw”, always interpreted
on a paradigmatic background

https://twitter.com/michaelgalanin/status/98
8087010270285825?s=11

19
KUHN’S TAKE ON SCIENTIFIC DEVELOPMENT

Paradigm: fundamental assumptions that


members of scientific community accept
(PoS p. 81)

http://www.thwink.org/sustain/glossary/KuhnCycle.htm
KUHN’S PARADIGM SHIFTS ILLUSTRATED

http://www.youtube.com/watch?v=3cp6pEzx3uw
21
A SCIENTIFIC PARADGM (DEFINITION)
WHAT PARADIGMS ARE AND WHY THEY ARE IMPORTANT

Paradigm: ”…a paradigm is an entire scientific outlook – a constellation of shared assumptions,


beliefs, and values that unite a scientific community” (PoS p. 81)

Paradigms refer to the way “things are done” (norms), in a scientific community, and are
implicit.

For example: How we usually measure X variable, what questions we ask, what science gets
funded, who a science institution would hire, what methods are popular in a field, what qualifies as
a good answer etc.

“A paradigm governs, in the first instance, not a subject matter but rather a group of
practitioners” (Kuhn, p. 180)

Paradigms are about fundamental assumptions, thus difficult to change


New evidence – even good evidence – might not change minds (more in lesson 10)
22
COMPARING ACROSS PARADIGMS:
INCOMMENSURABLE
• Within paradigm (normal science): Objective comparison possible
as everyone 'speaks the same language' of methods and standards

• Across paradigms: Incommensurable (see PoS p. 87)


• No common standard: Different assumptions, norms and
techniques used
• Making objective comparison very difficult

• For example
• Within the same quantitative approach and paradigm, we can
directly compare two (similar) surveys on job satisfaction.
• Across paradigms, comparing a quantitative survey to an
ethnographic study on job satisfaction risks comparing volume to
weight – the measures don’t easily translate.

23
KUHN’S TAKE ON SCIENTIFIC DEVELOPMENT

Typical (normal)
science

Revolutions are –
very – rare! Paradigm: fundamental assumptions that
members of scientific community accept
(PoS p. 81)

http://www.thwink.org/sustain/glossary/KuhnCycle.htm
KUHN’S TAKE ON SCIENTIFIC DEVELOPMENT
(REFLECT, DISCUSS 2 MIN – QUESTIONS?)

http://www.thwink.org/sustain/glossary/KuhnCycle.htm

25
SUMMING UP: A SCIENTIFIC PARADIGM
Paradigms showcase a research agenda
• What is important Theories can more easily change -
• Which skills a researcher should have paradigms are embedded practices
• Which data are collected that are diffucult to change
• What is being taught
• Which research is being accepted
• ….

The concept explains why research change slowly


– difficult to change these institutionalized practices

Yet – have there really been such revolutionary


changes in social sciences?

26
ILLUSTRATING SLOW CHANGS IN (ECONOMICS) SCIENCE
1994 landmark (Nobel prize winning) study claims: Increasing
minimum wage did not negatively impact employment

An example of a
fundamental change in
assumptions about the world

https://noahpinion.substack.com/p/why-15-minimum-wage-is-pretty-safe

27
DOES WATER RUN UPHILL?
OF COURSE MINIMUM WAGE WOULD NEGATIVELY IMPACT EMPLOYMENT…RIGHT?

“The inverse relationship between quantity demanded and price is the core proposition in economic
science, which embodies the presupposition that human choice behavior is sufficiently rational to allow
predictions to be made. Just as no physicist would claim that “water runs uphill,” no self-respecting
economist would claim that increases in the minimum wage increase employment. Such a claim, if seriously
advanced, becomes equivalent to a denial that there is even minimal scientific content in economics, and
that, in consequence, economists can do nothing but write as advocates for ideological interests.
Fortunately, only a handful of economists are willing to throw over the teaching of two centuries; we have
not yet become a bevy of camp-following whores.”

~James M. Buchanan, 1986 Nobel laureate in economics, writing in the Wall Street Journal on April 25, 1996

28
SLOWLY CHANGING BELIEF ON EFFECT OF
MINIMUM WAGE

https://noahpinion.substack.com/p/why-15-minimum-wage-is-pretty-safe

29
CARD’S (2021 NOBEL PRIZE WINNER) EXPERIENCE
A RATIONAL DISCUSSION?
“[E]conomists who objected to our work were upset by the thought that we were giving free rein to
people who wanted to set wages everywhere at any possible level…I've subsequently stayed away from
the minimum wage literature for a number of reasons. First, it cost me a lot of friends. People that I had
known for many years, for instance, some of the ones I met at my first job at the University of Chicago,
became very angry or disappointed. They thought that in publishing our work we were being traitors to
the cause of economics as a whole.”
https://noahpinion.substack.com/p/why-15-minimum-wage-is-pretty-safe

• Changes in fundamental (paradigmatic) assumptions and beliefs are difficult, stubborn and slow
• Science not a fully rational process / institution
• Yet, not necessarily a full-blown Kuhnian revolution  arguably Kuhn partially right

30
SO, HOW DO PARADIGMS CHANGE AND IMPROVE
AFTER ALL, ACCORDING TO KUHN?
If science can’t improve and two paradigms just are
different  pure relativism, anything goes

Answer: About truth, funerals and survival of the fittest

https://en.wikipedia.org/wiki/Planck%27s_principle
31
SCIENCE, ONE FUNERAL AT A TIME?

What research
design is this?

What to pay
attention to?

http://www.econ.upf.edu/docs/papers/downloads/1498.pdf ”…a new scientific truth does not triumph by convincing its opponents and
making them see the light, but rather because its opponents eventually die, and a
new generation grows up that is familiar with it.” (Max Planck) 32
NEW PARADIGMS ARE NOT MORE TRUE I
Kuhn p. 206: There is, I think, no theory-independent way to reconstruct phrases like ‘really
there’; the notion of a match between the ontology of a theory and its ”real” counterpart in
nature now seems to me illusive in principle” (my emphasis. A clear critique of positivism)

POS, p. 84: ”…Kuhn questioned whether the concept of objective truth actually makes sense at
all. The idea that there is a fixed set of facts about the world, independent of any particular
paradigm, was of dubious coherence, he believed. Kuhn suggested a radical alternative: That
facts about the world are paradigm-relative, and thus change when paradigms change. If this
suggestion is correct, then it makes no sense to ask whether a given theory corresponds to the
facts ‘as they really are’, nor therefore to ask whether it is objectively true.” (my emphasis)

33
NEW PARADIGMS ARE NOT MORE TRUE II
Kuhn p. 206: There is, I think, no theory-independent way to reconstruct phrases like ‘really there’;
the notion of a match between the ontology of a theory and its ”real” counterpart in nature
now seems to me illusive in principle” (my emphasis. A clear critique of positivism)

POS, p. 84: ”…Kuhn questioned whether the concept of objective truth actually makes sense at
all. The idea that there is a fixed set of facts about the world, independent of any particular
paradigm, was of dubious coherence, he believed. Kuhn suggested a radical alternative: That
facts about the world are paradigm-relative, and thus change when paradigms change. If this
suggestion is correct, then it makes no sense to ask whether a given theory corresponds to the
facts ‘as they really are’, nor therefore to ask whether it is objectively true.” (my emphasis)

Kuhn p. 204: ”…as argument piles on argument and as challenge after


challenge is succesfully met, only blind stubbornness can at the end
account for continued resistance”.
34
COMPARING DIFFERENT PARADIGMS 
SURVIVAL OF THE FITTEST
• Not only about funerals – newer paradigms can be better / not simple relativism
• Yet not better representation of reality (not an ontological argument), cf. p. 206,
 survival of the fittest, across paradigms

• Criteria for comparison (Kuhn p. 206)


• Accuracy of prediction
• Number of different problems solved
• Simplicity, scope, compatibility with other disciplines

”…scientific development is, like biological, a unidirectional and irreversible process.


Later scientific theories are better than earlier ones for solving puzzles in the often quite
different environments to which they are applied” (Kuhn p. 206)

35
WHICH QUOTE IS NOT BY KUHN?
(VEVOX. APP 143-136-950, 2 MIN)

1) Later scientific theories are better than earlier ones for solving puzzles in the often quite different
environments to which they are applied. That is not a relativist's position, and it displays the sense in
which I am a convinced believer in scientific progress.

2) 'Normal science' means research firmly based upon one or more past scientific achievements,
achievements that some particular scientific community acknowledges for a time as supplying the
foundation for its further practice.

3) A new paradigm in the end wins over the former, established paradigm, by developing new theories
and practices that manage to obtain a clearer and better understanding of reality as it is.

https://todayinsci.com/K/Kuhn_Thomas/KuhnThomas-Quotations.htm
QUESTION FROM 2018 RE-EXAM
Question 3 (15%)

Please explain why Kuhn would NOT agree with the following
statement: “A new paradigm in the end wins over the former
established paradigm, by developing new theories and practices that
manage to obtain a clearer and better understanding of reality as it is.”

Please be specific about what part of the statement Kuhn would


disagree with, and explain why.
ANSWER TO Q3 FROM 2018 RE-EXAM
A FEW BULLET-POINTS TO ILLUSTRATE WHAT A VERY GOOD ANSWER CAN LOOK LIKE
• To see the world as it is = positivism (Guba 1990, see lesson 9)
• ”There is, I think, no theory-independent way to reconstruct phrases like ”really there”(Kuhn 1962, p. 206)
• We see reality through our paradigmatic (values, techniques, etc. that members of our scientific society share)
lenses
• Interpretation begings where perception ends (Kuhn p. 198)
• Newer, better theories not approximately closer to reality (Kuhn p. 206)
• As post-positivism would claim (Guba 1990, lesson 8)
• Different paradigms are incommensurable
• Difficult to communicate across: “Since the vocabularies in which they discuss such situations consist, however, predominantly
of the same terms, they must be attaching some of those terms to nature differently, and their communication is inevitably only
partial”(Kuhn1969, p. 198)
• But, in an evolutionary perspective: Better at solving problems, more accurate predictions, solves more and more
important problems (Kuhn p. 206)
• One paradigm might only be able to explain lab-experiment, while another can explain phenomena in the field
• Thus, a paradigm can win over another – but not an ontological argument
• ”as argument piles on” (Kuhn p. 204)
• Therefore, since different conceptual frameworks and ways of seeing the world, it is not the world itself that is
being compared, but paradigmatic approaches: Some are better than others, thus not simple
relativism/subjectivism.
KUHN: SUMMING UP
Kuhn: A history of science perspective
• Keywords: Paradigms, normal science, crisis, revolutions, scientific community
• How have the sciences developed, historically – a psychological-sociological
perspective
• We – literally – don’t see the same. Depends on training/ education/ expectations.
• In what sense is truth achieved: Linear, accumulated development ≠ evolutionary
perspective
POPPER 1962

Popper’s agenda
• Problem with induction and idea of observing the world as it is
• Asymmetry
• Verification vs. falsification of theories
• Realism and tentative truths (-> post-positivism, more in lesson 9)

Overall question: How do we use theory, and can theories be true?


E.g.: can induction or deduction lead to truth?
40
PROBLEM WITH INDUCTION 1: CAN WE ‘JUST’
OBSERVE THE WORLD?
Idea of pure induction: That we can observe the world and then theorize

41
PROBLEM WITH INDUCTION II: THE NEXT SWAN
MIGHT BE BLACK

ABH: p. 47

ABH: p. 48

42
WE PROPOSE A THEORY AND TEST IT I
Yet: We should not try to verify a theory. Too easy to find examples that (seem to) support.

Thus: Too easy to deductively create a hypothesis and find support for it

43
WE PROPOSE A THEORY AND TEST IT SEVERELY
– TO FALSIFY IT II
Make conjectures (informed guesses) – and then seek to refute/falsify it (!)

ABH: p. 48

44
WE MAKE RISKY PREDICTIONS – AND AIM TO
FALSIFY IT
A theory should not just state:
• More motivation  better performance Easy to “confirm”
• Movements by heavy objects influence other objects
•…

We want more specific, risky predictions More risky and challenging theory
• Challenging, specific goals  better performance
• A new marketing campaign targeting a smaller, niche audience will result in higher profit
margins than mass marketing efforts.

Having falsified theories we can advance the scientific field

45
VIDEO: FALSIFICATION

https://www.youtube.com/watch?v=wf-sGqBsWv4 46
POPPER: IS SCIENCE BASED ON INDUCTION OR
DEDUCTION?
We often look at the world / data and try to find a pattern – inductive theory building

However, actual science (according to Popper) is testing the theory to see if theory is
wrong - deductively

47
WHAT DOES THIS MEAN FOR QUALITATIVE /
QUANTITATIVE STUDIES?
Quantitative Hypothesis Testing:

• Popper's principle of falsifiability can be directly applied to quantitative hypothesis testing in


business research. In this context, a hypothesis is a predictive statement that can be tested
through data analysis. According to Popper, a hypothesis should be structured in a way that
allows for it to be refuted by empirical evidence. For business students, this translates into
designing and testing hypotheses in a manner that there are clear criteria for what evidence
would disprove the hypothesis. (ChatGPT-4 generated)

• Keep in mind that a hypothesis test is specific, and always an incomplete test of a theory

48
WHAT DOES THIS MEAN FOR QUALITATIVE /
QUANTITATIVE STUDIES?
Qualitative Case Studies:

• While qualitative case studies are not typically structured around statistical hypothesis testing,
Popper's ideas still hold relevance. The critical approach to developing theory from case
studies involves looking for disconfirming evidence as well as confirming evidence,
challenging the researcher to refine their theories. Popper would likely caution against over-
reliance on inductive reasoning from case studies and stress the importance of seeking
potential falsification even in qualitative research. (ChatGPT-4 generated)

49
IF THE THEORY IS REFUTED/WRONG, IT IS NOT
UNSCIENTIFIC – JUST NOT VALID

ABH: p. 49

50
WE CAN ONLY APPROXIMATE TRUTH

Post-positivism  lesson 9 (Guba)

http://www.proprofs.com/quiz-school/user_upload/ckeditor/Trooth.jpg 51
POPPER – TO BE CONTINUED IN LESSON 8

52
PHILOSOPHY OF SCIENCE

LESSON 8
KUHN AND POPPER ON SCIENTIFIC
PROGRESS - PART 2
Carsten Bergenholtz (cabe@mgmt.au.dk)
Department of Management, Aarhus BSS
Aarhus University
AGENDA: FIRST HALF OF LESSON

2
REVISITING KUHN I: HOW SCIENCE DEVELOPS
Positivism, pre-Kuhn: “…positivists believed that disputes between rival
scientific theories could be solved in a perfectly objective way, comparing
the theories directly to the observations…” (PoS 5, p.80)
REVISITING KUHN II: PARADIGM SHIFTS AND DIFFICULTY
OF COMPARING BETWEEN PARADIGMS

• Within paradigms: Challenging but possible


to compare competing theories – same
standards, measurements, assumptions etc.

• Early Kuhn: Paradigms incommensurable –


can’t compare (at all) across paradigms

• Later Kuhn: Very difficult to compare across


paradigms
• New data points don’t fit & discarded
by stubborn scientists who process the
world differently
• But not impossible, cf. minimum wage
& Semmelweis examples

4
REVISITING KUHN III: EXAMPLE ON HOW COMPARISON
ACROSS ‘PARADIGMS’ CAN BE DIFFICULT
For example: If studying emotions
• Approaches can be so different, so very difficult to compare
• Observing (subjectively) real world situations vs. what happens (objectively) inside the brain
• What is the correct / objective theory?
 incomparable – or, very difficult to compare. No common standard

5
CONSIDER THIS: IS IT A GOOD THING IF 100% OF
SCIENTISTS AGREE ON SOMETHING? (REFLECT 1-2 MIN)

6
AFTER KUHN: SCIENCE & POLITICS
Politicians (mis)use science for politics

Yet, science also value-driven / political to some degree


• No external, objective standard for truth
•  Best science is less than 100% objective
• E.g.: What problems are most important to solve?
• Science also a sociological dynamic: influence, institutional power etc.

Science therefore is, inherently, somewhat political

7
REVISITING POPPER I: HOW TO DO SCIENCE
Given how induction & deduction (do not) work  this is what science can do

8
REVISITING POPPER II: WHY ONE (OR MANY) TESTS
DON’T VERIFY A THEORY
Imagine a theory: Higher financial incentives increase performance in simple tasks

One can test this theory in a wide range of ways


• Different research designs
• Different populations
• Different settings / tasks
• Increase incentives a lot / a bit
• Use different kinds of tasks
• Measure over short/long periods of time
• ….
No matter how many tests one has run…one can always think of additional – potentially
falsifying / refuting scenarios
• Maybe theory should be: ”Higher financial incentives increase performance in simple
tasks in X kind of conditions”
9
One overall (theoretical) claim: Many ways of testing it

Aschwanden 2015: https://fivethirtyeight.com/features/science-isnt-broken/

10
EXAMPLE: KUHN, POPPER AND THE CHALLENGE OF
NEW CLAIMS
Why wasn’t a better explanation accepted – quicker?

Popper:
• The medical community should have conducted
objective tests to falsify hypotheses
• A critical, rational approach could have resolved it

Kuhn
• Paradigms explains resistance to germ
theory/handwashing
• Not just objective evidence that led to scientific
change
• Survival of the fittest explanation…

11
2021 SPRING EXAM QUESTION (4-5 MIN)

Question 6: ” A famous scientist recently claimed that science provides insight into objective
reality. Please rely on the curriculum to provide counterarguments to this claim.”

• How can Kuhn & Popper contribute to answer the following question? (note: further relevant
texts are presented later)
• Please consider, in some detail, how you would structure an answer to the question
OUTLINE OF ANSWER TO 2021 Q6
Kuhn
• Focuses on socio-psychological factors
• We see a processed version of the world
• Therefore no theory-independent observations of the world -> embedded in paradigms
• Observations embedded in (somewhat) incommensurable paradigms
• Evolutionary ‘survival of the fittest’ perspective on improvement of theories – not closer to reality

Popper
• Focuses on nature of induction & deduction -> what science can (not) do
• Conjectures and Refutations: Create theories/hypotheses and aim to falsify – not to verify/prove
• Problem of induction – next data point might prove us wrong
• We can always think of a different way to deductively test theory
• We can approximate truth, but will always ‘only’ be ”science of the time”

Both agree
• No completely objective truths
• No direct access to the world in itself / objective reality
• Later theories often improve upon and replace earlier ones
13
SYNTHESIZING KUHN AND POPPER: IN RESEARCH
AND IN BUSINESS

In science and in Business


• Prioritize testing over confirmation
• Aim to falsify – and break out of paradigmatic boundaries
• Anticipate and address resistance with evidence and engagement

14
AGENDA: SECOND HALF OF LESSON

• Earlier, related perspectives


• Watts 2011: Similar focus & questions (how is social science particular, compared to
physics)
• Kuhn & Popper: Focus on limits of science – in general
• Nelson 2016:
• Focuses on particular challenges for business administration (e.g. vs. physics)
• How the nature of the phenomena we study limits what we our theories can do

15
Following presentation of Nelson 2016 also available
in this 2021 video: https://youtu.be/t-yLe1LllbA
WHAT ARE WE MEASURING, AND HOW CERTAIN CAN WE
BE ABOUT OUR CONCLUSIONS?
Different approaches to these questions in different fields

17
NELSON 2016: WHAT KIND OF SCIENCE IS
ECONOMICS / BUSINESS ADMINISTRATION

Is business administration / economics even a science?

Overview of Nelson’s claims


• We should not model social science on physics – social sciences are
different (see e.g. p. 1697)
• Some other (natural) sciences are more similar to social sciences, than
physics (see e.g. p. 1698-1699) ?
• Identifies 3 key features that distinguish business/economics from physics

In other words, how can we talk about business administration as a


science? What world are we theorizing about in different sciences?
THE CLAIM AND
ARGUMENT,
BRIEFLY PUT
BUSINESS: A SCIENCE? I
WHY DOES SOCIAL SCIENCE NOT LEAD TO NEWTONIAN LAWS, LIKE IN PHYSICS?

Newtonian physics: Homogeneous phenomena and stable causal influences

“The location of any planet at any time can be described completely in terms of numbers,
as can its motion at that time” (Nelson 2016: p. 1693)
“Thus while Newton’s calculations treat only the masses of a particular planet and the sun,
the effect of other masses and potential influences like galactic dust on the planet’s orbit
turns out to be very small” (Nelson 2016: p. 1694)
BUSINESS: A SCIENCE? II
EVEN SEVERAL NATURAL SCIENCES AREN’T LIKE
CLASSICAL PHYSICS
Meteorology (e.g. storms) ≠ astrophysics, Newtonian laws

Nevertheless, meteorology is a
pretty precise science.

“Every storm is different in some often significant way…. Measures of physical characteristics of
storms, like wind velocity, provide important indicators of how destructive that storm is likely to
be. However, the measurements here are not precise, and in their nature cannot be, and the
relevant causal connections are somewhat variable” (Nelson 2016: p. 1699)
BUSINESS: A SCIENCE? III
THE WORLD(S) WE STUDY
1) “First, while the subject matter they study falls into different classes, within each
class there is considerable heterogeneity, which cannot be reduced to a small
number of well defined variables, given the questions being asked of that subject
matter. And in many cases the classes themselves do not have sharply defined
boundaries“ (Nelson 2016: p. 1697)

2) “Second, the forces and conditions that influence the subjects of study in ways
that the science seeks to understand are numerous, highly variable, and often
cannot be separated sharply one from another. Indeed the circumstances
associated with any particular set of observations may need to be understood as,
in some sense, unique.” (p. 1697)

3) …the subject matter they study often changes over time…” (p. 1698) – see also
lesson 9

The world limits what science we can do (p. 1693)


Kuhn focused not on the world, but human/social factors
Popper focused on how nature of induction/deduction limits
HOMOGENEITY OF ATOMS OR CELLS VS. HUMANS

In contrast, business administration 


"A major challenge is the heterogeneity of
entrepreneurial phenomena. Different people start
different types of new economic activities for different
reasons; through different processes; at different pace;
with different resourcebases, and in different industry,
geographic, cultural, and economic environments.”

https://eprints.qut.edu.au/221162/1/95492.pdf

Imagine letting 12 populations of humans


https://en.wikipedia.org/wiki/E._coli_long-term_evolution_experiment develop separately for 50.000 generations…

23
BUSINESS: A SCIENCE? IV
NELSON 2016 ON NATURE OF SOCIAL SCIENCE
CONSTRUCTS [MEASURES OF CONCEPTS)

“I want to suggest that, where the underlying conception of the phenomenon being studied has blurry
edges, a constructed number meant to measure an aspect of it may take on a definitional quality.
That is, the number, and the assumptions behind its construction, comes to be treated as what the
phenomenon is, rather than as a useful if limited measure of that phenomenon. I propose that this is a
real danger, and needs to be guarded against.” (Nelson 2016: p. 1695, my emphasis)
EXAMPLES OF DIFFERENCES IN MEASURES - THE
DIFFERENCES MATTER

http://jreiss.org/Philosophy_of_Economics/PhilEcon_Manuscript_Master.pdf
BIG 5 VS OTHER
PERSONALITY CONCEPTS:
NO SHARP BOUNDARIES ->
FUZZY OVERLAP

““I want to suggest that, where the underlying


conception of the phenomenon being studied has
blurry edges, a constructed number meant to measure
an aspect of it may take on a definitional quality….”

https://twitter.com/PsychoSchmitt/status/1482023538710167556
Heterogeneous variables without clear boundaries

Aschwanden 2015: https://fivethirtyeight.com/features/science-isnt-broken/

27
HETEROGENEOUS VARIABLES INFLUENCED BY
NUMEROUS OTHER VARIABLES (NELSON POINT #1+2)

https://doi.org/10.1038/nature13977
HETEROGENEOUS VARIABLES INFLUENCED BY
NUMEROUS OTHER VARIABLES (NELSON POINT #1+2)

29
HOW MANY VARIABLES MIGHT IMPACT
ORGANIZATIONAL PERFORMANCE?
Market demand Employee morale Customer satisfaction
Economic conditions Organizational structure Supply chain efficiency
Industry competition Innovation capacity Technological adoption
Company culture Financial health Regulatory environment
Management effectiveness Brand reputation Globalization effects

Pricing strategies Capital availability Risk management


Marketing campaigns Talent recruitment and retention Sustainability practices
Operational processes Product quality Stakeholder relationships
Legal issues Service delivery Digital transformation progress
Strategic partnerships Corporate governance Change management capabilities

And the chatbot could have generated many more


30
COMPARE WITH MEDICINE (SEMMELWEIS) EXAMPLE
MORE DISTINCT VARIABLES: VERY EASY TO MEASURE IF ONE HAS WASHED
HANDS & IF SOMEONE IS DEAD (OR NOT)

31
METEOROLOGY VS. SOCIAL SCIENCES
PEOPLE ARE NOT JUST HETEROGENEOUS - CAN ALSO DECIDE TO
CHANGE BEHAVIOR OVER TIME (NELSON POINT #3, MORE IN LESSON 13)

https://twitter.com/kjhealy/status/1040220704527011840

Think Nelson 2016 makes a weather / storm reference.

Physics: Atoms do not change behavior


Social science: Humans can change their thoughts and behaviors
BUSINESS: A SCIENCE V
ECONOMISTS’ DREAMS ABOUT HOMO ECONOMICUS, BASED ON
PHYSICS AS AN IDEAL ≠ NELSON 2016

https://economics.stackexchange.com/questions/5706/ https://www.youtube.com/watch?v=GdujVdjAKUg
what-do-supply-demand-curves-really-look-like

We have to assume homogeneous, unchanging individuals whose behavior is


not disturbed by numerous influences necessary  if we are to achieve
mathematical-economics law-like insights (as in physics)
ECONOMICS ≠ NEWTONIAN (PLANETARY) GRAVITY
CAN WE REALLY SIMPLIFY THE ECONOMIC WORLD TO THIS DEGREE?

Pfleiderer, Paul. 2014. Chameleons: The Misuse of Theoretical Models in Finance and Economics. Stanford University.
NELSON ON LIMITS OF EXPERIMENTS (P. 1699)
How homogeneous and
predictable is the world?

Are the different schools in the


same treatment really identical?

My summary of Nelson 2016 + his take on experiments: https://youtu.be/1I0N0o6FaUA


NELSON AND
EXPERIMENTS
Does Nelson thus ‘oppose’ experiments?

No, does not oppose, as such. However:


• Risk of schools too different – so even randomization won’t create 2 equal groups
• The more homogeneous your variables and context – the easier to do RTC
• Acknowledge heterogeneity and be transparent about definition of variables
• Try out different variables and setups
NELSON ON (DANGER OF) DREAM OF HOMO ECONOMICUS
AND ASSUMING ALL SCIENCES ARE LIKE PHYSICS (P. 1700)

We risk misguided science and


science policy/investments

If we expect precise science, why


fund science that is not precise?

37
BUSINESS: A SCIENCE? III
THE WORLD(S) WE STUDY
1) “First, while the subject matter they study falls into different classes, within each
class there is considerable heterogeneity, which cannot be reduced to a small
number of well defined variables, given the questions being asked of that subject
matter. And in many cases the classes themselves do not have sharply defined
boundaries.“ (Nelson 2016: p. 1697)

2) “Second, the forces and conditions that influence the subjects of study in ways
that the science seeks to understand are numerous, highly variable, and often
cannot be separated sharply one from another. Indeed the circumstances
associated with any particular set of observations may need to be understood as,
in some sense, unique.” (p. 1697)

3) …the subject matter they study often changes over time…” (p. 1698) – see also
lesson 9

The world limits what science we can do (p. 1693)


Kuhn focused not on the world, but human/social factors
Popper focused on how nature of induction/deduction limits
HOW CAN THESE TOPICS APPEAR IN AN EXAM SITUATION?
(REFLECT ON THIS EXAM QUESTION, 2-3 MIN)

2018: Question 6 (23%)


Will business administration researchers ever be able to come up
with clear, universal theories about all individual and business
behavior; e.g., is it possible to create a clear, universal theory about
how all firms should be innovative to succeed? (just like the theory
of gravity informs us about the motion of all physical entities) Please
provide reasons for your answer and base it on the curriculum and
course material.

Additional angles on this question will be presented later in the course


Extracts from an
actual exam
answer
‘YOUR’ MEASURES AND PHENOMENA, E.G. IN YOUR
BACHELOR PROJECT (REFLECT, 1-2 MIN)

How easy is it to measure / make sense of what you are interested in?
CONSIDER THIS: IS IT A GOOD THING IF 100% OF
SCIENTISTS AGREE ON SOMETHING? (REFLECT 1-2 MIN)

43
IMPLICATIONS OF NELSON’S CHARACTERISTIC OF
BUSINESS ADMINISTRATION / ECONOMICS
• Physics’ quantitative ”law-focus” should not be the goal of business administration
• When phenomena and theoretical relations are heterogeneous, we have to
acknowledge limitations & be transparent in our categorization of our data/variables
• Both quantitative and qualitative work important – selection of method should be based
on specific problem field
• The more homogeneous the field, the easier it is to quantify average relations

…it is a mark of the trained mind never to expect more precision in the treatment of any subject than the nature of that
subject permits; for demanding logical demonstrations from a teacher of rhetoric is clearly about as reasonable as
accepting mere plausibility from a mathematician
(Aristotle, 1976: p. 65)
WHY READ KUHN, POPPER AND NELSON?
They have shaped scientists (and societies) ideas about what science is
• How rational and objective science is
• What certainty / kind of truth can we expect
• How are some sciences different than others
• Established counterpoints to positivism & a naive conception of science

However: Science still the best option we have, to learn about the world
• Might not be true, but best approximation
• Different (paradigmatic) approaches are difficult – but possible – to compare
• Fittest theories survive
• Science is (somewhat) self-correcting & explanations are – continuously - improving

Overall: Important to acknowledge weaknesses and strengths of science


• Not a flaw that scientists disagree. Don’t hope for 100% consensus
• Ok that theories we thought were true, turn out to be false

45
PHILOSOPHY OF SCIENCE

LESSON 9
PHILOSOPHICAL ASSUMPTIONS IN SCIENCE

Carsten Bergenholtz (cabe@mgmt.au.dk)


Department of Management, Aarhus BSS
Aarhus University
CHATBOT

• Isager 2023 added

• Will increase nr. of


allowed prompts in last
weeks up to exam

• Chatbot less strong on


Popper – has less material

2
ANOTHER REMINDER: REMEMBER FLOWLOCK+ SETUP
NOTES: PDF + OWN COMPUTER

More info on how this works and how you can try out the setup to follow

3
SUMMING UP: NELSON 2016

4
AGENDA

In order to understand Guba 1990, we need to understand two sets of concepts


• Ontology, epistemology, methodology
• Positivism, post-positivism, constructivism
Explained via in-depth reading of Guba 1990

5
TERMINOLOGY: ONTOLOGY & EPISTEMOLOGY

Guba 1990 uses these concepts – but what do they mean?


What question are they addressing?

See 2021 video here: https://youtu.be/37-4-0jlf7k


WHAT WORLD (ONTOLOGY) ARE WE THEORIZING ABOUT
AND HOW DO WE ACCESS IT/ KNOW (EPISTEMOLOGY)?II
• Ontology: ”What is the nature of reality” (Guba 1990, p. 18), or: What ‘real’
status do concepts and theories have?
• Epistemology: ”What is the nature of the relationship between the knower
(inquirer) and the known (or knowable)” (Guba 1990 p. 18) or: How do we
know something (or: how do we access / create concepts and theories?)

Easier to think about in physics compared to social sciences, cf. Nelson 2016?

We can’t directly ‘see’ these concepts


WHAT WORLD (ONTOLOGY) ARE WE THEORIZING ABOUT
AND HOW DO WE ACCESS IT / KNOW (EPISTEMOLOGY)? II

One could argue that we in physics ”uncover” reality and the mathematical,
‘100%’ correct laws that explain behavior of atoms, planets etc.

Social sciences?
Overview

Guba 1990, p. 18

https://journals.sagepub.com/doi/full/10.1111/1467-9256.00173
9
”YES, PRIME MINISTER” I

https://www.youtube.com/watch?v=G0ZZJXw4MTA

10
”YES, PRIME MINISTER”: IS THE PROBLEM ONTOLOGICAL
OR EPISTEMOLOGICAL, OR BOTH?

Ontology: Does an independent opinion on national service exist?


Epistemology: Is it because we don’t have access to what people actually think?
How one asks,
matters!

https://x.com/keiranpedley/status/1759970566776197307?s=20
12
TYCHO BRAHE’S MEASUREMENTS
OF THE SOLAR SYSTEM –
ONTOLOGICAL DISTINCTION
• Measurements of planets ≠ measurements of
attitudes, unemployment, stock markets etc.

• For Brahe easier to claim an ontological


distinction between measurement and what
is measured
WHAT WORLD (ONTOLOGY) ARE WE THEORIZING ABOUT AND
HOW DO WE ACCESS IT/ KNOW (EPISTEMOLOGY)? III
• If we can measure something (e.g. an atom) and perfectly predict future (≈ effect size of 1), ‘real-
ness’ of theoretical entities can be assumed?

• If we struggle to measure something (e.g. unemployment, innovation) and can’t perfectly predict
future (≈ effect size of e.g. 0.2), more difficult to assume ‘real-ness’ and ‘real knowledge’?

Therefore:
• Issue with our ability to measure and understand the world (epistemology) or are these concepts
just not independently real (ontology), cf. Prime Minister ex.?

 Guba 1990 outlines possible answers


GUBA 1990: 3 DIFFERENT APPROACHES TO THE
ONTOLOGY / EPISTEMOLOGY CHALLENGES PRESENTED
Positivism Postpositivism (critical Constructivism
realism)
Ontology Reality independent of us, Reality independent of us, Multiple realities, social entities don’t
general concepts and general concepts and exist independently of our thinking &
theories exist (and to be theories exist theories about them, who is to say
uncovered ≈ an atom) which reality perspective is true?

Epistemology We can observe the world as Objectivity the ideal, theories Theories / constructs are constructed,
it is. Have access to theories / / constructs are (inter)subjective knowledge, findings
constructs, which represent approximations, can’t see the are creation of interaction between
and correspond to the world = world as it is ≈ Popper’s inquiry / inquired into
objective knowledge position

Methodology Uncover truth, fly on the wall, RCTs are good, but valuable We can’t go outside our theories,
(or investigator) see ‘through’ a microscope, to add ‘naturalistic settings’, don’t aim to see unfiltered world, re-
RCTs the best many sources reduces construct the world, intersubjective
distortions, aim to remove approach, interesting world is in the
filter, see ‘with’ a microscope minds of those we research
RICKY GERVAIS ON (POSITIVISTIC) SCIENCE

16
GUBA: POSITIVISM, SOME RELEVANT QUOTES
P. 19: ”The basic belief system of positivism is rooted in a realist ontology,
that is, the belief that there exists a reality out there, driven by immutable
natural laws. The business of science is to discover the ”true” nature” -
ontology

P. 19: "If there is a real world operating according to natural laws, then the
inquirer must behave in ways that put questions directly to nature and
allow nature to answer back directly” - epistemology

p. 19: ”But how can that be done, given the possibility of inquirer bias, on
the one hand, and nature's propensity to confound, on the other? The
positivist's answer: by the use of manipulative methodology that controls
for both, and empirical methods that place the point of decision with
nature rather than with the inquirer. The most appropriate methodology is
thus empirical experimentalism, or as close an approximation there to as
can be managed." - methodology

Same answer
Even if effect size is 0.2, that effect
represents something real
17
GUBA: POST-POSITIVISM, SOME RELEVANT
QUOTES AND INSIGHTS
P. 20: "...although a real world driven by real natural causes exist, it
is impossible for humans truly to perceive it with their imperfect
sensory and intellective mechanisms” - ontology

P. 26: "Even post-positivists have conceded that objectivity is not


possible, the results of an inquiry are always shaped by the
interaction of inquirer and inquired into". - epistemology

P. 21: ”Further, if objectivity can never b e entirely attained, relying


on many different sources makes it less likely that distorted
interpretations will be made.” - methodology

‘Modified objectivity’ / approximate truths (=Popper’s position)

There is a real effect out there – but we can never ‘know’ if our theories are true
We can only compare validity of theory with validity of another theory, never Not necessarily
direct access to the world itself same answer
18
POPPER ON TRUTH (EPISTEMOLOGY)

https://youtu.be/li0ciaqJ0m0?t=155

19
GUBA: CONSTRUCTIVISM, SOME
RELEVANT QUOTES AND INSIGHTS
P. 25: ”…facts are only within some theoretical framework…basis for discovering ”how
things really are”…is lost”.
P. 26: ”Realities are multiple”. You can’t step outside your theoretical, constructed
framework - ontology

P. 26: "...the results of an inquiry are always shaped by the interaction of inquirer and
inquired into.”
p. 25: “…no unequivocal explanation is ever possible. There can be many constructions,
and there is no foundational way to choose among them.”
P. 26: ”…it makes the findings of an inquiry not a report about what is ”out there” but the
residue of a process that literally creates them [the findings]” - epistemology

P. 26: ” If realities exist only in respondents' minds, subjective interaction seems to be the
only way to access them..” Concepts are constructed (e.g. motivation, inflation), let's be
explicit about it and investigate how and why we have constructed it. Notice: not a direct
quote. - methodology

Theories don’t represent something externally


real, we ‘construct’ data, don’t find results
20
REFLECT: DO THESE ‘ISMS’ MAKE SENSE TO YOU? (2 MIN)
QUESTIONS?
Positivism Postpositivism (critical Constructivism
realism)
Ontology Reality independent of us, Reality independent of us, Multiple realities, social entities don’t
general concepts and general concepts and exist independently of our thinking &
theories exist (and to be theories exist theories about them, who is to say
uncovered ≈ an atom) which reality perspective is true?

Epistemology We can observe the world as Objectivity the ideal, theories Theories / constructs are constructed,
it is. Have access to theories / / constructs are (inter)subjective knowledge, findings
constructs, which represent approximations, can’t see the are creation of interaction between
and correspond to the world = world as it is ≈ Popper’s inquiry / inquired into
objective knowledge position

Methodology Uncover truth, fly on the wall, RCTs are good, but valuable We can’t go outside our theories,
(or investigator) see ‘through’ a microscope, to add ‘naturalistic settings’, don’t aim to see unfiltered world, re-
RCTs the best many sources reduces construct the world, intersubjective
distortions, aim to remove approach, interesting world is in the
filter, see ‘with’ a microscope minds of those we research
ON CONSTRUCTIVISM I

https://twitter.com/alexgaggio/status/1023131258065436672?s=11
Each row orders the color chips for one language: Colors
farther left are easier to communicate, those farther to the
right are harder to communicate.
https://images.theconversation.com/files/186385/original/file-20170918-8258-
oz7qnm.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip
ON CONSTRUCTIVISM II: TYPICAL
MISCONCEPTIONS
• Just because something is constructed – doesn’t mean it is made up
• Constructivism is about the process of constructing theories – not that the world is made up
• You can’t build any kind of theory – the world limits
• You can’t build a building without materials – similar with theories (there is a world/building material)
• Buildings can be easily destroyed – our institutionalized theories cannot (easily)

Constructivism vs. Social constructivism


WHAT POSITION DO (PSYCHOLOGY)
RESEARCHERS HAVE?

“The dialog is not to determine which


paradigm is, finally, to win out”
Guba 1990, p. 27

https://twitter.com/research_tim/status/993756255423430656?s=11
WHERE DOES KUHN FIT INTO GUBA’S FRAMEWORK?
VEVOX.APP 143-136-950 (CLICK ON IMAGE)

Positivism Postpositivism (critical Constructivism


realism)
Ontology Reality independent of us, Reality independent of us, Multiple realities, social entities don’t
general concepts and general concepts and exist independently of our thinking &
theories exist (and to be theories exist theories about them, who is to say
uncovered ≈ an atom) which reality perspective is true?

Epistemology We can observe the world as Objectivity the ideal, theories Theories / constructs are constructed,
it is. Have access to theories / / constructs are (inter)subjective knowledge, findings
constructs, which represent approximations, can’t see the are creation of interaction between
and correspond to the world = world as it is ≈ Popper’s inquiry / inquired into
objective knowledge position

Methodology Uncover truth, fly on the wall, RCTs are good, but valuable We can’t go outside our theories,
(or investigator) see ‘through’ a microscope, to add ‘naturalistic settings’, don’t aim to see unfiltered world, re-
RCTs the best many sources reduces construct the world, intersubjective
distortions, aim to remove approach, interesting world is in the
filter, see ‘with’ a microscope minds of those we research
KUHN AND GUBA
“However, the classification of Kuhn into the paradigm of constructivism is not perfect. A central claim of constructivism is
that due to the undeterminaiton of theory, i.e. that “no theory can ever be fully tested”, “no unequivocal explanation is
ever possible. There can be many constructions, and there is no foundational way to choose among them.” (Guba 1990,
pp. 25) This stands somewhat in contrast to Kuhn’s perspectives on scientific development. Indeed, while he argues that
the incommensurability of paradigms implies that both may be right because they cannot be compared; “The
proponents of different theories are like the members of different language-culture communities. Recognizing the
parallelism suggests that in some sense both groups may be right.” (Kuhn 1962, pp. 205) However, Kuhn does then go on
to argue – somewhat – that scientific progress means later theories are better for solving scientific “puzzles” than earlier
theories; “…scientific development is, like biological, a unidirectional and irreversible process. Later scientific theories are
better than earlier ones for solving puzzles in the often quite different environments to which they are applied.” (Kuhn
1962, pp. 206) This implies that Kuhn is not entirely aligned with the subjectivist epistemology of constructivism despite
seemingly being aligned with the relativist ontology of the paradigm. Thus, it seems that Kuhn does not adhere perfectly
to any of the three paradigms defined by Guba (1990), but that his perspectives on ontology places him closest to the
constructivist view on the social science world.”

(Student exam answer)

26
NELSON 2016 AND PARADIGMS
(REFLECT 2 MIN)

Question 5 (21%) – 2020 Spring exam

How would you categorise Nelson's (2016) take on business economics


as a science: Is it a positivist, post-positivist or constructivist position, as
per Guba's (1990) categorisation of these positions? Please be specific
in your answer and offer at least two quotes from Nelson (2016) that
support your argument.
NELSON 2016, SOME QUOTES
Clearly not a positivist. Post-positivist or constructivist, according to Guba?

”That is, the number, and the assumptions behind its construction, comes to be treated as what the phenomenon is,
rather than as a useful if limited measure of that phenomenon.” (Nelson 2016, p. 1695).

“The numbers used in the social and behavioral sciences almost always are, by themselves, somewhat limited and
imprecise characterizers of the phenomena they are designed to measure, and need to be understood as a part of a
broader and more detailed if qualitative characterization (Nelson 2016, p. 1695).

”And in these fields the kind of precise law-like relationships that physics has identified simply may not exist, even in
the sense of tightly stable stochastic relationships.” (Nelson 2016, p. 1693)

“If these phenomena are to be associated with numbers, some proxies need to be invented or some quantitative
indicators constructed that relate to phenomena of interest, but are not natural measures of that.” (Nelson 2016, p.
1696)

Guba 1990, p.26: ”First, it renders the distinction between ontology and epistemology obsolete; what can be known
and the individual who comes to know it are fused into a coherent whole””
A GOOD ANSWER TO 2020 Q5, PART 1
To categorise Nelson’s take on business economics as a science as per Guba (1990), I will need to first establish what Guba’s positions
are, as to be able to place Nelson in one (or several) of them. In his 1990 paper titled “The Alternative Paradigm Dialog”, Guba
discusses the three paradigms of positivism, post-positivism and constructionism. He discusses how the different paradigms will have
different answers to the ontological, epistemological and methodological questions (Guba 1990, page 18).
Positivism has the ontology that concepts and theories are real, i.e. there is one real “truth”. The epistemology is that the theories used
then represents the world as it is, and is therefore objective knowledge.
Post-positivism has almost the same ontology as positivism, in that there is one real “truth”. However, they are more critical about one’s
access to it, as they argue it is not reachable. Their epistemology is that objectivity then is the ideal, but we will never be able to see the
world as it is, and therefore our best bet is theories that approximate the real “truth” as closely as possible.
Constructivism has a completely different ontology, as this paradigm argues that there is several “truths”. As Guba puts it: “ “Reality”
exists only in the context of a mental framework (construct) for thinking about it” (Guba 1990, page 25). This also means, in relation to
constructivism’s epistemology, that theories are constructed, and that the constructed reality only then can be “seen” through this
window of constructed theory (Guba 1990, page 25).
At first, I establish why I do NOT consider Nelson to be a positivist. As described above, a positivist will argue that the theories they use
are a window straight into how the world really is, and is therefore objective knowledge. However, in his paper, Nelson several times
mentions that many variables in social science does not make sense to evaluate at face value: “A good part of the reason why one or
a few numbers alone generally do not cover adequately the subject matter being studied, and the numbers themselves often are
somewhat fuzzy, is that the phenomena studied have several aspects and each of these has blurry edges.” (Nelson 2016, page 1696).
Nelson argues that we cannot just use one number to define a variable such as unemployment, as unemployment as a subject is
inherently fuzzy. You can get different arguably correct answers to what one considers unemployment. Whether it be people out of
work, people out of work but who are willing to work etc. But you cannot get different answers to what the weight of a specific atom is
(obviously given the respondent knows the answer). And therefore, it does not make sense to view numbers or theories in the social
science field as a window into reality, as there needs to be accounted for their inherent fuzziness.
A GOOD ANSWER TO 2020 Q5, PART II
A quote that could place Nelson on the more constructivist side is the following: “Many of the particular subject
matters treated by the social and behavioural sciences not only are quite heterogeneous, but the general conception
of the nature of the phenomenon often has uncertain boundaries.” (Nelson, page 1695). In the quote, Nelson argues
that the nature of phenomena has uncertain boundaries, which I think hints at him meaning the phenomena are a
construct of the created data and the given researcher, and should thus be treated as such. However, one could also
argue that he says the uncertain boundaries means that the data are only an approximation to reality, which would
then seem like him being a post- positivist

In general, I would argue the way he addresses quantifiable measures makes him seem like a post-positivist, as it
seems as though some data/numbers do have a face value on their own, such as in the following quote: “The
numbers used in the social and behavioural sciences almost always are, by themselves, somewhat limited and
imprecise characterizers of the phenomena they are designed to measure, and need to be understood as part of a
broader and more detailed if qualitative characterization.” (Nelson 2016, page 1695). However, the way he
addresses that data needs to be understood as part of a broader qualitative characterization makes him seem more
constructivist, as it seems he relies more on the creation between the data and the researcher to create meaning.

To conclude, I would argue that Nelson’s 2016 paper is definitely not positivistic, as he does not believe theories are a
window of the one “true” reality. However, when evaluating whether he is post-positivistic or constructivist, I would
argue that the discussion gets blurrier, and that he is somewhere in between.
MORE ON METHODOLOGY
FOR EXAMPLE, HOW TO DO INTERVIEWS I

What kind of approach is this?

31
HOW TO DO INTERVIEWS – AS A
CONSTRUCTIVIST? II
A constructivist would
• probably not hide identity – but participate openly
• not focus on ‘contamination’ and ‘purity’
• seek out multiple, different understandings – and
understand why participants think/behave
• contrast responses from one with responses from
others (dialectic, p. Guba 26-27)
• while in PP aim of standardized approach is to be
able to smoothly compare respondents’ responses
– in C one accepts and uses their different realities
• rely on more open-ended, flexible approaches &
questions

Semi-structured vs. Narrative approach


32
HOW TO WE ASSESS TRUTH / QALITY OF WORK?

Positivism: Truth (validity, hitting the bullseye metaphor)


Post-positivism: Approximating truth (validity, bullseye metaphor)
Constructivism: “Constructions are not more or less "true," in any absolute sense, but simply
more or less informed and / or sophisticated.” (Lincoln & Guba 1994, p. 110)
 quality criteria: credibility, authencitity, transparency

33
CHATBOT ANSWER: “CAN YOU GIVE EXAMPLES OF HOW INTERVIEW QUESTIONS
WOULD DIFFER IN A POST-POSITIVISTIC VS CONSTRUCTIVIST APPROACH?
Certainly! Here are examples of interview questions that might differ between a post-positivist and constructivist approach:

Post-positivist approach: In a post-positivist approach, researchers attempt to identify patterns and relationships while
acknowledging the limitations of human knowledge. They strive for objectivity and often try to minimize their own biases. Interview
questions in a post-positivist approach might include:
Can you describe the steps involved in your decision-making process?
What factors do you consider when evaluating a new opportunity or project?
How do you measure the success of a particular initiative or project?

Constructivist approach: In a constructivist approach, researchers focus on understanding the subjective experiences and
meaning-making processes of the participants. They embrace the co-construction of knowledge between the researcher and the
participant and acknowledge their own role in shaping the research findings. Interview questions in a constructivist approach
might include:
Can you tell me about a time when you faced a difficult decision, and how you made sense of the situation?
How do your personal values and beliefs influence your evaluation of new opportunities or projects?
In what ways has your understanding of success evolved throughout your experiences with various initiatives or projects?
The main difference between the questions lies in the emphasis on the participant's subjective experiences and meaning-making
processes in the constructivist approach. While both approaches use interviews to explore participants' perspectives, the
constructivist approach places more emphasis on the individual's interpretation of events and personal context. In contrast, the
post-positivist approach focuses more on identifying patterns and relationships across participants' experiences, while still
acknowledging the limitations of human knowledge and the potential influence of context.”
34
POST-POSITIVISM VS CONSTRUCTIVISM: A FEW NOTES
Typical (PP) aim: Find out what is really the case
Most (?) bachelor projects: Some variant of post-positivism
• What is aim: To discover / carve out information (like a treasure-hunt/archeologist) – or to
produce/construct knowledge?
• If risk of interfering: Aiming to be as distant / invisible as possible – or can inteviewer also be
engaged and passionate (e.g. in interview situation)?
• Interviewguide: Standardized and how strictly followed?
• Considered necessary to quantitatively test (qualitative) insights, afterwards?
• How do you assess objectivity? ‘Truth’ vs trustworthiness
• ….

Not always easy to differentiate


PP and C!

35
ANOTHER EXAMPLE: STUDYING THE VARYING
KINDS OF IMPACT OF BREASTFEEDING I

Example presented in this video from 2020


https://youtu.be/4ZOeQomRHpc
STUDYING THE VARYING KINDS OF IMPACT OF
BREASTFEEDING II

https://w0.pngwave.com/png/791/656/medical-laboratory-hospital-physician-science-science-png-clip-art.png

37
BREASTMILK IS GOOD FOR
THE BABY!
Breast milk ------- > Good for baby

https://lh3.googleusercontent.com/proxy/AawPg7iPTMOJez2tm
0RTBo6qJdfX0V7d4fuLpbFLUTRRdi_KB1qr_T7ahed0pxlzBTBFEIu
_jj9WHGP7PDKPgnPKUA

38
BUT…IT IS NOT SO SIMPLE
WHAT IS THE RESEARCH QUESTION?
Breastfeeding can be very painfull, and basically
impossible  potentially loss of sleep & self-
esteem, + feeling of shame by mother

Can these concepts be measured via numbers?

https://media.istockphoto.com/id/1346197512/vector/rate-us-bar-from-1-to-10-best-for-website-design-app-ui-isolated-on-white-
background-eps-10.jpg?s=612x612&w=0&k=20&c=RrUgDdl7SLsP2snleuivNDyslYbWbAZ91Xl3t4zGfjc=

39
• Norms of self-esteem and when to feel shame
vary across cultures ≠ physiological impact
https://media.istockphoto.com/id/1346197512/vector/rate-us-bar-from-1-to-10-best-for-website-design-app-ui-isolated-on-white-
background-eps-10.jpg?s=612x612&w=0&k=20&c=RrUgDdl7SLsP2snleuivNDyslYbWbAZ91Xl3t4zGfjc=

• Maybe ‘level of esteem/shame’ can’t be


properly measured via a number? (too
heterogeneous, fuzzy?)

• Mothers’ emotions (shame, self-esteem) can


also  wellbeing of baby

40
WHAT IS THE RESEARCH QUESTIONS (RQ) 
PHILOSOPHY OF SCIENCE ASSUMPTIONS
When RQ shifts from physiological components to psychological
considerations

• Shift in how distinct, pre-existing the reality is?


• Shift in how certain we can be about actually measuring what is out there?
• Might the shift require moving from measurement to engaging in more
reflexive conversation with anxious mothers?

What paradigm applies – thus also depends on the RQ and the phenomena
the RQ seeks to investigate
Also: If one has certain (paradigm) assumptions, one addresses certain RQs
FINAL PERSPECTIVES ON
THEORY
Self-falsification? Did saying a theory out
loud change behavior? (Price and Wolfers
2010; Pope et al. 2013)

42
PARADIGMS AND METAPHORS OF THEORY I

Guba & Lincon 1994, p. 113)

Theories shouldn’t be able to self-falsify Not surprising if theories can self-falsify


SUMMING UP LESSON 9

44
PHILOSOPHY OF SCIENCE

LESSON 10
HUMAN ‘RATIONALITY’ AND BIASES
Carsten Bergenholtz (cabe@mgmt.au.dk)
Department of Management, Aarhus BSS
Aarhus University
PHILOSOPHY OF SCIENCE ASSUMPTIONS ABOUT
HUMAN (RESEARCHERS)
Philosophy of Science: Usually focuses on assumptions about
research designs, what a theory is and what world we investigate
RD /
Humans?
Methods Social world
Theory

• Classical economic assumption about humans: “everyone is


reacting rationally to the market situation” (quote by Spence, Nobel https://www.simplypsychology.org/information-
processing.html2
Prize winning economist)

• Kahneman questions assumption about the rational homo


economicus: How does the human mind ACTUALLY work and
how do humans actually behave?

2
https://cdn-images-1.medium.com/max/2000/1*kQdOtDv-7KolqkWPyQD0kw.png
AGENDA FOR LESSON 10
• Where does ‘irrationality’ of economic thinking come from?
• Introducing system 1 & system 2
• What You See Is All There IS (cf. chapter 7)
• Systematic cognitive biases (e.g. confirmation bias) – not just random errors
• Efficient yet imperfect cognitive systems
• How to solve the flaws of system 1 & 2 – in general and in science
KAHNEMAN: THINKING, FAST AND SLOW (2011). NOBEL PRIZE
WINNER WHOSE WORK LED TO ”BEHAVIORAL ECONOMICS”

4
THE RISE OF ‘BEHAVIORAL ECONOMICS’ (≈ KAHNEMAN
INSPIRED LITERATURE) IN ECONOMIC LITERATURE

https://www.google.com/url?sa=i&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3D
GdujVdjAKUg&psig=AOvVaw1RCiiuI24tFsbW-
1j3t2Jh&ust=1581924792471000&source=images&cd=vfe&ved=0CAIQjRxqFwoTCOCO1KDH
1ecCFQAAAAAdAAAAABAD

https://www.bloomberg.com/view/articles/2018-08-02/how-economics-went-from-philosophy-to-science
WHERE DOES THIS LACK OF RATIONALITY ORIGIN? I

Same question, very different answers!


This seems ‘irrational’ – due to emotions or
poor thinking?

https://image.slidesharecdn.com/behaviourchangeivovla
evclahrc-150515122817-lva1-app6892/95/what-is-
unique-about-behavioural-economics-prof-ivo-vlaev-30-
638.jpg?cb=1431693095

https://www.google.com/url?sa=i&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DGdujVdjAKUg&psig=AOvVaw1RCiiuI24tFsbW-
1j3t2Jh&ust=1581924792471000&source=images&cd=vfe&ved=0CAIQjRxqFwoTCOCO1KDH1ecCFQAAAAAdAAAAABAD

6
WHERE DOES THIS LACK OF RATIONALITY COME FROM? II
NO EMOTIONS INVOLVED, BUT STILL NOT QUITE RATIONAL

https://image.slidesharecdn.com/goldpresentation-130107111506-phpapp01/95/how-behavioral-
economics-informs-website-design-5-638.jpg?cb=1357719312

7
WHAT DOES LINDA DO?
(REFLECT 1 MIN, VEVOX.APP 143-136-950)

“Linda is thirty-one years old, single, outspoken, and very bright,” they wrote. “She
majored in philosophy. As a student, she was deeply concerned with issues of
discrimination and social justice, and also participated in antinuclear
demonstrations.” Then they went around asking people the same question:

Which alternative is more probable?

• Linda is a bank teller.


• Linda is an insurance salesperson
• Linda is a bank teller and is active in the feminist movement.

https://janav.files.wordpress.com/2013/11/lindafixed.jpg
‘IRRATIONALITY’ OF HUMAN THINKING
WHERE DOES IT COME FROM? (P. 8)

https://www.google.com/url?sa=i&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3F
v%3DGdujVdjAKUg&psig=AOvVaw1RCiiuI24tFsbW-
1j3t2Jh&ust=1581924792471000&source=images&cd=vfe&ved=0CAIQjRxqFwoTCOCO
1KDH1ecCFQAAAAAdAAAAABAD
KAHNEMAN (2011): SYSTEM 1 AND 2

System 1 & 2 are presented in this 2021 video: https://youtu.be/QOOdtPJIlbk

10
SYSTEM 1 VS. SYSTEM 2 (S. 19-22): TWO DIFFERENT
SYSTEMS / FORMS OF HUMAN THINKING I

2+2 = 4 17*24 = ?

What is the capital What is Kuhn’s


of Denmark? main argument?

”System 2 allocates attention to the effortful mental


”System 1 operates automatically
activities that demand it, including complex
and quickly, with little or no effort
computations. The operations of System 2 are often
and no sense of voluntary control”
associated with the subjective experience of agency,
(p. 20)
choice and concentration” (p. 21)
SYSTEM 1 VS. SYSTEM 2 (P. 19-22)
RABBIT VS. THE TURTLE

• Detect that one object is • Focus on the voice of a particular


more distant person in a crowded and noisy room
• Orient to the source of a • Search memory to identify a surprising
sudden sound sound
• Detect hostility in a voice • Solving difficult math problems
• Understand simple • Evalute complicated information
sentences • Check the validity of a complex
logical argument
(see p. 21-22 for further ex.)
Illustration by David Plunkert (see p. 21-22 for further ex.)

19
SYSTEM 1 VS
SYSTEM 2
• ”[some] mental activities become
fast and automatic through
prolonged practice” (p. 22)

• Two different thinking processes,


partially trained – not two systems
that are genetically determined and
located in different parts of the brain

System 1 is gullible and biased to believe, System 2 is in charge of doubting and


unbelieving, but System 2 is sometimes busy, and often lazy (p. 81).

13
HOWEVER, THESE SYSTEMS LEAD TO SYSTEMATIC
MISTAKES / TENDENCY TO MAKE ERRORS
System 1 shortcuts + system 2 laziness  errors

https://thumbs.dreamstime.com/z/biases-human-mind-pictured-as-word-inside-head-to-
symbolize-relation-psyche-d-illustration-172338048.jpg

14
WHAT DO WE SEE HERE?

Kahneman 2011, ch. 7, p. 79

15
THE ‘STROOP’ TEST: WE CAN FAIL AT TASKS THAT SEEM QUITE SIMPLE

Try to identify the colour of the word, not what is written

See also Kahneman 2011, p. 25 for a similar example


IF QUICK THINKING SYSTEM 1 RESPONDS…
The ball and the bat
A baseball and a bat costs 1.10$ together
The bat costs 1$ more than the ball
How much does the ball cost?
WYSIATI: WHAT YOU SEE IS ALL THERE IS
(KAHNEMAN 2011, CH 7, P. 85)

Kahneman 2011, ch 7, p. 85

Kahneman 2011, ch 7, p. 87

18
19
WE DON’T JUST SEE THE WORLD AS IT IS
ATTENTION SHAPES PERCEPTION  TOP DOWN FILTERING OF INFORMATION

We don’t see the world


just as it is, cf. Kuhn

Our systems direct


attention which shapes
perception of the world

WYSIATI
https://twitter.com/Foone/status/1014273847518851073?s=20

20
TOP DOWN ‘FILTERING’ I: EXPECTATIONS MAKE US MORE
READY FOR SOME KIND OF INFORMATION

https://twitter.com/page_eco/status/14
61306098757947396?s=20&t=qZ2hAz8r
Zxex3lDLwXmBMw

21
TOP DOWN ‘FILTERING’ II

What we see is in part what we predict


will be out there – our brain is literally a
prediction machine

https://twitter.com/page_eco/status/1155077311399489536?s=20
SLIDE BORROWED FROM JACOB ORQUIN
KAHNEMAN ON THINKING FAST AND SLOW
https://www.youtube.com/watch?v=iVrLsN5zM_A

24
SYSTEM 1 VS. SYSTEM 2: OVERVIEW

http://ameritest.wordpress.com/2013/01/29/completing-the-picture-part-ii-system-1-vs-system-2-thinking0/

Irrationality does not only come from lack of cognitive effort or interrupting emotions –
but are an integrated feature of our brain’s cognitive setup

Kahneman: Important to understand the systematic nature of these mistakes, i.e. biases
SYSTEM 1: NOT JUST RANDOM ERRORS, BUT
SYSTEMATIC COGNITIVE BIASES

Kahneman 2011, p. 25: “System 1 is generally


very good at what it does: its models of
familiar situations are accurate, its short-term
predictions are usually accurate as well, and
its initial reactions to challenges are swift and
generally appropriate. System 1 has biases,
however, systematic errors that it is prone to
make in specified circumstances” (my
emphasis)

https://www.castoredc.com/blog/cognitive-bias/

26
NON-SYSTEMATIC LIST OF COGNITIVE BIASES
(EMERGING FROM SYSTEM 1, LACK OF SYSTEM 2 ACTIVATION)

http://lifehacker.com/this-graphic-explains-20-cognitive-biases-that-affect-y-1730901381?sidebar_promotions_icons=testingon&utm_expid=66866090-
67.e9PWeE2DSnKObFD7vNEoqg.2
ANCHORING BIAS & CONFIRMATION BIAS
• Scenario: Two headlines on a scientific study appear on your
feed—one confirms, the other contradicts your beliefs.
• Inclination: We naturally gravitate towards information that
supports our existing views.
• Impact: This bias affects not only our perception but also the
information we choose to share and endorse.
 Shapes what information we process & thus conclusions
CONFIRMATION BIAS (NOT JUST EMOTIONS, SOCIAL CONFORMITY,
CLASSICAL IRRATIONALITY THAT DRIVES BIAS)

https://www.all-about-psychology.com/selective-perception.html

p.5 This doesn’t seem rational. But where does this


‘irrationality’ / bias come from?
CONFIRMATION BIAS: WHEN ARE WE CRITICAL?

An investor hears a rumor that a company is


on the verge of declaring bankruptcy and
considers selling the stock. The investor then
[…] looks for negative news about the
company and ignores positive indicators.
https://www.investopedia.com/terms/c/confirmation-bias.asp

https://twitter.com/dsquintana/status/1488815799427022849?s=12

30
WHY DOES OUR SOPHISTICATED BRAIN ‘SUFFER’
FROM CONFIRMATION BIAS?
System 1 is lazy:
• We prefer to process information that matches
expectations - easier for new bit of information to find ”its
place” in the brain
• Processing information that challenges beliefs, requires
more energy and is ‘risky’

We like positive rewards: Finding out one is right, ‘feels’ better

31
CONFIRMATION BIAS: NEGATIVE VS. POSITIVE
FEEDBACK PROCESSED DIFFERENTLY

Emotions and / or
Cognition?
32
THIS IS NOT NEW:
HUMANS HAVE KNOWN
ABOUT BIASES FOR
CENTURIES

Thomke: Experimentation works, p. 135


Bacon writings from 17th century 33
BIASES - AND PRIOR PHILOSOPHY OF SCIENCE
CURRICULUM I

34
BIASES - AND PRIOR PHILOSOPHY OF SCIENCE
CURRICULUM II

Kahneman 2011, ch 7, p. 81

• Popper’s idea of science goes against human (system 1 & nature


• Thus in line with Kahneman’s idea – we like to confirm; we should
try to disconfirm

35
HOWEVER: BIASES AREN’T JUST BAD

36
SYSTEM 1 AND 2: EFFICIENT YET IMPERFECT
COGNITIVE SYSTEMS
• Why these systems? Why is (lazy) system 1 default?

• Evolutionary perspective
• When humans ran away from wild animals  quick system 1
decision necessary
• Confirmation bias  strenghtens groups
• System 2 requires energy, energy is expensive
• Thus, we can’t constantly question ourselves (p. 28)  efficient
heuristics needed.
• The system is efficient – just has biases/errors
• …

Systems not adapapted to statistics and logical challenges – not


wired into our brains (yet?).
KAHNEMAN EXAM QUESTION EXAMPLES
(GOOD ANSWERS WILL BE UPLOADED)

2021 Spring: Question 5 (12%)


Please explain Kahneman's system 1 and 2 using an example from an organisational context. You are welcome to use an example
used in teaching or draw on your own experience or readings.

2022 Spring: Question 3 (16%)


Which of the following sentences is Kahneman unlikely to have authored? Please be specific in your explanation of why some of
the sentences do not match Kahneman's presentation of system 1 vs. system 2.

• System 1 is effective but also implies a range of cognitive systematic biases.


• My message is that it is possible to greatly reduce the number of these systematic biases.
• System 2 is a mental process that requires effort, but an individual can ensure to activate system 2 when necessary, thus
reducing the number of systematically biased errors system 1 might make.
• Thus, the more one is aware of the systematic mistakes one tends to make, the better chance one will have to limit the
number of biases to which one falls prey.
• Interestingly, since one knows oneself best, it is easier to identify the mistakes oneself makes, rather than the biases to which
one's friends and colleagues fall prey.

38
SO, WE HAVE SYSTEMATIC BIASES. HOW TO REDUCE
BIASES AND ‘IRRATIONALITY’? I (REFLECT 1 MIN)

39
HOW TO REDUCE BIASES? (PRE-KAHNEMAN)
• Pre-Kahneman, homo economicus assumption: Irrationality stems from lack of
incentives / information or emotions

• To improve thinking and rationality  increase incentives, teach


logic/probability, and control emotions

40
INCENTIVES CAN’T EASILY FIX IT THOUGH

Camerer and Hogarth (1999) conclude


that “. . . no replicated study has made
rationality violations disappear purely by
raising incentives.” (p. 8)

41
POST-KAHNEMAN: DIFFERENT PROPOSALS ON
HOW TO FIX BIASES

How to reduce risk of “falling” for system 1’s biases via focus on cognition / thinking

1) Debiasing: More information, better (critical) thinking skills

2) Nudges / choice architecture: Change the way options and information is presented

42
1) DEBIASING: ENHANCING CRITICAL
THINKING (IN SCIENCE)
a) Proactively seek differing viewpoints to
challenge your own (disconfirm)
b) Rely on evidence, not mere intuition, to support
claims (be aware of (your) bias)
c) Carefully analyse information to ensure it
substantiates your claims (correlation vs.
causation e.g.)
d) Aim for objectivity by slowing down thinking &
controlling emotions (system 2 activation)
e) Clearly define terms to reduce
misunderstandings (our world is ambiguous, cf.
Nelson)

43
2) NUDGING: CHANGE THE DEFAULT
SETTING OF A CHOICE SITUATION

• To change human behavior, don’t target


”homo economicus” or emotions.

Video by Ariely: https://youtu.be/9X68dm92HVI?t=754


• Choice architecture: Influence ”what they
see” and engage system 1 to your
advantage
44
45
SUMMING UP: WHAT TO LEARN FROM KAHNEMAN
• Emotions vs. cognition: Where do biases / irrationality
come from

• Be able to describe system-differences

• What You See Is All There Is  system 1 biases

• However, system 1 & 2 also efficient

• How to improve thinking and decisions?

Both the humans we study and the researchers studying


them are prone to such systematic biases
TO THE CURIOUS ONES: FURTHER READINGS
• The entire book: Thinking, Fast and Slow (2011)
• Short video, overview: https://www.youtube.com/watch?v=JiTz2i4VHFw
• Video of Kahneman, on the book (45 min): http://www.youtube.com/watch?v=DUpqr7nWc3s

• Other books:
• Thaler: Misbehaving (Nobel Prize 2017 winner)
• Galef: The Scout Mindset (2021)

• Further examples of questions and applications:


http://www.vanityfair.com/business/features/2011/12/kahneman-quiz-201112
• Google and heuristics:
http://www.nytimes.com/2011/03/13/business/13hire.html?pagewanted=3&_r=0

* Galef: Rationally speaking podcast


PHILOSOPHY OF SCIENCE

LESSON 11
SCIENTIFIC PROCESSES AND OBJECTIVITY

Carsten Bergenholtz (cabe@mgmt.au.dk)


Department of Management, Aarhus BSS
Aarhus University
SUMMING UP LESSON 10

Confirmation bias:
Both about how we collect /
process / obtain information

Jigsaw puzzles analogy 

2
LESSON 11: AGENDA
• Aschwanden 2015
Lesson 7-9
• Science is difficult I – politics example The principles of why it is difficult
• Is science broken due to cheating & biases? to reach true objectivity
• Science is difficult II – soccer example
• Is there a right scientific analysis of a study? Lesson 11
Practice-oriented look at the
• Science as a collective process & what kind of truth?
objectivity challenge and options
to solve it
• Harrer 2021
• What is a meta-analysis
• Strenghts and pitfalls

3
ASCHWANDEN 2015: IS SCIENCE BROKEN?
Science troubled by
• Replication crisis
• Fraudelent papers getting published
• Researchers have biases
• Inconsistent results
• ….

Aim of todays lesson 

4
MANY ORIGINAL STUDIES DO NOT REPLICATE?
(WHEN REDONE BY OTHERS)

http://theneuroeconomist.com/2016/03/is-the-
replication-crisis-in-psychology-real/
https://science.sciencemag.org/conte
nt/349/6251/aac4716
WHY THIS LACK OF REPLICATION?
Shouldn’t science have institutions, peer review processes and criteria that ensures that only true /
good research gets published?

6
PEER REVIEW CAN’T FIX EVERYTHING: CHEATING,
(BAD?) INCENTIVES AND HUMAN BIASES
Partially due to fraud / cheating (see lesson 12)
• Some papers get retracted (0.02%) most due to fraud

Partially due to human biases (cf. Kahneman)


• ”failling prey to human biases…”
• ”you believe your hypothesis…”
• ”you have to believe…to give energy and passion”
(all Aschwanden 2015 quotes)

https://twitter.com/lakens/status/133522764
5135564800?s=20

7
P-(VALUE) HACKING

"That’s because answering even a


simple scientific question — which
party is correlated with economic
success — requires lots of choices
that can shape the results.”
(Aschwanden 2015)

“The variables in the data sets you used to test your hypothesis had 1,800 possible combinations. Of these, 1,078 yielded
a publishable p-value,1 but that doesn’t mean they showed that which party was in office had a strong effect on the
economy. Most of them didn’t.” (Aswchwanden 2015, https://fivethirtyeight.com/features/science-isnt-broken/#part1)
8
CHARLES BABBAGE, IN 1830:
REFLECTIONS ON THE DECLINE
OF SCIENCE IN ENGLAND

On the problem of bias in science,


and lack of replications in science

An old problem!

https://twitter.com/lakens/status/1305153073258561542?s=20
9
ASCHWANDEN 2015: SCIENCE IS NOT BROKEN
Yet also better reasons for why science studies do
not always replicate:

“…I’ve spent months investigating the problems


hounding science, and I’ve learned that the
headline-grabbing cases of misconduct and fraud
are mere distractions. The state of our science is
strong, but it’s plagued by a universal problem:
Science is hard — really fucking hard.”

https://fivethirtyeight.com/features/science-isnt-broken/#part1
HOWEVER: WHY SAME DATA, DIFFERENT CONCLUSIONS?
IS THE CONCLUSION TRUE? ARE DARK-SKINNED PLAYERS
MORE LIKELY TO GET A RED CARD? (VEVOX.APP 143-136-950)

• Yes, true
• No, not all studies show this result. Maybe they were (politically,
cognitively, emotionally) biased?
• Well, what does ‘truth’ mean?
NO BIAS? WELL, NO PRIOR SYSTEMATIC DIFFERENCES
BETWEEN THE 29 TEAMS

For example
Research team’s prior (personal) beliefs about an expected relation unrelated to result

Of course difficult to completely rule out impact of any prior bias


Table 4. Covariates Included by Each Team
Nelson 2016: Heterogeneous constructs and numerous
influences…

DOI: (10.1177/2515245917747646) Is there always a correct decision to be made?


Many decisions

Fig. 3. Point estimates (clustered by analytic approach) and 95% confidence intervals for the effect of soccer players’ skin tone on the
number of red cards awarded by referees. Reported results, along with the analytic approach taken, are shown for each of the 29
analytic teams. The teams are clustered according to the distribution used in their analyses; within each cluster, the teams are listed in
order of the magnitude of the reported effect size, from smallest at the top to largest at the bottom. The asterisks indicate upper bounds
that have been truncated to increase the interpretability of the plot (see Fig. 2). OLS = ordinary least squares; WLS = weighted least
squares; Misc = miscellaneous.

DOI: (10.1177/2515245917747646)
IS THERE EVEN SUCH A THING AS THE RIGHT ANALYSIS?
WELL, SOME CHOICES ARE BETTER THAN OTHERS

Some analysis is wrong


Maybe multiple analyses can be right?

https://twitter.com/lakens/status/1166049199470850049?s=20
DIFFERENT INTERPRETATION OF RESEARCH
QUESTION  DIFFERENT CHOICES
”In 2018…published an article in which 29 teams analyzed the same research question with the same
data: … In this article, we investigate why results diverged so much. We argue that the main reason was
an unclear research question: Teams differed in their interpretation of the research question and
therefore used diverse research designs and model specifications. We show by reanalyzing the data
that with a clear research question, a precise definition of the parameter of interest and theory-guided
causal reasoning, results vary only within a narrow range. The broad conclusion of our reanalysis is that
social science research needs to be more precise in its “estimands” to become credible.”

https://doi.org/10.1177/23780231211024421

Less variation in results!

17
ANOTHER EXAMPLE: MANY ANALYSTS, MANY (QUITE?)
DIFFERENT RESULTS

• 162 researchers (73 teams).


• Data: 6 questions from the International
Social Survey Programme
• Hypothesis: (whether) immigration
reduces public support for social policies
• Results: substantial variation, not
predicted by researcher skills or beliefs

• Some (not all!) questions are just really


difficult – many choices to be made

https://twitter.com/tomstafford/status/1377131265405505546
18
SCIENTISTS ALSO AGREE ON MANY QUESTIONS
IMPRESSIVE WHEN THEY DO, GIVEN POTENTIAL FOR VARIATION…

https://en.wikipedia.org/wiki/Scientific_cons
ensus_on_climate_change

19
SAME (RED CARD) DATA, MOSTLY SIMILAR CONCLUSIONS
• “On the other hand, it also suggests there’s
a there there. It’s hard to look at that data
and say there’s no bias against dark-
skinned players.”

• “The important lesson here is that a single


analysis is not sufficient to find a definitive
answer. Every result is a temporary truth,
one that’s subject to change when
someone else comes along to build, test
and analyze anew.”

• “But these disparate results don’t mean


that studies can’t inch us toward truth.”

(Aschwanden 2015) http://www.proprofs.com/quiz-school/user_upload/ckeditor/Trooth.jpg


TRUST NOT ONE STUDY, BUT MANY
SCIENTIFIC DEVELOPMENT: IN PRACTICE

“The standard way of thinking about the


scientific method is: ask a question, do a
study, get an answer. But this notion is
vastly oversimplified. A more common
path to truth looks like this: ask a question,
do a study, get a partial or ambiguous
answer, then do another study, and then
do another to keep testing potential
hypotheses and homing in on a more
complete answer. Human fallibilities send
the scientific process hurtling in fits, starts
and misdirections instead of in a straight http://undsci.berkeley.edu/i
line from question to truth.” mages/us101/paths.gif

(Aschwanden 2015)
SCIENTIFIC DEVELOPMENT:
COLLECTIVE PROCESS
"Since people are fallible, and often obstinate and overly fond of
their own ideas, the objectivity of the process which tests
conjectures lies not in the emotional detachment and
impartiality of individual scientists, but rather in the scientific
community being organized in certain ways, with certain
institutions, norms and traditions, so that individuals’ prejudices
more or less wash out (Popper, 1945, Chapters 23–24)."
http://www.stat.columbia.edu/~gelman/research/published/philosophy.pdf ≈ Kahneman, Kuhn

23
KUHN VS ASCHWANDEN: WHAT DOES THE WORD
‘TRUTH’ ACTUALLY MEAN?
But the scientists have not realized that Kuhn’s claim is based on a philosopher’s
definition of “truth,” which is not achieved (if ever) until there is exact
knowledge of the ultimate constituents of matter on the quantum-gravity scale.
Scientists use a less demanding definition for the word “truth,” in which
measured parameters can be subject to nonzero error bars.”
Kenneth G. Wilson, in Physics Today 54, 3, 53 (2001)

Kuhn
Aschwanden
24
A RELATED METAPHOR, ON TRUTH
"The empirical basis of science has nothing 'absolute'
about it. Science does not rest upon solid bedrock. The
bold structure of its theories rise, as it were above a
swamp. It is like a building erected on piles. The piles
are driven down from above into the swamp, but not
down to any natural or 'given' base; and if we stop
driving the piles deeper, it is not because we have
reached firm ground. We simply stop when we are
satisifed that the piles are firm enough to carry the
structure at least for the time being.”
(Popper 1959, p. 111 quoted from Duberley & Johnnson 2000, )

25
SCIENCE AS UNCERTAINTY REDUCTION I
WE OFTEN SHOULDN’T WAIT FOR 100% CERTAINTY

"Science is not a magic wand that turns everything it touches to truth. Instead, “science
operates as a procedure of uncertainty reduction,” said Nosek, of the Center for Open
Science. “The goal is to get less wrong over time.” (Aschwanden 2015)

https://twitter.com/erikleejohnson/status/1357413415606755334?s=20
Reflekter 2-3 min: Hvad er Aschwandens (2015) primære budskab?

"Science is not a magic wand that turns everything it


touches to truth. Instead, “science operates as a
procedure of uncertainty reduction,” said Nosek, of the
Center for Open Science. “The goal is to get less wrong
over time.” (Aschwanden 2015)
SO, WHY IS SCIENCE NOT BROKEN?
(OR WHY SHOULD THE PUBLIC BELIEVE IN SCIENCE)

Yes, researchers have biases, some cheat, some are sloppy. But 

• Science is ******* difficult, failures are to be expected


• Some fields more heterogeneous, fuzzy and difficult than others
• Researchers are fallible and biased, paradigmatic groupthink constrains
• Don’t trust one study, trust multiple studies with different research designs
• Be critical: What is measured (and where), what is theorized?

Thus: Avoid these 2 flawed interpretations of science


• Pure relativism  because we collectively can’t just produce any result
• Blind belief in science  because we collectively don’t just produce one result

This video by John Oliver relates to today’s issue: https://www.youtube.com/watch?v=0Rnq1NpHdmw


2021 SPRING EXAM QUESTION (DISCUSS 2-3 MIN)

Question 3 (20%)
i) Please identify a quote in Aschwanden (2015) that you think has
the best fit with the overall message in Nelson (2016). Provide an
explanation for how the quote fits into Nelson's overall message. ii)
Please also identify a quote in Aschwanden (2015) that seems to be
in disagreement with Nelson (2016).

Note that a quote can be one or several sentences long.

I will upload a good exam answer to Brightspace

29
ASCHWANDEN AND NELSON: 2021 GRADING GUIDE
Several quotes can be applied in i). The following lists some of the most obvious ones. Again, others can be applied. Arguments should be
presented, both on Nelson’s position (not elaborated here), but also on how the quote links to Nelson.
• ”That’s because answering even a simple scientific question — which party is correlated with economic success — requires lots of
choices that can shape the results. This doesn’t mean that science is unreliable. It just means that it’s more challenging than we
sometimes give it credit for.” - this is probably the most ideal quote
• ”Even the most skilled researchers must make subjective choices that have a huge impact on the result they find”
• ”If we are going to rely on science as a means for reaching the truth – and it’s still the best tool we have – it’s important that we
understand and respect just how difficult it is to get a rigorous result.”
• “As a society, our stories about how science works are also prone to error. The standard way of thinking about the scientific method is:
ask a question, do a study, get an answer. But this notion is vastly oversimplified”
• ”The important lesson here is that a single analysis is not sufficient to find a definitive answer.”
• ”The uncertainty inherent in science doesn’t mean that we can’t use it to make important policies or decisions. It just means that we
should remain cautious and adopt a mindset that’s open to changing course if new data arises.”

Some rely on the following quote: “The state of our science is strong but it’s plagued by a universal problem: Science is hard - really fucking
hard.” Nelson would not disagree as such, but since this quote is not specific, using this quote requires one to specify what exactly
Aschwanden means, in order to enable a direct comparison to Nelson’s position.

A relevant quote for ii)


• ““But these disparate results don’t mean that studies can’t inch us toward truth.” Several other similar quotes can be applied. The
position one can unfold is that Aschwanden’s position is more akin to post-positivism, while Nelson’s position is more akin to
constructivism..
• "P-hacking and similar types of manipulations often arise from human biases." - one could use this quote to state that while
Aschwanden focuses somewhat on biases and p-hacking, Nelson argues that social science is difficult because of the nature of the
subject, rather than the biased humans studying social science.
• Other quotes are also useful. 30
LINKING ASCHWANDEN TO PRIOR CURRICULUM
Lesson 10 (Kahneman): Human biases shape decisions
Lesson 8 (Nelson): Heterogeneous variables, numerous influences
Lesson 7-8 (Kuhn): Paradigms and prior beliefs / standards in scientific communities
Lesson 9 (Guba): Are the effects real and true?
Lesson 2 (Isager): Difficult to identify relevant confounding variables
Lesson 6 (Theory): Induction vs. deduction (inductive search or ONE test?)

31
FROM ASCHWANDEN TO
HARRER ET AL. 2021:
META-ANALYSES

How to improve objectivity?

• Aschwanden: one dataset, many analyses


• Meta-analysis: many datasets, one
analysis

https://www.stevestewartwilliams.com/p/exercise-can-be-more-effective-than
HOW TO MAKE SENSE OF A
DIVERSE RESEARCH FIELD?

What is the answer to X question? Go to literature


and check

But 1) there is a lot of research, and 2) (cf.


Aschwanden 2015) not all prior research is true?

Need to ”critically appraise bodies of evidence in


their entirety” (Harrer et al. 2021, p. 2)
HOW TO DIGEST AN ENTIRE FIELD? (P. 2-3)
AND WHAT IS A META-ANALYSIS
How to summarize the evidence on a given topic? (e.g. if exercise  reduces depression)
• Traditional / narrative reviews: No strict rules, risk of bias – qualitative conclusions
• Systematic reviews: Summarizes evidence based on pre-defined and transparent standards –
qualitative conclusions

• Meta-analysis: Systematic selection and statistical analysis – quantitatively


• Averages effects across studies; e.g., Study 1: 0.3 effect, Study 2: 0 effect, Study 3: 0.5 effect
→ meta-analytic average: 0.27 effect
• An analysis of existing analyses: Prior studies become units of analysis
• Only relevant for quantitative results

• Sounds like an easy solution? Well, still difficult


HOW TO DO A META-ANALYSIS
Define what studies to include
• What kind of exercise (independent variable)
• What kind of mood measurements (dependent variable)
• What sample criteria: all populations, all kinds of study designs, all time periods?

Specify how analysis is to be done


• Weighted average of effect sizes (bigger samples / better studies more impact e.g.)
• Sensitivity analysis (i.e. does it matter exactly how analysis is done (cf. Aschwanden))

Should multiple meta-analyses on the same topic lead to identical results?

Well, pitfalls exist  next slides


BRIEF EXPLAINER OF META-ANALYSES

https://www.youtube.com/watch?v=i675gZNe3MY
COMPARING APPLES AND ORANGES? (P. 7) I
COMPARING APPLES AND ORANGES? II

https://datacolada.org/103

But what kind of ”choice architecture interventions” (cf. Kahneman) are we comparing?

Intervention in one study: Menu presentation Another study: Serving sizes Very different
interventions / variables
(cf. Nelson 2016)

https://datacolada.org/105
COMPARING APPLES AND ORANGES? III

If very consistent results,


across interventions /
variables  less to worry
about (?)
GARBAGE IN, GARBAGE OUT

Harrer et al. 2021, p. 8

“When we average valid with invalid studies, the resulting


average is invalid” https://datacolada.org/104
FILE DRAWER PROBLEM / PUBLICATION BIAS (P. 8)
“The file drawer problem refers to the issue that not all
relevant research findings are published, and are therefore
missing in our meta-analysis.” Scenario A
Researcher 1 does study on X impact on Y,
“There is good reason to believe that studies with negative finds no (interesting) result
or “disappointing” results are systematically Doesn’t manage to get study published
underrepresented in the published literature and that there
Scenario B
is a so called publication bias. The exact nature and extent Researcher 2 does study on X impact on Y,
of this bias can be at best a “known unknown” in meta- finds (interesting) result.
analyses.” (Harrer et al. 2021, p. 8) Manages to get study published

Scenario C….
One can check how symmetric published results are: more Result: Published studies are biased
results just barely significant, than just barely non-
significant?

Links to Kuhn’s (constraining) paradigms


RESEARCHER BIAS?
“There is evidence that, given one and the same
data set, even experienced
analysts with the best intentions can come to
drastically varying conclusions
(Silberzahn et al. 2018 [the red card study]).” (Harrer et al.
2021, p. 9)

”One way to reduce the researcher agenda


problem is pre-registration, and publishing a
detailed analysis plan before beginning with the
data collection for a meta-analysis” (Harrer et al.
2021, p. 9)
WHAT IS THE BEST KIND OF STUDY / GOLD
STANDARD?
What to do, if searching for scientific
answer to a question?
• Search literature, take research
designs into consideration
• Preferably find a systematic review
• Ideally a meta-analysis
• Still not bullet-proof cf. pitfalls)
• Different meta-analyses on same
topic can provide different results

Level 4: Anecdotes, opinions

A post-positivistic approach
Focused on quantitative challenges
Modified from:
https://www.researchgate.net/publication/336191285/figure/fig3/AS:8437196705054
73@1578169853483/ASPAN-hierarchy-of-evidence-ASPAN-American-Society-of-
PeriAnesthesia-Nurses-O-Laura.jpg
SUMMARY LESSON 11
• Science is complicated and produced by biased researchers that sometimes cheat
• Not all science results replicate  non-trustworthy science? Can we make anything up?

• Aschwanden: How one dataset can be analyzed in many ways  if most agree,
probably inching towards truth
• From individual subjectivity to collective intersubjectivity / ‘objectivity
• While multiple choices might be right, many will be wrong
• Trust many studies, not one – yet, how to interpret many studies?

• Harrer: From narrative reviews to quantitative meta-analysises  if consistent results,


probably uncovered something that is indicative?
• Risk of comparing apples and oranges
• Publication bias
• Garbage in-garbage out
• Researcher bias not eliminated
PHILOSOPHY OF SCIENCE

LESSON 12
SCIENTIFIC INTEGRITY
Carsten Bergenholtz (cabe@mgmt.au.dk)
Department of Management, Aarhus BSS
Aarhus University
META-ANALYSIS VS. INDIVIDUAL STUDIES:
REDUCING RISK OF SELECTION/CONFORMITY BIAS

https://twitter.com/NBTiller/status/1765376104125038940/photo/1

2
OVERVIEW: REST OF THE COURSE

‘Lesson 16’: Wednesday the 20th of March I am available on-site + Zoom meetings.
First hour of lesson in Fuglesangssalen, second hour of lesson on Zoom:
https://aarhusuniversity.zoom.us/my/bergenholtz (see info on Brightspace)

3
CONTINUOUS Q&A: VEVOX.APP 143-136-950

Will address questions in lesson 15 + provide answers on Brightspace

4
LESSON 12: AGENDA

iv) Flowlock+

Scientific misconduct = Videnskabelig uredelighed

5
Following introducion of scientific misconduct also
available on video here:
https://youtu.be/uIRFJlp0V2o

(slightly different curriculum, same overall points)

6
WHY IS IT – PARTICULARLY - IMPORTANT THAT SCIENCE
HAS HIGH INTEGRITY?
Science
Good science Bad science

• Cumulative science: We always build on prior work


• Depend on prior findings: We don’t re-run each study
• Data: We often don’t even have access to the data – thus can’t check it
• Ethics: Thus need to assume honesty/integrity of science/scientists

7
INFAMOUS EXAMPLE: SCIENTIFIC MISCONDUCT’S
SERIOUS CONSEQUENCES
Vaccine and autism – a correlation? No!!
• Wakefield 1998 Lancet study showing correlation – later retracted
• Personal economic interests + fabricated data + distorted message + important impact on
science = scientific misconduct
http://tallguywrites.livejournal.com/148012.html

8
https://www.wsj.com/articles/SB10001424127887323300004578555453881252798
GOOD SCIENCE VS BAD SCIENCE VS SCIENTIFIC
MISCONDUCT/DISHONESTY I
Bad science
Misconduct=videnskabelig uredelighed

• Bad science vs. actual misconduct/dishonest? (≈ no integrity)

• Kuhn: Paradigms and norms develop  research freedom essential. What seems like
misconduct to some, might be ok to others?

Scientific freedom refers in particular to the freedom to choose research topic, the
freedom to ask questions, the freedom to choose materials and methods to find the
answers, and freedom to publicly present hypotheses, results and reasoning
https://forskerportalen.dk/en/freedom-of-research/

• All agree: Deliberate dishonesty ≠ ok. Yet, is all misconduct deliberate?

9
BAD SCIENCE VS. SCIENCE MISCONDUCT/DISHONESTY II

• Science can be bad for many reasons


• Incompetent analysis, dataset with errors, wrong research design, unfounded conclusions, cheating etc.
• Scientific misconduct is about methods and norms
• Not whether the conclusions are ‘true’ or not

Official Danish guidelines for how to make rulings on scientific dishonest:

”Section 3: The Committees shall not be entitled to consider cases involving the validity or
truth of scientific theories or cases involing the research quality of a scientific product.”

All agree: Deliberate dishonesty ≠ ok. Yet, is all misconduct deliberate?


10
OFFICIAL DANISH RULES FOR SCIENTIFIC MISCONDUCT
DANISH COMMITTEE ON RESEARCH MISCONDUCT 2020

Intention

We often change/omit
data – has to be mislead
to be misconduct

Misleading

Plagiarism

• Mistakes and misunderstandings ≠ scientific misconduct


• Willfull or gross neglient (that is ”should have known”, cf. traffic rules)
11
RESEARCH PROCESS: WHERE CAN SCIENTIFIC MISCONDUCT
EMERGE

12
EXAMPLE OF SCIENTIFIC MISCONDUCT I: PENKOWA CASE

• Scientific misconduct
• Fabricated data
• Cheated with funding sources

• Reported to the police for document


forgery – acquitted solely because case
was too old
Official letter from Penkowa
I wanted to hand in my response as soon as
possible, but will hand in today the 28th of
https://da.wikipedia.org/wiki/Milena_Penkowa May 2003. I haven’t been able to do this
earlier due to an unfortunate incident in my
family. Wednesday the 14th of May my
mother and sister died in a car in Belgium,
which is why I have had to spend time in the
hospital and with the funeral…

13
DOES THIS EXAMPLE CONSTITUTE SCIENTIFIC MISCONDUCT?
(DISCUSS 1-2 MIN)

https://x.com/literalbanana/status/1744406652680622205?s=20

14
https://slate.com/technology/2018/02/how-brian-wansink-forgot-the-difference-
between-science-and-marketing.html
WANSINK’S FOOD RESEARCH – PROBABLY MISCONDUCT/DISHONEST

https://www.vox.com/science-and-health/2018/9/19/17879102/brian-wansink-
cornell-food-brand-lab-retractions-jama

https://slate.com/technology
/2018/02/how-brian-
wansink-forgot-the-
difference-between-science-
and-marketing.html

http://news.cornell.edu/media-relations/advisories/cornell-university-statement-regarding-questions-about-professor-brian
15
A RESULT BASED ON SCIENTIFIC MISCONDUCT CAN STILL BE TRUE
Scientific misconduct is about the process, not the result

https://x.com/mbialek82/status/1720472868641735148?s=20

16
WHAT ONE SHOULD KNOW (TRAFFIC RULES VS. ALL
STATISTICAL METHODS)

17
WHAT ARE THE POTENTIAL CONSEQUENCES OF
MISCONDUCT?

• Scientific Community: Reduced trust in science / researcher looses status


• Journals: Paper retractions.
• Workplace: Disciplinary actions, possibly firing.
• Legal Aspect: Varies by country; court involvement possible in the USA (e.g.).
NOTICE: VARIANCE ACROSS COUNTRIES
DIFFERENCE BETWEEN DK AND USA DEFINITIONS
Denmark

USA

19
NOTICE ANYTHING
PARTICULAR ON THIS LIST?!

20
https://retractionwatch.com/the-retraction-watch-leaderboard/
GINO: AN ILLUSTRATIVE EXAMPLE
THE DIFFICULTY OF ASSESSING MISCONDUCT

21
DATA COLADA 2023 I
Investigation of study 1 (of a paper ON HONESTY !) with 4 studies, where study 3 has already been
shown to have been fabricated

Data Colada 2023, p. 3)

For context. The study’s


treatment & topic

22
DATA COLADA 2023 II
THE EXCEL FILE: MANUALLY ADDED DATA
• Some data entries are out of sequence / duplicates
• All those entries strongly (!) support the hypothesis
• Excluding those entries = no significant result

23
DATA COLADA III
COULD THIS BE EXTREME RANDOMNESS?

• Clearly not random.


• Someone adjusted the data = cheated
• But whom?

• Legally, difficult burden of proof.

24
GINO’S DEFENSE

US vs. DK definition of misconduct


https://storage.courtlistener.com/recap/gov.uscourts.mad.259933/gov.uscourts.mad.259933.1.0.pdf

https://x.com/justsomeoneDK/status/1687064352522657793?s=20

25
STATUS OF THE CASE
• Gino is currently on academic leave from Harvard University
• All papers where she has collected data are being investigated
• Gino has sued Data Colada for 25 million dollars (money alreay been crowdfunded)

• Someone fabricated data & thus cheated. Who. Gino? A research assistant?
 incredibly difficult to prove gross negligence / disohnest intentions

• So: How to prevent such cases?

26
HOW TO REDUCE SCIENTIFIC MISCONDUCT?
SOME SUGGESTIONS

Pre-registering hypotheses & data analysis plan

Data transparency

https://www.aeaweb.org/journals/data/data-code-policy

Implications for (student) academic projects: Be transparent


• Promotes honesty & increases credibility
• Allows replicability
• Good practice for future work
27
PLAGIARISM (CARTER 2000): SUBCATEGORY OF
SCIENTIFIC MISCONDUCT

“Plagiarism is using someone else's material as if it were your own (books, articles, films, web pages,
tables, figures, etc.), without referring explicitly to the original source. It is plagiarism when you
knowingly cheat – if, for example, you borrow someone else's work and submit it as your own - but it is
also plagiarism if your references to sources are imprecise and incomplete.”
https://library.au.dk/en/students/plagiarism

Based on academic norms + law

28
PLAGIARISM IN MUSIC I

https://www.youtube.com/watch?v=R5sO9dhPK8g 29
PLAGIARISM IN MUSIC II
NOT THE SAME, BUT STILL TOO SIMILAR

https://www.youtube.com/watch?v=wUgNJ7Qdst8 30
PLAGIARISM IS DISTORTING THE MESSAGE

“Plagiarism involves using someone else’s words or ideas without giving proper credit…to
the author of the original…’the false assumption of authorship’” (Carter 2000).

“Using texts, ideas etc. produced or conceived by others does not in itself constitute
plagiarism. Plagiarism only arises when the examiner(s) are led to believe that you are the
author or originator.” https://studerende.au.dk/fileadmin/studerende.au.dk/studerende/Avoid_cheating_on_exams.pdf

But students might ask: Others have written it better?


• We want an assessment of you, not others
• If written independently – indicator for having thought it independently

31
2 FORMS OF PLAGIARISM

1) Direct copy: Using (segments) of another text without stating it as a


quote
2) Very clear resemblance / paraphrasing: ”it can also involve a
paraphrasing someone else’s argument as your own”

What you should do:


• Quote if direct citation (author, year, page number)
• Make a reference if the idea for your thought / the information comes
from somewhere else (author, year)

See example next slide 

32
EXAMPLE OF REFERENCING PROPERLY

Don’t have to provide reference for


every single technical term
(confounding variable, causality etc.)

Question 4 answer, 2020 exam (Brightspace)

You could reference this exam by e.g. writing (Philosophy of Science Brightspace: 2020 exam question 4 answer, p. 9)

33
EXAMPLE FROM CARTER 2000 I (VEVOX.APP 143-136-950, 2 MIN)
The major events in Indian-white relations in Major events in Native American-settler relations
these years were military and legal: more than during these years were military and legal, and
two hundred pitched battles, not to speak of were shaped in important ways by both the US
guerilla warfare in outlying Western regions, Army and the Federal legislature. There were more
and a reservation policy promulgated in 1887, than two hundred pitched battles as well as guerilla
to remain in effect until the 1930's. In each case, warfare in remote Western areas, and a reservation
military action and legal solution, economic policy which remained in effect from 1887 until the
and cultural issues figured as unspoken but vital 1930's. In the case of both legal solution and military
imperatives. The drama of these years was action, cultural and economic issues were unspoken
played largely on the Great Plains, but against but vital imperatives. The drama of the period,
a scenery put in place by the forced removal or which took place mostly on the Great Plains, had as
expulsion of tribes East of the Mississippi in the its context a scenery put in place by the ante-
decades before the Civil War. bellum expulsion or forced removal of tribes living
(original) East of the Mississippi River.

34
EXAMPLE FROM CARTER 2000 II (VEVOX.APP 143-136-950, 1-2 MIN)
The major events in Indian-white relations in these Alan Trachtenberg notes in The Incorporation
years were military and legal: more than two of America that the major events in Indian-
white relations in these years were military and
hundred pitched battles, not to speak of guerilla
legal, with over two hundred pitched battles as
warfare in outlying Western regions, and a well as guerilla warfare in outlying Western
reservation policy promulgated in 1887, to remain regions. In each case, military action and legal
in effect until the 1930's. In each case, military solution, he adds, economic and cultural issues
action and legal solution, economic and cultural were latent but crucial imperatives. Though
played out largely on the Great Plains, the
issues figured as unspoken but vital imperatives.
drama of these years took place 'against a
The drama of these years was played largely on scenery put in place by the forced removal or
the Great Plains, but against a scenery put in place expulsion of tribes East of the Mississippi in the
by the forced removal or expulsion of tribes East of decades before the Civil War.' 1
Footnote linking to original source
the Mississippi in the decades before the Civil War.
(original)

35
EXAMPLE FROM CARTER 2000 III (VEVOX.APP 143-136-950, 1-2 MIN)
The major events in Indian-white relations in these The relationship between Native Americans
years were military and legal: more than two and settlers in the trans-Mississippi West
hundred pitched battles, not to speak of guerilla during the late nineteenth century was
marked by military conflict and legal
warfare in outlying Western regions, and a
coercion. As Alan Trachtenberg notes, there
reservation policy promulgated in 1887, to remain were “more than two hundred pitched
in effect until the 1930's. In each case, military battles, not to speak of guerilla warfare in
action and legal solution, economic and cultural outlying Western regions, and a reservation
issues figured as unspoken but vital imperatives. policy promulgated in 1887, to remain in
effect until the 1930’s.” (Trachtenberg 1982, p.
The drama of these years was played largely on
29). But if the relationship was played out on
the Great Plains, but against a scenery put in place the battlefield and in response to legislative
by the forced removal or expulsion of tribes East of action, it was also shaped in significant ways
the Mississippi in the decades before the Civil War. by economic and cultural concerns. 2
Footnote linking to original source
(original)

36
(FORMER) HARVARD UNIVERSITY PRESIDENT
(NOW RESIGNED, DUE TO THIS SCANDAL)

https://x.com/aaronsibarium/status/1734340667055419852?s=20 Just 1 of several such examples


37
OH THE IRONY: SELF-PLAGIARISM
Oh, the irony: Paper on “Ethics and Integrity of the Publishing Process” retracted for plagiarism
http://retractionwatch.com/2014/03/24/oh-the-irony-paper-on-ethics-and-integrity-of-the-
publishing-process-retracted-for-duplication/
“For over three years I accepted every opportunity to address management scholars about the
importance of knowing, understanding, and following our codes of ethics. My script for these
talks took shape quickly as I learned which stories and lessons had the greatest impact on
audiences. Over time it grew and evolved, but as is often the case in our teaching, some of the
earliest and most effective narratives stayed put. The AMR editorial in 2009 was an early effort in
delivering this message in print form. The introductory essay in MOR was one of the last. At the
time each was written, they represented my best effort at conveying the most powerful message
possible on the importance of ethical publishing practices. But I goofed. Big time. When
translating the most current form of my ‘script’ into the MOR piece, I did not pull up
the AMR editorial from two years prior to make sure any overlap with it was properly credited to
that earlier piece. Unfortunately, that’s the very definition of self-plagiarism. And perhaps even
more unfortunate is that all of this could have been avoided with the simple insertion of
‘(Schminke, 2009)’ in the appropriate place(s) in MOR so as to guide readers to the original
location of those ideas.”

38
PLAGIARIZE: ”THEY’LL NEVER NOTICE?” - OURIGINAL
• Automatic system to find original sources
• https://medarbejdere.au.dk/en/administration/au-studies-
administration/sagsbehandlingsportal/eksamen/plagiarism-detection

• Serious sanctions
• Don’t want to create unnecessary anxiety!
• Reference your points and ideas, if needed
• Produce independent wording
• Please do get inspired
• No need to provide reference for every term

Remember: Cite group notes! Might be used word-by-word by someone else (applies to notes
found online or shared among few people)
E.g. write: (Study group notes 2024)
SHORT BSS VIDEO (BY ME) ON AVOIDING PLAGIARISM
English: https://youtu.be/BrJ0UnAgZn0
Danish: https://youtu.be/Hx5vBIFezg4

40
2020 SPRING EXAM, QUESTION 6 (DISCUSS 3-4 MIN)
Spørgsmål 6 (10%)
Aschwanden (2015) beskriver et studie, hvor 29 teams af forskere analyserede det samme datasæt for at vurdere, om
spillere med en mørkere hudfarve fik flere røde kort. De 29 teams genererede forskellige resultater, om end de generelt
indikerede den samme tendens; spillere med en mørkere hudfarve havde en tendens til at få flere røde kort. Læs venligst
følgende hypotetiske beskrivelser og vurder – baseret på den angivne information – om beskrivelserne implicerer
videnskabelig uredelighed. Gør rede for din vurdering for hver af de 2 beskrivelser.

A) One of the teams had a strong, prior assumption that a darker skin colour would not lead to more red cards. They considered
several different models, and were told by multiple colleagues during the process that the model they selected was probably
inferior, although the team disagreed. Just before they submitted their article, they realised they had left out a major chunk of the
data, which meant they had only analysed some of the data. They submitted the article after all, not disclosing their mistake, since
including the missing data would not have changed their results. They ended up arguing for a model that was statistically
insignificant.

B) One of the teams was funded by an international soccer organisation. The research team would only get the funding if they
could show that there was no relation between skin colour and getting a red card. The team did not disclose their funding source,
and ended up picking a model that fitted their expectations the best. In order to achieve the results they had to leave out parts of
the data that did not fit their interpretation. They ended up arguing for a statistically significant model that very clearly showed there
was no relation.
Good exam answer will be uploaded to Brightspace

41
FROM 2020 ‘CORRECTION GUIDE’
All answers should explicitly rely on a relevant conceptual framework, from the curriculum. The definition
Denmark / the curriculum has worked with, is that scientific dishonest behavior a) is intentional or grossly
negligent and b) misleads the scientific message. Also fine if an answer has referred to Resnik and Stewart
(2000), in particular their ‘decision tree’ figure, that includes a definition of misconduct. It should be clear that
misconduct does not equal errors nor disagreement.

In order to pass, the answer should identify that case b) demonstrates intentional, biased behavior and
leaving out the data, misleads the scientific message. We know from Aschwanden 2015, that out of the 29
research teams, none found a significant negative effect, so the researchers in scenario b) found a quite
particular finding.

Case a) could be argued to imply intentional or grossly negligent behavior, but the behavior does not
mislead the scientific message. According to the Danish rules, one should probably not categorize the
behavior as scientific misconduct/dishonest (videnskabeligt uredeligt), even though it is certainly a
questionable scientific practice.

42
SUMMING UP: SCIENTIFIC MISCONDUCT

• Mistakes do not constitute scientific misconduct

• Dishonest & gross negligence: How to demonstrate researcher knew / should have known?
• Gino: Dishonesty can be very difficult to prove

• Plagiarism: “Using texts, ideas etc. produced or conceived by others does not in itself
constitute plagiarism. Plagiarism only arises when the examiner(s) are led to believe that
you are the author or originator.” (Carter 2000).
43
GDPR: GENERAL DATA PROTECTION REGULATION

https://commission.europa.eu/law/law-topic/data-protection/data-protection-eu_en

What: A set of (EU) rules that protects your personal data.


• Personal data is any data that can identify you
• Makes sure that organisations, research respects data privacy / keeps data safe

Why: Mainly developed to protect misuse of data available online


• Thus mainly commercial companies
• Yet also applies to researchers/academic (student) work – you!

Examples of how GDPR helps resolve ethical challenges


• Data collected without disclosing purpose
• Data collected on people without consent
• Data leaked due to careless data storage
• …
https://img.freepik.com/free-vector/data-protection-law-illustration-concept_114360-
971.jpg?w=1060&t=st=1678282296~exp=1678282896~hmac=49100d536954427d5a683911ddb997130d8102c16c70
fad078ecc47c1cb26d9d

https://studerende.au.dk/en/it-support/information-
security/data-protection-gdpr
44
SOME EXAM INFO I
• 23th of March: 9-14 (don’t hand in a second late!)
• Be familiar with the Flowlock+ system (only PDF
documents, no internet) – more info in 2 slides

• Don’t plagiarize or cheat


• When you quote, provide citations
• Refer to texts to signal that you are on top of them
• References: See former exam answers for how to
reference. I’ll create a full list for you, with all papers.

45
(SOME) GENERAL CRITERIA FOR A GOOD EXAM ANSWER
(I’LL RETURN TO THIS LATER)
• Read the question carefully. And then re-read it.
• Make sure you answer the question (surprisingly many forget/don’t)
• Rely on the curriculum, avoid common sense approach (if you could have written it without relying on the curriculum,
probably not relevant answer)

• Identify what the question is about (if ”strength and weaknesses of an experiment”  probably validity concepts)
• Key challenge for poor exams: Too superficial, so be specific and provide reasons (e.g.: ”External validity is
limited because they relied on students”.)
• Be specific and detailed: a) link directly to the question, b) show which part of the curriculum is relevant
for answering a given question (don’t just repeat general terminology – show how is it relevant for THIS situation / question?)
• Signal you have read the curriculum, by quoting / referencing curriculum when possible (maybe even quote a
snippet not shown on a slide)

• Good answers provide multiple reasons when possible


• The very good answers are able to make links across different texts explicit

46
REMINDER: FLOWLOCK+ SETUP
NOTES: PDF + OWN COMPUTER

Why Flowlock+? Only way to ensure you can access notes / documents

47
ACCESSING WISEFLOW
NEW UPDATE -> LOG IN AGAIN
• AU implemented new Wiseflow system March 1: https://europe.wiseflow.net/login
• Same link: Therefore, log out of current/old Wiseflow (need to use different browswer/delete history?)

48
TRY OUT FLOWLOCK+ SYSTEM I
ACCESSING THE WISEFLOW SYSTEM (SEE ALSO BRIGHTSPACE FOR THE
FOLLOWING INFORMATION)

49
TRY OUT FLOWLOCK+ SYSTEM II
OVERVIEW OF FUNCTIONS

50
TRY OUT FLOWLOCK+ SYSTEM III
ACCESSING PDF DOCUMENTS

51
COPYING FROM PDF DOCUMENTS

52
TRY OUT FLOWLOCK+ SYSTEM IV
WHITE-LISTED WEB PAGES (DICTIONARIES)

53
PHILOSOPHY OF SCIENCE

LESSON 13
COMPLEXITY IN SOCIAL SCIENCE
Carsten Bergenholtz (cabe@mgmt.au.dk)
Department of Management, Aarhus BSS
Aarhus University
FULL EXAM ANSWERS UPLOADED

2
SUMMING UP SCIENTIFIC INTEGRITY
Scientific misconduct?
• Assess intention / gross negligence

• And if fabrication (undisclosed),


falsification (misleading research) or
plagiarism (without rightful
crediting) has been committed by a
researcher.

3
LESSON 13: AGENDA

iv) Eksamen & Flowlock+

4
PUZZLE: CURRENT ECONOMIC PARADIGMS HAVE
LIMITED EXPLANATORY / PREDICTIVE POWER – WHY?
Many economic phenomena and outcomes can’t be predicted, or fully explained – why?

For example, Big 5 personality effect on


job performance: about 20%. Why not
more?

Because
• The unexplained part is random/noise?
• Simply have not discovered true relation
yet?
• Or need entirely new approach?

5
WATTS 2007: P 489
SOCIAL SCIENCE IS REALLY
DIFFICULT/HARD

Why hard science/problems?

6
ASCHWANDEN 2015 ANSWER: SOCIAL SCIENCE IS
COMPLICATED / HARD
Complicated due to many variables / links between

“The variables in the data sets you used to test your hypothesis had 1,800 possible combinations. Of these, 1,078 yielded
a publishable p-value,1 but that doesn’t mean they showed that which party was in office had a strong effect on the
economy. Most of them didn’t.” (Aswchwanden 2015, https://fivethirtyeight.com/features/science-isnt-broken/#part1)
WATTS 2007 ANSWER
(P. 489)

Social science is hard


because of interactions
between elements 
complexity
The following introduction of complexity also available
in this 2021 video: https://youtu.be/O1kKCTI-9i8
(slightly different curriculum, same overall points)

9
TERMINOLOGY: COMPLICATED VS COMPLEX I

Complexity in common language


• Something really difficult

Complexity in science https://thumbs.dreamstime.com/b/complicated-flowchart-cartoon-two-arms-


holding-tangled-flow-chart-paper-77231017.jpg

• Not just difficult


• More particular meaning

10
THREE CHARACTERISTICS OF A COMPLEX SYSTEM
SULLIVAN 2011
• Heterogeneous agents: Decisions evolve and adapt over time
• Interaction: Agents are interdependent
• Emergence: The whole is greater than the sum of its parts
Aim of lecture: Explaining
 not completely predictable, not complete chaos these characteristics and
their implications

11
COMPARISON: COMPLICATED VS COMPLEX SYSTEMS I

• Independence vs interdependence

• Complex systems are interdependent:


Elements interact and adapt 
feedback loops  change system

• Independent elements: Behaviour is


not adaptive

Complicated systems can be completely predicted ≠ Complex systems


12
COMPARISON: COMPLICATED VS COMPLEX
SYSTEMS II
Complicated system: We can analyze all parts of a system = predict result

Complex system: We can’t just look at the parts, we need to understand interactions and changes of
behavior ≈ understand system dynamics

Cake ingredients + interactions + system environments


Cake ingredients

13
WEATHER AS A COMPLEX SYSTEM
BETWEEN CHAOS AND UNPREDICTABILITY

https://www.weather-atlas.com/en/denmark/copenhagen-
weather-december

https://www.youtube.com/watch?v=M7Hf6VfsJ0U

14
COMPLEX SYSTEMS: ECONOMIC EXAMPLES

Market Economies:
• Heterogeneous agents: Consumers and producers with diverse and adaptive preferences and
production techniques.
• Interaction: Market transactions, price mechanisms and social influence
• Emergence: Economic dynamics that are at times unpredictable

Innovation Ecosystems:
• Heterogeneous agents: Startups, investors, universities, and R&D departments.
• Interaction: Collaboration, competition, and knowledge transfer.
• Emergence: Technological advancements and industry disruptions beyond single entities

15
IF STUDYING INCENTIVES AND PERFORMANCE AS A
COMPLICATED SYSTEM

• Starting point: Other factors probably also impact incentives & performance
• Yet, if the system is equal to the sum of its parts, we can
• Investigate sub components of an economic system independently
• Investigate incentives and performance
• E.g. investigate leadership style and peformance
• …
• And then combine all results into overall insight into how organizations work.

• Complexity perspective challenges this approach: Interactions between interdependent


factors matter
COMPLEXITY: NOT ONLY IN HUMAN, ECONOMIC SYSTEMS

Humans, animals, cells, climate systems

All can consist of heterogeneous (adaptive)


agents that interact (are interdependent)
and where outcome is more than the sum of
its part (and somewhat unpredictable)

Let’s assume a really, really simple system

17
EXAMPLE I: SIMULATION OF ANTS THAT FOLLOW
SIMPLE RULES
We can’t look at each kind of ant
individually, and then simply combine
their behavior & then explain the system

Description of simulation: None of the


ants have a plan or an idea about
getting food!  the overall system
behavior seems coordinated, and solves
a very difficult problem (effectively find
all food)

http://netlogoweb.org/launch#http://netlogoweb.org/assets/modelslib/Sample%20Models/Biology/Ants.nlogo

Description from the program


In this project, a colony of ants forages for food. Each ant follows a set of simple rules, but the colony as a whole acts in a sophisticated way.
When an ant finds a piece of food, it carries the food back to the nest, dropping a chemical as it moves. When other ants “sniff” the chemical, they
follow the chemical toward the food. As more ants carry food to the nest, they reinforce the chemical trail.
18
THE WHOLE SYSTEM IS MORE THAN ITS PARTS
• Not just a complicated system

• Complex system due to interconnectedness and


feedback – behavior ‘adapts’ /changes
• If some individual ants are removed – still works
• No centralized ‘leader’ in a complex system

• The whole system is more than its parts – can’t be


predicted even if one knows ts parts (the different
kinds of ants)

Sullivan 2011, p. 2
19
HOW ABOUT AN EVEN SIMPLER SYSTEM?
EXAMPLE II: GAME OF LIFE I
• Very structured, complex behavior emerges from
very (!!) simple rules, in a system of black and white
dots that are interconnected in a feedback loop

• All examples follow the same type of simple rules,


just different starting formations

https://www.youtube.com/watch?v=C2vgICfQawE
EXAMPLE II: GAME OF LIFE II

https://www.youtube.com/watch?v=C2vgICfQawE
21
BUT WHAT ABOUT COMPLEXITY IN HUMAN SYSTEMS?

To sum up
• Complexity is more than just ”very complicated”:
• Interdependence (interactions)  feedback loops  adaptive/changing behavior
• Complex, sophisticated systems not coordinated by centralized leader
• Emerging behavior and system: The whole is more than the sum of its parts
• But, how about human, economic systems?

First one more animal system! (then humans)


WOLVES AND YELLOWSTONE PARK I

https://www.youtube.com/watch?v=ysa5OBhXz-Q
WOLVES AND YELLOWSTONE PARK II (DISCUSS 2 MIN)

Why watch a video about wolves and Yellowstone, in a class


about people and economic systems?
SULLIVAN 2011: P. 3

Cascading events – or, a


kind of chain-reaction, in a
complex maze system

25
TRADITIONAL STATISTICAL ANALYSIS
(E.G. REGRESSION) VS. COMPLEXITY
Why would a traditional statistical (e.g. linear
regression) analysis not work well?

Linear regression assumes


• Independent observations
• Static/fixed variables
• Linear relations between variables
• And produces average results

If components are interdependent, and adapt/learn


In a regression, the average effect is
behavior after interaction and small canges can have independent of starting point. All parks should
big impacts  linear regression doesn’t quite work be equally influenced by wolves
Complexity: Starting point matters – dynamics
could be different in a different park

26
A LEGO BRICK METAPHOR

https://brickarchitect.com/2018/overcoming-builders-block/

Linear regression approach: Predicting Lego building based on bricks, colour etc.
- a lego brick is a lego brick, stays the same  limits what you can build

What if bricks could adapt: change shape, behavior based on interacting with other bricks?
Dynamics would be complex, much more difficult to predict (whole more than sum of its parts)

27
PEOPLE AND ECONOMIC SYSTEMS?

Are individuals and organizations like independent,


unchanging lego bricks (just all rational, homo
economicus)?

Or do we adapt/change behavior based on feedback


from others/the system, thus create new things, adapt ≈
changing Yellowstone world?

28
EXAMPLE: WATTS, MUSIC AND LINEAR
REGRESSIONS I
• If we want to predict and explain how well music numbers will do in a hitlist.
• We just let people listen to music numbers, let them rate / buy it  result?
• Well…what about interdependence and feedback listeners get from others?

https://en.wikipedia.org/wiki/Billboard_Year-End_Hot_100_singles_of_2022
29
EXAMPLE: WATTS, MUSIC AND LINEAR REGRESSIONS II

A simple linear regression wouldn’t capture these dynamic,


non-linear (feedback) effects / interdependence

http://www.nytimes.com/2012/08/05/business/of-luck-and-success-economic-view.html
WATTS 2007 ON COMPLEXITY IN SOCIAL SCIENCE
KEY QUOTE!
“It is hard to understand, for example, why even a single organization behaves the way it
does without considering
a) the individuals who work in it
b) the other organizations with which it competes, cooperates and compares itself to
c) the institutional and regulatory structure within which it operates
d) the interactions between all these components.” (p. 489)

Watts 2007: ”…interactions have consequences for the choices of others" (p. 489)

Usually one studies one level at a time – but when


different components interact, as in Yellowstone?

31
EXAMPLE: INCENTIVES AND PERFORMANCE’
• Holst-Hansen & Bergenholtz (cf. Bergenholtz 2024b) studied how isolated individuals
responded to a single variable (incentives, 1 vs. 5 DKK) change
• What about interactions with other individuals?
• What about different management styles, organizational cultures?
• What about effects from competitors, who might implement similar system – will
the labour supply change?
• What about national cultures, financial crises etc.?

• Watts argues we should worry about multiple components interacting

The effect might be real in isolation – but only in isolation?


Watts 2007: Challenges the existing paradigm of how to study business
(what data to collect, methods, ontology etc.)
INTERDEPENDENT FACTORS IN MANAGEMENT
1) Performance incentives  better performance

But what about?


2) Skill training
3) Leadership coaching

• How to run RCT?


• Maybe incentives work differently at different skill / leadership levels
• Other factors’ influence not necessarily additive (more not always better)
• Therefore, if interdependence, need to multiple RCTs: 1+2, & 1+2+3 & 2+3 & 1+3

Other factors that might influence impact of performance incentives?


• i) Basic Salary ii) Performance Bonuses iii) Flexible Work Schedules iv) Professional Development Opportunities v) Positive Workplace Culture
vi) Access to Health and Wellness Programs vii) Modern and Ergonomic Office Equipment viii) Transparency and Open Communication ix)
Empowering Leadership Style x) Team Building Activities xi) Recognition and Awards Programs xii) Access to Childcare Services xiii) Constructive
Feedback Mechanisms xiv) Technology and Software Upgrades xv) Efficient Internal Processes xvi) Employee Autonomy xvii) Collaborative
Team Structures xviii) Clear Career Advancement Paths xix) Employee Engagement Initiatives xx) Support for Work-Life Balance… (Chatbot
written list)
33
WAIT, WHY NOT JUST CONTROL FOR
CONFOUNDING VARIABLES?
When controlling for (confounders), we assume
independent variables are fixed (in the analysis)

In complex system:
• Adaptive behavior, variables not fixed
• Interdependent variables
• Sum is more than the parts

• Therefore, can’t just control for other factors, since


these factors might (individually or in
combination) be adaptive (changing)

34
BARRETT-FELDMAN 2021 I
MECHANISTIC VS COMPLEXITY MINDSET
BARRETT-FELDMAN 2021 II
MECHANISTIC VS COMPLEXITY MINDSET

• Maybe effect of incentives depends on: The particular task, if the task length is 10 or 20 min,
recent economic news, weather, room temperature etc. + the combined interaction

• Every single factor maybe basically unimportant – but all interconnected and combined?

• Would results change if other factors change?


LAB-EXPERIMENTS - MECHANISTIC MINDSET
Experiments are best suited to assessing single factors with simple and separable effects. They are
based on the implicit assumption that a treatment simply has an effect or not, and the purpose of
an experiment is just to find out which. They are far less suited to multifactor problems that require
tracking many complex interactions, feedback loops, and highly correlated processes and
outcomes simultaneously. Yet many problems of great importance are like this, such as increasing
life expectancy, educating children, improving social policy, and improving public health…. These
dynamic systems cannot be expected to be influenced by an intervention in a simple yes/no
fashion. Instead, they typically produce unexpected effects and require ongoing longitudinal,
observational analysis. Addressing this type of dynamic process requires a fundamentally different
ontological orientation, wherein nonlinear dynamic systems are the expectation rather than the
exception.

Diener E, Northcott R, Zyphur M, West SG (2022) Beyond Experiments. Perspectives on Psychological Science 17(4):1101-1119.

37
EVALUATION OF THE COURSE

Evaluation link (also available on Brightspace)

https://go.blueja.io/hcsMgsb5QkqSxrDin9ZdIA

38
LINKING BACK TO PRIOR CURRICULUM I
COMPLEXITY AND KUHN’S PARADIGM SHIFTS

• Complexity approach is a competing paradigm to traditional paradigm, since


• Challenges conventional view of economics as complicated
• Important anomalies/phenomena not explained by traditional approach (model crisis)
• Collection of new kinds of data
• New methods and analytic techniques
• Redefining what the economic world / economic system is

39
LINKING BACK TO PRIOR CURRICULUM II
COMPLEXITY AND GUBA: WHICH PARADIGM?
Very difficult question – wouldn’t be exam question

Maybe most closely related to post-positivism


• Acknowledgment of uncertainty
• Reality not directly observable, independent of researchers
• Models, hypotheses produced
• Methodological pluralism (many methods can be applied)

40
LINKING BACK TO PRIOR CURRICULUM III
COMPLEXITY AND CASE STUDIES
Case studies could via in-depth study of unique context capture
• How elements are connected / interdependent
• Feedback loops
• Adaptive, changing behavior

For (hypothetical) example: How a change in performange managed system


 changed team dynamics  more competitive behavior  change in
communication and trust  reduced enmployee engagement

Longitudinal study: Might identify link between performance management


system -> reduced employee engagement: but not the actual changes in the
system

41
Some of the following points are covered (but with
different examples) in this video
https://www.youtube.com/watch?v=_Z2DVh2ws2I

42
INTERDEPENDENT OR INDEPENDENT SYSTEMS?
HOW TO ANALYSE ECONOMIC PROBLEMS

Understanding system components in isolation vs. understanding


interconnectedness of the system  complexity perspective

43
EXAMPLE FROM LESSON 7: BUSINESS ECONOMICS
- SIMPLE CAUSATION FROM A AND B?

https://twitter.com/graykimbrough/status/1108403322690654208?s=11
WATER DOES RUN UPHILL? FEEDBACK LOOPS
CHANGE BEHAVIOR AND ECONOMIC SYSTEM
“Why did so many economists get this wrong? They looked at the micro and ignored the macro.
In the basic model of supply and demand, an increase in the cost of something reduces
demand. This may be true in an isolated laboratory setting. But when minimum wages went up
in the real world, it affected not just the parties to that transaction, but the regional economy.
By considering only the relationship between employer and employee, the dismal-science set
was focusing too narrowly. The critics failed to consider the impact of lower-wage employees
earning more money; these folks typically spends almost everything they earn, which means
that when they’re paid more it goes right back into the local economy.”

https://www.bloomberg.com/view/articles/2016-10-21/doomsayers-keep-getting-it-wrong-on-higher-minimum-wages
FROM IDENTIFYING WHAT A COMPLEX
SYSTEM IS TO HOW TO ANALYZE I
In a regression analysis:
• Collect data on main variables (including potential confounders) and
run regressions (cross-sectional or longitudinal)
• Reductionist approach: If you understand the parts, you have
understood the whole

Complexity perspective:
• Understand how variables interact and influence each other
• Study system’s behavior over time, capturing patterns and changes
• Aim to model system behavior, over time
• Holistic, system approach: Need to understand the parts, and their
interactions, to understand the whole

46
FROM IDENTIFYING WHAT A COMPLEX SYSTEM IS
TO HOW TO ANALYZE II - EXAMPLE
Regression approach
Example: Explaining consumer spending through isolated factors like income levels,
interest rates, and consumer confidence, assuming direct and predictable relationships.
•  Establish direct causal relationships?
• An equation can be written

Complexity approach
Example: Investigating how agents (e.g. individuals and institutions) interact and
influence each others’ adaptive behavior
• Requires tracking all individuals’ consumer spending behavior + how they are
connected + the changes in behavior of interacting
•  overall system level interdependencies and dynamics
• No equation can be written

47
2021 SPRING EXAM QUESTION (DISCUSS 2-3 MIN)

Question 4 (17%)
Complexity researchers like Duncan Watts (the author of Watts 2007) do carry out experiments and
find great value in this approach. However, what challenges might the complexity perspective
identify when doing experiments - in particular lab-experiments? When referring to the complexity
perspective you can rely on the curriculum or the Covid-19 digital exercise we engaged in, in the
live part of lesson 12. Please explain your answer in detail.

Good answer will be uploaded to Brightspace

48
SUMMING UP COMPLEXITY

49
A FEW COMMENTS ABOUT THE EXAM
Suggestion for preparation
• Read curriculum + reflect on exercises/questions in class + study previous exams
• You’ll be too busy to read the curriculum in the exam situation
• Practice Research Design challenges + look at former exam questions and consider how you would
approach them. This year’s question will be similar in style
• This course doesn't call for a template approach, as some other courses.
• Not a course where one (only) learns a model or table or technique: critical thinking and reflection on
what science and truth is about is needed

To illustrate, extract of 2024 exam questions (there are 5 questions in total):

• Please identify the weaknesses of each of the three studies and explain […]
• Four statements are listed below. Has XXX written all these? Please be specific in your explanation of why you think
some (if any) of the sentences might not match the content provided in XXX.
• As discussed in the lectures, XXXs interpretation of YYY arguably shares common ground with ZZZs philosophy of
science. Please explain in detail how ZZZs philosophy of science is similar to and differs from XXXs presentation of
YYY.
FLOWLOCK+ (= W0+) EXAM

Aids: Physical books, physical notes and articles. PDF files on computer and
access to dictionary.

• Test-flow has been open this week. Hope you all have ensured you can use Flowlock+ and
can access PDF notes / documents.

• See info about access in lesson 12 slides / Brightspace folder (Flowlock+)

• Notice, you might have to log out & in at the exam, the 23rd of March!
51
PHILOSOPHY OF SCIENCE

LESSON 14
STUDY CAFE 2
Carsten Bergenholtz (cabe@mgmt.au.dk)
Department of Management, Aarhus BSS
Aarhus University
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Agenda

Work with questions that either have been former exam


questions or are similar to in type

2
Assessing quality of RCT research design: What to
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

pay attention to (rules of thumb) I


• Is this an actual randomized controlled experiment / are there
issues with the internal validity?
• Are all 3 elements ‘completed’ (randomized, controlled, one trial/treatment)
• If not, is this ‘as good as’ an experiment? (i.e. a quasi-RCT, like the elementary school study)

• How strong is the external validity


• Use the criteria listed in sessions 2-3 (selection, setting, history, Hawthorne)

• How strong is the ecological validity


• E.g.: Solving math problems in a lab-setting (quite weak)
• E.g.: Field experiment (generally strong, since in the field)

• When assessing validity


• Don’t make a binary assessment (good vs. bad). Characterize strengths and weaknesses, be specific,
use criteria, assess the sample, task etc. Maybe compare to examples seen during the course.

Not your personal assessment: Curriculum based assessment of the quality of the RCT
3
AARHUS UNIVERSITY
SCHOOL OF BUSINESS AND SOCIAL SCIENCES

Assessing (quality of) research designs: What to


pay attention to, in general (rules of thumb)
• Don’t focus on the content of the study, but research design decisions
• Is it a longitudinal, cross-sectional, RCT, case study design?
• What data is collected? How many times is data collected? Words or
numbers? Focus on one context, or a larger sample (of people/firms)?
• What sample does the study rely upon?
• How did we end up with this sample, i.e. is selection bias an issue? (cf.
WW2 airplane example) Does the conclusion match the sample?
• Might there be important data left out (omitted variable bias / confounders)?
• Are the variables in the study capturing what they set out to measure?
• Is all data collected from a single source, or multiple different ones
(common methods bias, cf. video games examples)?
• Does the overall design, sample and the data match the (causal) conclusion
made?
• Does the evidence and data actually speak to the conclusions and theory?

• Not an exhaustive list! Not clear-cut rules, therefore: Practice! 4


REMEMBER THE OPPORTUNITY TO ASK / VOTE ON
VEVOX (143-136-950) QUESTIONS – FOR LESSON 15

5
PLEASE DISCUSS THE FOLLOWING 3
EXERCISES (30 MIN IN TOTAL)

• Do ask us questions

• Feedback is given afterwards, at the end of the first half

(you don’t have time to write a full answer, but I suggest you think about a) what
criteria you will rely upon when assessing external validity, b) how you would
structure your answer, and c) maybe think about how the external validity
compares to some of the other examples we have heard about in class)

6
A) 2023 SPRING EXAM QUESTION (18-20 MIN)
Question 1 (25%)
A recent paper provided evidence for a relationship between recessions and job satisfaction. The authors argue that during an economic
recession, when the job market is less robust and the economy is suffering, people will perceive their jobs to be more satisfying, as
alternative job prospects appear less attractive. The authors of the paper set out to test this hypothesis through three separate studies.
Please identify the weaknesses of each of the three studies and explain how a given study's weakness can be remedied by the other
studies.
Study 1 involved a survey sent out to adults in different regions of South America. The questions in the survey covered a broad range of
variables, including the respondents' perspective on the current economic situation, their job security, job satisfaction and a range of
demographic variables. At the time of data collection, some participants lived in regions experiencing high growth, while others lived in
regions that experienced a more neutral economic development. The total sample size consisted of 12.300 participants, across all
regions and ages. The statistical analysis found that people perceive their jobs as more satisfying if the economic conditions are poorer,
compared to a high growth environment, thus supporting the overall hypothesis.
Study 2 aimed to replicate the findings of Study 1, relying on surveys of adults living in three regions in the United States. The sample
size was 8.500 participants. The country (and thus all regions) was in the middle of a very serious economic crisis when the data were
collected. The sample size of Study 2 was 8.150 participants, across all regions and ages. Some participants were more exposed to the
crisis than others were, e.g. by working in industries that suffered the most economically. The survey consisted of the same question as
those in Study 1. This study also found that people that were the most highly exposed to the economic crisis were the ones reporting the
highest job satisfaction. This provides further support for the hypothesis that economic conditions influence people's job satisfaction.
Study 3 aimed for a different approach to answer the research question. The study, which had a sample size of 512 participants, sought
to investigate the mechanism behind the observed relationship between economic conditions and job satisfaction. Participants were
from the United States. The study created a treatment and control condition (and randomised participants into each of them), by
making participants read either a New York Times article that praised the current economy or an article from the New York Times that
highlighted the negative characteristics of the current economy (both were actual articles, printed in the New York Times). Being in the
"bad economy" group (having read the negative article) led to a more negative assessment of the current economy, compared to the
average assessment of the economy by the people in the other group (who had read the article praising the economy). After being
exposed to the treatment (or control), the participants answered questions about their job and their job satisfaction. As predicted and in
line with the former studies, participants in the 'bad economy' condition reported greater job satisfaction.
While previous research has shown that job satisfaction can be influenced by factors within organisations, these studies suggest that
external factors such as economic conditions can also play a role in how people evaluate their jobs. The findings of the three studies
support the hypothesis that people perceive their jobs to be more satisfying during economic downturns, which may have implications
for employers and policymakers seeking to improve job satisfaction levels during times of economic instability.
7
B) 2023 SPRING EXAM QUESTION (12-14 MIN)
Question 2 (15%)
Imagine a large Danish company that offers telemarketing services. The company has 60 different
units across Scandinavia, each with at least 60 employees. It has for years been fairly profitable but
also realised an increasing level of competition. The top management team is worried about its future
earnings and decides to implement a new system that integrates a chatbot (a computer program that
uses advanced technology to imitate human conversations) in the sales process. The idea is that
telemarketing employees would be able to get inspiration from the chatbot on how to optimise their
sales. 30 of the 60 unit locations have been selected to try out this new feature. The selection was
partially related to logistical challenges and the availability of IT expertise. The team that
implemented the chatbot visited each of the 30 different units during a span of 15 weeks. Subsequent
analysis of sales performance shows that the implementation of the chatbot, on average, led to a
subsequent and immediate increase in performance, compared to the units that have not (yet)
implemented the chatbots. The analysis controlled for prior performance, size of the unit and the
experience of the manager in charge of each of the units.
You have now received some limited information on the above outlined research design. Based on
the provided information one can infer that it is not a randomised controlled trial. What information
would you have liked to know about this case in order to identify if the setup constitutes a quasi-
experiment or not? Please be specific when explaining what information you need.

8
9
PLEASE DISCUSS THE FOLLOWING 3
EXERCISES (30 MIN IN TOTAL)

• Do ask us questions

• Feedback is given afterwards, at the end

(you don’t have time to write a full answer, but I sugest you think about
a) what texts and perspectives you would rely upon, b) what the main
points of these texts are, c) how you would structure an answer, and d)
remember the questions asks for counterargumentS – plural.

10
C) 2023 SPRING EXAM QUESTION (12-14 MIN)

Question 3 (20%)
In Guba (1990), the positions post-positivism and constructivism are outlined. Please
identify both similarities and differences across these two positions and explain these
similarities and differences in detail. Your explanation should use at least two different
quotes from Guba.

11
D) 2021 SPRING EXAM QUESTION (10-12 MIN)

Question 4 (15%)
Please rely on the curriculum and course material to explain what features of complexity
that are not covered in the following brief presentation of the concept. Provide at least
one quote from the curriculum to support your argument.
In the context of economics or business administration, complexity refers to the degree of
difficulty involved in understanding and managing the dynamics of a particular system or
organisation. A complex system or organisation is characterised by a large number of
interconnected elements or variables that relate to each other in often complicated ways.
Managing complexity in business administration involves analysing and understanding the
various elements and their interconnectedness, identifying potential risks and
opportunities, and developing strategies to optimise performance and mitigate negative
impacts. This requires a multidisciplinary approach. Overall, understanding and managing
complexity is a crucial skill for businesses and organisations that operate in dynamic and
uncertain environments, and it requires a deep understanding of the underlying
mechanisms and dynamics of the system.

12
E) ASSESSING RESEARCH DESIGNS: WHAT IS
THE MAIN CHALLENGE HERE? (5 MIN)
As stated in the text below, the aim of the study was to figure out if running increases the risk of hip and
knee problems. Please look at the information below and based on our curriculum identify the primary
obstacle to trusting the study's findings.

Dr. Hartwell and his colleagues conducted the prospective cohort study, “Does Running Increase the Risk for
Hip and Knee Arthritis? A Survey of 3,804 Chicago Marathon Runners” …

An electronic survey was completed by 3,804 participants registered for the 2019 and 2021 Chicago
Marathons. The mean age of respondents was 43.9. Participants had run an average of five or fewer
marathons, and they had been running for an average of 14.7 years.
The survey featured 30 questions targeted at assessing:
• Demographic information (age, sex, height, weight, country of origin and occupation)
• Running history (number of years running; average weekly mileage over the past year; average running
pace during training runs; number of marathons completed, including ultra-marathons;
and participation with cross training)
• Hip/knee health (occurrence of hip or knee pain over the past year that prevented running; a history of
hip or knee injuries that required a break in running; a family history of hip and/or knee arthritis;
surgical procedures performed on the hip or knee; and receiving a hip or knee arthritis diagnosis)

https://aaos-annualmeeting-presskit.org/2023/research-news/new-study-demonstrates-that-running-does-not-increase-risk-of-arthritis/ (slightly
modified)

13
14
A) 2023 SPRING EXAM QUESTION (17-19 MIN)

Overall comments
• Cross-sectional studies (1 & 2) imply the risk of confounding variables, or omitted
variable bias (or similar terminology). It is not adequate to write the generic
statement that cross-sectional studies have low internal validity. One should
specify the given weakness (e.g. that even though control variables are included
in study 1 & 2, some variables might be confounding (or maybe there are
variables missing, constituting an omitted variable bias).
• The RCT (study 3) can eliminate the risk of such confounding variables/omitted
variable bias (in study 1-2), but implies lower external validity.
• Study 1 + 2 complement each other and imply higher external validity compared
to 3.
Note also
• Different economic contexts in study 1 & 2
• All relevant control variables collected? Might some be confounding?
• Common methods bias in study 1 & 2?

Make sure to explain how one could remedy [=afhjælpe] the weaknesses.

15
B) 2023 SPRING EXAM QUESTION (12-14 MIN)
Not a RCT: No randomization between treatment and control
As good as random?
Key questions: How was treatment allocated? Might the units in the different conditions be as
similar as possible, that is is there only one difference (the treatment) between them.

Highlight that IT expertise and logistics were partial reasons for allocation of units to treatment
and condition

So, threats to internal validity, due to lack of randomization might imply


- selection bias: Maybe availability of IT expertise, means that most IT knowledgeable units
got the treatment. Maybe only more IT knowledgeable units can cope with chatbots?
- confounding variables: Maybe difference in employee skills, or market conditions across
units. Maybe IT knowledgeable units are in certain regions.

Maybe Hawthorne effect: Treatment group might be happy about access to new technology
– vice versa, control condition might be unhappy about being left out. Related to managerial
visit.

16
C) 2023 SPRING EXAM QUESTION (12-14 MIN)
Similarities
• This is probably the most relevant quote and will be provided by many: “Even post-positivists
have conceded that objectivity is not possible, the results of an inquiry are always shaped by the
interaction of inquirer and inquired into” (both PP & C)
• Some will refer to a clip of “Yes Prime Minister” that I showed in class. It illustrates a problem that
both PP and C agree we have (neutral observations of the world not possible). They have
different solutions to the problem though.
• Complete Truths with capital T is unachievable, due to human limitations, biases. This similarity
leads into differences (either modified objectivity, or subjectivity).
• Both agree qualitative methods and multiple perspectives could be useful

Differences
• PP assumes that there are regularities to uncover, and seeks approximate truths to uncover
those ≠ C
• PP argues we should aim to reduce biases, by relying on triangulation, distanced
• C: ”…it makes the findings of an inquiry not a report about what is ”out there” but the residue of a
process that literally creates them[findings]” ≠ PP
• PP: "...although a real world driven by real natural causes exist, it is impossible for humans truly
to perceive it with their imperfect sensory and intellective mechanisms” ≠ C
• C argues that realities are multiple ≠ PP

17
D) 2021 SPRING EXAM QUESTION (10-12 MIN)
• Please note that the questions asked you to identify which featureS are missing
from the given description.
• The exam set provides some information that is relevant, such as:
interconnectedness, analyzing various elements, deep understanding of
dynamics of the system, very difficult systems etc. This does not fully capture the
essence of complexity though. Notice in particular the text says: “A complex
system or organisation is characterised by a large number of interconnected
elements or variables that relate to each other in often complicated ways.”
Relevant features of complex systems not clearly explained in Q4 text.
• Interdependence: Elements/components in a system interact and adapt to
each other. This constitutes feedback loops and change in the behavior that can
be difficult to predict.
• The whole is more than the sum of its parts (emergence)
• Watts quote, p. 489: “…which it operates the interactions between all these
components.” (p. 489)

18
E) ASSESSING RESEARCH DESIGNS: WHAT IS THE
MAIN CHALLENGE HERE? (5 MIN)
As stated in the text below, the aim of the study was to figure out if running increases the risk of hip and knee problems.
Please look at the information below and based on our curriculum identify the primary obstacle to trusting the study's
findings.

Dr. Hartwell and his colleagues conducted the prospective cohort study, “Does Running Increase the Risk for Hip and Knee
Arthritis? A Survey of 3,804 Chicago Marathon Runners” …

An electronic survey was completed by 3,804 participants registered for the 2019 and 2021 Chicago Marathons. The mean
age of respondents was 43.9. Participants had run an average of five or fewer marathons, and they had been running for
an average of 14.7 years.
The survey featured 30 questions targeted at assessing:
• Demographic information (age, sex, height, weight, country of origin and occupation)
• Running history (number of years running; average weekly mileage over the past year; average running pace during
training runs; number of marathons completed, including ultra-marathons; and participation with cross training)
• Hip/knee health (occurrence of hip or knee pain over the past year that prevented running; a history of hip or knee
injuries that required a break in running; a family history of hip and/or knee arthritis; surgical procedures performed on
the hip or knee; and receiving a hip or knee arthritis diagnosis)

https://aaos-annualmeeting-presskit.org/2023/research-news/new-study-demonstrates-that-running-does-not-increase-risk-
of-arthritis/ (slightly modified)

19
PHILOSOPHY OF SCIENCE

LESSON 15
Q&A
Carsten Bergenholtz (cabe@mgmt.au.dk)
Department of Management, Aarhus BSS
Aarhus University
AGENDA: VEVOX.APP QUESTIONS

• Flowlock+ / references
• Evaluation: Follow up
• Misc questions
• Overview of the course
• Illustrating how/when Phil of Science is relevant
• Chatbot survey(s)

2
FLOWLOCK+ EXAM
• 5 hours: 5 questions
• No internet access
• Access to PDF documents (notes, texts) on own PC (don’t just
save online) + physical/printed notes allowed
• Access to dictionary
• About references (quotes, chatbot, group notes, general
references)
• In particular when you refer to / quote other sources than
curriculum  provide reference
• How to reference -> see prior exam answers

https://studerende.au.dk/studier/fagportaler/businessadm/eksamen/retning
slinjer-for-skriftlige-opgaver-paa-hacandmerc 3
EVALUATION: FOLLOW UP I
RESPONSE RATE: 27%

A bit higher than last year


4
KUHN & TRUTHS

5
OVERVIEW OF THE COURSE / CONNECTIONS

6
ILLUSTRATING HOW / WHEN THE COURSE
CONTENT IS RELEVANT 

7
APPLYING PHIL OF SCIENCE INSIGHTS I:
ASSESSING CONSULTANCY REPORTS

8
CHALLENGES WITH THE REPORT I:
REVERSE CAUSALITY?

https://maycontainlies.com/discernment-matters-even-more/

9
CHALLENGES WITH THE REPORT II:
HETEROGENEOUS VARIABLES

10
CHALLENGES WITH THE REPORT III:
HETEROGENEOUS VARIABLES

11
CHALLENGES WITH THE REPORT IV:
CAUSALITY? & CONFOUNDING VARIABLES?

12
APPLYING PHIL OF SCIENCE INSIGHTS II:
CORRELATION & CAUSALITY IN MEDIA STORY ABOUT SCIENCE

13

50 studies, shared in 1.3 million social media shares


https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0196346
CORE GOAL IN PHIL OF
SCIENCE: LEARNING ABOUT
CONFOUNDING & OMITTED
VARIABLES, AND HOW TO
GENERALIZE

14
CHATBOT SURVEY(S)

We will highly appreciate, if you fill out following survey


• Both about your bach project and chatbots
• And Phil, the chatbot for this course

Answers can shape message to supervisors and future prioritization


and adoption of similar chatbot offerings
https://survey.au.dk/LinkCollector?key=DJFNXYJ8SKC6

15

You might also like