Lecture 1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

AIL 7310: MACHINE LEARNING FOR ECONOMICS

Lecture 1

3rd August, 2023

AIL 7310: ML for Econ Lecture 1 1 / 15


What is Machine Learning?

A concise definition by Athey(2018):

”...Machine Learning is a field that develops algorithms designed to


be applied to datasets, with the main areas of focus being prediction
(regression), classification, and clustering or grouping tasks.”

AIL 7310: ML for Econ Lecture 1 2 / 15


What is Machine Learning?

A concise definition by Athey(2018):

”...Machine Learning is a field that develops algorithms designed to


be applied to datasets, with the main areas of focus being prediction
(regression), classification, and clustering or grouping tasks.”

Broadly defined three types of Machine Learning


Supervised Machine Learning
Unsupervised Machine Learning
Reinforcement Learning

AIL 7310: ML for Econ Lecture 1 2 / 15


What is Machine Learning?

Most supervised algorithms and many unsupervised algorithms are actually


quite old.

AIL 7310: ML for Econ Lecture 1 3 / 15


What is Machine Learning?

Most supervised algorithms and many unsupervised algorithms are actually


quite old.

Below is the list of algorithms and their approximate year of discovery.

Linear and logistic regression (1805, 1958).


Decision and regression trees (1984).
K-Nearest neighbors (1967).
Support vector machines (1990s).
Neural networks (1940s, 1970s, 1980s, 1990s).
Random forests (2001), bagging (2001), boosting (1990).
So why all the fuss now??

AIL 7310: ML for Econ Lecture 1 3 / 15


What is Machine Learning?

AIL 7310: ML for Econ Lecture 1 4 / 15


The Rise of Big Data

AIL 7310: ML for Econ Lecture 1 5 / 15


The Rise of Big Data

What is Big Data?

AIL 7310: ML for Econ Lecture 1 5 / 15


The Rise of Big Data

What is Big Data?

The 4 Vs of Big Data


Volume - Scale of data.
Velocity - Analysis of streaming data
Variety - Different forms of data.
Veracity - Uncertainty of data.

”A billion years ago modern homo sapiens emerged. A billion minutes


ago, Christianity began. A billion seconds ago, the IBM PC was re-
leased. A billion Google searches ago ... was this morning”
Hal Varian(2013)

AIL 7310: ML for Econ Lecture 1 5 / 15


Big Data and ML

Big Data + Computational Advancements =Rise of ML

AIL 7310: ML for Econ Lecture 1 6 / 15


Big Data and ML

Big Data + Computational Advancements =Rise of ML

AIL 7310: ML for Econ Lecture 1 6 / 15


What is Economics?

AIL 7310: ML for Econ Lecture 1 7 / 15


What is Economics?

Economics is the study of choices we make under scarcity.

AIL 7310: ML for Econ Lecture 1 7 / 15


What is Economics?

Economics is the study of choices we make under scarcity.

All economics questions arise because we have unlimited wants but limited
resources.

AIL 7310: ML for Econ Lecture 1 7 / 15


What is Economics?

Economics is the study of choices we make under scarcity.

All economics questions arise because we have unlimited wants but limited
resources.

This leads to scarcity and we need to make choices.

AIL 7310: ML for Econ Lecture 1 7 / 15


What is Economics?

Economics is the study of choices we make under scarcity.

All economics questions arise because we have unlimited wants but limited
resources.

This leads to scarcity and we need to make choices.

Note: Scarcity can be in anything. E.g. Money, personnel, physical


infrastructure, time, attention.

AIL 7310: ML for Econ Lecture 1 7 / 15


Why do we need a separate ML for Economics course?

Economics has traditionally focused on causality.

AIL 7310: ML for Econ Lecture 1 8 / 15


Why do we need a separate ML for Economics course?

Economics has traditionally focused on causality.

We focus on finding the marginal effect of policies or


interventions.Traditional econometrics has been about this.

AIL 7310: ML for Econ Lecture 1 8 / 15


Why do we need a separate ML for Economics course?

Economics has traditionally focused on causality.

We focus on finding the marginal effect of policies or


interventions.Traditional econometrics has been about this.

E.g. How many extra children can be vaccinated if an additional one lakh
rupee was allocated?

AIL 7310: ML for Econ Lecture 1 8 / 15


Why do we need a separate ML for Economics course?

Economics has traditionally focused on causality.

We focus on finding the marginal effect of policies or


interventions.Traditional econometrics has been about this.

E.g. How many extra children can be vaccinated if an additional one lakh
rupee was allocated?

In contrast, ML has traditionally been about prediction.

AIL 7310: ML for Econ Lecture 1 8 / 15


Why do we need a separate ML for Economics course?

Economics has traditionally focused on causality.

We focus on finding the marginal effect of policies or


interventions.Traditional econometrics has been about this.

E.g. How many extra children can be vaccinated if an additional one lakh
rupee was allocated?

In contrast, ML has traditionally been about prediction.

Very recently, economists and econometricians have been combining


traditional econometrics tools with ML tools to get new Causal ML tools.

AIL 7310: ML for Econ Lecture 1 8 / 15


Why do we need a separate ML for Economics course?

Economics has traditionally focused on causality.

We focus on finding the marginal effect of policies or


interventions.Traditional econometrics has been about this.

E.g. How many extra children can be vaccinated if an additional one lakh
rupee was allocated?

In contrast, ML has traditionally been about prediction.

Very recently, economists and econometricians have been combining


traditional econometrics tools with ML tools to get new Causal ML tools.

E.g. Double/Debiased Lasso, Causal ML

AIL 7310: ML for Econ Lecture 1 8 / 15


Why do we need a separate ML for Economics course?

Economic data has some specific issues that were not traditionally
addressed by ML.

AIL 7310: ML for Econ Lecture 1 9 / 15


Why do we need a separate ML for Economics course?

Economic data has some specific issues that were not traditionally
addressed by ML.

E.g. Panel-Data, Time-series data (serial correlation)

AIL 7310: ML for Econ Lecture 1 9 / 15


Why do we need a separate ML for Economics course?

Economic data has some specific issues that were not traditionally
addressed by ML.

E.g. Panel-Data, Time-series data (serial correlation)

Most Important: Need to include economic theory and intuition to get


meaningful and impactful applications of ML in economics.

AIL 7310: ML for Econ Lecture 1 9 / 15


Machine Learning in Economics

So how does Economics use Machine Learning?


There are three commonly used ways (so far) in empirical research.
Use ML tools to harvest non traditional data (usually Big Data) - Big
Data and Economics.
Pure prediction problems - predict health/labour/education
events/outcomes
Combine inference with prediction - causal ML (economists’
contribution to the ML literature)

AIL 7310: ML for Econ Lecture 1 10 / 15


How Economics uses Machine Learning

Incorporating Big Data into economic research


Text Analysis: analyzing MPC meeting minutes to understand
members’ attitude towards inflation target.
Social Media data: harvest data from twitter/facebook to understand
sentiments about a brand.
Remote Sensing Data: using data from satelite images to identify
instances of crop-burning.
Night Lights Data: Using night lights data to construct small area
estimates of economic outcome e.g. poverty.
and many more...

AIL 7310: ML for Econ Lecture 1 11 / 15


How Economics uses Machine Learning

Predicting various economic outcomes


predicting school dropouts,child mortality,credit worthiness of loan
candidates,food security status of households.
target inspections to increase occupational safety and decrease
injuries in the workplace (Johnson et al. (SSRN, 2020))
constructing measures/indices: Identify the best survey questions to
predict an agency score of women. (Jayachandran et al.NBER, 2021)
better survey design - Kasy and Sautmann (Econometrica, 2021)

AIL 7310: ML for Econ Lecture 1 12 / 15


How Economics uses Machine Learning
Combining Econometrics (Causal Inference) and Machine Learning
(Predictions)
Double/Debiased Lasso: Here control variables are selected using ML
to conduct propensity score matching and get doubly robust causal
estimates.(Belloni et al. (Journal of Economic Perspective,
2014),Angrist and Frandsen (NBER, 2019))

AIL 7310: ML for Econ Lecture 1 13 / 15


How Economics uses Machine Learning
Combining Econometrics (Causal Inference) and Machine Learning
(Predictions)
Double/Debiased Lasso: Here control variables are selected using ML
to conduct propensity score matching and get doubly robust causal
estimates.(Belloni et al. (Journal of Economic Perspective,
2014),Angrist and Frandsen (NBER, 2019))

Causal Forest: Decision trees are used to estimate heterogenous


treatment effects of any intervention (summer jobs by Davis and
Heller (AER P&P, 2017))

AIL 7310: ML for Econ Lecture 1 13 / 15


How Economics uses Machine Learning
Combining Econometrics (Causal Inference) and Machine Learning
(Predictions)
Double/Debiased Lasso: Here control variables are selected using ML
to conduct propensity score matching and get doubly robust causal
estimates.(Belloni et al. (Journal of Economic Perspective,
2014),Angrist and Frandsen (NBER, 2019))

Causal Forest: Decision trees are used to estimate heterogenous


treatment effects of any intervention (summer jobs by Davis and
Heller (AER P&P, 2017))

Matrix Completion method for causal inference: matrix completion


methods (e.g. singular value decomposition) primarily used in
recommendation engines are used for causal inference.(Athey and
Imbens, JASA,2021)

AIL 7310: ML for Econ Lecture 1 13 / 15


What Machine Learning can learn from Econometrics

Approaches to working with non-iid data (esp Panel Data)


Causal Inference tools
▶ confounding variables
▶ randomized controlled trials/lab-in-field experiment (Duflo, Kremer and
Banerjee, Nobel Prize 2019)
▶ natural experiments/quasi-experiments (Card, Angrist and Imbens,
Nobel Prize 2021)
▶ difference-in-difference
▶ regression discontinuity
▶ instrumental variables

AIL 7310: ML for Econ Lecture 1 14 / 15


Conclusions

ML in economics is more than the sum of its parts. It is much more than
mere Applied ML. Remember than econometrics was the original data
science before ”data science” was coined and became cool!

AIL 7310: ML for Econ Lecture 1 15 / 15


Conclusions

ML in economics is more than the sum of its parts. It is much more than
mere Applied ML. Remember than econometrics was the original data
science before ”data science” was coined and became cool!

Econometrics is the study of measuring economic concepts. The field has


development numerous techniques over the last 50 years that proved to be
immensely useful. They are used by other fields as well e.g. epidemiology,
neuroscience, sociology, business, finance, political science.

AIL 7310: ML for Econ Lecture 1 15 / 15


Conclusions

ML in economics is more than the sum of its parts. It is much more than
mere Applied ML. Remember than econometrics was the original data
science before ”data science” was coined and became cool!

Econometrics is the study of measuring economic concepts. The field has


development numerous techniques over the last 50 years that proved to be
immensely useful. They are used by other fields as well e.g. epidemiology,
neuroscience, sociology, business, finance, political science.

The combination of ML and Econometrics is just the begining of many


more exciting tools and contributions.

AIL 7310: ML for Econ Lecture 1 15 / 15

You might also like