Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

Chapter 1: Economic questions and data

1.1

The experiment is carried to test the effect of reading on the improvement of a persons
vocabulary. Students are asked to take a vocabulary test before being randomly assigned to either
the controlled group or the treatment group in such a way that the initially average test results of
the two groups are equal. The students in the treatment group are required to read a list of
random chosen books at least some hours per day during the testing period. The others in the
control group are told to read nothing at the same time. After the testing period, students in the
two groups are required to take the vocabulary test again. The causal effect of reading on the
improvement of a persons vocabulary would be estimated by comparing the average test results
of the treatment group and the control group.

When carrying out this experiment in practice, its hard to make sure that the students in the
control group do not read any book during the testing period. Another impediment is that its
hard to keep other factors unchanged since ones vocabulary can also be improved through other
ways, such as communication. Its hard to distinguish the effects of those factors on the
improvement of ones vocabulary rather than reading books.

1.2

The experiment is carried to test the effect of the consumption of alcohol on long-term memory
loss. The attendants are required to take a memory test before being randomly assigned to either
the controlled group or the treatment group in such a way that the initially average test results of
the two groups are equal. The people in the treatment group are told to drink a certain amount of
alcohol per day in a predetermined long period of time (testing period), while the others in the
control group are told not to drink any alcohol in the same period. After the testing period, the
attendants in the two groups are required to take the test again. The causal effect of the
consumption of alcohol on long-term memory loss would be estimated by comparing the average
final test results of the two groups.

However, there are some impediments to the implementation of this experiment in practice. The
first one is the ethical issue, would it be ethical to tell one to drink alcohol in a long period of
time only for the purpose of experiment without knowing if that entity really want to do it or not?
The second big impediment is that it is hard to keep other factors that also have the effect on
ones memory unchanged, for example aging, sickness.

1.3

The experiment is carried to study the causal effect of hours spent in remedial classes at schools
by students who are struggling in mathematics on their final test scores and performance in the
subject. Firstly, the students are required to take a mathematics test without knowing the purpose
of the experiment. The students whose scores are below a certain level are regarded as struggling
in mathematics and, therefore, chosen to attend the experiment.

The attendants are then randomly assigned to either the control group or the treatment group.
Students in the treatment group are told to attend the remedial classes at schools, while the others
in the control group are requested not to do so. After the testing period, the average performance
and the average final test scores of students in the treatment group is compared to those of the
control group to estimate the causal effect.

An observational cross-sectional data set would consist of a number of different schools with the
observations collected at the same point in time. For example, the data set might contain the data
on the performance of students who attended and didnt attend the remedial classes in the subject
during the year 2016 of 100 different schools.

An observational time series data set would consist of observations for a single school collected
at different points in time. For example, the data set might contain data on the final test scores
and the performance of the participants in a certain school who either attended or didnt attend
the remedial class from the year 1990 to 2016.

An observational panel data set would consist of observations from different schools collected at
different points in time. For example, the data might consist of the final test scores and the
performance of the participants in 100 different schools from the year 1990 to 2016.
Chapter 2: Review of Probability

2.1

Each of the following examples, (a) the gender of the next person one meets, (b) the number of
times a computer crashes, (c) the time it takes to commute to school, (d) whether the computer
one is assigned in the library is new or old, and (e) whether it is raining or not, can be though of
as random since their outcomes are still not known with certainty until they actually happen.

2.2

If X and Y are independently distributed, we have:

Pr ( Y y| X=x )=Pr ( Y y )

Which means the conditional distribution of Y knowing a certain value of X and the

marginal distribution of Y are equal, which implies that knowing a certain value of X

does not change the probability distribution of Y . Therefore, if X and Y are

independently distributed for all the values of x and y , knowing the value of X tells

nothing about the value of Y .

2.3

Though there is no direct causal link between the amount of rainfall in ones hometown during a
randomly selected month and the number of children born in Los Angeles during the same
month, the amount of rainfall might tell something about the season and births are seasonal.
Thus, knowing the amount of rainfall might review something about the season which in turn
reviews something about the number of children born. Therefore, they are not independently
distributed.

2.4

Since the students for a sample of four are chosen randomly, knowing the average weight of one
specific sample of four doesnt give any information about the average weight of another
randomly chosen sample of four. This is a simple random sampling, in which a sample of four is
chosen randomly from a population of eighty and each sample in the pool of the possible
samples is an independently distributed random variable. It can be implied that the average
weight calculated from the sample is also a random variable, so the average weight might not
necessarily equal that of the population.

2.5

Since Y 1 , , Y n are i.i.d. random variables with a N (1,4) distribution. The their average


random variable Y with n=2, n=10, and n=100 are respectively

( 25 ) ,N (1, 251 )
N (1,2 ) , N 1, . Plots for those distributions are below:

N (1,2)

N 1, ( 25 )
N 1, ( 251 )

2.6

The normal approximation when n=25 and n=100 are more fit comparing to that of

n=5 , thus the calculation of Pr ( Y 0.1) using n=25 and n=100 give more

accurate answer.

You might also like