Professional Documents
Culture Documents
Statistics
Statistics
Kosuke Imai
Department of Politics
Princeton University
Fall 2011
Sampling distribution of θ̂
Estimation error: θ̂ − θ0 where θ0 is the true value
Bias: B = E(θ̂) − θ0
Efficiency: V(θ̂)
Mean Squared Error (MSE): E(θ̂ − θ0 )2 = B 2 + V(θ̂)
Consistency: θ̂ → θ0 as n increases
Quota sampling
fixed quota of certain respondents for each interviewer
sample resembles the population w.r.t. these characteristics
potential selection bias
R EADINGS : FPP Chapters 20, 21, and 23; A&F 4.4–4.5, 5.1–5.4
Unbiasedness: E(X ) = p
Consistency (Law of Large Numbers): X → p as n increases
δ̂ − δ approx.
Z = ∼ N (0, 1)
s.e.
i.i.d.
If Xi ∼ N (µ, σ 2 ), then X ∼ N (µ, σ 2 /n) without the CLT
√
Standard error: s.e. = SX / n
Moreover, without the CLT
X −µ exactly
t−statistic = ∼ tn−1
s.e.
where tn−1 is the t distribution with n − 1 degrees of freedom
Use tn−1 (rather than N (0, 1)) to obtain the critical value for exact
confidence intervals
n=2
n = 10
n = 50
density
0.2
0.1
0.0
−6 −4 −2 0 2 4 6
0.20
Group: Germany vs Poland
Group: Croatia vs Germany
0.15
Group: Austria vs Germany
Density
Quarter-final: Portugal vs Germany
0.10
Semi-final: Germany vs Turkey
0.05
Final: Germany vs Spain
A total of 14 matches
0.00
12 correct guesses 0 2 4 6 8 10 12 14
1 − 0.9510 ≈ 0.4
If you do this with enough animals, you will find another Paul
Kosuke Imai (Princeton University) Statistical Inference POL 345 Lecture 20 / 46
Hypothesis Testing for Proportions
1 Hypotheses – H0 : p = p0 and H1 : p 6= p0
2 Test statistic: X
3 Under the null, by the central limit theorem
X − p0 X − p0 approx.
Z −statistic = = p ∼ N (0, 1)
s.e. p0 (1 − p0 )/n
4 Is Zobs unusual under the null?
Reject the null when |Zobs | > zα/2
Retain the null when |Zobs | ≤ zα/2
H0 : p = p0 and Ha : p > p0
X ∼ N (p∗ , p∗ (1 − p∗ )/n)
p
Reject H0 if X > p0 + zα/2 × p0 (1 − p0 )/n
H0 : δ = 0 and H1 : δ 6= 0
How large the sample size must be in order to reject H0 at the
95% confidence level with probability 0.9 when δ = ±0.02?
Sampling distribution of δ̂ when δ = 0.02:
0.4
n = 50
0.2
n = 100
n = 1000
0.0
90
80
70
60
Frequency
50
40
30
20
10
0
0.16 1.06 1.96 2.86 3.76 4.66 5.56 6.46 7.36 8.26 9.16 10.06 10.96 11.86 12.76 13.66
z-Statistic
est. = X T − X C = 0.07
Standard error:
s
X T (1 − X T ) X C (1 − X C )
s.e. = + = 0.028
nT nC
Two-sample test
H0 : pT = pC and H1 : pT 6=
pC .
Reference distribution: N 0, p(1−p)
nT + p(1−p)
nC
p-value: 0.010
Power calculation:
pT = 0.37 and pC = 0.30
Two-sample test at the 5% significance level
Equal group size: nT = nC
If n = 1000, what is the power of the test?
Residuals:
Min 1Q Median 3Q Max
-80.333 -16.756 -3.967 17.744 87.089
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -38.399 8.708 -4.410 2.78e-05 ***
comp 77.465 16.663 4.649 1.10e-05 ***
--
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
100
Margin of Victory (percentage points) ● ●
●
●
●
50
●
● ● ●
● ● ● ●
● ●●
● ● ● ● ●
●
● ●● ●
● ● ●●
● ● ●
● ●●●● ●
● ● ●
●● ● ● ● ● ●
0
● ●
● ● ●
● ● ● ● ●
● ●● ●
●●
●● ●
● ●
● ●●●
● ●
● ●● ● ● ●
●
●
●
−50
● ● ●
●
●●
●●
●
alpha.hat = −38.4 ( s.e. = 16.7 )
−100
Residuals:
Min 1Q Median 3Q Max
-4.8182 -1.6462 -0.1049 1.3896 5.9684
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 24.883976 0.078750 315.986 <2e-16 ***
ELF -0.023370 0.001653 -14.142 <2e-16 ***
oil -0.003064 0.004527 -0.677 0.499
--
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
1.0
● ● ●●
Democratic Voteshare in Current Election
●
0.8
●
●
● ● ●
● ● ● ● ● ● ●
●
● ● ●● ●● ●● ● ● ●●●
● ● ●●
● ● ●●● ●
● ● ●
●
● ●●●● ● ● ●● ● ● ●●● ● ●●
0.6
● ●
● ● ● ● ● ● ● ●● ● ●● ●●●● ● ●● ● ●
● ●● ●
●●
● ● ● ●●
● ● ● ● ●● ● ●
●
● ●●● ●●●
● ● ● ●● ● ● ● ● ● ● ● ●
● ●● ● ● ●●
● ●
● ●●● ● ●● ● ● ●●● ●● ●
●● ●
●● ●●●● ● ● ●● ● ● ●●● ●● ● ●●●●
●● ●●●
●● ●●
●● ●
●●
● ● ●●
●●● ●● ● ●●● ●● ●●
●●●●● ●● ●
● ●● ● ● ● ● ●●●●● ● ● ● ●● ●● ● ● ●● ●●● ●●●● ●● ●●●●●●
●●
●●● ● ●●● ●●●
●● ● ●
●
●●● ● ●●●●●● ● ● ●● ● ● ● ●●●
●●●●●
●● ●●● ●● ●● ●●●
●●●
● ● ●● ● ● ● ● ●●●● ● ● ● ●●●●●
●
●●● ●●● ● ●●●●●
● ●● ●●●● ●
●
●
●●● ●●
●
● ● ●
●●
●● ●●
●● ●●●
●●●●●● ●●
● ● ●● ● ●
●●●● ● ● ● ● ●● ●● ●● ● ●
● ●●●● ● ●●● ● ● ● ● ● ●● ●● ●● ●
● ● ●● ● ●●● ● ● ●
●● ● ●● ● ●●●●● ● ● ●●● ● ●
0.4
●
● ●●
● ●● ● ● ●
●● ● ● ●
● ● ●
●●● ● ● ● ● ● ●● ● ●
●● ●●●●●●● ● ● ●
●●● ●
● ●
● ●
● ●● ●● ● ● ● ● ● ● ●● ●
●
● ●● ●●● ●
● ●
● ● ● ●●
0.2
●●●
0.0
● ● ●
Final Exam:
1 A take home exam
2 40% of the overall grade
3 An open book exam (no outside materials) but no collaboration
4 Posted on Jan 18 and due noon on Jan 23
5 Another problem set (with conceptual and data analysis questions)
Preparation: Review the materials you did not understand well. Do
some practice problems including a practice final exam