Professional Documents
Culture Documents
Computer Project MAS291 SE150263
Computer Project MAS291 SE150263
Subsample
id<- 263
set.seed(id)
wage <- read.csv("wage.csv")
yourdata<-wage[sample(1:nrow(wage),30),]
educ age wage IQ exper lwage black
1917 14 27 265 75 7 5.57973 1
5.65248
161 17 27 285 121 4 0
9
6.70563
1592 12 34 817 93 16 0
9
6.27098
1589 16 27 529 108 5 1
8
6.33682
2113 16 24 565 NA 2 0
6
6.32435
2835 12 28 558 90 10 0
9
6.05912
1467 12 31 428 NA 13 1
3
6.62406
2566 12 29 753 NA 11 1
5
1627 11 25 700 93 8 6.55108 0
5.57215
2994 12 27 263 100 9 0
4
5.29831
2957 10 25 200 NA 9 1
7
6.55819
956 11 26 705 89 9 0
8
6.05678
606 9 24 427 NA 9 0
4
2128 17 25 442 NA 2 6.09131 0
6.16331
223 12 26 475 104 8 0
5
7.22548
1064 12 33 1374 NA 15 0
2
6.35784
2350 15 27 577 93 6 0
2
5.89989
1946 12 31 365 70 13 1
8
5.70378
227 13 26 300 122 7 0
3
7.17701
2202 18 33 1309 NA 9 0
9
6.56244
634 12 33 708 101 15 0
4
5.78382
2025 12 30 325 86 12 1
5
5.52146
1712 14 27 250 NA 7 1
1
6.13556
655 18 25 462 107 1 0
5
6.47697
1224 13 32 650 99 13 0
3
5.89715
1780 13 29 364 80 10 1
4
6.86171
1508 17 29 955 132 6 0
2
6.46146
829 12 24 640 NA 6 1
8
6.32793
2765 17 25 560 116 2 0
7
6.13556
1767 13 24 462 NA 5 0
5
Topic: Probability
Ex1
X: “Choose a black”
Number of black in data: 703
703 C 6∗2297 C 24
P(X=6) = 3010C 30
~~ 0
Ex 2
X: “Choose a black”
Number of black in subsample: 10
The probability that there is at least one black:
20 C 5
P(X≥1) = 1 – P(X=0) = 1 - 30 C 5 = 0.891
Ex5
24 +34
E(X) = 2
= 29
(34−24)2−1
V(X) = = 8.25
12
= 0.2776-0.002
= 0.2774
Ex 7
μ = 99.04
σ = 20.2
a. P(100<X<110) = P(X<110) – P(X<100)
110−99.04 100−99.04
= P(Z< ) - P(Z< )
20.2/ √ 20 20.2/ √ 20
= P(Z<2.43) – P(Z<0.21)
= 0.9925 – 0.5832
= 0.4093
σ
b. n = E = 20.2
number of sample must be 21 if we want the standard error of sample
mean to be 1
Ex 8
- Statistics for wage:
R Code:
Minimum 200
Maximum 1374
1.
364.25
Quartile
3.
687.5
Quartile
Mean 557.1
Median 502
Variance 79733.4724
Stdev 282.371161
- Statistics for IQ
NAs 11
Minimum 70
Maximum 132
1.
89.5
Quartile
3.
107.5
Quartile
Mean 98.894737
Median 99
Variance 266.766082
Stdev 16.332975
Minimum 9
Maximum 18
1.
12
Quartile
3.
15.75
Quartile
Mean 13.466667
Median 12.5
Variance 6.050575
Stdev 2.459792
Minimum 1
Maximum 16
1.
6
Quartile
3.
10.75
Quartile
Mean 8.3
Median 8.5
Variance 16.493103
Stdev 4.06117
ṕ∗(1−ṕ) ṕ∗(1−ṕ)
ṕ - Zα/2*
√ n
≤ p ≤ ṕ + Zα/2*
√ n
0.682∗0.318 0.682∗0.318
0. 682– 2.576*
√ 3010
≤ p ≤ 0. 682+ 2.576*
√ 3010
0.6602 ≤ p ≤ 0.7038
Ex 14
Number of blacks = 703
703
ṕ = 3010 = 0.23355
Z α /2 2 2.576 2
n=( E
) * ṕ∗(1−ṕ ) = (
0.01
) * 0.23355∗0.76645 = 11878.33
x́−μ 98.894−100
Test statistics = σ = 15 = -0.321 < Zα/2 = 2.33
√n √19
Fail to reject H0
Ex 16
H0: mean(lwage) = 6
H1: mean(lwage) > 6
mean = x̅ = 6.212
standard diviation = s = 0.473
x́−μ 6. 212−6
Test statistics = s = 0.47 3 = 2.455 > t α, n -1 = 1.31
√n √ 30
Reject H0
Ex 17
H0: p = 0.07
H1: p > 0.07
Number of people with less than 10 years of work experience = 1850
1850
ṕ = 3010 = 0.615
ṕ− p 0.615−0.07
Test statistics = ṕ∗(1−ṕ ) = 0.615∗0.385 = 61.4485 > Zα = -2.05
√ n √ 3010
Reject H0
Ex 18
a.
H0: Δ = 0
H1: Δ # 0
X1: black
X2: not black
x́ 1 = 411.9, n1 = 10
x́ 2 = 629.7, n1 = 20
s1 = 178.6023
s2 = 299.9067
( n 1−1 )∗s 12+ ( n2−1 )∗s 22 ( 10−1 )∗178.60232 + ( 20−1 )∗299.90672
sp =
266.9955
√ n 1+n 2−2
=
√ 28
=
x́ 1−x́ 2−Δ
Test statistics: t0 = s p 2 s p2 = -2.106
√ +
n 1 n2
- t0.005, 29 < t0 < t0.005, 29 = 2.76
Fail to reject H0
b.
H0: p1 – p2 = 0
H1: p1 – p2 # 0
X1: black
X2: not black
Ex 19:
x: “education”
y: “wage”
Σx = 404 Σx
2
= 5616 Σ xy =
228219
Σy = 16713 = Σy
2
11623083
( Σx)2
Sxx = Σ x 2
- = 175.4667
n
Σx∗Σy
Sxy = Σ xy - n
= 3150.6
Sxy 3150.6
a. Slope = β1 = Sxx = 175.4667 = 17.955
σ2
Se(β1) =
√ Sxx
= 21.42
β1
b. Test statistics = Se (β 1) = 0.83 < t α/2, n -2 = 1.7
Fail to reject H0
1∗S xy
c. coefficient of determination = R2 = β SS T = 0.024465
1164.322
( Σx)2
Sxx = Σ x 2
- = 175.4667
n
Σx∗Σy
Sxy = Σ xy - n
= 5.240778
Sxy 5.240778
d. Slope = β1 = Sxx = 175.4667 = 0.0298