Professional Documents
Culture Documents
Team 10 Final Project
Team 10 Final Project
Sokal
STATISTICS PROJECT-GROUP 10
Dataset: program
Hypotheses:
1. Test scores differ by participation
2. Grades increased after program participation
Summary of Data:
Number of Observations
Median Test Score
Average Test Score
Number of participants in
program
Number of grade increments
Of which, number of
participants
32
22
21.94
14
11
8
1 is the mean test score, 1 is the sample standard deviation, 1 is the sample size
b. Sample 2: Sample of students who have participated in the program
2 is the mean test score, 2 is the sample standard deviation, 2 is the sample size
Then,
0 : 1 2 = 0
1 : 1 2 0
Test Statistic:
=
1
(
2 )
1
1
( + )
1 2
~ 1 +2 2
where,
Pooled variance, 2 =
Calculated Values:
21.56
1
1
4.00
1
18
3.94
2
2
2
1 +2 2
22.43
3.86
14
-0.62
Madhuresh Kumar | Sushil Kumar A C | Naveen Kumar Singh | Karthik Cheboli | Yashni Nagarajan | Mintu Kumar Singh | Jitendra Sokal
Here, we have done the t-test with equal population variance due to the result of the following hypothesis
testing:
2
0 : 12 = 1
2
12
1 : 2 1
2
Test Statistic:
=
12 12
~ (1 1,2 1)
22 22
= 1.0771 (when 12 = 22 )
Madhuresh Kumar | Sushil Kumar A C | Naveen Kumar Singh | Karthik Cheboli | Yashni Nagarajan | Mintu Kumar Singh | Jitendra Sokal
Challenges
Here, we have assumed that the population follows normal distribution. From the above k-density plot,
we can see that the sample is also approximately normal
(
1 )
2
( (1 )1 + (1 )1 )
1
2
~ (0,1)
Since the hypothesis is that the population proportions are equal, the best estimate of overall population
proportion is a combined proportion of success, given by
=
1
1 + 2
2
1 + 2
Calculated Values
1
0.167
18
0.344
0.571
14
-2.391
Madhuresh Kumar | Sushil Kumar A C | Naveen Kumar Singh | Karthik Cheboli | Yashni Nagarajan | Mintu Kumar Singh | Jitendra Sokal
Challenges
Here we have assumed normality even though sample sizes are not very large (18 and 14)
This may cause difference in std. error in the z-value. Accordingly, the p-value may also be inaccurate to
some degree. However the conclusion will remain the same.
To be sure, we can assume that the population follows Normal distribution instead of Bernoulli distribution.
We can use the t-test here, test statistic:
=
0.167
1
0.383
1
18
0.444
2
2
(
1 )
2
1
1
( + )
1
2
~ 1 +2 2
0.571
0.513
14
-2.55
t-test
p-value: Left tail test- p=0.00796
p-value is lower than each of the confidences required (0.01, 0.05, 0.1). This means, we can be more
than 99% confident that grades have increased after program participation
As we can observe, doing either the t-test or z-test does not show a great difference in p-value. Therefore, we
can conclude at 90%, 95% and 99% Confidence that grades have increased after program participation
Page 1
Obs
Mean
Std. Err.
Std. Dev.
nonpar~e
part_t~e
18
14
21.55556
22.42857
.943579
1.030919
4.003267
3.857346
19.56478
20.20141
23.54633
24.65574
combined
32
21.9375
.6896959
3.901509
20.53086
23.34414
f =
degrees of freedom =
Ha: ratio != 1
2*Pr(F > f) = 0.9072
1.0771
17, 13
Obs
Mean
nonpar~e
part_t~e
18
14
combined
32
diff
Std. Err.
Std. Dev.
21.55556
22.42857
.943579
1.030919
4.003267
3.857346
19.56478
20.20141
23.54633
24.65574
21.9375
.6896959
3.901509
20.53086
23.34414
-.8730159
1.404261
-3.7409
1.994868
t =
degrees of freedom =
Ha: diff != 0
Pr(|T| > |t|) = 0.5388
-0.6217
30
Obs
Mean
Std. Err.
Std. Dev.
nonpar~e
part_t~e
18
14
21.55556
22.42857
.943579
1.030919
4.003267
3.857346
19.9141
20.60288
23.19701
24.25426
combined
32
21.9375
.6896959
3.901509
20.76811
23.10689
f =
degrees of freedom =
1.0771
17, 13
Page 2
Ha: ratio != 1
2*Pr(F > f) = 0.9072
Obs
Mean
nonpar~e
part_t~e
18
14
combined
32
diff
Std. Err.
Std. Dev.
21.55556
22.42857
.943579
1.030919
4.003267
3.857346
19.9141
20.60288
23.19701
24.25426
21.9375
.6896959
3.901509
20.76811
23.10689
-.8730159
1.404261
-3.256413
1.510382
t =
degrees of freedom =
Ha: diff != 0
Pr(|T| > |t|) = 0.5388
-0.6217
30
Obs
Mean
Std. Err.
Std. Dev.
nonpar~e
part_t~e
18
14
21.55556
22.42857
.943579
1.030919
4.003267
3.857346
18.82085
19.32316
24.29026
25.53398
combined
32
21.9375
.6896959
3.901509
20.04495
23.83005
f =
degrees of freedom =
Ha: ratio != 1
2*Pr(F > f) = 0.9072
1.0771
17, 13
Obs
Mean
nonpar~e
part_t~e
18
14
combined
32
diff
Std. Err.
Std. Dev.
21.55556
22.42857
.943579
1.030919
4.003267
3.857346
18.82085
19.32316
24.29026
25.53398
21.9375
.6896959
3.901509
20.04495
23.83005
-.8730159
1.404261
-4.734728
2.988696
t =
degrees of freedom =
Ha: diff != 0
Pr(|T| > |t|) = 0.5388
-0.6217
30
Variable
Mean
Std. Err.
nonpart_inc
part_inc
.1666667
.5714286
.087841
.13226
diff
-.4047619
under Ho:
.1587727
.1692508
-2.39
P>|z|
18
14
.338832
.8306534
-.7159506
-.0935732
0.017
z =
-2.3915
Page 3
Ho: diff = 0
Ha: diff < 0
Pr(Z < z) = 0.0084
Ha: diff != 0
Pr(|Z| > |z|) = 0.0168
Variable
Mean
nonpart_inc
part_inc
.1666667
.5714286
.087841
.13226
diff
-.4047619
under Ho:
.1587727
.1692508
Std. Err.
-2.39
P>|z|
.3111523
.7889769
-.6659197
-.1436041
0.017
18
14
Ha: diff != 0
Pr(|Z| > |z|) = 0.0168
z =
-2.3915
Variable
Mean
Std. Err.
nonpart_inc
part_inc
.1666667
.5714286
.087841
.13226
diff
-.4047619
under Ho:
.1587727
.1692508
-2.39
P>|z|
11 .
>
>
>
.3929302
.9121078
-.8137332
.0042094
0.017
18
14
Ha: diff != 0
Pr(|Z| > |z|) = 0.0168
z =
-2.3915