Team 10 Final Project

Madhuresh Kumar | Sushil Kumar A C | Naveen Kumar Singh | Karthik Cheboli | Yashni Nagarajan | Mintu Kumar Singh | Jitendra
Sokal
STATISTICS PROJECT-GROUP 10
Dataset: program
Hypotheses:
1. Test scores differ by participation
2. Grades increased after program participation
Summary of Data:
Number of Observations
Median Test Score
Average Test Score
Number of participants in
program
Number of grade increments
Of which, number of
participants
32
22
21.94
14
11
8
Data Source: Spector and Mazzeo (1980).
Hypothesis 1: Test Scores differ by participation

Definitions:
1. Population parameters:
a. 1 is the mean test score of students who have not participated in the program
b. 2 is the mean test score of students who have participated in the program
2. Sample statistics:
a. Sample 1: Sample of students who have not participated in the program
1 is the mean test score, 1 is the sample standard deviation, 1 is the sample size
b. Sample 2: Sample of students who have participated in the program
2 is the mean test score, 2 is the sample standard deviation, 2 is the sample size
Then,
0 : 1 2 = 0
1 : 1 2 0
Test Statistic:
=
1
(
2 )
1
1
( + )
1 2
~ 1 +2 2
where,
Pooled variance, 2 =
Calculated Values:
21.56
1
1
4.00
1
18
3.94
(1 1)12 +(2 1)22
2
2
2
1 +2 2
22.43
3.86
14
-0.62
Madhuresh Kumar | Sushil Kumar A C | Naveen Kumar Singh | Karthik Cheboli | Yashni Nagarajan | Mintu Kumar Singh | Jitendra Sokal
Here, we have done the t-test with equal population variance due to the result of the following hypothesis
testing:
2
0 : 12 = 1
2
12
1 : 2 1
2
Test Statistic:
=
12 12
~ (1 1,2 1)
22 22
= 1.0771 (when 12 = 22 )
Test Results and Interpretation

F-test:
p-value: Two tail tests- p=0.9072
p-value is greater than error at each confidence interval (0.01, 0.05, 0.1)
Therefore, we cannot reject the null hypothesis, meaning that the ratio of variances of the populations
is 1.
Variance of test scores of those who have and have not participated in the program are equal. So, we
can do the t-test with equal variances.
t-test:
p-value: Two tail tests: p=0.5388
p-value is greater than error at each confidence interval (0.01, 0.05, 0.1)
Therefore, we cannot reject the null hypothesis, meaning that test scores did not differ by
participation
Challenges
Here, we have assumed that the population follows normal distribution. From the above k-density plot,
we can see that the sample is also approximately normal
Hypothesis 2: Grades increased after program participation

Definitions:
1. Population parameters:
a. 1 is the proportion of students whose grades have increased despite not having participated in the
program
b. 2 is the proportion of students whose grades have increased after participating in the program
2. Sample statistics:
a. Sample 1: Sample of students who have not participated in the program
1 the proportion whose grades increased, 1 is the sample size
is
b. Sample 2: Sample of students who have participated in the program
2 the proportion whose grades increased, 2 is the sample size
is
Then,
0 : 1 2 = 0
1 : 1 2 < 0
Test Statistic:
=
(
1 )
2
( (1 )1 + (1 )1 )
1
2
~ (0,1)
Since the hypothesis is that the population proportions are equal, the best estimate of overall population
proportion is a combined proportion of success, given by
=
1
1 + 2
2
1 + 2
Calculated Values
1
0.167
18
0.344
0.571
14
-2.391
Test Results and Interpretation

z-test
p-value: Left tail test- p=0.0084
p-value is lower than each of the confidences required (0.01, 0.05, 0.1)
We can therefore say that at each of these confidences, null hypothesis is rejected. This means, we can
be more than 99% confident that grades have increased after program participation
Challenges
Here we have assumed normality even though sample sizes are not very large (18 and 14)
This may cause difference in std. error in the z-value. Accordingly, the p-value may also be inaccurate to
some degree. However the conclusion will remain the same.
To be sure, we can assume that the population follows Normal distribution instead of Bernoulli distribution.
We can use the t-test here, test statistic:
=
where, Pooled variance, 2 =

Calculated Values:
1
0.167
1
0.383
1
18
0.444
2
2
(
1 )
2
1
1
( + )
1
2
~ 1 +2 2
(1 1)12 +(2 1)22

1 +2 2
0.571
0.513
14
-2.55
t-test
p-value: Left tail test- p=0.00796
p-value is lower than each of the confidences required (0.01, 0.05, 0.1). This means, we can be more
than 99% confident that grades have increased after program participation
As we can observe, doing either the t-test or z-test does not show a great difference in p-value. Therefore, we
can conclude at 90%, 95% and 99% Confidence that grades have increased after program participation
RBI stats project 1
Sunday January 8 15:48:30 2017
Page 1
___ ____ ____ ____ ____(R)

/__
/
____/
/
____/
___/
/
/___/
/
/___/
Statistics/Data Analysis
User: Naveen Kumar Singh
Project: Team 10
___ ____ ____ ____ ____ (R)
/__
/
____/
/
____/
___/
/
/___/
/
/___/
13.1
Statistics/Data Analysis
Copyright 1985-2013 StataCorp LP

StataCorp
4905 Lakeway Drive
College Station, Texas 77845 USA
800-STATA-PC
http://www.stata.com
979-696-4600
stata@stata.com
979-696-4601 (fax)
50-student Stata lab perpetual license:

Serial number: 301306257722
Licensed to: Naveen Kumar Singh
Reserve Bank of India
Notes:
1.
You are running Small Stata.
1 . use "C:\Users\Naveen\Desktop\4thweek\stats assignment\team 10.dta"

2 . sdtest nonpart_tuce == part_tuce
Variance ratio test
Variable
Obs
Mean
Std. Err.
Std. Dev.
[95% Conf. Interval]
nonpar~e
part_t~e
18
14
21.55556
22.42857
.943579
1.030919
4.003267
3.857346
19.56478
20.20141
23.54633
24.65574
combined
32
21.9375
.6896959
3.901509
20.53086
23.34414
ratio = sd(nonpart_tuce) / sd(part_tuce)

Ho: ratio = 1
Ha: ratio < 1
Pr(F < f) = 0.5464
f =
degrees of freedom =
Ha: ratio != 1
2*Pr(F > f) = 0.9072
1.0771
17, 13
Ha: ratio > 1

Pr(F > f) = 0.4536
3 . ttest nonpart_tuce == part_tuce, unpaired

Two-sample t test with equal variances
Variable
Obs
Mean
nonpar~e
part_t~e
18
14
combined
32
diff
Std. Err.
Std. Dev.
21.55556
22.42857
.943579
1.030919
4.003267
3.857346
19.56478
20.20141
23.54633
24.65574
21.9375
.6896959
3.901509
20.53086
23.34414
-.8730159
1.404261
-3.7409
1.994868
diff = mean(nonpart_tuce) - mean(part_tuce)

Ho: diff = 0
Ha: diff < 0
Pr(T < t) = 0.2694
t =
Ha: diff != 0
Pr(|T| > |t|) = 0.5388
-0.6217
30
Ha: diff > 0

Pr(T > t) = 0.7306
4 . sdtest nonpart_tuce == part_tuce, level(90)

Variance ratio test
Variable
Obs
Mean
Std. Err.
Std. Dev.
nonpar~e
part_t~e
18
14
21.55556
22.42857
.943579
1.030919
4.003267
3.857346
19.9141
20.60288
23.19701
24.25426
combined
32
21.9375
.6896959
3.901509
20.76811
23.10689

Ho: ratio = 1
f =
1.0771
17, 13
RBI stats project 1
Ha: ratio < 1

Pr(F < f) = 0.5464
Page 2
Ha: ratio != 1
2*Pr(F > f) = 0.9072
Ha: ratio > 1

Pr(F > f) = 0.4536
5 . ttest nonpart_tuce == part_tuce, unpaired level(90)

Variable
Obs
Mean
nonpar~e
part_t~e
18
14
combined
32
diff
Std. Err.
Std. Dev.
21.55556
22.42857
.943579
1.030919
4.003267
3.857346
19.9141
20.60288
23.19701
24.25426
21.9375
.6896959
3.901509
20.76811
23.10689
-.8730159
1.404261
-3.256413
1.510382

Ho: diff = 0
Ha: diff < 0
Pr(T < t) = 0.2694
t =
Ha: diff != 0
Pr(|T| > |t|) = 0.5388
-0.6217
30
Ha: diff > 0

Pr(T > t) = 0.7306
6 . sdtest nonpart_tuce == part_tuce, level(99)

Variance ratio test
Variable
Obs
Mean
Std. Err.
Std. Dev.
nonpar~e
part_t~e
18
14
21.55556
22.42857
.943579
1.030919
4.003267
3.857346
18.82085
19.32316
24.29026
25.53398
combined
32
21.9375
.6896959
3.901509
20.04495
23.83005

Ho: ratio = 1
Ha: ratio < 1
Pr(F < f) = 0.5464
f =
Ha: ratio != 1
2*Pr(F > f) = 0.9072
1.0771
17, 13
Ha: ratio > 1

Pr(F > f) = 0.4536
7 . ttest nonpart_tuce == part_tuce, unpaired level(99)

Variable
Obs
Mean
nonpar~e
part_t~e
18
14
combined
32
diff
Std. Err.
Std. Dev.
21.55556
22.42857
.943579
1.030919
4.003267
3.857346
18.82085
19.32316
24.29026
25.53398
21.9375
.6896959
3.901509
20.04495
23.83005
-.8730159
1.404261
-4.734728
2.988696

Ho: diff = 0
Ha: diff < 0
Pr(T < t) = 0.2694
t =
Ha: diff != 0
Pr(|T| > |t|) = 0.5388
-0.6217
30
Ha: diff > 0

Pr(T > t) = 0.7306
8 . prtest nonpart_inc == part_inc

Two-sample test of proportions
Variable
Mean
Std. Err.
nonpart_inc
part_inc
.1666667
.5714286
.087841
.13226
diff
-.4047619
under Ho:
.1587727
.1692508
nonpart_inc: Number of obs =

part_inc: Number of obs =
z
-2.39
P>|z|
18
14

-.0054986
.3122037
.338832
.8306534
-.7159506
-.0935732
0.017
diff = prop(nonpart_inc) - prop(part_inc)
z =
-2.3915
RBI stats project 1
Page 3
Ho: diff = 0
Ha: diff < 0
Pr(Z < z) = 0.0084
Ha: diff != 0
Pr(|Z| > |z|) = 0.0168
Ha: diff > 0

Pr(Z > z) = 0.9916
9 . prtest nonpart_inc == part_inc, level(90)

Variable
Mean
nonpart_inc
part_inc
.1666667
.5714286
.087841
.13226
diff
-.4047619
under Ho:
.1587727
.1692508

Std. Err.
-2.39
P>|z|

.022181
.3538802
.3111523
.7889769
-.6659197
-.1436041
0.017

Ho: diff = 0
Ha: diff < 0
Pr(Z < z) = 0.0084
18
14
Ha: diff != 0
Pr(|Z| > |z|) = 0.0168
z =
-2.3915
Ha: diff > 0

Pr(Z > z) = 0.9916
10 . prtest nonpart_inc == part_inc, level(99)

Variable
Mean
Std. Err.
nonpart_inc
part_inc
.1666667
.5714286
.087841
.13226
diff
-.4047619
under Ho:
.1587727
.1692508

z
-2.39
P>|z|
11 .
>
>
>

-.0595969
.2307494
.3929302
.9121078
-.8137332
.0042094
0.017

Ho: diff = 0
Ha: diff < 0
Pr(Z < z) = 0.0084
18
14
Ha: diff != 0
Pr(|Z| > |z|) = 0.0168
z =
-2.3915
Ha: diff > 0

Pr(Z > z) = 0.9916
twoway (function y=tden(31,x), range(-5 -0.6247) color(ltblue) recast(area)) (function y=tden

olor(ltblue) recast(area)) (function y=tden(31,x), range(-5 5)), legend(off) plotregion(margi
title("t") text(0 -0.6247 "-0.6247", place(s)) text(0 0.6247 "0.6247", place(s)) title("Two-t
31), alpha=0.05")
12 . twoway (function y=Fden(13,17,x), range(0.9287 3) color(ltblue) recast(area)) (function y=Fde

> legend(off) plotregion(margin(zero)) ytitle("f(t)") xtitle("t") text(0 0.9287 "0.9287", plac
> ection region" "F(13,17), alpha=0.05")
13 . twoway (function y=normalden(x), range(-5 -2.3915) color(ltblue) recast(area)) (function y=no

> legend(off) plotregion(margin(zero)) ytitle("f(z)") xtitle("z") text(0 -2.3915 "-2.3915", pl
> ejection region" "z, alpha=0.05")
14 . kdensity tuce, kernel(epanechnikov) normal
(n() set to 32)
15 . graph save Graph "C:\Users\Naveen\Desktop\4thweek\stats assignment\Graph5.gph", replace
(file C:\Users\Naveen\Desktop\4thweek\stats assignment\Graph5.gph saved)
save "C:\Users\Naveen\Desktop\4thweek\stats assignment\team 10.dta", replace
file C:\Users\Naveen\Desktop\4thweek\stats assignment\team 10.dta saved
16 .

Team 10 Final Project

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Team 10 Final Project

Uploaded by

Copyright:

Available Formats

Madhuresh Kumar | Sushil Kumar A C | Naveen Kumar Singh | Karthik Cheboli | Yashni Nagarajan | Mintu Kumar Singh | Jitendra

Data Source: Spector and Mazzeo (1980).

Hypothesis 1: Test Scores differ by participation

(1 1)12 +(2 1)22

Test Results and Interpretation

Hypothesis 2: Grades increased after program participation

Test Results and Interpretation

where, Pooled variance, 2 =

(1 1)12 +(2 1)22

RBI stats project 1

Sunday January 8 15:48:30 2017

___ ____ ____ ____ ____(R)

Copyright 1985-2013 StataCorp LP

50-student Stata lab perpetual license:

You are running Small Stata.

1 . use "C:\Users\Naveen\Desktop\4thweek\stats assignment\team 10.dta"

[95% Conf. Interval]

ratio = sd(nonpart_tuce) / sd(part_tuce)

Ha: ratio > 1

3 . ttest nonpart_tuce == part_tuce, unpaired

[95% Conf. Interval]

diff = mean(nonpart_tuce) - mean(part_tuce)

Ha: diff > 0

4 . sdtest nonpart_tuce == part_tuce, level(90)

[90% Conf. Interval]

ratio = sd(nonpart_tuce) / sd(part_tuce)

RBI stats project 1

Sunday January 8 15:48:30 2017

Ha: ratio < 1

Ha: ratio > 1

5 . ttest nonpart_tuce == part_tuce, unpaired level(90)

[90% Conf. Interval]

diff = mean(nonpart_tuce) - mean(part_tuce)

Ha: diff > 0

6 . sdtest nonpart_tuce == part_tuce, level(99)

[99% Conf. Interval]

ratio = sd(nonpart_tuce) / sd(part_tuce)

Ha: ratio > 1

7 . ttest nonpart_tuce == part_tuce, unpaired level(99)

[99% Conf. Interval]

diff = mean(nonpart_tuce) - mean(part_tuce)

Ha: diff > 0

8 . prtest nonpart_inc == part_inc

nonpart_inc: Number of obs =

[95% Conf. Interval]

diff = prop(nonpart_inc) - prop(part_inc)

RBI stats project 1

Sunday January 8 15:48:30 2017

Ha: diff > 0

9 . prtest nonpart_inc == part_inc, level(90)

nonpart_inc: Number of obs =

[90% Conf. Interval]

diff = prop(nonpart_inc) - prop(part_inc)

Ha: diff > 0

10 . prtest nonpart_inc == part_inc, level(99)

nonpart_inc: Number of obs =

[99% Conf. Interval]

diff = prop(nonpart_inc) - prop(part_inc)

Ha: diff > 0

twoway (function y=tden(31,x), range(-5 -0.6247) color(ltblue) recast(area)) (function y=tden

12 . twoway (function y=Fden(13,17,x), range(0.9287 3) color(ltblue) recast(area)) (function y=Fde

13 . twoway (function y=normalden(x), range(-5 -2.3915) color(ltblue) recast(area)) (function y=no

You might also like

_ __(R)