Professional Documents
Culture Documents
Exam 3
Exam 3
Exam 3
###1.
Because sex and race were not missing any values the only two that are of concern are sei and
ann.income. I substracted 252 (missing values) from 624. There are 372 cases with both sei and
ann.income.
d=read.csv(file.choose())
> sum(is.na(d$sei)|is.na(d$ann.income))
[1] 252
> 624-252
[1] 372
###2.
> summary(subd$sei)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.20 29.28 37.70 43.46 62.20 92.30
> describe(subd$sei)
vars n mean sd median trimmed mad min max range skew kurtosis
1 1 372 43.46 19.03 37.7 42.86 18.24 0.2 92.3 92.1 0.27 -0.64
se
1 0.99
Both skewness and kurtosis values are not greater than |1|. There are no extreme outliers. There
is some evidence of bimodality and perhaps some skewness but the skewness rating is not
greater than |1|. There are many values outside of the confidence interval band in the QQplot but
they are fairly close.
###3.
###4.
The design needs to be unbalanced because the group sizes are different.
The data below show that it is also disproportional.
> test.prop*N
[1] 22.76101
>
> round(test.prop*N) ==table3[1]
F:HIGH:NON-WHITE
FALSE
> round(test.prop*N) ==table3[2]
F:HIGH:WHITE
FALSE
> round(test.prop*N) ==table3[3]
F:LOW:NON-WHITE
FALSE
> round(test.prop*N) ==table3[4]
F:LOW:WHITE
TRUE
> round(test.prop*N) ==table3[5]
F:MED:NON-WHITE
FALSE
> table3[4]
F:LOW:WHITE
23
> round(test.prop*N)
[1] 23
>
###5.
I do not reject the null hypothesis that the population variance is the same for all 12 groups.
> max(VAR)
[1] 436.1599
> min(VAR)
[1] 165.2449
> max(VAR)/min(VAR)
[1] 2.639475
>
There does not appear to be a violation of the assumption of homogeneity of variance. The
biggest variance is less than 4
> mean(subd$sei)
[1] 43.4621
Gender
Male Female
Race Race
Non- Non-
White White White White
LOW 35.6489 34.325 32.23 31.3
Annual
MED 42.265 44.4 29.8265 34.96
Income
HIGH 54.785 40.9909 49.664 38.2574
Marginal Means:
GENDER Female Male
43.21762 43.7257
Non-
RACE White White
37.80353 45.13798 Grand Mean
ANN.INCOM
E Low Med High 43.4621
34.09474 40.73176 49.30573
$`F:HIGH:WHITE`
vars n mean sd median trimmed mad min max range skew kurtosis
1 1 101 49.66 18.97 50.2 49.43 20.31 0.2 92.3 92.1 -0.09 -0.2
se
1 1.89
$`F:LOW:NON-WHITE`
vars n mean sd median trimmed mad min max range skew kurtosis se
1 1 11 31.3 14.99 31.1 31.59 11.27 5.5 54.5 49 -0.25 -1.12 4.52
$`F:LOW:WHITE`
vars n mean sd median trimmed mad min max range skew kurtosis se
1 1 23 32.23 12.85 28.7 31.24 8.3 13.1 62.5 49.4 0.85 -0.02 2.68
$`F:MED:NON-WHITE`
vars n mean sd median trimmed mad min max range skew kurtosis se
1 1 10 34.96 15.24 30.2 32.92 11.86 22 64.2 42.2 0.8 -1.01 4.82
$`F:MED:WHITE`
vars n mean sd median trimmed mad min max range skew kurtosis se
1 1 34 39.83 16.78 33.7 38.59 7.26 0.2 80.9 80.7 0.62 0.15 2.88
$`M:HIGH:NON-WHITE`
vars n mean sd median trimmed mad min max range skew kurtosis
1 1 22 40.99 20.88 37.25 42.23 15.86 0.2 71.6 71.4 -0.16 -0.93
se
1 4.45
$`M:HIGH:WHITE`
vars n mean sd median trimmed mad min max range skew kurtosis
1 1 55 54.79 19.33 62.5 56.47 17.49 0.6 80.9 80.3 -0.78 0.07
se
1 2.61
$`M:LOW:NON-WHITE`
vars n mean sd median trimmed mad min max range skew kurtosis se
1 1 16 34.32 15.48 28.45 33.32 5.19 13.1 69.6 56.5 0.98 -0.2 3.87
$`M:LOW:WHITE`
vars n mean sd median trimmed mad min max range skew kurtosis
1 1 45 35.65 15.75 32.1 34.13 10.53 14.6 70.2 55.6 0.96 -0.22
se
1 2.35
$`M:MED:NON-WHITE`
vars n mean sd median trimmed mad min max range skew kurtosis se
1 1 12 44.4 14.56 40.9 44.25 13.05 17.1 73.2 56.1 0.14 -0.52 4.2
$`M:MED:WHITE`
vars n mean sd median trimmed mad min max range skew kurtosis
1 1 29 42.27 19.29 35.4 41.08 11.42 17.1 84.2 67.1 0.78 -0.67
se
1 3.58
par(mfrow=c(2,3))
for(i in 1:6){
qqPlot(subd$sei[subd$sex.ann.race2==levels(subd$sex.ann.race2)[i]], ylab="SEI", main =
paste(levels(subd$sex.ann.race2)[i], " (n = "
,length(subd$sei[subd$sex.ann.race2==levels(subd$sex.ann.race2)[i]]),")", sep=""))
}
for(i in 7:12){
qqPlot(subd$sei[subd$sex.ann.race2==levels(subd$sex.ann.race2)[i]], ylab="SEI", main =
paste(levels(subd$sex.ann.race2)[i], " (n = "
,length(subd$sei[subd$sex.ann.race2==levels(subd$sex.ann.race2)[i]]),")", sep=""))
}
par(mfrow = c(1,1))
The QQ plots are presented above. One of the plot shows some evidence of skewness but the
skewness rating does not exceed |1|. There is little variance from the a normal distribution as
well. There does not appear to be an extreme violation of the normality assumption. There are
two instances in which the kurtosis values exceed |1| but not greatly.
There is some evidence against the assumption of normality but considering that most are normal
and there was not problem in the variances I would go ahead with the test.
###6.
> summary(sei.mod)
Call:
lm(formula = sei ~ sex * ann.income * race, data = subd, contrasts = list(sex = contr.sum,
ann.income = contr.sum, race = contr.sum))
Residuals:
Min 1Q Median 3Q Max
-54.185 -12.561 -4.085 13.836 42.636
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 39.8878 1.1548 34.540 < 2e-16 ***
sex1 -2.1814 1.1548 -1.889 0.059699 .
ann.income1 6.0366 1.5108 3.996 7.82e-05 ***
ann.income2 -6.5118 1.6636 -3.914 0.000108 ***
race1 -2.5157 1.1548 -2.178 0.030026 *
sex1:ann.income1 0.2177 1.5108 0.144 0.885487
sex1:ann.income2 0.5706 1.6636 0.343 0.731819
sex1:race1 -0.3517 1.1548 -0.305 0.760901
ann.income1:race1 -3.7848 1.5108 -2.505 0.012680 *
ann.income2:race1 1.9521 1.6636 1.173 0.241416
sex1:ann.income1:race1 0.9485 1.5108 0.628 0.530510
sex1:ann.income2:race1 0.4500 1.6636 0.271 0.786917
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> summary(sei.mod)$fstatistic
value numdf dendf
6.159856 11.000000 360.000000
> 1 - pf(summary(sei.mod)$fstatistic[1], summary(sei.mod)$fstatistic[2], summary(sei.mod)
$fstatistic[3])
value
2.732248e-09
> 1 - pf(summary(sei.mod)$fstatistic[1], summary(sei.mod)$fstatistic[2], summary(sei.mod)
$fstatistic[3])
value
2.732248e-09
Response: sei
Sum Sq Df F value Pr(>F)
(Intercept) 374789 1 1193.0062 < 2.2e-16 ***
sex 1121 1 3.5682 0.05970 .
ann.income 6935 2 11.0379 2.226e-05 ***
race 1491 1 4.7454 0.03003 *
sex:ann.income 68 2 0.1088 0.89695
sex:race 29 1 0.0927 0.76090
ann.income:race 1978 2 3.1478 0.04413 *
sex:ann.income:race 232 2 0.3689 0.69179
Residuals 113096 360
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
The omnibus null hypothesis is rejected [F(11, 360) = 6.159856, p <0.001]. With this rejected I
will test all the effects of the model.
The main effects of RACE and ANN.INCOME are significant; SEX is close to significant. There
is also a significant interaction between RACE and ANN.INCOME. The three way interaction is
not significant. Before performing post hoc tests to explore the nature of the statistically
significant effects I will compute the ETA-squared values to estimate an effect size for each
effect.
The overall R2 for the model is .158; the sum of parts is .089. Meaning the overlapping part of variance
the effects share in common accounts for 6.9% of the variance in SEI.
###7.
There are two levels for RACE (WHITE and NON-WHITE). The main effect size for RACE is .414.
HIGH LOW
LOW 5.5e-11 -
MED 0.00028 0.01370
With the Fisher LSD test (which had the lowest p-value) all three null hypotheses are rejected because the
p-values are all below .05. There appears to be a statistically significant difference between all the low to
medium, medium to high, and low to high groups.
Response: sei
Sum Sq Df F value Pr(>F)
(Intercept) 81345 1 362.9952 <2e-16 ***
race 23 1 0.1035 0.7484
sex 189 1 0.8456 0.3602
race:sex 1 1 0.0032 0.9553
Residuals 20392 91
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Med.aov
Anova Table (Type III tests)
Response: sei
Sum Sq Df F value Pr(>F)
(Intercept) 105436 1 353.8539 <2e-16 ***
race 30 1 0.1013 0.7511
sex 571 1 1.9156 0.1701
race:sex 198 1 0.6654 0.4171
Residuals 24135 81
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> High.aov
Anova Table (Type III tests)
Response: sei
Sum Sq Df F value Pr(>F)
(Intercept) 232778 1 638.2301 < 2.2e-16 ***
race 4381 1 12.0124 0.0006548 ***
sex 426 1 1.1669 0.2814159
race:sex 39 1 0.1078 0.7430351
Residuals 68568 188
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>