Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Assignment 1 – Part 1

Instructions:

All of these questions require you to complete using SPSS. Question 2-4 require that you
complete both manually and by SPSS. Please create a WORD document to answer the
questions, perform your test for Q2-4 (you can use Equation in Microsoft Word to write the
formulas, or use other software like MathType) and copy your SPSS results into it.

Please submit your assignment on Blackboard by click on “Assignment 1 - Part 1” in


Assignments. The submission must include: 1. The word document and 2. The syntax file
(include your syntax used for all 4 questions). Please use “Attach file” and select the two
files you are going to submit, then remember to click the Submit button.

The due date is 10/27/2016, by 11:59pm.

QUESTION 1 (10pts)

Use SPSS: Get the file surgery.sav from Bb. This file contains data from 141 babies who
were referred to a paediatric hospital for surgery. Examine the distributions of three
continuous variables in the data set, that is, birth weight, gestational age and length of stay.

Decide whether a variable is normally distributed and fill in the following tables:

Mean Skewness Z- P value Plots Overall


values of decision
Median Kurtosis Shapiro
Wilk

Birth 2463.9 .336  1.633 .056  Approx.


weight approx.  approx. Normal
2425.0 normal normal normal

-.323  -0.792
approx. 
normal normal

 approx. normal
Gestational 36.564 -.590  -2.81 .000 Approx.
age approx.  normal
37.000 normal normal  non-
normal
.862  2.06 
approx. normal
normal

 approx. normal

Length of 38.05 3.212  15.23 .000 Non-


stay non-  non- normal
27.00 normal normal  non-
normal
12.675  30.27
non- non-
normal normal

 non-normal

Provide proper descriptive statistics for each variable

Descriptive statistics for normally distributed variables:

Variables Mean SD Min Max


birth weight 2463.9 514.632 1150 3900

gestational age 36.564 2.0481 30.0 41.0

Descriptive statistics for Non-normally distributed variables

Variables Median 25th percentile 75th IQR Min Max Range


percentile
Length of 27.00 20.25 42 22 0 244 244
stay

Transform non-normal variables to normal. Give the proof that your transformed variable(s)
has normal distribution.

Get the log10(LOS), we obtain a variable that is somewhat normally distributed:


Skewness of the log10(LOS): -0.11

Histogram:

QUESTION 2 (10pts)

The daily dietary intake of 11 young and healthy women was measured over a longer period.
The women didn’t know the aim of the study in advance, which was the comparison of pre-
and post-menstrual ingestion, to avoid any deliberate influence on the study results. The
mean dietary intake (in kJ) over 10 pre- (PREMENS) and 10 post-menstrual days
(POSTMENS) of each woman is given in the following table:

The research question is: Is there a difference between pre- and postmenstrual dietary
intake?
State the null & alternative hypothesis, perform the test by your own calculation and by
SPSS and give your conclusion based on this data

Answer:

Null hypothesis: The mean difference between pre- and postmenstrual


food intake is equal to zero H :  0
0 diff

Two-sided alternative hypothesis: The mean difference between pre-


and postmenstrual food intake is unequal to zero H :  0 A diff

Premens Postmens Pre-Post


( xdiffi  xdiff )2
1 5260 3910 1350 873.20

2 5470 4220 1250 4963.20

3 5640 3885 1755 188833.70

4 6180 5160 1020 90270.20

5 6390 5645 745 331142.70

6 6515 4680 1835 264761.70

7 6805 5265 1540 48202.20

8 7515 5975 1540 48202.20

9 7515 6790 725 354560.70

10 8230 6900 1330 91.20

11 8770 7335 1435 13121.70

x pre  6753.64 x post  5433.18 xdiff  1320.45


n

(x diffi  xdifi ) 2


sdiff  i 1
n 1
 366.75

As we are comparing the means of the paired sample, and assuming


that food intake in the population has normal distribution, we will use
the paired-sample t-test.
Calculate the t-statistic: xdiff  0
t
SEˆ ( x )
1320.45  0
  11.94
366.75 / 11
Degree of freedom: df=n-1=10
p-value (according to T-table): Pr(Tdf=10≥|11.94|) <0.001
 p-value < 0.001  reject H0

Conclude: The mean difference between pre- and postmenstrual food


intake is unequal to zero (or there is a difference in food intake between
pre and post menstrual period.

SPSS results:
Check for normality:

Descriptives

Statistic Std. Error

Mean 6753.64 344.363

Lower Bound 5986.35


95% Confidence Interval for Mean
Upper Bound 7520.93

5% Trimmed Mean 6724.60

Median 6515.00

Variance 1304445.455

Pre Std. Deviation 1142.123

Minimum 5260

Maximum 8770

Range 3510

Interquartile Range 1875

Skewness .428 .661

Kurtosis -.793 1.279

Mean 5433.18 366.889

Lower Bound 4615.70


95% Confidence Interval for Mean
Upper Bound 6250.66

5% Trimmed Mean 5413.54

Post Median 5265.00

Variance 1480681.364

Std. Deviation 1216.833

Minimum 3885

Maximum 7335
Range 3450

Interquartile Range 2570

Skewness .218 .661

Kurtosis -1.256 1.279

Tests of Normality

a
Kolmogorov-Smirnov Shapiro-Wilk

Statistic df Sig. Statistic df Sig.

*
Pre .128 11 .200 .952 11 .674
*
Post .140 11 .200 .936 11 .479

*. This is a lower bound of the true significance.

a. Lilliefors Significance Correction

According the results, the distribution of food intake is normal.


We perform the paired-sample t-test
As shown in SPSS results, the paired t-test has p-value = 0.000  reject H0 and
conclude that the mean difference between pre- and postmenstrual food intake is
unequal to zero (or there is a difference in food intake between pre and post
menstrual period.

QUESTION 3 (10pts)

An animal experiment of Dr. X: two groups of female rats receive food with high and low
protein. The research question: Is there a difference between groups for the weight gain,
assuming the two populations have equal variances.

State the null & alternative hypothesis, perform the test by your own calculation and by
SPSS and give your conclusion based on this data

Answer:

Null hypothesis: There is no difference in the means of weight gain between the
groups of rats receiving food with high and low protein (or the mean difference of
weight gain between the groups of rats receiving food with high and low protein is
equal to zero)

H 0 : 1  2 or 1  2  0
Alternative hypothesis: There is a difference in the means of weight gain between the
groups of rats receiving food with high and low protein (or the mean difference of
weight gain between the groups of rats receiving food with high and low protein is
unequal to zero)

H A : 1  2 or 1  2  0
As we are comparing the means of two independent samples of rats, and assuming
the distribution of weight gain in the population is normal, we will use the 2-sample t-
test.
12
 12 2
   ( x1i  x1 ) 
Group 1: x1i
n1  12, x1  i 1  120, s1   i 1   21.388
n1 n1  1
Group 2:
7
 7 2
 x2i   ( x2i  x2 ) 
n2  7, x2  i 1  101, s2   i 1   20.623
n2 n2  1
As we assume the two populations have equal variances, we calculate the pooled
estimate of the variance
(n1  1) s12  (n2  1) s22
s 
2

n1  n2  2
p

(12  1)(21.388) 2  (7  1)(20.623) 2


  446.118
12  7  2

Calculate the t-statistic:


( x1  x2 )  ( 1  2 )
t
s 2p [(1/ n1 )  (1/ n2 )]
(120  101)  0
  1.89
(446.118)[(1/12)  (1/ 7)]
Degree of freedom: df=n1+n2-2=17
p-value (according to T-table): 0.05<Pr(Tdf=17≥|1.89|) <0.1
 p-value >0.05  do not reject H0

Conclude: There is no difference in the means of weight gain between


the groups of rats receiving food with high and low protein.

SPSS results:

Check for normality

Descriptives

Groups of protein content Statistic Std. Error

Mean 120.0000 6.17424

Lower Bound 106.4106


95% Confidence Interval for Mean
Upper Bound 133.5894

Gain High 5% Trimmed Mean 119.7778

Median 121.0000

Variance 457.455

Std. Deviation 21.38819


Minimum 83.00

Maximum 161.00

Range 78.00

Interquartile Range 28.00

Skewness .230 .637

Kurtosis .167 1.232

Mean 101.0000 7.79499

Lower Bound 81.9263


95% Confidence Interval for Mean
Upper Bound 120.0737

5% Trimmed Mean 101.0000

Median 101.0000

Variance 425.333

Low Std. Deviation 20.62361

Minimum 70.00

Maximum 132.00

Range 62.00

Interquartile Range 33.00

Skewness .018 .794

Kurtosis -.241 1.587

Tests of Normality

Groups of protein content Kolmogorov-Smirnova Shapiro-Wilk

Statistic df Sig. Statistic df Sig.

*
High .092 12 .200 .992 12 1.000
Gain
Low .100 7 .200* .998 7 1.000

*. This is a lower bound of the true significance.

a. Lilliefors Significance Correction


According the results, the distribution of food intake is normal.
We perform the independent samples t-test

The test for equality of the two variances has p-value = 0.905  do not reject the null
and conclude that the variances of the two populations are equal  we will read the t-
test results of equal variances assumed

p-value = 0.076  do not reject the null and conclude that the means of weight gain of
the low protein and high protein groups are not different.
QUESTION 4 (10pts)

The 24-hours total energy consumption (in MJ/day) was determined for 13 skinny and 9
heavily overweighted women.

Values for skinny women:

6.13, 7.05, 7.48, 7.48, 7.53, 7.58, 7.90, 8.08, 8.09, 8.11, 8.40, 10.15, 10.88

Values for heavy overweighted women:

8.79, 9.19, 9.21, 9.68, 9.69, 9.97, 11.51, 11.85, 12.79

Is there a difference in energy consumption between both groups?

State the null & alternative hypothesis, perform the test by your own calculation and by
SPSS and give your conclusion based on this data

Answer:

Null hypothesis: There is no difference in the means of energy consumption between


skinny and overweighted women (or the mean difference of energy consumption
between skinny and overweighted women is equal to zero)

H 0 : 1  2 or 1  2  0
Alternative hypothesis: There is a difference in the means of energy consumption
between skinny and overweighted women (or the mean difference of energy
consumption between skinny and overweighted women is not equal to zero)

H A : 1  2 or 1  2  0
As we are comparing the means of two independent samples of women, and
assuming the distribution of energy consumption in the population is normal, we will
use the 2-sample t-test.
12
 12 2
   1i 1 
( x  x )
Group 1: x1i
n1  13, x1  i 1
 8.066, s1   i 1   1.238
n1 n1  1

Group 2:
7
 7 2
 x2i 


i 1
( x2i  x2 ) 
  1.398
n2  9, x2  i 1
 10.298, s2 
n2 n2  1
Assuming the two populations have equal variances, we calculate the pooled estimate
of the variance
(n1  1) s12  (n2  1) s22
s 2

n1  n2  2
p

(13  1)(1.238) 2  (9  1)(1.398) 2


  1.701
13  9  2
Calculate the t-statistic:
( x1  x2 )  ( 1  2 )
t
s 2p [(1/ n1 )  (1/ n2 )]
(8.066  10.298)  0
  3.946
(1.701)[(1/13)  (1/ 9)]
Degree of freedom: df=n1+n2-2=20
p-value (according to T-table): Pr(Tdf=20≥|-3.946|) <0.001
 p-value < 0.001  reject H0

Conclude: There is a difference in the means of energy consumption between


skinny and overweighted women

SPSS results:

Check for normality

Descriptives

Group Statistic Std. Error

Skinny Mean 8.0662 .34338

Lower Bound 7.3180


95% Confidence Interval for Mean
Upper Bound 8.8143

5% Trimmed Mean 8.0174

Median 7.9000

Variance 1.533

Std. Deviation 1.23808

Minimum 6.13

Maximum 10.88

Range 4.75
Energy consumption
Interquartile Range .77

Skewness 1.161 .616

Kurtosis 1.768 1.191

Mean 10.2978 .46596

Lower Bound 9.2233


95% Confidence Interval for Mean
Upper Bound 11.3723

Overweight 5% Trimmed Mean 10.2431

Median 9.6900

Variance 1.954

Std. Deviation 1.39787


Minimum 8.79

Maximum 12.79

Range 4.00

Interquartile Range 2.48

Skewness .849 .717

Kurtosis -.719 1.400

Tests of Normality

Group Kolmogorov-Smirnova Shapiro-Wilk

Statistic df Sig. Statistic df Sig.

Skinny .255 13 .020 .867 13 .048


Energy consumption
Overweight .259 9 .082 .876 9 .143

a. Lilliefors Significance Correction

According the results, the distribution of energy is somewhat normal.


We perform the independent samples t-test
The test for equality of the two variances has p-value = 0.329  do not reject the null
and conclude that the variances of the two populations are equal  we will read the t-
test results of equal variances assumed

p-value = 0.001  reject the null and conclude that there is a difference in the means
of energy consumption between skinny and overweighted women.

You might also like