Professional Documents
Culture Documents
Pertemuan 7 - New
Pertemuan 7 - New
Smoke 1= Smoker
2= Non-smoker
Initial Analysis
We first perform a simple cross-tabulation to check whether
the frequencies per each cell are adequate to allow log-linear
analysis.
BMI
SMOKE 1 2 Total
1 ECG 1 Count 47 8 55
Expected Count 34.4 20.6 55.0
2 Count 25 35 60
Expected Count 37.6 22.4 60.0
2 Count 40 65 105
Expected Count 57.9 47.1 105.0
SMOKE Value df
Asymp. Sig. (2-
sided)
a. 0 cells (.0%) have expected
1 Pearson Chi-Square 23.503
a
1 .000 count less than 5. The
Continuity Correction b
21.670 1 .000
minimum expected count is
Likelihood Ratio 24.906 1 .000
Fisher's Exact Test
20.57.
Linear-by-Linear Association 23.298 1 .000 b. Computed only for a 2x2
N of Valid Cases 115
table
2 Pearson Chi-Square 4.151c 1 .042
Continuity Correction
b
3.033 1 .082
c. 0 cells (.0%) have expected
Likelihood Ratio 4.113 1 .043 count less than 5. The
Fisher's Exact Test
Linear-by-Linear Association 4.083 1 .043
minimum expected count is
N of Valid Cases 61 6.56.
Total Pearson Chi-Square
b
30.472d 1 .000
d. 0 cells (.0%) have expected
Continuity Correction 28.791 1 .000
Likelihood Ratio 32.094 1 .000
count less than 5. The
Fisher's Exact Test minimum expected count is
Linear-by-Linear Association 30.299 1 .000
31.87.
N of Valid Cases 176
Initial Analysis
Chi-Square Tests
a. For saturated models, .500 has been added to all observed cells.
The likelihood ratio chi-square with no parameters and only the mean is
69.822. The value for the first order effect is 44.530. The difference 69.822 −
44.530 = 25.292 is displayed on the first line of the next table.
The difference is a measure of how much the model improves when first order
effects are included. The significantly small P value (0.0000) means that the
hypothesis of first order effect being zero is rejected. In other words there is a
first order effect.
K-Way and Higher-Order Effects
K-Way and Higher-Order Effects
Similar reasoning is applied now to the question of second order effect. The
addition of a second order effect improves the likelihood ratio chi-square by
43.142. This is also significant. But the addition of a third order term does not
help. The P value is not significant.
K-Way and Higher-Order Effects
K-Way and Higher-Order Effects
In log-linear analysis the change in the value of the likelihood ratio chi-square
statistic when terms are removed (or added) from the model is an indicator
of their contribution. We saw this in multiple linear regression with regard to
R2.
The difference is that in linear regression large values of R2 are associated
with good models. Opposite is the case with log-linear analysis. Small values
of likelihood ratio chi-square mean a good model.
Backward Elimination Statistics
Step Summary
a c
Step Effects Chi-Square df
b
0 Generating Class SMOKE*BMI*E .000 0
CG
Deleted Effect 1 SMOKE*BMI*E 1.389 1 The purpose here is to find the
CG unsaturated model that would
b
1 Generating Class SMOKE*BMI, 1.389 1
provide the best fit to the data.
SMOKE*ECG,
BMI*ECG
Deleted Effect 1 SMOKE*BMI 3.080 1 This is done by checking that the
2 SMOKE*ECG 3.505 1 model currently being tested
3 BMI*ECG 27.631 1 does not give a worse fit than its
b
2 Generating Class SMOKE*ECG, 4.469 2
predecessor.
BMI*ECG
Deleted Effect 1 SMOKE*ECG 7.968 1
2 BMI*ECG 32.094 1
b
3 Generating Class SMOKE*ECG, 4.469 2
BMI*ECG
Backward Elimination Statistics
Step Summary
a c
Step Effects Chi-Square df
b
0 Generating Class SMOKE*BMI*E .000 0
CG As a first step the procedure
Deleted Effect 1 SMOKE*BMI*E 1.389 1
commences with the most
CG
1 Generating Class
b
SMOKE*BMI, 1.389 1
complex model. In our case it is
SMOKE*ECG, BMI * ECG * SMOKING.
BMI*ECG
Deleted Effect 1 SMOKE*BMI 3.080 1 Its elimination produces a chi-
2 SMOKE*ECG 3.505 1
square change of 1.389, which
3 BMI*ECG 27.631 1
2 Generating Class
b
SMOKE*ECG, 4.469 2
has an associated significance
BMI*ECG level of 0.2386. Since it is greater
Deleted Effect 1 SMOKE*ECG 7.968 1 than the criterion level of 0.05, it
2 BMI*ECG 32.094 1 is removed.
b
3 Generating Class SMOKE*ECG, 4.469 2
BMI*ECG
Backward Elimination Statistics
Step Summary
a c
Step Effects Chi-Square df
b
0 Generating Class SMOKE*BMI*E .000 0
The procedure moves on to the
CG
Deleted Effect 1 SMOKE*BMI*E 1.389 1
next hierarchical level described
CG under step 1. All 2 – way
1 Generating Class
b
SMOKE*BMI, 1.389 1 interactions between the three
SMOKE*ECG, variables are being tested.
BMI*ECG
Deleted Effect 1 SMOKE*BMI 3.080 1
2 SMOKE*ECG 3.505 1
Removal of BMI * ECG will
3 BMI*ECG 27.631 1 produce a large change of
2 Generating Class
b
SMOKE*ECG, 4.469 2 27.631 in the likelihood ratio chi-
BMI*ECG square. The P value for that is
Deleted Effect 1 SMOKE*ECG 7.968 1
highly significant (prob<0.0005).
2 BMI*ECG 32.094 1
b
3 Generating Class SMOKE*ECG, 4.469 2
BMI*ECG
Backward Elimination Statistics
Step Summary
a c
Step Effects Chi-Square df
b
The smallest change (of 3.080) is
0 Generating Class SMOKE*BMI*E .000 0
CG
related to the BMI * SMOKING
Deleted Effect 1 SMOKE*BMI*E 1.389 1 interaction. This is removed next.
CG And the procedure continues
b
1 Generating Class SMOKE*BMI, 1.389 1
until the final model which gives
SMOKE*ECG,
BMI*ECG
the second order interactions of
Deleted Effect 1 SMOKE*BMI 3.080 1 BMI * ECG and ECG * SMOKING.
2 SMOKE*ECG 3.505 1
3 BMI*ECG 27.631 1 Each time an estimate is
b
2 Generating Class SMOKE*ECG, 4.469 2
obtained it is called iteration.
BMI*ECG
Deleted Effect 1 SMOKE*ECG 7.968 1
The largest difference between
2 BMI*ECG 32.094 1
successive estimates is called
3 Generating Class
b
SMOKE*ECG, 4.469 2 convergence criterion.
BMI*ECG
Backward Elimination Statistics
Step Summary
a c
Step Effects Chi-Square df
b
0 Generating Class SMOKE*BMI*E .000 0
CG
We conclude that being
Deleted Effect 1 SMOKE*BMI*E 1.389 1
CG
overweight and smoking have
1 Generating Class
b
SMOKE*BMI, 1.389 1 each a significant association
SMOKE*ECG, with an abnormal cardiogram.
BMI*ECG
However, in this particular group
Deleted Effect 1 SMOKE*BMI 3.080 1
2 SMOKE*ECG 3.505 1
of subjects being overweight is
3 BMI*ECG 27.631 1
more harmful.
b
2 Generating Class SMOKE*ECG, 4.469 2
BMI*ECG
Deleted Effect 1 SMOKE*ECG 7.968 1
2 BMI*ECG 32.094 1
b
3 Generating Class SMOKE*ECG, 4.469 2
BMI*ECG
Odds Ratio
We could have inferred this by calculating the odds ratio when we
performed the cross tabulation.
Y = 1 Y = 0
X = 1
X = 0
Y = 1 Y = 0
X = 1
X = 0
nij
where p̂ij with n = n11 + n10 + n01 + n00 being the sum of all four cell counts.
n
Odds Ratio
The sample log odds ratio is
.
Odds Ratio
The odds ratio calculation is shown below:
(ECG 1) (ECG 2)
Overweight (BMI 1) 47 25
(ECG 1) (ECG 2)
Smoker (Smoking 1) 10 15
Non-Smoker (Smoking 2) 6 30