Professional Documents
Culture Documents
8 Sept - SPSS Workshop - Exercises 3 - Kate Reid
8 Sept - SPSS Workshop - Exercises 3 - Kate Reid
Lab Class 3
Review of Basic Concepts in SPSS
: Random samples
: Producing correlations
: Performing t-tests
Exercise 1
Data was collected by a real estate agent in Tullamarine about some characteristics of all houses sold in the area in
the last month.
Data was collected on 40 people to indicate whether they had a high or low income and whether they lived in an
urban or rural area.
The data in column 1 is an indication of whether the respondent lived in an urban or rural area (urban =1, rural=2).
Column 2 is an indication of whether the respondent had a “high” or a “low” income (low=1, high=2).
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 2
1 2
1 2
1 2
1 2
1 2
1 2
1 2
1 2
1 2
1 2
1 2
1 2
1 2
1 2
2 1
2 1
2 1
2 1
2 1
2 1
4
2 1
2 1
2 1
2 1
2 1
2 1
2 2
2 2
2 2
2 2
2 2
2 2
2. Perform a crosstabs and find out, a) the frequency (and percentage) of urban respondents who had a high and
low income and b) the frequency (and percentage) of rural dwellers who had a high and low income.
3. A chi square statistic is a measure of association between two categorical variables. You may be interested in
knowing for instance whether your level of income is dependent upon whether you happen to be in an urban or
rural area or whether income is INDEPENDENT of locality.
This is a really useful statistic to have some familiarity with because in social research we’re often dealing with
purely categorical variables. The chi square statistic compares the OBSERVED frequencies (i.e. those that you
obtained in your data) with the frequencies that would be EXPECTED if there was no relationship between the two
variables (i.e. if they were INDEPENDENT).
Remember that in chi square we’re interested in the issue of INDEPENDENDCE vs. DEPENDENCE. So if
you’re just as likely to have a high or a low income in an urban area we can say that income is INDEPENDENT of
LOCALITY. In this case your chi square statistic WILL NOT be significant (i.e. sig >0).
To produce the chi square statistic go back to CROSSTABS. Click on STATISTICS check CHI SQUARE click
CONTINUE then on CELLS and under RESIDUALS check ADJ. STANDARDIZED.
A significance level greater than 0 indicates that there is a relationship between income and locality.
The adjusted standardized residuals give you an indication of where the OBSERVED frequencies differ
significantly from the EXPECTED frequencies.
As a rough guide you should look for adjusted standardized residuals BIGGER than +2 or –2.
5
A large positive residual tells you that there are more observations in that cell than you’d expect if the variables
were independent. A large negative residual tells you that you are fewer observations in that cell than you’d
expect if the variables were independent.
Exercise 3
Data was collected on 20 people who enrolled in a new fitness program at a gym.
2 3 18
2 3 26
1 2 20
2 1 19
1 3 25
2 1 22
1 1 23
2 3 19
1 2 18
2 2 24
2 1 21
1 3 27
2 3 22
2 3 19
1 1 23
1 2 25
2 1 23
1 1 19
2 2 21
2 3 20
6
1. Enter the data into SPSS
2. Produce some simple descriptions of the people in your sample-produce graphs as well
3. What percentage of males and females are in the different health categories?
4. Is there a relationship between gender and health rating?
5. Find out if the mean age of males and females is significantly different?
6. Perform an ANOVA to see if the average age of the participants varies according to health rating.
Exercise 4
Open the SPSS data file DIETSTUDY.SAV that can be found by clicking
FILE →OPEN→DATA then double click on the TUTORIAL folder then double click on the
SAMPLE_FILES folder then open DIETSTUDY.SAV
This file contains data from 16 individuals who underwent a weight reduction program. The file records the
participant’s age and gender, their triglyceride levels at the start of the program (tg0) and their weight in pounds (it
is an American study) at the start of the program (wgt0). The subsequent triglyceride and weight measurements
were taken at 2 weekly intervals after the commencement of the program.
Note: Triglycerides are a fat that shows up in a blood test (you’re probably more familiar with the term cholesterol
test), higher readings indicate higher levels of the fat and are said to be a risk factor for heart disease, stroke etc.
a) Write a paragraph that describes the main attributes of the sample at the beginning of the study.
7
b) Perform a test to determine if the average triglyceride reading for the participants decreases between the first
and the last measurement.
c) Perform a test to determine if the average weight for the participants decreases between the first and the last
measurement.
d) Perform tests to determine if males and females have i) different triglyceride levels at the start of the program
and ii) different weights at the start of the program