Professional Documents
Culture Documents
Preprocessing The Data, and Cross-Tabs
Preprocessing The Data, and Cross-Tabs
And Cross-Tabs
Figure 1: Histogram and Frequency Polygon
of Incomes of Families in Car Ownership Study
25
20
15
10
1 05k
55k
65k
85k
95k
1 5k
2 5k
3 5k
4 5k
7 5k
0k
Figure 2: Cumulative Distribution of Incomes
of Families in Car Ownership Study
120
100
80
60
40
20
105k
35 k
5 5k
75 k
9 5k
15k
25k
45k
65k
85k
0k
Family Income and Number of Cars Family Owns
Number of Cars
TOTAL 75 25 100
Number of Cars by Family Income
Number of Cars
# of
Cases
Income 1 or None 2 or More Total
Number of Cars
Number of Cars
4 or Less 70 8 78
5 or More 5 17 22
75 25 100
Total
Number of Cars by Size of Family
Number of Cars
# of
Cases
Size of Family 1 or None 2 or More Total
5 or More
23% 77% 100% (22)
Number of Cars by Income and Size of Family
1 or 2 or 1 or 2 or 1 or 2 or
Income None More None More None More
Total Total Total
Less than $37,500 44 2 46 4 4 8 48 6 54
More than $37,500 26 6 32 1 13 14 27 19 46
TOTAL 70 8 78 5 17 22 75 25 100
Number of Cars by Income and Size of Family
1 or 2 or 1 or 2 or 1 or 2 or
Income None More None More None More
Total Total Total
Less than $37,500 96% 4% 100% (46) 50% 50% 100% (8) 89% 11% 100% (54)
More than $37,500 81% 19% 100% (32) 7% 93% 100% (14) 59% 41% 100% (46)
Car Ownership for Small, Below Average Income Families
Number of Cars
Number of Cars
I
A. Refine Explanation
Some
B. Reveal Spurious
Relationship
Explanation
II
C. Provide Limiting
Conditions
No
Relationship III IV
The Researcher’s Dilemma
True Situation
Researcher’s No Some
Conclusion Relationship Relationship
No Correct Spurious
Relationship Decision Noncorrelation
Chi-Square Tests
Measures of Association for Nominal Data
#Cars: 0 or 1 2+
Family Size:
70 8 78
4 or less
5 17 22
5 or more
100
75 25
#Cars: 0 or 1 2+
Family Size:
78 78%
4 or less
22 22%
5 or more
100
75 25
75% 25%
#Cars: 0 or 1 2+
Family Size:
58.5 19.5 78 78%
4 or less
Chi-square measures how much our data differ from what we’d expect (given
the hypothesis of independence)
Are the row and column variables associated ?
2
r c (oij eij )
X
2
i 1 j 1 eij
Chi-Square for Our Data
Is this large?
Three-way table: