Professional Documents
Culture Documents
Statistical Interference On SAS
Statistical Interference On SAS
Statistical Interference On SAS
STATISTICAL INFERENCE
ON SAS
YI-TING HWANG
DEPARTMENT OF STATISTICS
NATIONAL TAIPEI UNIVERSITY
X − µ0 X − µ0
Z = =
σx σ/ n
σx
where is the true population s.d., then the
test follows a normal distribution with mean
0 and variance 1.
10/26/2006 ch4 Statistical inference on SAS 6
T TEST
If σ x is unknown,
X − µ0 X − µ0
T = = .
sx s/ n
where s is the sample standard deviation, the
test follows a t distribution with df n-1.
2 1
2
s2
1
s
2
2 n1 + n 2
n + n n1 − 1 n2 − 1
1 2
10/26/2006 ch4 Statistical inference on SAS 14
T TEST WITH EQUAL
VARIANCE
Let X −Y
T =
1 1
sp +
n1 n 2
2 2
( n − 1) s + ( n − 1) s
where s 2p = 1 1 2 2
is the pooled
n1 + n 2 − 2
sample standard deviation, the test follows a
t distribution with df n1 + n 2 − 2 .
where 2
sD =
∑ (d i − d ) 2
n −1
is the sample standard deviation of the
differences, the test follows a t distribution
with df n-1.
10/26/2006 ch4 Statistical inference on SAS 18
EXAMPLE 5
• Observational studies have suggested that low dietary
intake or low plasma concentrations of retinol, beta-
carotene, or other carotenoids might be associated with
increased risk of developing certain types of cancer.
However, relatively few studies have investigated the
determinants of plasma concentrations of these
micronutrients.
• A cross-sectional study is designed to investigate the
relationship between personal characteristics and dietary
factors, and plasma concentrations of retinol, beta-
carotene and other carotenoids. Study subjects (N = 315)
were patients who had an elective surgical procedure
during a three-year period to biopsy or remove a lesion of
the lung, colon, breast, skin, ovary or uterus that was
found to be non-cancerous.
10/26/2006 ch4 Statistical inference on SAS 19
VARIABLE NAMES
• Variable Names in order from left to right:
– AGE: Age (years)
– SEX: Sex (1=Male, 2=Female).
– SMOKSTAT: Smoking status (1=Never, 2=Former, 3=Current Smoker)
– QUETELET: Quetelet (weight/(height^2))
– VITUSE: Vitamin Use (1=Yes, fairly often, 2=Yes, not often, 3=No)
– CALORIES: Number of calories consumed per day.
– FAT: Grams of fat consumed per day.
– FIBER: Grams of fiber consumed per day.
– ALCOHOL: Number of alcoholic drinks consumed per week.
– CHOLESTEROL: Cholesterol consumed (mg per day).
– BETADIET: Dietary beta-carotene consumed (mcg per day).
– RETDIET: Dietary retinol consumed (mcg per day)
– BETAPLAS: Plasma beta-carotene (ng/ml)
– RETPLAS: Plasma Retinol (ng/ml)
10/26/2006 ch4 Statistical inference on SAS 20
DISCRETE VARIABLES
Test statistic :
(O − E ) 2
2 k j j
χ = ∑ ≈ χ2 ,
j=1 E k −1
j
where E = np
j j
0
10/26/2006 ch4 Statistical inference on SAS 25
TEST OF INDEPENDENCE
Consider two variables of interest, X and Y,
with I and J categories, respectively.
Objective:
H : the classifications are independent
0
H a : the classifications are dependent
A B C
Cash 40 35 20
Credit card 40 30 30
Check 20 35 50
10/26/2006 ch4 Statistical inference on SAS 34