Test of Hypothesis: Distribution,, - , + + - +

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Lecture:7, Test of Hypothesis

𝝌𝟐 −Distribution

Assume throughout that 𝒁𝟏 , 𝒁𝟐 , … … . , 𝒁𝒏 are IID random variables, each with


normal distribution with mean 0 and variance 1. Then the random variable
𝒀𝒏 = 𝒁𝟏 𝟐 + 𝒁𝟐 𝟐 + ⋯ … … … . +𝒁𝒏 𝟐
has the chi-square(𝝌𝟐 ) distribution with 𝒏 degrees of freedom i.e.
𝝂 𝝌𝟐
−𝟏 −
𝝌𝟐 𝟐 𝒆 𝟐
𝒇 𝝌𝟐 = 𝝂ൗ
𝟐 𝟐𝚪 𝝂ൗ
𝟐

Here
𝝂 = 𝒌 − 𝟏 is the degree of 𝒇 𝝌𝟐
freedom (d.f)
𝒌 −the number of pairs of
frequencies to be compared.

𝝌𝟐
• Goodness-of-fit test – an inferential procedure used to determine
whether a categorical frequency distribution follows a claimed
distribution.

• Expected counts – probability of an outcome times the sample size for k


mutually exclusive outcomes
• One-way table – a table of k mutually exclusive observed values
• Cells – one item in the one-way table
• Total area under a chi-square curve is equal to 1
• It is not symmetric, it is skewed right
• The shape of the chi-square distribution depends on the
degrees of freedom .
• As the number of degrees of freedom increases, the chi-square
distribution becomes more nearly symmetric.
• The values of χ² are nonnegative; that is, values of χ² are
always greater than or equal to zero (0); they increase to a
peak and then asymptotically approach 0.

• Table of 𝝌𝟐 gives critical values


Goodness-of-Fit Test
P-Value is the
area highlighted

P-value = P(χ2 )

χ2α
Critical Region

where
𝒏 (𝑶𝒊 −𝑬𝒊 )𝟐 Oi is observed count
Test Statistic: χ =σ𝒊=𝟏
2
for ith category and
𝑬𝒊 Ei is the expected count
for the ith category

Reject null hypothesis, if


P-value < α
χ2 > χ2α, k-1 (Right-Tailed)
Things to Avoid
Example 1
Are you more likely to have a motor vehicle collision when
using a cell phone? A study of 699 drivers who were using a
cell-phone when they were involved in a collision examined
this question. These drivers made 26,798 cell phone calls
during a 14 month study period. Each of the 699 collisions was
classified in various ways.

Sun Mon Tue Wed Thu Fri Sat


20 133 126 159 136 113 12

Are accidents equally likely to occur on any day of the week?


Example 1 – Graphical Analysis
Are accidents equally likely to occur on any day of the week?
Sun Mon Tue Wed Thu Fri Sat
20 133 126 159 136 113 12
Example 1 – Chi-Square Analysis
Are accidents equally likely to occur on any day of the week?

Hypotheses:
H0: Motor vehicle accidents involving cell phones are equally
likely to occur everyday of the week

Ha: Motor vehicle accidents involving cell phones will vary


everyday of the week (not all the same)

Conditions:

Expected counts (everyday) = 699/7 = 99.857

1) All expected counts > 0


2) All expected counts > 5
Example 1 – Chi-Square Analysis
Are accidents equally likely to occur on any day of the week?
(Obs – Exp)²
𝟐
Calculations: 𝝌𝝂 = ∑ -----------------
Exp
Item Sun Mon Tue Wed Thu Fri Sat
Observed 20 133 126 159 136 113 12
Expected 99.86 99.86 99.86 99.86 99.86 99.86 99.86
63.86 11.00 6.84 35.03 13.08 1.73 77.30

𝝌𝟐 = ∑ (63.86 + 11 + 6.84 + 35.03 + 13.08 + 1.73 + 77.3)


= 208.84
Interpretation:
𝝌𝟐 𝒏 − 𝟏, 𝒑 − 𝒗𝒂𝒍𝒖𝒆 = 𝝌𝟐 𝟔, 𝟎𝟎𝟎𝟓 = 𝟐𝟒. 𝟏

Since our χ² value is much greater than the critical value


(208 > 24.1), we would reject H0 and conclude that the
accidents are not equally likely each day of the week.
Multinomial Experiments
A multinomial experiment is a probability
experiment in which there are a fixed number
of independent trials and there are more than
two possible outcomes for each trial.

•The probability for each outcome is fixed.


•The sum of the probabilities of all possible
outcomes is one.

A chi-square goodness-of-fit test is used to test


whether a frequency distribution fits a specific
distribution.
Chi-Square Test for Goodness-of-Fit
Example: A social service organization claims 50% of all
marriages are the first marriage for both bride and
groom, 12% are first for the bride only, 14% for the
groom only and 24% a remarriage for both.
First Marriage %
Bride and Groom 50
Bride only 12
Groom only 14
Neither 24
H0: The distribution of first-time marriages is 50% for
both bride and groom, 12% for the bride only, 14% for
the groom only. 24% are remarriages for both.
H1: The distribution of first-time marriages differs from
the claimed distribution.
Goodness-of-Fit Test
Observed frequency, O, is the frequency of the
category found in the sample.
Expected frequency, E, is the calculated frequency for
the category using the specified distribution. Ei = npi

In a survey of 103 married couples, find the E =


expected number in each category.
First Marriage % E = np
Bride and Groom 50 103(.50) = 51.50
Bride only 12 103(.12) = 12.36
Groom only 14 103(.14) = 14.42
Neither 24 103(.24) = 24.72
Chi-Square Test
If the observed frequencies are obtained from a
random sample and each expected frequency is at
least 5, the sampling distribution for the goodness-
of-fit test is a chi-square distribution with k – 1
degrees of freedom (where k = the number of
categories).

The test statistic is:

O = observed frequency in each category


E = expected frequency in each category
A social service organization claims 50% of all marriages are the
first marriage for both bride and groom, 12% are first for the bride
only, 14% for the groom only, and 24% a remarriage for both. The
results of a study of 103 randomly selected married couples are
listed in the table. Test the distribution claimed by the agency.
Use .
First Marriage f
Bride and Groom 55
Bride only 12
Groom only 12
Neither 24
1. Write the null and alternative hypothesis.
H0: The distribution of first-time marriages is 50% for both bride and
groom, 12% for the bride only, 14% for the groom only. 24% are
remarriages for both.
Ha: The distribution of first-time marriages differs from the claimed
distribution.
2. State the level of significance.
3. Determine the sampling distribution.
A chi-square distribution with 4 – 1 = 3 d.f.

4. Find the critical value.

5. Find the rejection region.


11.34 2
0
6. Find the test statistic.

% O E (O – E)2 (O – E) 2/E
Bride and groom 50 55 51.5_ 12.25__ 0.2379
Bride only 12 12 12.36 0.1296 0.0105
Groom only 14 12 14.42 5.8564 0.4061
Neither 24 24 24.72 0.5184 0.0210
Total 100 103 103.__ 0.6755

= 0.6755
0 11.34
7. Make your decision.

The test statistic 0.6755 does not fall in the rejection region, so fail
to reject H0.

8. Interpret your decision.

The distribution fits the specified distribution for first-time marriages.


𝝌𝟐 - test applied to Poisson assumption
The number of vehicles arriving at the northwest corner of an intersection in a 5
minute period between 7.00 am and 7.05 am was monitored for five work days
over a 20 week period. Following table show the resulting data. The first entry in
the table indicates that there were 12 , 5-minute periods during which zero
vehicles arrived, 10 –periods during which one vehicle arrived and so on.

No. of arrival in a 5-minute period


Arrival/Period Frequenc Arrival/Period Frequenc
y y

0 12 6 7
1 10 7 5
2 19 8 5
3 17 9 3
4 10 10 3
5 8 11 1
The mean could be obtained as:
σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 364
𝑥ҧ = 𝛼 = 𝑛 = = 3.64
σ𝑖=1 𝑓𝑖 100

Since the histogram of the data , appeared to follow a Poisson distribution with
mean 3.64 (determined from the data), following hypothesis may be made:
𝐻0 : The random variable is Poisson distributed
𝐻1 : The random variable is not Poisson distributed.

The pdf for the Poisson distribution is given as follows:


𝛼 𝑥 𝑒 −𝛼
𝑝 𝑥 =𝑓 𝑥 = ; 𝑥 = 0,1,2,3, … . .
𝑥!
For 𝛼 = 3.64, the probabilities associated with various 𝑥 are obtained as:

𝒇 𝟎 = 𝟎. 𝟎𝟐𝟔 𝒇 𝟔 = 𝟎. 𝟎𝟖𝟓
𝒇 𝟏 = 𝟎. 𝟎𝟗𝟔 𝒇 𝟕 = 𝟎. 𝟎𝟒𝟒
𝒇 𝟐 = 𝟎. 𝟏𝟕𝟒 𝒇 𝟖 = 𝟎. 𝟎𝟐𝟎
𝒇 𝟑 = 𝟎. 𝟐𝟏𝟏 𝒇 𝟗 = 𝟎. 𝟎𝟎𝟖
𝒇 𝟒 = 𝟎. 𝟏𝟗𝟐 𝒇 𝟏𝟎 = 𝟎. 𝟎𝟎𝟑
𝒇 𝟓 = 𝟎. 𝟏𝟒𝟎 𝒇 𝟏𝟏 = 𝟎. 𝟎𝟎𝟏
With this information following table was constructed. The value of 𝐸1 is given by
𝑛𝑝0 = 100 ∗ 0.026 = 2.6. In a similar manner, the remaining 𝐸𝑖 values are
determined. Since 𝐸1 < 5, 𝐸1 and 𝐸2 are combined. In that case 𝑂1 and 𝑂2 are
also combined and 𝑘 is reduced by 1. The last five class intervals are also
combined for the same reason, and 𝑘 is further reduced by four.

𝒙𝒊 Observed Freq. 𝑶𝒊 Expected Freq. 𝑬𝒊 𝑶𝒊 − 𝑬𝒊 𝟐൘


𝑬𝒊
0 12 2.6
22 12.2 7.87
1 10 9.6
2 19 17.4 0.15
3 17 21.1 0.80
4 10 19.2 4.41
5 8 14.0 2.57
6 7 8.5 0.26
7 5 4.4
8 5 2.0
9 3 17 0.8 7.6 11.62
10 3 0.3
11 1 0.1

100 27.68
The calculated 𝜒 2 =27.68.

The degree of freedom for the tabulated value of 𝜒 2 is


𝜈 =𝑘−𝑠−1=7−1−1= 5
Here 𝑠 = 1, since one parameter , 𝛼, was estimated from the data. At the 0.05
level of significance , the critical value 𝜒 2 0.05,5 = 11.1. Since the computed value
of 𝜒 2 is more than the tabulated value, we find that 𝐻0 would be rejected at the
level of significance 0.05. The analyst therefore want to search for a better –
fitting model or use the empirical distribution of the data.
Gibson Mix:
One way to measure computer performance is by CPU execution time. Since
there are a number of common instruction types, an appropriate way to
proceed is to take a weighted average
𝑟

෍ 𝑝𝑖 𝑡𝑖
𝑖=1
where 𝑝𝑖 is the probability of calling instruction 𝐼𝑖 𝑎𝑛𝑑 𝑡𝑖 is the execution time of
𝐼𝑖 . A specific set of instructions 𝐼1 , 𝐼2 , … . . , 𝐼𝑟 in conjunction with their occurrence
probabilities comprise an instruction mix . The execution time rating is
dependent on the accuracy of the mix. One mix in use is the Gibson Mix, given in
the table below, the occurrence probabilities being given in the 𝑝𝑖 column.
Consider the problem of determining whether Gibson mix is suitable in a particular
environment. Since there are 𝑟 = 7 classification in the mix, the appropriate
hypothesis test is one in which the null hypothesis specifies the 7 cell probabilities.
The expected and observed frequencies are given in 𝐸𝑖 and 𝑛𝑖 column, of the table.
The observed frequencies correspond to a sample of 200 instructions that have been
randomly selected from typical program in use. Thus, for each 𝑖, 𝐸𝑖 = 200𝑝𝑖 . The
𝜒 2 test is applicable with 7-1=6 degree of freedom.
1. Hypothesis test : 𝑯𝟎 : 𝒑𝟏 = 𝟎. 𝟑𝟏, 𝒑𝟐 = 𝟎. 𝟏𝟖,
𝒑𝟑 = 𝟎. 𝟏𝟕, 𝒑𝟒 = 𝟎. 𝟏𝟐 , 𝒑𝟓 =
𝟎. 𝟎𝟕, 𝒑𝟔 = 𝟎. 𝟎𝟒, 𝒑𝟕 = 𝟎. 𝟏𝟏
2. Level of significance: 𝛼 = 0.05
3. Test Statistics : 𝜒6 2
4. Sample size : 𝑛 = 200
5. Critical value : 𝜒 2 0.05,6 =12.592
Instruction Prob. Of Expected Obs. Freq.
Type instruction freq. (𝑬𝒊 ) 𝑶𝒊
type 𝒑𝒊
Transfer to and 0.31 62 72
from main
memory
Indexing 0.18 36 30
Branching 0.17 34 32
Floating-point 0.12 24equation here.14
arithmetic
Type
Fixed-point 0.07 14 22
arithmetic-
Shifting 0.04 8 10
Miscellaneous 0.11 22 20
𝜒6 2
62 − 72 2 36 − 30 2 34 − 32 2 24 − 14 2 14 − 22 2 8 − 10 2 22 − 20 2
= + + + + + +
62 36 34 24 14 8 22
= 12.15 < 12.592
Thus the Gibson Mix probabilities are accepted.
* Keep in mind the meaning of the test. It has not demonstrated the accuracy of the Gibson Mix , only
that , at 0.05 level, the sample value is not significant and the Gibson Mix probabilities are therefore
not rejected.
Example: A computer system has six I/O channels and the system personnel are
reasonably certain that the load on the channels is balanced. If 𝑋 is the random
variable denoting the index of the channel to which a given I/O operation is
directed, then its pdf is assumed to be:
𝒑𝑿 𝒊 = 𝒑𝒊 = 𝟏ൗ𝟔 ; 𝒊 = 𝟎, 𝟏, … . , 𝟓.
Out of 𝒏 = 𝟏𝟓𝟎 I/O operations observed, the numbers of operations directed to
various channels were:
𝒏𝟎 = 𝟐𝟐 𝒏𝟏 = 𝟐𝟑 𝒏𝟐 = 𝟐𝟗 𝒏𝟑 = 𝟑𝟏 𝒏𝟒 = 𝟐𝟔 𝒏𝟓 = 𝟏𝟗

We wish to test the hypothesis that load on the channel is balanced; that is
𝐻0 : 𝑝𝑖 = 1ൗ6 ; 𝑖 = 0,1,2, … . , 5.
𝜒5 2
22 − 25 2 23 − 25 2 29 − 25 2 31 − 25 2 26 − 25 2 19 − 25 2
= + + + + +
25 25 25 25 25 25
= 4.08 < 11.1
Therefore we cannot reject the null hypothesis that the channels are load balanced.
Kolmogorov-Smirnov Test
The K-S test is applicable to unbinned distributions that are functions of a single
independent variable, that is, to data sets where each data point can be associated
with a single number. In such cases, the list of data points can be easily converted
to an unbiased estimation 𝑆𝑁 𝑥 of the cumulative distribution (CDF) function of
the probability distribution from which it was drawn:

If the 𝑵 events are located at values 𝒙𝒊 , 𝒊 = 𝟏, 𝟐, … . . , 𝑵 ., then 𝑺𝑵 𝒙 is the


function giving the fraction of data points to the left of a given value of 𝒙. This
function is obviously constant between consecutive 𝒙𝒊 ’s and jumps by the same
constant 𝟏Τ𝑵 at each 𝒙𝒊 shown below.
𝑆𝑁

CDF
K-S test compares the continuous CDF , 𝐹 𝑥 , of the uniform distribution to the
empirical CDF, 𝑆𝑁 𝑥 , of the sample of 𝑁 observations. By definition
𝐹 𝑥 = 𝑥, 0 ≤ 𝑥 ≤ 1.
If the sample from the random –number generator is 𝑅1 , 𝑅2 , … . , 𝑅𝑁 , then the
empirical CDF , 𝑆𝑁 𝑥 is defined by
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑅1 , 𝑅2 , … , 𝑅𝑁 ≤ 𝑥
𝑆𝑁 𝑥 =
𝑁
As 𝑁 becomes larger 𝑆𝑁 𝑥 should become better approximation to 𝐹(𝑥)
provided that null hypothesis is true.

Different distribution functions, or set of data, give different CDF estimates by


the above procedure. However all CDF’s agree at the smallest allowable value
of 𝒙 ( where they are zero) and the largest allowable value of 𝒙 ( where they
are unity). So it is the behavior between the largest and smallest values that
distinguishes distributions.
One can think of any number of statistics to measure the overall difference
between two CDF’s: The absolute value of the area between them, for example.
Or their integrated mean square difference. The K-S , 𝑫 is a particularly simple
measure: It is defined as the maximum value of the absolute difference between
two CDF’s. Thus comparing one data sets 𝑺𝑵 𝒙 to a known CDF function 𝑭 𝒙 ,
the K-S statistics is
𝑫 = 𝐦𝐚𝐱 𝑺𝑵 𝒙 − 𝑭 𝒙
−∞<𝒙<𝟏
What makes the K-S statistics useful is that its distribution in the case of the null
hypothesis ( data sets drawn from the same distribution) can be calculated, at
least to useful approximation, thus giving the significance of any observed
nonzero value of 𝑫.
The function which enters into the calculation of the significance can be written
as the following sum

𝑗−1 𝑒 −2𝑗 2 𝜆2
𝑄𝐾−𝑆 𝜆 = 2 ෍ −1
𝑗=1
Which is a monotonic function with limiting values 𝑄𝐾−𝑆 0 = 1, 𝑄𝐾−𝑆 ∞ = 0.
In terms of this function , the significance level of observed value 𝐷(as a disprrof
of the null hypothesis that the distributions are the same) is given by the
formula
𝑷𝒓𝒐𝒃𝒂𝒃𝒊𝒍𝒊𝒕𝒚 𝑫 > 𝒐𝒃𝒔𝒆𝒓𝒗𝒆𝒅 = 𝑸𝑲−𝑺 ( 𝑵 .D)
For testing against a uniform CDF , the test procedure follows these steps:

Step-1: Rank the data from smallest to largest. Let 𝑅𝑖 denote the 𝑖 − 𝑡ℎ smallest
observation , so that
𝑅1 ≤ 𝑅2 ≤ 𝑅3 ≤ ⋯ … … … ≤ 𝑅𝑁

Step-2: Compute
𝑖
𝐷 + = max − 𝑅𝑖 ,
1≤𝑖≤𝑁 𝑁
𝑖−1
𝐷 − = max 𝑅𝑖 −
1≤𝑖≤𝑁 𝑁
Step-3: Compute
𝐷 = 𝑚𝑎𝑥 𝐷+ , 𝐷 −
Step-4: Determine the critical value of 𝐷𝛼 from the table for the specified
significance level 𝛼 and the given sample size 𝑁.

Step-5: If the sample statistics 𝐷 is > 𝐷𝛼 , the null hypothesis that the data are a
sample from a uniform distribution is rejected. If 𝐷 ≤ 𝐷𝛼 , conclude that no
difference has been detected between true distribution of 𝑅1 , 𝑅2 , … . , 𝑅3 and the
uniform distribution.
# Given the random sample of size 𝒏 = 𝟏𝟎:
0.3404, 0.1440 , 0.6960 , 0.8675 , 0.5649 , 0.5793 , 0.1514 , 0.5044 , 0.9859 ,
0.4658
We wish to test the hypothesis that the population distribution function is the
uniform distribution over (0,1).

Sol.: Following table details the procedure


𝑹𝒊 : 0.1440 0.1514 0.3407 0.4658 0.5044 0.5649 0.5793 0.6960 0.8675 0.9859

𝑖ൗ 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.0
𝑁
𝑖 − 1ൗ 0.0 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90
𝑁
𝑖ൗ -- 0.0486 -- -- -- 0.0351 0.1207 0.1040 0.0325 0.0141
𝑁
− 𝑅𝑖
𝑅𝑖 0.1440 0.0514 0.1407 0.1658 0.1044 0.0649 -- -- 0.0675 0.0859
− 𝑖 − 1ൗ𝑁

Therefore 𝐷 = 𝑚𝑎𝑥 𝐷 + , 𝐷 − = 0.1658.


Using the Table of critical values, we find that at 𝛼 = 0.05, 𝐷𝛼 = 0.41. We
therefore accept the null hypothesis at the 5% level of significance.

You might also like