Professional Documents
Culture Documents
Test of Hypothesis: Distribution,, - , + + - +
Test of Hypothesis: Distribution,, - , + + - +
Test of Hypothesis: Distribution,, - , + + - +
𝝌𝟐 −Distribution
Here
𝝂 = 𝒌 − 𝟏 is the degree of 𝒇 𝝌𝟐
freedom (d.f)
𝒌 −the number of pairs of
frequencies to be compared.
𝝌𝟐
• Goodness-of-fit test – an inferential procedure used to determine
whether a categorical frequency distribution follows a claimed
distribution.
P-value = P(χ2 )
χ2α
Critical Region
where
𝒏 (𝑶𝒊 −𝑬𝒊 )𝟐 Oi is observed count
Test Statistic: χ =σ𝒊=𝟏
2
for ith category and
𝑬𝒊 Ei is the expected count
for the ith category
Hypotheses:
H0: Motor vehicle accidents involving cell phones are equally
likely to occur everyday of the week
Conditions:
% O E (O – E)2 (O – E) 2/E
Bride and groom 50 55 51.5_ 12.25__ 0.2379
Bride only 12 12 12.36 0.1296 0.0105
Groom only 14 12 14.42 5.8564 0.4061
Neither 24 24 24.72 0.5184 0.0210
Total 100 103 103.__ 0.6755
= 0.6755
0 11.34
7. Make your decision.
The test statistic 0.6755 does not fall in the rejection region, so fail
to reject H0.
0 12 6 7
1 10 7 5
2 19 8 5
3 17 9 3
4 10 10 3
5 8 11 1
The mean could be obtained as:
σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 364
𝑥ҧ = 𝛼 = 𝑛 = = 3.64
σ𝑖=1 𝑓𝑖 100
Since the histogram of the data , appeared to follow a Poisson distribution with
mean 3.64 (determined from the data), following hypothesis may be made:
𝐻0 : The random variable is Poisson distributed
𝐻1 : The random variable is not Poisson distributed.
𝒇 𝟎 = 𝟎. 𝟎𝟐𝟔 𝒇 𝟔 = 𝟎. 𝟎𝟖𝟓
𝒇 𝟏 = 𝟎. 𝟎𝟗𝟔 𝒇 𝟕 = 𝟎. 𝟎𝟒𝟒
𝒇 𝟐 = 𝟎. 𝟏𝟕𝟒 𝒇 𝟖 = 𝟎. 𝟎𝟐𝟎
𝒇 𝟑 = 𝟎. 𝟐𝟏𝟏 𝒇 𝟗 = 𝟎. 𝟎𝟎𝟖
𝒇 𝟒 = 𝟎. 𝟏𝟗𝟐 𝒇 𝟏𝟎 = 𝟎. 𝟎𝟎𝟑
𝒇 𝟓 = 𝟎. 𝟏𝟒𝟎 𝒇 𝟏𝟏 = 𝟎. 𝟎𝟎𝟏
With this information following table was constructed. The value of 𝐸1 is given by
𝑛𝑝0 = 100 ∗ 0.026 = 2.6. In a similar manner, the remaining 𝐸𝑖 values are
determined. Since 𝐸1 < 5, 𝐸1 and 𝐸2 are combined. In that case 𝑂1 and 𝑂2 are
also combined and 𝑘 is reduced by 1. The last five class intervals are also
combined for the same reason, and 𝑘 is further reduced by four.
100 27.68
The calculated 𝜒 2 =27.68.
𝑝𝑖 𝑡𝑖
𝑖=1
where 𝑝𝑖 is the probability of calling instruction 𝐼𝑖 𝑎𝑛𝑑 𝑡𝑖 is the execution time of
𝐼𝑖 . A specific set of instructions 𝐼1 , 𝐼2 , … . . , 𝐼𝑟 in conjunction with their occurrence
probabilities comprise an instruction mix . The execution time rating is
dependent on the accuracy of the mix. One mix in use is the Gibson Mix, given in
the table below, the occurrence probabilities being given in the 𝑝𝑖 column.
Consider the problem of determining whether Gibson mix is suitable in a particular
environment. Since there are 𝑟 = 7 classification in the mix, the appropriate
hypothesis test is one in which the null hypothesis specifies the 7 cell probabilities.
The expected and observed frequencies are given in 𝐸𝑖 and 𝑛𝑖 column, of the table.
The observed frequencies correspond to a sample of 200 instructions that have been
randomly selected from typical program in use. Thus, for each 𝑖, 𝐸𝑖 = 200𝑝𝑖 . The
𝜒 2 test is applicable with 7-1=6 degree of freedom.
1. Hypothesis test : 𝑯𝟎 : 𝒑𝟏 = 𝟎. 𝟑𝟏, 𝒑𝟐 = 𝟎. 𝟏𝟖,
𝒑𝟑 = 𝟎. 𝟏𝟕, 𝒑𝟒 = 𝟎. 𝟏𝟐 , 𝒑𝟓 =
𝟎. 𝟎𝟕, 𝒑𝟔 = 𝟎. 𝟎𝟒, 𝒑𝟕 = 𝟎. 𝟏𝟏
2. Level of significance: 𝛼 = 0.05
3. Test Statistics : 𝜒6 2
4. Sample size : 𝑛 = 200
5. Critical value : 𝜒 2 0.05,6 =12.592
Instruction Prob. Of Expected Obs. Freq.
Type instruction freq. (𝑬𝒊 ) 𝑶𝒊
type 𝒑𝒊
Transfer to and 0.31 62 72
from main
memory
Indexing 0.18 36 30
Branching 0.17 34 32
Floating-point 0.12 24equation here.14
arithmetic
Type
Fixed-point 0.07 14 22
arithmetic-
Shifting 0.04 8 10
Miscellaneous 0.11 22 20
𝜒6 2
62 − 72 2 36 − 30 2 34 − 32 2 24 − 14 2 14 − 22 2 8 − 10 2 22 − 20 2
= + + + + + +
62 36 34 24 14 8 22
= 12.15 < 12.592
Thus the Gibson Mix probabilities are accepted.
* Keep in mind the meaning of the test. It has not demonstrated the accuracy of the Gibson Mix , only
that , at 0.05 level, the sample value is not significant and the Gibson Mix probabilities are therefore
not rejected.
Example: A computer system has six I/O channels and the system personnel are
reasonably certain that the load on the channels is balanced. If 𝑋 is the random
variable denoting the index of the channel to which a given I/O operation is
directed, then its pdf is assumed to be:
𝒑𝑿 𝒊 = 𝒑𝒊 = 𝟏ൗ𝟔 ; 𝒊 = 𝟎, 𝟏, … . , 𝟓.
Out of 𝒏 = 𝟏𝟓𝟎 I/O operations observed, the numbers of operations directed to
various channels were:
𝒏𝟎 = 𝟐𝟐 𝒏𝟏 = 𝟐𝟑 𝒏𝟐 = 𝟐𝟗 𝒏𝟑 = 𝟑𝟏 𝒏𝟒 = 𝟐𝟔 𝒏𝟓 = 𝟏𝟗
We wish to test the hypothesis that load on the channel is balanced; that is
𝐻0 : 𝑝𝑖 = 1ൗ6 ; 𝑖 = 0,1,2, … . , 5.
𝜒5 2
22 − 25 2 23 − 25 2 29 − 25 2 31 − 25 2 26 − 25 2 19 − 25 2
= + + + + +
25 25 25 25 25 25
= 4.08 < 11.1
Therefore we cannot reject the null hypothesis that the channels are load balanced.
Kolmogorov-Smirnov Test
The K-S test is applicable to unbinned distributions that are functions of a single
independent variable, that is, to data sets where each data point can be associated
with a single number. In such cases, the list of data points can be easily converted
to an unbiased estimation 𝑆𝑁 𝑥 of the cumulative distribution (CDF) function of
the probability distribution from which it was drawn:
CDF
K-S test compares the continuous CDF , 𝐹 𝑥 , of the uniform distribution to the
empirical CDF, 𝑆𝑁 𝑥 , of the sample of 𝑁 observations. By definition
𝐹 𝑥 = 𝑥, 0 ≤ 𝑥 ≤ 1.
If the sample from the random –number generator is 𝑅1 , 𝑅2 , … . , 𝑅𝑁 , then the
empirical CDF , 𝑆𝑁 𝑥 is defined by
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑅1 , 𝑅2 , … , 𝑅𝑁 ≤ 𝑥
𝑆𝑁 𝑥 =
𝑁
As 𝑁 becomes larger 𝑆𝑁 𝑥 should become better approximation to 𝐹(𝑥)
provided that null hypothesis is true.
Step-1: Rank the data from smallest to largest. Let 𝑅𝑖 denote the 𝑖 − 𝑡ℎ smallest
observation , so that
𝑅1 ≤ 𝑅2 ≤ 𝑅3 ≤ ⋯ … … … ≤ 𝑅𝑁
Step-2: Compute
𝑖
𝐷 + = max − 𝑅𝑖 ,
1≤𝑖≤𝑁 𝑁
𝑖−1
𝐷 − = max 𝑅𝑖 −
1≤𝑖≤𝑁 𝑁
Step-3: Compute
𝐷 = 𝑚𝑎𝑥 𝐷+ , 𝐷 −
Step-4: Determine the critical value of 𝐷𝛼 from the table for the specified
significance level 𝛼 and the given sample size 𝑁.
Step-5: If the sample statistics 𝐷 is > 𝐷𝛼 , the null hypothesis that the data are a
sample from a uniform distribution is rejected. If 𝐷 ≤ 𝐷𝛼 , conclude that no
difference has been detected between true distribution of 𝑅1 , 𝑅2 , … . , 𝑅3 and the
uniform distribution.
# Given the random sample of size 𝒏 = 𝟏𝟎:
0.3404, 0.1440 , 0.6960 , 0.8675 , 0.5649 , 0.5793 , 0.1514 , 0.5044 , 0.9859 ,
0.4658
We wish to test the hypothesis that the population distribution function is the
uniform distribution over (0,1).
𝑖ൗ 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.0
𝑁
𝑖 − 1ൗ 0.0 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90
𝑁
𝑖ൗ -- 0.0486 -- -- -- 0.0351 0.1207 0.1040 0.0325 0.0141
𝑁
− 𝑅𝑖
𝑅𝑖 0.1440 0.0514 0.1407 0.1658 0.1044 0.0649 -- -- 0.0675 0.0859
− 𝑖 − 1ൗ𝑁