Professional Documents
Culture Documents
ASM_Mid Sem_slide 2 to 8
ASM_Mid Sem_slide 2 to 8
Pilani Campus
5
“It is a good idea to check for outliers before making
decisions based on data analysis. Errors are often made
in recording data and entering data into the computer.
Outliers should not necessarily be deleted, but their
accuracy and appropriateness should be verified.
8
Empirical rule
For the data having a bell-shaped distribution.
11
• Is the outlier because of error
measurement or incorrectly • To summarize our discussion:
entered? — Then it is a Noise
and should be dropped (or • Why do you want to find the
change, if you know the real outlier? You might be want to
value of the data) see the outlier because you
• Is the outlier does not change are interested in the
the results but does it affect abnormality. Think about what
the assumptions? In this case, your question is.
you may drop the outlier or not. • Is the outlier “actually” causing
• Is the outlier affects both any problems with the result,
statistical results and the influence, or assumptions?
assumptions? In this case, we • Where did the outlier come?
cannot merely drop the outlier. This might take in-depth
analysis and domain expertise.
• Moreover, You can’t always
tell where it is come from, but
try to consider different
possibilities because it can
help inform the best way to
proceed.
12
[6/288/17]
The mean cost of domestic airfares in the USA rose
to an all-time high of $385per ticket. Airfares
were based on the total ticket value, which
consisted of the price charged by the airlines
plus any additional taxes and fees. Assume
domestic airfares are normally distributed with a
standard deviation of $110
a. What is the probability that a domestic airfare is
$550 or more
b. What is the probability that a domestic airfare is
$250 or less
c. What is the probability that a domestic airfare is
between $300 and $550
d. What is the cost for the 3% highest domestic
airfares?
14
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
BA ZG524 A D V AN C E D S T AT IS T IC AL M E T HODS 16 BITS-Pilani
•
• The article “A Rapid and Simple Method for Simultaneous Determination of Glycerol,
Fructose, and Glucose in Wine” (American J. of Enology and Viticulture, 2007: 279–283)
includes the following observations on glycerol concentration (mg/mL) for samples of
standard-quality (uncertified) white wines: 2.67, 4.62, 4.14, 3.81, 3.83. Suppose the desired
concentration value is 4.
• Does the sample data suggest that true average concentration is something other than the
desired value?
• Let’s carry out a test of appropriate hypotheses using the one-sample t test with a
significance level of .05.
The accompanying Minitab output from a request to
perform a two-tailed one sample t test shows
identical calculated values to those just obtained.
Hypotheses
Test statistic
Rejectio region
(Critical value
approach)
• The article “A Rapid and Simple Method for Simultaneous Determination of Glycerol,
Fructose, and Glucose in Wine” (American J. of Enology and Viticulture, 2007: 279–283)
includes the following observations on glycerol concentration (mg/mL) for samples of
standard-quality (uncertified) white wines: 2.67, 4.62, 4.14, 3.81, 3.83. Suppose the desired
concentration value is 4.
• Does the sample data suggest that true average concentration is something other than the
desired value?
• Let’s carry out a test of appropriate hypotheses using the one-sample t test with a
significance level of .05.
The accompanying Minitab output from a request to
perform a two-tailed one sample t test shows
identical calculated values to those just obtained.
Hypotheses
Test statistic
Rejectio region
(Critical value
approach)
61 64 65 62 90 69 76 79 77 54 64 74 65 65 61 56 63 80 56 71 79 84
Test Score Interval Observed fre Expected fre Chi- square test sta
Less than 55.10
55.10 to 59.68
59.68 to 63.01
63.01 to 65.82
65.82 to68.42
68.42 to 71.02
71.02 to 73.83
73.83 to 77.16
77.16 to 81.74
81.74 and over
BITS Pilani, Pilani Campus
Test Score Interval Observed fre Expected fre Chi- square test sta
Less than 55.10 5 5 0
55.10 to 59.68 5 5 0
59.68 to 63.01 9 5 3.2
63.01 to 65.82 6 5 0.2
65.82 to68.42 2 5 1.8
68.42 to 71.02 5 5 0
71.02 to 73.83 2 5 1.8
73.83 to 77.16 5 5 0
77.16 to 81.74 5 5 0
81.74 and over 6 5 0.2
• The article “A Rapid and Simple Method for Simultaneous Determination of Glycerol,
Fructose, and Glucose in Wine” (American J. of Enology and Viticulture, 2007: 279–283)
includes the following observations on glycerol concentration (mg/mL) for samples of
standard-quality (uncertified) white wines: 2.67, 4.62, 4.14, 3.81, 3.83. Suppose the desired
concentration value is 4.
• Does the sample data suggest that true average concentration is something other than the
desired value?
• Let’s carry out a test of appropriate hypotheses using the one-sample t test with a
significance level of .05.
The accompanying Minitab output from a request to
perform a two-tailed one sample t test shows
identical calculated values to those just obtained.
Hypotheses
Test statistic
Rejectio region
(Critical value
approach)
61 64 65 62 90 69 76 79 77 54 64 74 65 65 61 56 63 80 56 71 79 84
Test Score Interval Observed fre Expected fre Chi- square test sta
Less than 55.10
55.10 to 59.68
59.68 to 63.01
63.01 to 65.82
65.82 to68.42
68.42 to 71.02
71.02 to 73.83
73.83 to 77.16
77.16 to 81.74
81.74 and over
BITS Pilani, Pilani Campus
Test Score Interval Observed fre Expected fre Chi- square test sta
Less than 55.10 5 5 0
55.10 to 59.68 5 5 0
59.68 to 63.01 9 5 3.2
63.01 to 65.82 6 5 0.2
65.82 to68.42 2 5 1.8
68.42 to 71.02 5 5 0
71.02 to 73.83 2 5 1.8
73.83 to 77.16 5 5 0
77.16 to 81.74 5 5 0
81.74 and over 6 5 0.2
Hypotheses
Test statistic
Rejectio region
(Critical value
approach)
61 64 65 62 90 69 76 79 77 54 64 74 65 65 61 56 63 80 56 71 79 84
Test Score Interval Observed fre Expected fre Chi- square test sta
Less than 55.10
55.10 to 59.68
59.68 to 63.01
63.01 to 65.82
65.82 to68.42
68.42 to 71.02
71.02 to 73.83
73.83 to 77.16
77.16 to 81.74
81.74 and over
BITS Pilani, Pilani Campus
Test Score Interval Observed fre Expected fre Chi- square test sta
Less than 55.10 5 5 0
55.10 to 59.68 5 5 0
59.68 to 63.01 9 5 3.2
63.01 to 65.82 6 5 0.2
65.82 to68.42 2 5 1.8
68.42 to 71.02 5 5 0
71.02 to 73.83 2 5 1.8
73.83 to 77.16 5 5 0
77.16 to 81.74 5 5 0
81.74 and over 6 5 0.2
When Ho is true
When Ho is false
61 64 65 62 90 69 76 79 77 54 64 74 65 65 61 56 63 80 56 71 79 84
Test Score Interval Observed fre Expected fre Chi- square test sta
Less than 55.10
55.10 to 59.68
59.68 to 63.01
63.01 to 65.82
65.82 to68.42
68.42 to 71.02
71.02 to 73.83
73.83 to 77.16
77.16 to 81.74
81.74 and over
BITS Pilani, Pilani Campus
Test Score Interval Observed fre Expected fre Chi- square test sta
Less than 55.10 5 5 0
55.10 to 59.68 5 5 0
59.68 to 63.01 9 5 3.2
63.01 to 65.82 6 5 0.2
65.82 to68.42 2 5 1.8
68.42 to 71.02 5 5 0
71.02 to 73.83 2 5 1.8
73.83 to 77.16 5 5 0
77.16 to 81.74 5 5 0
81.74 and over 6 5 0.2
When Ho is true
When Ho is false
Error SSE nt -k
Total SST nt -1
Example: Reed Manufacturing
Janet Reed would like to know if there is any significant
difference in the mean number of hours worked per week for
the department managers at her three manufacturing plants
(in Buffalo, Pittsburgh, and Detroit). An F test will be
conducted using a = .05
Plants 490 ? ? ?
Comment
Error on your
? results? using ANOVA
25.66667 table below.
Total ? ?
A 4 78 19.5 5.666667
B 4 84 21 11.33333
C 4 90 22.5 5.666667
D 4 99 24.75 9.583333
Groups Count Sum Average Variance West End 5 107 21.4 7.3
ANOVA
Total 229.2 19
Total 229.2 19