L04-Pengujian Hipotesis - r1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 95

DATA ANALYTICS FOR BUSINESS

Testing Statistical Hypothesis

Dr. Uka Wikarya

MAGISTER AKUNTANSI
FAKULTAS EKONOMI DAN BISNIS
UNIVERSITAS INDONESIA
1
Outline of Presentation Material
1. Basic Idea of hypothesis Testing
2. Hypothesis Testing for Population Mean
• Two-Tailed Z Test of Mean ( Known)
• One-Tailed Z Test of Mean ( Known)
• Two-Tailed t Test of Mean ( Unknown)
3. Hypothesis Testing for Population Proportion
4. Hypothesis Testing for Two Population Means and Proportion
5. Decision Making Risks

2
Hypothesis Testing
I believe the
population mean age
is 50 (hypothesis). Reject
hypothesis! Not
Population close.

 

 
 Random
 sample

Mean
 X = 20
3
Stating a Hypothesis?

I believe the mean of electricity


A belief about a population expense is more than
Rp250000 per month
parameter
• Parameter is
population mean,
proportion, variance
• Must be stated
before analysis

© 1984-1994 T/Maker Co.

4
Basic Idea
Sampling Distribution for sample mean

It is unlikely that
we would get a ... therefore, we reject
sample mean of
this value ... the hypothesis that m
= 50.

... if in fact this were


the population mean

m = 50 Sample Means
20 H0
5
Hypothesis Testing

Hypothesis and Hypothesis Testing


HYPOTHESIS. A statement about the value of a population parameter developed for
the purpose of testing.

HYPOTHESIS TESTING. A procedure based on sample evidence and probability


theory to determine whether the hypothesis is a reasonable statement.

Null and alternate Hypothesis


NULL HYPOTHESIS (Ho). A statement about the value of a population
parameter developed for the purpose of testing numerical evidence.

ALTERNATE HYPOTHESIS (Ha). A statement that is accepted if the


sample data provide sufficient evidence that the null hypothesis is false.

6
Test Statistic versus Critical Value

TEST STATISTIC. A value, determined from sample information, used to


determine whether to reject the null hypothesis.

Example: z, t, F, 2

CRITICAL VALUE. The dividing point between the region where the null
hypothesis is rejected and the region where it is not rejected.
Level of Significance
1. Probability type
2. Defines unlikely values of sample statistic if null
hypothesis is true
• Called rejection region of sampling
distribution
3. Designated (alpha)
• Typical values are .01, .05, .10
4. Selected by researcher at start

8
Rejection Region (One-Tail Test)

Sampling Distribution Level of Confidence

Rejection
Region

1–

Nonrejection
Region

Ho Sample Statistic
Critical Value
Value
Observed sample statistic

9
Rejection Region (One-Tail Test)

Sampling Distribution Level of Confidence

Rejection
Region

1–

Nonrejection
Region

Ho Sample Statistic
Critical Value
Value
Observed sample statistic
10
Rejection Regions (Two-Tailed Test)

Sampling Distribution Level of Confidence

Rejection Rejection
Region Region

1–
1/2  1/2 
Nonrejection
Region

Ho Sample Statistic
Critical Value Critical
Value Value
Observed sample statistic
11
Rejection Regions
(Two-Tailed Test)

Sampling Distribution Level of Confidence

Rejection Rejection
Region Region

1–
1/2  1/2 
Nonrejection
Region

Ho Sample Statistic
Critical Value Critical
Value Value
Observed sample statistic
12
Rejection Regions
(Two-Tailed Test)

Sampling Distribution Level of Confidence

Rejection Rejection
Region Region

1–
1/2  1/2 
Nonrejection
Region

Ho Sample Statistic
Critical Value Critical
Value Value
Observed sample statistic
13
Two-tail Test in Normal Standard Distribution (Z)
One-tail Test in Normal Standard Distribution (Z)
One-Tailed Z Test Finding Critical Z

What Is Z given  = .025? Standardized Normal Probability Table


(Portion)

.500  
- .025 =1 Z .05 .06 .07
.475 1.6 .4505 .4515 .4525
 = .025
1.7 .4599 .4608 .4616

0 1.96 Z 1.8 .4678 .4686 .4693

 1.9 .4744 .4750 .4756

16
Hypothesis Setups for Testing a Mean (m)

17
Two-Tailed Z Test
of Mean ( Known)

18
Two -Tailed Z Test for Mean ( Known)

1. Assumptions
• Population is normally distributed
• If not normal, can be approximated by
normal distribution since n  30.
2. Alternative hypothesis has  sign

3. Z-test Statistic

X  mx X m
Z 
x 
n
19
Two-Tailed Z Test Example

Does an average box of


cereal contain 368 grams
of cereal?
A random sample of 25
boxes showed x = 372.5.
The company has
specified  to be 25
grams. Test at the .05 368 gm.

level of significance.
20
Two-Tailed Z Test Solution

 H0: m = 368 Test Statistic:


 Ha: m  368 X  m 372.5  368
Z   1.50
   .05  15
 n  25
n 25
 Critical Value(s):
Decision:
Reject H0 Reject H0 Do not reject at  = .05
.025 .025 Conclusion:
No evidence “average
-1.96 0 1.96 Z is not 368”
21
One-Tailed Z Test of Mean
( Known)

22
One-Tailed Z Test for Mean ( Known)

1. Assumptions
• Population is normally distributed
• If not normal, can be approximated by
normal distribution, since n  30
2. Alternative hypothesis has < or > sign

3. Z-test Statistic

X  mx X m
Z 
x 
n
23
One-Tailed Z Test for Mean Hypotesis

H0:m>= 0 Ha: m< 0 H0:m<= 0 Ha: m> 0

Reject H 0 Reject H 0
 

0 Z 0 Z
Must be significantly below Small values satisfy H0 .
m Don’t reject!

24
One-Tailed Z Test Example

Does an average box of


cereal contain more than
368 grams of cereal?

A random sample of 25
boxes showed x = 372.5.
The company has specified
 to be 15 grams. Test at 368 gm.
the .05 level of significance.
25
One-Tailed Z Test Solution

 H0: m = 368 Test Statistic:


 Ha: m > 368 X  m 372.5  368
Z   1.50
  = .05  15
 n = 25 n 25
 Critical Value(s):
Decision:
Reject Do not reject at  = .05
.05 Conclusion:
No evidence average
0 1.645 Z is more than 368
26
Use the Mean Comparison z-test
Calculator made by Stata v.15

Don’t reject null hypothesis,


thus no evidence average
is more than 368

27
One-Tailed Z Test Thinking Challenge

You’re an analyst for Ford. You want


to find out if the average miles per
gallon of Escorts is at least 32 mpg.
Similar models have a standard
deviation of 3.8 mpg. You take a
sample of 60 Escorts & compute a
sample mean of 30.7 mpg. At the .01
level of significance, is there evidence
that the miles per gallon is at least
32?

28
One-Tailed Z Test Solution*

 H0: m = 32 Test Statistic:


 Ha: m < 32 X  m 30.7  32
Z   2.65
  = .01  3.8
 n = 60 n 60
 Critical Value(s):
Decision:
Reject Reject at  = .01
.01 Conclusion:
There is evidence
-2.33 0 Z average is less than 32
29
Use the Mean Comparison z-test
Calculator made by Stata v.15

Reject null hypothesis,


thus there is evidence
that average is less
than 32

30
Observed Significance
Levels: p-Values

31
p-Value

1. Probability of obtaining a test statistic more


extreme (or than actual sample value,
given H0 is true
2. Called observed level of significance
• Smallest value of  for which H0 can be
rejected
3. Used to make rejection decision
• If p-value  , do not reject H0
• If p-value < , reject H0

32
Two-Tailed Z Test p-Value Example

Does an average box of


cereal contain 368 grams of
cereal?
A random sample of 25
boxes showed x = 372.5.
The company has specified
 to be 25 grams. Find the 368 gm.

p-Value.

33
Two-Tailed Z Test p-Value Solution

X  m 372.5  368 Ho:m = 368


Z   1.50
 15 Ha:m  368
n 25

0 1.50 Z
Z value of sample statistic
 (observed)
34
Two-Tailed Z Test p-Value Solution

p-value is P(Z  -1.50 or Z  1.50)


1/2 p-Value 1/2 p-Value .5000
- .4332
.0668
.4332

-1.50 0 1.50 Z
From Z table: Z value of sample statistic
 lookup 1.50  (observed)

35
Two-Tailed Z Test p-Value Solution

p-value is P(Z  -1.50 or Z  1.50) = .1336

1/2 p-Value 1/2 p-Value 


.5000
.0668 .0668 - .4332
.0668

-1.50 0 1.50 Z
From Z table: Z value of sample statistic
 lookup 1.50 
36
Two-Tailed Z Test p-Value Solution

(p-Value = .1336)  ( = .05).


Do not reject H0.
1/2 p-Value = .0668 1/2 p-Value = .0668

Reject H0 Reject H0
1/2  = .025 1/2  = .025

-1.50 0 1.50 Z
Test statistic is in ‘Do not reject’ region

37
One-Tailed Z Test p-Value Example

Does an average box of


cereal contain more than
368 grams of cereal?
A random sample of 25
boxes showed x = 372.5.
The company has specified
 to be 25 grams. Find the
p-Value. 368 gm.

38
One-Tailed Z Test p-Value Solution

Ho:m = 368
X  m 372.5  368
Ha:m > 368 Z   1.50
 15
n 25

0 1.50 Z
Z value of sample statistic

39
One-Tailed Z Test p-Value Solution

p-Value is P(Z  1.50)


Use alternative
p-Value

hypothesis to .5000
find direction - .4332
.0668
.4332

0 1.50 Z
 From Z table:
lookup 1.50 
Z value of sample statistic

40
One-Tailed Z Test p-Value Solution

p-Value is P(Z  1.50) = .0668

 p-Value
Use alternative 
hypothesis to .0668 .5000
find direction - .4332
.0668
.4332

0 1.50 Z
 From Z table:
lookup 1.50 
Z value of sample statistic

41
One-Tailed Z Test p-Value Solution

(p-Value = .0668)  ( = .05).


Do not reject H0.
p-Value = .0668

Reject H0
 = .05

0 1.50 Z
Test statistic is in ‘Do not reject’ region

42
p-Value: Thinking Challenge

You’re an analyst for Ford. You


want to find out if the average miles
per gallon of Escorts is at least 32
mpg.
Similar models have a standard
deviation of 3.8 mpg. You take a
sample of 60 Escorts & compute a
sample mean of 30.7 mpg. What is
the value of the observed level of
significance (p-Value)?

43
p-Value: Solution*
Ho:m >= 368
p-Value is P(Z  -2.65) = .004. Ha:m < 368
p-Value < ( = .01). Reject H0.
 p-Value

Use alternative .5000
hypothesis to .004 - .4960
.0040
find direction
.4960

-2.65 0 Z
Z value of sample From Z table:
 statistic  lookup 2.65

44
Two-Tailed t Test
of Mean ( Unknown)

45
t Test for Mean ( Unknown)
1. Assumptions
• Population is normally distributed
• If not normal, only slightly skewed & large
sample (n  30) taken
2. Parametric test procedure
3. t test statistic
X m
t
S
n

46
Two-Tailed t Test, Finding Critical t
Values

Given: n = 3;  = .10 Critical Values of t Table (Portion)


df = n - 1 = 2 
v t.10 t.05 t.025
 /2 =
.05  1 3.078 6.314 12.706
 /2 = .05
2 1.886 2.920 4.303
-2.920 0 2.920 t
3 1.638 2.353 3.182

 47
Two-Tailed t Test, Example

Does an average box of


cereal contain 368 grams
of cereal? A random
sample of 36 boxes had
a mean of 372.5 and a
standard deviation of 12
grams. Test at the .05
level of significance.
368 gm.

48
Two-Tailed t Test, Solution

 H0: m = 368 Test Statistic:


 Ha: m  368 X  m 372.5  368
t   2.25
  = .05 S 12
 df = 36 - 1 = 35 n 36
 Critical Value(s):
Decision:
Reject H0 Reject H0 Reject at  = .05
.025 .025 Conclusion:
There is evidence population
-2.030 0 2.030 t average is not 368
49
Use the Mean Comparison t-test
Calculator made by Stata v.15

Reject null hypothesis,


so there is evidence that
population average is not
368

50
Two-Tailed t Test
Thinking Challenge
You work for the FTC. A
manufacturer of detergent claims
that the mean weight of detergent
is 3.25 lb. You take a random
sample of 64 containers. You
calculate the sample average to be
3.238 lb. with a standard deviation
of .117 lb. At the .01 level of
significance, is the manufacturer
correct? 3.25 lb.

51
Two-Tailed t Test
Solution*
 H0: m = 3.25 Test Statistic:
 Ha: m  3.25 X  m 3.238  3.25
t   .82
   .01 S .117
 df  64 - 1 = 63 n 64
 Critical Value(s):
Decision:
Reject H0 Reject H0 Do not reject at  = .01
.005 .005 Conclusion:
There is no evidence
-2.656 0 2.656 t average is not 3.25
52
Use the Mean Comparison t-test
Calculator made by Stata v.15

Don’t reject null


hypothesis, thus there is
no evidence average is
not 3.25

53
Hypothesis Testing For
Proportion

54
Qualitative Data
1. Qualitative random variables yield
responses that classify
• e.g., Gender (male, female)
2. Measurement reflects number in category
3. Nominal or ordinal scale
4. Examples
• Do you own savings bonds?
• Do you live on-campus or off-campus?

55
Proportions
1. Involve qualitative variables
2. Fraction or percentage of population in a
category
3. If two qualitative outcomes, binomial
distribution
• Possess or don’t possess characteristic

4. Sample Proportion (p)


x number of successes
p 
n sample size

56
Sampling Distribution of Proportion
1. Approximated by
Sampling Distribution
Normal Distribution
np  3 np1  p 
P(p )
.3
.2
.1
2. Mean .0 p
mp   .0 .2 .4 .6 .8 1.0

3. Standard Error
 (1   )
p  where  = Population Proportion
n

57
Standardizing Sampling Distribution of
Proportion

p^  m p^ p^  p0
Z 
 p^ p0 (1  p0)
Sampling n Standardized Normal
Distribution Distribution

 P^ z = 1

m P^ ^
P m Z= 0 Z
58
One-Sample Z Test for Proportion

1. Assumptions
• Random sample selected from a binomial
population
• Normal approximation can be used if
np  15 and n(1-p)  15

2. Z-test statistic for proportion


p 0
Z Hypothesized population
 0 (1   0 ) proportion
n

59
Hypothesis Setups for Testing a Proportion ()

60
One-Proportion Z Test Example

The present packaging


system produces 10%
defective cereal boxes.
Using a new system, a
random sample of 200 boxes
had 11 defects. Does the
new system produce fewer
defects? Test at the .05 level
of significance.

61
One-Proportion Z Test Solution

 H0:  = .10 Test Statistic:


 Ha:  < .10 p  π0 11 200  0.10
Z   2.12
  = .05 π 0 (1  π 0 ) (.10)(.90)
n 200
 n = 200
 Critical Value(s):
Decision:
Reject H0 Reject at  = .05
.05 Conclusion:
There is evidence new
-1.645 0 Z system < 10% defective
62
Use the Proportion test
Calculator made by Stata v.15

Reject null hypothesis, so


there is evidence that new
system defective < 10%

63
One-Proportion Z Test Thinking Challenge

You’re an accounting manager. A


year-end audit showed 4% of
transactions had errors. You
implement new procedures. A
random sample of 500 transactions
had 25 errors. Has the proportion
of incorrect transactions changed
at the .05 level of significance?

64
One-Proportion Z Test Solution*

 H0:  = .04 Test Statistic:


 Ha:   .04 p  π0 25 500  0.04
Z   1.14
  = .05 π 0 (1  π 0 ) (.04)(.96)
n 500
 n = 500
 Critical Value(s):
Decision:
Reject H0 Reject H0
Do not reject at  = .05
.025 .025 Conclusion:
There is evidence
-1.96 0 1.96 Z proportion is not 4%
65
Use the Proportion test
Calculator made by Stata v.15

Don’t reject Ho. There is


evidence that population
proportion is not 4%

66
TWO-SAMPLES MEAN TEST
Comparing Two Population Means
 No assumptions about the shape of the populations are required.
 The samples are from independent populations.
 The formula for computing the value of z is:

Use if sample sizes  30 Use if sample sizes  30


or if  1 and  2 are known and if  1 and  2 are unknown

X1  X 2 X1  X 2
z z
 12  22 s12 s22
 
n1 n2 n1 n2
Comparing Two Population Means - Example

The U-Scan facility was recently installed at the Byrne Road Food-Town
location. The store manager would like to know if the mean checkout
time using the standard checkout method is longer than using the U-
Scan. She gathered the following sample information. The time is
measured from when the customer enters the line until their bags are in
the cart. Hence the time includes both waiting in line and checking out.
EXAMPLE 1 continued
Step 1: State the null and alternate hypotheses.
(keyword: “longer than”)
H0: µS ≤ µU
H1: µS > µU

Step 2: Select the level of significance.


The .01 significance level is stated in the problem.

Step 3: Determine the appropriate test statistic.


Because both population standard deviations are known, we can use z-
distribution as the test statistic.
EXAMPLE 1 continued
Step 4: Formulate a decision rule.

Reject H0 if Z > Z
Z > 2.33
EXAMPLE 1 continued
Step 5: Compute the value of z and make a decision

Xs  Xu
z
 s2  u2

ns nu The computed value of 3.13 is larger than the
5.5  5.3 critical value of 2.33.

0.40 2 0.30 2 Our decision is to reject the null hypothesis.
 The difference of .20 minutes between the
50 100 mean checkout time using the standard
0.2 method is too large to have occurred by
  3.13 chance.
0.064
We conclude the U-Scan method is faster.
Use the Mean-Comparison z-test
Calculator made by Stata v.15

Reject the null hypothesis, the


difference of .20 minutes is too
large to have occurred by
chance.

We conclude the U-Scan (var-y)


method is faster.

73
Two-Sample Tests of Hypothesis: Dependent Samples
Dependent samples are samples that are paired or related
in some fashion.

For example:
 If you wished to buy a car you would look at the same
car at two (or more) different dealerships and
compare the prices.
 If you wished to measure the effectiveness of a new
diet you would weigh the dieters at the start and at
the finish of the program.
Hypothesis Testing Involving Paired Observations
Use the following test when the samples are
dependent:

d
t
sd / n

Where
d is the mean of the differences
sd is the standard deviation of the differences
n is the number of pairs (differences)
Hypothesis Testing Involving Paired Observations - Example

Nickel Savings and Loan wishes to compare


the two companies it uses to appraise the
value of residential homes. Nickel Savings
selected a sample of 10 residential
properties and scheduled both firms for an
appraisal. The results, reported in $000,
are shown on the table (right).

At the .05 significance level, can we conclude


there is a difference in the mean appraised
values of the homes?
Hypothesis Testing Involving Paired Observations - Example
Step 1: State the null and alternate hypotheses.

H0: md = 0
H1: md ≠ 0

Step 2: State the level of significance.


The .05 significance level is stated in the problem.

Step 3: Find the appropriate test statistic.


We will use the t-test
Hypothesis Testing Involving Paired Observations -
Example
Step 4: State the decision rule.

Reject H0 if
t > t/2, n-1 or t < - t/2,n-1
t > t.025,9 or t < - t.025, 9
t > 2.262 or t < -2.262
Hypothesis Testing Involving Paired Observations
- Example
Step 5: Compute the value of t and make a decision

The computed value of t is greater than the higher


critical value, so our decision is to reject the null
hypothesis. We conclude that there is a difference
in the mean appraised values of the homes.
Hypothesis Testing Involving Paired Observations – Excel Example
Use the Mean-Comparison t-test
Calculator made by Stata v.15

Reject the null hypothesis. We


conclude that there is a
difference in the mean
appraised values of the homes.

81
2 SAMPLES PROPORTION
TEST
Two Sample Tests of Proportions
 We investigate whether two samples came from
populations with an equal proportion of successes.

 The two samples are pooled using the following formula.


Two Sample Tests of Proportions continued

The value of the test statistic is computed from the


following formula.
Two Sample Tests of Proportions - Example

Manelli Perfume Company recently developed a


new fragrance that it plans to market under the
name Heavenly. A number of market studies
indicate that Heavenly has very good market
potential. The Sales Department at Manelli is
particularly interested in whether there is a
difference in the proportions of younger and
older women who would purchase Heavenly if it
were marketed. Samples are collected from each
of these independent groups. Each sampled
woman was asked to smell Heavenly and
indicate whether she likes the fragrance well
enough to purchase a bottle.
Two Sample Tests of Proportions - Example

Step 1: State the null and alternate hypotheses.


(keyword: “there is a difference”)
H0: 1 =  2
H1:  1 ≠  2

Step 2: Select the level of significance.


The .05 significance level is stated in the problem.

Step 3: Determine the appropriate test statistic.


We will use the z-distribution
Two Sample Tests of Proportions - Example

Step 4: Formulate the decision rule.


Reject H0 if Z > Z/2 or Z < - Z/2
Z > Z.05/2 or Z < - Z.05/2
Z > 1.96 or Z < -1.96
Two Sample Tests of Proportions - Example

Step 5: Select a sample and make a decision

Let p1 = young women p2 = older women

The computed value of -2.21 is in the area of rejection. Therefore, the null hypothesis is rejected at
the .05 significance level.
To put it another way, we reject the null hypothesis that the proportion of young women who would
purchase Heavenly is equal to the proportion of older women who would purchase Heavenly.
Use the Proportion test
Calculator made by Stata v.15

Reject the null


hypothesis at the .05
significance level. The
proportion of young
women who would
purchase Heavenly is
not equal to the
proportion of older
women who would
purchase Heavenly.

89
Decision Making Risks

90
Errors in Making Decision
1. Type I Error
• Reject true null hypothesis
• Has serious consequences
• Probability of Type I Error is (alpha)
— Called level of significance
2. Type II Error
• Do not reject false null hypothesis
• Probability of Type II Error is (beta)

91
Decisions and Consequences in
Hypothesis Testing
True Situation
Researcher Decision Ho is True Ho is False
Accept Ho Right decision with Wrong decision
Probability (Type II Error)
(1-) with probability

Reject Ho Wrong Decision Right decision
(Type I Error) Power of Test
with probability with Probability
 (1-)

92
 &  Have an Inverse Relationship

You can’t reduce both errors


simultaneously!


93
The end …….
Thank You

94
Searching Stata Syntax Location

95

You might also like