Stat Deck4

Parametric Tests
harishram@hotmail.com
KDUE73YTIL
Introduction to Statistics
This file is meant for personal use by harishram@hotmail.com only.

Sharing or publishing the contents in part or full is liable for legal action.
Agenda
● Large Sample Test

○ Two Sample
● Small Sample
harishram@hotmail.com Test
KDUE73YTIL
○ One Sample
○ Two Sample
● Test for Population Proportion

○ One Sample
○ Two Sample

In this session…
Testing of
Hypothesis
KDUE73YTIL
Large sample tests Small sample tests Tests for proportion

Large Sample Tests
KDUE73YTIL

Large sample tests
Large
Sample Test
KDUE73YTIL
One Sample Two Sample

Large sample tests
Number of Defective Number of Defective

● One sample Items from Items from
Production Line A Production Line B
1 5
KDUE73YTIL Can we say that
3 8 Production Line B
● Two sample
produces less defectives
5 2
than Production Line A?
6 5
2 3
4 6
Two sample tests
● The two sample tests are used to compare the equality of means of two populations
● Used to
KDUE73YTIL
compare:
○ Performance of two machineries
○ Performance of two portfolios

Two sample tests for population mean
● Let there be two different populations such that they follow normal distribution
● The two
KDUE73YTIL
samples are independent of each other
● The samples must have equal variance (it can be tested statistically)

Test for equality of variances
● The hypothesis to test whether two populations have equal variances
H0: The variances are equal against H1: The variances are not equal
KDUE73YTIL
● Failing to reject H0, implies that the two populations have equal variance
● The Levene's test used
Note: The python code for levene’s test scipy.stats.levene()

Two sample test - hypothesis
● Let there be two samples of sizes n1 and n2 drawn from normal population N(µ1, σ12)
and N(µ2, σ22) respectively
● The hypothesis
KDUE73YTIL to test whether the population means are equal
H0 : µ1 = µ2 against H1 : µ1 ≠ µ1
● It implies
H0: The two population means are equal (i.e µ1 = µ2)
against H1: The two population means are not equal µ0 (i.e µ1 ≠ µ2)
● Like the one sample tests, it is possible to test for
H0 : µ1 ≤ µ2 against H1: µ1 > µ2

KDUE73YTIL
Or
H0 : µ1 ≥ µ2 against H1: µ1 < µ2
● Failing to reject H0 implies that the null hypothesis is true

Two sample tests - test statistic
● The test statistic is Z given by
Sample mean Specified mean

KDUE73YTIL
● Under H0, the test statistic follows normal distribution

● If σ12 and σ22 are not known then they are replaced the sample estimates s12 and s22
respectively
● The test
KDUE73YTIL statistic is Z given by
Sample mean Specified mean
where

The python code to conduct a Z test for two population means is
KDUE73YTIL
statsmodels.stats.weightstats.ztest(Sample_1, Sample_2, value,
alternative)

Two sample tests - decision rule
Based on
H1 Based on critical region Based on p-value
confidence interval
KDUE73YTIL
For two tailed test µ1 ≠ µ2 Reject H0 if |Z|≥ Zα/2

Reject H0 if p-value
Reject H0 if µ1 - µ2
is less than or
For left tailed test µ1 < µ2 Reject H0 if Z ≤ Zα does not lie in the
equal to the level
confidence interval
of significance
For right tailed test µ1 > µ2 Reject H0 if Z ≥ Zα

Two sample test for population mean (σ unknown)
Question:
A study was carried out to understand amount of hemoglobin in blood for males and
females. A random
KDUE73YTIL sample of 160 males and 180 females have means of 13 g/dl and 15
g/dl. The two samples have standard deviation of 4.1 g/dl for male donors and 3.5 g/dl for
female donor . Can it be said the population means of hemoglobin are the same for men
and women? Use α = 0.01.

Solution:
X: Amount of hemoglobin in blood for males

Y: Amount of
hemoglobin in blood for females
KDUE73YTIL
Here n1 = 160, s1 = 4.1, = 13

n2 = 180, s2 = 3.5, = 15
To test H0: µ1 = µ2 against H1: µ1 ≠ µ2

Solution:
The test statistic

KDUE73YTIL
Decision Rule: Reject |Zcalc| ≥ Zα/2
Here Zα/2 = 2.58
Since 4.807 > 2.58, reject H0. -2.58 2.58
We may conclude that both males and females have

different hemoglobin averages
Python solution: Calculate critical z-value
KDUE73YTIL
i.e. If test statistic is less than -2.58 or greater than 2.58 then we reject H0.
Python solution: Calculate test statistic
KDUE73YTIL
As test statistic (=-4.8068) < critical value (=-2.58), we reject H0.

Python solution: Calculate p-value
KDUE73YTIL
As p-value < 0.01, we reject H0.

Small Sample Tests
KDUE73YTIL

Small sample tests
● For tests n < 30 are considered to be small sample tests
● Under the null hypothesis the test statistic follow exact distribution - t distribution, F
KDUE73YTIL
distribution or χ2 distribution
● Small sample tests are regarded as exact tests

Small sample tests
KDUE73YTIL

Small sample tests
KDUE73YTIL

One sample tests
● To test for population mean, if σ2 is known the Z test is used
● If σ2 is
KDUE73YTIL
unknown, use the t-test
● The t-tests are based on the t-distribution
● The unknown σ is replaced by sample standard deviation (s)

The t-distribution
The t distribution for different values of df

● It is based on the degrees of freedom
● The t distribution
KDUE73YTIL
is a symmetric distribution
about the mean
● Let n denote the degrees of freedom
● Its mean is 0 and variance is (for n>2)

The t-distribution
● It is leptokurtic
KDUE73YTIL
● For large n it is mesokurtic

Degrees of freedom
Degrees of freedom (d.f.) is the number of independent variables, i.e. they are free to vary.
Example:
KDUE73YTIL
Consider a system of equations:
● Y=8+5Z
● Z = X + 10
Say if the value of Z is known, it is possible to find the value of other variables. Thus the
d.f. is 1.

One sample test - hypothesis
● The hypothesis to test whether the population mean is equal to a specified value
when σ is unknown
harishram@hotmail.com H0 : µ = µ0 against H1 : µ ≠ µ0
KDUE73YTIL
● It implies
H0: The population mean is equal to µ0 (i.e µ = µ0)
against H1: The population mean is not equal to µ0 (i.e µ ≠ µ0)

● Like the one sample t- tests, it is possible to test for
H0 : µ ≤ µ0 against H1 : µ > µ0
KDUE73YTIL
Or
H0 : µ ≥ µ0 against H1 : µ < µ0
● Failing to reject H0 implies that the null hypothesis is true

One sample test - test statistic
● The test statistic t given by
KDUE73YTIL
s2 is the sample estimate of population variance and is given by
or
● Under H0, the test statistic t follows t distribution with n-1 degrees of freedom
The python code to conduct a t test for population mean is
KDUE73YTIL
scipy.stats.ttest_1samp(Sample, popmean)

Decision rule
Based on
confidence interval
KDUE73YTIL
For two tailed test µ ≠ µ0 Reject H0 if |t| ≥ tdf,α/2

Reject H0 if µ0
is less than or
For left tailed test µ < µ0 Reject H0 if t ≤ -tdf,α does not lie in the
equal to the level
confidence interval
of significance
For right tailed test µ > µ0 Reject H0 if t ≥ tdf,α

One sample t-test
Question:
A researcher is studying the growth of bacteria in waters of Lake Beach. The

mean bacteria count of 100 per unit volume of water is within the safety level.
KDUE73YTIL
The researcher collected 10 water samples of unit volume and found the mean
bacteria count to be 94.8 with a sample variance of 72.66. Does the data indicate
that the bacteria count is within the safety level? Test at the α = .05 level.
Assume that the measurements constitute a sample from a normal population.

One sample t-test
Solution:
The mean bacteria count of 100 per unit volume of water is within the safety level.
i.e. µ = 100
KDUE73YTIL
Let X: bacteria count per unit volume
The researcher collected 10 water samples of unit volume and found the mean bacteria
count to be 94.8 with a sample variance of 72.66.
i.e. n = 10, = 94.8 and s2 = 72.66

One sample t-test
Solution:
To test: The bacteria count is within the safety level

KDUE73YTILH0 : µ ≥
i.e. 100 against H1 : µ < 100
The test statistic is
Here t follows t9,0.05

-1.833
From the table we have t9,0.05 = -1.833
One sample t-test
Solution:
tcritical (-1.833) > tcalc (-1.929)

KDUE73YTIL
Since the critical value is less than the calculated
statistic value.
-1.833
We reject the H0, and can conclude that the
average bacteria per unit volume (true mean) is
within the safety levels.

One sample t-test
Python solution: Calculate critical t-value
KDUE73YTIL
As t-distribution is symmetric, for a left-tailed test if test statistic is less than -1.83, then we
reject H0.
One sample t-test
KDUE73YTIL
As test statistic (=-1.929) < critical value (=-1.83), we reject H0.

One sample t-test
KDUE73YTIL
As p-value < 0.05, we reject H0.

Small sample tests
KDUE73YTIL

Two sample tests
● Like the two sample Z test, the two sample t test are used to compare the equality of
means of two populations for unpaired data
KDUE73YTIL
● For unpaired data, test relative performance two machineries, investment portfolios

Two sample tests for unpaired data
● To test population means of two populations
● Let there
KDUE73YTIL
be two different populations such that they follow normal distribution
● Two samples of sizes n1 and n2 drawn from normal population N(µ1, σ12) and N(µ2,
σ22) respectively
● The two samples are independent of each other

● The hypothesis to test whether the population means are equal
H0 : µ1 = µ2 against H1 : µ1 ≠ µ1
●
KDUE73YTIL
It implies
H0: The two population means are equal (i.e µ1 = µ2)
against H1: The two population means are not equal µ0 (i.e µ1 ≠ µ2)
● The one sided hypothesis are
H0 : µ1 ≤ µ2 against H1 : µ1 > µ2 Or H0 : µ1 ≥ µ2
against H1 : µ1 < µ2 Sharing or publishing the contents in part or full is liable for legal action.

σ12 and σ22 are
known
KDUE73YTIL


σ12 and σ22 are σ12 and σ22 are
known unknown
KDUE73YTIL
Where s2 is the pooled sample variance given by
respectively
respectively
● The pooled
KDUE73YTIL
sample variance (s2) is used
● Under H0, the test statistic follows t distribution with n1 + n2 - 2 df
● Failing to reject H0 implies that the two population means are equal (i.e µ1 = µ2)

The python code to conduct a t test for two population means which are not
KDUE73YTIL paired is
scipy.stats.ttest_ind(Sample_1, Sample_2)

Two sample test for unpaired data
Question:
An experiment was conducted to compare the pain relieving hours of two new medicines.
Two groupsof 14 and 15 patients were selected and were given comparable doses. Group
KDUE73YTIL
1 was given medicine 1 and group 2 was given medicine 2. Following data is obtained from
the two samples. Test whether the two populations give the same mean hours of relief.
Assume the data comes from normal distribution has equal variance. [Use α = 0.01]
Medicine 1 Medicine 2
Mean of hours of relief 6.4 7.3
S.D of hours of relief 1.4 1.5

Solution:
Let X: patients receiving medicine 1

KDUE73YTILY: patients
receiving medicine 2
To test H0: µ1 = µ2 against H1: µ1 ≠ µ2
The pooled variance is

Solution:

KDUE73YTIL
Decision rule: Reject H0 if |tcal| ≥ tn1+n2-2, α/2
tn1+n2-2, α/2 = t27,0.005 = 2.771

-2.771 2.771
Since 2.771>|-1.667|, we fail to reject H0.
The two medicines haveThis

thefilesame
is meant mean hours
for personal use by of sleep.
harishram@hotmail.com only.
KDUE73YTIL
If test statistic is less than -2.77 or greater than 2.77, then we reject H0.
KDUE73YTIL
As test statistic (=-1.667)This

> file
critical
is meantvalue (=-2.77),
for personal we fail to reject
use by harishram@hotmail.com only.H0.
KDUE73YTIL
As p-value > 0.01, we fail to reject H0.

● The test is carried out under the assumption that the samples are drawn
from independent normal population
KDUE73YTIL
● If the populations from which the samples are drawn do not have equal
variance then the t-test can not be use. In this case the Welch test is used

Two sample tests for paired data
● For paired data effectiveness of some training/treatment is measured
● The observations are recorded on the same individual/item twice resulting in pairs of
KDUE73YTIL observations
Example:
A energy drink manufacturing company wants to test if sales increase after they advertise
the drink with a life sized picture of a well-known athlete.

● Let {(Xi, Yi), i = 1, 2, 3, … ,n} where X and Y are paired data

● Let µX, µY be the mean of data from X and Y respectively
● Define
KDUE73YTIL
di = yi - xi
● Let µd = µy - µx
● In paired t-test, we test for
H0 : µd = µ0 against H1 : µd ≠ µ0
H0 : µd ≤ µ0 against H1 : µd > µ0
H0 : µd ≥ µ0 against H1 : µd < µ0
● Suppose mean of di = and variance of di =
● The test statistic is

KDUE73YTIL
● Under H0, the test statistic follows t-distribution with n-1 degrees of freedom
● Failing to reject H0, implies that the null hypothesis is true

The python code to conduct a t test for two population means which are
KDUE73YTIL paired is
scipy.stats.ttest_rel(Sample_1, Sample_2)

Two sample t-test for paired data
Question:
An energy drink distributor claims that a new advertisement poster, featuring a life-size
picture of a
well-known athlete. For a random sample of 10 outlets, the following data was
KDUE73YTIL
collected. Test that the null hypothesis that there the advertisement was effective in
increasing the sales
Before 33 32 38 45 37 47 48 41 45
After 42 35 31 41 37 36 49 49 48
Test the hypothesis using critical region technique. [Use α = 0.05].

Solution:
To test,
H0: The advertisement was not effective against H1: The advertisement was
KDUE73YTIL
effective ( µd ≤ 0)
(µd > 0)
Compute di
Before 33 32 38 45 37 47 48 41 45
After 42 35 31 41 37 36 49 49 48
di=yi-xi 9 This3file is meant-7

for personal-4 0
use by harishram@hotmail.com-11 only. 1 8 3
Solution:
To test,
H0: The advertisement was not effective against H1: The advertisement was
KDUE73YTIL
effective ( µd ≤ 0)
(µd > 0)
We have mean and its variance

Solution:
Decision Rule: Reject H0, if tcalc ≥ tdf,α

KDUE73YTIL
Here tdf,α = tn-1,α= t8,0.05 = 1.86
Since 1.86 > 0.0998, fail to reject H0 1.86
Thus, there is no effect of the advertisement.

KDUE73YTIL
If test statistic is greater than 1.86, then we reject H0.

Python solution: Calculate test statistic and p-value
KDUE73YTIL
The test statistic (=0.1) < critical value (=1.86), also the p-value > 0.05, thus we fail to reject
H0.
Test for Population Proportion
KDUE73YTIL

Test for proportion
● For qualitative data the proportion of a desired characteristic is obtained
● Test for proportion:

KDUE73YTIL
○ One sample: Testing population proportion (P) is equal to a specified value (P0)
○ Two sample: Testing equality of Two population proportions (P1 = P2)
● Similar to the tests of population mean

Test for proportion
Test for
proportion
KDUE73YTIL

● The hypothesis to test the population proportion is equal to a specified value
H0 : P = P0 against H1 : P ≠ P0
●
KDUE73YTIL
It implies
H0: The population proportion is equal to P0
against H1: The population proportion is not equal to P0
● Failing to reject H0 implies that the population proportion is equal to P0

Test for proportions
● The test statistic is given by

Specified
Sample
proportion
proportion
KDUE73YTIL
Sample size
● Under H0, the test statistic follows standard normal distribution

One sample test for proportion - decision rule
Based on
confidence interval
KDUE73YTIL
For two tailed test P ≠ P0 Reject H0 if |Z|≥ Zα/2

Reject H0 if P0
is less than or
For left tailed test P < P0 Reject H0 if Z ≤ -Zα does not lie in the
equal to the level
confidence interval
of significance
For right tailed test P > P0 Reject H0 if Z ≥ Zα

One sample test for proportion
Question:
From a sample 361 business owners had gone into bankruptcy due to recession. On taking
a survey, it
was found that 105 of them had not consulted any professional for managing
KDUE73YTIL
their finance before opening the business. Test the null hypothesis that at most 25% of all
businesses had not consulted before opening the business.
Test the claim using p-value technique. [Use α = 0.05].

Solution:
From a sample 361 business owners had gone into bankruptcy due to recession.
i.e. n = 361
KDUE73YTIL
On taking a survey, it was found that 105 of them had not consulted any professional for
managing their finance before opening the business.
Let X: business which did not consult before

x = 105
The sample proportion (p) = x/n = 105/361 = 0.2909

Solution:
To test: The null hypothesis that at most 25% of all businesses had not consulted before
opening the
business
KDUE73YTIL
Here P0 = 0.25
To test, H0: P ≤ 0.25 against H1: P > 0.25

Solution:
The test statistic

KDUE73YTIL
The p-value = P(Z > 1.79) = 0.0367 1.79
As p-value < 0.05, reject H0.
We may conclude that at least 25% of all businesses had not consulted before starting the
business.
KDUE73YTIL

KDUE73YTIL
As the p-value < 0.05, we reject H0.

Test for proportion
Test for
proportion
KDUE73YTIL

Two sample tests for population proportion
● Let there be two samples sizes n1 and n2 from different populations of such that x1 and
x2 are the number of specific items in each of them respectively
KDUE73YTIL
● Suppose these samples have proportions of specific items p1 and p2 respectively
● To test the equality of population proportion from which these samples are chosen

● The hypothesis to test the population proportion
H0 : P1 = P2 against H1 : P1 ≠ P2
●
KDUE73YTIL
It implies
H0: The two population proportions are equal (P1 = P2)
against H1: The two population proportions are not equal (P1 ≠ P2)
● Failing to reject H0 implies that the two population proportions are equal
Test for proportions
● The test statistic is given by
where is the proportion of

harishram@hotmail.com pooled sample such that
KDUE73YTIL
● Under H0, the test statistic follows standard normal distribution

The python code to conduct a Z test for two population proportions is
KDUE73YTIL
statsmodels.api.stats.proportions_ztest(Sample_1, Sample_2)

Two sample test for proportion - decision rule
Based on
confidence interval
KDUE73YTIL
For two tailed test P1 ≠ P2 Reject H0 if |Z| ≥ Zα/2

Reject H0 if P1 - P2
is less than or
For left tailed test P1 < P2 Reject H0 if Z ≤ -Zα does not lie in the
equal to the level
confidence interval
of significance
For right tailed test P1 > P2 Reject H0 if Z ≥ Zα

Two sample test for proportion
Question:
Steve owns a kiosk where sells two magazines - A and B in a month. He buys 100 copies
of magazineA out of which 78 were sold and 70 copies of magazine B out of which 65
KDUE73YTIL
were sold. Is there enough evidence to say that magazine is B is more popular?
Test the claim using p-value technique. [Use α = 0.05].

Solution:
Steve owns a kiosk where sells two magazines - A and B in a month.

KDUE73YTIL
Let X: the number of magazines sold
Out of 100 copies of magazine A 78 are sold Out of 70 copies of magazine B 65 are sold
Here, x1 = 78 and n1 = 100 Here, x2 = 65 and n1 = 70
Let p1 be the proportion of sell of magazine A Let p2 be the proportion of sell of magazine B
p1= x1/n1 = 78/100 = 0.78 p2 = x2/n2 = 65/70 = 0.928

Solution:
To test, whether magazine B is more popular, i.e

KDUE73YTIL
H0: P1 ≥ P2 against H1: P1 < P2
Where P1: denotes population proportion of magazine A sold

P2: denotes population proportion of magazine B sold

Solution:
The pooled proportion is

KDUE73YTIL

Solution:
The test statistic Z = -2.5905

KDUE73YTIL
The p-value = P(Z < Zcalc, under H0) = P(Z < -2.5905, µ = 13) = 0.0048
Since p-value < 0.05, we reject H0.
-2.59
Thus there is enough evidence to conclude that magazine is B is more popular.

Python solution: Calculate test statistic and p-value
KDUE73YTIL
As the p-value < 0.05, we reject H0.

Summary
KDUE73YTIL

Parametric tests
● The tests considered so far have two features:
○ The probability distribution of the samples was assumed to be known

KDUE73YTIL
○ The hypothesis test was about the parameter of the probability distribution
● These tests are known as the parametric tests
● The times when these assumptions are not satisfied, use the non-parametric tests

KDUE73YTIL Thank You


Stat Deck4

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stat Deck4

Uploaded by

Copyright:

Available Formats

Parametric Tests

This file is meant for personal use by harishram@hotmail.com only.

● Large Sample Test

● Test for Population Proportion

This file is meant for personal use by harishram@hotmail.com only.

Large sample tests Small sample tests Tests for proportion

This file is meant for personal use by harishram@hotmail.com only.

This file is meant for personal use by harishram@hotmail.com only.

One Sample Two Sample

This file is meant for personal use by harishram@hotmail.com only.

Number of Defective Number of Defective

○ Performance of two machineries

○ Performance of two portfolios

This file is meant for personal use by harishram@hotmail.com only.

This file is meant for personal use by harishram@hotmail.com only.

● The Levene's test used

Note: The python code for levene’s test scipy.stats.levene()

H0: The two population means are equal (i.e µ1 = µ2)

● Like the one sample tests, it is possible to test for

H0 : µ1 ≤ µ2 against H1: µ1 > µ2

H0 : µ1 ≥ µ2 against H1: µ1 < µ2

● Failing to reject H0 implies that the null hypothesis is true

This file is meant for personal use by harishram@hotmail.com only.

● The test statistic is Z given by

Sample mean Specified mean

● Under H0, the test statistic follows normal distribution

This file is meant for personal use by harishram@hotmail.com only.

Sample mean Specified mean

This file is meant for personal use by harishram@hotmail.com only.

This file is meant for personal use by harishram@hotmail.com only.

For two tailed test µ1 ≠ µ2 Reject H0 if |Z|≥ Zα/2

This file is meant for personal use by harishram@hotmail.com only.

This file is meant for personal use by harishram@hotmail.com only.

X: Amount of hemoglobin in blood for males

Here n1 = 160, s1 = 4.1, = 13

To test H0: µ1 = µ2 against H1: µ1 ≠ µ2

This file is meant for personal use by harishram@hotmail.com only.

The test statistic

Decision Rule: Reject |Zcalc| ≥ Zα/2

Here Zα/2 = 2.58

Since 4.807 > 2.58, reject H0. -2.58 2.58

We may conclude that both males and females have

As test statistic (=-4.8068) < critical value (=-2.58), we reject H0.

As p-value < 0.01, we reject H0.

This file is meant for personal use by harishram@hotmail.com only.

● For tests n < 30 are considered to be small sample tests

● Small sample tests are regarded as exact tests

This file is meant for personal use by harishram@hotmail.com only.

This file is meant for personal use by harishram@hotmail.com only.

This file is meant for personal use by harishram@hotmail.com only.

● To test for population mean, if σ2 is known the Z test is used

● The t-tests are based on the t-distribution

● The unknown σ is replaced by sample standard deviation (s)

This file is meant for personal use by harishram@hotmail.com only.

The t distribution for different values of df

● Let n denote the degrees of freedom

● Its mean is 0 and variance is (for n>2)

This file is meant for personal use by harishram@hotmail.com only.

● For large n it is mesokurtic

This file is meant for personal use by harishram@hotmail.com only.

This file is meant for personal use by harishram@hotmail.com only.

H0: The population mean is equal to µ0 (i.e µ = µ0)

against H1: The population mean is not equal to µ0 (i.e µ ≠ µ0)