Professional Documents
Culture Documents
Inference en
Inference en
Statistic Inference
MaMaEuSch†
Management Mathematics for European Schools
94342 - CP - 1 - 2001 - 1 - DE - COMENIUS - C21
∗
University of Seville
†
This project has been carried out with the partial support of the European Community in the frame-
work of the Sokrates programme. The content does not necessarily reflect the position of the European
Community, nor does it involve any responsibility on the part of the European Community.
Contents
1 Statistic Inference 2
1.1 Introduction to Statistic Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Sample distribution of an statistic or an estimator . . . . . . . . . . . . . . . . . . . 3
1.3 Point estimation. Sample distribution of the main estimators . . . . . . . . . . . . . 5
1.4 Interval estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.1 Estimation errors and sample size . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5 Hypothesis tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5.1 Relationship between confidence intervals and hypothesis tests . . . . . . . . 13
1.5.2 Chi-square test for adjustment to a distribution . . . . . . . . . . . . . . . . . 14
1.5.3 Dependence and independence tests . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.4 Homogeneity test for several samples . . . . . . . . . . . . . . . . . . . . . . . 16
1.6 Bayesian inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1
Chapter 1
Statistic Inference
We will devote the chapter that we start now to see how we can obtain conclusions about a
population through the data we got from sampling. We will make an overview of the different
concepts we need to know as well as of the techniques we can use.
2
In case we decide to apply techniques of classic inference, our conclusions can be obtained in
different ways:
• We can search a value for the average height of the students of the high school (that we will
find through an estimator or statistic) that we will consider as the value of the parameter. In
this case we are using point estimation.
• We can also look for a random interval inside which we could find with some ”certainty” the
real value of the parameter, for instance, the real value of the average pocket money of the
students of the high school. In this case, we are talking about interval estimation.
• Let us imagine that we have a possible value for the average of the height of the students of
the high school and we want to test is this value is ”suitable”, with some ”certainty”. In this
case we would apply an hypothesis test.
parametric methods
by purpose,
non parametric methods
Interval
Inference Point
Estimation
classic inference
by type of information, Hypothesis
Test
Bayesian Inference
3
Example 1.2.1 Let us imagine now that we have 3 papers in a bag. We are going to make a draw
in which we will have two possible winners, this is, one person will take one of the papers, and then
we put it inside the bag again and some other person takes out a paper again. Papers have values
of 0 euros, 500 euros and 1000 euros. How will the average of these samples of size 2 behave in our
population of 3 papers? Which is the most probable result?
The possibilities we have are the following:
(0, 0), (0, 500), (0, 1000), (500, 0), (500, 500), (500, 1000), (1000, 0), (1000, 500), (1000, 1000).
We calculate the average for all of them and we check the probability that they appear:
The average of the population (0, 500, 1000) is 500 and the variance is 166666.b
6 while for the
random variable sample average, the average is 500 and the variance is 83333.b 3. As we can see,
they have the same average but the variance of the sample average is lower. If we represent the
distribution of the sample average, we see that:
The Central Limit Theorem comes to confirm the fact we have noticed in the example above.
Given a random variable with average µ and variance σ 2 , the distribution of the averages of√the
samples, as n (sample size) increases to infinity tends asymptotically to a distribution N (µ, σ/ n).
In the example above we have presented all the possible samples, but imagine what it would be
to calculate the same with all the possible samples of 60 students out of the 544 students of the
high school. It would be a neverending calculation. That is why it is usually used the Montecarlo’s
method, which consists in simulating, through tables of random numbers o through a computer, the
fulfillment of a great number of samples, and we calculate for them the value of the statistic, and
with that, we can get an approximate probability distribution (the bigger the number of samples
generated, the more approximate it is).
4
1.3 Point estimation. Sample distribution of the main esti-
mators
We come to the point now in which we need to estimate the parameters of the population. We
want to know the average and variance of the height and pocket money of the students of the high
school.We can take as an estimation the value of the sample average and the sample variance for our
sample of 60 students of the 544. In this case, we will make point estimation, because we estimate
the value of the unknown parameter through only one value of the estimator.
But now, does it coincide the average of the distribution of our estimator with the value of the
population parameter? For instance, the average of the sample variance does not coincide with
the population variance, so it will not be a good estimator for the variance. Is the value of the
estimation closer to the parameter if we increase sample size? These and other properties we want
to have them in an estimator.
When we want to estimate the value of a population parameter we want the estimator to have
a certain number of properties in order to get a ”good” estimation:
• Centered or unbiased: the average of the sample distribution of the sample statistic coin-
cides with the unknown population parameter.
• Consistent: if we increase the size of the sample, the average value of the sample distribution
of the sample statistic converges to the estimated parameter.
• Efficient: that it has the minimum variance of all the centered estimators.
• Sufficient: that it uses all the information about the parameter provided by the sample data.
Let x1 , x2 , . . . , xk be a random sample of a population. The most common estimators are:
For the population average, the sample average:
Pk
xi
x = i=1 .
n
For the population proportion, sample proportion:
observed values of A
p= .
sample size
For the population variance, sample cuasivariance:
Pk
n (xi − x)2
Sc2 = S 2 = i=1 ,
n−1 n−1
as we have already mentioned, we will not use sample variance because it is an not unbiased
estimator of the population variance.
It can be proved that if we have a random variable X from a population with average µ and
standard deviation σ, we have that:
• In sampling with replacement or in an infinite population
σ
x has µ as average value and √ as standard deviation
n
5
• In sampling without replacement or a finite population:
r
σ N −n
x has µ as average value and √ as standard deviation
n N −1
We can notice that the only difference between the case of infinite population and sampling with
replacement and the case of sampling without replacement
q and finite population is that the standard
−n
deviation is multiplied by the correction factor N
N −1 , where N is the size of the population and
n is the size of the sample.
Moreover, if we can prove that X is a random variable with a normal distribution having µ as
average and σ as a known standard deviation, we have that:
σ
x follows a distribution N (µ, √ ).
n
As we can see in the expression of the sample average, the bigger the size of n is, the lower the
standard deviation is, so the lower the error committed when we consider the sample average as an
estimator for the population average. We will have to check how convenient it is to increase the
sample size, fitting to the economic budget we have.
It can be proved that if X is a random variable with a normal distribution with µ as an average
and σ as standard deviation, we have that
x−µ
Sc
has a distribution t-Student with (n − 1) degrees of freedom.
√
n
In case that the sample size is greater than 30, t-Student distribution can be approximated by
a N (0, 1) distribution.
It can be proved that if X has a normal distribution with µ as an average and σ as standard
deviation, we have that:
(n − 1)Sc2 nS 2
= has a distribution χ2 (n − 1) chi-square with n − 1 degrees of freedom.
σ2 σ2
6
µ = population average, σ = population standard deviation,
N = size of the population, n = size of the sample,
x = sample
Pn average,2 Pn proportion2 (q = 1 − p),
p sample
2 (xi − x) 2 (xi − x)
S = i=1 sample variance, Sc = i=1 sample cuasivariance.
n n−1
Moreover,
• zα is the value of a variable N (0, 1) which leaves an area (probability) α on its right side.
• tα (n − 1) is the value of a variable t-Student with (n − 1) degrees of freedom that leaves an
area (probability) α on its right side.
• χ2α (n − 1) is the value of a variable chi-square with (n − 1) degrees of freedom that leaves an
area (probability) α on its right side.
• Fα (m, n) is the value of a variable F of Snedecor with (n, m) degrees of freedom that leaves
an area (probability) α on its right side.
So, from now we search an interval (a, b) so that the unknown population parameter can be
found inside the interval with a certain precision or confidence level. To find that interval we use
the data provided by the sample, so this means we will find different intervals for different samples.
The concept of confidence level (for instance 90%) refers to the fact that if we consider a big
number of samples, and for each of these samples we calculate the confidence interval for a certain
unknown parameter h, this parameter is inside at least 95% of these intervals.
This fact is very important, so that when we build our confidence interval for a certain sample,
we should not make the mistake of thinking that ”the population parameter is inside the interval
with a probability of 0.95” because this is a wrong interpretation. The interval is random before
calculating the value of the statistic for each sample, once it is calculated for a concrete sample, it
is not random anymore and it contains or it contains not the parameter.
To build the confidence interval of a population parameter θ, we start considering an estimator
θb (generally unbiased), and from them we start the construction of the interval with a certain size
λb, so that the interval will be something like (θb−λb, θb+λb), with the condition that the probability
that the unknown parameter θ is inside the interval is 1 − α, this is
P [θb − λb ≤ θ ≤ θb + λb] = 1 − α.
The term λb is the margin for the error or precision of the estimation of the unknown population
parameter, it is usually called typical error of estimation or standard error.
We are going to give now in a detailed way the confidence intervals for a confidence level 1 − α
according to the different situations and population parameters for which we want to calculate those
intervals.
For the case of one population and sampling with replacement, we have:
7
Population Parameter Interval
σ σ
N (µ, σ), σ known µ x − z α2 √ , x + z α2 √
n n
Sc Sc
N (µ, σ), σ unknown µ x − t α2 (n − 1) √ , x + t α2 (n − 1) √
n !n
Pn 2
Pn 2
i=1 i (x − µ) (x
i=1 i − µ)
N (µ, σ), µ known σ2 ,
χ2α (n) χ21− α (n)
2 2 !
(n − 1)Sc2 (n − 1)Sc2
N (µ, σ), µ unknown σ2 ,
χ2α (n − 1) χ21− α (n − 1)
2 r 2 r !
p·q p·q
B(n, p), n > 30 p p − z α2 , p + z α2
n n
In case that we make sampling without replacement r or the population is not infinite, in general
N −n
we should multiply the standard error by the factor .
N −1
As we can see, the structure of the confidence interval is (θb − λb, θb + λb) where θb is an estimator
of the population parameter we want to calculate the interval for, λ is a value (critic point) of a
well-known distribution and b depends on the size of the sample n.
For instance, if we have a normal distribution with known standard deviation, a confidence
σ σ
interval for the average is: x − z α2 √ , x + z α2 √ where λ is in this case z α2 , a critic point
n n
of N (0, 1) that leaves on its right side an area of α/2, b would be √σn and the estimator of the
population average, θb would be x in this case.
For the case of two independent populations (and random sampling with replacement), we
have:
Population parameter Interval
s
σx2 σy2
N (µ, σ), σ known µx − µy (x − y) ± z α2 +
nx ny
s s
2
(nx − 1)Scx + (ny − 1)Scy2
1 1
N (µ, σ), σ unknown equal µx − µy (x − y) ± t α2 (nx + ny − 2) +
nx + ny − 2 nx ny
s
px q x py q y
B(n, p), n > 30 px − py (px − py ) ± z α2 +
nx ny
2
2 2
σx Scx 1 Scx 1
N (µ, σ), µ unknown ,
σy2 2 F α (n − 1, n − 1) S 2 F
Scy 2 x y cy 1− 2 (nx − 1, ny − 1)
α
For samples which are big enough we can consider as valid the intervals built applying the
previous expressions.
In case that we are not in any of the cases above, we still can build a confidence interval for the
population average of any population applying Tchebycheff theorem:
Let X be a random variable with average µ and standard deviation σ. It holds that for any
value of k > 0, P (|X − µ| ≥ kσ) ≤ k12 .
In case we have that x is the sample average, we would have an interval for the population
average with known standard deviation:
8
σ σ
x − k√ , x + k√ .
n n
Therefore, the steps we have to follow to build a confidence interval could be the following:
1. Fix the confidence level and the error we would like to have, calculating the appropriate
sample size needed.
2. Calculate the error we make with given sample size and confidence level.
3. Calculate the confidence level we can have with given sample size and the error we want to
commit.
9
If we really want to check whether these questions are right or wrong, we should measure each
and every element of the population. But as usual, this will not be in general a real possibility, so
we have to try to answer all those questions using the data we have from our sample.
An hypothesis test allows us to accept or reject if some statements are true or false depending
on some data we get through a sample.
Obviously this leads us to assume that the conclusion we get to may not be true, therefore we
should try to assure a certain degree of precision for the case in which we accept the hypothesis
that is posed. The precision degree is what we call confidence level.
We can have 2 main types of hypothesis tests:
• The ones which pose hypothesis about the parameters of the probability distribution of the
population. For instance, that the average of a normal population is equal to 7. We will call
them parametric tests.
• The ones posing other types of hypothesis. For instance, that a certain population has a
normal distribution or that there is no dependence between variables height and pocket money
of the students of a high school. We will call them non-parametric tests.
Once we have applied an hypothesis test and we accept the initial hypothesis, this does not
mean that we have proved (in the mathematical sense) the statement, because we have not checked
all the elements of the population and even the hypothesis could have been rejected with the data
of other sample. What we can say is that we cannot reject the statement with the data that we
have.
¿From now on we are going to present parametric tests (on the average, variance and proportion)
as well as non-parametric (homogeneity or heterogeneity of the population and independence in
contingency tables).
We need previously some concepts:
10
• Significance level: it represents the probability of rejecting H0 when it is true and it is
complementary to the confidence level, this is, α. It gives us the probability of the rejection
region under the null hypothesis.
• Bilateral tests: null hypothesis is presented in such a way that population parameters are
univocally determined. For instance, the average is equal to 5, or the variance is 3.
• One-sided tests: null hypothesis is presented in such a way that the values of the unknown
population parameter is inside a semi-open interval. To know the distribution of the sample
statistic, we will suppose that the population parameter takes the value of one of the limits of
the interval. For instance, H0 : µ0 ≥ 3, this is, the average is greater or equal than 3, against
the alternative hypothesis H1 : µ0 < 3. Once we have to determine the distribution of the
statistic we will suppose that the value of the statistic under the null hypothesis is µ = 3
When we apply the hypothesis test we use the values of an statistic whose probability distribution
should be known under null hypothesis. Then the sample data can lead us to two types of errors:
• Error of type I: this is the error produced when we reject null hypothesis being true. The
probability of that decision is the significance level α, this is, the probability of rejecting null
hypothesis being true.
• Error of type II: this is the error produced when we accept the null hypothesis being false,
which would be the same as rejecting H1 being true. The probability of rejecting alternative
hypothesis being true is denoted by β.
• Power of the test: it represents the probability of rejecting H0 when H1 is true.
We can make a summary of the decisions taken and the errors made in the following table:
The probabilities of errors type I and II are complementary functions between them, in the sense
that increasing one, the other one decreases and viceversa, so we should try to minimize the error
that we consider to be more relevant and this would mean increasing the other one. A possible
solution consists on searching the appropriate sample size which makes compatible the levels of
errors of type I (α) and type II (β), this is, fixed one of the errors, we fix the sample size so that
the other is inside the desired limits.
The steps to follow to make a hypothesis test are:
1. To establish the distribution of the population, null hypothesis H0 and alternative hypothesis
H1 .
2. To fix the confidence level, 1 − α, and the sample size, n.
3. To select a sample and to calculate the value of the corresponding statistic, whose distribution
will be known under H0 .
11
4. To determine acceptance region and rejecting region.
5. To accept H0 if the value of the statistic is inside the acceptance region. In other case, H0 is
rejected.
6. Statistic conclusions.
In the following table we can see the different statistics which are normally used, as well as the
critic regions depending on the type of test applied. For the case of only one population:
Population H0 H1 Statistic Critic region
µ = µ0 µ 6= µ0 |T | ≥ z α2
x − µ0
N (µ, σ), σ known µ ≥ µ0 µ < µ0 T = σ T < z1−α
√
n
µ ≤ µ0 µ > µ0 T > zα
µ = µ0 µ 6= µ0 |T | ≥ t α2 (n − 1)
x − µ0
N (µ, σ), σ unknown µ ≥ µ0 µ < µ0 T = Sc
T < t1−α (n − 1)
√
n
µ ≤ µ0 µ > µ0 T > tα (n − 1)
σ = σ0 σ 6= σ0 |T | ≥ χ2α (n)
Pn 2
12
Populations H0 H1 Statistic Critic region
µx − µy = a µx − µy 6= a |T | ≥ z α2
x−y−a
N (µ, σ) µx − µy ≥ a µx − µy < a T =q 2 T < z1−α
σx σy2
nx + ny
σ known µx − µy ≤ a µx − µy > a T > zα
µx − µy = a µx − µy 6= a |T | ≥ t α2 (nx + ny − 2)
N (µ, σ) µx − µy ≥ a µx − µy < a T = r x−y−a
2 +(ny −1)S 2
(nx −1)Scx q T < t1−α (nx + ny − 2)
cy 1
nx +ny −2 nx + n1y
σ unknown equal µx − µy ≤ a µx − µy > a T > tα (nx + ny − 2)
σx2 = σy2 σx2 6= σy2 T > nnxy F α2 (nx , ny ) ó T <
nx
ny F1− 2 (nx , ny )
α
Pnx
(xi − µx )2 nx
N (µ, σ) σx2 ≥ σy2 σx2 < σy2 T = Pi=1
ny 2
T < ny F1−α (nx , ny )
i=1 (yi − µy )
µ known σx2 ≤ σy2 σx2 > σy2 T > nnxy Fα (nx , ny )
σx2 = σy2 σx2 6= σy2 T > F α2 (nx − 1, ny − 1) ó
T < F1− α2 (nx − 1, ny − 1)
2
Scx
N (µ, σ) σx2 ≥ σy2 σx2 < σy2 T = 2
T < F1−α (nx − 1, ny − 1)
Scy
µ unknown σx2 ≤ σy2 σx2 > σy2 T > Fα (nx − 1, ny − 1)
Let us recall that in case we are searching for the values of tα (n − 1) with values of n greater
than 30, this distribution is approximated by N (0, 1), and so we will look for the values of zα .
We start now presenting non-parametric hypothesis tests. The tests we will study from
13
now on are based on chi-square distribution. We will see tests about the adjustment of a theoretic
distribution to an empiric distribution as well as the application to contingency tables.
14
1.5.3 Dependence and independence tests
We want to know if two variables X and Y from the same population are dependent or independent.
We suppose that the possible values of the variables are:
X : x1 , x2 , . . . , xk ,
Y : y1 , y 2 , . . . , y m ,
and we have a sample of size n, where we measure those variables X e Y .
We denote:
Oij = number of elements presenting values xi and yj .
eij = number of expected elements that present values xi and yj if variables are independent.
We can build the following continency table where empiric and theoretic frequencies appear:
To calculate theoretic frequencies we can use the following expression, provided that the variables
are independent:
This statistic has a distribution chi-square with (k − 1)(m − 1) degrees of freedom in case that
the variables are independent and with a confidence level of 1 − α.
We accept H0 if T < χ2α (k − 1)(m − 1) ACCEPTANCE REGION.
We reject H0 if T ≥ χ2α (k − 1)(m − 1) REJECTION REGION.
15
1.5.4 Homogeneity test for several samples
We try to decide if several samples measuring the same characteristic A belong or not to the same
population, with respect to the same characteristic A.
Let us suppose that we have k samples of sizes n1 , n2 , . . . , nk , being y1 , y2 , . . . , yk the elements
of each sample presenting a certain characteristic A and the rest do not present it.
If we suppose that all the samples come from the same population, the proportion of elements
presenting characteristic A are:
y1 + y 2 + · · · + y k
p= .
n1 + n2 + · · · + nk
If we suppose that the samples come from the same population, the expected values for charac-
teristic A in each sample are n1 p, n2 p, n3 p, . . . , nk p.
We can build the following table in which we present the observed and expected values:
16
When we want to analyze a population, we have to take into account that if the population can
be divided into subpopulations so that they keep heterogeneity of the starting population, if we do
not take it into account we can have mistaken results.
Let us consider, for instance, the following data which are related about students in a high
school admitted in some seminars:
N of applications N of admitted Admitted proportion
Men 1000 470 0.47
Women 1000 570 0.57
If we suppose that the population is homogeneous, we will get to the conclusion that there is a
significative difference between men and women, in favor to women, when they have to be admitted
for a seminar.
But if the data are analyzed depending on the seminar A, B or C, we get the following table:
As we can see, discrimination is in favor to men in every seminar. Therefore, conclusions would
be different if we group the data. This situation is known as Simpson paradox.
17
1
µ + σn2 x
s
σ02 0 x
1
N (µ1 , σ1 ) = N 1 , .
σ02
+ σn2 1
σ02
+ n
2
σx
x
18
Chapter 2
An example of application of
inference
1. One of the things we want to do is to make t-shirts of the high schools and sell them to the
student to organize a trip. We will use our data to find a confidence interval for the average
of the pocket money of the students because it can give us orientation about how high the
price can be so that the students can afford it.
2. Last studies say that young people devote most of their spare time to connect to the internet
and watch television. Can we say that the students of the high school devote daily more than
one hour to be connected to the internet?
3. We want to see if in our population we can consider to be true the data usually believed that
there are a 10% of the population that is left-handed.
We have then data about 25 students referring to the variables mentioned above. The data are
the following:
19
Observation Pocket money Internet Left-handed
1 0 0 0
2 12 10 0
3 12 10 0
4 5 90 0
5 8 90 0
6 8 0 1
7 0 30 0
8 40 60 0
9 21 0 1
10 0 60 0
11 9 45 0
12 4.5 15 0
13 20 0 0
14 0 30 0
15 15 60 0
16 0 30 0
17 0 0 0
18 0 0 0
19 12 30 1
20 9.4 60 0
21 10 60 1
22 2 120 1
23 5 90 0
24 3.5 150 0
25 10 60 0
Let us solve the problems we have posed. We start by our first goal:
Confidence interval for the average pocket money
We start by searching the limits where we can find the average pocket money. The first thing
to be done is to fix a confidence level. We will fix our confidence level as 90%
In which situation are we in? We suppose that our population is a normal population. Do we
know σ? The answer is no. Therefore, we have a normal population with unknown σ. We recall
that the confidence interval for the average in this situation is:
Sc Sc
x − t α2 (n − 1) √ , x + t α2 (n − 1) √ ,
n n
for the case of sampling withq replacement. As we have made sampling without replacement, we
−n
will apply the correction factor N N −1 , and so we have:
r r !
Sc N − n Sc N − n
x − t α2 (n − 1) √ , x + t α2 (n − 1) √ .
n N −1 n N −1
Therefore we need the following data:
20
x = 8.256, Sc = 8.895, t α2 (n − 1) = t0.05 (24) = 1.711.
and the interval would be
r r !
8.895 558 − 25 8.895 558 − 25
8.256 − 1.711 √ , 8.256 + 1.711 √ = (5.2785, 11.2335).
25 558 − 1 25 558 − 1
And what we get is that the appropriate limits would be 5.27 euros and 11.23 euros for the
t-shirts we want to sell.
Time devoted by young people to connecting to the internet
We ask ourselves now if we can state that the students of the high school devote daily more than
one hour to connect to the internet. Which technique can we use to get an answer to our question?
We will use a one-sided hypothesis test in which we will try to decide if the average of our variable
is greater than one hour (60 minutes).
Which is our situation now? We suppose again that our population is a normal population, and
again, σ is unknown. We have to choose now a confidence level, let it be 95%.
Null and alternative hypothesis for our test are:
H0 : average time daily devoted to the internet is greater or equal to 60 minutes.
H1 : average time daily devoted to the internet is lower than 60 minutes.
Our statistic is, knowing that σ is unknown:
x − µ0
T = Sc
,
√
n
21
H1 : The proportion of left-handed students is not equal to 0.1.
We will make the contrast for a confidence level of 95%. We recall that the statistic to be used
is:
p − p0
T =q ,
p0 (1−p0 )
n
where
|T | ≥ z α2 = z0.025 = 1.96,
and then we cannot reject the hypothesis that in our high school there are 10% of the students
who are left-handed.
22
H1 : The variance of the height of the students of the 5th level (σx2 ) is lower or equal than the
one of the students of the 4th level (σy2 ).
We are in the case of two normal populations with unknown average, so our statistic will be:
2
Scx
T = 2
.
Scy
And as we have,
2 2
Scx = 66.982, Scy = 58.72,
then
T = 1.14,
and we have the following critic region
x = 166.692, y = 167.8.
If we substitute
166.692 − 167.8 − 0
T =q q = −1.69957.
(26−1)66.982+(25−1)58.72 1 1
26+25−2 26 + 25
23
T > tα (nx + ny − 2) = 1.6766,
thus we cannot reject the hypothesis. But the truth is that if we pay attention to the two sided
test (µx − µy = 0) and its critic region
|T | ≥ t α2 (nx + ny − 2) = 2.0096.
we also cannot reject null hypothesis, so we cannot say that the students of the 5th level are
shorter than the ones of the 4th level.
Our conclusion is that no one of them is right, at least in the beginning. Differences between
averages and variances in the two populations are not significative.
24