Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

Prior Knowledge Check

1) !~#(20, 0.4). Calculate: 2) Wanda rolls a fair dice


eight times.

a) +(! = 5) b) +(! = 10)


a) Suggest a suitable model
for the random variable
c) +(! ≤ 2) d) +(! ≥ 18) !, the number of times
the dice lands on 5

b) Calculate:

i) +(! = 2) ii) +(! ≥ 4)


Hypothesis Testing
Hypothesis
A Hypothesis is a statement made A hypothesis is a statement made about a population
about the value of a population parameter
parameter. You can test a
hypothesis to see if there is enough
evidence to change it Null Hypothesis !!
The ‘default’ position which we usually initially assume
to be true
This chapter is all about testing
statistically whether statements are
true or not
Alternative Hypothesis !"
This hypothesis tells us about the parameter/situation
à There are several key terms you
if our null hypothesis turns out to be incorrect
need to know…

Population parameter
A population parameter is a statistical Test statistic
measure relating to a population A test statistic is the result of the experiment we
are using, which we use to test !!

7A
Hypothesis Testing
A Hypothesis is a statement made What is the test statistic for this situation?
about the value of a population
parameter. You can test a
hypothesis to see if there is enough
evidence to change it

Let us think about a practical Write a sensible null hypothesis for this situation
example:

Imagine we believe that a dice is


biased towards landing on 6s.

à We roll the dice 20 times and get Write a sensible alternative hypothesis for this
a 6 on 8 occasions situation

7A
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
Null Hypothesis A researcher wants to test,
A Hypothesis is a statement made à The dice is at the 5% significance level,
about the value of a population unbiased whether the dice is biased
parameter. You can test a
hypothesis to see if there is enough 1 Under what conditions would
!!: # =
evidence to change it 6 we reject the null hypothesis?

Let us think about a practical


example:

Imagine we believe that a dice is


biased towards landing on 6s.

à We roll the dice 20 times and get


a 6 on 8 occasions

As this is the 5% significance level, we the


null hypothesis that the dice is unbiased
7A
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
Null Hypothesis A researcher wants to test,
A Hypothesis is a statement made à The dice is at the 5% significance level,
about the value of a population unbiased whether the dice is biased
parameter. You can test a
hypothesis to see if there is enough 1 Under what conditions would
!!: # =
evidence to change it 6 we reject the null hypothesis?

.(8 23452) = 0.84%


Let us think about a practical
example: As this is below the 5% significance level, we reject
the null hypothesis that the dice is unbiased

Imagine we believe that a dice is Think about why we are rejecting the null hypothesis
biased towards landing on 6s.

à We roll the dice 20 times and get


a 6 on 8 occasions

7A
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
Null Hypothesis A researcher wants to test,
A Hypothesis is a statement made à The dice is at the 5% significance level,
about the value of a population unbiased whether the dice is biased
parameter. You can test a
hypothesis to see if there is enough 1 Under what conditions would
!!: # =
evidence to change it 6 we reject the null hypothesis?

Imagine we had rolled 7 sixes instead…


Let us think about a practical
example:

As this is the 5% significance level, we the


Imagine we believe that a dice is null hypothesis that the dice is unbiased
biased towards landing on 6s.

à We roll the dice 20 times and get


a 6 on 8 occasions

7A
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
Null Hypothesis A researcher wants to test,
A Hypothesis is a statement made à The dice is at the 5% significance level,
about the value of a population unbiased whether the dice is biased
parameter. You can test a
hypothesis to see if there is enough 1 Under what conditions would
!!: # =
evidence to change it 6 we reject the null hypothesis?

Imagine we had rolled 6 sixes instead…


Let us think about a practical
example: .(6 23452) = 6.4%

As this is the 5% significance level, we the


Imagine we believe that a dice is null hypothesis that the dice is unbiased
biased towards landing on 6s.

à We roll the dice 20 times and get


a 6 on 8 occasions
Note that in section 7A you will not need to do
these calculations, but they are useful in
helping understand the concepts behind what
we are doing!

7A
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
A Hypothesis is a statement made
about the value of a population
parameter. You can test a
hypothesis to see if there is enough If we do not specify which way we
evidence to change it believe the coin to be biased, then our
answer to part c changes…

John wants to see whether a coin is


unbiased, or whether it is biased
towards coming down on heads. He
tosses the coin 8 times and counts
the numbers of times, ', that it
lands heads up.

a) Describe the test statistic


b) Write down a suitable null
hypothesis
c) Write down a suitable
alternative hypothesis

7A
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
A Hypothesis is a statement made
about the value of a population
parameter. You can test a hypothesis
to see if there is enough evidence to
change it

An election candidate believes she has


the support of 40% of the residents in
a particular town. A researcher wants
to test, at the 5% significance level,
whether the candidate is
overestimating her support. The
researcher asks 20 people whether
they support the candidate, and 3 say
that they do.

a) Write down a suitable test


statistic
b) Write down two suitable
hypotheses
c) Explain the condition under which
the null hypothesis would be
rejected
7A
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
% (the number of sixes from
Probability
A critical region is one which, if the 20 throws)
test statistic falls within it, would 0 0.026
cause you to reject the null 1
hypothesis 2 0.198
3

In section 7A, we considered 4 0.202


testing the probability that a dice 5
was biased, and calculated the 6
following data: 7
8
. 6 23452 :;< := 20 = 0.064
. 7 23452 :;< := 20 = 0.025 We can draw this information using a diagram…
. 8 23452 :;< := 20 = 0.0084

Let us put this in a table, and fill in


some more data (all calculated using
the formula above)

7B
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
A critical region is one which, if the
test statistic falls within it, would
cause you to reject the null
hypothesis

% (the number of 5%
sixes from 20 Probability
throws)
0 0.026
1 0.104
2 0.198
3 0.238 Originally, we were considering whether the dice was
4 0.202 biased towards 6s, at the 5% significance level. This can
5 0.129
be represented on the diagram…

6 0.064
7 0.025
8 0.0084

à The critical region would be the set of values that


would lead to the null hypothesis being rejected,

7B
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
A critical region is one which, if the
test statistic falls within it, would
cause you to reject the null
hypothesis

% (the number of 5%
sixes from 20 Probability
throws)
0 0.026
1 0.104
2 0.198
3 0.238 Note that we were only considering whether the dice was biased
4 0.202 towards sixes
5 0.129
à If we were just considering whether the dice was biased, we
6 0.064 could also include the value of 0 in the critical region
7 0.025
8 0.0084

à These are called one or two-tailed tests, and we will see


more about them later in the chapter…
7B
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
A critical region is one which, if the
test statistic falls within it, would
cause you to reject the null
hypothesis

% (the number of
sixes from 20 Probability
throws)
0 0.026
1 0.104
2 0.198
3 0.238
4 0.202
5 0.129
6 0.064
7 0.025
8 0.0084

7B
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
!! : # = 0.35
A critical region is one which, if the
test statistic falls within it, would Assuming !! is true, then '~)(6, 0.35)
cause you to reject the null
hypothesis à You can then use your calculator or the
statistical tables to find the value for which the
probability would be less than 5%
A single observation is taken from a
Binomial distribution )(6, #). The
observation is then used to test
!!: # = 0.35 against !&: # > 0.35.

a) Using a 5% significance level,


find the critical region for this
test

b) State the actual significance


level of this test

7B
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
!! : # = 0.35
A critical region is one which, if the
test statistic falls within it, would Assuming !! is true, then '~)(6, 0.35)
cause you to reject the null
hypothesis à You can then use your calculator or the
statistical tables to find the value for which the
probability would be less than 5%
A single observation is taken from a
Binomial distribution )(6, #). The
observation is then used to test
!!: # = 0.35 against !&: # > 0.35.

a) Using a 5% significance level,


find the critical region for this
test

b) State the actual significance


level of this test

7B
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
A critical region is one which, if the
test statistic falls within it, would
cause you to reject the null
hypothesis

A single observation is taken from a


Binomial distribution )(6, #). The
observation is then used to test
!!: # = 0.35 against !&: # > 0.35.

a) Using a 5% significance level,


find the critical region for this
test
5 or 6
b) State the actual significance
level of this test

7B
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
A critical region is one which, if the The ‘actual significance level’ is the probability
test statistic falls within it, would of the test statistic falling within the critical
cause you to reject the null region
hypothesis
à It can also be thought of as ‘the probability
of incorrectly rejecting the null hypothesis’
A single observation is taken from a
Binomial distribution )(6, #). The
observation is then used to test
!!: # = 0.35 against !&: # > 0.35.

a) Using a 5% significance level,


find the critical region for this
test à We already found the critical region in part a
5 or 6
b) State the actual significance à In this case, the chance of getting 5 or 6
level of this test ‘successes’ can be calculated as follows

). )*%

7B
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
A critical region is one which, if the
test statistic falls within it, would
cause you to reject the null hypothesis

A random variable ' has binomial


distribution )(40, #). A single
observation in used to test !!: # = 0.25
against !&: # ≠ 0.25.

a) Using the 2% level of significance,


find the critical region of this test.
The probability in each ‘tail’ should
be as close to possible as 0.01

7B
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
It can help to summarise the information…
A critical region is one which, if the
test statistic falls within it, would
cause you to reject the null hypothesis

A random variable ' has binomial


distribution )(40, #). A single
observation in used to test !!: # = 0.25
against !&: # ≠ 0.25.

a) Using the 2% level of significance,


find the critical region of this test.
The probability in each ‘tail’ should
be as close to possible as 0.01

7B
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
It can help to summarise the information…
A critical region is one which, if the !!: # = 0.25 + = 40 # = 0.25 2%
test statistic falls within it, would !&: # ≠ 0.25
cause you to reject the null hypothesis

A random variable ' has binomial


distribution )(40, #). A single
observation in used to test !!: # = 0.25
against !&: # ≠ 0.25.

a) Using the 2% level of significance,


find the critical region of this test.
The probability in each ‘tail’ should
be as close to possible as 0.01

, - ≤ 3 = 0.47%

Please note that in some early editions


of the textbook, the value has been
incorrectly stated as 18 rather than 17! 7B
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
A critical region is one which, if the So the critical regions are as follows:
test statistic falls within it, would
cause you to reject the null hypothesis

A random variable ' has binomial


distribution )(40, #). A single
observation in used to test !!: # = 0.25
against !&: # ≠ 0.25.

a) Using the 2% level of significance,


find the critical region of this test.
The probability in each ‘tail’ should
be as close to possible as 0.01

, - ≤ 3 = 0.47%
, - ≥ 17 = 1.16%

7B
If: '~)(+, #) + " #$"
. '=/ = # 1−#
/

Hypothesis Testing
A critical region is one which, if the
test statistic falls within it, would
cause you to reject the null hypothesis

A random variable ' has binomial


distribution )(40, #). A single
observation in used to test !!: # = 0.25
against !&: # ≠ 0.25.

, - ≤ 3 = 0.47%
, - ≥ 17 = 1.16%
b) State the actual significance level
of the test

7B
Hypothesis Testing
You need to be able to carry out a one-
tailed hypothesis test

Reminder:
This might be given to you as numbers
only or through a context A one-tailed test is where we are only
considering only one ‘end’ of the binomial
distribution.

For example:

!&: # > 0.3

!&: # < 0.2

à So it might be worded that a dice is


‘biased towards’ or ‘biased against’, rather
than just ‘biased’

7C
Hypothesis Testing
You need to be able to carry out a one-
tailed hypothesis test

A single observation, 4, is taken from a


Binomial distribution )(12, #) and a value
of 8 is obtained. Use this observation to
test !!: # = 0.4 against !&: # > 0.4 using a
5% significance level.

7C
Hypothesis Testing
It can help to summarise the information…
You need to be able to carry out a one- !!: # = 0.4 + = 12 # = 0.4 5%
tailed hypothesis test
!&: # > 0.4 4=8

A single observation, 4, is taken from a


Binomial distribution )(12, #) and a value
of 8 is obtained. Use this observation to
test !!: # = 0.4 against !&: # > 0.4 using a
5% significance level.

à Make sure you read questions


Write out the calculation we need to do…
carefully!

à Consider whether you are testing if


the probability is above or below the
critical value

7C
Hypothesis Testing
You need to be able to carry out a one-
tailed hypothesis test

If the question is given to you in


context, it helps to follow a series of
steps…

1) Decide on the test statistic


2) Identify the null and alternative
hypotheses
3) Calculate the probability of the test
statistic taking the observed value,
assuming the null hypothesis is true
4) Compare this with the significance
level
5) Write a conclusion in the context of
the question

7C
Hypothesis Testing
Summarise the information…
You need to be able to carry out a one- '~)(20, #)
tailed hypothesis test + = 20 # = 0.25
!!: # = 0.25
!&: # > 0.25 4 = 10 5%
The standard treatment for a
&
particular disease has a probability
'
of success. A certain doctor has
undertaken research in this area and
has produced a new drug which has
been successful with 10 out of 20
patients. The doctor claims that the
new drug represents an improvement
on the standard treatment. Test, at
the 5% significance level, the claim Write out the calculation we need to do…
made by the doctor.

The test statistic will be the 10


patients who were cured, out of 20

7C
Hypothesis Testing
You also need to be able to carry
out a two-tailed hypothesis test

It is important to check the


wording of questions.

A one-tailed test is used to test


whether a probability seems to have
increased, or decreased.

A two-tailed test is used to test


whether the probability has
changed in either direction (the
direction will not be specified)

7D
Hypothesis Testing
Summarise the information…
You also need to be able to carry out a
two-tailed hypothesis test

A single observation, !, is taken from


a Binomial distribution "~$ 10, ( , and
a value of 1 is obtained. Use this
observation to test )! : ( = 0.45 A starting point is to find whether you need to
against )" : ( ≠ 0.45 using a 5% check the upper or lower limit
significance level.

à Test whether the observation is


sufficient to reject )!

For a two-tailed test, we need to halve


the significance level used.

à In this case we would be


considering 2.5% at either
‘extreme’ of the distribution

7D
Hypothesis Testing
Summarise the information…
You also need to be able to carry out a
two-tailed hypothesis test '~)(10, #)
!!: # = 0.45 + = 10 # = 0.45
A single observation, !, is taken from
a Binomial distribution "~$ 10, ( , and !&: # ≠ 0.45 4=1 2.5%
a value of 1 is obtained. Use this
observation to test )! : ( = 0.45
against )" : ( ≠ 0.45 using a 5%
significance level.

à Test whether the observation is


sufficient to reject )!
We can now write the following:
For a two-tailed test, we need to halve
the significance level used.

à In this case we would be


considering 2.5% at either
‘extreme’ of the distribution

7D
Hypothesis Testing
You also need to be able to carry
out a two-tailed hypothesis test

You need to be able to use your


calculator to find values using
probabilities that are not in the
table…

à We highly recommend the ‘Casio


ClassWiz’

à We will see how to use this to


find values that are not in the
statistical tables…

7D
Hypothesis Testing
You also need to be able to carry In the previous question, we needed to find
out a two-tailed hypothesis test .(' ≤ 1) when + = 10 and # = 0.45

You need to be able to use your


calculator to find values using
probabilities that are not in the
table…

à This value is also in your ClassWiz


calculator

à Let’s see how to calculate it…

7D
Hypothesis Testing
Start by pressing the ‘mode’ button
You also need to be able to carry
out a two-tailed hypothesis test
On this screen, press 7

You need to be able to use your


calculator to find values using On this screen, press
probabilities that are not in the down
table…
On this screen, press 1
(Binomial Cumulative
Distribution)

On this screen, press 2

On this screen, fill in the


details. In this case, " = 1,
% = 10 and ' = 0.45

This will give you the


value from the table
7D
Hypothesis Testing
Summarise the information…
You also need to be able to carry 1
out a two-tailed hypothesis test !!: # = + = 12 4=2
3
'~)(12, #)
1 1
!&: # ≠ #= 2.5%
Over a long period of time it has 3 3
been found that in Enrico’s
The expected number of ‘successes’ would be
restaurant the ratio of non- &
vegetarian to vegetarian meals sold 12× ( = 4
is 2:1. In Manuel’s restaurant, in a
random sample of 12 people Since 2 is less than this, we need to find
ordering meals, 2 ordered a .(' ≤ 2)
vegetarian meal. Using a 5%
significance level, test whether or
not the proportion of people eating
vegetarian meals in Manuel’s Put the values into
restaurant is different to that in your calculator
Enrico’s restaurant.
Since 0.1811 > 0.025, we cannot reject the This is the value for
null hypothesis that the probabilities are the +(- ≤ 2) based on the
same. probability and number
of trials
à There is not enough evidence to suggest
that the proportion in each restaurant is
different 7D
Hypothesis Testing – Test A (17 mins) Fundamentals Challenge  Expert
Subtopics: Hypothesis testing, finding critical values, one-tailed tests, two-tailed tests

1. A test statistic has a distribution X ~ B(9, p). Given that H0: p = 0.25, H1: p > 0.25, find the critical
region for the test using a 5% significance level. [4]

2. A random variable has distribution X ~ B(20, p). A single observation of x = 2 is taken from this
distribution. Test, at the 10% significance level, H0: p = 0.15 against H1: p  0.15 . [4]

3. An article states that 45% of drivers in town X drive a black car. A researcher wants to test, at the
10% significance level, whether the article is overestimating the number of black-car drivers. The
researcher asks 40 drivers what colour their car is. Thirteen people say black.
a) Write down a suitable test statistic. [1]
b) Write down a suitable null hypothesis and a suitable alternative hypothesis. [2]
c) Explain the condition under which the null hypothesis would be rejected. [1]

4. On average, a machine fails 4 times out of 10. An engineer designs a new machine that he believes
has a reduced failure rate. He uses his new machine 15 times in order to test his belief.
a) Describe the test statistic. [1]
b) State suitable null and alternative hypotheses. [2]
c) Using a 10% level of significance, find the critical region for a test to check the engineer’s
belief, ensuring the probability is as close to 0.1 as possible. [4]
d) Write down the actual significance level of the test. [1]

5. It is claimed that 10% of women use a particular perfume called ‘Daisy’. In a random survey of 50
women, 41 said they do not use this perfume.
Test, at the 5% significance level, whether or not there is evidence that the proportion of women
using the ‘Daisy’ perfume is 0.1. State your hypotheses carefully. [7]

6. A doctor claims that 80% of patients suffering from a certain illness recover when they are treated
with a new medicine.
A random sample of 25 patients with this illness is taken from hospital records.
a) Write down a suitable distribution to model the number of patients in this sample who recover
when given the new medicine. [1]
b) Assuming that the claim is correct, find the probability that the medicine will be successful for
exactly 19 patients. [2]
The hospital believes that the doctor’s claim is incorrect and the percentage who will recover is
lower. A random sample of 40 patients with the illness who had been prescribed the medicine is
taken from the hospital records. It is found that of these 40 patients, 26 had recovered.
c) Stating your hypotheses clearly, test, at the 2% level of significance, the hospital’s belief. [6]

TOTAL 36 MARKS

Test 6.3a – Hypothesis Testing Page 1 of 1 © ZigZag Education, 2018


Prior knowledge check
1) Given that ! = 3×2! 2) The height, ℎ ./, and the
handspan, 0 ./, of 20 students are
a) Show that &'(! = ) + +,, where
recorded. The regression line of h on
) and + are constants to be
s is found to be ℎ = 22 + 11.30. Give
found
an interpretation of the value 11.3 in
b) The straight line graph of , this model.
against &'(! is plotted. Write
down the gradient of the line
and the intercept on the 3) A single observation of , = 32 is
vertical axis taken from the random variable
3~+(40, 9). Test, at the 1%
significance level:
;" : 9 = 0.6 against ;# : 9 > 0.6
! = #$ ! %&'! = %&'# + )%&'$
Regression, correlation and
hypothesis testing
You need to be able to use
! = #$ !
logarithms and coding to analyse
trends in non-linear data %&'! = %&'#$ !

%&'! = %&'# + %&'$ !


àYou have seen this in the Pure
Year 1 course, chapter 14
%&'! = %&'# + )%&'$

àLet’s have a reminder of how


the relationships should be
written…

1A
! = #$ ! %&'! = %&'# + )%&'$ ! = #* " %&'! = %&'# + $%&'*
Regression, correlation and
hypothesis testing
You need to be able to use ! = #* "
logarithms and coding to analyse
trends in non-linear data
%&'! = %&'#* "

àYou have seen this in the Pure %&'! = %&'# + %&'* "
Year 1 course, chapter 14
%&'! = %&'# + $%&'*
àLet’s have a reminder of how
the relationships should be
written…

1A
! = #$ ! %&'! = %&'# + )%&'$ ! = #* " %&'! = %&'# + $%&'*
Regression, correlation and
hypothesis testing
You need to be able to use ! 3 5 6 8 9 11
logarithms and coding to analyse
" 1.04 1.49 1.79 2.58 3.1 4.46
trends in non-linear data
The table to the right shows some data à Using the coding, when % = 0, # = 0 as well…
collected on the temperature, in °C, of a
colony of bacteria (t), and its growth à Substitute # = 0 into the equation…
rate (g).

The data are coded using the changes


of variable # = % and & = '()). The
regression line of & on # is found to be:
& = −0.2215 + 0.0792#

a) Mika says that the constant -0.2215


in the regression line means that the
colony is shrinking when the
temperature is 0°C. Explain why Mika is
wrong.
! = #$ ! %&'! = %&'# + )%&'$ ! = #* " %&'! = %&'# + $%&'*
Regression, correlation and
hypothesis testing
You need to be able to use ! 3 5 6 8 9 11
logarithms and coding to analyse
" 1.04 1.49 1.79 2.58 3.1 4.46
trends in non-linear data
à You can use the coding again to get the
The table to the right shows some data
equation in the required form – substitute
collected on the temperature, in °C, of a
both ‘codes’ into the equation you are given…
colony of bacteria (t), and its growth
rate (g).

The data are coded using the changes


of variable # = % and & = '()). The
regression line of & on # is found to be:
& = −0.2215 + 0.0792#

b) Given that the data can be modelled


by an equation of the form ) = 34 ! ,
where 3 and 4 are constants, find the
values of 3 and 4.

1A
Regression, correlation and
hypothesis testing
You need to be able to calculate and
use the product moment correlation
coefficient (PMCC)

à The PMCC (usually denoted by the


letter 5) will tell you how strong the
correlation is in a set of data, plotted
on a scatter graph
?=1

à It can take values from -1 (perfect


negative correlation) to 1 (perfect
positive correlation

à On your exam you can use a calculator


to work out this value…

1B
Regression, correlation and
hypothesis testing
Day of
You need to be able to calculate and month
1 2 3 4 5 6 7 8 9 10
use the product moment correlation
coefficient (PMCC) ! 4 4 8 7 12 12 3 4 7 10

g 13 12 19 23 33 37 10 n/a n/a 23
From the large data set, the daily mean
windspeed, ! knots, and the daily maximum gust, à The n/a in the table indicates that no data is
" knots, were recorded for the first 10 days in available on those days
September in Hurn in 1987.

a) State the meaning of n/a in the table

1B
Regression, correlation and
hypothesis testing
Day of
You need to be able to calculate and month
1 2 3 4 5 6 7 8 9 10
use the product moment correlation
coefficient (PMCC) ! 4 4 8 7 12 12 3 4 7 10

g 13 12 19 23 33 37 10 n/a n/a 23
From the large data set, the daily mean
windspeed, ! knots, and the daily maximum gust, à Now press 2, since we want a linear regression
" knots, were recorded for the first 10 days in
September in Hurn in 1987.

b) Calculate the product moment correlation à Now enter the data (remember to ignore the
coefficient for the remaining 8 days n/a values from this question…)
à On your casio classwiz, press menu

à Now press the OPTN button

à Now press 6

à Now press 4

1B
Regression, correlation and
hypothesis testing
Day of
You need to be able to calculate and month
1 2 3 4 5 6 7 8 9 10
use the product moment correlation
coefficient (PMCC) ! 4 4 8 7 12 12 3 4 7 10

g 13 12 19 23 33 37 10 n/a n/a 23
From the large data set, the daily mean
windspeed, ! knots, and the daily maximum gust, à Now press 4
" knots, were recorded for the first 10 days in
September in Hurn in 1987.

b) Calculate the product moment correlation


coefficient for the remaining 8 days

à Your screen displays 3 values. The PMCC is #, in


this case 0.9533 (2dp)

à The other values help give you the equation of


the line of best fit for this data (in the form
shown above it). This is not required in this
chapter though!

1B
Regression, correlation and
hypothesis testing
Day of
You need to be able to calculate and month
1 2 3 4 5 6 7 8 9 10
use the product moment correlation
coefficient (PMCC) ! 4 4 8 7 12 12 3 4 7 10

g 13 12 19 23 33 37 10 n/a n/a 23
From the large data set, the daily mean
windspeed, ! knots, and the daily maximum gust,
" knots, were recorded for the first 10 days in
September in Hurn in 1987.

b) Calculate the product moment correlation


coefficient for the remaining 8 days

c) With reference to your answer to part b),


comment on the suitability of a linear regression
model for this data

1B
Regression, correlation and
hypothesis testing
You need to be able to perform a à When testing if the population PMCC, p, is either
hypothesis test to determine whether a greater than or below zero, you need to use a one-
data set has no correlation tailed test

An important note is that in the à When testing whether it is not equal to 0, you should
previous section, we used # to denote use a two-tailed test…
the PMCC

The letter # is used when the PMCC is


calculated for a sample

When we calculate the PMCC for the


whole population of a data set, $ is used
(it is the Greek letter rho)

1C
Regression, correlation and
hypothesis testing
You need to be able to perform a +! : $ = 0 +" : $ ≠ 0 Sample size = 30
hypothesis test to determine whether a Significance level in each tail: 0.05
data set has no correlation
Finding the critical region
A scientist takes 30 observations of the
masses of two reactants in an
experiment. She calculates a PMCC of
# = −0.45.

The scientist believes there is no


correlation between the masses of the
two reactants. Test, at the 10% level of
significance, the scientist’s claim,
stating your hypotheses clearly.

à Start by stating your hypotheses

à You should also state the sample size,


and halve the significance level (since
this is a two-tailed test)

1C
Regression, correlation and
hypothesis testing
You need to be able to perform a
hypothesis test to determine whether a
data set has no correlation

1C
Regression, correlation and
hypothesis testing
You need to be able to perform a
hypothesis test to determine whether a
data set has no correlation

1C
Regression, correlation and
hypothesis testing
You need to be able to perform a +! : $ = 0 +" : $ ≠ 0 Sample size = 30
hypothesis test to determine whether a Significance level in each tail: 0.05
data set has no correlation
Finding the critical region
A scientist takes 30 observations of the
masses of two reactants in an
experiment. She calculates a PMCC of
# = −0.45.

The scientist believes there is no


correlation between the masses of the
two reactants. Test, at the 10% level of
significance, the scientist’s claim,
stating your hypotheses clearly.

1C
Regression, correlation and
hypothesis testing
You need to be able to perform a . 31 28 38 37 18 17 21 29
hypothesis test to determine whether a
data set has no correlation / 99 94 87 80 80 89 84 86

The table from the large data set shows


the daily maximum gust, . kn, and the Use your calculator to find the PMCC (as
daily maximum relative humidity, /%, in seen in section 1B)
Leeming for a sample of eight days in
May 2015.

a) Find the PMCC for these data

b) Test, at the 10% level of


significance, whether there is
evidence of a positive correlation
between daily maximum gust and
daily maximum humidity. State your
hypotheses clearly

1C
Regression, correlation and
hypothesis testing
You need to be able to perform a . 31 28 38 37 18 17 21 29
hypothesis test to determine whether a
data set has no correlation / 99 94 87 80 80 89 84 86

+! : $ = 0 +" : $ > 0 Sample size = 8


The table from the large data set shows
the daily maximum gust, . kn, and the Significance level in each tail: 0.1
daily maximum relative humidity, /%, in
Leeming for a sample of eight days in
May 2015.

a) Find the PMCC for these data


= 0.1149
b) Test, at the 10% level of
significance, whether there is
evidence of a positive correlation
between daily maximum gust and
daily maximum humidity. State your
hypotheses clearly

à State hypotheses, the sample size,


and the significance level

1C

You might also like