Professional Documents
Culture Documents
Correlation Questions
Correlation Questions
5. What is scatter diagram? Distinguish between positive and negative correlation with the
help of scatter diagrams?
Graphical method of studying the correlation between two variables. One variable shown
on x axis and other on the Y axis. Each pair of the values is plotted on thye graph by means
of a dot mark.
6. Define Karl Pearson’s co-efficient of correlation and explain the various formulae through
which Pearson’s co-efficient of correlation can be obtained.
Karl Pearson’s coefficient of correlation:
It gives the numerical expression for the measure of correlation. it is denoted by ‘ r ’.
The value of ‘ r ’ gives the magnitude of correlation and sign denotes its direction.
Method No.1
Direct method based on deviations
Deviation of items are taken from assumed mean and actual means
This method is conveniently used where the values of the variables are of very big size and their
deviations from their respective means are found to be in whole numbers.
This assumed mean method is advisable when it is not possible to get the arithmetic mean of
both the variables in whole numbers or round numbers ie.if the actual means are in fractions. It is
better to consider assumed mean to find out deviations
Formula
Actual mean Method
Assumed mean Method
dx=X-X
r= dy=Y-Y
∑dx=0
∑dy=0
dx=X-X
dy=Y-Y
Steps
Steps 1. find arithmetic mean of both the
a) find the deviations of both X and Y variables
series from the assumed mean. It is
denoted by dx and dy and its total as
2. Find the deviation of the values of
∑dx and ∑dy respectively.
both the variables from their
b) Square up the deviations under the respective means and present them as
heading dx2 and dy2 and get totaled X and Y respectively
∑dx2 and ∑dy2.
c) Multiply dx and dy and totaled them as
3. Square up the deviations of each
∑dxdy
variable and present them as X2 and
a) Substitute the values in the formula Y2 or dx2 and dy2
4. Find the product of each pair of the
deviations and get them totaled under
∑XY or ∑dxdy
5. Find the total of squares of the
deviations of each of the variable∑ X2
and ∑Y2 or
a.
∑dx2 and ∑dy2
6. Put the respective values in the
relevant formula.
7. Interpret the result lies between +1or -
1
∑(X-X) (Y-Y)
n SDof X*SDof Y
SD of x =√∑(X-X)2
n
SD of Y=√∑(Y-Y)2
n
7. What are the properties of the Karl Pearson’s co-efficient of correlation?
a) Correlation coefficient has a well defined formula.
b) r means it is a pure number and is independent of the units of measurement.
c) It lies between ± 1
d) r does not change with reference to change of origin or change of scale.
e) r between X and Y is same as that between Y and X
8. Explain the Assumptions on which Karl Pearson’s ‘r’ is based.
Ans Prof. Pearson’s co-efficient is based on the following assumptions:
1) Linear relationship
In devising the formulae, Prof.Pearson has assumed that there is a linear relationship between
the variables Which means that if the values of the two variables are plotted on a scatter
diagram , it will give rise to a straight line.
2) Cause and effect relationship
There is a cause and effect relationship between variables which means that a change in the
value of one variable is a cause for effecting a change in the value of another variable.
3) Normalcy in distribution
It is assumed that the population from which the data are collected are normally distributed.
4) Multiplicity of Causes
Each of variables under study is affected by Multiplicity of causes.
5) Probable error of Measurement
He assumed that there is a probability of some error which may creep into the measurement
of ‘ r ’.
9. What do you mean by probable error ? what are the uses of it?
Probable error of ‘r’ is a statistical measurement which measures reliability and
dependability of the value of ‘r’.
The Magnitude of probable error must lie within a limit which is obtained by the following
formula.
r = coefficient of correlation
PE ( r) = 0.6745 1-r2
n=No. of pairs of the 2 variables √n
0.6745-Constant
SE (r ) = 1-r2
√n
Usually ‘r’ is calculated from samples. Different samples drawn from the same population
the ‘r’ may vary. But the numerical value of such variations is expected to be less than the
PE
Probable Error is the difference (error) occurring due to taking samples from the
means or population.
Interpretation of ‘ r’ on the basis of PE significance of ‘ r’
Not at all significant correlation is taken to
r < PE ( r) be almost absent.
(less than)
By adding and subtracting the value of PE from the coefficient of correlation we get the
upper limit and lower limit within which the ‘r’ in the population can be expected to lie.
Symbolically P= r±PE
Uses of PE
a) It is used to determine the limits within which the population correlation coefficient
may be expected to lie.
b) Measure the reliability and dependability of the value of ‘r’
10. Explain the methods for calculating Spearman’s rank correlation co-efficient.
This method is a development over Karl Pearson’s method of correlation coefficient.
There are many occasions whereby the values of certain variables can not be measured in
quantitative terms. If we want to study the association between two attributes namely intelligence
and beauty, we can not assign definite values. To study the correlation between attributes. Method
developed by British Psychologist Charls Edward Spearman in 1904. This method is known as
Rank Correlation.
R= 1- 6∑D2 or 1 - 6∑D2 ‘D’ difference of rank , ‘N’ number of pairs.
N3-N n(n2-1)
When the actual ranks are given When the actual ranks are not given
Steps: Steps:
a) Take the difference of the two ranks ie. a) To assign the ranks either ascending or
R1-R2 descending order.
b)
Square these differences D2 and get b) In case of ascending order- smallest
them totaled ie. ∑D2 value is assigned the first rank
c) Apply the formula c) In case of descending order – largest
value is assigned the first rank.
No.4
Find the co-efficient of correlation between age and playing habit of the following students.
Age: 14.5-15.5 15.5-16.5 16.5-17.5 17.5-18.5 18.5-19.5 19.5-20.5
No.of 250 200 150 120 100 80
students
Regular players: 200 150 90 48 30 12
No.5
Find the co-efficient of correlation between the density of population and the death rate.
Cities A B C D E F
Areas in sq.miles 150 180 100 60 120 80
Population in’000 30 90 40 42 72 24
No. of death 300 1440 560 840 1224 312
No 6
A student calculates the values of ‘r’ as 0.7 when the number of items (n) is 25. Find the limits
within which ‘r’ lies for another sample from the same universe.
No.7
Test the significance of correlation for the following values based on the number of observations
i) 10
and ‘r’= +0.4
No 8
The coefficient of rank correlation of the marks obtained by 10 students in statistics and
accountancy was 0.2. it was later discovered that the difference in ranks in the two subjects
of one of the student was wrongly entered as 7 instead of 9. Find correct correlation
coefficient.
No 9
The ranking of 10 individuals at the start and at the finish of a course of a training are as follows.
Individuals A B C D E F G H I J
Rank before 1 6 3 9 5 2 7 10 8 4
Rank after 6 8 3 2 7 10 5 9 4 1
No 10
No 11
No 12
No.13
X Y
No.of Pairs of observation 20 20
Sum of squares of deviations 136 138
from mean
Summation of product of deviations of X and Y from their means =122
Calculate product moment correlation coefficient
Additional problems
1. Calculate Karl Pearsons’ ‘r’ method . Arithmetic mean of X and Y are 6 and 8 respectively.
X 6 2 10 - 8
Y 9 11 - 8 7
X series Y series
2. Sum of deviations from Assumed mean -14 18
Sum of squares of deviations from Assumed mean 4304 6308
Sum of products of deviation from their respective Assumed mean1510
No. of pairs of observation 12
Calculate Karl Pearson’s ‘ r’ between X and Y.
4. Covariance between X and Y is 488 , variance of X and Y are 824 and 325 respectively . Final
out ‘r’.
6. While calculating the ‘r’ , 30 pairs of X and by the following results were obtained
∑ x = 120, ∑x2 = 600 , ∑ Y =90, ∑Y2=250
∑xy = 356. It was however discovered that two pairs of observations were
X Y
8 10 While the correct values were X Y
12 7 8 12
Calculate the correct correlation coefficient. 10 8
7. The correlation co-efficient (on the basis of rank) of the marks awarded to 10 students in
commerce and economics was 0.2. Later, it was found that the difference in ranks in the two
subjects of one of the students was wrongly entered as 7 instead of 9 . Calculate the correct
rank correlation co-efficient)
8. From the following data relating to the marks secured by a batch of Candidates ascertain the
Rank Correlation coefficient and interpret result.
Candidates :A B C D E F G H I J
Marks in statistics : 55 40 50 35 37 18 30 22 15 5
Maths : 58 60 48 50 30 32 45 37 42 52
Economics :70 68 75 40 80 50 30 85 25 90