Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

CORRELATION AND

REGRESSION
Unit 5 Correlation and Regression

❖ Bivariate Data:

When data are collected on two variables simultaneously, they are known as
bivariate data and the corresponding frequency distribution is known as Bivariate
Frequency Distribution.

1. Marginal Distribution: It is frequency distribution of one variable (x or y) across


the other variable’s full range of values.
2. Conditional Distribution: It is the frequency distribution of one variable (x or y)
across a particular sub-population of the other variable.

Example:
Y↓ X→ 10 11 12 13 Total (𝑓𝑦 )
21 | (1) - || (2) | (1) 4
22 | (1) ||| (3) | (1) - 5
23 || (2) - | (1) | (1) 4
24 | (1) || (2) || (2) || (2) 7
Total (𝑓𝑥 ) 5 5 6 4 20

(1) Marginal Frequency Distribution of X:


X 10 11 12 13 Total
F 5 5 6 4 20

(2) Marginal Frequency Distribution of Y:


Y 21 22 23 24 Total
F 4 5 4 7 20

(3) Conditional Frequency Distribution of X when Y = 23:


X 10 11 12 13 Total
F 2 0 1 1 4

(4) Conditional Frequency Distribution of Y when X = 11:


X 21 22 23 24 Total
F 0 3 0 2 5

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.1


Unit 5 Correlation and Regression

❖ Correlation:

If the two variable quantities are interdependent i.e. change in one variable tends
to be accompanied by corresponding change in other variable either directly or
inversely, then the two variables are known to be associated or correlated and the
process of establishing a relation between them is known as correlation.

Types of correlation:

1. Positive correlation: If two variables move in the same direction i.e. When one
variable increases other also increases or when one variable decreases other
also decreases, is called positive correlation.
Example: Profit and investment, the longer your hair grows, the more shampoo
you will need, knowledge and study are positively correlated.

2. Negative correlation: If two variables move in the opposite direction i.e. When
one variable increases other decreases, is called Negative Correlation.
Example: Temperature of environment and sale of winter wear clothes, the more
you sleep the less you prepare for exams, if speed increases time required to
cover particular distances decreases.

3. Zero correlation: When any increase or decrease in x does not influence y, then
x and y are said to be uncorrelated and the correlation coefficient between
them is zero. In this case the two variables are known as dissociated or
uncorrelated or independent
Example: Shoesize and intelligence are uncorrelated, You playing mobile games
at home and rise in no. of covid cases, height and exam scores

❖ Correlation coefficient:
It measures the degree or strength of linear relation between two variables. It lies
between -1 and +1, and is denoted by “r”. It is a unit free measure.
• If r = 0, it means no correlation
• If r = +1, it means perfect positive correlation
• If r = -1, it means perfect negative correlation

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.2


Unit 5 Correlation and Regression

❖ Measures of Correlation Co-efficient:

1. Scatter Diagram Method.


2. Karl Pearson’s Product Moment Correlation Coefficient.
3. Spearman Rank Coefficient of Correlation.
4. Concurrent Deviation Method.

1. Scatter diagram method:

It is the simplest method to establish correlation between two variables


although it fails to measure the extent of relationship between the two
variables. The greater the Scatter of plotted points on the chart, the lesser is
the relationship between the two variables. The more closely the points come to
a straight line, the higher is the degree of relationship.

a) Perfect positive correlation: b) Positive correlation with high degree


3 2.5
2.5
2
2
1.5
1.5
1
1
0.5 0.5

0 0
0 1 2 3 0 1 2 3 4

c) Positive correlation with low degree d) Perfect negative correlation

2.5 3

2 2.5
2
1.5
1.5
1
1
0.5
0.5
0 0
0 1 2 3 4 0 1 2 3

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.3


Unit 5 Correlation and Regression

e) Negative correlation with high degree f)Negative correlation with low degree

2.5 2.5

2 2

1.5 1.5

1 1

0.5 0.5

0 0
0 1 2 3 4 0 1 2 3 4

g) No correlation

2.5

1.5

0.5

0
0 0.5 1 1.5

2. Karl Pearson’s coefficient of correlation:

It is the most used method for establishing the measure of extent as well as
the relationship between the two variables. It can measure correlation only when
the variables are having a linear relationship.
𝑪𝑶𝑽(𝒙𝒚)
• 𝑟𝒙𝒚 = 𝝈𝒙 ×𝝈𝒚
𝒏∑𝒙𝒚−∑𝒙∑𝒚
• 𝑟𝒙𝒚 =
√𝒏∑𝒙𝟐 −(∑𝝒)𝟐 √𝒏∑𝒚𝟐 −(∑𝒚)𝟐
∑(𝑥−𝑥)(𝑦−𝑦)
• 𝐶𝑂𝑉(𝑥𝑦) = 𝑛
∑ 𝑥𝑦
• 𝐶𝑂𝑉(𝑥𝑦) = − 𝑥. 𝑦
𝑛
∑ 𝑥2
• 𝑆. 𝐷𝑥 = √ − (𝑥)2
𝑛
∑ 𝑦2
• 𝑆. 𝐷𝑦 = √ − (𝑦)2
𝑛

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.4


Unit 5 Correlation and Regression

where,

COV (xy) = covariance (lies between −∞ 𝑎𝑛𝑑 ∞)


𝑆. 𝐷(𝜎𝑥 )𝑎𝑛𝑑 𝑆. 𝐷(𝜎𝑦 ) = standard deviation

Properties of Karl Pearson’s coefficient of correlation:

a) Based on actual data.


b) Not affected by change of origin and scale.

Problem no.1: Find correlation coefficient from the following information:

x y xy 𝑥2 𝑦2
2 9 18 4 81
3 8 24 9 64
5 8 40 25 64
5 6 30 25 36
6 5 30 36 25
8 3 24 64 9
29 39 166 163 279
Solution:
29 39
Now, 𝑥 = 6
= 4.83, 𝑦 = 6
= 6.50
∑ 𝑥𝑦
COV(x,y) = 𝑛 − 𝑥. 𝑦 = -3.74
∑ 𝑥2
𝑆. 𝐷𝑥 = √ 𝑛
− (𝑥)2 = 1.95

∑ 𝑦2
𝑆. 𝐷𝑦 = √ 𝑛
− (𝑦)2 = 2.06

𝑪𝑶𝑽(𝒙𝒚)
𝑟𝒙𝒚 = = - 0.93
𝝈𝒙 ×𝝈𝒚

𝒏∑𝒙𝒚−∑𝒙∑𝒚
Alternate Method: 𝑟𝒙𝒚 = = - 0.93
√𝒏∑𝒙𝟐 −(∑𝝒)𝟐 √𝒏∑𝒚𝟐 −(∑𝒚)𝟐

Problem No.2: Given that the correlation coefficient between x and y is 0.8,
write down the correlation coefficient between u and v where

i) 2u + 3x + 4 = 0 and 4v + 16y +11 = 0


Ans: Since value of u, v is positive and x, y is positive, therefore value and
sign remain same. ∴ 𝑟𝑥𝑦 = 𝑟𝑢𝑣 = 𝟎. 𝟖

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.5


Unit 5 Correlation and Regression

ii) 2u - 3x + 4 = 0 and 4v + 16y +11 = 0


Ans: Since value of y is positive and x is negative, therefore value
remains same and sign changes. ∴ 𝑟𝑥𝑦 = 𝑟𝑢𝑣 = −𝟎. 𝟖

iii) 2u - 3x + 4 = 0 and 4v - 16y +11 = 0


Ans: Since value of u, v are positive and x, y are negative, therefore value
and sign remains same. ∴ 𝑟𝑥𝑦 = 𝑟𝑢𝑣 = 𝟎. 𝟖

iv) 2u + 3x + 4 = 0 and 4v - 16y +11 = 0


Ans: Since value of x is positive and y is negative, therefore value
remains same and sign changes. ∴ 𝑟𝑥𝑦 = 𝑟𝑢𝑣 = −𝟎. 𝟖

3. Spearman Rank Coefficient of Correlation:

It is applied to identify the correlation between two qualitative characteristics.


𝟔 ∑ 𝑫𝟐
• 𝒓 = 𝟏 − 𝒏(𝒏𝟐−𝟏)
(𝒕𝟑 −𝒕)
𝟔[∑𝑫𝟐 +∑ ]
𝟏𝟐
• In case of tie in ranks: 𝒓 = 𝟏 −
𝒏(𝒏𝟐 −𝟏)

where,

r = Rank Coefficient of Correlation


D = R1 – R2……refers to difference in ranks
n = number of observations

Properties of Spearman rank coefficient of correlation:

a) Based on Ranks
b) Not affected by change of origin and scale.

Problem No.3: Compute the coefficient of rank correlation:

Sales (x) Ad (y) Rank (x) Rank (y) D=R1-R2 𝐷2


90 7 2 2 0 0
85 6 3 3 0 0
68 2 8 7 1 1
75 3 6 6 0 0
82 4 4 5 -1 1
80 5 5 4 1 1
95 8 1 1 0 0
70 1 7 8 -1 1
Total - - - 0 4

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.6


Unit 5 Correlation and Regression

Solution:

Since n = 8, ∑ 𝐷 2 = 4,
6 ∑ 𝐷2
𝑟 = 1 − 𝑛(𝑛2 −1) = 0.95

4. Concurrent deviation method:


It is the simplest method to find correlation.
𝟐𝑪−𝑵
• 𝒓𝒄 = ±√± ( )
𝑵
Where,
C = number of concurrent deviations
N = number of pairs of observations i.e. number of observations - 1

Properties of concurrent deviation method:

a) Based on direction of change


b) Not affected by change of origin and scale.

Problem No. 4: Find coefficient of concurrent deviation:

Year Price Sign of Demand Sign of Product of


deviation deviation deviation
from from
previous previous
fig. fig.
1990 25 35
1991 28 + 34 - -
1992 30 + 35 + +
1993 23 - 30 - +
1994 35 + 29 - -
1995 38 + 28 - -
1996 39 + 26 - -
1997 42 + 23 - -
Solution:

Since, N = no. of pairs = 7

c = no. of positive signs = 2

2𝐶−𝑁
thus, 𝑟𝑐 = ±√± ( 𝑁
) = -0.65

❖ Properties of coefficient of correlation:


𝑏𝑑
• If x = bu + a and y = dv +c, then 𝑟𝑥𝑦 = |𝑏||𝑑| 𝑟𝑢𝑣
• −1 ≤ r ≤ 1

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.7


Unit 5 Correlation and Regression

❖ Regression:

Correlation is used to establish the relationship between two variables, whereas


regression describes how an individual variable is numerically related to the
dependent variable. It is concerned with estimating the value of a variable for a
given value of another variable, in the form of an equation.

Example: One can estimate the profit for a given level of investment on the basis of
the past records.

Note: If y depends on x, then y is known as dependent variable or regression or


explained variable and x is known as independent variable or predictor or
explanator.

1. Regression equation y on x: To estimate the value of y for a given value of x


𝑦 − 𝑦 = 𝑏𝑦𝑥 (𝑥 − 𝑥)

2. Regression equation x on y: To estimate the value of x for a given value of y


𝑥 − 𝑥 = 𝑏𝑥𝑦 (𝑦 − 𝑦)

𝑐𝑜𝑣(𝑥𝑦) 𝜎𝑦 𝒏∑𝒙𝒚−∑𝒙∑𝒚
3. Regression coefficient of y on x: 𝑏𝑦𝑥 = =𝑟 =
𝜎𝑥2 𝜎𝑥 √𝒏∑𝒙𝟐 −(∑𝝒)𝟐

𝑐𝑜𝑣(𝑥𝑦) 𝜎 𝒏∑𝒙𝒚−∑𝒙∑𝒚
4. Regression coefficient of x on y: 𝑏𝑥𝑦 = 𝜎𝑦2
= 𝑟 𝜎𝑥 =
𝑦 √𝒏∑𝒚𝟐 −(∑𝒚)𝟐

5. 𝑟 = √𝑏𝑦𝑥 × 𝑏𝑥𝑦

Properties of regression:

1. (x̅, y̅) is the point of intersection of the two regression equations.


2. Both regression co-efficient must have the same sign. i.e., either both are
positive or both are negative.
3. If regression lines coincide i.e. become identical value of r will be -1 or +1.
4. If regression lines cut each other making an angle of 900 , r will be 0.
5. Correlation co-efficient is the Geometric Mean of regression co-efficient.
6. Not affected by change of origin.
7. Affected by change of scale.
8. The covariance, correlation coefficient and regression coefficients have the
same origin.
9. If u = ax + b, v = cy + d, then
𝑎 𝑐
𝑏𝑢𝑣 = 𝑐 × 𝑏𝑥𝑦 and 𝑏𝑣𝑢 = 𝑎 × 𝑏𝑦𝑥

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.8


Unit 5 Correlation and Regression

Problem No. 5: the following gives mean and S.D of the prices of two shares.
Coefficient of correlation between the share prices = 0.48. Find the most likely
price of share B for a price of Rs. 50 of share A.
Share Mean SD
Company A 𝑥 =44 𝜎𝑥 = 5.60
Company B 𝑦 = 58 𝜎𝑦 = 6.30
Solution:
The regression line of y on x is given by : y = a + bx
𝜎𝑦
Where 𝑏𝑦𝑥 = 𝑟 𝜎 = 0.54
𝑥

𝑎 = 𝑦 − 𝑏𝑥 = 34.24
Thus, when x = 50, y = 34.24 + 0.54x = Rs. 61.24

❖ Probable error:

It is the method of obtaining correlation coefficient of population.


1−𝑟 2
1. Probable error: 𝑃. 𝐸 = 0.674 ×
√𝑁
1−𝑟 2
2. Standard error: 𝑆. 𝐸 =
√𝑁
3. Value of r is significant (i.e it exists) only if r > 6 P.E
4. The limits of population is r ± P.E

❖ Spurious correlation:

There are some cases where we may find a correlation between two variables
although the two variables are not casually related. This is due to existence of third
variable which is related to both the variables under consideration. Such a
correlation is known as spurious correlation or non-sense correlation.

Example: there could be a positive correlation between production of rice and that
of iron in India for last 20 years due to the effect of third variable time on both
these variables.

❖ Coefficient of Determination:
𝑬𝒙𝒑𝒍𝒂𝒊𝒏𝒆𝒅 𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆
𝒓𝟐 =
𝑻𝒐𝒕𝒂𝒍 𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆
Thus, a value of 0.6 for r indicates that (0.6)2 × 100% or 36% of the variation has
been accounted for by the factor under consideration and the remaining 64%
variation is due to other factors.

Coefficient of non-determination = (𝟏 − 𝒓𝟐 )

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.9


Unit 5 Correlation and Regression

Multiple Choice Questions

❖ Problems Based on Theory:


1. Which is the quickest method to find correlation between two variables? {}
a) Scatter diagram
b) Method of concurrent deviation
c) Method of rank correlation
d) Method of product moment correlation

2. For finding correlation between two attributes, we consider {}


a) Pearson’s correlation coefficient
b) Scatter diagram
c) Spearman’s rank correlation coefficient
d) Coefficient of concurrent deviations

3. If the relation between two variables x and y is given by 2x + 3y + 4 = 0, then the value of
the correlation coefficient between x and y is {}
a) 0
b) 1
c) – 1
d) Negative

4. [Nov 07] In rank correlation, the association need not be linear: {}
a) True
b) False
c) Partly True
d) Partly False

5. [June 08] If the correlation coefficient between two variables is 1, then the two lines of
regression are: {}
a) Parallel
b) At right angles
c) Coincident
d) None of these

6. [June 10] ________ of the regression coefficients is greater than the correlation coefficient.
{}
a) Combined mean
b) Harmonic mean
c) Geometric mean
d) Arithmetic mean

7. [May 19] A.M of regression coefficient is {}


a) Equal to r
b) Greater than r
c) Half of r

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.10


Unit 5 Correlation and Regression

d) None of these

8. [Jan 21] The intersection points of the two regression lines: y on x and x on y is {}
a) (0,0)
b) (𝑥, 𝑦)
c) (𝑏𝑦𝑥 , 𝑏𝑥𝑦 )
d) (1,1)

9. [Jan 21] The regression coefficients remain unchanged due to {}


a) Shift of scale
b) Shift of origin
c) Replacing x values by 1/x
d) Replacing y values by 1/y

10. [July 21] If the sum of the product of the deviation of x and y from their mean is zero, the
correlation coefficient between x and y is {}
a) Zero
b) Positive
c) Negative
d) 10

11. [July 21] The straight-line graph of the linear equation y = a + bx, slope is horizontal if
{}
a) b = 1
b) b ≠ 0
c) b = 0
d) a = b ≠ 0

12. [Dec 21] The regression coefficients remain unchanged due to {}
a) Shift of origin
b) Shift of scale
c) Always
d) Never

13. If high values of one tend to low values of the other, they are said to be {}
a) Negatively correlated
b) Indirectly correlated
c) Both
d) None

14. Correlation coefficient can be found out by {}


a) Scatter diagram
b) Rank method
c) Both
d) None

15. Age of applicants for life insurance and premium of insurance – correlation is {}

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.11


Unit 5 Correlation and Regression

a) Positive
b) Negative
c) Zero
d) None

16. Unemployment index and the purchasing power of common man – correlation is {}
a) Positive
b) Negative
c) Zero
d) None

17. Production of Pig iron and soot content in Durgapur – correlation is {}
a) Positive
b) Negative
c) Zero
d) None

18. In case – “years of education and income” correlation is {}


a) Positive
b) Negative
c) Zero
d) None

19. 𝑟12 is the same as 𝑟21 {}


a) True
b) False
c) Cannot be predicted
d) None

20. [MTP: S1 Dec 21] For a p × q bivariate frequency table, the maximum number of
conditional distributions is {}
a) P
b) P + q
c) pq
d) p or q

21. [MTP: S1 Dec 21] For a p × q bivariate frequency table, the maximum number of marginal
distributions is {}
a) P
b) P + q
c) 1
d) 2

❖ Problems Based on Direct Formulae:


22. If for two variable x and y, the covariance, variance of x and variance of y are 40, 16 and 256
respectively, what is the value of the correlation coefficient? {}
a) 0.01

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.12


Unit 5 Correlation and Regression

b) 0.625
c) 0.4
d) 0.5
23. If the sum of squares of difference of ranks, given by two judges A and B, of 8 students in 21,
what is the value of rank correlation coefficient? {}
a) 0.7
b) 0.65
c) 0.75
d) 0.8

24. If the rank correlation coefficient between marks in management and mathematics for a
group of students in 0.6 and the sum of squares of the differences in ranks in 66, what is the
number of students in the group? {}
a) 10
b) 9
c) 8
d) 11

25. For 10 pairs of observations, No. of concurrent deviations was found to be 4. What is the
value of the coefficient of concurrent deviation? {}
a) √0.2
b) √−0.2
c) 1/3
d) -1/3

26. The coefficient of concurrent deviation for p pairs of observations was found to be 1/ √3 . If
the number of concurrent deviations was found to be 6, then the value of p is. {}
a) 10
b) 9
c) 8
d) None of these

27. What is the value of correlation coefficient due to Pearson on the basis of the following
data: {}
x: –5 –4 –3 –2 –1 0 1 2 3 4 5

Y: 27 18 11 6 3 2 3 6 11 18 27

a) 1
b) -1
c) 0
d) -0.5

28. If r = 0.6 then the coefficient of non-determination is {}


a) 0.4
b) – 0.6
c) 0.36
d) 0.64

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.13


Unit 5 Correlation and Regression

29. If the regression line of y on x and that of x on y are given by y = –2x + 3 and 8x = –y + 3
respectively, what is the coefficient of correlation between x and y? {}
a) 0.5
b) -1/√2
c) – 0.5
d) None of these

30. If 4y – 5x = 15 is the regression line of y on x and the coefficient of correlation between x and
y is 0.75, what is the value of the regression coefficient of x on y? {}
a) 0.45
b) 0.9375
c) 0.6
d) None of these

31. If the regression coefficient of y on x, the coefficient of correlation between x and y and
√3
variance of y are -3/4, 2 and 4 respectively, what is the variance of x? {}
a) 2/√3/2
b) 16/3
c) 4/3
d) 4

32. [Nov 06] Take 200 and 150 resp. as the assumed mean for X and Y series of 11 values, then
dx = X – 200, dy = Y – 150, ∑ 𝑑𝑥 = 13, ∑ 𝑑𝑥 2 = 2667, ∑ 𝑑𝑦 = 42, ∑ 𝑑𝑦 2 = 6964, ∑ 𝑑𝑥 𝑑𝑦 =
3943. The value of r is: {}
a) 0.77
b) 0.98
c) 0.92
d) 0.82

33. [May 07] If the sum of squares of the rank difference in mathematics and physics marks of
10 students is 22, then the coefficient of rank correlation is: {}
a) 0.267
b) 0.867
c) 0.92
d) None

34. [Dec 10] If the sum of the product of deviations of x and y series from their means is zero,
then the coefficient of correlation will be {}
a) 1
b) – 1
c) 0
d) None of these

35. [Dec 10] Given 𝑥 = 16, 𝜎𝑥 = 4.8, 𝑦 = 20, 𝜎𝑦 = 9.6


The coefficient of correlation between x and y is 0.6. What will be the regression coefficient
of x on y? {}

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.14


Unit 5 Correlation and Regression

a) 0.03
b) 0.3
c) 0.2
d) 0.05

36. [May 19] Given the following series: {}


X 10 13 12 15 8 15
Y 12 16 18 16 7 18
The rank correlation coefficient r =
𝑡(𝑡3 −1)
6[∑𝐷 3 +∑ ]
12
a) 1 − 𝑛(𝑛2 −1)
𝑡(𝑡2 −1)
6[∑𝐷 2 +∑ ]
12
b) 1 − 𝑛(𝑛2 −1)
(𝑡3⋅ −𝑡)
6[∑𝐷 2 +∑ ]
12
c) 1 − (𝑛2 −1)
𝑡(𝑡2 −1)
36[∑𝐷 2 +∑ ]
12
d) 1 − 𝑛(𝑛2 −1)

2
37. [May 19] Find the probable error if 𝑟 = and n = 36 {}
√10
a) 0.6745
b) 0.06745
c) 0.5287
d) None

38. [Jan 21] For the set of observations {(1,2), (2,5), (3,7), (4,8), (5,10)}, the value of Karl
Pearson’s coefficient of correlation is approx. given by {}
a) 0.755
b) 0.655
c) 0.525
d) 0.985

39. [Jan 21] The coefficient of correlation between x and y is 0.5, the covariance is 16 and the
standard deviation of x is 4. Then the standard deviation of y is {}
a) 4
b) 8
c) 16
d) 64

40. [July 21] If 𝑏𝑦𝑥 = −1.6 𝑎𝑛𝑑 𝑏𝑥𝑦 = −0.4, then 𝑟𝑥𝑦 will be {}
a) 0.4
b) – 0.8
c) 0.64
d) 0.8

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.15


Unit 5 Correlation and Regression

❖ Problems based on properties:


41. If u + 5x = 6 and 3y – 7v = 20 and the correlation coefficient between x and y is 0.58 then what
would be the correlation coefficient between u and v? {}
a) 0.58
b) – 0.58
c) – 0.84
d) 0.84

42. From the following data {}


x: 2 3 5 4 7

y: 4 6 7 8 10

Two coefficient of correlation was found to be 0.93. What is the correlation between u and v
as given below?

u: –3 –2 0 –1 2

v: –4 –2 –1 0 2

a) -0.93
b) 0.93
c) 0.57
d) -0.57

43. Referring to the data presented in Q. No. 42, what would be the correlation between u and v?
{}
u: 10 15 25 20 35

v: –24 –36 –42 –48 -60

a) -0.6
b) 0.6
c) -0.93
d) 0.93

44. Given the regression equations as 3x + y = 13 and 2x + 5y = 20, which one is the regression
equation of y on x? {}
a) 1st equation
b) 2nd equation
c) Both(a) and(b) (d)
d) None of these

45. Given the following equations: 2x – 3y = 10 and 3x + 4y = 15, which one is the regression
equation of x on y? {}
a) 1st equation
b) 2nd equation
c) Both equations
d) None of these

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.16


Unit 5 Correlation and Regression

46. If u = 2x + 5 and v = –3y – 6 and regression coefficient of y on x is 2.4, what is the regression
coefficient of v on u? {}
a) 3.6
b) -3.6
c) 2.4
d) -2.4

47. If the regression line of y on x and of x on y are given by 2x + 3y = –1 and 5x + 6y = –1 then


the arithmetic means of x and y are given by {}
a) (1, -1)
b) (-1,1)
c) (-1,1)
d) (2,3)

48. If y = 3x + 4 is the regression line of y on x and the arithmetic mean of x is –1, what is the
arithmetic mean of y? {}
a) 1
b) -1
c) 7
d) None of these

49. If coefficient of correlation between x and y is 0.46. Find coefficient of correlation between
𝑦
x and . {}
2
a) 0.46
b) 0.92
c) – 0.46
d) – 0.92

𝑥−5 𝑦−7
50. If the correlation coefficient between x and y is r, then between 𝑈 = 10
and 𝑉 = 2
is
{}
a) r
b) – r
c) (r - 5)/2
d) (r - 7)/10

51. [Jan 21] The intersection points of the two regression lines: y on x and x on y is {}
a) (0,0)
b) (𝑥, 𝑦)
c) (𝑏𝑦𝑥 , 𝑏𝑥𝑦 )
d) (1,1)

52. [Dec 21] For any two variables x and y regression equations are given as 2x + 5y – 9 = 0 and
3x – y – 5 = 0. What is the A.M of x and y? {}
a) 2, 1
b) 1, 2
c) 4, 2

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.17


Unit 5 Correlation and Regression

d) 2, 4

❖ Problems Based on Brain Twisters:


53. If cov (x, y) = 15, what restrictions should be put for the standard deviations of x and y?
{}
a) No restriction
b) The product of the standard deviations should be more than 15.
c) The product of the standard deviations should be less than 15
d) The sum of the standard deviations should be less than 15.

54. If the covariance between two variables is 20 and the variance of one of the variables is 16,
what would be the variance of the other variable? {}
a) More than 25
b) More than 10
c) Less than 10
d) More than 1.25

55. If y = a + bx, then what is the coefficient of correlation between x and y? {}
a) 1
b) -1
c) 1 or –1 according as b > 0 or b < 0
d) None of these

56. While computing rank correlation coefficient between profit and investment for the last 6
years of a company the difference in rank for a year was taken 3 instead of 4. What is the
rectified rank correlation coefficient if it is known that the original value of rank correlation
coefficient was 0.4? {}
a) 0.3
b) 0.2
c) 0.25
d) 0.28

57. Following are the two normal equations obtained for deriving the regression line of y and x:
5a + 10b = 40

10a + 25b = 95

The regression line of y on x is given by {}

a) 2x+3y=5
b) 2y+3x=5
c) Y=2+3x
d) Y=3+5x

58. [May 19] Given that:


X -3 - 3/2 0 3/2 3
Y 9 9/4 0 9/4 9

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.18


Unit 5 Correlation and Regression

The Karl Pearson’s coefficient of correlation is {}


a) Positive
b) Zero
c) Negative
d) None

59. [May 19] If the regression line of y on x is given by Y = x + 2 and Karl Pearson’s coefficient of
𝜎𝑦 2
correlation is 0.5 then 𝜎 2 =______. {}
𝑥
a) 3
b) 2
c) 4
d) None

60. [Jan 19] Given that the variance of x is equal to the square of standard deviation of y and
the regression line of y on x is y = 40 + 0.5(x - 30). Then the regression line of x on y is………
{}
a) Y = 40 + 4(x - 30)
b) Y = 40 + (x - 30)
c) Y = 40 + 2(x - 30)
d) Y = 30 + 2(x - 40)

61. [July 21] If the slope of the regression line is calculated to be 5.5 and the intercept 15 then
the value of y when x is 6 is: {}
a) 88
b) 48
c) 18
d) 78

62. [July 21] If y = 9x and x = 0.01y, then r is equal to {}


a) – 0.1
b) 0.1
c) 0.3
d) – 0.3

63. [Dec 21] The intersecting point of two regression lines falls at x-axis. If the mean of x values
is 16, the standard deviation of x and y are resp. 3 and 4, then mean of y is {}
a) 16/3
b) 4
c) 0
d) 1

64. [MTP − S1 Dec 21] If the coefficient of correlation between two variables is 0.7 then the
percentage of variation unaccounted for is {}
a) 70%
b) 30%
c) 51%
d) 49%

Prof. Shubham Agarwal For Concept Query Only 8806337760 5.19

You might also like