Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

anchal.Jarshan M.

BUSINESS STATISTICS
SEM UNIT 1

SIMPLE CORRELATION & REGRESSION


CORRELATION
(1) CORRELATION
Correlation is a statistical measure for finding out degree or strength of association between two
or more variables. By association it means, the tendency of the variables to move together.
When change in one variable is accompanied by change in other variable or variables,
variables are said to be correlated. OR

(2) TYPES OF CORRELATION:


i. Correlation:
Positive
Negative
Positive correlation: If the movements of two variables is in same direction i.e. they increase
or decrease together then they are Positively Correlated. e.g. Height and weight of human
being. Demand and supply, Amount of rainfall and yield of crop (upto a point), Price and supply
of a commodity.
Negative correlation If movements of two variables is in direction i.e. increase in
opposite
one brings decrease in other, they are said to be Negatively Correlated. e.g. Sale of woolen
garments and the day ternperature, expenses and saving, Price and demand of a commodity.

YATRI CLASSES Page 1


ii. Correlation:
Linear
Non Linear
Linear Correlation: Correlation may be linear or non linear, if amount of variation in vaniable
X bears a constant ratio to the corresponding amount of variation in Y then co-relation
between X and Y is said to be Linear. Graphically it gives a straight line.
Non Linear Correlation If this ratio does not remain constant then it is known as Non-
Linear or curvilinear correlation.

ii. Correlation:
Simple (Bi variate)
Multiple
Partial
Simple Correlation Study of correlation between only two variables is known as simple
correlation or Bivariate correlation.
Multiple Correlation : In Multiple correlation three or more variables are studies
simultaneously. Multiple correlation consists of measurement of relationship between a
dependent variable and two or more independent variables.
Partial Correlation Correlation belween a dependent variable and one particular
independent variable keeping other variables constant is called partial correlation.

(3) METHODS OF STUDYING CORRELATION:


1) Scatter Diagram:
The existence of correlation can be shown graphically by means of a scatter diagram. Statistical
data relating to simultaneous movements (or variations) of two variables can be graphically
represented by points.
One of the two variables, say X, is shown along the horizontal axis OX and the other variable Y
along the vertical axis OY. All the pairs of values of X and Y are now shown by points (or dots)
on the graph paper. This diagrammatic representation of bivariate data is known as scatter
diagram.
The scatter of these points as also the direction of the scatter reveals the nature and strength of
correlation between two variables. The following are some scatter diagram showing different
between two variables.
typesof co-relation

YATRI CLASSES
Page 2
Parfoct Corrolatlon
Strong Corralatlon

Positlvo
r+1) Nogatlvo Pooltlvo Nogativo
(r-1)
Weak Gorrolation
Spocial Catagory

0
Positivee Negative Non Linear Correlatlon No Corrolation
1.0. Lack of Correlation
(Curvilinear)
0)
Gurukul-

2) Karl Pearson's Coefficient of Correlation:


a. Karl Pearson's coefficient of correlation is given by rC0.0X)

gxoy
Formulae of Coefficient of Correlation (r) To be Applied When-
1) r= o u ) Cov (X, Y), oy&oy are directly given in
the problem
2)r=- ) (-7) 2X X) (Y -Y),
-

n, oy&oyare directly
noXOY given in the problem.
3) r= Y 2 xy, n, oy&oyare directly given in the
naxOY
x= (X -

X);
problem.
y (Y - Y):

4) r=
nXY-2X XY n, 2XY, 2XEY.2X2, 2Y?are directly
nx-(2x)*/ny2-22 given in the problem or for rectification
5) r . n dxdy-Y dx Xdy dx X- a i)X series & Y -series are given

ndx2(2dx)2/n2 dy2-( dY ii) or Yor both are fraction


dy = Y - b

i) X- series & Y - series are given


6)
6) r xy
x = (X - X)
i) X&Y or both are integer
y= (Y
-

Y)
Pearson's Coefficient)
C. Properties of Coefficient of Correlation: (Karl variables.
1. It is a pure number and it measures the strength
of association between two

2. +1, i.e. -1srs1


Its valuelies)between -1 and
of reference and the units of measurement.
3. It is independent\of any change of origin
D. Interpretation of Karl Pearson's Coefficient of correlation:
i.r +1

r + 1 Perfect positive linear correlation between two variables X & Y


Graph (Scatter Diagram) of X &Y will be a straight line with positive slope (upward sloping)
ii. r=-1

r - 1Perfect negative linear correlation between two variables X & Y.


Graph (Scatter Diagram) of X & Y will be a straight line with negative slope (downward sloping)
ii. r= 0
Ifr 0, it means there is a lack of linear correlation between X & Y
(However it should be noted thatr= 0 implies lack linear correlation only & there may Non-linear
or Curvilinear correlation like Quadratic, Bi - quadratic, Sine curve etc.)

iv. Imperfect correlation


- I fvalue of "r" is between 0 &1 then correlation is imperfect. Positive correlation
If value ofr is between 0 &-1 then correlation is imperfect. Negative correlation
If value of "P" is close to 1 it means correlation is strong positive
-

If value of"r is close to-1 it means correlation is strong negative


-

If value of "is close to 0 (either side) it means correlation is weak positive or weak negative
depending upon then sign of "r.
3) CORRELATION AND CAUSATION (i.e. CAUSE EFFECT RELATIONSHIP)
Correlation analysis enables us to have an idea about the nature and strength of relation between
the two variables under the study. However, it fails to reflect the cause effect relationship between
the variables. In bi - variate analysis, correlation does not imply causation. Even high degree of

correlation also does not imply cause effect relationship. But cause effect relationship always
imply correlation.
The high degree of correlation between the variables may be due to the following reasons.

YATRI CLASSES Page 4


1. Mutual dependence:
The phenomena under study may inter - influence each other. Such situations usualy observed
in data relating to economic and business situations.
e.g. It is well know principle in economics that pricesif a commodity are influenced by the force of
supply and demand. Increase in demand(Being supply constant ) any lead the price to increase.
Here change in price is the result and change in demand is the cause .at the same time, we can
also examine the law of demand, where change in price( being other factors remaining constant)
brings inverse change in demand. Here change in price is clause and change is demand is the
result.
Accordingly, the two variables may show a good degree of correlation due to interdependence of
each other. Here, it becomes very difficult to isolate the exact cause from the effect

2. Both the variables influenced by the same external


are
factors
A high degree of correlation between the two variables may be due to the effect of a third variable
or a number of variables on each of these two variables.
E.g. A high degree of correlation may be observed between yield of two crops (say: potato and
rice), due to effect of number of other factors like favourable weather condition, fertilizer used,
irrigation facilities etc. on each of them. But none of the two isthe cause of other.

3. Pure chance:
It may happen that a small randomly selected sample from a bi - variate distribution may show a
high degree of correlation, though actually, variables may not be correlated in the population.
E.g. The correlation between the size ofthe shoe and intelligence of a group of individuals should
be zero, since the forces affecting the two variables are entirely independent of each other.
however, in any sample taken from the above population, there may be the chances that the
value of r is non zero then such correlation is termed as chance correlation or spurious correlation
or non - sense correlation.

YATRI CLASSES
Page 5
PRACTICAL EXAMPLES
1. Calculate the co efficient of correlation and interpret the results.
x 28 7 40 38 35 33 40 32 34 33
23 32 33 34 30 26 29 31 34 38
Sol.
X xy
X-X Y-
28 23 -7 -8 49 64 56
37 32 2 1 2
40 33 5 2 25 4 10
38 34 3 3 9 9 9
35 30 -1 1 0
33 26 -2 -5 2510
4

40 29 5 25 10
32 31 -3 0 0
34 34 -1 9 -3
33 38 2 49 14
350 310 130 166 60
X 35 Y 31
Required:
(i) Coefficient of Correlation (ii) Interpretation
r y There is weak +ve Correlation
VEx22y2
x= (X -X)
60
V130V166
y=(Y)
0.41
2. Find the correlation co-efficient between sales and advertising expenditure from the following
data:
PSales (Rs. Lakh) 65 66 67 67 68 69 70 72
VAdvertising expenditure (Rs. '000) 67 68 65 68 72 72 69 71
(Ans. 0.60)
3. Calculate person's coefficient of correlation between advertisement cost and sales as per the
data given below
Advertisement Cost in '000 Rs.39 65 62 90 82 75 25 98 36 78
Sales in lakh Rs. 47 53 58 86 62 68 60 91 51 84
(Ans. 0.78)
4 . Calculate Pearson's co efficient of correlation from the following taking 100 and 50 as the
ssumed average of X and Y respectively.
X 104 111 104 114 118 117 105 108 106 100 104 105
: 57 55 47 45 45 50 64 63 66 62 69 61
(Ans. -0.67)
YATRI CLASSES
Page 6
From the following data, find out the correlation coefficient between heights of fathers and sons.
Heights of fathers (inches):65 66 67 67 68 69 70 72
Heights of sons (inches) ;67 68 65 68 72 72 69 71 (Ans. 0.603)
6. Calculate the co efficient of correlation and interpret the
-

result
Age of Husbands 23 27 39 29 30 31 33 35 36
Age of Wives 18 20 21 22 24 37 29 28 29
Sol.
Let X be the Age of Husbands and Y be the Age of Wives
X Y dx dy dx|dy dx.dy
X Y
30] 25]
23 18 -7 7 49 49 49
27 20 -3 -5 9 25 15
39 21 9 4 81 16 36
29 22 -1 9
30 24 0
31 27 2
33 29 4 9 16 12
35 28 3 25 9 15
36 29 4 361 6 24
283 218 13 -7 211 145 81

31.44 Y=24.22
Required:
(i) Coefficient of Correlation ()
n dx.dy-2 dx dy dx X -30
ndx2( dx)n dy ( dy
(9)(81)-(13)(-7) dy Y-25
V(9)(211)-(13)2/(9)(145)-(-7)
0.57
(i) Interpretation
There is Moderate + ve Correlation
7. Calculate the co-efficient of correlation and interpret the results.
Firm 2 3 4 5 5 7 9 10
Sales 50 55 55 60 65 65 60 60 50 55
Expenditure 11 13 14 16 16 15 14 13 13 15
(Ans. 0.80)-s
8. Calculate Karl Pearson's co - efficient from the following table of prices and supply of a

commodity during the year 1977-1986.


Year 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986
Price/Kg 10 12 18 16 15 19 18 17 15 16
Supply 30 35 45 44 42 48 44
47 46 44 45 (Ans. 0.96)

YATRI CLASSES Page 7


9. Calculate coefficient of correlation between X and Y.
X: 78 89 97 69 59 79 68 61
Y: 125 137 156 112 197 136 123 108 (Ans. 0.03)
10.The following data refers to advertisement expense and no. of units sold in last six months
Ad.Expense (in '000 Rs.) 14 21 26 22 15 15
No. of units sold (in lacks) 31 37 5050 45 33
39
Calculate the correlation coefficient and comment on the result. Also draw a scatter diagram and
interpret it. (Ans. 0.95)
11. To study the effectiveness of an advertisement, a survey is conducted by calling people at
random by asking the number of advertisements read or seen in a week and the number of items
purchased in the week.
Adv. SeenI read 05 10 4
No. of items purchased 10 12 5 2 8
Calculate the correlation coefficient and comment on the resuit (Ans. 0.75)

12. Compute Karl Pearson's coefficient of correlation in thefollowing series relating to cost of living
and wages.
100 101 102 100 99 98 97 98 96 96
Wages
95 92 95 94 90 91
Cost of living 98 99 99 97
(Ans. 0.92)
13. Calculate Karl Pearson's coefficient ofcorrelation from the following data, for price and demand.
Price 14 16 17 18 19 20 21 22 23
Demand 84 78 70 75 66 67 62 58 60 (Ans. -0.954)
14. Two variates x and y when expressed as deviations from respective means, are given as follows.
Find the co -

efficient of correlation and comment on the results.


X -4 -3 T 0 1 3
4 -2 1 2 2 -1 (Ans. -0.13)
15. i) Compute the correlation co eficient between the corresponding values of X and Y in the
following table
2 6 8 11
Y 12 8 10 7 5 (Ans. -0.92)
ii) Multiply each X- value in the Table by 2 and add 6. Multiply each value of Y in the Table by
3 and subtract 15, Find the correlation coefficient between the two new sets of values. Explain
why you do or do not obtain the same result as in (). (Ans. -0.92)

16. Calculate the correlation coefficient from the following data


X 12 9 8 10 11 13 7
Y 14 8 9 11 12 3
Let now each value ofX is multiplied by 2 & then 6 is added to it. Similarly, each value of Y is
multiplied by 3 and 2 is subtracted from it. What will be the r between the new series of X & Y?
(Ans. 0.95 & 0.95)

YATRI CLASSES Page 8


17. If coefficient of correlation between X and Y is -0.92 then find coefficient of correlation between
) U= 2X +6 and V = 3Y -15
(i) U = 2X +6 and V = -3Y +15
(Ans. (i) -0.92; (ii) 0.92)
18. In order to find the correlation coefficient between two variables X and Y from 12 Pairs of
Observations, the following calculations were made:
X = 30, Y = 5, YX? = 670, Y2 =285, 2XY = 334

Calculate the co efficient of correlation between X andY. (Ans. 0.78)


19. A test in Mathematics was given to 10 students who were abcut to bring a course in
Statistics. The scores (X) in their test were examined in relations to score (Y) in the final
examination in Statistics.
The following result were obtained:
2X 71,EY = 70, Xx2 = 555, y 2 = 526, 2XY 527

Find the coefficient of correlation between X and Y, (Ans. 0.701)


20. A computer while calculating the correlation coefficient between two variables X and Y from 25

pairs of observations obtained the following results:


n 25, 2X = 125, 2 Y = 100. 2 x2 = 650, 2 r? = 460, and XY= 508
It was however discovered at the time of checking that two pairs of observations were not
correctly copied. They were taken as (6, 14) and (8, 6) while the correct values were (8, 12) and
(6, 8). Prove that the correct values of the correlation coefficient should be 2/3.
Sol. Given: Incorrect results are -

508
n = 25, 2X = 125, 2Y =100, Ex =650, 2y2 =460, and XY =

Working:
Corrected Result Incorrect Result- Wrong Observations+ Correct Observation
25 2 25
n
125 6 8 6+8 125

Y 100 14 6 12 +8 100
650 6 82 62 +82 = 650
X
460 14- 62 12+82 436
508 (6X14) - (8BX6)+ (8X12)+(6X8)= 520
2XY
Required:
(i) Coefficient of Correlation
nX.Y.-xXY
Correctedr Corrected nX2(E x)2/n2Y2 (E Y)2
(25) (520)-(125) (100)
V(25)(650)-(125)2V(25) (436)-(100)2

0.67
21. In order to find the correlation coefficient between two variables X and Y from 12 Pairs of
made:
Observations, the following calculations were

X 30, Y =
5, 2X* =
670, 2Y? =285, 2 XY =
334
On the subsequent verification it was found that pair (X=11, Y=4) was copied wrongly, the value

being (X=10, Y=14). Find thecorrect value of correlation coefficient. (Ans. 0.78)
YATRI CLASSES Page 9
22. Karl Pearson's co-efficient of correlation between
two variables X and Y is 0.28, their
covariance is +7.6, if the variance of X is 9, find the standard deviation of Y series. -

23. Calculate the co efficient of correlation between X and Y series from the
-

following data
iX-X*=136 i1(Y-Y )*=138 and
X-X) (Y-Y)=122
(Ans. 0.89)
24. From the following data, compute the efficient of correlation between X and
co -

Y
X series Y series
No. of items 15 15
Arithmetic mean 25 18
Sum of square pf deviations from mean 136 138
Sum of the product of X and Y series from their
respective mean =122. (Ans. 0.89)
25. Calculate correlation coefficient from the
following results
N=10. 2X = 140, 2Y = 150, Z(X- 14)2 = 180, 2(Y - 15) = 215, 2 - 1 4 ) ( Y - 15) = 60

(Ans. 0.305)
26. Calculate correlation coefficient from the following results
n= 10, 2X = 100. EY = 150. 2(X 10)? = 180, X(-15) 215, X(X - 10)(Y - 15) = 160

(Ans. 0.81)
27. Given: r = 0.8, 2xy = 60, o = 2.5, 2Y = 150, Yx2 = 90. Find no. o

x and y are deviations from their respective arithmetic means. (Ans. 10)
28. From the following data find the number of items.
r0.5, x y =120, o,=8, 2x2 = 90.
Where x and y are deviation from their respective means. (Ans. 10)
29.The coefficient of correlation between two variätes X and Y is 0.8 and their covariance is 20. If
the variance of X
series is 16find the
30. Interpret the values +1,r-1, r=0
standard deviation of Y series. (Ans. 6.25)
31. Draw scatter diagram for r=-1, r + 1 , r=0.3, r = 0.8.
32. Coefficient of Correlation between two variables X and Y is 0.48. Their covariance is 36. If
variance of Xis 16, find the S.D. of Y series.
(Ans. 18.75)
33. Find the value of coefficient of correlation, if X xy=450, n = 50, 4.5 and
o,= a,=3.5. (Ans. 0.75)
34. The following table gives age and marks of 100 students. Calculate
coefficient of correlation.
Marks Age (in years)
18 19 20 21

10-20 2 2
20-30 A 6 4
30-40 10 11
40-50 A 6 8 (Ans. 0.26)
50-60 2 4
60-70 2 1

YATRI CLASSES
Page 10
35. Calculate the coefficient of correlation and
interpret it. (Ans. -0.08)
Sales
Advertising Expenditure
revenue 5-15 15-25 25-35 35-45
75-125 3 4 8
125-175 8 5 7
175-225 2 3 4
225-275 3 2

36. Following is the distribution of students according to their heights and weightsS
Height (in inches) Weight X(in lbs.)
90-100 100-110 110-120 120-130
50-55
55-60 10
60-65 12 10
65-70 3 8 3
Find out the correlation coefficient between height and weight. (Ans. 0.078)

REGRESSION
(1) REGRESSION:
of average relationship between two or more variables. One of these
Regression is a measure

variable is called dependent or explained and other variable or variables as independent or

explaining variable(s).

(2) DEPENDENT AND INDEPENDENT VARIABLES


1. Dependent Variable (Regressed or Explained variable):
The variable whose value is to be predicted is dependent variable.
2. Independent variable (Regressor or Predictor or Explanatory variable):
values the variable which is used for prediction is
The variable which influences the or

independent variable.

(3) TYPEs OF REGRESSION:


1) Simple and Multiple:
In simple regression only two variables are involved one dependent and other independent.
functional relationship between only two variables. In multiple
Simple regression gives
regression more than two variables are involved with one dependent and two or moree

independent variables.

2) Linear and Curvilinear:


In linear regression rate of change of dependent variable with respect to independent variable
remains constant and graphically linear regression gives a straight line.

YATRI CLASSES Page 11


in none linear or curvilinear regression rate of change of dependent variable with respect t
independent variable does not remaining constant. Graphically it does not give a straight line.
(4) SIMPLE LINEAR REGREsSION EQUATION:
Regression equation Y on X Regression equation X on Y
Used to estimate Y when value of X is Used to estimate X when value of Y is given
given
Standard Form: Y =a + bX Standard Form: X a +bY =

Where: a= Y intercept Where: a Xintercept


B Regression coefficient. byx b= Regression coefficient bxy
=
slope of the line representing =
slope of the line representing9
change in Y variable for a unit change in X variable for a únit change
change in X variable in Y variable
=
Slope Coefficient of =
Slope Coefficient of
regression equation Y on X regression equation X on Y
Y Dependent variable X= Dependent variable
X=
Independent variable YIndependent variable
actically Y on X is obtained using. Practically X on Y is obtained using.
Y-Y byx (X -X) X-X bxr (Y
FORMULAE OF "bYx" AND "bx"
FOR "bYx" Tobe applied when FOR "bx"
byx XY-X Y n, 2XY2X, 2Y EX, 2Y2 are hbw ="2XY-YX Y
nx2-ZX)2 ny2-2Y)2
directly given in the problem or for
rectification
byx drdy-y dx
dy i) X-series & Y- series are given bve "2 dxdy-2 dx 2 dy
ndx2-(2 dx)2 nE dy2-(2 dy)2
(dx = X-a & dyY-b)
i ) X&Y or both are fraction (dx X - a & dy = Y-b)

byx= r
x (X- )X-series & Y- series are given
X)i) bxyr=LXY
bxr x= (X - X)
X ii) X&Y both are integer
2y2

y (Y-Y y=(Y-Y)
byx r. OX r, oy&ox are directly given in the b x r .
OY
problem
bv= rdxdy-XJdx >fdy x y| Two way frequency table is given in
| byx=nfay?-(2fdy)* x kyy
bYx ny fdx2-2fax x the problem
dx= ix
&dy =b dx =*-& dy =
lx iy

(5) PROPERTIES OF REGRESSION LINES:


1. There are always two regression lines
i. Regression equation of Y on X. (Used to estimate Y given X)
ii. Regression equation of X on Y. (Used to estimate X given Y)

YATRI CLASSES Page 12


2. Two regression lines intersect each other at the point (X, Y).
Xon Y
on X

(,

3. In case of perfect correlation, positive or negative i.e. for r=t1, two -regression lines coincide
and constitute only one line.

Perfect Correlation

X on Y
Y on X

r +1) r-1)
ry
is the degree of correlation
4. Smaller is the angle between the two regression lines, the greater
between variables.

XonY Xon Y
Yon X
Yon X

Strong Correlation Weak Correlation

between two variables is zero i.e. forr=0 two regression lines


5. When coefficient of correlation
perpendicular to each other
No Correlaion i.e. Lack of Correlaion
Xon Y

Y on X

r 0)
GTH7TATSRiaauet
YATRI CLASSES Page 13
(6) PROPERTIES OF REGRESsION COEFFICIENT:
1. Geometric Mean oft w o regression coefficient gives coefficient of correlation. i.e. = byx. DxY

2. Since-1 srs1 0 s bYx. bxr= rs1


3. Both regression coefficients always carry same sign i.e. both are either positive or both are

negative.
4. bYx = r.oy/0x DxYr. gxlay, are always positive, therefore regression coefficients
and correlation coefficient carry same sign.
5. If one regression coefficient is greater than unit i.e. 1 then other regression coefficient is always
lessthann1.
6. Regression coefficient are independent of change of origin but dependent on change of scale.

(7) USES OF REGRESSION ANALYSIS:


1. To get functional relationship between dependent variable with one or more independent
variables.
2. To prove the estimate of value of dependent variable from the values of independent
variables.
3. Using regression coefficient we can calculate the correlation coefficient.

(8)COEFFICIENT OF DETERMINATION:
the assumption of linear correlation
1. The coefficient of determination is used to decide how far
line. It is denoted by R, The
between the variable is valid, for determining the regression
value of lies between 0 and
R the value of R is close to 1 the assumption on linearityis
1. If
as valid.
valid and if it is close to 0, the assumption cannot be regarded
2. R2 = (Correlation coefficient between Y and ?
= (Correlation coefficient between Y and a+ bX)
= (Correlation coefficient between Y and X) (: r is independent of change of origin&

scale)
-Explainedvariance
TotalVariance

3. Interpretation on the basis of value of R


value of R, we can know the trustworthiness of the estimates obtained
On the basis of the have idea about the validity of the assumption of the linear
also we can
using the regression and
and Y.
correlation between X Forecasts obtained by Assumption about linear
Situation
regression lines correlation between Xand Y
100% reliable Perfectly valid
R=1 100% unreliable Perfectly invalid
R-0
Ris close to 1 Considerably reliable Considerably valid
is close to 0
Considerably unreliable Considerably invalid
R
Page 14
YATRI CLASSES
4. Practically:
If r 0.8, Coefficient of determination R 0.64=

It means that 64% of variation in the Y is explained by the regression model.


Remaining 36% variation in the Y is unexplained i.e. due to error.
Note:
This is true for models with only one indepondent variable.

(9) DISTINCTION BETWEEN CORRELATION AND REGRESSION


Sr. Correlation Regression
No.
1 Correlation measures nature andstrength | Regression gives mathematically an
of association between two or more average relationship between the
variable variables.
2 Corelation does not imply cause effect Regression analysis pre-assumes cause
relationship between the variables under effect relationship between the variables
the study under the study.
3 Study of correlation gives idea about | Regression analysis establishes the
direction and degree of relationship functionalrelationship between the
between two variables. It does not enable variables under the study. It is used to
us to predict the value of one variable estimate the value of dependent variable
when value of other variable is given when value of independent variable is
given
Correlation co-efficient r is independent of Regression coefficients are independent
change of origin and scale of change of origin but dependent on
change of scale
5 Correlation analysis confirms only the | Regression analysis has much wider
study of linear relationship between the application as it studies linear as well as
variables and therefore, has imited non- linear relationship between
applications. variables.
6 There may be non-sense or spurious There is nothing like non-sense
correlation between two variables regression.
because of pure chance.

7y bxy # Dyx

YATRI CLASSES
Page 15
PRACTICAL EXAMPLES
37. From the following data, find the two regression equation
X 1 2 3 4 5 6 7
Y 2 4 7 6 5 6 5
Sol.
Y X y xy
-X -P
1 -3 -3 9 9
2 4 -2 -1 4
3 -1 2 1 4
4 1 0
5 5 1 0
6 6 2 4 1 2
7 5 3
28 35 28 16 11

X =4 Y5
Working:
byx = 28
=
0.393

bxy == 0.688
2y2
(i) Regression Equation Y on X (ii) Regression Equation X on Y
Y - Y byYx (X -X) X-X =
bxy =
(Y -P)
Y -5 0.393 (X-4) X -4 0.688 (Y-5)
Y-5 = 0.393X-1.57 X-4= 0.688Y-3.44
Y =0.393X-1.57+5 X = 0.688Y-3.44+4

Y 3.430.393X X =0.56+0.688Y
38. From the following data obtain the two equations.
Sales 91 97 108 121 67 124 51 73 111 57
Purchases 71 75 69 97 70 91 39 61 80 47
(Ans. Y =15.1+0.61X; X=-5.2+1.36Y)
39. From the following data obtain the twO regression equations:
X 6 2 10 4 8
11 5 8 7 (Ans. -
Y =11.9 0.65X; X=16.4
-1.3Y)
40. The following data relate to the scores obtained by 9 Salesman of a company in an
intelligence
test and their weekly sales in thousand rupees
Sales A B C D E F G H
Intelligence Test Scores 50 60 50 60 80 50 80 40 70
Weekly Sales 30 60 40 50 60 30 70 50 60

YATRI CLASSES
Page 16
a. Obtain the regression
b. If the
equation of sales
intelligence test scores of the salesmen
on
intelligence test score of a salesman is 65, what would be his
expected weekly sales?
(Ans. Y =5+0.75X; 53.75)
41. Using the following data obtain the two
X 14 19 24
regression equations.
21 26 22 15 20 19
Y 31 36 48 33
37 50 45 33 41 39
(Ans. Y =7.8+1.61X; X=-2.4+0.56Y)
42. Given the following information:
Year 1999 2000 2001 2002 2003 2004
Research expense (in'000 Rs.) 5 11 4 5 3 2
Annual Profit (in '000 Rs.) 31 40 30 34 25 20
. Develop the estimating equation that best describes the data. given
ii. Estimate the annual profit when research expense made will 7000.
iii. How much variation in the annual profits is explained by the variation in the research
expenditure? (Ans. (i) Y =20.6+1.88X; (ii) 33.76 (Rs.'000); (ii) 81.78%)

43. From the following data of the age of husband and he age of wife, form two regression lines.
Calculate the husband's age when wife' age is 16. Calculate wife's age when husband's age is

25.
28 28 29 30 31 33 35
Husband's age 36 23 27 27 29 28
Wife's age 29 18 20 22 27 21 29
Esti. X=22)
(Ans. Y =-1.73+0.891X, X 8.2+0.872Y; Esti. Y=21;
=

44. The following table shows the ages (X) and blood pressure (Y) of 8 persons.

X 52 63 45 36 72 65 47 25
62 53 51 25 79 43 60 33
Y
and find the expected blood pressure of a person who
Obtain the regression equation of Y on X
is 49 years old.
Sol.
X dx dy dx dxdy
X-50] [Y-50]

52 62 2 12 4 24
63 53 13 3 169 39
45 51 -5 1 25 5
25 -14 -25 196 350
36
79 22 29 484 638
72
43 15 -7 225 -105
65
60 -3 10 9 -30
47
425
25
405
33
406
25 5
-7
5
625
1737 1336

Page 17
YATRI CLASSES
X 50.63 Y=50.75
Working:
byx= "2dxdy- dxLdy
ndx2- dx)2
8(1336)-(5) (6)
8(1737)-(5)2
0.77
Required:
(i) Regression Equation Y on X (i) Estimation of Y when X = 49
Y -Y= byx (X -X)=
TakingY on X
Y - 50.75 = 0.77 (X-50.63)
Y 11.76 0.77(49)
Y - 50.75 = 0.77X-38.985 = 11.76+37.73
.Y = 0.77X-38.985+50.75
=49.49
Y =11.76+0.77X
45. From the following data find the
X = 100.
regression equations, and estimate the likely value of Y when
X 72 98 76 81 56 76 92 88 49
Y 124 131 117 132 96 120 136 97 85

46. The heights of a sample of 10 fathers and their eldest


(Ans. Y =51.88+0.83X; X 2.63 +0.64Y; Esti. Y=134.88)
sons are given below: (To the nearest cm.)
Height of father (X) 170 167 162 163 167 166 169 171 164 165
Height of son (Y) 168 167 166 166 168 165 168 170 165 168
(Ans. Y =98.88+0.41X; X 70.88+ 12Y)
=

47. To know what relationship exist between unernpioyment and suicide attempts, a sociologist
surveyed twelve citied and obtained the following data.
City 1 2 3 4 5 6 7 8 9 10 11 12
Unemployment 7.3 6.4 6.2 5.5 6.4 4.7 5.8 7.9
rate percent
6.7 9.6 10.3 7.2
7.2

No. of suicide 22 179 8 12 5 7 19 13 29 33 18


attempts per
1000 residents
i. Develop the estimating equation that best describes the given data.
ii. Estimate
attempted suicide rate when unemployment rate happens to be
ii. Calculate coefficient of determination and interpret it.
(Ans. (i) Y =-20.47+5.21X; (ii) Esti. Y 10.79 (per 100 residents); (ii)
=

48. Find the regresion equations from the following data: 0.9274)
X=60, 2Y=40, 2XY=1150, 2X=4160,Y2=1720, N = 10
Sol.
Given: X=60, XY=40, 2XY=1150, 2X*=4160, 2Y=1720, N 10
Working:
() = -5= 6 ,Y = =4
N 104
N 10

YATRI CLASSES
Page 18
(ii) byx NXY-xyy
Nx2- (x)2 bxy = 2XY-xyY
= 10(1150)-(60) (40) NEY2- (EY)2
10(4160)-(60)2 -10(1150)-(60) (40)
= 0.239 10(1720)-(40)2
0.583
Required:
(i) Regression Equation Y onX
Y-Y= byx = (X -X) (ii) Regression Equation X on Y
X-X = bxv= (Y -Y)
Y-4 0.239 (X-6) X - 6 = 0.583 (Y-4)
Y-4 0.239X-1.43
Y 0.239X-1.43+4 X-6 0.583Y-2.33
X = 0.583Y-2.33+6
Y2.57+0.239XX
49. Find the AX=3.67+0.583YY
regression equations of X and Y from the following data
2X=24, Y=44, 2 XY=306, X=164,2Y=574, N = 4

50. Obtain the


(Ans.Y=-1.6+2.1X; X =0.86+0.467Y)
regression equations for the following information's
X=50, Y=30, X XY=1000, 2 X2=3000,2Y =1800, N 10
(Ans. Y=1.45+0.309X; X = 3.51 + 0.497Y)
51. From the following results, obtain the two
regression equations and estimate the yield of crops
when the rainfall is 29 cms, and the
rainfall when
the is 600 yield kg,
Yield in Kgs. (Y) Rainfall in cms. (X)
Mean 508.4 26.7
S.D. 36.8 4.61
Co efficient of correlation
between yield an rainfall =
0.52
Sol.
Given: Y 508.4, X 26.7; y 36.8, x 4.61, r 0.52
Working:
by=r.= 0.52 =4.151;
4.61 bxy r= 0.52o0.065
36.8
Required
(i) Two Regression Equations
Regression Equation Y on X Regression Equation X on Y
YF byx -(X-X) x-X= bxy (Y -V) =

:Y-508.4 4.151(X-26.7) X- 26.7 0.065 (Y-508.4)


Y-508.4 4.151X-110.83 X-26.7 = 0.065Y-33.05
Y=4.151X-110.83+508.4 X =0.065Y-33.05+26.7
Y 397.57+4.151X X =-6.35 +0.065Y
Regression Equation X on YY
X-= bx (Y -¥) =

X 26.7 0.065 (Y-508.4)


X-26.7 = 0.065Y-33.05
YATRI CLASSES
Page 19
X =0.065Y-33.05+26.7
X =-6.35+0.065Y
(ii) Estimation
Estimation of Y when X 29 Estimation of X when Y = 6000
Taking Y on X Taking X on Y
Y=397.57 +4.151(29) Y= -6.35 +0.065(600)
=517.95 KKg 32.65 Cms
52. Find out the likely production corresponding to a rainfall of 40 cms. from the following data
Rainfall (in cms.) Output (in quintals)
Averagee 30 50
S.D. 5 10
r 0.8 (Ans. Y2+1.6X; Esti. Y =66)
53. You are given the following data
X
Arithmetic Mean 36 85
Standard Deviation 11
Correlation co efficient between X and Y = 0.66
You are required to
75
1. Find the two regression equations 2 Estimate the value of X when Y
(Ans. Y =67.72+0.48X; X = 41.35+0.91Y;Esti. X =26.9)

54. In a correlation study the following obtained:


Y
Mean 65 67
Standard Deviation 2.5 3.5
Correlation of Correlation 0.8
Find the two regression equations that are associated with the above values.
(Ans. Y -5.8+1.12X; X =26.81+0.57Y)
55. Compute the two regression equations on the basis of the following information:
x Y
Mean 40 45
Standard Deviation 10 9
Karl Pearson's correlation coefficient between X and Y =
0.5
Also estimate the value of Y for X = 48 using appropriate regression equation.

(Ans. Y =27+0.45X; X = 14.8+0.56Y; Esti. Y = 48.6)

56. The information gives below relates to the advertisement and sales of the company
Advertisement Expenditure (Rs. Lakhs) Sales (Rs. Lakhs)
Arithmetic Mean 20 100
Standard Deviation 3 12
Correlation coefficient between
X and Y =
0.8
i. Find the two regression equations

YATRI CLASSES Page 20


ii. What should be
advertisement expenditure, if the company wants to attain sales
120 lakhs. target of Rs
(Ans. Y =36+3.2X; X =0.2Y; Esti. 24) X=
57. Given the following results for the
height (X) and weight (Y) in appropriate units of 1000 students.
X 68 Y 150, ax 2.5,
=
ay =
20, r = 0.6
Obtain the equations
of two regression lines. Estimate height of a student whose weight 200 units
and also estimate weight of a student whose height is 60 units.
(Ans. Y =-176.4+4.8X; X = 56.75+0.075Y; Esti. Y = 111.6; Esti. X = 71.75)
58. The following data are given regarding expenditure on advertising and sales of a particular firm
Advertisement Expenditure (Rs. Lakhs) (X) Sales (Rs. Lakhs) (Y)
Mean 10 90
S.D. 3
Correlation of correlation rxy 0.8
i. Calculate the regression equation of Y onX.
i. Estimate the advertisement expenditure required to attain a sales target of Rs. 120 lakhs.
(Ans. Y =58+3.2X; Esti. X = 16)
59. Find out the regression equation showing the regression of capacity utilization on product form
the following data
Average Standard deviation
Production (in lack units) 35.6 10.5
Capacity utilization (in %) 84.8 8.5
0.62
Estimate the production, when capacity utilization is 70%.
(Ans. Y =66.93+0.502x, X = -29.36+0.766Y; Esti. X = 24.26 (Rs. Lakhs)
60. From the following data, find out the probable yield when the rainfall is 29.
Rainfall
Yield
Mean 25" 40 units per hectare
Standard deviation 3 6 units per hectare
Correlation coefficient between rainfall and production 0.8 (Ans. 46.4)
61. You are given variance of x = 9. The regression equations are 8X- 10Y+66-0 and
40X-18Y=214, find
1. Average values of X and Y.
2. Correlation coefficient between the two variables. 3. S.D. ofY.
62. Regression equetions of two variables X and Y are as follows: 3X+2Y-26=0 & 6X+Y-31=0
Find 1. The means
2.The regression coefficients
3.The coefficient between X and Y (Ans.(1) 4,7; (2)-1.5, -0.17; (3) -0.5)
63. In a partially destroyed record the following data are available; variance of X = 25, regression
equation of X upon Y is 5X-Y = 22 and regression equation of y upon is 64X 45Y = 24

Find
1.Mean values of X and Y. 2. S.D. ofY
3.Coefficient of correlation between X and Y. (Ans. (1) 6,8; (2) 13.39; (3) 0.53)

YATRI CLASSES Page 21


64. For 100 students of a class, the regression equation of marks in Statistics (X) on ne
of marks
Commerce (Y) is 3Y 5X +180 0. The mean marks in commerce is 50 and variance
-

statistics is 4/9 of the variance of marks in commerce. Find the mean marks in Stausics a
(Ans. 66; 09)
coefficient of correlation between marks in the two subjects.
Marks in
65. For 50 student of class the regression equation of marks in Statistic (X) on
or marks
marks of Accountancy is 44 and variance
Accountancy (Y) is 3Y-5X+180=0. If the mean

in Statistics is 9/16th of variance of marks in Accountancy.


Find
i. Mean marks of statistics i. Coefficient of correlation (Ans. (i) 62.4; (ii) 0.8)
66. A student obtained the following two regression equations. Do you agree with him?

6X 15Y+21 and 21X+14Y =56 (Ans. Data are inconsistent)

and find the value of r. (The


67. Compute the two regression coefficient from the data given below
correlation coefficient) using the same.
X 7 4 8 6 5
Y 6 85 2 9 (Ans. (1) 1.2, 0.4; (2) 0.69)
giving the
and the value of r from the following table
68. Obtain the regression equations ofY on X
marks in Accountancy and Statistic:
Marks in Accountancy
Total
5-15 15-25 25-35 35-45
Y
2
0-10
10-20 3 6 15
Marks in 20
20-30
Statistic 15
30-40 3
4 8
40-50
Total 5 A8 27 10 60
(Ans. Y = 8.88+0.671X; r = 0.53)

69. Following table gives ages of husbands and wives for 50 newly married couples.

i) Find two regression equations (Ans. (i)Y =


8.03+0.47X, X= 11.87+0.73Y
wife is of 20 (ii) Esti. X = 26.47 z 26;
ii) Estimate age of husband when years age.
(ii) Esti. Y = 22.13 = 22)
ii) Estimate age of wife when husband is of 30 years age.

Age of husband
20-25 25-30 30-35
Age of 16-20 9 14 0
Wife 20-24 6 11 3
24-28 0

YATRI CLASSES Page 22

You might also like