Professional Documents
Culture Documents
Business Statistics Unit 1
Business Statistics Unit 1
BUSINESS STATISTICS
SEM UNIT 1
ii. Correlation:
Simple (Bi variate)
Multiple
Partial
Simple Correlation Study of correlation between only two variables is known as simple
correlation or Bivariate correlation.
Multiple Correlation : In Multiple correlation three or more variables are studies
simultaneously. Multiple correlation consists of measurement of relationship between a
dependent variable and two or more independent variables.
Partial Correlation Correlation belween a dependent variable and one particular
independent variable keeping other variables constant is called partial correlation.
YATRI CLASSES
Page 2
Parfoct Corrolatlon
Strong Corralatlon
Positlvo
r+1) Nogatlvo Pooltlvo Nogativo
(r-1)
Weak Gorrolation
Spocial Catagory
0
Positivee Negative Non Linear Correlatlon No Corrolation
1.0. Lack of Correlation
(Curvilinear)
0)
Gurukul-
gxoy
Formulae of Coefficient of Correlation (r) To be Applied When-
1) r= o u ) Cov (X, Y), oy&oy are directly given in
the problem
2)r=- ) (-7) 2X X) (Y -Y),
-
n, oy&oyare directly
noXOY given in the problem.
3) r= Y 2 xy, n, oy&oyare directly given in the
naxOY
x= (X -
X);
problem.
y (Y - Y):
4) r=
nXY-2X XY n, 2XY, 2XEY.2X2, 2Y?are directly
nx-(2x)*/ny2-22 given in the problem or for rectification
5) r . n dxdy-Y dx Xdy dx X- a i)X series & Y -series are given
Y)
Pearson's Coefficient)
C. Properties of Coefficient of Correlation: (Karl variables.
1. It is a pure number and it measures the strength
of association between two
If value of "is close to 0 (either side) it means correlation is weak positive or weak negative
depending upon then sign of "r.
3) CORRELATION AND CAUSATION (i.e. CAUSE EFFECT RELATIONSHIP)
Correlation analysis enables us to have an idea about the nature and strength of relation between
the two variables under the study. However, it fails to reflect the cause effect relationship between
the variables. In bi - variate analysis, correlation does not imply causation. Even high degree of
correlation also does not imply cause effect relationship. But cause effect relationship always
imply correlation.
The high degree of correlation between the variables may be due to the following reasons.
3. Pure chance:
It may happen that a small randomly selected sample from a bi - variate distribution may show a
high degree of correlation, though actually, variables may not be correlated in the population.
E.g. The correlation between the size ofthe shoe and intelligence of a group of individuals should
be zero, since the forces affecting the two variables are entirely independent of each other.
however, in any sample taken from the above population, there may be the chances that the
value of r is non zero then such correlation is termed as chance correlation or spurious correlation
or non - sense correlation.
YATRI CLASSES
Page 5
PRACTICAL EXAMPLES
1. Calculate the co efficient of correlation and interpret the results.
x 28 7 40 38 35 33 40 32 34 33
23 32 33 34 30 26 29 31 34 38
Sol.
X xy
X-X Y-
28 23 -7 -8 49 64 56
37 32 2 1 2
40 33 5 2 25 4 10
38 34 3 3 9 9 9
35 30 -1 1 0
33 26 -2 -5 2510
4
40 29 5 25 10
32 31 -3 0 0
34 34 -1 9 -3
33 38 2 49 14
350 310 130 166 60
X 35 Y 31
Required:
(i) Coefficient of Correlation (ii) Interpretation
r y There is weak +ve Correlation
VEx22y2
x= (X -X)
60
V130V166
y=(Y)
0.41
2. Find the correlation co-efficient between sales and advertising expenditure from the following
data:
PSales (Rs. Lakh) 65 66 67 67 68 69 70 72
VAdvertising expenditure (Rs. '000) 67 68 65 68 72 72 69 71
(Ans. 0.60)
3. Calculate person's coefficient of correlation between advertisement cost and sales as per the
data given below
Advertisement Cost in '000 Rs.39 65 62 90 82 75 25 98 36 78
Sales in lakh Rs. 47 53 58 86 62 68 60 91 51 84
(Ans. 0.78)
4 . Calculate Pearson's co efficient of correlation from the following taking 100 and 50 as the
ssumed average of X and Y respectively.
X 104 111 104 114 118 117 105 108 106 100 104 105
: 57 55 47 45 45 50 64 63 66 62 69 61
(Ans. -0.67)
YATRI CLASSES
Page 6
From the following data, find out the correlation coefficient between heights of fathers and sons.
Heights of fathers (inches):65 66 67 67 68 69 70 72
Heights of sons (inches) ;67 68 65 68 72 72 69 71 (Ans. 0.603)
6. Calculate the co efficient of correlation and interpret the
-
result
Age of Husbands 23 27 39 29 30 31 33 35 36
Age of Wives 18 20 21 22 24 37 29 28 29
Sol.
Let X be the Age of Husbands and Y be the Age of Wives
X Y dx dy dx|dy dx.dy
X Y
30] 25]
23 18 -7 7 49 49 49
27 20 -3 -5 9 25 15
39 21 9 4 81 16 36
29 22 -1 9
30 24 0
31 27 2
33 29 4 9 16 12
35 28 3 25 9 15
36 29 4 361 6 24
283 218 13 -7 211 145 81
31.44 Y=24.22
Required:
(i) Coefficient of Correlation ()
n dx.dy-2 dx dy dx X -30
ndx2( dx)n dy ( dy
(9)(81)-(13)(-7) dy Y-25
V(9)(211)-(13)2/(9)(145)-(-7)
0.57
(i) Interpretation
There is Moderate + ve Correlation
7. Calculate the co-efficient of correlation and interpret the results.
Firm 2 3 4 5 5 7 9 10
Sales 50 55 55 60 65 65 60 60 50 55
Expenditure 11 13 14 16 16 15 14 13 13 15
(Ans. 0.80)-s
8. Calculate Karl Pearson's co - efficient from the following table of prices and supply of a
12. Compute Karl Pearson's coefficient of correlation in thefollowing series relating to cost of living
and wages.
100 101 102 100 99 98 97 98 96 96
Wages
95 92 95 94 90 91
Cost of living 98 99 99 97
(Ans. 0.92)
13. Calculate Karl Pearson's coefficient ofcorrelation from the following data, for price and demand.
Price 14 16 17 18 19 20 21 22 23
Demand 84 78 70 75 66 67 62 58 60 (Ans. -0.954)
14. Two variates x and y when expressed as deviations from respective means, are given as follows.
Find the co -
508
n = 25, 2X = 125, 2Y =100, Ex =650, 2y2 =460, and XY =
Working:
Corrected Result Incorrect Result- Wrong Observations+ Correct Observation
25 2 25
n
125 6 8 6+8 125
Y 100 14 6 12 +8 100
650 6 82 62 +82 = 650
X
460 14- 62 12+82 436
508 (6X14) - (8BX6)+ (8X12)+(6X8)= 520
2XY
Required:
(i) Coefficient of Correlation
nX.Y.-xXY
Correctedr Corrected nX2(E x)2/n2Y2 (E Y)2
(25) (520)-(125) (100)
V(25)(650)-(125)2V(25) (436)-(100)2
0.67
21. In order to find the correlation coefficient between two variables X and Y from 12 Pairs of
made:
Observations, the following calculations were
X 30, Y =
5, 2X* =
670, 2Y? =285, 2 XY =
334
On the subsequent verification it was found that pair (X=11, Y=4) was copied wrongly, the value
being (X=10, Y=14). Find thecorrect value of correlation coefficient. (Ans. 0.78)
YATRI CLASSES Page 9
22. Karl Pearson's co-efficient of correlation between
two variables X and Y is 0.28, their
covariance is +7.6, if the variance of X is 9, find the standard deviation of Y series. -
23. Calculate the co efficient of correlation between X and Y series from the
-
following data
iX-X*=136 i1(Y-Y )*=138 and
X-X) (Y-Y)=122
(Ans. 0.89)
24. From the following data, compute the efficient of correlation between X and
co -
Y
X series Y series
No. of items 15 15
Arithmetic mean 25 18
Sum of square pf deviations from mean 136 138
Sum of the product of X and Y series from their
respective mean =122. (Ans. 0.89)
25. Calculate correlation coefficient from the
following results
N=10. 2X = 140, 2Y = 150, Z(X- 14)2 = 180, 2(Y - 15) = 215, 2 - 1 4 ) ( Y - 15) = 60
(Ans. 0.305)
26. Calculate correlation coefficient from the following results
n= 10, 2X = 100. EY = 150. 2(X 10)? = 180, X(-15) 215, X(X - 10)(Y - 15) = 160
(Ans. 0.81)
27. Given: r = 0.8, 2xy = 60, o = 2.5, 2Y = 150, Yx2 = 90. Find no. o
x and y are deviations from their respective arithmetic means. (Ans. 10)
28. From the following data find the number of items.
r0.5, x y =120, o,=8, 2x2 = 90.
Where x and y are deviation from their respective means. (Ans. 10)
29.The coefficient of correlation between two variätes X and Y is 0.8 and their covariance is 20. If
the variance of X
series is 16find the
30. Interpret the values +1,r-1, r=0
standard deviation of Y series. (Ans. 6.25)
31. Draw scatter diagram for r=-1, r + 1 , r=0.3, r = 0.8.
32. Coefficient of Correlation between two variables X and Y is 0.48. Their covariance is 36. If
variance of Xis 16, find the S.D. of Y series.
(Ans. 18.75)
33. Find the value of coefficient of correlation, if X xy=450, n = 50, 4.5 and
o,= a,=3.5. (Ans. 0.75)
34. The following table gives age and marks of 100 students. Calculate
coefficient of correlation.
Marks Age (in years)
18 19 20 21
10-20 2 2
20-30 A 6 4
30-40 10 11
40-50 A 6 8 (Ans. 0.26)
50-60 2 4
60-70 2 1
YATRI CLASSES
Page 10
35. Calculate the coefficient of correlation and
interpret it. (Ans. -0.08)
Sales
Advertising Expenditure
revenue 5-15 15-25 25-35 35-45
75-125 3 4 8
125-175 8 5 7
175-225 2 3 4
225-275 3 2
36. Following is the distribution of students according to their heights and weightsS
Height (in inches) Weight X(in lbs.)
90-100 100-110 110-120 120-130
50-55
55-60 10
60-65 12 10
65-70 3 8 3
Find out the correlation coefficient between height and weight. (Ans. 0.078)
REGRESSION
(1) REGRESSION:
of average relationship between two or more variables. One of these
Regression is a measure
explaining variable(s).
independent variable.
independent variables.
byx= r
x (X- )X-series & Y- series are given
X)i) bxyr=LXY
bxr x= (X - X)
X ii) X&Y both are integer
2y2
y (Y-Y y=(Y-Y)
byx r. OX r, oy&ox are directly given in the b x r .
OY
problem
bv= rdxdy-XJdx >fdy x y| Two way frequency table is given in
| byx=nfay?-(2fdy)* x kyy
bYx ny fdx2-2fax x the problem
dx= ix
&dy =b dx =*-& dy =
lx iy
(,
3. In case of perfect correlation, positive or negative i.e. for r=t1, two -regression lines coincide
and constitute only one line.
Perfect Correlation
X on Y
Y on X
r +1) r-1)
ry
is the degree of correlation
4. Smaller is the angle between the two regression lines, the greater
between variables.
XonY Xon Y
Yon X
Yon X
Y on X
r 0)
GTH7TATSRiaauet
YATRI CLASSES Page 13
(6) PROPERTIES OF REGRESsION COEFFICIENT:
1. Geometric Mean oft w o regression coefficient gives coefficient of correlation. i.e. = byx. DxY
negative.
4. bYx = r.oy/0x DxYr. gxlay, are always positive, therefore regression coefficients
and correlation coefficient carry same sign.
5. If one regression coefficient is greater than unit i.e. 1 then other regression coefficient is always
lessthann1.
6. Regression coefficient are independent of change of origin but dependent on change of scale.
(8)COEFFICIENT OF DETERMINATION:
the assumption of linear correlation
1. The coefficient of determination is used to decide how far
line. It is denoted by R, The
between the variable is valid, for determining the regression
value of lies between 0 and
R the value of R is close to 1 the assumption on linearityis
1. If
as valid.
valid and if it is close to 0, the assumption cannot be regarded
2. R2 = (Correlation coefficient between Y and ?
= (Correlation coefficient between Y and a+ bX)
= (Correlation coefficient between Y and X) (: r is independent of change of origin&
scale)
-Explainedvariance
TotalVariance
7y bxy # Dyx
YATRI CLASSES
Page 15
PRACTICAL EXAMPLES
37. From the following data, find the two regression equation
X 1 2 3 4 5 6 7
Y 2 4 7 6 5 6 5
Sol.
Y X y xy
-X -P
1 -3 -3 9 9
2 4 -2 -1 4
3 -1 2 1 4
4 1 0
5 5 1 0
6 6 2 4 1 2
7 5 3
28 35 28 16 11
X =4 Y5
Working:
byx = 28
=
0.393
bxy == 0.688
2y2
(i) Regression Equation Y on X (ii) Regression Equation X on Y
Y - Y byYx (X -X) X-X =
bxy =
(Y -P)
Y -5 0.393 (X-4) X -4 0.688 (Y-5)
Y-5 = 0.393X-1.57 X-4= 0.688Y-3.44
Y =0.393X-1.57+5 X = 0.688Y-3.44+4
Y 3.430.393X X =0.56+0.688Y
38. From the following data obtain the two equations.
Sales 91 97 108 121 67 124 51 73 111 57
Purchases 71 75 69 97 70 91 39 61 80 47
(Ans. Y =15.1+0.61X; X=-5.2+1.36Y)
39. From the following data obtain the twO regression equations:
X 6 2 10 4 8
11 5 8 7 (Ans. -
Y =11.9 0.65X; X=16.4
-1.3Y)
40. The following data relate to the scores obtained by 9 Salesman of a company in an
intelligence
test and their weekly sales in thousand rupees
Sales A B C D E F G H
Intelligence Test Scores 50 60 50 60 80 50 80 40 70
Weekly Sales 30 60 40 50 60 30 70 50 60
YATRI CLASSES
Page 16
a. Obtain the regression
b. If the
equation of sales
intelligence test scores of the salesmen
on
intelligence test score of a salesman is 65, what would be his
expected weekly sales?
(Ans. Y =5+0.75X; 53.75)
41. Using the following data obtain the two
X 14 19 24
regression equations.
21 26 22 15 20 19
Y 31 36 48 33
37 50 45 33 41 39
(Ans. Y =7.8+1.61X; X=-2.4+0.56Y)
42. Given the following information:
Year 1999 2000 2001 2002 2003 2004
Research expense (in'000 Rs.) 5 11 4 5 3 2
Annual Profit (in '000 Rs.) 31 40 30 34 25 20
. Develop the estimating equation that best describes the data. given
ii. Estimate the annual profit when research expense made will 7000.
iii. How much variation in the annual profits is explained by the variation in the research
expenditure? (Ans. (i) Y =20.6+1.88X; (ii) 33.76 (Rs.'000); (ii) 81.78%)
43. From the following data of the age of husband and he age of wife, form two regression lines.
Calculate the husband's age when wife' age is 16. Calculate wife's age when husband's age is
25.
28 28 29 30 31 33 35
Husband's age 36 23 27 27 29 28
Wife's age 29 18 20 22 27 21 29
Esti. X=22)
(Ans. Y =-1.73+0.891X, X 8.2+0.872Y; Esti. Y=21;
=
44. The following table shows the ages (X) and blood pressure (Y) of 8 persons.
X 52 63 45 36 72 65 47 25
62 53 51 25 79 43 60 33
Y
and find the expected blood pressure of a person who
Obtain the regression equation of Y on X
is 49 years old.
Sol.
X dx dy dx dxdy
X-50] [Y-50]
52 62 2 12 4 24
63 53 13 3 169 39
45 51 -5 1 25 5
25 -14 -25 196 350
36
79 22 29 484 638
72
43 15 -7 225 -105
65
60 -3 10 9 -30
47
425
25
405
33
406
25 5
-7
5
625
1737 1336
Page 17
YATRI CLASSES
X 50.63 Y=50.75
Working:
byx= "2dxdy- dxLdy
ndx2- dx)2
8(1336)-(5) (6)
8(1737)-(5)2
0.77
Required:
(i) Regression Equation Y on X (i) Estimation of Y when X = 49
Y -Y= byx (X -X)=
TakingY on X
Y - 50.75 = 0.77 (X-50.63)
Y 11.76 0.77(49)
Y - 50.75 = 0.77X-38.985 = 11.76+37.73
.Y = 0.77X-38.985+50.75
=49.49
Y =11.76+0.77X
45. From the following data find the
X = 100.
regression equations, and estimate the likely value of Y when
X 72 98 76 81 56 76 92 88 49
Y 124 131 117 132 96 120 136 97 85
47. To know what relationship exist between unernpioyment and suicide attempts, a sociologist
surveyed twelve citied and obtained the following data.
City 1 2 3 4 5 6 7 8 9 10 11 12
Unemployment 7.3 6.4 6.2 5.5 6.4 4.7 5.8 7.9
rate percent
6.7 9.6 10.3 7.2
7.2
48. Find the regresion equations from the following data: 0.9274)
X=60, 2Y=40, 2XY=1150, 2X=4160,Y2=1720, N = 10
Sol.
Given: X=60, XY=40, 2XY=1150, 2X*=4160, 2Y=1720, N 10
Working:
() = -5= 6 ,Y = =4
N 104
N 10
YATRI CLASSES
Page 18
(ii) byx NXY-xyy
Nx2- (x)2 bxy = 2XY-xyY
= 10(1150)-(60) (40) NEY2- (EY)2
10(4160)-(60)2 -10(1150)-(60) (40)
= 0.239 10(1720)-(40)2
0.583
Required:
(i) Regression Equation Y onX
Y-Y= byx = (X -X) (ii) Regression Equation X on Y
X-X = bxv= (Y -Y)
Y-4 0.239 (X-6) X - 6 = 0.583 (Y-4)
Y-4 0.239X-1.43
Y 0.239X-1.43+4 X-6 0.583Y-2.33
X = 0.583Y-2.33+6
Y2.57+0.239XX
49. Find the AX=3.67+0.583YY
regression equations of X and Y from the following data
2X=24, Y=44, 2 XY=306, X=164,2Y=574, N = 4
56. The information gives below relates to the advertisement and sales of the company
Advertisement Expenditure (Rs. Lakhs) Sales (Rs. Lakhs)
Arithmetic Mean 20 100
Standard Deviation 3 12
Correlation coefficient between
X and Y =
0.8
i. Find the two regression equations
Find
1.Mean values of X and Y. 2. S.D. ofY
3.Coefficient of correlation between X and Y. (Ans. (1) 6,8; (2) 13.39; (3) 0.53)
statistics is 4/9 of the variance of marks in commerce. Find the mean marks in Stausics a
(Ans. 66; 09)
coefficient of correlation between marks in the two subjects.
Marks in
65. For 50 student of class the regression equation of marks in Statistic (X) on
or marks
marks of Accountancy is 44 and variance
Accountancy (Y) is 3Y-5X+180=0. If the mean
69. Following table gives ages of husbands and wives for 50 newly married couples.
Age of husband
20-25 25-30 30-35
Age of 16-20 9 14 0
Wife 20-24 6 11 3
24-28 0