Download as pdf or txt
Download as pdf or txt
You are on page 1of 186

Noida Institute of Engineering and Technology, Greater Noida

Descriptive Measures

Unit: 1

Subject Name and Subject code:


Statistics and Probability (AAS0303)
Dr. JYOTI SHARMA
B Tech 3rd Sem: AI, Data Science AKTU
Mathematics

Faculty Name Aakansha Vyas Unit Number 1


1
9/22/2022
Faculty Introduction

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 2


Evaluation Scheme

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 3


Syllabus

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 4


Branch Wise Application

• Probability and Statistics form the basis of Data Science. The


probability theory is very much helpful for making the prediction. ...
With the help of statistical methods, we make estimates for the
further analysis. Thus, statistical methods are largely dependent on
the theory of probability.

Faculty Name Aakansha Vyas Unit Number 1


9/22/2022 5
Course Objective

• The objective of this course is to familiarize the engineers with


concept of Statistical techniques, probability distribution,
hypothesis testing and ANOVA and numerical aptitude. It aims to
show case the students with standard concepts and tools from B.
Tech to deal with advanced level of mathematics and applications
that would be essential for their disciplines. The student will be able
to understand:
• The concept of Statistical techniques.
• The concept of probability distribution.
• The concept of hypothesis testing.
• The concept of ANOVA.
• The concept of numerical aptitude.

Faculty Name Aakansha Vyas Unit Number 1


9/22/2022 6
Course Outcome

• CO 1:Understand the concept of moments, skewness, kurtosis,


correlation, curve fitting and regression analysis.
• CO 2:Understand the concept of Probability and Random variables.
• CO 3: Remember the concept of probability to evaluate probability
distributions
• CO 4: Apply the concept of hypothesis testing and estimation of
parameter.
• CO 5: Solve the problems of Time & Work, Pipe & Cistern, Time,
Speed & Distance, Boat & Stream, Sitting Arrangement , Clock &
Calendar.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 7


Program Outcome

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 8


CO-PO Mapping(CO1)

Sr. Course PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
No Outcome

1 CO 1 3 3 3 3 1 1 2

2 CO 2 3 3 3 2 1 1 2 2

3 CO 3 3 2 3 2 1 1 1

4 CO 4 3 2 2 3 1 1 1

5 CO.5 3 3 2 2 1 1 1 2 2

*1= Low *2= Medium *3= High

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 9


Prerequisite and Recap (CO1)

▪ Knowledge of Maths 1 B.Tech.


▪ Knowledge of Maths 2 B.Tech.
▪ Knowledge of Permutation and Combination.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 10


Brief Introduction about subject (CO1)

Statistics is concerned with making inferences about the way the


world is, based upon things we observe happening. ...
Probability is the language of uncertainty, and so to understand
statistics, we must understand uncertainty, and hence understand
probability.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 11


Unit Content (CO1)

• Introduction
• Measures of central tendency – mean, median, mode
• Measures of dispersion – mean deviation, standard deviation,
quartile deviation, variance
• Moment
• Skewness and kurtosis
• Least squares principles of curve fitting, Covariance
• Correlation and Regression analysis
• Correlation coefficient: Karl Pearson coefficient, rank correlation
coefficient
• Uni-variate and multivariate linear regression
• Application of regression analysis, Logistic Regression
• Time series analysis- Trend analysis (Least square method).
9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 12
Unit Objective(CO 1)
• The objective of this course is to familiarize the engineers with concept
of Statistical techniques.
• It aims to show case the students with standard concepts and tools
from B. Tech to deal with advanced level of mathematics and
applications that would be essential for their disciplines.

Faculty Name Aakansha Vyas Unit Number 1


9/22/2022 13
Topic objective (CO1)

Measures of central tendency


• To present a brief picture of data- It helps in giving a brief
description of the main feature of the entire data.
• Essential for comparison- It helps in reducing the data to a single
value which is used for doing comparative studies.
• Helps in decision making- Most of the companies use measuring
central tendency to plan and develop their businesses economy.
• Formulation of policies- Many governments rely on this medium
while forming any policies.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 14


Measures of Central Tendency (CO1)

❑ Measures of Central Tendency or Averages:


Definition : According to Prof. Bowley: Averages are “statistical
constants which enable us to comprehend in a single effort the
significance of the whole.”
Types of Measures of Central Tendency: There are five types of
measures of central tendency
➢ Arithmetic Mean or Simple Mean
➢ Median
➢ Mode
➢ Geometric Mean
➢ Harmonic Mean

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 15


Arithmetic Mean (CO1)
➢ Arithmetic Mean
Definition
Arithmetic mean of a set of observations is their sum divided by the
number of observations, e.g., the arithmetic mean x¯of n observations
x1, x2, ..., xn is given by:

❖ In case of the frequency distribution xi |fi ,i = 1, 2,..., n, where


fi is the frequency of the variable xi ,

=𝑁

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 16


Arithmetic Mean(CO1)
In case of grouped or continuous frequency distribution, x is taken as
the mid-value of the corresponding class.
Example: Find the arithmetic mean of the following frequency
distribution:

Solution:
Computation of mean
𝑓1 x1 +𝑓2 x2 +⋯ + 𝑓𝑛 xn
𝑥ҧ =
𝑓1 + 𝑓2 + ⋯ + 𝑓𝑛

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 17


Arithmetic Mean(CO1)

By using formula σ𝑛𝑖=1 𝑓𝑖 = 𝑁 = 73, σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 = 299

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 18


Daily Quiz (CO1)

Example: Calculate the mean for the following frequency distribution:


Class 0-8 8-16 16-24 24-32 32-40 40-48
interval
Frequency 8 7 16 24 15 7
Solution: Arithmetic mean =25.404
Example: The average salary of male employees in a farm was Rs.
5,200 and that of females was Rs. 4,200. The mean salary of all the
employees was Rs. 5,000.Find the percentage of male and female
employees.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 19


Median(CO1)
➢ Median:
Definition: Median of a distribution is the value of the variable which
divides it into two equal parts.
It is the value such that the number of observations above it is equal
to the number of observations below it. The median is thus a
positional average.
❖ Ungrouped Data:
• If the number of observations is odd then median is the middle
value after the values have been arranged in ascending or descending
order of magnitude.
• In case of even number of observations, there are two middle
terms and median is obtained by taking the arithmetic mean of
middle terms.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 20


Median(CO1)

Example
1. Median of Values 25, 20, 15, 35, 18. Median: 20
2. Median of Values 8, 20, 50, 25, 15, 30. Median: 22.5

❖ Discrete Frequency Distribution


In this case median is obtained by considering the cumulative
frequencies. The steps involved
𝑁
i. Find , where N =σ𝑛
𝑖=1 𝑓𝑖
2
𝑁
ii. See the cumulative frequency (c.f.) just greater than .
2

iii. corresponding value of x is median.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 21


Median(CO1)

Example: Obtain the median for the following frequency distribution:

Solution:
𝑁 8+10+11+16+20+25+15+9+6 120
i. Find = = = 60, where N
2 2 2
=σ𝑛
𝑖=1 𝑓𝑖
𝑁
ii. See the cumulative frequency (c.f.) just greater than .
2

iii. corresponding value of x is median.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 22


Median(CO1)

𝑁
Here N = 120, The cumulative frequency just greater than is 65 and
2
the 2 value of x corresponding to 65 is 5. Therefore, median is 5.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 23


Median(CO1)

❖ Continuous Frequency Distribution


In this case, the class corresponding to the c.f. just greater
𝑁
is called the median class and the value of median is
2
obtained by the formula:
ℎ 𝑁
Median = 𝑙 + −𝑐
𝑓 2
where
• l is the lower limit of the class,
• f is the frequency of the median class,
• h is the magnitude of the median class,
• c is the c.f. of the class preceding the median class,
• N =σ𝑛𝑖=1 𝑓𝑖

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 24


Daily Quiz(CO1)

Example : find the median wages of the following distribution.


Wages No. of workers
2000-3000 3
3000-4000 5
4000-5000 20
5000-6000 10
6000-7000 5

Solution: The median wage is Rs. 4,675.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 25


Mode(CO1)

➢ Mode:
• Mode is the value which occurs most frequently in a set of
observations and around which the other items of the set cluster
densely.
• It is the point of maximum frequency or the point of greatest
density.
• In other words the mode or modal value of the distribution is that
value of the variate for which frequency is maximum.
Calculation of Mode
❖ In case of discrete distribution: Mode is the value of x
corresponding to maximum frequency but in any one (or more)of
the following cases.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 26


Mode(CO1)
i. If the maximum frequency is repeated.
ii. If the maximum frequency occurs in the very beginning or at the
end of distribution .
iii. If there are irregularities in the distribution, the value of mode is
determined by the method of grouping.
❖ In case of continuous frequency distribution: mode is given by the
formula
𝑓𝑚 −𝑓1
Mode= 𝑙 + ×ℎ
2𝑓𝑚 −𝑓1 −𝑓2

where 𝑙 is the lower limit,ℎ 𝑡ℎ𝑒 width and 𝑓𝑚 the frequency of the
model class 𝑓1 𝑎𝑛𝑑 𝑓2 are the frequencies of the classes preceding and
succeeding the modal class respectively. While applying the above
formula it is necessary to see that the class intervals are of the same
size.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 27


Mode(CO1)

❖ For a symmetrical distribution, mean, median and mode coincide.


When mode is ill defined ,where the method of grouping also fails
its value can be ascertained by the formula
Mode=3Median-2Mean
This measure is called the empirical mode.
Q. Calculate the mode from the following frequency distribution.
Size(𝒙) 4 5 6 7 8 9 10 11 12 13
Freqen 2 5 8 9 12 14 14 15 11 13
cy
(𝑓)

Solution: Method of Grouping :

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 28


Mode(CO1)

𝑺𝒊𝒛𝒆(𝒙) 1 2 3 4 5 6
4 2 7
5 5 13
6 8 17 15
7 9 21 22 29
8 12 26 35
9 14 28 40 43
10 14 29 40
11 15 26 39
12 11 24
13 13

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 29


Mode(CO1)

Since the item 10 occurs maximum number of times i.e.5times,hence


the mode is 10.

𝑪𝒐𝒍𝒖𝒎𝒏𝒔 𝑺𝒊𝒛𝒆 𝒐𝒇 𝒊𝒕𝒆𝒎 𝒉𝒂𝒗𝒊𝒏𝒈 𝒎𝒂𝒙. 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚


1 max.15 11
2max 29 10, 11
3 max 28 9, 10
4 max 40 10, 11, 12
5 max 40 8 9 10
6 max 43 9 10 11

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 30


Mode(CO1)

Q. Find the mode of the following:


Marks 0-5 6-10 11-15 16-20 21-25
No.of candidates 7 10 16 32 24
Marks 26-30 31-35 36-40 41-45
No.of candidates 18 10 5 1

Solution: Here the greatest frequency 32 lies in the class 16-20.Hence


modal class is 16-20.But the actual limits of this class are 15.5-20.5.
𝑙 = 15.5, 𝑓𝑚 = 32, 𝑓1 = 16, 𝑓2 = 24, ℎ = 5

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 31


Mode(CO1)

𝑓𝑚 −𝑓1
Mode= 𝑙 + ×ℎ
2𝑓𝑚 −𝑓1 −𝑓2
32 − 16
= 15.5 + ×5
64 − 16 − 24
16
= 15.5 + ×5
24
10
= 15.5 +
3
= 18.83 𝑚𝑎𝑟𝑘𝑠

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 32


Daily Quiz(CO1)

Q.1 Calculate the mean, median and mode of the following data-

Wages (in Rs) 0-20 20-40 40-60 60-80 80-100 100-120 120-140

No. of Workers 6 8 10 12 6 5 3

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 33


Recap(CO1)

✓ Measures of central tendency


✓ Mean
✓ Mode
✓ Median

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 34


OBJECTIVES OF MEASURING DISPERSION
(CO-1)
❖ To determine the reliability of an average
❖ To compare the variability of two or more series
❖ For facilitating the use of other statistical measures
❖ Basis of Statistical Quality Control

Faculty Name Aakansha Vyas Unit


9/22/2022 35
Number 1
Measures of Dispersion(CO1)

Definition

• Measures of dispersion are descriptive statistics that describe how


similar a set of scores are to each other
– The more similar the scores are to each other, the lower the
measure of dispersion will be
– The less similar the scores are to each other, the higher the
measure of dispersion will be
– In general, the more spread out a distribution is, the larger the
measure of dispersion will be

Faculty Name Aakansha Vyas Unit


9/22/2022 36
Number 1
Measures of Dispersion(CO1)

• Which of the distributions of 125


scores has the larger 100
75
dispersion? 50
25
0
• The upper distribution has 1 2 3 4 5 6 7 8 9 10
more dispersion because the 125
scores are more spread out 100
• That is, they are less similar 75
to each other 50
25
0
1 2 3 4 5 6 7 8 9 10

Faculty Name Aakansha Vyas Unit


9/22/2022 37
Number 1
MEASURES OF DISPERSION(CO1)

Absolute Relative
Expressed in the In the form of ratio
same units in or percentage, so is
which data is independent of
expressed units

Ex: Rupees, Kgs, It is also called


Ltr, Km etc. Coefficient of
Dispersion

Faculty Name Aakansha Vyas Unit


9/22/2022 38
Number 1
Measures of Dispersion(CO1)

• There are some measures of dispersion


– Range
– Inter quartile range
– Mean deviation
– Standard deviation
– Variance
– Coefficient of Variation

Faculty Name Aakansha Vyas Unit


9/22/2022 39
Number 1
RANGE (R) (CO1)

RANGE:-
 It is the simplest measures of dispersion
 It is defined as the difference between the largest and
smallest values in the series
R=L–S
R = Range
L = Largest Value
S = Smallest Value
𝐿−𝑆
Coefficient of Range=
𝐿+𝑆

Faculty Name Aakansha Vyas Unit


9/22/2022 40
Number 1
RANGE(CO1)

❖ Individual Series:-
Q1: Find the range & Coefficient of Range for the following data: 20,
35, 25, 30, 15
Solution:-
L = Largest Value=35
S = Smallest Value=15
(Range)R = L – S=35-15=20
𝐿−𝑆 35−15 20
Coefficient of Range = = = = 0.4
𝐿+𝑆 35+15 50
❖ Discrete Frequency Distribution:
Q2: Find the range & Coefficient of

Solution:-L = Largest Value=70,S = Smallest Value=10

Faculty Name Aakansha Vyas Unit


9/22/2022 41
Number 1
RANGE(CO1)

(Range)R = L – S=70-10=60
𝐿−𝑆 70−10 60
Coefficient of Range = = = = 0.75
𝐿+𝑆 70+10 80
Continuous Frequency Distribution
Q3: Find the range & Coefficient of Range:
Size 5-10 10-15 15-20 20-25 25-30
F 4 9 15 30 40

Solution:-L = Upper limit of Largest class=30


S =Lower limit of Smallest Value=5
(Range)R = L – S=30-5=25
𝐿−𝑆 30−5 25 5
Coefficient of Range = = = = = 0.714
𝐿+𝑆 30+5 35 7

Faculty Name Aakansha Vyas Unit


9/22/2022 42
Number 1
Daily Quiz(CO1)

Q1: Find the range & Coefficient of Range for the


following data: 25, 38, 45, 30, 15
Ans:30,0.5
Q2: Find the range & Coefficient of Range:

Q3: Find the range & Coefficient of Range:

Faculty Name Aakansha Vyas Unit


9/22/2022 43
Number 1
INTERQUARTILE RANGE & QUARTILE
DEVIATION(CO1)

 Interquartile Range is the difference between the


upper quartile (Q3) and the lower quartile (Q1)
 It covers dispersion of middle 50% of the items of the
series
 Symbolically, Interquartile Range = Q3 – Q1
𝑄 3 −𝑄1
 Symbolically, Quartile Deviation =
2
 Quartile Deviation is half of the interquartile range. It
is also called Semi Interquartile Range
 Coefficient of Quartile Deviation: It is the relative
measure of quartile deviation.
𝑄 3 −𝑄1
 Coefficient of Q.D. =
𝑄3+𝑄1

Faculty Name Aakansha Vyas Unit


9/22/2022 44
Number 1
IQR & QD(CO1)

Example: Find interquartile range, quartile deviation and


coefficient of quartile deviation:28, 18, 20, 24, 27, 30, 15.
Solution:
Arranging data in ascending order
15,18,20,24,27,28,30

𝑛+1 7+1
Q3 = 𝑆𝑖𝑧𝑒 𝑜𝑓3 𝑡ℎ 𝑖𝑡𝑒𝑚 = 𝑆𝑖𝑧𝑒 𝑜𝑓3 𝑡ℎ 𝑖𝑡𝑒𝑚=28
4 4
Symbolically, Interquartile Range = Q3 – Q1 =28-18 =10
Q – Q1 28–18
Quartile Deviation = 3 = =5
2 2
Q – Q1 28–18
Coefficient of Q.D. = 3 = = 0.217
Q3 + Q1 28+ 1 8

Faculty Name Aakansha Vyas Unit


9/22/2022 45
Number 1
IQR & QD(CO1)

Example:
X 10 20 30 40 50 60
F 2 8 20 35 42 20
Solution:
X F C.F.
10 2 2
:
20 8 10
30 20 30
40 35 65
50 42 107
60 20 127
N=127

Faculty Name Aakansha Vyas Unit


9/22/2022 46
Number 1
IQR & QD(CO1)
𝑁+1 127+1
Solution:𝑄1 = 𝑆𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ 𝑖𝑡𝑒𝑚 = 𝑆𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ 𝑖𝑡𝑒𝑚 = 40
4 4
𝑁+1 127+1
Q3 = 𝑆𝑖𝑧𝑒 𝑜𝑓3 𝑡ℎ 𝑖𝑡𝑒𝑚 = 𝑆𝑖𝑧𝑒 𝑜𝑓3 𝑡ℎ 𝑖𝑡𝑒𝑚=50
4 4
Symbolically, Interquartile Range = Q3 – Q1 =50-40 =10
Q3 – Q1 50–40
Quartile Deviation = = =5
2 2
Q3 – Q1 50–40
Coefficient of Q.D. = = = 0.11
Q3 + Q1 50+40

Faculty Name Aakansha Vyas Unit


9/22/2022 47
Number 1
Daily Quiz(CO1)

Q1: Find quartile deviation and coefficient of


quartile deviation:
4,8,10,7,15,11,18,14,12,16
Ans: 3.75, 0.32

X 0-10 10-20 20-30 30-40 40-50 60


Q2: F 2 8 20 35 42 20

Ans: 10, 5, 0.11

Age 0-20 20-40 40-60 60-80 80-100


Q3: Persons 4 10 15 20 11

Ans: 14.33, 0.19

Faculty Name Aakansha Vyas Unit


9/22/2022 48
Number 1
3. MEAN DEVIATION(M.D.) (CO1)

 It is also called Average Deviation


 It is defined as the arithmetic average of the deviation of the
various items of a series computed from measures of central
tendency like mean or median.
There are some formulas to calculate mean deviation.
σ 𝒅𝒙ഥ
M.D from Mean 𝑀. 𝐷.𝑥ҧ =
𝑛
𝑀.𝐷.ഥ𝑥
Coefficient of 𝑀. 𝐷.𝑥ҧ =
𝑥ҧ
σ 𝒅𝒎
M.D from Median 𝑀. 𝐷.𝑀 =
𝑛
𝑀.𝐷.𝑀
Coefficient of 𝑀. 𝐷.𝑀 =
𝑀

Faculty Name Aakansha Vyas Unit


9/22/2022 49
Number 1
MEAN DEVIATION(CO1)

Q1: Calculate M.D. from Mean & Median & coefficient of


Mean Deviation from the following data: 20, 22, 25, 38, 40, 50,
65, 70, 75
σ𝑥 20+22+25+38+40+50+65+70+75 405
Solution:𝑀𝑒𝑎𝑛 𝑥ҧ = = = = 45
𝑛 9 9

9+1
= 𝑆𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ 𝑡𝑒𝑟𝑚 = 40
2
σ 𝒅𝒙ഥ 160
M.D from Mean 𝑀. 𝐷.𝑥ҧ = = = 17.78
𝑛 9
𝑀.𝐷.ഥ𝑥 17.78
Coefficient of 𝑀. 𝐷.𝑥ҧ = = = 0.39
𝑥ҧ 45
σ 𝒅𝒎 155
M.D from Median 𝑀. 𝐷.𝑀 = = = 17.22
𝑛 9
𝑀.𝐷.𝑀 17.22
Coefficient of 𝑀. 𝐷.𝑀 = = = 0.43
𝑀 40
9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 50
MEAN DEVIATION(CO1)
Marks X Deviation from mean Deviation from median
45 𝒅ഥ𝒙 = 𝑿 − 𝟒𝟓 40 𝒅𝒎 = 𝑿 − 𝟒𝟎

20 25 20
22 23 18
25 20 15
38 7 2
40 5 0
50 5 10
65 20 25
70 25 30
75 30 35
N=9,
σ𝑋 = ෍ 𝒅ഥ𝒙 =160 ෍ 𝒅𝒎 =155
405
Faculty Name Aakansha Vyas Unit
9/22/2022 51
Number 1
MEAN DEVIATION(CO1)

Example: Calculate M.D. from Mean & Median &


coefficient of Mean Deviation from the following data:

Solution:

Faculty Name Aakansha Vyas Unit


9/22/2022 52
Number 1
MEAN DEVIATION(CO1)
x F c.f. 𝒅𝒎 f 𝒅𝒎 Fx 𝒅ഥ𝒙 f 𝒅ഥ𝒙
= 𝑿 − 𝟒𝟎 = 𝑿 − 𝟒𝟏

20 8 8 20 160 160 21 168


30 12 20 10 120 360 11 132
40 20 40 0 0 800 1 20
50 10 50 10 100 500 9 90
60 6 56 20 120 360 19 114
70 4 60 30 120 280 29 116
N= 2460
෍ f 𝒅𝒎 ෍ f 𝒅ഥ𝒙 = 640
60
= 620

Faculty Name Aakansha Vyas Unit


9/22/2022 53
Number 1
MEAN DEVIATION(CO1)

σ 𝑓 𝒅𝒎 620
M.D from Median= = = 10.33
𝑁 60
𝑀.𝐷.𝑀 10.33
Coefficient of 𝑀. 𝐷.𝑀 = = = 0.258
𝑀 40
σ 𝑓𝑥 2460
Mean𝑥ҧ = = = 41
𝑁 60
σ 𝑓 𝒅𝒙ഥ 640
M.D from Mean= = = 10.67
𝑁 60
𝑀.𝐷.ഥ𝑥 10.67
Coefficient of 𝑀. 𝐷.𝑥ҧ = = = 0.26
𝑥ҧ 41

Faculty Name Aakansha Vyas Unit


9/22/2022 54
Number 1
MEAN DEVIATION(CO1)

Q1. Calculate M.D. from Mean & coefficient of Mean Deviation


from the following data:
Marks 0-10 10-20 20-30 30-40 40-50
No.of students 5 8 15 16 6

Faculty Name Aakansha Vyas Unit


9/22/2022 55
Number 1
Variance (CO 1)

❖ For an Individual Series :If 𝑥1 , 𝑥2,….. 𝑥𝑛 are the values of the variable
under consideration , ഥ𝑥 is defined as

❖ For a frequency Distribution: If 𝑥1, 𝑥2,…., 𝑥𝑛 are the values of a


variable 𝑥 with the corresponding frequencies 𝑓1 , 𝑓2 , … . , 𝑓𝑛
respectively 𝑥ҧ is defined as
σ 𝑓𝑥
𝜇 = 𝑥ҧ = :𝑁 = ෍𝑓
σ𝑓

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 56


Variance (CO 1)

where 𝑁 = σ𝑛𝑖=1 𝑓𝑖
Note. In case of a frequency distribution with class intervals, the values
of 𝑥 are the midpoints of the intervals.
Example 1.Find the Variance and standard deviation for the following
individual series.
𝒙 3 6 8 10 18
Solution:

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 57


Variance (CO 1)

𝒙 𝒙−ഥ
𝒙 𝒙−ഥ
𝒙 𝟐

3 -6 36
6 -3 9
8 -1 1
10 1 1
18 9 81

෍ 𝑥 = 45 ෍ 𝒙−ഥ
𝒙 𝟐 = 𝟏𝟐𝟖

σ𝑥 45
n=5,σ 𝑥 = 45, 𝑥ҧ = 𝑛
=
5
=9
1 128
𝜎 2 = 𝑛 σ𝑛𝑖=1 𝑥𝑖 − 𝑥ҧ 2
= 5
= 25.6,
Standard deviation= variance = 25.6 = 𝟓. 𝟎𝟓

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 58


Variance (CO 1)

• Example Find the variance and standard deviation for the following
frequency distribution.

Marks 5-15 15-25 25-35 35-45 45-55 55-65


No.of 10 20 25 20 15 10
students

• Sol.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 59


Variance (CO 1)
Mark No.of Mid- 𝒇𝒙 𝒙−ഥ 𝒙 𝒇 𝒙− ഥ
𝒙 𝟐

s Studen Point = 𝒙 − 𝟑𝟒
ts(𝒇) (𝒙)
5-15 10 10 100 -24 5760

15-25 20 20 400 -14 3920

25-35 25 30 750 -4 400


35-45 20 40 800 6 720
45-55 15 50 750 16 3840
55-65 10 60 600 26 6760
N=100 σ 𝑓𝑥=3400 𝒙 𝟐=21400
σ𝒇 𝒙−ഥ

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 60


Variance (CO 1)

σ 𝑓𝑥 3400
𝑥ҧ = = = 34
𝑁 100

Standard deviation (𝜎 )= variance = 214 = 𝟏𝟒. 𝟔𝟐

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 61


Daily Quiz(CO1)

• Find the mean of the following data:


15,20,30,22,25,18,40,50,55 and 65
• Find the mode of the following distribution:
7,4,3,5,6,3,3,2,4,3,4,3,3,4,4,2,3

Faculty Name Aakansha Vyas Unit


9/22/2022 62
Number 1
Recap(CO1)

• Measures of Central tendency


• Measures of dispersions

Faculty Name Aakansha Vyas Unit


9/22/2022 63
Number 1
Topic Objective (CO1)

Moments
• In mathematical statistics it involve a basic calculation. These
calculations can be used to find a probability distribution's mean,
variance, and skewness.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 64


Moments (CO1)

❑ Moments: The moment of a distribution are the arithmetic means


of the various powers of the deviations of items from some given
number.
➢ Moments about mean (central moment)
➢ Moments about any arbitrary number (Raw Moment)
➢ Moments about origin

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 65


Central Moments (CO1)

➢ Moment about mean (central moment):


❖ For an Individual Series :If 𝑥1 , 𝑥2,….. 𝑥𝑛 are the values of the variable
under consideration , the 𝑟 𝑡ℎ moment 𝜇𝑟 about mean ഥ𝑥 is defined
as

σ𝑛
𝑖=1 𝑥𝑖 − 𝑥ҧ
𝑟
Moment about mean 𝜇𝑟 = ; r = 0,1,2, … .
𝑛

❖ For a frequency Distribution: If 𝑥1, 𝑥2,…., 𝑥𝑛 are the values of a


variable 𝑥 with the corresponding frequencies 𝑓1 , 𝑓2 , … . , 𝑓𝑛
respectively then 𝑟 𝑡ℎ moment 𝜇𝑟 about the mean 𝑥ҧ is defined as

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 66


Central Moments (CO1)

σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 − 𝑥ҧ 𝑟
𝜇𝑟 = ; r = 0,1,2 … .
𝑁

where 𝑁 = σ𝑛𝑖=1 𝑓𝑖
1 1 𝑁
in particular 𝜇0 = σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 − 𝑥ҧ 0
= σ𝑛𝑖=1 𝑓𝑖 = =1
𝑁 𝑁 𝑁
Note. In case of a frequency distribution with class intervals, the values
of 𝑥 are the midpoints of the intervals.
Example 1.Find the first four moments for the following individual series.
Solution: Calculation of Moments

𝒙 3 6 8 10 18

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 67


Central Moments (CO1)

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 68


Central Moments (CO1)

For any distribution,𝜇0 = 1

For any distribution,𝜇1 = 0, for r=2,

Therefore for any distribution ,𝜇2 coincides with the variance of the
distribution.
1 486
Similarly, 𝜇3 = σ𝑛𝑖=1 𝑥𝑖 − 𝑥ҧ 3
= = 97.2
𝑛 5
1 7940
𝜇4 = σ𝑛𝑖=1 𝑥𝑖 − 𝑥ҧ 4
= = 1588
𝑛 5

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 69


Central Moments (CO1)

σ 𝑥 45
Now 𝑥ҧ = = =9
𝑛 5
σ 𝑥− 𝑥ҧ 0
𝜇1 = = =0,
𝑛 5
σ 𝑥− 𝑥ҧ 2 128
𝜇2 = = =25.6,
𝑛 5
σ 𝑥− 𝑥ҧ 3 486
𝜇3 = = =97.2,
𝑛 5
σ 𝑥− 𝑥ҧ 4 7940
𝜇4 = = =1588,
𝑛 5

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 70


Central Moments (CO1)

For any distribution,𝜇0 = 1 for r=1

For any distribution,𝜇1 = 0, for r=2,

Therefore for any distribution ,𝜇2 coincides with the variance of the
distribution.
1
Similarly, 𝜇3 = σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 − 𝑥ҧ 3
𝑁
1
𝜇4 = σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 − 𝑥ҧ 4
and so on.
𝑁

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 71


Central Moments (CO1)

• Example 𝜇1,𝜇2,𝜇3, 𝜇4 for the following frequency distribution.


Marks 5-15 15-25 25-35 35-45 45-55 55-65
No.of 10 20 25 20 15 10
students

• Sol. Calculation of Moments

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 72


Central Moments (CO1)
Mark No.of Mid- 𝒇𝒙 𝒙−ഥ𝒙 𝒇 𝒙−ഥ
𝒙 𝒇 𝒙− ഥ
𝒙 𝟐 𝒇 (𝒙 𝒇 𝒙− ഥ
𝒙 𝟒

s Studen Point =𝒙 −ഥ 𝒙 )𝟑
ts(𝒇) (𝒙) − 𝟑𝟒
5-15 10 10 100 -24 -240 5760 -138240 3317760

15-25 20 20 400 -14 -280 3920 -54880 768320

25-35 25 30 750 -4 -100 400 -1600 6400


35-45 20 40 800 6 120 720 4320 25920
45-55 15 50 750 16 240 3840 61440 983040
55-65 10 60 600 26 260 6760 175760 4569760
N=100 σ 𝑓𝑥 σ 𝒇 (𝒙 − σ 𝒇 (𝒙 − 𝒇 (𝒙 − 𝒇 (𝒙 −
=34 ഥ
𝒙)=0 𝒙)𝟐=21400
ഥ 𝒙)𝟑=4680
ഥ 𝒙)𝟒=96712

00 0 00

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 73


Central Moments (CO1)

σ 𝑓𝑥 3400
𝑥ҧ = = = 34
100𝑁
σ𝒇 𝒙 −ഥ𝒙 0
𝜇1 = = =0
𝑁 100

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 74


Raw Moments (CO1)

➢ MOMENTS ABOUT AN ARBITARY NUMBER(Raw Moments):


❖ If 𝑥1 , 𝑥2 , 𝑥3 , … . . , 𝑥𝑛 are the values of a variable 𝑥 with the
corresponding frequencies 𝑓1 , 𝑓2 , 𝑓3,…..𝑓𝑛 respectively then
𝑟 𝑡ℎ moment 𝜇𝑟 ′ about the number 𝑥 = 𝐴 is defined as

1
𝜇′𝑟 = σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 − 𝐴 𝑟 ; 𝑟 = 0,1,2, …
𝑁

Where,𝑁 = σ𝑛𝑖=1 𝑓𝑖
1
For 𝑟 = 0, 𝜇′0 = σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 − 𝐴 0
=1
𝑁

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 75


Raw Moments (CO1)

1 1 𝐴
For 𝑟 = 1, 𝜇′1 = σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 − 𝐴 = σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 − σ𝑛𝑖=1 𝑓𝑖 = 𝑥ҧ − 𝐴
𝑁 𝑁 𝑁
1
For 𝑟 = 2, 𝜇′2 = σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 − 𝐴 2
𝑁
1
For 𝑟 = 3, 𝜇′3 = σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 − 𝐴 3
and so on.
𝑁
In Calculation work, if we find that there is some common factor ℎ(>1)
in values of 𝑥 − 𝐴,we can ease our calculation work by defining 𝑢 =
𝑥−𝐴
.

In that case , we have

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 76


Moments about the origin (CO1)

➢ MOMENTS ABOUT THE ORIGIN:


If 𝑥1 , 𝑥2 , … … , 𝑥𝑛 be the values of a variable 𝑥 with corresponding
frequencies 𝑓1 , 𝑓2 , … … , 𝑓𝑛 respectively then 𝑟 𝑡ℎ moment about the
origin 𝑣𝑟 is defined as
1
𝑣𝑟 = σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 𝑟 ; r = 0,1,2, … .
𝑁

Where, 𝑁 = σ𝑛𝑖=1 𝑓𝑖
1 𝑁
For 𝑟 = 0, 𝑣0 = σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 0 = =1
𝑁 𝑁
1
For 𝑟 = 1, 𝑣1 = σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 = 𝑥ҧ
𝑁
1
For 𝑟 = 2, 𝑣2 = σ𝑛𝑖=1 𝑓𝑖 𝑥𝑖 2 and so on.
𝑁

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 77


Relations (CO1)

relations:
𝜇1 = 0
𝜇2 = 𝜇2 ′ − 𝜇1 ′2
𝜇3 = 𝜇3 ′ − 3𝜇2 ′𝜇1 ′ + 2𝜇1 ′3
𝜇4 = 𝜇4′ − 4𝜇3′ 𝜇1 ′ + 6𝜇2 ′𝜇1 ′2 − 3𝜇1 ′4

• RELATION BETWEEN 𝒗𝒓 𝑨𝑵𝑫 𝝁𝒓


𝑣1 = 𝑥ҧ
𝑣2 = 𝜇2 + 𝑥ҧ 2
𝑣3 = 𝜇3 + 3𝜇2 𝑥ҧ + 𝑥ҧ 3
𝑣4 = 𝜇4 + 4𝜇3 𝑥ҧ + 6𝜇2 𝑥ҧ 2 + 𝑥ҧ 4

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 78


KARL PERSON’S COEFFICIENTS(CO1)

❖ KARL PERSON’S 𝜷 𝑨𝑵𝑫 𝜸 COEFFICIENTS:


Karl Pearson defined the following four coefficients based upon the
first four moments of a frequency distribution about it mean:

𝜇3 2 𝜇4
𝛽1 = 𝛽2 = (𝛽 −coefficients)
𝜇2 3 𝜇2 2

𝛾1 = + 𝛽1 𝛾2 = 𝛽2 − 3 (𝛾 −coefficients)

The practical use of this coefficients is to measure the skewness and


kurtosis of a frequency distribution .These coefficients are pure
numbers independent of units of measurement.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 79


KARL PERSON’S COEFFICIENTS(CO1)

Example1 : The first three moments of a distribution about the value


“2” of the variable are 1,16 and −40.Show that the mean is 3,variance
is 15 and 𝜇3 = −86.
Solution: We have A=2,𝜇′1 = 1, 𝜇′2 = 16 and 𝜇′3 = −40
We have that 𝜇′1 = 𝑥ҧ − 𝐴 ⟹ 𝑥ҧ = 𝜇′1 + 𝐴 = 1 + 2 = 3
2
Variance=𝜇2 = 𝜇′2 − 𝜇′1 = 16 − 1 2 = 15
3
𝜇3 = 𝜇′3 − 3𝜇 ′ 2𝜇 ′1 + 2𝜇 ′1 = −40 − 3 16 1 + 2 1 3

= −40 − 48 + 2 = −86.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 80


KARL PERSON’S COEFFICIENTS(CO1)

Example 2:The first moments of a distribution about the value “35”


are−1.8,240, −1020 𝑎𝑛𝑑 144000.Find the values of 𝜇1 , 𝜇2 , 𝜇3 , 𝜇4.
Solution: 𝜇1 = 0
𝜇2 = 𝜇′2 − 𝜇1 ′2 = 240 − −1.8 2 = 236.76
3
𝜇3 = 𝜇′3 − 3𝜇′2 𝜇′1 +2𝜇′1
= −1020 − 3 240 −1.8 + 2 −1.8 3 = 264.36
2 4
𝜇4 = 𝜇′4 − 4𝜇 ′ 3 𝜇 ′1 + 6𝜇 ′ 2 𝜇 ′ 1 − 3𝜇 ′ 1
= 144000 − 4 −1020 −1.8 + 6 240 −1.8 2− 3 −1.84 4

= 141290.11.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 81


KARL PERSON’S COEFFICIENTS(CO1)

Example 3:Calculate the variance and third central moment from the
following data.
𝒙𝒊 0 1 2 3 4 5 6 7 8
𝐹𝑖 1 9 26 59 72 52 29 7 1
Solution: Calculation of Moments

𝒙 𝒇 𝒖=
𝒙−𝑨
, 𝑨 = 𝟒, 𝒉 = 𝟏 𝒇𝒖 𝒇𝒖𝟐 𝒇𝒖𝟑
𝒉

0 1 -4 -4 16 -64
1 9 -3 -27 81 -243
2 26 -2 -52 104 -208
3 59 -1 -59 59 -59
4 72 0 0 0 0

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 82


KARL PERSON’S COEFFICIENTS(CO1)

σ 𝑓𝑢2 507
𝜇′2 = ℎ2 = =1.9805
𝑁 256

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 83


KARL PERSON’S COEFFICIENTS(CO1)

σ 𝑓𝑢3 3 −37
𝜇′3 = ℎ = = −0.1445
𝑁 256
Moments about Mean:
𝜇1 = 0
2
𝜇2 = 𝜇′2 − 𝜇 ′1 = 1.9805 − −.02734 2
= 1.97975
Variance=1.97975
Also 𝜇3 = 𝜇′3 − 3𝜇′2 𝜇′1 + 2𝜇1 ′3
3
= −0.1445 − 3 1.9805 −0.02734 + 2 −0.02734
=0.0178997
Third central moment= 0.0178997.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 84


KARL PERSON’S COEFFICIENTS(CO1)

Example 4: The first four moments of a distribution about the value


‘4’of the
variable are -1.5,17,−30 and 108.Find the moments about mean,
about origin;𝛽1 𝑎𝑛𝑑 𝛽2 also find the moments about the point 𝑥 = 2.
Solution: We have A=4,𝜇′1 = −1.5, 𝜇 ′ 2 = 17, 𝜇 ′ 3 = −30, 𝜇 ′4 = 108
Moments about mean
𝜇1 = 0
𝜇2 = 𝜇′2 − 𝜇1 ′2 = 14.75
𝜇3 = 𝜇′3 − 3𝜇 ′ 2 𝜇 ′1 + 2𝜇1 ′3 = 39.75
𝜇4 = 𝜇′4 − 4𝜇 ′ 3 𝜇 ′1 + 6𝜇 ′ 2 𝜇1 ′2 − 3𝜇1 ′4 = 142.3125
𝑥ҧ = 𝜇′1 + 𝐴 = −1.5 + 4 = 2.5

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 85


KARL PERSON’S COEFFICIENTS(CO1)

Moments about origin:


𝑣1 = 𝑥ҧ = 2.5
𝑣2 = 𝜇2 + 𝑥ҧ 2 = 14.75 + 2.5 2 = 21
𝑣3 = 𝜇3 + 3𝜇2 𝑥ҧ + 𝑥ҧ 3 = 166
𝑣4 = 𝜇4 + 4𝜇3 𝑥ҧ + 6𝜇2 𝑥ҧ 2 + 𝑥ҧ 4 = 1132
Calculation of 𝛽1 𝑎𝑛𝑑 𝛽2
𝜇3 2 𝜇4
𝛽1 = =0.492377 𝛽2 = =0.654122
𝜇2 3 𝜇2 2
Moments about the points𝑥 = 2
𝜇′1 = 𝑥ҧ − 𝐴 = 2.5 − 2 = 0.5
𝜇′2 = 𝜇2 + 𝜇1 ′2 = 14.75 + .5 2 = 15
𝜇′3 = 𝜇3 + 3𝜇′2𝜇′1 − 2𝜇1 ′3 = 39.75 + 3 15 .5 − 2 .5 3 = 62
𝜇′4 = 𝜇4 + 4𝜇′3𝜇′1 − 6𝜇 ′ 2𝜇1 ′2 + 3𝜇1 ′4 =244

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 86


Daily Quiz(CO1)

Q1. The first four moments of a distribution are 3,


10.5,40.5,168.Comment upon the nature of the distribution.
Q2. For a distribution, the mean is 10,variance is 16,𝛾1 is 1 and 𝛽2 is 4.
Find the first four moment about origin.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 87


Topic objective (CO1)

Skewness
• It tells us whether the distribution is normal or not
• It gives us an idea about the nature and degree of concentration of
observations about the mean
• The empirical relation of mean, median and mode are based on a
moderately skewed distribution

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 88


Skewness(CO1)

❑ Skewness:
• I t means lack of symmetry.
• It gives us an idea about the shape of the curve which we can draw with
the help of the given data.
• A distribution is said to be skewed if—
Mean, median and mode fall at different points, i.e.,
Mean ƒ= Median ƒ= Mode;
• Quartiles are not equidistant from median; and
• The curve drawn with the help of the given data is not symmetrical
but stretched more to one side than to the other.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 89


Skewness(CO1)

Symmetrical Distribution
A symmetric distribution is a type of distribution where the left side of
the distribution mirrors the right side. In a symmetric distribution, the
mean, mode and median all fall at the same point.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 90


Skewness(CO1)

M e a s u r e s o f S ke w n e s s :
The measures of skewness are:
• Sk = M − M d ,
• Sk = M − M o ,
• Sk = (Q3 − M d ) − (M d − Q1),
where M is the mean, M d , the median, M o , the mode, Q1, the first
quartile deviation and Q3 , the third quartile deviation of the distribution.
These are the absolute measures of skewness.
• C o e f f i c i e n t s o f S k e w n e s s : For comparing two series we do
not calculate these absolute measures but we calculate the relative measures
called the coefficients of skewness which are pure numbers independent of
units of measurement.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 91


Skewness(CO1)

The following are the coefficients ofskewness:


• Prof. Karl Pearson’s Coefficient of Skewness,
• Prof. Bowley’s Coefficient of Skewness,
• Coefficient of Skewness based upon Moments.
P r o f . K a r l Pe ars o n ’s C o e f f i c i e n t o f S ke w n e s s :
Definition
• It is defined as:
𝐴. 𝑀. −𝑀𝑜𝑑𝑒 3 𝑀 − Md
𝑆𝐾𝑝 = =
𝑆. 𝐷 σ
where σ is the standard deviation of the distribution. If mode is ill-
𝑀𝑜𝑑𝑒=3Median-2mean

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 92


Skewness(CO1)

defined, then using the empirical relation,


M o = 3Md − 2M, for a moderately asymmetrical distribution, we have
• From above two formulas, we observe that Sk = 0 if M = M o = M d .
• Hence for a symmetrical distribution, mean, median and mode coincide.
• Skewness is positive if M > M o or M > M d , and negative if M <
M o or M < M d .
• Limits are: |Sk | ≤ 3 or −3 ≤ Sk ≤3.
• However, in practice, these limits are rarely attained.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 93


Skewness(CO1)

C o e f f i c i e n t o f S ke w n e s s b a s e d u p o n M o m e n t s
Definition
𝜇3
It is defined as: 𝛾1 =
𝜇2 3

where 𝛾1 are Pearson’s Coefficients and defined as:


Sk = 0, if either 𝛽1 = 0 or 𝛽2 = −3. Thus Sk = 0, if and
only if 𝛽1 = 0.
Thus for a symmetrical distribution 𝛽1 = 0.
In this respect 𝛽1 is taken as a measure ofskewness.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 94


Skewness(CO1)

• The coefficient of skewness based upon moments is to be regarded as


without sign.
• The Pearson’s and Bowley’s coefficients of skewness can be positive as
well as negative.
❖ P o s i t i v e l y S k e w e d D i s t r i b u t i o n : The skewness is
positive if the larger tail of the distribution lies towards the higher
values of the variate (the right),i.e., if the curve drawn
with the help of the given data is
stretched more to the right than
to the left.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 95


Skewness(CO1)

❖ Ne gativ e ly S ke w e d Dis trib u tio n :


The skewness is negative if the larger tail of the distribution lies towards
the lower values of the variate (the left), i.e., if the curve drawn with the
help of the given data is stretched more to the left than to the right.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 96


Skewness(CO1)

Pe ars o n ’s 𝜷𝟏 a n d 𝜸 𝟏 C o e f f i c i e n t s :
𝝁𝟑
𝜸 𝟏 = 𝜷𝟏 = ±
𝝁𝟐 𝟑
Q1. Karl Pearson coefficient of skewness of a distribution is 0.32, its
standard deviation is 6.5 and mean is 29.6. find the mode of the
distribution.
Solution: Given that 𝑆𝐾𝑝 = 0.32, σ=6.5 mean =29.6
𝐴. 𝑀. −𝑀𝑜𝑑𝑒 3 𝑀 − Md
𝑆𝐾𝑝 = =
𝑆. 𝐷 σ
29.6 − 𝑀𝑜𝑑𝑒
0.32 = ⟹ 𝑀𝑜𝑑𝑒 = 27.52
6.5

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 97


Topic objective (CO1)

Kurtosis
• Describe the concepts of kurtosis
• Explain the different measures of kurtosis
• Explain how kurtosis describe the shape of a distribution.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 98


Kurtosis (CO1)

❑ Kurtosis
• If we know the measures of central tendency, dispersion and skewness,
we still cannot form a complete idea about the distribution. Let us
consider the figure in which all the three curves
• A, B, and C are symmetrical about the mean and have the same range.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 99


Kurtosis (CO1)

Definition: Kurtosis is also known as Convexity of the Frequency Curve due to


Prof. Karl Pearson.
• It enables us to have an idea about the flatness or peaknessof the
frequency curve.
• It is measure by the coefficient β2 or its derivation γ2 given as:
𝜇4
𝛽2 = 2
𝜇2
• Curve of the type A which is neither flat nor peaked is called the normal
curve or mesokurtic curve and for such curve 𝛽2 = 3, i.e., γ2 = 0.
• Curve of the type B which is flatter than the normal curve is known as
platycurtic curve and for such curve 𝛽2 < 3, i.e., γ2 <0.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 100


Kurtosis (CO1)

Curve of the type C which is more peaked than the normal curveis called leptokurtic
curveand for such curve 𝛽2 > 3, i.e., γ2 >0.
Q2. For a distribution, the mean is 10, variance is 16, γ1 is +1 and 𝛽2 is 4. Comment
about the nature of distribution. Also find third central moment.
𝝁𝟑
Solution1 = ± ⇒ 𝝁𝟑 =64, 𝝁𝟐 =16,
𝟒𝟎𝟗𝟔

𝜇4
4= ⇒ 𝜇4 = 1024
256

Since γ1 = +1, the distribution is moderately positively skewed, i.e,


if we draw the curve of the given distribution, it will have longer tail towards theright.
Further, since 𝛽2 = 4 > 3, the distribution is leptokurtic, i.e.,
it will be sightlymore peakedthan the normal curve.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 101


Kurtosis (CO 1)

Example 3 The first four moment about the working mean 28.5 of a
distribution are 0.294,7.144,42.409 and 454.98. Calculate the first four
moment about mean. Also evaluate 𝛽1 and 𝛽2 and comment upon
the skewness and kurtosis of the distribution.
Solution:𝜇′1 = .294, 𝜇′2 = 7.144, 𝜇′3 = 42.409, 𝜇′4 =
454.98Moment about mean
𝜇1 = 0,
𝜇2 = 𝜇2′ − 𝜇1 ′2 = 7.0576.
𝜇3 = 𝜇3′ − 3𝜇2′𝜇1 ′ + 2𝜇1 ′3 = 36.1588,
𝜇4 = 𝜇4′ − 4𝜇3′𝜇1′ + 6𝜇2′𝜇1 ′2 − 3𝜇1′4 = 408.7896

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 102


Kurtosis (CO 1)

𝜇4
𝛽2 = 2 = 8.207
𝜇2
Skewness :𝛽1 is positive so 𝛾 1 =
1.9285 so distribution is positivley skewed.
Kurtosis: 𝛽2 = 8.207 > 3 so distribution is leptokutic.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 103


Daily Quiz(CO1)

Q1. Find all four central moments and Discuss Skewness and Kurtosis
for the following distribution-

Range of 2-4 4-6 6-8 8-10 10-12


Expenditur
es
No. of 38 292 389 212 69
families

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 104


Weekly Assignment(CO1)

Q1. The First four moments of a distribution about 𝑥 = 4 are


1, 4, 10, 𝑎𝑛𝑑 45. Find the first four moments about mean. Discuss the
Skewness and Kurtosis and also comment upon the nature of the
distribution.
Q2. Define the Mode and calculate Mode for the distribution of
monthly rent Paid by Libraries in Karnataka
Monthly rent 500-1000 1000-1500 1500-2000 2000-2500 2500-3000 3000 & above

No.of Library 5 10 8 16 14 12

Q3. Write Short Note on


i. Range ii. Inter quartile range iii. Mean deviation iv. Standard
deviation v. Variance
Q 4. Explain the measures of dispersion and also find the range &
Coefficient of Range for the following data: 20, 35, 25, 30, 15.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 105


Recap(CO1)

✓ Moments
✓ Relation between 𝑣𝑟 𝑎𝑛𝑑 𝜇𝑟
✓ Relation between 𝜇𝑟 𝑎𝑛𝑑 𝜇′𝑟
✓ Moment generating function.
✓ Skewness
✓ Kurtosis

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 106


Topic objectives(CO1)

Curve Fitting
• The objective of curve fitting is to find the parameters of a
mathematical model that describes a set of data in a way that
minimizes the difference between the model and the data.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 107


Curve Fitting (CO1)

❑ Curve Fitting :Curve fitting means an exact relationship between


two variables by algebraic equation. It enables us to represent the
relationship between two variables by simple algebraic expressions
e.g. polynomials, exponential or logarithmic functions. .It is also
used to estimate the values of one variable corresponding to the
specified values of other variables.

❖ METHOD OF LEAST SQUARES: Method of least squares provides a


unique set of values to the constants and hence suggests a curve of
best fit to the given data.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 108


Curve Fitting (CO1)

• FITTING A STRAIGHT LINE: Let 𝑥𝑖 , 𝑦𝑖 , 𝑖 = 1,2, … . 𝑛 be n sets of


observations of related data and
𝑦 = 𝑎. 1 + 𝑏. 𝑥 (1)
Normal equations
σ 𝑦 = 𝑛𝑎 + 𝑏 σ 𝑥 (2)
σ 𝑥𝑦 = 𝑎 σ 𝑥 + 𝑏 σ 𝑥 2 (3)
𝑥−(𝑚𝑖𝑑𝑑𝑙𝑒 𝑡𝑒𝑟𝑚 )
If n is odd then,𝑢 =
𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙(ℎ)
𝑥−(𝑚𝑒𝑎𝑛 𝑜𝑓 𝑡𝑤𝑜 𝑚𝑖𝑑𝑑𝑙𝑒 𝑡𝑒𝑟𝑚𝑠)
If n is even then,𝑢 = 1
(𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙)
2

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 109


Curve Fitting (CO1)

Q.Fit a straight line to the following data by least square method.


𝒙 0 1 2 3 4
𝑦 1 1.8 3.3 4.5 6.3

Sol. Let the straight line obtained from the given data be
𝑦 = 𝑎. 1 + 𝑏𝑥 (1)
then the normal equations are
σ 𝑦 = 𝑚𝑎 + 𝑏 σ 𝑥 (2)
σ 𝑥𝑦 = 𝑎 σ 𝑥 + 𝑏 σ 𝑥 2 (3) m=5

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 110


Curve Fitting (CO1)

From(2) and (3), σ 𝑦 = 𝑚𝑎 + 𝑏 σ 𝑥 ⇒ 16.9=5𝑎 + 10𝑏


෍ 𝑥𝑦 = 𝑎 ෍ 𝑥 + 𝑏 ෍ 𝑥 2 ⇒ 47.1 = 10𝑎 + 30𝑏

Solving we get 𝑎 = 0.72, 𝑏 = 1.33


Required lines is 𝑦 = 0.72 + 1.33𝑥

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 111


Curve Fitting (CO1)

➢ FITTING OF AN EXPONENTIAL CURVE


Let 𝑦 = 𝑎𝑒 𝑏𝑥
Taking logarithm on both sides, we get
log10 𝑦 = log10 𝑎 + 𝑏𝑥 log10 𝑒
𝑌 = 𝐴 + 𝐵𝑋
Where 𝑌 = log10 𝑦 , 𝐴 = log10 𝑎,𝐵 = 𝑏 log10 𝑒, 𝑋 = 𝑥
The normal equation for (1) are
෍ 𝑌 = 𝑛𝐴 + 𝐵 ෍ 𝑋 𝑎𝑛𝑑 ෍ 𝑋𝑌 = 𝐴 ෍ 𝑋 + 𝐵 ෍ 𝑋 2

Solving these, we get A and B.


𝐵
Then 𝑎 = 𝑎𝑛𝑡𝑖𝑙𝑜𝑔 𝐴𝑎𝑛𝑑 𝐵 =
log10 𝑒

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 112


Curve Fitting (CO1)

➢ FITTING OF THE CURVE


Let 𝑦 = 𝑎𝑥 𝑏
Taking logarithm on both sides, we get
log10 𝑦 = log10 𝑎 + 𝑏 log10 𝑥
𝑌 = 𝐴 + 𝐵𝑋
Where 𝑌 = log10 𝑦 , 𝐴 = log10 𝑎,𝐵 = 𝑏 , 𝑋 = log10 𝑥
The normal equation to (1) are
෍ 𝑌 = 𝑛𝐴 + 𝐵 ෍ 𝑋 𝑎𝑛𝑑 ෍ 𝑋𝑌 = 𝐴 ෍ 𝑋 + 𝐵 ෍ 𝑋 2

Which results A and B on solving and 𝑎 = 𝑎𝑛𝑡𝑖𝑙𝑜𝑔 𝐴, 𝑏 = 𝐵.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 113


Curve Fitting (CO1)

Example 5 Use the method of least squares to the fit the curve:
𝑐0
𝑦= + 𝑐1 𝑥 to the following table of values:
𝑥

X 0.1 0.2 0.4 0.5 1 2


Y 21 11 7 6 5 6
𝒄𝟎
➢ Solution: Let given curve is 𝒚 = + 𝒄𝟏 𝒙
𝒙
Normal equations are
𝑦 1 1
෍ = 𝑐0 ෍ 2 + 𝑐1 ෍
𝑥 𝑥 𝑥
1
෍ 𝑦 𝑥 = 𝑐0 ෍ + 𝑐1 ෍ 𝑥 .
𝑥

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 114


Curve Fitting (CO1)

𝒙 𝑦 𝑦 𝑦 𝑥 𝟏 1
𝑥 𝑥 𝑥2
0.1 21 210 6.64078 3.16228 100
0.2 11 55 4.91935 2.23607 25
0.4 7 17.5 4.42719 1.58114 6.25
0.5 6 12 4.24264 1.41421 4
1 5 5 5 1 1
2 6 3 8.48528 0.70711 0.25
4.2 302.5 33.7152 10.1008 136.5
4 1

302.5 = 136.5𝑐0 + 10.10081𝑐1

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 115


Curve Fitting (CO1)

33,71524 = 10.10081𝑐0 + 4.2𝑐1


so we have
𝑐0 = 1.97327, 𝑐1 = 3.28182
Hence the curve is
1.97327
𝒚= + 3.28182 𝒙
𝒙

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 116


Daily Quiz(CO1)

Q Fit a second degree parabola to the following data-

𝑥 0 1 2 3 4
𝑓 1 0 3 10 21

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 117


Topic Objective (CO1)

Time series
1. It helps to understand the concept of Time-series for future
prediction of data values.
2. Understand the different basic concept / fundamentals of Time
Series Analysis.
3. Understand the importance of Time Series Analysis.

Faculty
Nidhi Name
Sharma Aakansha Vyas Unit Unit-2
KMB-104
9/22/2022 118
Number 1
INTRODUCTION OF TIME SERIES(CO1)

We know that planning about future is very necessary for the every
business firm, every govt. institute, every individual and for every
country. Every family is also doing planning for his income
expenditure. As like every business is doing planning for possibilities
of its financial resources & sales and for maximization its profit.
Definition: “A time series is a set of observation taken at specified
times, usually at equal intervals”.
“A time series may be defined as a collection of reading belonging to
different time periods of some economic or composite variables”.
By –Ya-Lun-Chau
▪ Time series establish relation between “cause” & “Effects”.
▪ One variable is “Time” which is independent variable & and the
second is “Data” which is the dependent variable.

Faculty Name
Nidhi Sharma Aakansha
KMB-104Vyas Unit Unit-2
9/22/2022 119
Number 1
TIME SERIES ANALYSIS(CO1)

Faculty Name
Nidhi Sharma Aakansha
KMB-104Vyas Unit Unit-2
9/22/2022 120
Number 1
Example(CO1)
We explain it from the following example:
Day No. of Packets of milk sold Year Population (in Million)
Monday 90 1921 251
Tuesday 88 1931 279
Wednesday 85
1941 319
Thursday 75
1951 361
Friday 72
1961 439
Saturday 90
1971 548
Sunday 102
1981 685

• From example 1 it is clear that the sale of milk packets is decrease from
Monday to Friday then again its start to increase.
Faculty Name Aakansha Vyas Unit
• Same thing in example 2 the population
9/22/2022
Number 1 is continuously increase.
121
Time Series (CO1)
Examples
• Stock price, Censex
• Exchange rate, interest rate, inflation rate, national GDP
• Retail sales
• Electric power consumption
• Number of accident fatalities

Faculty
Nidhi Name
Sharma Aakansha Vyas Unit Unit-2
KMB-104
9/22/2022 122
Number 1
Time Series (CO1)

The Method of least square can be used either to fit a straight line trend or a
parabolic trend.
The straight line trend is represented by the equation:-
= Yc = a + bx

Where, Y = Trend value to be computed


X = Unit of time (Independent Variable)
a = Constant to be Calculated
b = Constant to be calculated
❑ Example:-
Draw a straight line trend and estimate trend value for 1996:
Year 1991 1992 1993 1994 1995
Production 8 9
Faculty Name 8
Aakansha Vyas Unit 9 16
9/22/2022 123
Number 1
Time Series (CO1)

Deviation From 1990 Trend


Year X Y XY X2 Yc = a + bx
(1) (2) (3) (4) (5) (6)

1991 1 8 8 1 5.2 + 1.6(1) = 6.8

1992 2 9 18 4 5.2 + 1.6(2) = 8.4

1993 3 8 24 9 5.2 + 1.6(3) = 10.0

1994 4 9 36 16 5.2 + 1.6(4) = 11.6

1995 5 16 80 25 5.2 + 1.6(5) = 13.2

N= 5  X = 15  Y =50  XY = 166  X = 55
2

Now we calculate the value of two constant


Faculty Name
‘a’ and ‘b’ with the help of
Aakansha Vyas Unit
9/22/2022 124
two equation:- Number 1
Time Series (CO1)

 Y = Na + b X
 XY = a X + b X 2

Now we put the value of  X , Y ,  XY ,  X 2


,&N

50 = 5a + 15(b) ……………. (i)


166 = 15a + 55(b) ……………… (ii)
Or 5a + 15b = 50……………… (iii)
15a + 55b = 166 …………………. (iv)

Equation (iii) Multiply by 3 and subtracted by (iv)

-10b = -16
b = 1.6
Now we put the value of “b” in the equation (iii)
Faculty
Nidhi Name
Sharma Aakansha Vyas Unit Unit-2
KMB-104
9/22/2022 125
Number 1
Time Series (CO1)

= 5a + 15(1.6) = 50
5a = 26
a = 5.2
As according the value of ‘a’ and ‘b’ the trend line:-
Yc = a + bx
Y= 5.2 + 1.6X

Now we calculate the trend line for 1996:-


Y1996 = 5.2 + 1.6 (6) = 14.8

Faculty
Nidhi Name
Sharma Aakansha Vyas Unit Unit-2
KMB-104
9/22/2022 126
Number 1
Daily Quiz (CO1)

Q1. Fit a straight line trend by the method of least square (taking 1980 as
year of origin) to the following data:
Year 1980 1981 1982 1983 1984 1985 1986

Production 125 128 133 135 140 141 143

And obtain the trend values.

Faculty
Nidhi Name
Sharma Aakansha Vyas Unit Unit-2
KMB-104
9/22/2022 127
Number 1
Recap(CO1)

✓ Moments
✓ Relation between 𝑣𝑟 𝑎𝑛𝑑 𝜇𝑟
✓ Relation between 𝜇𝑟 𝑎𝑛𝑑 𝜇′𝑟
✓ Moment generating function.
✓ Skewness & kurtosis
✓ Curve fitting
✓ Time Series

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 128


Topic objective (CO1)

Correlation
• Identify the direction and strength of a correlation between two factors.
• Compute and interpret the Pearson correlation coefficient and test for
significance.
• Compute and interpret the coefficient of determination.
• Compute and interpret the Spearman correlation coefficient and test
for significance.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 129


Correlation(CO1)

➢ C o r r e l a t i o n : In a bivariate distribution we are interested to find


out i f there is any correlation between the two variables under study.
• If the change in one variable affects a change in the other variable, the
variables are said to be correlated.
❖ Positive Co rre lat io n
• If the two variables deviate in the same direction, i.e., if the increase
(or decrease) in one results in a corresponding increase (or decrease) in
the other, correlation is said to be direct or positive.
• For example, the correlation between (i) the heights and weights of a
group of persons, and (ii) the income and expenditure; is positive.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 130


Correlation(CO1)

➢ Negative Co rre lat io n :


• If the two variables deviate in the opposite directions, i.e., if increase (or
decrease) in one results in corresponding decrease (or increase) in the other,
correlation is said to be diverse ornegative.
• For example, the correlation between (i) the price and demand of a
commodity, and (ii) the volume and pressure of a perfect gas; is
negative.
➢ Pe rfe ct Correlation
• Correlation is said to be perfect if the deviation in one variable is
followed by a corresponding and proportional deviation in the other.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 131


Correlation(CO1)

Co rre lat io n Coefficient:


• The correlation coefficient due to Karl Pearson is defined as a measure
of intensity or degree of linear relationship between two variables.
• K a r l Pea rson’s C o r r e l a t i o n C o e f f i c i e n t
• Karl Pearson’s correlation coefficient between two variables X and Y , is
denoted by r (X, Y ) or rXY , is a measure of linear relationship between them
and is defined as:
𝐶𝑜𝑣(𝑥,𝑦)
• r(X, Y ) =
σX σY
• f (xi, yi ); i = 1, 2, ...,n is the bivariate distribution, then

• Cov(X,Y ) = E [{X − E (X ) }{ Y − E (Y )}]

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 132


Correlation(CO1)

KARL PEARSON’S CO –EFFICIENT OF CORRELATION(OR PRODUCT


MOMENT CORRELATION CO-EFFICIENT)
Correlation co-efficient between two variable 𝑥 𝑎𝑛𝑑 𝑦, usually
denoted by 𝑟 𝑥, 𝑦 𝑜𝑟 𝑟𝑥𝑦 is a numerical measure of linear relationship
between them and defined as
σ 𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത
𝑟𝑥𝑦 =
σ 𝑥𝑖 − 𝑥ҧ 2 σ 𝑦𝑖 − 𝑦ത 2

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 133


Correlation(CO1)

σ 𝑥 − 𝑥ҧ 𝑦 − 𝑦ത
𝑟𝑥𝑦 =
𝑛𝜎𝑥 𝜎𝑦
𝑛 σ 𝑥𝑦−σ 𝑥 σ 𝑦
Or 𝑟 𝑥, 𝑦 =
𝑛 σ 𝑥2 − σ 𝑥 2 𝑛 σ 𝑦2 − σ 𝑦 2
Here 𝑛 is the no. of pairs of values of 𝑥 𝑎𝑛𝑑 𝑦.
Note: Correlation co efficient is independent of change of origin and
scale.
Let us define two new variables 𝑢 𝑎𝑛𝑑 𝑣 𝑎𝑠
𝑥−𝑎 𝑦−𝑏
𝑢= ,𝑣 = where 𝑎, 𝑏, ℎ, 𝑘 𝑎𝑟𝑒 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡 𝑡ℎ𝑒𝑛 𝑟𝑥𝑦 = 𝑟𝑢𝑣
ℎ 𝑘
𝑛 σ 𝑢𝑣−σ 𝑢 σ 𝑣
Then 𝑟 𝑢, 𝑣 =
𝑛 σ 𝑢2 − σ 𝑢 2 𝑛 σ 𝑣 2 − σ 𝑣 2

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 134


Correlation(CO1)

Q.Find the coefficient of correlation between the values of 𝑥 𝑎𝑛𝑑 𝑦:


𝒙 1 3 5 7 8 10
𝑦 8 12 15 17 18 20
Sol. Here 𝑛 = 6. The table is as follows.
𝒙 𝒚 𝒙𝟐 𝒚𝟐 𝒙𝒚
1 8 1 64 8
3 12 9 144 36
5 15 25 225 75
7 17 49 289 119
8 18 64 324 144
10 20 100 400 200

෍ 𝑥 = 34 ෍ 𝑦 = 90 ෍ 𝑥 2 = 248෍ 𝑦2 = 1446
෍ 𝑥𝑦 = 582

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 135


Correlation(CO1)

Karl Pearson’s coefficient of correlation is given by


𝑛 σ 𝑥𝑦 − σ 𝑥 σ 𝑦
𝑟 𝑥, 𝑦 =
𝑛 σ 𝑥 2 − σ 𝑥 2 𝑛 σ 𝑦2 − σ 𝑦 2
6 × 582 − 34 × 90
𝑟 𝑥, 𝑦 = = 0.9879
6 × 248 − 34 2 6 × 1446 − 90 2
Q. Find the co-efficient of correlation for the following table:
𝒙 10 14 18 22 26 30
𝑦 18 12 24 6 30 36

𝑥−22 𝑦−24
Solution: Let 𝑢 = ,𝑣 =
4 6

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 136


Correlation(CO1)

𝒙 𝒚 𝒖 𝒗 𝒖𝟐 𝒗𝟐 𝒖𝒗
10 18 -3 -1 9 1 3
14 12 -2 -2 4 4 4
18 24 -1 0 1 0 0
22 6 0 -3 0 9 0
26 30 1 1 1 1 1
30 36 2 2 4 4 4
Total
෍ 𝑢 = −3 ෍ 𝑣 = −3 ෍ 𝑢 2 = 19 ෍ 𝑣 2 = 19 ෍ 𝑢𝑣

= 12

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 137


Correlation(CO1)

1 1 1 1 1 1
𝑢 = σ𝑢 =
Hence,n=6,ത −3 = − ; 𝑣ҧ = σ 𝑣 = −3 = −
𝑛 6 2 𝑛 6 2
𝑛 σ 𝑢𝑣−σ 𝑢 σ 𝑣
Then 𝑟𝑢𝑣 =
𝑛 σ 𝑢2 − σ 𝑢 2 𝑛 σ 𝑣 2 − σ 𝑣 2
6 × 12 − −3 −3 63
= = = 0.6
6 × 19 − −3 2 6 × 19 − −3 2 105 105

❖ Calculation of co-efficient of correlation for a bivariate frequency


distribution.
• If the bivariate data on 𝑥 𝑎𝑛𝑑 𝑦 is presented on a two way
correlation table and 𝑓 is the frequency of a particular rectangle
• In the correlation table then

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 138


Correlation(CO1)

Since change of origin and scale do not affect the co-efficient of


correlation.𝑟𝑥𝑦 = 𝑟𝑢𝑣 where the new variables 𝑢, 𝑣 are properly
chosen.
Q. The following table given according to age the frequency of marks
obtained by 100 students is an intelligence test:

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 139


Correlation(CO1)

Marks 18 19 20 21 total
10-20 4 2 2 8
20-30 5 4 6 4 19
30-40 6 8 10 11 35
40-50 4 4 6 8 22
50-60 2 4 4 10
60-70 2 3 1 6
Total 19 22 31 28 100

Calculate the coefficient of correlation between age and intelligence.


Solution: Age and intelligence be denoted by 𝑥 𝑎𝑛𝑑 𝑦 respectively.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 140


Correlation(CO1)

𝑴𝒊𝒅 x⟶ 18 19 20 21 𝒇 𝒖 𝒇𝒖 f𝒖𝟐 𝒇𝒖𝒗


𝒗𝒂𝒍𝒖𝒆 y↓ 𝒚 − 𝟒𝟓
=
𝟏𝟎
15 10-20 4 2 2 8 -3 -24 72 30
25 20-30 5 4 6 4 19 -2 -38 76 20
35 30-40 6 8 10 11 35 -1 -35 35 9
45 40-50 4 4 6 8 22 0 0 0 0
55 50-60 2 4 4 10 1 10 10 2
65 60-70 2 3 1 6 2 12 24 -2
𝑓 19 22 31 28 100 total -75 217 59
𝑣 -2 -1 0 1 Total
= 𝑥 − 20
𝑓𝑣 -38 -22 0 28 -32
𝑓𝑣 2 76 22 0 28 126
9/22/2022 𝑓𝑢𝑣 56 16 Faculty0Name -13Aakansha
59 Vyas Unit Number 1 141
Correlation(CO1)

𝑦−45
Let us define two new variables 𝑢 𝑎𝑛𝑑 𝑣 𝑎𝑠 𝑢 = , 𝑣 = 𝑥 − 20
10

= 0.25

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 142


Rank Correlation(CO1)

RANK CORRELATION:
Definition: Assuming that no two individuals are bracketed equal in either
classification, each of the variables X and Y takes the values 1, 2, ...,n.
Hence, the rank correlation coefficient between A and Bis denoted by r,
and is given as:

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 143


Rank Correlation(CO1)

Question. Compute the rank correlation coefficient for the following


data.

Person A B C D E F G H I J
Rank in 9 10 6 5 7 2 4 8 1 3
maths
Rank in 1 2 3 4 5 6 7 8 9 10
physics

Sol. Here the ranks are given and 𝑛 = 10

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 144


Rank Correlation(CO1)

Person 𝑹𝟏 𝑹𝟐 D=𝑹𝟏 − 𝑹𝟐 𝑫𝟐
A 9 1 8 64
B 10 2 8 64
C 6 3 3 9
D 5 4 1 1
E 7 5 2 4
F 2 6 -4 16
G 4 7 -3 9
H 8 8 0 0
I 1 9 -8 64
J 3 10 -7 49

෍ 𝐷 2 = 280

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 145


Rank Correlation(CO1)

Uses:
• It is used for finding correlation coefficient if we are dealing with
qualitative characteristics which cannot be measured quantitatively but
can be arranged serially.
• It can also be used where actual data are given.
• In case of extreme observations, Spearman’s formula is preferred to
Pearson’s formula.
Limitations
• It is not applicable in the case of bivariate frequency distribution.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 146


Tied Correlation(CO1)

• For n > 30, this formula should not be used unless the ranks are given,
since in the contrary case the calculations are quite time-consuming.

TIED RANKS: If some of the individuals receive the same rank in a


ranking of merit, they are said to be tied.
• Let us suppose that m of the individuals, say, (k + 1)th, (k + 2)th,...,(k +
m)th, are tied.
• Then each of these m individuals assigned a common rank, which is
arithmetic mean of the ranks k + 1, k + 2,...,k + m.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 147


Tied Correlation(CO1)

Question: Obtain the rank correlation co-efficient for the following


data:

𝒙 68 64 75 50 64 80 75 40 55 64
𝑦 62 58 68 45 81 60 68 48 50 70

Solution: Here marks are given so write down the ranks

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 148


Tied Correlation(CO1)

75 𝑿 68 64 75 50 64 80 75 40 55 64 Total
𝑌 62 58 68 45 81 60 68 48 50 70
Ranks in 4 6 2.5 9 6 1 2.5 10 8 6
𝑋(𝑥)
Ranks in 5 7 3.5 10 1 6 3.5 9 8 2
Y(𝑦)

𝐷 = 𝑥 − 𝑦 -1 -1 -1 -1 5 -5 -1 1 0 4 0
𝐷2 1 1 1 1 25 25 1 1 0 16 72

75 2 times
64 3 times
68 2 times

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 149


Tied Correlation(CO1)

6 × 75 6
= 1− = = 0.545
990 11

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 150


Daily Quiz(CO1)

Q1. Find the rank correlation coefficient for the following data:
𝑥 23 27 28 28 29 30 31 33 35 36

𝑦 18 20 22 27 21 29 27 29 28 29

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 151


Recap(CO1)

✓ Correlation
✓ Karl Pearson coefficient of correlation
✓ Rank Correlation
✓ Tied Rank

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 152


Topic objectives (CO1)

Regression
• Explanation of the variation in the dependent variable, based on
the variation in independent variables and Predict the values of the
dependent variable.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 153


Regression Analysis(CO1)

❑ REGRESSION ANALYSIS:
• Regression measures the nature and extent of correlation
.Regression is the estimation or prediction of unknown values of
one variable from known values of another variable.
Difference between curve fitting and regression analysis: The only
fundamental difference, if any between problems of curve fitting and
regression is that in regression, any of the variables may be considered
as independent or dependent while in curve fitting, one variable
cannot be dependent.
Curve of regression and regression equation:
• If two variates 𝑥 𝑎𝑛𝑑 𝑦 are correlated i.e., there exists an
association or relationship between them, then the scatter diagram

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 154


Regression Analysis(CO1)

will be more or less concentrated round a curve. This curve is called


the curve of regression and the relationship is said to be expressed by
means of curvilinear regression.
• The mathematical equation of the regression curve is called
regression equation.

Some following types of regression will discuss here:


➢ Linear Regression
➢ Non- linear Regression
➢ Multiple linear Regression

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 155


Linear Regression(CO1)

➢ LINEAR REGRESSION:
• When the point of the scatter diagram concentrated round a
straight line, the regression is called linear and this straight line is
known as the line of regression.
• Regression will be called non-linear if there exists a relationship
other than a straight line between the variables under
consideration.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 156


Linear Regression(CO1)

LINES OF REGRESSION: A line of regression is the straight line which


gives the best fit in the least square sense to the given frequency.

LINES OF REGRESSION:
Let 𝑦 = 𝑎 + 𝑏𝑥 ----.(1)
be the equation of regression line of 𝑦 𝑜𝑛 𝑥.
σ 𝑦 = 𝑛𝑎 + 𝑏 σ 𝑥 … … .(2)
σ 𝑥𝑦 = 𝑎 σ 𝑥 + 𝑏 σ 𝑥 2 … … .(3)
Solving (2) and (3) for ‘𝑎’ and ‘𝑏’ we get.
1
σ 𝑥𝑦− σ 𝑥 σ 𝑦 𝑛 σ 𝑥𝑦−σ 𝑥 σ 𝑦
𝑛
𝑏= 1 = …..(4)
σ 𝑥2 − σ𝑥 2 𝑛 σ 𝑥2 − σ 𝑥 2
𝑛

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 157


Linear Regression(CO1)

σ𝑦 σ𝑥
𝑎= −𝑏 = 𝑦ത − 𝑏𝑥ҧ … …(5)
𝑛 𝑛
Eqt.(5) given 𝑦ത = 𝑎 + 𝑏𝑥ҧ
Hence 𝑦 = 𝑎 + 𝑏𝑥 line passes through point 𝑥,ҧ 𝑦ത
Putting 𝑎 = 𝑦ത − 𝑏 𝑥ҧ in equation 𝑦 = 𝑎 + 𝑏𝑥 ,we get
𝑦 − 𝑦ത = 𝑏 𝑥 − 𝑥ҧ ………(6)
Eqt.(6) is called regression line of 𝑦 𝑜𝑛 𝑥.′ 𝑏′ is called the regression
coefficient of 𝑦 𝑜𝑛 𝑥 and is usually denoted by 𝑏𝑦𝑥.
𝑦 − 𝑦ത = 𝑏𝑦𝑥 𝑥 − 𝑥ҧ
𝜎𝑦
𝑏𝑦𝑥 = 𝑟
𝜎𝑥

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 158


Linear Regression(CO1)
𝑥 = 𝑎 + 𝑏𝑦
𝑥 − 𝑥ҧ = 𝑏𝑥𝑦 𝑦 − 𝑦ത
Where 𝑏𝑥𝑦 is the regression coefficient of 𝑥 𝑜𝑛 𝑦 and is given by
𝑛 σ 𝑥𝑦 − σ 𝑥 σ 𝑦
𝑏𝑥𝑦 =
𝑛 σ 𝑦2 − (σ 𝑦) 2
𝜎𝑥
Or 𝑏𝑥𝑦 = 𝑟 where the terms have their usual meanings.
𝜎𝑦

USE OF REGRESSION ANALYSIS:


A) In the field of a business this tool of statistical analysis is widely
used .Businessmen are interested in predicting future production,
Consumption ,investment, prices, profits and sales etc.
B) In the field of economic planning and sociological studies,
projections of population birth rates ,death and other similar variables
are of great use.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 159


Linear Regression(CO1)

Where 𝑥ҧ 𝑎𝑛𝑑 𝑦ത are mean values while


𝑛 σ 𝑥𝑦 − σ 𝑥 σ 𝑦
𝑏𝑦𝑥 =
𝑛 σ 𝑥2 − σ 𝑥 2
In eqt.(3),shifting the origin to 𝑥,ҧ 𝑦ത , we get
2
෍ 𝑥 − 𝑥ҧ 𝑦 − 𝑦ത = 𝑎 ෍ 𝑥 − 𝑥ҧ + 𝑏 ෍ 𝑥 − 𝑥ҧ

⇒ 𝑛𝑟𝜎𝑥 𝜎𝑦 = 𝑎 0 + 𝑏𝑛𝜎𝑥 2
𝜎𝑦
⇒𝑏=𝑟
𝜎𝑥
Where 𝑟 is the coefficient of correlation 𝜎𝑥 𝑎𝑛𝑑 𝜎𝑦 are the standard
deviations of 𝑥 𝑎𝑛𝑑 𝑦 series respectively.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 160


Regression Analysis Properties(CO1)

PROPERTIES OF REGRESSION COEFFICIENTS:


Property 1. Correlation coefficient is the geometric mean between the
regression coefficients.
𝑟𝜎𝑦 𝑟𝜎𝑥
Proof :The coefficients of regression are and .
𝜎𝑥 𝜎𝑦

𝑟𝜎𝑦 𝑟𝜎𝑥
G.M. between them= × = 𝑟 2 = r =coefficient of
𝜎𝑥 𝜎𝑦
correlation.
Property 2.If one of the regression coefficients is greater than unity,
the other must be less than unity.
𝑟𝜎𝑦 𝑟𝜎𝑥
Proof. The two regression coefficients are 𝑏𝑦𝑥 = and 𝑏𝑥𝑦 = .
𝜎𝑥 𝜎𝑦

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 161


Regression Analysis Properties(CO1)

1
Let 𝑏𝑦𝑥 >1,then <1
𝑏𝑦𝑥

Since 𝑏𝑦𝑥 . 𝑏𝑥𝑦 = 𝑟 2 ≤ 1


1
𝑏𝑥𝑦 ≤ <1
𝑏𝑦𝑥
Similarly if 𝑏𝑥𝑦 > 1, 𝑡ℎ𝑒𝑛 𝑏𝑦𝑥 < 1.
Property 3.Airthmetic mean of regression coefficient is greater than
the Correlation coefficient.
Proof. We have to prove that
𝑏𝑦𝑥 + 𝑏𝑥𝑦
>𝑟
2
𝜎𝑦 𝜎𝑥
r +r > 2𝑟
𝜎𝑥 𝜎𝑦

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 162


Regression Analysis Properties(CO1)

𝜎𝑥 2 + 𝜎𝑦 2 > 2𝜎𝑥 𝜎𝑦
2
𝜎𝑥 − 𝜎𝑦 > 0 which is true.
Property 4:Regression coefficients are independent of the origin but
not of scale.
𝑥−𝑎 𝑦−𝑏
Proof. Let 𝑢 = ,𝑣 = , where a, b, h and k are constants
ℎ 𝑘


Similarly, 𝑏𝑥𝑦 = 𝑏𝑢𝑣 ,
𝑘
Thus 𝑏𝑦𝑥 and 𝑏𝑥𝑦 are both independent of a and b but not of ℎ 𝑎𝑛𝑑 𝑘.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 163


Regression Analysis Properties(CO1)

Property 5:The correlation coefficient and the two regression


coefficient have same sign.
𝜎𝑦
Proof: Regression coefficient of 𝑦 𝑜𝑛 𝑥 = 𝑏𝑦𝑥 = 𝑟
𝜎𝑥
Regression coefficient of x 𝑜𝑛 𝑦 = 𝑏𝑥𝑦 = 𝑟
𝜎𝑥
𝜎𝑦
Since 𝜎𝑥 and 𝜎𝑦 are both positive; 𝑏𝑦𝑥 , 𝑏𝑥𝑦 and 𝑟 have same sign.

• ANGLE BETWEEN TWO LINES OF REGRESSION:


If 𝜃 is the acute angle between the two regression lines in the case of
two variables 𝑥 𝑎𝑛𝑑 𝑦 ,show that

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 164


Regression Analysis Properties(CO1)

1−𝑟 2 𝜎𝑥 𝜎𝑦
𝑡𝑎𝑛𝜃 = . , where 𝑟, 𝜎𝑥,𝜎𝑦 have their usual meanings.
𝑟 𝜎𝑥 2 +𝜎𝑦 2

Explain the significance of the formula where 𝑟 = 0 𝑎𝑛𝑑 𝑟 = ±1


Proof: Equations to the lines of regression of 𝑦 𝑜𝑛 𝑥 𝑎𝑛𝑑 𝑥 𝑜𝑛 𝑦 𝑎𝑟𝑒
𝑟𝜎𝑦 𝑟𝜎𝑥
𝑦 − 𝑦ത = 𝑥 − 𝑥ҧ and (𝑥 − 𝑥)=
ҧ (𝑦 − 𝑦)
𝜎𝑥 𝜎𝑦
𝑟𝜎𝑦 𝜎𝑦
The slopes are 𝑚1 = and 𝑚2 =
𝜎𝑥 𝑟𝜎𝑥

tan

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 165


Regression Analysis Properties(CO1)

Since 𝑟 2 ≤ 1 and 𝜎𝑥 , 𝜎𝑦 are positive.


1−𝑟 2 𝜎𝑥 𝜎𝑦 𝜋
tan𝜃 = . Where 𝑟 = 0, 𝜃 = the two lines of regression
𝑟 𝜎𝑥 2 +𝜎𝑦 2 2
are Perpendicular to each other. Hence the estimated value of 𝑦 is the
same for all values of 𝑥 and vice versa.
When 𝑟 = ±1, 𝑡𝑎𝑛𝜃 = 0 so that 𝜃 = 0 𝑜𝑟 𝜋
Hence the lines of regression coincide and there is perfect correlation
between the two variates 𝑥 𝑎𝑛𝑑 𝑦.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 166


Linear Regression(CO1)

Q. The equation of two regression lines, obtained in a correlation


analysis of 60 observations are:
5𝑥 = 6𝑦 + 24 𝑎𝑛𝑑 1000𝑦 = 768𝑥 − 3608.What is the correlation
Coefficient ?Show that the ratio of coefficient of variability of
5
𝑥 𝑡𝑜 𝑡ℎ𝑎𝑡 𝑜𝑓 𝑦 is .What is the ratio of variance of 𝑥 𝑎𝑛𝑑 𝑦?
24
Solution: Regression line of 𝑥 𝑜𝑛 𝑦 𝑖𝑠
5𝑥 = 6𝑦 + 24
6 24
𝑥 = 𝑦+
5 5
6
𝑏𝑥𝑦 =
5
Regression line of 𝑦 𝑜𝑛 𝑥 𝑖𝑠

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 167


Linear Regression(CO1)

1000𝑦 = 768𝑥 − 3608


𝑦 = 0.768𝑥 − 3.608
𝑏𝑦𝑥 = 0.768
𝜎𝑥 6
𝑟 = ……..(3)
𝜎𝑦 5
𝜎𝑦
𝑟 =0.768….(4)
𝜎𝑥
Multiply equations(3) and (4) we get
𝑟 2 = 0.9216 ⇒ 𝑟 = 0.96
Dividing (3) by (4) we get

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 168


Linear Regression(CO1)

Taking square root, we get


𝜎𝑥 5
=1.25 =
𝜎𝑦 4
Since the regression lines pass through the point(𝑥,ҧ 𝑦)
ത we have
5𝑥ҧ = 6𝑦ത + 24
1000𝑦ത = 768 𝑥ҧ − 3608
Solving the above equation 𝑥𝑎𝑛𝑑
ҧ 𝑦ത ,we get 𝑥=6,
ҧ 𝑦ത =1
𝜎𝑥
Coefficient of variability of 𝑥 =
𝑥ҧ
𝜎𝑦
Coefficient of variability of y =
𝑦ത
𝜎𝑥 𝑦ത 𝑦ത 𝜎𝑥 1 5 5
Required ratio= × = = × =
𝑥ҧ 𝜎𝑦 𝑥ҧ 𝜎𝑦 6 4 24

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 169


Non-Linear Regression(CO1)

➢ NON-LINEAR REGRESSION:
Let 𝑦 = 𝑎. 1 + 𝑏𝑥 + 𝑐𝑥 2
Be a second degree parabolic curve of regression of 𝑦 on 𝑥.
⇒ ෍ 𝑦 = 𝑛𝑎 + 𝑏 ෍ 𝑥 + 𝑐 ෍ 𝑥 2

⇒ ෍ 𝑥𝑦 = 𝑎 ෍ 𝑥 + 𝑏 ෍ 𝑥 2 + 𝑐 ෍ 𝑥 3

⇒ ෍ 𝑥2𝑦 = 𝑎 ෍ 𝑥2 + 𝑏 ෍ 𝑥3 + 𝑐 ෍ 𝑥4

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 170


Multiple-Linear Regression(CO1)

➢ MULTIPLE LINEAR REGRESSION:


Where the dependent variable is a function of two or more linear or
non linear independent variables. consider such a linear function as
𝑦 = 𝑎 + 𝑏𝑥 + 𝑐𝑧
෍ 𝑦 = 𝑚𝑎 + 𝑏 ෍ 𝑥 + 𝑐 ෍ 𝑧

෍ 𝑥𝑦 = 𝑎 ෍ 𝑥 + 𝑏 ෍ 𝑥 2 + 𝑐 ෍ 𝑥𝑧

෍ 𝑦𝑧 = 𝑎 ෍ 𝑧 + 𝑏 ෍ 𝑥𝑧 + 𝑐 ෍ 𝑧 2

Solving the above equations we get values of 𝑎, 𝑏 𝑎𝑛𝑑 𝑐 then we get


linear function 𝑦 = 𝑎 + 𝑏𝑥 + 𝑐𝑧 is called the regression plan.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 171


Multiple Linear Regression(CO1)

Q. Obtain a regression plane by using multiple linear regression


To fit the data given below.

𝒙 1 2 3 4
𝑦 12 18 24 30
𝑧 0 1 2 3

Sol. Let 𝑦 = 𝑎 + 𝑏𝑥 + 𝑐𝑧 𝑏𝑒 𝑡ℎ𝑒 𝑟𝑒𝑞𝑢𝑖𝑟𝑒𝑑 𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑝𝑙𝑎𝑛𝑒 𝑤ℎ𝑒𝑟𝑒


𝑎, 𝑏, 𝑐 𝑎𝑟𝑒 𝑡ℎ𝑒 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡𝑠 𝑡𝑜 be determined by following equations.
෍ 𝑦 = 𝑚𝑎 + 𝑏 ෍ 𝑥 + 𝑐 ෍ 𝑧

෍ 𝑥𝑦 = 𝑎 ෍ 𝑥 + 𝑏 ෍ 𝑥 2 + 𝑐 ෍ 𝑥𝑧

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 172


Multiple Linear Regression(CO1)

෍ 𝑦𝑧 = 𝑎 ෍ 𝑧 + 𝑏 ෍ 𝑥𝑧 + 𝑐 ෍ 𝑧 2

Here 𝑚 = 4 Substitution yields,


84=4𝑎 + 10𝑏 + 6𝑐
240 = 10𝑎 + 30𝑏 + 20𝑐
156=6a+20b+14c
𝑎 = 10, 𝑏 = 2, 𝑐 = 4
Hence the required regression plane is
𝑦 = 10 + 2𝑥 + 4𝑧

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 173


Multiple Linear Regression(CO1)

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 174


Daily Quiz(CO1)

Q1 Two lines of regression are given by 7𝑥 − 16𝑦 + 9 = 0 and


− 4𝑥 + 5𝑦 − 3 = 0 and 𝑣𝑎𝑟(𝑥)=16.Calculate
(i) the mean of 𝑥 and 𝑦
(ii) variance of 𝑦
(iii) The correlation coefficient.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 175


Weekly Assignment(CO1)

Q1. Fit a straight line trend by the method of least square (taking 1978
as year of origin) to the following data:
Year 1979 1980 1981 1982 1983 1984
5 7 9 10 12 17
Production

And obtain the trend values.


Q2. From the following data calculate Karl Pearson's coefficient
of skewness
Marks 10 20 30 40 50 60 70
Less than
No. of 10 30 60 110 150 180 200
students

Q3. Write regression equations of X on Y and of Y on X for the


following data -

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 176


Weekly Assignment(CO1)

X 1 2 3 4 5
Y 2 4 5 3 6

Q4. Fit a straight line trend by the method of least squares to the
following data: -
Year 2012 2013 2014 2015 2016 2017
Sales of T.V. 7 10 12 14 17 24
sets (in’000)

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 177


Faculty Video Links, Youtube & NPTEL Video Links and Online
Courses Details

Youtube/other Video Links:


• https://youtu.be/wWenULjri40
• https://youtu.be/mL9-WX7wLAo
• https://youtu.be/nPsfqz9EljY
• https://youtu.be/nqPS29IvnHk
• https://youtu.be/aaQXMbpbNKw
• https://youtu.be/wDXMYRPup0Y
• https://youtu.be/m9a6rg0tNSM
• https://youtu.be/Qy1YAKZDA7k
• https://youtu.be/Qy1YAKZDA7k
• https://youtu.be/s94k4H6AE54
• https://youtu.be/lBB4stn3exM
• https://youtu.be/0WejW9MiTGg
• https://youtu.be/QAEZOhE13Wg
• https://youtu.be/ddYNq1TxtM0
• https://youtu.be/YciBHHeswBM
• https://youtu.be/VCJdg7YBbAQ
• https://youtu.be/VCJdg7YBbAQ
• https://youtu.be/yhzJxftDgms

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 178


MCQ

Q1. Which one is true


i. Correlation helps to determine the validity of a test.
ii. Correlation helps to determine the reliability of a test.
iii. Correlation indicates the nature of the relationship between two
variables.
iv. All of the above
Q2. Which one is true
i. 𝐼𝑓 𝑏𝑥𝑦 > 1, 𝑡ℎ𝑒𝑛 𝑏𝑦𝑥 < 1.
𝑏𝑦𝑥 + 𝑏𝑥𝑦
ii. 2
>𝑟
𝑏𝑦𝑥 + 𝑏𝑥𝑦
iii. 4
> 2𝑟
iv. 𝐼𝑓 𝑏𝑦𝑥 > 1, 𝑡ℎ𝑒𝑛 𝑏𝑥𝑦 < 1.

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 179


MCQ

Q3. Sum of squares of items 2430, mean is 7 N=12, find the variance.
i. 176.5
ii. 12.38
iii. 153.26
iv. 14
Q4. Calculate the standard variation of the following
9, 8, 6,5,8,6
i. 2
ii. 3
iii. 1.414
iv. 2.414

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 180


Glossary (CO1)

Q 1 An in complete distribution is given below:


x 10-20 20-30 30-40 40-50 50-60 60-70 70-80
f 12 30 X 65 Y 25 18
Given that median value is 46 and N=229
i. X
ii. Y
iii. Mean
iv. Mode
Pick the correct option from glossary
a. 45.82
b. 33.5
c. 46.07
d. 45

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 181


Glossary (CO1)

Q2. For the following:


i. Equation of line y on x
ii. Regression coefficient x on y
iii. Correlation coefficient
iv. Equation of line x on y
Pick the correct option from glossary
a. 𝑥 − 𝑥ҧ = 𝑏𝑥𝑦 𝑦 − 𝑦ത
b. r(x,y)
c. 𝑦 − 𝑦ത = 𝑏𝑦𝑥 𝑥 − 𝑥ҧ
d. 𝑏𝑥𝑦

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 182


Expected Questions for University Exam

• Set A.docx

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 183


Recap Unit 1
✓ Measures of central tendency – mean, median, mode
✓ Measures of dispersion – mean deviation, standard
deviation, quartile deviation, variance
✓ Moment
✓ Skewness and kurtosis
✓ Least squares principles of curve fitting
✓ Correlation
✓ Regression analysis
✓ Time series analysis

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 184


References

1. Introduction to Statistics - P.K. Giri & J. Banerjee.


2. Statistical Models: Theory and Practice by David Freedman.
3. Richard I. Levin, David S. Rubin “Statistics for Management”,
Pearson Education
4. Anderson, Sweeney and Williams “Statistics for Business and
Economics”, Cengage Learning.

9/22/2022 Unit II 185


References

9/22/2022 Faculty Name Aakansha Vyas Unit Number 1 186

You might also like