PS CH1,2,3

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 79

2.

1 Introduction
In practical statistics, we come across many situations where we often require to
find a relationship between two or more variables.
➔ For example, weight and height of a person, demand and supply, expenditure depends on
income, etc.
These relations, in general, may be expressed by polynomial or they may
have exponential or logarithmic relationship. In order to determine such relationship,
first it is require to collect the data showing corresponding values ofthe variables
under consideration. Suppose
. ... (2.1)
be the data showing corresponding values of the variables x and y under
consideration. If we plot the above data points on a rectangular coordinate system,
then the set of points so plotted fonn a scatter diugram . From this diagram, it is
sometimes possible to visualise a smooth curve approximating the data. Such a
curve is called an approximating curve.
In particular, if the data approximate well to a straight line, we say that a
linear relationship exists between the variables. It is quite possible that the relationship
may be nonlinear. Thus, the general problem offinding a functional relationship of
the fonn y = f (x) between two variables x and y, giving the approximating curve
and which approximately fit the given data (2.1) of x and y, is called curvefitting.
The fitting ofcurves to a set of numerical data is ofconsiderable importance
from theoretical as well as practical statistics point ofview. Theoretically, it is useful
in the study of correlation and regression (lines of regression can be regarded as
fitting oflinear curves to a giveri bivariate frequency or probability distributions). In
practical statistics, curve fitting enables us to represent a close functional relationship
between two variables by polynomials, exponentials or logarithmic functions using
the principle ofleast squares.
43

Ch .2 Cu rve Fit ting
res
2. 2 Th e I\f et ho d of L-ea s t Sq ua
st-fit cu ne of a giv en 1:. pe that
ha s the
lea st sq ua res ~s um e-- the be
'I he me th od of en set 01
ns (least sq ua re err or) fro m a gn
the deviatio
mi nim um su m of the sq ua re of
dJ ta.
int s are (x • y1 ). (x 2 • y 2 ) ••. .• Cr,,
• Y11 ) • \\ here
Su pp os e tha t the da ta po 1

de pe nd en t\ ari ab le. Le t the fitting cu rv e f (x ) ha s the


,. •~ ind ep en de nt an d y is
ch da ta po int s.
lo llo \\ mg de1·iatio11~ (o re llo rs or re sidua ls) fro m ea

di = Yt - f (x i). d2 f(x 2 ). ··· · d,, = Y,, - f (x ")


= Y2 -
de \ iat ion s ,, ill be po sit ive an d oth ers ne ga th e. Th us, to
Cl ea rly . so me of the t
of these an d for m the ir su m; tha
gi \ e eq ua l,, eig hta ge to ea ch err or ,,, e sq ua re ea ch
IS.
J "
.,
- · d - ~ · d ll - •
D = JI ::
d ! , .... "T

mg to the me tho d of Ie. bt squ ares. the be st fitting Cl lf\ e ha s the


No \,. ac co rd
pr op ert ) tha t
Ii ')
fy; - /(x ;)J- = a mi nim um .
'J fl "J
-, 'J
D =d1- ~d2 - -:- ... +d11 - =I d, - =2
1=I 1=1
fo rm of the
pr inc ipl e of lea st sq ua res do es no t he lp in de ter mi nin g the
.\Tote Th e the best
wh ich ca n fit a g1 \en da ta. bu t it helps on ly m de ter mm mg
ap pr op ria te cu 1'e eq ua tio ns \\ he n the ap pr ox im ate
tan cs of the res ult ing
po:;::,1bJe \ aJue!S of the co ns
ad va nc e.
fo nn of the CU1''e is kn ow n in of the form
of a St ra igh t Li ne Su pp os e the eq ua tio n of a strrught lin e
2.2 . I Fi tti ng
11-data po int s
, = a . hx is Lo be fit ted to the

L me = a + b:c


•·· ... ···· • ··-·· ·
f}; __ _D eviation d.I
~
·-·· ···· ·-- ···- ···- ··· . .••••...••. .
. .. :(x , •\ I J •
• • I


• •

_____ __:__ _ _ __ __. X


0. __ _ _ __ _ _ _ _ _
x,
Fig ur e 2.1
44
Ch.2 Curve Fitting

Lr1· Y1), (.,·2· Y2), .. . • (x,,. Yn); n~ 2 ,


"' here a is y-i ntercept and b is I ts slope (refer Figure 2.1 ).

For the general point (x, , y1 ) , if the vertical distance of this point from the

l111e r =a+ hx is the deviation d,. then

d '- =v
· I
--f(x.I )=J·I -a-br-
·1 ·

Applying method of least squares, the values of a and bare so determined


that they minimise
n 2
D= L (y•. -a-bx.)
l l ·
i=l
This~ ill be so, if

dD 11
-=0 ⇒ -2L (y;-a-bx;)=O,
da i=l

dD 11
- =0 ⇒ -2L x .(y . -a-bx.)=O.
ab .
1=1
l l
I

Simplifying and expanding the above equations, we have


ll II II
L v- =a
• I
L l+bL xl ,
i= l 1=1 1=1

II II ? ll
L X; Y; =a L X; + b L x.-
1 '
i=l i=l i=l
which implies
n n
L Y; = an+ b L X; , ... (2.2)
i=l i=l

ll n 11 _
L x.I·v.L =a L x,. +b L x.I 2 • ... (2.3)
i=l i=l i=l

Equations (2.2) and (2.3) are known as normal equations or least square
eqr,ations. From these equations, we have

45

d
,
... (2.4)

.. .(2.5)

Substituting these values of a and b in the equation of a straight line


"=a+ hx. the required best straight line fit to the given data is obtained.

Exa111ple 2.1
Using method ofleast squares, find the best fitting straight line to the following data.
., I 2 3 4 5
3 5 6 5
Solution
Here.11 = 5. Let Y =a+ bx be the required straight line fit. In order to find a and b,
let us lirsl calculate the following table.

X-
I Y; x.2
X;Y;
I

(x1) 1 (y,) l l 1
(x2 )2 (y2)3 4 6
(X3) 3 (y3) 5 I 9 15
(x4 )4 (y4)6 16 24
(x )5
5 <Ys)S 25 25
L l5
55 71
Using (2.4).

5 5
X; l: X·Y·
. l l
l=l

46
Ch.2 Curve Fitting

<'Vi)(2C, J - (J 5)(7))
~-
(5)(55) - (15) 2
=0.7.
Using (2.5).

b=
(5)( _-i X Y;
t=I
1 -( t
t=l
X; [ _t Y;
t=l

cs{!/)-(t xJ
_ (5)(71)- (15)(20)
(5)(55) - o5)2
= l.l.
Therelore, the best fitted straight line is
y = a+bx = 0.7+1.l.x. Answer
The following Figure 2.2 shows plot of the given data and the corresponding fitted
straight line.
\'

7
y = 0.7+1.lx
6 (4,6)·

5 (3,5) • •(5,5)

3
,,
• (l,l)

X
0 2 3 4 5 6
Figure 2.2

47

Ch.2 Curve Fitting
.
2.2. 2 Fitting of a Second Degree Curve Suppose th e equation of a second
-----
degree curve of the form y =a+ bx+ cx2 is to be fitted to then-data points

(xi' Y1 ), (x2, Y2 ).... ' (xn' Yn); n > 3'


\.\ here a, b, care unknown coefficients.
Applying method of least squares, the va]ues of a, band care so detenninerj
that they minimise
II 2 2
D = L (yi -a-bx;-cxi ) .
i=I
This will be so, if

an =0, an =0, an =0,


da ab de
,, h1ch i mplics (by following the similar procedure as in Section 2.2.1) normal
equations as
11 ll
l: y.I =an+b

• • ... (2.6)
I 1=J.
.....---- ,
n n n n 3
~ x.y. =a! x1+b"t\,x. 2 +c ~ x.
ll ,wl .w,, ...(2.7)
i~ i~ i~ i~
~~~~

n n
~ ; Ix. 4 ... (2.8)
/-- ~';:C.'.j;~· =J •=l l •

Solving these equations, we get a, band c. Substituting these values in the


equation of a second degree curve y =a+ bx+ cx2 , the required best second
degree curve fit to the given data is obtained.

Example 2.2

l '~i ng method of least squares, find the best fitting second degree curve to the
toJlowing data.
.r l 2 3 4
\' 6 11 18 Zl
Solution
Here, 11 = 4. Let Y =a+ bx+ cx7. b,c.th" • 0
~ ~uired second degree curve. In order t
find a, b, c, Jet us firstcalculatetbc f~table.
48
Ch .2 Curve Fitting

\ \'. 1
I \"- x .3 2
. I . I x.4 X;Y;
I I X; Y;
6 l 1 1 6 6
-1
ll 4 8 16 22 44
·'" 18 9 7J 81 54 162
-+ Tl 16 M 256 100 432
'- 10 (Q X) 100 3.54 lSX) (M

Using t2.6) - (2 .8 ),

62 = 4a +1 0b +3 0c
190 = 10 a+ 30 b+ 100c
64 4 = 30a + 1OOb + 354c
H) pa11ial ph oting pro ced ure , the giv
en system can be rewritten as
30 a + 100b + 354c = 644
1Oa + 30b + 100c = 190
4a + l0b + 30c = 62
Th e aug me nte d matrix is

30 100 354
: 64
10 30 100 : 1904]
[ 4 10
30 : 62
Op era tin g R (1/2), R/1 12 ), R/1 12 ),
1 we get

. [15 50 177 32 2]
~ 5: 15 50 95
2 5 15 31
Operating R /-1 /3 ), R /-2 /15 ), we get
1 1
-
rs
0
\ 50
_1
I] 32 2
_3 7
3 3
0 _1 _ 129 _1 79
3 15 15
Operating R /-l ), we get ·
2

15 50 177 322
0 _i - 9 _3 7
3 3
0 0 2 2
5 5 49
Ch.2 Curve Fitting
By back substitution
-
'
2 2
- c=- ⇒ c=l.
5 5
5 37 5 b 9(1) - - 37
- 3 h - 9c = -3 ⇒ - 3 - - 3

⇒ - 5 b=-~+9
3 3
5 10
⇒ -·- b=--
3 3
⇒ b=2.

I Sa+ 50b + 177c = 322 ⇒ l5a + 50(2) + 177(1) = 322


⇒ 15a =45
⇒ a=3.

Therefore. the best fitted second degree curve is


y =3+ 2x+ x2 . Answer
2.2.3 Fitting of an Exponential Curve Suppose the exponential curve of the fonn

y = aebx is to be fitted to then-data points


(xi, Yi), (xz, Yz), ···' (xn' Yn),
where a, b are unknown coefficients.
Taking common logarithm (base 10) on both sides of y -= aebx, we get
log 10 y = log10 a+ bx log10 e.
Let
Y = log 10 y, A= 1og10 a and B = b log 10 e,
then above equation becomes
Y=A+Bx.
Above equation is in linear form ofxandysince Y = log 10 y is known. Th~ normal
equations using (2.2), (2.3) becomes

., .. (2.9)

50 ... (2.10)
Ch.2 Curve Fitting
From the above equat ions, A, B can be found and conse quent ly

a= anti log A, b = B
log10 e
can be calcul ated, which on substi tuting in the equati on of an expon ential curve

y = aebx , the requir ed best exponential fit to the given data is obtained.
Note One can simila rly fit the curve y = abx.

Exam ple 2.3

f it the curve y = a ebx to the following data.


0 2 4
_\' 5.012 10 31.62
Solu tion
Herc, 11 = 3. Let us first calcul ate the follow ing table.

X; Y; Y; = log 10 Y; x .2 X;Y;
I

0 5.012 0.70 0 0
., IO I 4 2
➔ 31 .62 1.50 16 6
I 6 32 ~ 8

Using equations (2.9) and (2.10),


3.2=3 A+6B ...(i)
8 =6A+ 20B ...(ii)
Multiplying (i) by (-2) and add to (ii), we get
1.6 = 8B ⇒ B = 0.2.
Using(ii),
8=6A+20(0 .2) ⇒ A=0. 67.
Therefore,
a= anti log A z anti log (0.67) z 4.68.

b= B z 0. 2 z 0.46.
log10 e 0.4343
Therefore, the best fitted exponential c_urve is

51
"" C~ h~ .2~ c~ u' -!! .rv ,~ e~ F1
-
~· tt~ in ~g ~- --- --- --- --- --- ---
6
-----
-
Answer
\' = 4.68 e 0 .4 x ·
g of a Ge om etr ic (P~we r) Cu rve Sup pose th e geometric curve of the
2.2.4 Fittin

fun n y =a.," is to be fitted to the 11-data points


(.rJ • YJ ), (x2 • Y2 ) •... ' (x, ,' Yn ) '

\\ here a and bar e unknown coeffic


ients.

Taking common Jogarithm (base 10)


on both sides of Y =axb , we get
Jog 10 y =Jog JO a+ blo g JO x.
Let
x,
Y = log 10 y, A= log 10 a and X = log 10

then above equation becomes


Y= A +b X.
inc e Y = loglO Y and X = Iog 10 x are
Ah o\e equ ati on is in lin ear fon nof xan dys
and (2. 3) becomes
known. The normal equations using (2.2)
...(2.11)

... (2. 12)

nd and consequently
From the above equations,A, b can be fou
a= ant ilo gA
ofa and bin the equation of a geometric
can be calculated. Substituting these values
ve = axb , the required bes t geome tric fit to the given data is obtained.
cur y

Ex am ple 2.4

Fit the curve y =axb to the following data.


6J 3S 7 26
X
350 «X> 500 (ill
,
Solution
Here, n = 4. Le~ us first ca lc ~a llo wi
ng table.

52

j
Ch.2 Cur ve Fitting

.\ , = loglO ' , } x-I


.. XY
\
• I 1 = lvg lO •'', I I

61 350 J -ss, 25-l-tl 3.187 ~~2

?!> 400 l -H.50 2.(-021 2.002 3~


I 500 0.~ 51 2.6<.ro 0.71 4 2-:S l
] .(> fill 0 -U50 1.T S2 0.172 I 153

4.-UJO.t J0.6:!34 6.0-5 1165S


L'si ng equ atio ns (1. 11 ) and (2. J2).
I 0.6234 = 4A -r 4.4604b ...(i)

11.658 = 4.4 604 A + 6.0 15b ...(ii)


Sol \'m g the se equ atio ns, \\ e get
A== 2.845, b == -0. 1697.
The refo re,
a= ant i log A== ant ilog (2 .84 5) == 699
.8.
'e is
The refo re. the bes t fitt ed geo me tric cun
y = 699 .8x -0. I697. An swe r
e, the n for the sak e of con , eni enc e and
, Special Case Wh en the dat a is ver y larg
nge the origin and sca le usin g
of t:Jl
L'J') C cul atio ns, it is som etim es ad, isible to cha
substitution
=Y - 8
X =x - A and y
/z /z
A and B are the ass um ed me ans (or mid dle val ues ) of x and y ser ies,
wh ere
rvaJ.
res pec tive ly and his the wid th of the inte

~x a,n ple 2.5


wh ich best fits the foll ow ing data.
De term ine the equ atio n of a strrught line '2<XJl
2004 2005 2(XX>
X 2003
~ 00 ~
y ~ 56

Solution the
Here. 11 = 5. Let y = a+ bx be the required straight line fit. Let us first calculate
following table.
X -2() ()5 XI.Y.I
X I. = I
X _2
I
X; •VI

-2 4 -70
2003 ~
-1
- 56 53
2004 56
Ch.2 Curve Fitting

2005 79 0 0 0
XXX> a) 1 I a)

'200"1 «> 2 4 a)

'\'
,_ ~ 0 IO 34
l.J~mg e4uations (2.2) and (2.3),
5a+Ob=290 .,.(i)
Oa+ lOb = 34 ... (ii)
Sol\ ing these equations. we gel
_ 34 _
a = -290 = -58 and b - - - 3.4 .
10
5
Therefore. the best fitted straight line is
y =a+bX =58+3.4X.
Replacing Xby (x-2005), we get
y = 58+3.4(x-2005)=- 6159+3.4x. Answer
2.3 Short Questions

Exa,nple 2. 6
Define curve fitting. [GTU, M ay 2016]
Solution
The process of finding the equation of the curve of best fit to the given data points,
which may be most suitable for predicting the unknown values, is known as curve
fitting. Answer

Example 2.7
Whal is meant by the curve of best fit?
[GTU, June 2017- Comp.]
Solution
The curve of best fit is that curve for which the sum of squares of errors is minimum.
Answer
.
R t·\ ll'W L, • • ,\
.~
r.Xt.'lTIS(.'S

OJ. Fit a straight line to the foll


X l 2 3
y 3 · 4 S

54

Jl
Ch.2 Curve Fitting

U2. I ll •' !>tr.11ght line to the following data. Using this equation find the value of Y
,,. hen x = 2.4.
.\ l 2 3 4 5 6 7
y 05 25 2.0 4.0 35 6.0 55
[GTU, May 2017)
03. It' P 1s the pull required to lift a load W by means of a pulley block, find a linear
approximation of the form P =mW+ c connecting P and W, using the following data.
P 13 18 23 27
iv 51 75 102 119
[GTU, Jttne 2017-Comp.]
~ Fit .t -.econd degree curve to the following data.
x l 2 3 4 5 6 8 7 9
,, 2 6 7 8 10 1111 10 9
05 Fit a least square geometric curve y = a/' to the following data.
x l 2 3 4 5
y 05 2 45 8 125
06. Fit a second degree parabola y =a+ bx+ er
to the following data.
X 1.0 15 2.0 25 3.0 35 4.0
y 12 1.4 1.9 2.4 2.8 3.3 42
[GTU, June 2017 - Comp.]

Answers
Review Exercises
01. _r = l.6+1.2x 02. y=0.0714+0.8392x, y(2.4)=2.0854.
03. P=0.2309 W+0.2186. 04. y = -0.2673x2 + 3.523x-0.9283 05. a= 0.5012, b = l.9977

06. y = 0.8353 + 0.1932 x + 0.157 l x2.

55

You might also like