Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

&11.

CURVE-FITTING

numerical values (z Y
and which give us a e t ofn pairsof
variables x y these
let there be two idea about the relationship of two

n order to have an approximate showing the


a
graph thus, we get diagram
srisbles, we plot these n paired points on a
satter dot diagrana From
calied or
of both the variables
$ultaneous variation in values non-mathenmatical relation between
two variables
an approximate
atter diagram, we get only variables by algebraic equstions, înfact
between two
Curve fitting means an exact relationship curve fitting means to
form an equation
ofthe curve. Theretore,
ts relationship is the equation Curve is considered of immense importsnee both from
titting
e the c u r v e from the given data.and
of view of theoreticaB practical statistics
the point
COMPUTER BASED NUMERICAL AND STATISTICA
404
and regresgin
TECNA ES,s
the study of correlation
Theoretically, it is useful in
between two
variables by simple aluehractically.
ebraic
ly.ititen
bles us to represent the
relationship
logarithmic
exponential or
functions. expressions
polynomials, variable corresponding
e.g., values of one the erifien
to estimate the
It is also used
variable.
values of the other curve can be found b..
occurring in the equation
of approximate by foilowing
.

The constants
methods: ii) Method of group averages
(i) Graphical method
(iv) Method of moments.
(iii) Method of least squares discus8 and study here method oflenat
.
four methods, we will only
Out of the above
squares.
LEAST SQUARES
8.12. METHOD OF
set of values to the constants and hence suggests a

Method of least squares provides unique


a

data.
curve of best fit to the given
of two variables :
have m-paired observations (x, J1), t2, Y2),. , m Ym
Suppose we
of degree n of the type
and y. It is required to fit a polynomial
+ kx"
.1
y = a + bx + cx* +..
constants a, b, c, k such that it represents the
of these values. We have to determine the
..,

curve of best fit of that degree.


ot
In case m =
n, we get in general a unique set of values satisfying the given system
equations.
and in equanon
then, we get m equations by putting different values ofr
But if m > n y
be no such soludou
(1) and we want to find only the values of n constants. Thus there may
satisty all m equations.
Therefore we try to find out those values of a, b, c, ... k which satisfy all the equations
as nearly as possible. We apply the method of least squares in such cases.
Putting , g forx in (1), we get
y a+bx, +cx +. +kx"
= a + bx2 + Cx2 + .. + kx"

y a + b r t CXt + kx

where J2, y are the expected values ofy for x =x, 2 x respectively. The v
y uY , Ym are called observed values ofy corresponding tox = xq, I2, . . respecu
differ
The expected are values
ent values of r are called residuals.
different from observed values, the difference y, -y, for
Introduce a new quantity U such that
U= y, -y,* Zy, - a bx, - cx2-.kx,"¥
= -

of
The constants a, b, C, , k are chosen in such a way that the sun of the squares
residuals is minimum.
FITTING
405
CURVE
pA4TAAND
OU OU
for U to be maximum or minimum is =0 =-OU_U
Now 1aw the condition Oc
these relations, we get
simplifying
On Ey = ma + b2x + .. +kx

Exy =ax + 62x4 +.. + k Ex*l

axy aZ2 + bZr3+ .


+k Ern*2

2xy a2x" + b2x"+l + . + k Exn


and can be solved as simultaneous equations to
These are known as Normal equations
the constants a, 6, c , . . , k. These equations are (n + 1)
in number.
gve the values of
If we calculate the second order partial derivatives and these values are put, they give
minimum.
positive value of the function, so is
U
a
curve to be fitted but helps us
This method does not help us to choose the degree of the
curve has already been chosen.
in finding the values of the constants when the form of the

813. FITTING A STRAIGHT LINE

be n sets of observations of related data and y =a + bx ...(1) be the


et (7, ), i 1,2, =
...., n

Btraight line to be fitted. The residual at x , is =

E, =;-fr) =y; - a - bx;


Introduce a new quantity U such that

-a-bx)
i=1

minimum
By the principle of Least squares, U is
and =0
ob
da

or 2y = na + 62x ..(2)
2 - a -bx;X-1) =0

and or xy a r + b22 .(3)


2 2 -a -bx,)M-z)=0
i=1
(2) and (3) result two equations in a and b. Solving
Since , y, are known, equations
thes be known and hence equation (1).
Che best values for a and b can
Note. In case of change of origin,
x- (middle term)
if n is odd then, interval (h)

x -
(mean of two middle terms)
but if n is even then u =

interval)
2
COMPUTER BASED NUMERIC AND
STATISTICAL
408 TECHNIQUES
EXAMPLES

straight line to the follouwing data:


Example 1. Fit
a

2 3 4
0 1
3.3 4.5 6.3.
1 1.8
y:
obtained from the given data be y =a + bx then d
Sol. Let the straight line
equations are

y =ma +b Ex 1)
aZr + b2x2 2)
Zaxy =

m = 5
Here

0
0 1
1.8 1.8
1
6.6 4
2 3.3
4.5 13.5 9
3
25.2 16
6.3

Ex =10 y = 16.9 axy 47.1 Er 30

10b
From (1) and (2), 16.9 =5a +

47.1 10a + 306


and
Solving, we get a = 0.72, b =1.33

Required line is y = 0.72 +1.33 x.


data regarding x as the independen!
Example 2. Fit a straight line to the following
variable:
1 2 3 4 5 6

y: 1200 600 900 200 110 50.


Sol. Let the equation of the straight line to be fitted bey = a + bx
Here m = 6

1200 1200
2 900 4 1800
3 600 9 1800
200 16 800
5 110 25 550
50 36 300

x 21 y 3060
=91 . Eaxy 6450
TING
AND C U R V E F I T T I N G 409
DATA

equations,
we get
normal
From 3060 6a +21b, 6450 = 21a + 91b
a = 1361.97, b = - 243.42
Solving, we get
Requiredline is

y 1361.97 -243.42 x.
to the follouwing data isgiven by y 0.7x+ 11.285
Example 3. Show that the line offit
=

0 5 10 15 20 25
15 17 22 24 30.
12

Sol. Since m is even,


1 2 . 5 h=5 yo = 20 (say)
Let
x - 12.5 and v =y - 20
Then let, u =
2.5
uw

- 5 -8 40 25
0 12
15 9
15 3 -5
5
-3
10 17
2 1
2
15 22
12 9
3 4
20 24
50 25
5 10
25 30
uv = 122 Eu= 70
Eu = 0 Zv = 0
Total
and 122 = 70b
Normal equations are 0 = 6a
1.743
a =
0, b=
U 1.743u
Line of fit is

X-12.5 20, we get


Put u= X 1 =y
-

and v

2.5
11.285.
y 0.7x +
data:
line to the following
straight 67
Example 4. Fit a 67 65 66
73 69
71 68 67 68 64.
70 68
72 70
69 line to be fitted
be
of the straight
equation ..1)
Sol. Let the
=a + bx
y
Normal equations are ..(2)
ma + 62x
y =
.(3)
a x + b2x2
Eaxy =

and
COMPUTER BASED ERICAL AND
410
STATISTICAL TECHHNIQUES
below:
Here m = 8. Table is as

69 4899 5041
71
72 4896 4624
68
70 5110 5329
73
70 4830 4761
69
68 4556 4489
67
67 4355 4225
65
66 68 4488 4356
64 4288 4489
67

r 546 y = 548 Zxy = 37422 Xx2 37314

Substituting these values in equations (2) and (3),


we get
548 8a+ 546b

37422 546a + 37314b


Solving, we get
a 39.5454, b = 0.4242

Hence the required line of best fit is

y 39.5454 +0.4242 x.
Example 5. Show that the best fitting linear function for the points (xp Yy, lx Yy.
( Yn) may be expressed in the form
1
i n = 0 i = 1, 2, . )

Show that the line passes through the mean point (+, J).
Sol. Let the best fitting linear function be y = a + bx ..(1)

Then the normal equations are


y, = na + b2x; ..(2)

and x y, = aa, + bEx2


..3
Equations (1), (2), (3) may be rewritten as
bx - y+ a =0

62x y, +
-
na = 0

and bE-ay; + a = 0
FITTING
ATA
AND
CURVE
411
and b between these equations
Eliminating a
1
= 0 4)

hich is the required best fitting linear function for the mean point , ) ,

F n -
Clearly, the line (4) passes through point (z, J) as two rows of determinant being equal make
I zero.

ASSIGNMENT
8.28. POLYNOMIAL FIT: NON-LINEAR REGRESSION

Let y a + bx + cx2
=

be a second degree parabolic curve of regression of y onx to be fitted for the data
1)
i= 1, 2, .
Residual at x =
*;is
E -Aa) =y, -

a -

bx, cx2-

Now, let U-E - i=l i=1


-a-bx, -cx,2)2
By principle of Least squares, U should be minimum for the best values of a, b andec.

For this, =0,


da =0and dc
=0

aU
da
0 22
i=1
, -
a -

bx, cx2) (-1)


- = 0

bEr +c2r2 (1)


y = na +

aU
= 0
2 ,-a bx; -

-cx)-z) = 0
F
CAVE

AN"

Xry = ar +b2r2 + c
NA

.(2)

o=0 2 -a -

bx, -cx)-)=0
ry= aXr +b2r3+ cX .3)

Carations (1),
Equations (1), (2):
(2) and (3) are the normal equations for fitting a second degree parabolic
tmession of y on x. Here n is the no. of pairs of values of * and y.
Cuneotr

EXAMPLES

second degree of regression ofy on x to the following data:


Example 1. (a) Fit a curve

1.0 2.0 3.0 4.0


6.0 11.0 18.0 27
b) Fit a seconddegree parabola in the following data:
0.0 1.0 2.0 3.0 4.0

1.0 4.0 10.0 17.0 30.0


Sol. The equation of second degree parabola is given by
= a+ bx + cr2 ..1)
y
Normal equations are
..(2)
2y = ma +6Lr + cEr
.3)
Exy = aEr + bEr2 + cEr3
.(4)
and Ery = ax2 + bEr3 +cEr
is follows:
(a) Here m =4. Table as

*'y
1 6
6 16 22 44
8
2 11 81 54 162
9 27
3 18 256 108 432
16
64
4 27
354 xy 190 a y 644
30 Er3= 100
r = 10 y = 62 r=
(2), (3) and (4), ge
we
values in eqns.
Substituting ..6)
62 4a+ 10b +30c
100c ...6)
190 10a + 30b +
100b + 354c .(7)
644 30a +

3, b 2, c = 1
(6) and (7), we get a = =

Solving equations (5),


3 2x +2
second degree parabola is y = +

Hence the required


BAS NUMERICAL AND
430 COMPUTER
STATISTICAL
(b) Here m=5
ECHNIQUES
Table is as follows:
y
y
0 0
0 0
0.0 1.0
4.0
1 44 4
1.0
8 16 20
2.0 10.0 4 40
27 81 51
3.0 17.0 9 153
64 256 120 480
4.0 30.0 16

Er 10 y = 62 2 30 xr3 =100 x 354 Eaxy 195 xy =677


(2), (3) and (4), we get
Substituting values in eqns.
62 5 a + 10b + 30c
8)
+30b 100c
195 10a +
9
30a+ 1006 354c .10)
677 +

Solving eqns. (8), (9) and (10), we get


a 1.1 andc=1.5
1.2, b =

Hence the required second degree parabola is


y 1.2 + 1.1r + 1.5x2
=

ax bx +c in least square sense to the data


Example 2. Fit a parabola y +
=

10 12 15 23 20
Y. 14 17 23 25 21
Sol. The normal equations to the curve are

Ey= ax + bEx +5c

2axy=ar +bEx° + cEx 1)


and Ey=ar +b23+cz2
The values of Er, Zr . etc., are calculated by means of the following table:

14 100 1000 10000 140 1400


10
144 1728 20736 204 2448
12 17
23 225 3375 50625 345 5175
15
575 13225
23 25 529 12167 279841
20 21 400 8000 160000 420 8400

=80 = 100 = 1398 = 2 6 2 7 0 =521202 Zxy=1684 Ery 30648

Substituting the obtained values from the table in normal equation (1), we have

100 1398a +80b + 5c


1684 = 26270a + 1398b + 80c

30648 521202a + 262706+ 1398c


On solving a =-0.07, b = 3.03, c = - 8.899
T h e required equation is y = - 0.07:x +3.03x - 8.89.
431
F ,

CUPRVE

AND

ATA

E x a m p l e
aFit
3. Fit a parabolic curve
a parabolic.curve of regression
of y on x to the
following data:
1.5 2.0 2.5 3.0 3.5
1.0 4.0
1.3 1.6 2.0 2.7 3.4
1.1 4.1
m=7(odd)
Sol. Here

Let
u
x-2.-2x
2.5

0.5
-
5 and v
=y
tabular form are
Results in
u
u u
- 3 1.1
1.0 1.1 9 - 3.3
9.9 - 27
81
1.3 -2 1.3
1.5 2.6 5.2 16
2.0 1.6 1 1.6 - 1.6
1.6
2.5 2.0 0 2.0 0 0 0 0 0
3.0 2.7 2.7 1 2.7 2.7 1
3.5 3.4 2 3.4 6.8 13.6 8 16
4.0 4.1 3 4.1 9 12.3 36.9 27 81
Total 16.2 28 14.3 69.9 0 196
Let the curve to be fitted beU =a + bu cu2 that the normal
+ so
equations are
LU = 7a + b2u + cLu2
Zuv = au + bZu2 + cEu3
and Zuv = au2 + bZu3 + cZu4
16.2 = Ta + 28c, 14.3 = 28b, 69.9 = 28a + 196c
Solving, we get a 2.07, b= 0.511, c = 0.061
Hence the curve of fit is
U =2.07+0.511u +0.061u2
y = 2.07+ 0.511 (2x-5) +0.061 (2x -5 = 1.04-0.193ax + 0.2432
Example 4. Fit a second degree parabola to the following data by Least squares method:
1 2 3 4 5

y: 1090 1220 1390 1625 1915


Sol. Here m = 5 (odd)
Let u =* - 3, v =y- 1220

u u u u4
- 130 4 - 520 260
1090 -2 -8 16
1 0 -1
1220
170 0 0 0 0
3 1390
1 405 405 405 1 1
1625
1915 2 695 2780 1390 8 16
5
Eu = 0 Ev = 1140u= 10| u v = 2665 Euv = 2055 Zu=0 2ut = 34
Total
432 COMPUTER BASED NUMERICAL AND STATISTICAL
TECHNIQUES
Putting these values in normal equations, we get
1140 5a'+ 10c, 2055 = 106, 2655 10a' +34c
a'= 173, b= 205.5, c' 27.5

173 205.5u + 27.5u2


U = +
.1)
Put u =*-3 and U =y
-

1220

From (1), y-1220 = 173 + 205.5 (x - 3) + 27.5 (x -3)2


y 27.5x2 + 40.5x+ 1024.

You might also like