Session: Modeling, Simulation and Optimization

Session: Modeling, Simulation and Optimization
Applications and algorithms of non-

linear regression using least squares
Approximation based on Case Studies
By
S. Vignesh, T.K. Premannanth

Department of Chemical Engineering,
St. Joseph’s College of Engineering,
Chennai – 600119.
Abstract
Three data sets from literature were taken to investigate the importance of method of
least squares in approximation methods in engineering. The case studies such as
heat capacity data to quadratic equation in temperature, vapour liquid equilibrium
data to Wilson equation and fitting Gilliland – Sherwood data were taken Multi Non –
Linear Regression (MNLR) was successfully obtained using least square
approximation. The regression model obtained was subjected to “Distributed Errors”
Which was characterized by decrease of some global error measure with respect to
the whole approximation interval as the order of approximation increases. The “best”
Multiple Non – Linear Regression model was evaluated with the small value of error.
Graphs to evaluate “Goodness of Fit” were drawn for the three data sets which were
found to be good.
Introduction
 The so-called method of least squares is a very important approximation
method in engineering.
 This method is perhaps the best known “distributed error” approximation

method. Least square approximation is valuable in problems such as
fitting equations to discrete data points and in analyzing measurement
errors.
 The subject of least square analyses also plays central role in the
application of the theory of statistics, which treats problems involving
random.
 The subject of random variables and statistics is beyond the scope of our
presentation, we will therefore use least squares in our case
studies.
 Least squares are also useful for continuous approximations, such as

developing simple approximation to known functions.
 In least squares, distributed error methods are characterized by a

decrease of some global error measure with respect to the whole
approximation interval as the order of approximation increases.
Linear Regression
 Linear regression is solved by the method of least squares and the
error percentage was found out and Goodness of Fit graph was
plotted. The Least Square equations are as follows.
General equation:
y = a0 + a1x (2.1)
where,
a0 = (∑yi / n) - a1 (∑xi/n) (2.2)
a1 = [∑xi yi – (∑xi ∑yi)/ n] / ∑xi^2 – (∑xi )^2 / n (2.3)
Where a0 and a1 are constants that are determined.

Polynomial Regression
 Non – Linear Regression is solved by polynomial Regression
method (Second order). The Polynomial Regression equations are
as follows:
General equation:
y = a0 + a1x + a2x^2 + …..+ anx^n (2.4)
(2.5)
On solving the above matrix, we can get the values of a0, a1 & a2.
Case Studies
Three case studies has been
analyzed, they are:
 Heat capacity data to

quadratic equation in
temperature.
 Vapour liquid equilibrium data

to Wilson equation.
 Fitting Gilliland-Sherwood
data.
Case I
Heat Capacity Data to the

Quadratic Equation
 Here we have analyzed the heat capacity data for liquid
methylcyclohexane (C7H14) using the equation:-
Cp = a0 + a1T
Where Cp is the Heat capacity, T is the absolute

temperature and a0 and a1 are parameters which we have
found out by Linear Least Squares.
Data: TABLE I (a)
T,ºK CP, KJ/KgºK
150 1. 426
160 1. 447
170 1. 469
180 1. 492
190 1. 516
200 1. 541
210 1.567
220 1. 596
230 1. 627
240 1. 661
250 1. 696
260 1. 732
270 1. 770
280 1. 808
290 1. 848
300 1. 888
 On solving these data’s by Least Square method, we got the
values of a0 and a1 and it was found to be 0.96 and 0.00297
respectively.
 On substituting the values of a0 and a1 in the equation 2.1, we got
the predicted value, which seems to be near when compared to
the experimental values.
y = 0.96 + 0.00297x
 On substituting the temperature values on the above equation, we

get the predicted values
TABLE I (b)
PredictedValues Experimentalvalues
1.405 1.426
1.434 1.447
1.457 1.469
1.489 1.492
1.509 1.516
1.538 1.541
1.558 1.567
1.587 1.596
1.611 1.627
1.658 1.661
1.684 1.696
1.725 1.732
1.768 1.770
1.795 1.808
1.821 1.848
1.867 1.888
Goodness of fit
2
1.8
1.6
1.4
1.2
1 Fig I (c)
EXPERIMENT
AL
0.8 PREDICTED
0.6
0.4
0.2
0
The above graphEXPshows the
EXPEXPEXPEXPGoodness ofEXP
EXPEXPEXPEXP FitEXP
for
EXPthe Heat
EXPEXP EXP Capacity Data
of Methylcyclohexane.
1 2 3 The 4 5experimental
6 7 8 9 10 and11 predicted
12 13 14 15values showed
above shows the minimum percentage of error. The error percentage
would approximately lies between 2-5%.
Case II
Vapour Liquid Equilibrium to
Wilson Equation
 Vapour Liquid equilibrium data were taken from Heptane-
Toluene binary system at 1 atm pressure.
 Here we fitted activity coefficient data to Wilson Equation
 As it required Non-Linear regression, we have used
polynomial method and we have got the predicted values
which are nearer to the experimental.
 Here the equation we use is:
y = a0 + a1x + a2x^2 + …..+ anx^n

Data: TABLE II (a)
xi yi
1.000 0.0000
0.790 0.1259
0.596 0.1509
0.480 0.1392
0.390 0.1250
0.293 0.1111
0.220 0.0950
0.150 0.0707
0.065 0.0290
0.000 0.0000
 On solving these data's by Polynomial method, we got the values
of a0, a1 and a2 and it was found to be -0.00425, 0.575, -0.5568
respectively.
 On substituting these values on the equation 2.4, we got the
predicted values which were very nearer to the experimental
values.
y = -0.00425 + 0.575x -0.5568x^2
 On substituting the xi values, we got the predicted values which

are tabulated as follows.
TABLE II (b)
Predicted Experimental
0.0012 0.000
0.1154 0.1259
0.1495 0.1509
0.1435 0.1392
0.1343 0.1250
0.1102 0.1111
0.0925 0.0950
0.0695 0.0707
0.0278 0.0290
0.0000 0.0000
Goodness of fit
0.16
0.14
0.12
0.1
0.08 EXPERIMENTAL
0.06
PREDICTED Fig II (c)
0.04
0.02
0
EXP 1 EXP 2 EXP 3 EXP 4 EXP 5 EXP 6 EXP 7 EXP 8 EXP 9
The above graph shows the Goodness of Fit for the Vapour Liquid
Equilibrium data. The experimental and predicted lines showed above
shows the minimum percentage of error. The error percentage was
approximately lies between 6 – 7%.
Case III
Mass Transfer Data of Gilliland-
Sherwood Equation
 The Mass Transfer Data's were taken and analyzed by Gilliland-
Sherwood Equation
 As it required Non-Linear regression, we used the polynomial method
and we have got the predicted values which are nearer to the
experimental values.
 Here the equation we use is:
y = a0 + a1x + a2x^2 + …..+ anx^n

Data:
TABLE III (a)
xi yi
43.7 0.60
24.2 1.80
51.6 1.87
32.3 1.86
26.1 2.16
92.8 2.17
 On solving these data’s by polynomial method, we got the values
of a0, a1 and a2 and it was found to be 16.11, -0.7588 and 0.0053
respectively.
 On substituting these values on the equation 2.4, we got the
predicted values which were very nearer to the experimental
values.
y = 16.11 – 0.7588x + 0.0053x^2
 On substituting the xi values, we got the predicted values which

are tabulated below.
TABLE III (b)
PREDICTED EXPERIMENTAL
0.48 0.60
1.62 1.80
1.69 1.87
1.70 1.86
1.92 2.16
1.95 2.17
Goodness of fit
Fig III (c)
2.5
1.5
EXPERIMENT
AL
1 PREDICTED
0.5
0
EXP 1 EXP 2 EXP 3 EXP 4 EXP 5 EXP 6
The above graph shows the Goodness of Fit for the Mass Transfer
Data from Gilliland-Sherwood equation. The experimental and
predicted lines showed above shows the minimum percentage of
error. The error percentage was approximately lies between 20 –
25%. The error was little high since the data was Non-Linear.
Results and Discussion
 The three case studies analyzed using least squares and
polynomial regression yield good results.
 The goodness of fit graphs was plotted and error % was

calculated for all the three data’s.
 The error percentage was found to be approximately

2to5% in the first case study, similarly 7to10% in the
second and 20to25% in the third respectively.
 The first two case studies error was obtained very

minimal and the best fit graph was plotted for the
Gilliland-Sherwood data it was little higher because the
values are highly non-linear and it was difficult to
polynomial apply regression to the result. Further study
has to be made to minimize the error in the third case
study.
Discussion - Case I
 In the first case study, the table I (a) shows the data
taken for Performing regression.
 Using those values and by manually calculating the

necessary terms, we have substituted those terms in the
equations 2.1, 2.2 and 2.3 and we have obtained the
general equation.
 The table I (b) is the table containing the experimental

data and predicted data. Using those values we have
plotted a graph I (c) which shows the goodness of fit.
 From the graph we have concluded that the first case

study has come out well with minimum error %. As we
see the graph we can see the two lines very close
indicating that the regression was successful with very
less error.
Discussion - Case II
 In the second case study, the table II (a) shows the data taken
for Performing regression.
 Using those values and by manually calculating the necessary

terms, we have substituted those terms in the equations 2.4
and 2.5 and we have obtained the general equation.
 The table II (b) is the table containing the experimental data

and predicted data. Using those values we have plotted a graph
II (c) which shows the goodness of fit.
 From the graph we have concluded that the second case study
has come out well with minimum error %. As we see the graph
we can see the two lines very close indicating that the
regression was successful with very less error.
Discussion - Case III
 In the third case study, the table III (a) shows the data taken for
Performing regression.
 Using those values and by manually calculating the necessary terms, we

have substituted those terms in the equations 2.4 and 2.5 and we have
obtained the general equation.
 The table III (b) is the table containing the experimental data and predicted
data. Using those values we have plotted a graph III (c) which shows the
goodness of fit.
 From the graph we have concluded that the second case study has come out
well with minimum error %. As we see the graph we can see the two lines
very close indicating that the regression was successful with very less error.
 The error % is high when compared to the first two cases because the data
is non-linear.
Algorithm
General algorithm for all the three cases:
 Step 1: The data’s were taken from the case
studies.
 Step 2: Regression was applied to the data using
the formulas which are stated in the beginning.
 Step 3: The necessary values of Ao and A1 was
determined.
 Step 4: These values are substituted in the general
equations (2.1 for case study I and 2.4 for
Case study II and III).
 Step 5: Using that we have determined the
predicted values.
 Step 6: A graph was drawn between the
experimental and the predicted data.
 Step 7: The error percentage was calculated.
 Step 8: The graph plotted is the goodness of fit.
Conclusion
 The three sets of data's were analyzed and good results have
been obtained in all three case studies.
 The first 2 case studies have come perfectly with minimum error
and the goodness of fit graph is plotted and was found to be
good.
 For the third case study the error was little high when compared
to the other two, this is because the data is too non-linear.
 Goodness of fit graph was plotted for the third case study and
has come out well.
 For further minimizing the error for the Gilliland-Sherwood data,

further studies have to be made.
 On the whole the results i.e., the regression model obtained by

using Least Squares and Polynomial regression were successful
in prediction and the representation of the system was good.
References
1. Berezin I. S., and Zhidkov N. P., (1965), Computing Methods, Addision-Wesely,
Menlo Park, CA
2. PERRY, R. H., Green, D. W., and Maloney, J. O. (1984), Chemical Engineers Hand
book
3. Modeling and Analysis of Chemical Engineering Processes, Balu.K,
Padmanabhan.K, I.K. International Pvt. Ltd.
4. Optimization Theory and Practice, Mohan C Joshi, Kannan M Moudgalya, Narosa
Publishing House.
5. Neural Networks - A Comprehensive Foundation, Simon Haykin, Pearson
education second edition, 2004.
6. Neural Networks, Ananda Rao. M, Srinivas.J, Narosa Publishing House.
7. Design and analysis of Experiments, Montgomery D C, 5th ed., John Wiley & sons,
New York, 2007.
8. Experiment optimization in chemistry and chemical engineering, Akhnazarova S,
Kafarov V, MIR publishers, Moscow, 1982.
9. Experimental methods for engineers, Holman, McGraw Hill Publications.
Our sincere thanks
Organizing committee - J.N.T.U College of Engineering,
Anantapur
Judges and coordinator's for their best support.
Head of the department

Chemical Engineering
St. Joseph’s College of Engineering

Session: Modeling, Simulation and Optimization

Uploaded by

Copyright:

Available Formats

You might also like

Session: Modeling, Simulation and Optimization

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Session: Modeling, Simulation and Optimization

Uploaded by

Copyright:

Available Formats

Session: Modeling, Simulation and Optimization

Applications and algorithms of non-

S. Vignesh, T.K. Premannanth

 This method is perhaps the best known “distributed error” approximation

 Least squares are also useful for continuous approximations, such as

 In least squares, distributed error methods are characterized by a

a1 = [∑xi yi – (∑xi ∑yi)/ n] / ∑xi^2 – (∑xi )^2 / n (2.3)

Where a0 and a1 are constants that are determined.

y = a0 + a1x + a2x^2 + …..+ anx^n (2.4)

 Heat capacity data to

 Vapour liquid equilibrium data

Heat Capacity Data to the

Where Cp is the Heat capacity, T is the absolute

 On substituting the temperature values on the above equation, we

 Here the equation we use is:

y = a0 + a1x + a2x^2 + …..+ anx^n

y = -0.00425 + 0.575x -0.5568x^2

 On substituting the xi values, we got the predicted values which

 Here the equation we use is:

y = a0 + a1x + a2x^2 + …..+ anx^n

y = 16.11 – 0.7588x + 0.0053x^2

 On substituting the xi values, we got the predicted values which

 The goodness of fit graphs was plotted and error % was

 The error percentage was found to be approximately

 The first two case studies error was obtained very

 Using those values and by manually calculating the

 The table I (b) is the table containing the experimental

 From the graph we have concluded that the first case

 Using those values and by manually calculating the necessary

 The table II (b) is the table containing the experimental data

 Using those values and by manually calculating the necessary terms, we

 For further minimizing the error for the Gilliland-Sherwood data,

 On the whole the results i.e., the regression model obtained by

Head of the department

You might also like