فصل ۳ کتاب بالاجی xx1

You might also like

Download as pdf
Download as pdf
You are on page 1of 56
CHAPTER 3 Curve fitting 3.1 Introduction Next we will move on to an equally interesting topic, namely, curve fitting. Why study curve fitting at all? For example, if the properties of a function are available only at a few points, but we want the values of the function at some other points, we can use interpolation, if the ‘unknown points’ lie within the interval, where we have information ftom other points. Sometimes extrapolation is also required. For example, we want to forecast, may be the demand of a product or the progress of a cyclone. Whether the forecasting is in economies or weather science, we need to establish the relationship between the independent variable(s) and the dependent variables to prove our proposal of extrapolation. Suppose our data is collected over 40 years, we first develop a model using the data of, say, 35 years and then test it using the data for the 36th, 37th year and so on. Then we can confidently say we have built a model considering the data from 0-35 years, and in the time period 36-40 years, it is working reasonably well. So there is no reason why it should misbehave in the dist year. This is an extremely short introduction to, say, a “101 course on Introduction to Forecasting”. In thermodynamics, often times, thermodynamic properties are measured at only specific pressures and temperatures. But we want enthalpy and entropy at other valves of pressures and temperatures for our calculations and so we need a method by which we can obtain these properties. This is only one part of the story. Recall the truck problem of Chapter 79 80 Curve fitting 2. We wanted to get the operating point for the truck climbing uphill Here, we started off with the torque speed characteristic of the engine and the load. These curves have to be generated from limited and finite data. We want functional forms of the characteristics because these are more convenient for us to work with. Therefore, the first step is to look at these points, draw a curve and get the best function. So when we do curve fitting, some optimization is already inherent in it because we are only looking at the “best fit”. What is the difference between the best and some other curve? For example, if we have some 500 data points, it will be absolutely meaningless for us to have a curve that passes through all the 500 points, because we know that error is inherent in (his data. When error is inherent in the data, we want to pass a line which gives the minizum ‘a new concept here that be cases when the curve passes through all the points. So in the light of the above, when will one want to have an 5S and when will one want to have an SEE The answer is when the number of parameters and also Une number of measurements are smalll and we have absolute faith and confidence in our measurements, we can go in for an exact fit. But when the number of parameters and measurements is more and if we are using a polynomial, then the order of the polynomial keeps increasing. ‘The basic problem with higher order polynomials is that they tend to get very oscillatory and if we want to work with higher order information like derivatives and so on, it gets very messy. Therefore, if we are talking about high accuracy, limited amount of data, it is possible for us to do an exact fit. But if we are dealing with a large number of data points that are error prone, it is fine to have the best fit HERE can be of two types In the light of the abov Uses of curve fitting ¢ to carry out system simulation and optimization © to determine the values of properties at intermittent points 3.1 Introduction 81 © to do trend analysis to make sense of the data. © to do hypothesis testing. For example, we want to check if the data follow some basic rule of economics or some basic model of heat transfer or fluid mechanics. A bird’s eye view of the general curve fitting problem is shown in Fig.3.1 Curve fitting Exact Best I I 1.Curve passes L.Does not pass through every point through every point 2.Suitable for small 2.Suitable for large number of number of parameters parameters 3.Suited for 3.Suitable for voluminous and limited, high reasonably accurate accuracy data data Figure 3.1: A bird’s eye view of the general curve fitting problem (a) Exact fit: Example, enthalpy h = {(T,P). When we want to write a program to simulate a thermodynamic system, it is very cumbersome to use tables of temperature and pressure, and it is easier to use functions. For property data like the above, or calibration data like thermocouple emf versus temperature, exact fits are possible and desired. (b) Best fit: Any regression in engineering problems like a Nusselt, number correlation. For example, consider flow over a heated cylinder. Under steady state, we measure the total amount of heat which is dissipated in order to maintain the cylinder temperature a System in this book refers to a collection of components. It is different from the concept of system in thermodynamics where in no mass can enter ot leave a system 82 Curve fitting constant. We increase the velocity and for every Reynolds number and we determine the heat transfer rate. Using the Newton's law of cooling, the heat transfer rate is converted to a heat transfer coefficient. From this, we get the dimensionless heat. transfer coefficient called the Nusselt number. The Nusselt number goes with the correlating variables as follows Nu =aRe’Pr° (3.1) It is then possible for us to get a,b and ¢ from experiments and curve fitting. ‘The basis for all the above ideas comes from calculations or experiments, Regardless of what we do, we get values of the dependent variables only at discrete number of points. Our idea of doing calculations or experiments is not to just get the values at discrete points, As a scientist, we know that if the experiment is done carefully, we will get 5 values of Nusselt number for 5 values of Reynolds number. We are not going to stop there. We would rather want to have a predictive relationship like the one given in Eq.3.1, which can be used for first predicting Nu at various values of Re and Pr for which we did not do experiments and further, possibly simulate the whole system of which the heated cylinder may be a part So with the data we have, before we can do system simulation and optimization, there is an intermediate step involved which requires that al = ata covet to cation me which can be played around with. 3.2 Exact fit and its types Important exact fits are listed below 1. Polynomial interpolation 2. Lagrange interpolation 3. Newton's divided difference polynomial(s) 4. Spline fit 3.2 Exact fit and its types 83 3.2.1 Polynomial interpolation Let us propose that y varies quadratically with x. So we write the relation as Y=a9+a)2 +0977 (3.2) where dy,a1 and a2 are constants. So in order to get the 3 constants, we need the values of y at 3 values of (33) (3.4) (3.5) So we have 3 equations in 3 unknowns which can be solved to obtain ao, a1 and az. There will be no error associated with the proposed equation if it is applied at 20,21 and rp. There will be some error when we apply it in between zp,x; and x2. This is because we are approximating the phy by a quadratic, With 3 points, we can make a quadratic pass through. We can only do that much. Example 3.1: Evaluate the abiral Woganiim Of) using @uaaratie In(l) = 0 In(3) — 1.099 In(7) = 1.946 Report the error by comparing your answer with the value obtained from your caleulator. Solution when x=1, 0=a,+a1 +a (3.6) when x=3, 1.099 = ay +3a; + 9a (3.7) when x=7, 1.946 =a, + 7a, + 49az (3.8) Solving for ao, a1,a2 84 Curve fitting a = —0.719 a = 0775 a, = —0.056 = 0.719 + 0.7752 — 0.0562” (3.9) |, Y=L.75T } Yactuat = In(5) = 1.608 (from calculator) Hence the percentage error is 9. ‘This demonstrates how the polynomial interpolation works. We can complicate this by increasing the degree of the polynomial (and taking more points), but then the polynomial becomes very oscillatory. Consider three points 9, 7) and 2, and the corresponding y values are yo, yi and yp, respectively. A depiction of this is given in Fig. 3.2. > Xo XyY2 Figure 3.2: Lagrange interpolation for three points We can use the polynomial interpolation we just saw. However, the Lagrange interpolating polynomial is much more potent as it can easily be extended to any number of point The post processing work of CFD software, after we get the velocities and temperatures, generally uses the Lagrange interpolating polynomial to obtain the fadients)of velocity and temperature. Once we get the gradient of temperature, we can directly get the Nusselt number and then correlate for Nusselt number if we have data at a few values of velocity (please recall equ 3.1). 3.2 Exact fit and its types 85 In figure 3.2, zo, 21 and x2 need not be equally spaced. For the above situation, the Lagrange interpolating polynomial is given by This is called the Lagrange interpolating polynomial of order 2. When written in compact mathematical form, the Lagrangian interpolating polynomial of order N is given by XN Ga) Yay (3.11) 2M eH a;) Z It is possible for us to get the first, second derivatives and higher order derivatives. The first. derivative will still be a function of x. The second derivative will be a constant for a second order polynomial. For a third order polynomial, we need the function y at four points and the second derivative will be linear in x. Example 3.2: A long heat generating wall is bathed by a cold fluid on both sides. The thickness of the wall is 10 cm and the thermal conductivity of the wall material k is 45 W/mK. The temperature of the cold fluid is 30°C. The problem geometry, with the boundary conditions, is shown in fig. 3.2. Thermocouples located inside the wall show the following temperatures (see Table 3.1), under steady state. Table 3.1: ‘Temperature of the heat generating wall at various locations(example 3.2) x, m | T(x),C 0 80 oor | 82.1 | 0.03 | 84.9 0.05 | 87.6 The temperature distribution is symmetric about the mid plane and the origin is indicated in Fig.3.3. Using the second order Lagrange X= (2)! w farts AC Q=f ull Aw) IT +0 + A @) VT + OR (@) "7 = (2) I= (7 (ta — wy SATO (a aay boo lay + O82) OT fae eA tg—01 w ee 86 Curve fitting Figure 3.3: Problem geometry with boundary conditions for example 3.2 interpolation formula, determine the heat transfer coefficient “h” at the surface. If the cross sectional area of the wall is Im?, determine the volumetric heat generation rate in the wall for steady state conditions. Solution What is it we are trying to do here? We are trying to get T as a function of x and then write as q=-k dT/dx. q that comes from the wall is due to conduction. Therefore goond = doonv: ‘The next logical question is instead of doing all this, why can not we dircetly insert thermocouples in the flow? That will affect the flow and probably may even make a laminar flow turbulent and hence is not a good idea. We do such measurements with instruments like the hot wire anemometer, but the boundary layer itself is so small and thin that it is very difficult to do those measurements. Therefore we would much rather prefer to do the measurements on the wall where it is easy to insert thermocouples and then using the conduction-convection coupling and the Fourier’s law, we can determine the heat transfer coefficient. This is the so called “standard operating procedure” in convection heat transfer The problem at hand is typically called an inverse problem. We have a mathematical model for this, which is @T Ww cat e=0 (3.12) 3.2 Exact fit and its types 87 We have some measurements of temperature at 2,71 and x9. So if we marry the mathematical model with these measurements, we are able to get much more information about the system. What is the information we are getting? We are getting two more parameters, which are the heat transfer coefficient and the vulumetric heat generation rate. The straight problem is very simple, there is a wall and there is convection at the boundary. The heat transfer coefficient and volumetric heat generation rate are given. What is the surface temperature? Or what is the temperature 1 cm away from the surface? These are easy to answer! But here we are making some measurements and inferring some properties of the system. This is what is known as an inverse problem. Here we have 4 readings, but we need to choose only 3 for the Lagrange method. Which 3 do we choose? We choose sumething closest to the wall, so we are going to employ the first 3 readings. = 1)(a ~ 2) (x= 20)(x = 29) YS Geo =21)(t0 = 22)" * (ey —a0)(er — a2) 4 = 20)e a1) 2 — #1)(@2 — #1) aT (x + 22)) dz ~ [leo %)(@0— 22)] 4 _2e= lo +2) z)e2— 21) * Determining dT/dx at xo, we get: 2(0) — (0.01 + 0.03) 2(0) — (0 + 0.03) (© 0.010 = 0.03)" * (0.01 —0)(0.01 — 0.08) 2(0) — (0+.0.01) (0.03 — 0)(0.03 — 0.01) ar aT) 933.3 le :33.33K /m 82.1 84.9 aT Fle, = ME(e0) Too) ile, 45 x 233.33 A(80 — 30) h 210W/m?K 88 Curve fitting Figure 3.4: Qualitative temperature distribution and direction of heat flow for example 3.2 Fig.3.4 shows the qualitative variation of the temperature across the wall while the arrow indicates the direction of heat flow. The last part of the problem is the determination of the volumetric heat generation rate. ‘There are several ways of oblaining this. The heat transfer from the right side is hAAT and that from the left side is also AAT. So that makes a total of 2hAAT. Where does all this heat come from? From the heat generated. Therefore, equating the two, we have eV = 2hAaT (3.15) So, we determine q, from this. ‘I'he above procedure does not take recourse to the governing equation. (This is strictly not correct as the governing equation is itself a differential form of this energy balance with the constitutive relation (Fourier’s law) embedded in it) ‘When we are employing 3 temperatures, we can consider any 3 of the 4, But the pair (0,80°C) is very crucial because we are evaluating the gradient at the solid-fluid interface. So, using the overall energy balance, qwV = 2hAAT) qAL = 2x 210A x 50 _ 2x 210 x 50 Beri 8 Ww = Bag = BL x W/m There is a heat generating wall that has a thickness of 10 cm whose thermal conductivity is given by 45 W/mK and the heat generation rate 3.2 Exact fit and its types 89 is estimated to be 2.1 x 10°W/m. Now it is possible for us to apply our knowledge of heat transfer and obtain the maximum temperature and see whether it is the same as what is measured at 0.05m. The governing equation for the problem under consideration is aT | a de k (3.16) The solution to the above equation is aT _ wt 7 tA (3.17) 2 — weet T= -SP+Ac+B (3.18) T = 80°C at x=0, which gives B=80°C. (0.05) At x = +0.05m, dT/dx—0, hence A: Substituting for A and B in eqn.3.18, at x=0.05m, we gel T=85.8°C. ‘This value is quite close to the data given in Table 3.1 This is a very simple presentation of inverse heat transfer. Inverse heat transfer can be quite powerful. For example, in the international terminals at an airport during the Swine flu seasons of 2009 and 2010, when people were arriving, thermal infrared images were taken to evaluate if any of the passengers shows symptoms of flu. Here, we are basically working out an inverse problem. When the imager sees that the radiation from the nose and other parts exceeds a certain value, itis a sign that. the passenger could be infected. ‘Thermal infrared can also used for cancer detection, say breast tumour for example. If we can use an infrared camera and if we obtain the surface temperature image, if there is a tumour inside the breast, the metabolism will be high. This will cause the volumetric thermal heat generation to be high compared to the tissue which is non cancerous. Therefore this will show as a signature on the surface temperature image of the breast. ‘his is called a breast thermogram. Now, from these temperatures, one sulves an inverse problem to determine the size and location of the tumour. 90 Curve fitting Lagrange interpolation polynomial: Higher order derivatives xX, x, % a re 3.5: Lagrangian interpolation polynomial of order 2 with three points ‘The Lagrangian interpolating polynomial of order 2 for ¢ = f(x) (refer to Fig.3.5) is given by @-a)le (e-a)le (eo = 2)(00 #2) Car = a0) = 02)" Cen ~ zo) ~ 21) (3.19) Equation 3.19 is exactly satistied at 2 = xp,a = a and a =r», and there will be no error al these three poinls. But at all intermediate points, there will be an error as we are approximating the functional form for @ by a quadratic. Given that we have 3 points and we want the polynomial to pass through all the 3 points, the best fit is a polynomial of order 2. Let us now determine the first and second derivative of the polynomial with respect to x. dg _ [2a ~ (a0 +2)) a" th Te = oye = aay * [2x — (xo +21)] ‘9 [fea = 2o)lea 29) a eo _ 2 See a? ~ [@—a)@o—e)” * eae al + 7 (a2) [(@2 = 20)(w2 - a1)] For equi spaced intervals such that ary — ay — x2 — ay = Ae, we have 3.2 Bxact fit and its types o1 Po Bon By da? 2Aq? Ag?" 2Aq? do = 261 + - (322) Purely by using the concept of an interpolating polynomial, we are able to get the second derivative of the variable 4, where this variable could be anything like temperature, stream function or potential difference. ‘The above is akin to the central difference method in finite differences, which is one way of getting a discrete form for derivatives. ‘The central difference method basically uses the Taylor series approximation. Consider Fig.3.6, where @ is the variable of interest. Using the finite difference method, d’¢/dz? can be written as i Pre Figure 3.6: Depiction of the central difference scheme “a &6 él dln = 1A (3.23) &6 ala = — 2 = oad in ee oi-1 (3.24) This is the way central difference is worked out in the finite difference method. Now we can see that the results obtained using the finite difference method are the same as what we obtained using the Lagrange 92 Curve fitting interpolating polynomial. When we use Lagrange polynomials of order 3 and 4, it leads to results that are similar to ones obtained with higher order schemes in finite difference method. ‘The right hand side term of Eq.3.24 can be written as: on — 2op + ow Aa? Let us now look at an example to make these ideas more clear. Consider two dimensional steady state heat. conduction in a slab as given in Fig.3.7. The governing equation is wWr=o0 (3.28) eT Pr 1408 o Faz + By 0 (3.26) The above equation is called Laplace equation, in which V? is referred to as Laplacian operator. 100°C orc (0°C 0°C Figure 3.7: Two dimensional steady state heat conduction across a slab without heat generation We can solve the above equation using finite differences. We already have a representation for £3, which is 3.2 Exact fit and its types 93 (3.27) Similarly &y 2ow +o ae (5.28) when Ag = Ay, ab Py aa * ay ~° on —26p+ ow , on —2dbp+os en een, On Heer e8 4 (3.29) which reduces to dp = ous ey en sos (3.30) Please note that eqn. 3.30 can also be obtained hy using Lagrange interpolating polynomials for g(x,y) for one variable at a time and getting the second derivative at node P. If we simplify further, we see that the value of the temperature at a particular node turns out to be the algebraic average of the temperatures at its surrounding nodes. Looks reasonable, provided, this is also the commonsensical answer we would have obtained. For the problem under question, if we apply this formula at the centre point, it would be (100 + 0 + 0 + 0)/4 = 25°C. Regardless of the sophistication of the mathematical technique we use, the centre temperature must be 25°C. "This serves as a basic validation to the approximation we have done to the governing equation. 3.2.3 Newton’s divided difference method A general polynomial can be represented as: Y= a +.0}( ~ 29) + a2 ~ 29)(a — 21) tag(a — t9)(e ~ 2)(@— ap) +++ tan (x — o)(w — 41) +++ (@ ~ tp1) (3.31) 94 Curve fitting Here 29, 1,22,"++ ,2ry all need not be equispaced. We are trying to fit a polynomial for y in terms of x. We have to get all the coefficients ag, @1,2,*** dy. If we substitute, « = x, we get yo = ap because all the other terms become 0. This way, we get ag. Now we substitute for x = 2. We get y = ap +.a1(x1 — 70). All the other terms will vanish. We already know ap. We know (#1 — tn) and hence we can get a1. We then thus obtain all the “a"s in this way in a recursive fashion, Suppose we have a cubic polynomial, we have four equations with four unknowns and the system can be solved. Here all the unknowns are determined sequentially and not simultaneously. There is no need to solve the simultaneous equations. It is a lot simpler than the polynomial interpolation. Even the Lagrange interpolation is a lot simpler compared to the polynomial interpolation we did earlier because the coefficients need not have to be simultaneously determined. 3.2.4 Spline approximation Figure 3.8: General representation of a spline approximation If f(x) varies with x as shown in Tig.3.8, we would like to join the 4 points by a smooth curve. The key is to fit lower order polynomials for subsets of the points, mix and match all of them to get a “nice” smooth curve. But all of them do not follow the same equation. Locally for every 3 points there is a parabola or for every 4 points there is a cube But in order to avoid discontinuities at the intermediate points we have to match the functions as well as the slopes at Uhese intermediate points. We will again get a set of simultancous equations, which when solved simultaneously, will get us the values of the coefficients and this is how 3.2 Exact fit and its types 95 spline fitting is done. Consider a spline approximation with quadratic fit in each sub interval as depicted in Fig.3.9. We can divide it into 3 intervals. i) ax°+bx Figure 3.9: Spline approximation with a quadratic polynomial ‘The equations for the three intervals are as follows: Interval 1: f(x) = ayx? + bx + ey (3.32) Interval 2: f(x) = ap2? + bya + cp (3.33) Interval 3: f(a) = a3x? + dg + 03 (3.34) We have the value of the function only at 4 points corresponding to 20,271,092 and ay. There are 9 unknowns to be determined - 4, 42, 3, by, by, bs, C1, C2, ¢3. We require 9 equations for solving for these. At the intermediate points, there is only one value of {(x) whether we use the polynomial in the interval to the left or to the right. So for every point, we have two equations either coming from behind or from ahead. If n is the number of intervals, there are n-1 intermediate points. The total number of intervals here is 3, i.e. n: Intermediate points are 2. For every point, we have 2 equations and hence we have a total of 2(n-1) equations, At the end points, we have 2 equations. So the total number of equations we have is 2n-2+2=2n. The number of constants on the other hand is 3n, 96 Curve fitting ‘There should be continuity of the slope at the intermediate points. Hence J'(z) should be the same whether we come from the left or right. That means Days, + 1 = age + be (3.35) Like this, we can generate equations at the intermediate points. We have (n-1) such equations. So now we have 2n+n-1=3n-1 equations totally. We are still one short! ‘The second derivative at either of the two end points may be assumed to be 0. This means that cithcr 2a; or 2a3 is 0, depending on the end point chosen, which means that either a or ag is 0. So the last condition is "(w) = 0 at one end. Normally we use the cubic spline when there are hundreds of points. When we do this, our eyes will not notice that this juggling has been done somewhere. There is no other simpler way to mathematically close it. This is what a graphing software typically does when we use an approximating polynomial. So much mathematics goes on in the background! (Scc Chapra and Canale, 2009, for a fuller discussi spline fits) This brings us to the end of exact fits. 3.3. Best fit All of the above methods discussed in section 3.2 are applicable for properly determined systems where the number of points is equal to the order of the polynomial + 1 and so on, For example, if we have 3 points, we have a second order polynomial. Oftentimes, we have over determined systems. For example, if we perform the experiments for determining the heat transfer coefficient and obtain results for 10 Reynold’s numbers, we know that Nu — afe'Pr and when the Prandtl number, Pr is fixed, we have two constants. To determine these, 2 data points are sufficient. But we have 10 data points. So we have to get the best fit. What is our criterion of “best”? Whether we want to minimize the difference, or the difference in the modular form, or we want to minimize the maximum deviation from any point, or we want to minimize the square of the error, or higher order power of the error? Often times we only get data points, which are discrete. (For example, the performance of equipment like a turbo machine). However, we are interested in getting a functional relationship between the independent 3.3 Best fit 97 and the dependent variables, so that we are able to do system simulation and eventually optimization. If the goal is calibration, we want to have an exact fit, where the curve should pass through all the points. We discussed various strategies like the Newton's divided difference polynomial, Lagrange interpolation polynomial and so on. ‘These are basically exact fits where we have very few points, whose measurements are very accurate. However, there are several cases where the measurements are error prone. There are also far too many points and it is very unwise on our part to come up with a ninth degree or a twelfth degree polynomial, that will generally show a highly oscillatory behaviour. These are over determined systems as already mentioned. What do we exactly mean by this? These are systems that have far too many data points compared to the number of equations required to regress the particular form of the equation. For example, if we know that the relationship between enthalpy and temperature is linear, we can state it ash = a+ oT" So if we have enthalpy at two values of temperature we can get both a and b. But suppose we have 25 or 30 values of temperature, each of which has an enthalpy and all of which also have an error associated with it. Now, this is an over determined system as two pairs of temperaturc- enthalpy can be used to determine a and b. Among these values of a and b, which is the most desirable, we do not know. Therefore, we have to come up with a strategy of how to handle the over determined system. We do not want the curve to pass through all the points. So what form of the curve do we want? Polynomial or exponential or power law or any other form? Some of the possible representations are given below. Polynomial: y = a + bx + cx? (3.36) Exponential: y = ae™ (3.37) Power law: y = az” (3.38) The polynomial given here is a general depiction, we may have higher order polynomials also. Or we can have the exponential form y = ae" which is typically the case when we have an initial value problem. Eg., concentration decreases with time or the population is changing with time. Or we can have the power law form, for e.g. the Nusselt number or skin friction coefficient. Who will tell us the best way to get the values for a,b and c if it is a polynomial, or how to get a and b, if it is 98 Curve fitting exponential or the power law? We have chosen the form. But what is the strategy to be used, since it is an over determined system? So we now have to discuss the strategies for a best fit. 3.4 Strategies for best fit Let us take a straight line. y = ax +b. We have the data as shown in the Table 3.2. ‘Table 3.2: Simple tabulation of y for various values of x x iy alu ue Yn. We want (a,b). We have the table containing the values of y corresponding to different values of x. We require the values of a and b to be determined from this. How do we get the table of values? We could perform experiments in the laboratory or calculations on the computer. What could be this x and y? x could be Reynolds number while y could be skin friction coefficient, x could be temperature difference while y could be heat flux. We are looking at a simple one variable problem. Strategy 1: So the first strategy is to minimize R, or rather Minimise $= 39 Ri = Spy ~ (ax +b) (339) et) where n refers to the number of points. This is the simplest possibility. But it has a problem. Let us highlight this difficulty with an example. Consider two points 1 and 2 as shown in Fig.3.10. 3.4 Strategies for best fit 99 Figure 3.10: Regression with minimising differences Common sense tells us that we can join these two by a straight line, Let this be called line A. Suppose we take a point P on this line, now we pass another line through il, namely line B. So as seen in the Fig.3.10, line B has a deviation of —A on one side of line A, while the deviation is +A on the other side from line A. These two will cancel out and this will give exactly the same value of $ or the sum of residues, as opposed to the correct line (which is A in our case). Any line other than the vertical will reduce the sum of the residues $ to 0. So we will not get a unique line, Hence, from a common sense point of view, it is possible that large negative errors are compensated by large positive errors. In view of this, this is not the best strategy for the best fit of a curve. Figure 3.11: Regression with minimising modulus of the differences 100 Curve fitting Strategy 2: Another strategy would be to minimize the modulus of the difference between the data and the fit Minimise $= > |Ril => |yi — (ax + 0)| (3.40) a ial Here we minimize the modulus of the difference between tidata 0d Ymodets where a large negative error cannot cancel out a large positive error. It now raises a lot of hope in us. Let us take a hypothetical example to see whether this strategy works. ‘There are 4 points as seen from Fig.3.11. Let us join points 1 and 3 by a line and points 2 aud 4 by another line, Any line between these two lines will try to minimize 8. There could be so many lines which will satisfy this. However, the original goal was to obtain unique values of a and b which will give the best fit. So this too does not work! Figure 3.12: Regression with minimax strategy Strategy 3: Consider a set of data points 1, 2, 3, 4 and 5. Common sense tells us that a line passing through points 1, 2, 3 and 4 (or very close to them) is a good way of representing the data. But point 5 gets neglected in this process. 3.4 Strategies for best fit 101 In order to accommodate point 5, let us choose the dashed line as shown in Fig.3.12 as the “best” fit. On what basis was this chosen? The dashed line (of Fig. 3.12) minimizes the maximum deviation from any point. So it satisfies what is known as the mimimax criterion. This basically comes from decision theory. Minimax is based on maximizing the minimum gain and minimizing the maximum loss. As far as we are concerned, we are trying to minimize the maximum distance from any point. But point 5 is a rank outsider and there is something fundamentally wrong with this value,. Statistically, this point 5 is called an outlier, it is an “outstanding point”! Minimax criterion unnecessarily tries to give undue importance to an outlier. Sometimes often in mectings too, the most vocal person will be heard and given importance thongh his/her view may be not be the most reasonable! For the problem at hand, common sense suggests that we better remove the point (here, point 5) and repeat the simulations/experiments. Suddenly we cannot get a new physics or heat transfer that goes up and down. There is something wrong with that point. Maybe an error in taking the reading, or not allowing steady state to be reached or a fluctuation in the operating conditions like voltage and so on. With all these strategies having failed, we go on to the least square regression (LSR), which is Strategy 4. Here, we try to see minimizing it by the square of the differences between the data and the fit, whether ‘we can get a good handle on the problem (an excellent treatment on regression is available in Chapra and Canale, 2009) 3.4.1 Least Square Regression (LSR) Let y= ar +b. If we have two points, we can get a and b rightaway. If we have 200 points, then we can have so many combinations of a and b. Now we are trying to fit one global value for a and b which best fits the data. Whenever we have a procedure by which we determine a and b, we substitute a value of x in ax+b , from the resulting y of the data and ‘Yduin — sit 38 called the residue. The residue in general will be non-zero. We take the square of this and try to minimize the sum of these residues. N N — OR = Dw - (wes - vy? (3.41) aS 102 Curve fitting in which y; refers to ydata, (ax; +0j) refers to yy. The square takes care of all the negative and positive errors and also helps give us the best, value of a and b that fits all data. This is called as the L2 norm or the Euclidean norm. In order to obtain a and b, we differentiate the norm as follows. 98 = 0-23 (uaz —b)(-2) B= 0= 2 (w~ an Hy(-) On rearranging the above two equations, we have —Layitay a? +b xi =0 —Yutadin +nb=0 ‘The above equations can be solved for a and b as nb = Sy-ad x » = Ducade n Substituting for b, we get —Y ta ye + EME an Sorin tna S02? + us Sai - a(S) ai)? =0 (3.42) (3.43) (3.44) (3.45) (3.46) Please note that the two terms )>x? and ()> : 1.65 0.05 Now we introduce a new term r?, which is the coefficient of determination, as: P= 7? is called as coeffient of determination. vr? =r is known as correlation coefficient. r = 0.97 = 0.98. Meaning and uses of correlation coefficient We are able to determine the constants a and b, and we are able to calculate some statistical quantities like r which is close to 1. So we believe that the correlation is good. We did some heat transfer experiments. We varied the Reynolds number and got the Nusselt number. We believe that these two are related by a power law form and hence went ahead and did a least square regression and we got some values. Has all this really helped us or not? We need some statistical 108 Cneve fitting measures to answer this questiion. Suppose we had no heat transfer knowledge, and just have the Reynolds number and the Nusselt number, what we will have possibly been done as the first step is to look at the Nusselt number (y) and get its mean. Then we will tell the outside world that we do not know any functional relationship but the mean value is likely to be like this. We get the mean value of In y as 3.9944. But we do not stop with the mean of the natural logarithm of the Nusselt number. If we say that the mean is like this, then it is expected to have a variance which is like (Y — Y)?. Therefore the total residual which is given by sum of the square of the difference between the In(Nusselt number)-In(Nusselt number)averages Will be 1.65 and the average of this will be 0.33. However, if we get a little smart and we say that instead of qualifying only by the mean, if we can have a functional correlation ax |b, we can regress a and b, and see whether doing this really helps us. For the problem under consideration, we are saying that with respect to the fit, we are able to reduce the sum of the residuals $; from 1.65 to 5, = 0.05. So, of the 1.65, 1.60 is explained by the fact that Nusselt number goes as @Re”. Therefore 97% variance in the data is explained by this correlation. Therefore it is a good correlation. In the absence of a correlation, we will report only the mean and the standard deviation. But apart from this, we go deeper and find out some physics behind this and propose a power law relationship and we are able to explain almost, all of the variance in the data with very high r?. Sometimes we may get a correlation coefficient which is negative (with a minimum of -1), which may arise in some correlation where, when x increases, y decreases. Suppose we have a correlation that gives us only 60%, either onr experiments are erroneous or there are additional variables that we have not taken into account. So when we do a regression, it tells us many things. It is just not the experiments or the simulations that will help us understand science! Regression itself can be a good teacher! Any correlation that has a high r?, is not necessarily a good correlation. Tt is a good correlation purely from a statistical perspective. We do not know if there is a causal relationship between x and y. For example, let us calculate the ratio of number of rainy days in India in a year, to the total number of days (365). Let us look at the ratio of the number of one day international cricket matches won by India to the total number of matches played in a year. Let us say we look at the data for the last 5 3.4 Strategies for best fit 109 years. The two may have a beantifil correlation. But we are just trying to plot two irrelevant things. Such a correlation is known as a spurious correlation! So, first we have to really know whether there exists a physical relationship between variables under question. This can come when we perform experiments or when we are looking at the non dimensional form of the equations or by using the Buckingham Pi theroem. 10 8 6 = 4 2 | ° moe ° 2 4 6 8 10 Yeu Figure 3.15: A typical parity plot Outside of the above, we can also draw a parity plot, If we plot Yaata and Yj, the centre line is a 45° line about. which the points will be found. The points must be equally spread on both sides of the 45° line, called the parity line. A typical parity plot is given in Fig.3.15. If all the points lie on the parity line, it is incredible. This may also lead to suspicion! Furthermore, any experiment will have a natural variation. Approximately, 50% of the points should be above the parity line and 50% of the points should be below the line. When all the points are bunched together, it means there is some higher order physics, For example, when the Nusselt number is increasing and we get more error, the correlation suggests there are some additional effects like effect of variation of viscosity that we have not taken into account. This parity plot is also called a seatterogram or scatter plot. On the same parity plot, we can have red, hlue and green colours to indicate fluids of different 0 Curve fitting Prandtl numbers. Beyond a certain point, it is just creativity, when it comes to how we present the results, how we put the plots together! Suppose we want to get the height of a student in this class, we measure the height of each student and take the average. We then report the average height and the variance S,. If the class is large, the heights are likely to follow a Gaussian or a normal distribution. Now we can go one step further and attempt a correlation. Suppose we know the date of birth of each person, the height is possibly directly proportional to the date of birth, such that y=ax+b. With the date of birth, we get a much better fit than just taking all the heights alone and getting the mean. So the goodness of the fit y = ax +b is just a measure of whether with the date of birth, we are able to predict heights, much better than just reporting the mean of the population. ‘The concept of Maximum Likelihood Estimate (MLE) Figure 3.16: Relationship of LSR with probability Let us take 4 points (1, 2, 3 and 4), as shown in Fig. 3.16. We want to draw a straight line through this, which is the LSR fit for example. Assuming that all the measurements are made using the same instruments, we can assume that the distribution of errors in the measurements follow a Gaussian with a standard deviation of o. If Yyix = ax + b, the probability for getting Yi is given by: P(Yi|(a,8)) = Age (361) ‘This comes from the normal or Gaussian distribution. Similarly, the 3.4 Strategies for best fit i probability of getting y for the same a and b is given by: P(¥o\(a,b)) = tian — a (3.62) V2no Therefore the probability of getting Yi,Y2 and so on is given by the product of the individual probabilities. These are independent events. ‘The total probability L is given by: y Lo= P(%,¥e.-+|(a,6)) = TR 1 1 Elm -toscti? : Y Gaaane (ass) Re -2In(L) = Nin(2n0”) + (3.64) @ Now we want to determine the best values of a and b that maximize P. This procedure is known as the Maximum Likelihood Estimation. ‘The standard deviation ¢ is not a variable in the problem, it is known (in fact, for the maximum likelihood estimation procedure to work, we do not even have to know o. If we know that sigma is a constant across measurements, that would suffice). If -2in(L) = P, we can make the first derivative stationary as follows ap op Fe =9= Sp (3.65) and get a and b. The resulting equation (3.65) is exactly the same as what we obtained earlier in the derivation of LSR for a straight line (eqns 3.41 - 3.43). 3.4.2. Performance metrics of LSR 1. Before the correlation, we have the data Y¥; and the mean Y and hence S; is the sum of the deviation of each value from the mean %- DM -¥P 2. S, is the residue given by Sp = S2% (Vi — ¥puas)? 12 Curve fitting 3. Now the coefficient of determination is basically what is the percentage or how much reduction is brought about in S; by proposing a correlation wherein we fit Y as a function of x. The square root of this coefficient of determination is the correlation coefticient. 4, There is one more quantity of interest, the standard error of estimate. It is something like the Root Mean Square error (RMSE). Std. error (3.66) ‘The numerator in this is actually S,. We divide by (n-2) and not n because if there are 2 points and it is a straight line, there is no question of an error. It is an exact fit. Therefore, this formula breaks down or this formula is meaningless if we try to do a linear fit for 2 points. So when n=2, the denominator becomes zero and the expression becomes infinity, which does not mean that the standard error is infinity. It simply tells tus not to use this formula for n=2. This is good when n is much greater than 2 like 20 or 30 data points. For example, if we develop a correlation between Nusselt number and Reynolds number, and we have 40 or 50 data points, instead of dividing by 48, if we divide by 46, it will not make much of a difference. We will got more or less the same standard error. The n and (n-2) look so different only when n is very small! Let us look at another example in linear regression. Consider the cooling of a first order system - say, a mercury in glass thermometer kept in a patient’s mouth. ‘The system has a surface area A, mass m and on the outside we have a heat transfer coefficient of “h” and ambient temperature of I.. ‘he governing equation for the temperature of the volume of mercury is mop -hA(T — Toe) (367) 0 = T-Tr (3.68) cot nao 3.69) mop = -hé (3.69) 6 beet! (3.70) ‘This is called a first order system. There are some inherent assumptions. 3.4 Strategies for best fit us 1. The heat transfer coefficient is constant and does not depend on the temperature of the body (which is a questionable assumption). The mass and specific heat are constant, these are not questionable assumptions. But is the heat transfer coefficient really a constant? Under natural convection, this is not so. Heat transfer coefficient is really variable, but we keep it a constant here. 2. Because the body is cooling and giving heat to the surroundings, the temperature of the surroundings is not increasing, That is why it is a first order system. But we can complicate it by having a second order system. For example, there is an infant erying for milk, So the mother is trying to prepare baby food. She puts in milk powder and hot water, mixes them in a glass bottle. If the milk is too hot, the bottle with milk is immersed in a basin of water Unfortunately here, Tx, is the temperature of the surrounding water. This starts increasing as it gets the heat from the milk in the bottle. We have got to model this as a second order system. For the problem under consideration, we are keeping Tx: as a constant. The goal of this exercise is to estimate the time constant if we have the temperature versus time history. Basically, we can generate different curves for 7,72 and so on. There will be one particular value of 7 at which (6; — 6.) is minimum, That is the best estimate of the time constant for this system. From r, if we know m, Cp and A, it gives us a very simple method to determine the heat transfer coefficient by means of an unsteady heat transfer experiment. Se Example 3.4: Consider the cooling of a first order system of mass m, specific heat Oy and surface area A, at an initial temperature TSC. Such a system is placed in “cold” quiescent air or still air at Too = 30°C with the heat transfer coefficient of h. From the experimental data given in Table 3.6, determine the time constant + and the initial temperature T: Evaluate the standard error of the estimate of the temperature and the correlation coefficient. 14 Curve fitting Table 3.6: Temperature-time history for example 3.4 SNo | ts @) [THC 1 | 10 | 933 2 30 82.2 3 60 | 68.1 | 4 90 | 57.9 | 5 130 49.2 | 6 180 414 7 | 250 [363 8 | 300 [329 Solution We first reduce the solution to the standard equation of a straight line. 0 = bet!” (3.71) In(0) = In(0,) —t/r (3.72) Y = art+b Table 3.7: LSR for example 3.4 SNolts@) [OC [Oy] # xy. T | 10 | 633] 41479 | 100 | 41479 2 | 30 | 522] 3.9551 | 900 | 118.653, 3 | 60 | 381 | 3.6402 | 3600 | 218.412 4 | 90 | 27.9 | 3.3286 | 8100 | 299.574 5 | 130 | 192 16900 | 384.137 6 | 180 | 114 32400 _| 438.048 7 | 250-763 [62500 | 460.125 | 8 | 300 | 29 90000 | 319.41 > 214500 | 2280 3.4 Strategies for best fit 15 Table 3.8: Calculation of goodness of fit for example 3.4 D_ | ype | W—uye)? [wo 2.9175 | 4.17 | 484x107 151 2.9175 | 3.94 | 2.25 x 10-* 1.076 2.9175 | 3.67 | 9x10 7 0.522 | 2.9175 | 3.37 [1.68 x10 | 0.169 2.9175 | 2.93 | 6.25 x 10-7 | 2x 10 2.9175 | 242 [169x104 | 0.261 2.9175 | 171 0.017 1.158 2.9175 | 1.19 [0.0157 3.432 [23.43 | 0.045, [8.13 (S,) Now we can regress for a and b (see Table 3.7) nV aw = Dai Dv na (Lai)? a = —0.01 y = Luda -Law Da nat (La)? b= 427 = Ge-t/100 0 = 71.7eWt/100 is 101.7°C. This is the initial temperature of the system. We can now evaluate the goodness of the fit (see Table 3.8). 0.04 813 0.997 or 99.7% This is a terrific correlation. The correlation coefficient is very high and very good. So it is a very accurate representation of the data. That means that if we are able to regress the data, (ie, the data can be represented as e~'/"), then 99.7% of the variance of the data can be explained by proposing this curve. The standard error of the estimate is given by 116 Curve fitting Va-3 70% This standard error of estimate is acceptable. That means at any time t, if we are using the correlation and obtaining the value of 8, there is a 99% chance that this @ will lie between the mean and -£3 x 0.08 = £0.24. It gives us the confidence with which we can predict @. This is an independent measure of the validity of the estimate. So when 0 is high, 0.08 is small and it is fine. But when this @ is going down, it becomes comparable. Therefore, when we do the experiment, we must try to use the maximum data when the system is hot. When the system is approaching the temperature of the surroundings and @ is very small, when sulficient time has elapsed and the driving force for heat transfer itself is sinall, Uhe estimates are amore prone Lo error. Other effects may also come into play in this example. In fact, we can sub divide the data into 3 sets as early, middle and final phases and for each of these phases we may get a different value of r. There are some other forms that are amenable to linear regression. Let us look at functions of more than one variable Y = f(«1,22). Now suppose we propose that Y =Co+ Cia + Core (3.73) which is the simplest linear form in two variables $= (% - [Co + Ci + Cox2))? (3.74) as aS, as 3G, 79 (3.75) aC BC This S is the same as 2(¥; — Yj) which has to be minimized. What is the immediate provocation for trying to understand a form like this? Many thermal sciences problems are two variable problems. For example, convective heat transfer across a tube is a function of the length and diameter of the tube. Another example is ineoume. the Nusselt number goes as (MW@=WRE™PR The expression is the same for turbulent flow over a pipe. When we take the logarithm on both sides, 3.4 Strategies for best fit uz This is of the form Y = Co + Cix1 + C22 where 2; is the Reynolds number and :r2 is the Prandtl number. Like this we have several systems in heat transfer and thermal engineering where the power law form can be reduced to the linear form. This is called For certain kinds of problems, we can do all these tricks and solve but for some, beyond a certain point, we have to take up non linear regression. More on this later. Some other forms that can be linearised: 1. Y —azbcan be linearised as In(Y) = In(a) + bin(x) 2Y ae’*can be linearised as In(Y) = 3. Y = can be linearised as $ = 42, } 3.4.4 Linear least squares with matrix algebra ‘Cons ceWAIREHENCREYS WHER! To make matters simple, let us take only 4 data points as shown in Table 3.9. Table 3.9: Linear least square regression with matrix algebra SNo[x| y | 2? | xy 1 0} 32] 0 0 2 1| 6.7 | 1 | 67 3 2/115] 4 | 23 4 3/144] 9 | 432 [6] 358 [14 Nx S = Vy (axi + oy? (3.77) at as aa = (3.78) esr (3.79) ab 118 Curve fitting y = 384e+3.19 We can write the above in a compact form as follows. [2?z|{A} = [27 ){v} (3.80) or Oz) 3) [1 1] _ fia 6 (na) = tile [ | 31 ‘The RHS is given by a2 012 3) |67] _ [729 111 1) jis} ~ [35.8 14.4. Let Dail [al Ya on | |b When we propose (2Z7Z][A] = [Z7]|Y], we have very concisely written the linear least squares formulation. Now all the tools available for matrix inversion can be used. There is no need to simultaneously solve two equations like what we did before. We will see, how by matrix inversion, we can get the same answers. The right side is called the forcing vector. So with the matrix formulation, we have (era) -[ §) 3.5 Non-linear least squares 19 72.9 35.8 14 6] fa] _ [72.91 6 4) [o) ~ [35.8 “1 p9 a] _ 1 [id 6] [r29 6] 20 (6 4) [358 a] _ 1 [4 -6) [729 b) 20 [-6 14] [35.8 a] _ 1 [ 4x729-6x35.8 ] _ [3.84] 6) 20 [-6 x 72.9+ 14 x 35.8] ~ [3.19 The values of a and b are the same as those obtained without using matrix algebra. This is a very smart way of doing it. When there are many equations and unknowns, we can use the power of matrix algebra to solve the system of equations elegantly. "Y] 3.5 Non-linear least squares 3.5.1 Introduction We need to first establish the need for non-linear regression in thermal systems. If we disassemble the chassis of our desktop, we will see the processor and an aluminium sink on top of it. There may be more than one such sink. ‘There are at least two fans, one near the outlet, the other fan is dedicated to the CPU. When we boot, the second fan will not turn on, Only when we run several applications, the second fan turns on when the CPU exceeds a certain temperature. Basically, we have a heat sink like what is shown in Fig.3.17. 120 Curve fitting Aluminium fins LVDLIITT reat sink fan vy substrate Figure 3.17: Depiction of a heat sink used in a desktop PC Suppose the whole heat sink is such that it is made of a highly conducting material and can be assumed to be at the same temperature, we can treat it as a lumped capacitance system. So when the system is turned on, we want to determine its temperature response. There is a processor that is generating heat at the rate of Q W, has a mass m, specific heat C,, and the heat transfer cocfficient afforded is h, the surface arca is A, if we treat the heat sink and the processor to be at the same temperature, which is a reasonably good assumption to start with, the governing equation will be mp Gp = Q~ hA(T ~ Too) (3.81) It is to be noted that in equation 3.81, m is the combined mass of the CPU and heat sink, and C, is the mass averaged specific heat. Initially when we start, the processor is constantly gencrating heat at the rate of Q Watts. At t=0, T = Too. So when T = Too, even thongh hA is available, the AT is so less that it is not able to compensate the Q, hence (Q — hAAT) is positive that it forces the system to a higher temperature. The system temperature keeps climbing up. When the system temperature keeps climbing up, Q is the same as the heat generation from the processor is a constant. Now AT is going up. So a time will come when Q = hAAT and the left side is switched off, which means that mC, dT/dr becomes 0 and the system approaches steady state. After the system approaches steady state, when we are trying to shut down the system, Q=0. Then mC,dT/dr = —hAAT, so it will follow a cooling curve that is analogous to what we studied previously. Therefore

You might also like