Download as pdf
Download as pdf
You are on page 1of 13
Variations in Repeated Measured Values + Suppose that we want to measure the boiling water temperature at atmospheric pressure, which is 100°C. + We take samples with the sample rate of f,=10 Hz in T=10 seconds. + Thus, the number of samples is M=100. + We see the measured values vary with time (noise), and none of the data points is 100°C. What is the measured temperature? What is the temperature that you report? Scanned with CamScanner Review - Statistics ite Statistics — Deals with real (finite) dataset * Infinite Statistics — Deals with infinite data samples — Useful for theoretical — Useful for studies measurements Enns c zi ae i ™ 2 im 7) Gu -x'F Yan ‘Sample mean is not the true mean. At infinity (>10,000 samples) it reaches the true mean. Lecce eee Histogram ee eke keke ad Coors ken acu ed eine a hin nace et ced Rete Reick? 1/6/2021 Scanned with CamScanner Histograms & Probability Distributions ‘Area under histogram represents ‘The probability distribution of the all data (=100% of data) data 100% 55% 12% + A probability distribution is a function that represents the likelihood that a data point falls within a certain range. + The probability that a point is inside the histogram is 100%. Standard Statistical Distributions to Measurements + The actual shape that the probability density function takes depends on the nature of the variable it represents and the circumstances surrounding the process in which the variable is involved, + There are a number of standard distribution shapes that suggest how a variable could be distributed on the probability density plot. * Often, experimentally determined histograms are used to identify which standard distribution the measured variable tends to follow. Tr ah = Scanned with CamScanner 1/6/2021 1/6/2021 Probability: Infinite Dataset p[ Datais Normal or Gaussian acne \ Distribution (Bell Curve): mean ae ‘ L -1 0 41 z + zis normalized parameter. Measured value ): the original data point is equal to the mean value. About 68% of 7 data is within <“*.954% 1: the original data point is | 12's win standard deviation away from | yo) eanaar the mean value. mean value x,y =x! +20 Xi99.7% = What is the z value for a different probability P ? Tp | sox | s0% | 95x | 99% 2 06745 1.6449 1.9600 2.5758 3.2905 ji Era Probability: Finite Dataset The factors based on infinite datasets, x ‘The Student's t-distribution depends on n= iH tS two parameters: Students t-istribution\ 7 + P: Probability (2 tail confidence level) SS EE Ee Ee SSE) + v: Degrees of Freedom a v=M-1 x'bzo For finite data sets the Student's t-distribution has to be used instead of the z value For large dataset (sample size, hE ‘M>10,000) the values of t and z are practically identical. ert teaod ress S68 3500 Scanned with CamScanner Estimation of True Mean Value + How close is the sample mean to the true mean? + Using statistics we can estimate a confidence interval for the true mean based on the sample mean. Step 1: Calculate 5; o @=7 Step 2: Estimate the uncertainty interval O Pte ——-[x’ x+tt,, Pinca Example: Estimation of True Mean . Sy = 034 _ 9.076 TM V20 WERE, = tion = 2.861 v=M-1=19 x'paoon = E+ ty,pSg = 25.72 + 2.861 x 0.076 = 25.72 + 0.22 Scanned with CamScanner 1/6/2021 Desired Confidence Interval of True Mean + Sometimes, we want to determine the number of data samples, M, needed to obtain a given confidence interval for the true mean. True gy Sample Cnr) es On Set this value to the x’ =K+b, pH _ desired confidence oa VM interval 7 ; s Se) 1M: # of required samples i 7 x Gt desired confidence interval © — S.P “7a = u= (se =) Example: Desired Error Range of True Mean Example: Use the data from the previous example (v=19, P=99%, 5,=0.34) to estimate the number of samples required to reduce the confidence interval of true mean to #0.1. M= (tp) = (2061x224) = 95 oe (ty) For this new M (v=94) we have to find the new t,,»from the t-table (t,, and recalculate M, which is obtained M=80. 629), 20) 0.6786) — 1.6708 700) 0.677) 1.8802] | 2.62 1000] 06747] 1.6464] 1.9623) 2.5808) For M=80 (v=94), the new t-value is obtained (t, = 2.64), and M is recalculated icant change is observed in M, we can say that the iteration has been converged. Finally, we need to conduct the measurement and sample 81 data, followed by calculation of the standard deviation. If there no significant change is observed between the new and old standard deviations, the calculations are finished. Otherwise, the above steps should be done for the new standard deviation. Scanned with CamScanner 1/6/2021 Regression Analysis + Sometimes, we need to fit an equation to the measured input and output data (e.g., for calibration). Linear Fit Nonlinear * Fit + Regression analysis: + Itis used to find the “best fit” equation + estimating the “Goodness of fit’ of the regression to the measurement data ‘We usually deal with linear regression in measurements, where the objective is to fita straight line to an input / output dataset, Linear Regression Analysis The straight line regression requires the identification of two variables: Intercept a, and Slope a,. Scanned with CamScanner 1/6/2021 Linear Regression Analysis (Cont’d) + How good is the fit? * We need to find the deviation of data from the regression line. x See eked Prine Ga Rete erie ee ersei ics ea an eee} =y M + = indicates perfect correlation to the line fit and r2=0 indicates no correlation. + Typically, r? > 0.9 is desirable. . Linear Regression — The Confidence Band “Confidence band” ofthe regression ly = a,x + ag + ty pSyx [ line: ‘ 1 (x - x)? 2 M Yih Ga — 2? + Two confidence bands . surrounding the best-fit you can be Pi confident that the two curved confidence line define the a bands enclose the true best-fit linear regression line confidence interval of the best-fit ine. + The confidence bands are curved. This does ‘not mean that the confidence band includes the possibility Upper band Regression line of curves. Infact, the a curves are the boundaries of all Lower band possible straight lines. Scanned with CamScanner 1/6/2021 Linear Regression — The Prediction Band y Confidence Band vs. Prediction Band ‘The P% prediction band is the area in Which you expect P% of all data points to fall. In contrast, the P% confidence band is the area that has a P% chance of containing the true regression line. o = a4x + dy + ty pS, beat os | y=ayx +a, 1+ do £ byeSyx [1 +g + Sar Ge we True Value of Slope & Intercept y Uncertainty in Slope Uncertainty inerept q x Pee oinicukekacd eheretkacs et eee) Tienes ee) My? x Sao = Bia xt Gozrrue = a + ty,pSao Se |} — OM SIG — Syx A Qyrrue = 4 + ty,pSar ag Lib - 9) Scanned with CamScanner 1/6/2021 Useful Excel Commands * Mean ofa dataset: = AVERAGE (Dataset) * Standard deviation of a dataset: = STDEV(Dataset) * Value of the student t-distribution: = T.INV.2T((1 — confidence interval), degree of freedom) Useful Excel Commands (Cont’d) Linear regression data: = LINEST(y dataset, x dataset, True, True) > First, select 6 empty cells (2-colurn by 3-row cells), then enter this command. > After entering the command, do not push Enter, you have to push “CTRLYSHIFT+ENTER” > The values obtained in the selected cells (2-column by 3-row cells) are as follows: Value ofthe slope (a,) Standard deviation of the slope (Sa1) Correlation coatficient (2) Value ofthe intercept (ao) Standard deviation of the intercept (Sao) Standard deviation ofthe regression (Syx)| In this Excel command: +The 1* true tells LNEST not to force the y-intercept to be zero ‘The 2! true tells LINEST to return additional regression parameters besides the slope & intercept Scanned with CamScanner 1/6/2021 10 Example: Confidence/Prediction Bands & True Value of Slope & Intercept Using the linear regression analysis: M=8 dy = 6432 a, = 80.025 Best fit equation: ODarTT aa y [lbf in] = 80.025 x [mV] — 6.432 0.1798 9.262 Standard deviation of the regression: nce u2200 1.782 132.675 Sos 2.362 173.812 Standard deviation of the intercept: 2.467 191.715 Sao = 3.010 2.986 227.700 3.427 271.320 Standard deviation of the slope: Sai = 1.399 828 Example (Cont’d) * For the probability of 959% and the number of data points of M=8 (v=M-2=6), we can find t, 9sy=2.447 from the t-table + The Confidence Bands of the regression line will be: 1 1, @ 2 pe y = 80.025 x ~ 6.432 typSyx F [MT 2, _ 1 (e—1.828)]? = 80.025 x — 6.432 4 2.447 x 4.496 x [> + SE Upper Confidence Band: 4 = 80.025 x ~ 6432-4 11.000 x [2 4 &—2828)"7* Yuapper.bandt = x6: 000 x |5 +535 Lower Confidence Band: Yiower bana = 80.025 x — 6.432 ~ 11.000 x | 1 1 1.828)? 10.330 1/6/2021 1 Scanned with CamScanner Example (Cont’d) + The Prediction Bands of the measured data will be: 1 1 @-s FF = 80.025 « ~ 6432 4 eS |1 +55 tS Ge Upper Prediction Band: 2 = 80.025 x ~ 6.432 + 11.000 x [1 +24 &— 1820)" Yupper.bana = 80.025 x — 6.432 + 11.000 x 11 +5 +5 335 Lower Prediction Band: ( ye 3 1, @~ 1.828 Mower.nana = 80.025 x ~ 6.432 ~ 11.000 x if +5+Foa30 | | Example (Cont’d) x00 (© Measured Data Points —tine fit —Upper Confidence Band —tower Confidence Band —Upper Prediction Band —Lower Prediction Band 0 Voltage (mV) Scanned with CamScanner 1/6/2021 12 Example (Cont’d) + True intercept value: Aorrue = Ay + ty,pSao = —6.432 + 2.447 x 3.010 = —6.432 + 7.366 * True slope value: Qrrue = a + ty,pSax = 80.025 + 2.447 x 1.399 = 80.025 + 3.423 Scanned with CamScanner 1/6/2021 13

You might also like