MMW Chapter 4 2

You might also like

Download as pdf
Download as pdf
You are on page 1of 21
Example: Given a random sample of size n= 10, 4,7,8,2,8,8,9,2,5,7 using the measure of central tendency, tell whether the given data are symmetric, skewed to the left, or skewed to the right. Mean=6 Median=7 Mode =8 Since the Mean < Median < Mode, therefore it is negatively skewed. The formula for the coefficient of the Pearsonian skewness, denoted by SK, is 3(q1- Md) sk=———— where °. Sk _-Pearsonian Coefficient of Skewness nthe mean Md the median 6 -thesd If Sk = O then the distribution is symmetric SK > 0 then the distribution is positively skewed ‘SK < O then the distribution is negatively skewed Example 10: The following data represent the score of 7 8S Applied Statistics, students in a quiz: X:=4, %2=7, Xa=8, Xe=2, Xs=2, Xo=9, X; Compute the coefficient of skewness Solution: Md=4 wes o=273 3(5-4) — = 1.0989 = 1.10 2.73 Hence, positively skewed distribution Example 11: Given a random sample of size, n=10. Xs=4, X27, Xo=B, Ke=2, Xs=2, Xo= B, X Compute the coefficient of skewness 9, Xa= 2, Xs Solution: The sample mean is 4.9; S=269; Md= 6 3(4.9- 6) sk 0989 = 1.10 2.69 Hence, negatively skewed distribution 108| Mathematics in the Modern World Using the data from the Frequency Distribution Table in example 6, compute Example 12: the coefficient of skewness Md=20.06 jic=98.91 oo = 7.897 Solution : Hence, negatively skewed distribution 3(19.89 - 20.06) + 0.2697 = -0.27 7.897 6. Coefficient of Kurtosis kurtosis measures the flatness and peakedness of the distribution of a given data set. It also measures the degree of departure from the normal distribution. A distribution which is more peaked eptokurtie than the normal distribution is called Leptokurtik distribution. A distribution Mesokurte which is more flatter than the normal Playhut distribution is called Platykurtic distribution. Between these two types are distribution which is more "normal" in shaped, referred to as Mesokurtic distribution. The coefficient of population kurtosis is denoted by K and is given by Z0K- 4 K=—— (ry ‘The coefficient of sample kurtosis is denoted by K and is given by 0K x)" 109 and the coefficient of kurtosis for group denoted by K, is given by: EK fc) N Ke (o<'* If K.<3, then the distribution is Platykurtic K> 3, then the distribution is Leptokurtic K = 3, then the distribution is Mesokurtic Example 13: The following data represent the score of 7 BS Applied Statistics in a quiz Xa=4, X27, Xa=B, KeE2, Xs=2, XeH9, X23. Compute the coefficient of kurtosis Solution: =5 6=2.73 SOK yt = (4-5) 4 (7-5) +... + (3-5) = 532 532/7 — = 0.04115 = 0.04 (2.7377 Hence, it is a platykurtic distribution Example 14: Given a random sample of size, n=10. Xi=4 , Ka=7, Xo=B, Xe=2, Xs2, Xe= 8, Xr= 9, Xe= 2, Xo= 5, Xio= 7. Compute the coefficient of kurtosis. Solution: The sample mean is 4.9 $= 2.69 E0Ki- pt = (44.9) 4 (7-4.9) +... + (7 -4.9)"= 719.017 719.017 /10 — = 1.373188 = 1.37 (2.697? Hence a platykurtic distribution 110 | Mathematics in the Modern World Example 15. The table below represent the scores of 64 students in a long quiz Class ss Frequency | = | axe L : 7 | 3 49 [10-14 | 10 | 2 120 | 38753.24 15-19 13 17 221 906.85 [20-24 | 18 2 396 [356.78 25-29 _8 27 216 —20444.12 | 30-34 5s | 32 160__ 107534.19 35-39 Si 37__{ a | 257111.38_| Total 64 1273 Solution: t= 19.89 oo= 7.89761 618352.20 / 64 — = 2.483061 = 2.48 (7.898) Hence, a platykurtic dist D. Measures of Relative Position Measure of position identifies the rank or position occupied by a data from an array of data collected. 1, Percentiles are values that divide a set of observations into 100 equal parts. These values denoted by Py, P:, Ps,.., Poy, mean that 1% of the data fall below P,, 2% fall below P», ... 99% fall below Py . The position occupied by each of the score from an array of data collected is based on the hundredth when the scores are arranged from highest to lowest or vice versa. To determine or identify the data of the desired percentile, the formula (Ez) aves the number of observation below the percentile, then counting from 1 to (4) n from the data arranged in ascending order gives the percentile. Deciles are values that divide a set of observations into 10 equal parts. These values denoted by D, Dp, Dy, «Ds, indicate that 10% of the data fall below D,, 20% fall below Dz, .. 90% fall below Dy. The position occupied |aaa by each of the score from an array of data collected is based on the tenth ice versa when the scores are arranged from highest to lowest or vice ve To determine or identify the data of the desired decile, the formula i ile, then, countin (2) m gives the number of observations below the decile, 4 from 1 to (2)n from the data arranged in ascending order gives the decile. 3. Quartiles are values that divide a set of observations into 4 equal parts. ‘© The 1" Quartile, Q;, also called the lower quartile is equivalent to Ps. To determine the 1" quartile, the formula Q, = j gives the number of observations below the quartile; then, counting from 1 to { from the data arranged in ascending order gives the quartile. ‘© The 2% Quartile, Q,, is the middlemost score or the median and is equivalent to the 50" percentile. To determine the 2” quartile, the formula Q, = 2% = 3 gives the number of observations below the quartile; then, counting from 1 to > from the data arranged in ascending order gives the quartile. ‘+ The 3 Quartile, Q3, also called the upper quartile is equivalent to the 75" percentile. To determine the 3 quartile, the formula Q, = = ives the number of observations below the quartile; then, counting from 1 to ** from the data arranged in ascending order gives the quartile Example: The scores of ten students in a 20-point Math quiz are as follows: 6, 12, 18, 8, 9, 10, 9, 15, 17, 15 Find the values of Qj, Qz, Ds, Ds, Pro, Pas, Pso. Interpret the values, uz | Mathematics in the Modern World = 2.5 +3 This implies that the value is located on the 3 position and that is 9. Thus, Q, = 9. This means that 25% of the students got scores equal or below 9; or 75% of the students got scores ‘equal or above 9. 2n _ 209) _ 5 This implies that the value is located on the 5” OS position and that is 10. Thus, Q2 = 10. This means that 50% of the students got scores equal or below 10 or above 10. = 2% = 109 _ 4 This implies that the value is located on the 1* Dd, = 107 10 position and that is 6. Thus, D, = 6. This means that 10% of the students got scores equal or below 6; or 90% of the students got scores equal or above 6. b= = position and that is 10, Thus, Ds = 10. This means that 50% of the students got scores equal or below 10 or above 10. 5 This implies that the value is located on the 5" Po = ie 1940) — 1 This implies that the value is located on the 1 position and that is 6. Thus, Pjo = 6, This means that 10% of the students got scores equal or below 6; or 90% of the students got scores ‘equal or above 6. pn _ 25110) Pas = ao 2.5 ©3 This implies that the value is located on the 3" position and that is 9. Thus, Pz, = 9. This means that 25% of the students got scores equal or below 9; or 75% of the students got scores equal or above 9. 113 114 Poo = oe 200) = 5 This implies that the value is located on the 5% position and that is 10. Thus, Pgo = 10. This means that 50% of the students got scores equal or below 9 or above 9. Notice that Q; = Pas Di = Pros @2= Ds = Psoi~ For Grouped data: The formulas for quartiles, deciles and percentiles are derived from the formula of the median, i.e. Q(N/4)- Fo Lact € | Se — Foc where Loc - Lower CB of the quartile class C= Class size F, - Oand e=2.71828 and 7 =3.14159. ‘The graph of the normal distribution is called the normal curve. Properties of a Normal Curve © The curve is bell-shaped and symmetric about .e horizontal axis asymtotically as if proceeds in a vertical axis through the mean j. @ The normal curve approaches thi either direction away from the mean. ‘The total area under the curve and above the horizontal axis is equal to 1 The distribution of a normal random variable with mean of zero and standard deviation of 1 is called a standard normal distribution. From the equation of the normal curve, two parameters describe the shape of the normal curve, the mean and the variance. The three measures of the central tendency median and mode) are identical. Once the mean 1 and the variance o® is given (mean, of x. Symbolically, if X is the ordinate of f{x) can be easily computed for possible values normally distributed with mean yt and variance o”, then it is denoted by X ~N (1,0) Characteristics of the Normal Distribution 1, The normal distribution is a continuous distribution in which random variable X can assume value between - #0 1.) = PK -0.86) = 1-P(Z<-0.46) = 1-0.3228 = 0.6772 c. P(-L1775) =? 75-50 z = 2.50 10 P(X> 75) = P(Z> 2.50)= 1- P(Z< 2.50) = 1-0.9938 = 0.0062 ‘Therefore, 0.62% of the student will have a score above 75. What percent of the student whose score will fall between 45 and 65? Plas < xX < 65) 45-50 a= =-0.50 10 65-50 m= = 150 10 P(45 a) = ——— =05675 800 P(Z>2,) = 0.5675 P(Z>2,) = 1-0.5675 = 0.4325 P(2<-0.17)= 0.4325 Z.=-0.17 X- 100 SS ay 15 . X: = 100 ~ 0.17(15) = 97.45 = 98 the passing score Therefore, 98 passing score in exam should be imposed by the University. F. Linear Regression and Correlation A correlation is a relationship or association between two variables. * Adirect or positive relationship between two variables implies that an increase in value of one of the variables corresponds to increase in the value of the other variable. An inverse or negative relationship between two variables means that an increase in the value of one variable corresponds to a decrease in the value of the other variable : ‘+A zero relationship exists between two variables if an increase in one is not ‘accompanied by either an increase or decrease in another. 124 | Mathematics in the Modern World Correlation Coefficient The linear correlation coefficient, denoted by p (rho), is a measure of the strength ‘of the linear relationship existing between two variables, X and Y, which is independent of their respective scales of measurement Important notes about linear correlation coefficient: © -ispsi -Apositive p- means that the line slopes upward to the right; negative p- means that the line slopes downward to the right. ¢ When pis 1 or -1, there is perfect linear relationship between X and Y and all the points (x,y) fall on the straight line. A p close to 1 or -1 indicates a strong linear relationship but it does not necessarily imply that X and Y or Y causes X. It is possible that a third variable may have caused the change in both x and y, producing the observed relationship. If p -0, then there is no linear correlation between X and Y. A value of P = 0, however, does not mean lack of association; hence, if a strong quadratic relationship exists between X and Y, a zero correlation obtained will indicate a nonlinear relationship. Interpretation of Correlation Coefficient Correlation Coefficient Interpretation im =1.00 Perfect Negative Correlation -| “0.76 to-0.99 Very High Negative Correlation High Negative Correlation Moderately Small Negative Correlation Very Small Negative Correlation - No Correlation 0.01 t0 0.25 Very Smal Positive Correlation “| c 026 t0 0.50 Moderately Small Positive Correlation _ 051 100.75 High Positive Correlation 0.76 100.99 Very High Positive Correlation | | 1.00 Perfect Positive Correlation Pearson Product Moment Coefficient Some typical scatterplots with approximate values of coefficient or correlation r: pearson product moment > Sing postive near +r ga et ‘correlation; ris near 1 corvelation: 6 ne y ‘= No apparent linear correlation, |= Fis near 0 y Data Layout n x % woe] ¥ a | % Ye | ay [x ve | 2 | % Ye | ays | ve | 3 Xs Ys %Ys |G ys a Yo | XeYe | Xe | Wat (Total xy EXY ae ye 126 | Mathematics in the Modern World Example 1: A principal of a public high schoo! wishes to investigate how well the entrance ‘examination scores affect the grade point average of the freshmen students. the data of a random sample of 15 freshmen student are as follows pea a | | ‘Entrance Score | GPA(Y) | XY a8 [2407] 656 52525.) 101773, ndxivi- ExiZvi fo nxn (xi fave - vy2) F = 0.858083 = 0.858 Testing for the Significance of Pearson r 1. Ho: r= 0 or there is no significant relation between entrance scores and GPA Ha: r #0 or there is significant relation between entrance scores and GPA Level of significance a = 0.05 and sample size n= 15 Test Statistics : t-test Critical Region : Reject Ho if |tc| > 2.160 Computations: Compute the t-test statistics using the formula yews rt Vn-2 i. v t= 3.09356994 = 3.094 127 ww ude that there j 6. Decision : since t, > 2.160, therefore reject the Ho and Se aeietend the Significant relation between entrance examin: grade point average of freshmen students 5 relation: Important notes about Pearson product moment coefficient or co measurem: * ris used to estimate based on a random sample of pales Of ant ® ¥), * -lgrs1 + Just like p, when r= 1 or -1, all the points (x,y). =0, they are scattered and give no evidence of a lin value of r suggests the degree to which the points ten Aan, fallona straight line; when ear relationship. Any other \d to be linearly related, ‘Spearman Rank Correlation Coefficient ‘The spearman rank correlation coefficient is the best known measure of relationship between two variables based on ranks (ordinal scale). It is applicable when quantitative ‘measurements of the variables are not normally distributed and could be ranked in two ordered series. Its formula is given by 6 >a? n(n?-2) where di is the difference between the ith paired ranks, and nis the total number of paired measurements Data Layout n | x Y__| RankofX | RankofY | di d? 1 x Ys a a] 2 | ®& Ya da dy? 3 |e Ys ds dst a a | de BY dt Example: An administrator wishes to determine significant relationship between the self evaluation and supervisors’ evaluation of their faculty members. A random sample of 10 selected faculty members were asked to rate their overall performance on 8 scale ranging from 1 to 5 (5 as the highest rate). Their rating are given as 128 | ‘Mathematics in the Modern World

You might also like