Download as pdf
Download as pdf
You are on page 1of 15
DEPARTMENT OF STATISTICS FACULTY OF COMPUTING AND MATHEMATICAL SC! ALIKO DANGOTE UNIVERSITY OF SCIENCE AND TECHNOLOGY, ENCES WUDIL COURSE TITLE: Engineering Statistics COURSE CODE: ENG4201 Lecture Note PREPARED By Musa Uba Muhammad (PhD) COURSE OUTLINE Sampling and Sampling Distribution, Frequency Tables & Their Graphs. Center of Distribution & Spread of Distribution. . Probabilities and Their Outcomes & Conditional Probability. }. Independence and Standard Deviation. Random Variables, Expectation & Variance for both Discrete and Continuous Distribution. Higher Dimensional Random Variables (Bivariate & Multivariate). . Normal Probability Distributions. Correlation & Regression Analysis. Law of Large Numbers & Central Limit Theorem. 0.Test of Hypothesis & Quality Control. Beer aAHEENnH- COURSE REQUIREMENTS This is a compulsory course for all Level 400 Engineering students. Students are expected to have a minimum of 75% attendance to b 'e able to write the final examination. The proportion of the total allocated marks for this course is: * Examination 60% (@ semester end), * Continuous Assessments 40% ( attendance, test, and assignment) SECTION ONE: SAMPLING AND SAMPLING DISTRIBUTION Definition: Sampling is a Statistical technique for selecting individual members or @ subset of the population to make a statistical inference from them, and to estimate the characteristics of the whole population (Sing 2014). It is also a time-convenient and cost-effective method, hence forms the basis of any research design. Example: If an Agronomic manufacturing company decides to check the adverse effect of a certain fertilizer on a given farmland, this may be difficult to check the complete farm. In this case, the researcher may decide to select a sample ie. the fraction of the farmland, in such a way to represent the entire population. Types of Sampling Techniques 1. Probability sampling: Sampling where a researcher sets selection criteria and then chooses members of a population randomly. All the members here have an equal Opportunity to be part of the sample with this selection parameter. Probability sampling techniques include: A. Simple Random Sampling (SRS) B. Systematic Sampling C. Stratified Random Sampling D. Cluster Sampling 2. Non-probability or Non-random Sampling: Non-probability Sampling is a technique in which the choice of individuals for a sample is based on convenience personal choice, or interest, this includes: A. Convenience Sampling B. Judgemental Sampling C. Quarter Sampling Note: Let N = Population Size, n= Sample Size We have the following: NV"; Possible samples if sampling is with replacement ('); Possible samples, if sampling is without replacement, Exercise 1, 1. Write and explain all types of sampling techniques you know, 2. With the aid of examples, differentiate between probability and non-probability sampling. - 1.2. SAMPLING DISTRIBUTION Given a random variable X, if we arrange its values in ascending order and assign a probability to each of the values if we present , in a form of relative frequency distribution, the result is called the sampling distribution of X. Definitions: a) A Statistical Population is a collection of all same characteristics. b. A Sample is a fraction or a subset of the population. c. A Random Sample of size K is a sample chosen in such a way as to ensure every sample of size k has an equal probability of being selected. d. A Parameter is a number describing some UNKNOWN characteristic in the population ie. 4 - e. A Statistic is a function of the sample observation i.e, x f. The probability of a Statistic is known as the sampling distribution. (How the x is distributed). There is a need to distinguish the distribution of realization of the random variable. For instance, sample (mean & variance) ie. x = 4.2. possible observations of the ff random variables, say ¥ from the we get data and calculate some 1.3. SAMPLING DISTRIBUTION OF THE SAMPLE MEAN ‘The sampling distribution of the sample mean is a theoretical probability distribution that shows the functional relationship between the values of a given sample mean based on sample of size n and the probability associated with each value, for all possible Samples of size n drawn from There are three (3) properties of interes 1. Its mean 2. Its variance and 3. Its Functional form Steps for Construction of Sampling Distribution of the Mean 1. Given a finite population of size N, randomly drawn for all possible samples of a specific population. in a given sampling distribution: size n. 2. Calculate the mean for each sample. 3. Summarized the mean obtained in step 2 (above), in terms of frequency distribution. Example 2 Suppose we have a population of size N = 5, consisting of the age of five children 6, 8, 10, 12 & 14. Take samples of size 2 with replacement and construct sampling distribution of the mean, Solution: N=5, We have N" = 5? = 25 possible samples since sampling is with replacement. Step 1: Draw all possible samples: 6 8 10 12 14 6 | 66) (6, 8) 6,10) | (6,12) | (14) 8 (8, 6) (8,8) (8,10) | (8,12) | (8,14) 10 | (10,6) (10,8) | (10,10) | (10,12) | (10, 14) 12 | 2,6) [ (2,8) | (12,10) | (12,12) | (2,14) 14] 04,6) | (4,8) | (12,10) [ 4,12) | 04,14) Step 2: Sample Mean 6 8 10 12 14 6 6 7 8 9 10 8 7 8 9 10 1 10 8 9 10 i 12 2 9 10 11 12 13 14 10 i 12 13 14 Step 3: Summary of the mean obtained in (2) above | x Frequency 6 T 7 2 8 3 9 4 10 3 I 4 12 3 B 2 [4 | a) To find the mean of ¥, i.e. ys — -= LA = 250 a eo Ril = 2 = 19-4 b) Variance of Xie. o,° ze 1.4, FREQUENCY DISTRIBUTION AND THEIR GRAPHS This refers to the total number of observations among the various categories, classes or groups of the variables. In addition, the data point can also be summarized with aid of a simple frequency distribution table or graph. Example 3 Refer back to example 2 step 3, (Ungrouped frequency distribution). Example 4 In a situation where the dataset is relatively large especially when the observations are far apart and the goals of summarizing the data in a tabular form may be defeated, it has become necessary for us to collapse the data into “classes” along with their corresponding frequencies. This is known as grouped frequency distribution. The data below represent several observations recorded in an engineering firm. 12, 01, 03, 05, 06, 13, 11, 07, 08, 10, 03, 02, 21, 08, 10, 16, 20, 03, 12, 13, 04, 06, 20, 03, 11, 16, 21, 04, 19, 16, 06, 10, 1 6, 05, 12, 17, 08, 09, 20, 21, 13, 09, 14, 07, 14, 18, 02, 10, 21, 17, 18, 19, 15, 07, 15, 19, 01, 11, 20, 10 Use the data above to form a frequency distribution, using interval: 1-3, 4-6, 7-9, 10-12, ete. Solution: Class Interval Tally Frequency 13 FHFIII 8 | 4-6 =H II 7 jee 8. HH I 8 10-12 HH EAE 10 13-15 HH I 7 16-18 TH ITT 9 19-21 tH HH I i Total 60 Lower Limit _| Upper Limit | Lower Boundary | Upper Boundary | Class Marks I 3 0.5 3.5 2 4 6 3.5 6.5 5 7 9 65 95 8 10 12 95 12.5 a 13 15 12.5 15.5 14 16 18 15.5 18.5 17 [19 21 19.5 215 20 Class mark / Mid-point Lower limit + Upper limit Graphical Presentation of Data A graph is a geometric image of frequency distribution. It is also a mathematical picture usually converted into a visual model, to ease understanding. Following, are the graphical methods to be considered: Bar chart, Pie chart, Histogram frequency polygon, and Or give: (A simple bar chart consists of vertical bars of equal width, whose heights represerit various frequencies of the data points. It is appropriate if the data points or groups aré not split into components. Example 5 The data point below represents the distribution of marks scored by a level three student in the Statistics department. Course Marks: STA3301 61 i STA3303 75 STA3305 82 STA3307 92 MTH3303 58 Represent the above information with the aid of a bar chart graph. MTH3303 STA3307 STA3305, m Marks STA3303 STA3301 Note: If the categories of the subject matter are split into two or more components, a component or multiple bar-chats is more appropriate. EXERCISE 2 1. Read on component and multiple bar-charts 2. Also, merit and demerit of bar-chart. PIE CHART A pie chart consists of a circle partitioned into sectors represent various proportions of the variables or groups in a given data set. Each group in the data is made to represent a sector of the circular diagram. Hence an angle at a point is 360°, each sector is obtained by multiplying the proportion of each group by the sum of 360°. EXAMPLE 6 Represent the information given in example 5 by pie chart diagram. { Courses: | Marks. Angle inasector | Pie chart STA3301 61 61/368 *360° 60° STA3303 75 75/368 *360° 3 STA3305 82 82/368 *360° 3 STA3307 92 92/368 *360° 90° MTH3303 58 58/368 *360° 37 TOTAL 368 360° EXERCISE 3 COURSES Read the merits and demerits of the Pie Chart. HISTOGRAM A histogram is a set of vertical bars with equal bases at different heights. It is also called a bar graph or frequency histogram. EXAMPLE 7 msTA3301 mSTA3303 STA3305, STA3307 mwTH3303, The data below shows the distributions of waiting time for some customers in a bank. Waiting time (class boundary) | Class mark No. of Customers 145-195 _ 17 [3 1.95 = 2.45 2.2 10 2.45 — 2.95 27 18 2.95 -3.45. 3.2 10 [3.45 3.95 37 7 [3.95_- 4.45 42 2 TOTAL 30 10 mGroup Group 2 Group 3 Group 4 Group 5 Group 6 SECTION TWO: MEASURES OF LOCATION / CENTRAL TENDENCY Measure of central tendency js a measure of how the data tends to the central values, such that, each individual tends to cluster around it. Measures of central tendency are very useful parameters, because they describe the properties of the population. The average refers to the “Centre” ofthe data set. It’s a single value intended to represent the distribution as a whole. “There are three types of averages commonly in used: (i) Mean, (ii) Median, and ({ii) Mode MEAN: This is the most commonly used and also of greatest importance out of the three averages, and they are further divided into: (a) Arithmetic mean: The sum of all items in a group / under consideration, divided by the ‘number of items in the same group. Mathematically is expressed as: n EXAMPLE 8 Calculate the arithmetic mean of the following data set: 4, 6, 7, 3, & 5 Solution: 44+6+74+3+5 ng te 5 Nevertheless, If ),,2)....%», occurs with frequencies f.f,....f, respectively, the mean is Lax ai Note that, for a grouped frequency distribution x; represents the i" class marks. obtain as : x 11 EXAMPLE 9 Given the following frequency table: (Class Interval [9214 [15720 «(21-26 [27-32% [931238 | 39244 Frequency 8 5 3 4 9 10 Find the mean? Solution: Class interval Class Marks Fi FX 9-14 115 8 92.0 15-20 175 5 87.5 21-26 235 3 705 27-32 295 4 118.0 33-38 355 9 319.5 39-44 415 10 415.0 TOTAL 39 1102.5 y= NOS 28.27 39 Merit of Arithmetic Mean 1. It is easy to calculate and widely understood 2. It can be determined when the total value and number of observations are available. 3. It is liable to mathematical precision Demerit 1. It is affected by extreme values 2. It may not correspond to any value in the data set. EXERCISE Read: b) Assumed mean method c) Mean by coding method MEDIAN: Is the middle number in a given data set or frequency distribution, when the numbers are arranged in order of magnitude. For an even number, the median is the mean of two middle numbers. EXAMPLE 10 Given the observations below, find the median. 1) X15 X2y X35 X4y Xs bi) X1y X29 X3» Kay Xs» XE 12 Solution: i) Median is x3 ii) Median is (xs + x) by 2 Nevertheless, if we are dealing with grouped data, the median is obtained using the formula given below: N Median = 1+ a c ‘Where: L is the lower class boundary of the median class, Nis the sum of the frequencies ie. D> /,» Cy, is the cumulative frequency of the class before the median class, Fis the frequency of the median class, Cis the class width (upper — lower class interval). Furthermore, the median could also be obtained from the cumulative frequency curve (ogive), i.e. middle quartile Q2. EXAMPLE 11 The data below represent the amount of blood donation for the six successive days, from donors in cm’, Blood amount cm? _| 3 6 9 12 15 18 Numbers of donors | 3 9 6 15 12 5 Find the median. Solution: us Z 6 9 2 15 18 uk ef [3 lis |i 33 [as [s0 } N/2 = 50/2 =25, the median corresponds to the observation whose cumulative frequency (cf) is just above N/2 which is 12cm’. EXAMPLE 12 Let us use the grouped frequency data below: Hourly wage 10-19 20-29 30-39 40-49 50-59 60-69 No. of workers | 4 6 10 20 (18 2 Solution: Hourly wage Fi CF Class boundaries 10-19 4 4 95—-19.5 20-29 6 10 19.5 — 29.5 29.5 — 39.5 30-39 10 20 | 39.5 — 49.5 40-49 20 40 49.5 — 59.5 50-59 18 58 | 59.5 — 69.5 60-69 2 60 Using the formula above, by substituting the values and considering “N” = 60, N/2 = 30 Median = 39.5 + 5 = 44.5 Merit of the median 1. It is not affected by extremes values 2. Computation is very easy 3. Since it is an actual value, it appears representative and accurate. Demerit 14 ———Src tell 1. For grouped data, the median is only an estimate 2. It doesn’t take all values into account It is not useful or realistic in qualitative data, MODE: A mode in a given set of data or frequency distribution is the number that occurs most in the distribution. In grouped data, the modal class is the class with the highest frequency, and hence the formula is given as: Where Fy is the cumulative frequency of the modal class, Fy is the frequency of the class before the modal class, Fyis the frequency of the class after the model class Lis the lower class boundary C is the class width (upper — lower class interval). Note: For a graphical method presentation, the mode is the peak of the frequency curve of the distribution. EXAMPLE 12 Given: 8, 10, 9,9, 10, 8, 8,...... here by inspection mode is 8. Using the above example 12, obtain the modal class of the distribution 20-10 10-8-19 Mode =395+[ jo =478 The merit of the mode 1. It can be determined by inspection, without calculations

You might also like