Data Representation Chapter

You might also like

Download as pdf
Download as pdf
You are on page 1of 10
16» Probability anc Statistics 807 SECTION A Data representation By the ond of this section you will be able to: » distinguish between continuous and discret > construct frequency distributions > draw a histogram and bar chart » plot a frequency polygon vata A1 Discrete and continuous data ‘The following are examples of discrete data: > Number of people in a room > Number of rejects on an assembly line > Shoe size of children, ‘What do you think is the definition of ‘discrete data’? Data which can only take certain values, The number of people in a rom can only be 1, 2, 3,... and not 1,23, 1.57, 10.11, ‘The following are examples of continuous data: > Weights of people > Output voltage of an analogue system > Loads on a beam, What do you think is the definition of ‘continuous data’? Data which can take any values between two end points. “The weights of people can be 60.28 kg, 70.1 kg, A2 Frequency distribution ‘What does the term ‘frequency’ mean? Itis the number of times a particular value occurs in some data, The combination of particular values and their frequency is called a frequency distribution. (One way of representing the distribution of data is by a frequency table. This is a table that summarizes the data into some order, Example 1 ‘The number of rejects, in the last 30 days, from an assembly line has yielded the following data: 808 16 > Probability and Statistics Example 1 continued 360087 4930 36 35 400 4200 4437 33 42. 370 4l aaa 30 36 37373630 44 31 3000 «424d 44 39 42. Construct a frequency distribution table, Solution Remember that frequency of a value is the number of times it occurs in the data, For example, there are 30 rejects on 4 days. We say the frequency of 30 is 4, We can, summarize the above data as detailed in Table 1 Number of rejects Frequency f 30 31 33 35 36 37 39 1 40 41 2 44 “9 TABLE 1 ‘The representation of data is a lot clearer in the table. We can check that all the data has been placed in the table, How? ‘The sum of the frequencies should add up to 30 because there are 30 data values, that is Sf=30. [Remember S means ‘sum of’ . This is no guarantee that our frequency distribution is correct but it is a good guide. This table is the frequency distribution for what sort of data? Discrete data (number of rejects) ‘We can use a similar idea to form a frequency distribution for continuous data. One way of representing continuous data is to group it into particular ‘classes’ or ‘intervals’, as the next example shows, This is particularly useful for a large amount of data, 16» Probability anc Stalistics 808 Example 2a The diameters, in mm, of 20 pipes are as follows: 40.6 40.7 40.9 41.0 4d a4 als aL 412 412 419 413 a4 416 418 416 412 40.5 40.8 419 Form a frequency distribution table, by grouping the data into five classes. Solution How do we form a frequency distribution for this data? We can group the data into classes, but classes of what size? ‘That depends on the data. The smallest value is 40.5 and the largest value is 41.9. Ifwe use classes of size 0.3, then we will get five classes. Lets form the frequency distribution. table for classes of size 0.3 (Table 2) w)Diameter am) Frequency 4 yoas sa 40.75 4 z 4075 Probability and Statistics A3_ Histogram A histogram is a graphical representation of a frequency distribution, Example 20 Considering the data of Example 2a, draw a histogram for this data Solution ‘The frequency is plotted along the vertical axis and the grouped diameter of pipes along the horizontal axis. Figure 1 shows a histogram of Ficqueaey the data contained in Table 2 The symbol, x, used in Fig. 1 means that there isno data Fig. 1 before the specified value, in this case 40.45. LOS 61354165 41.98 Diameee This is one of the simplest histograms to draw because it hhas equal class widths and so the height represents the frequency ‘When either axis does not start at zero (in this case the horizontal axis starts at 40.45), itis normally abbreviated by omitting a section of the scale, indicated by ~. In a histogram the area of the rectangle is proportional to the frequency. ‘Only in the case of equal class width do we have the height of each rectangle representing the frequency. For histograms with unequal class widths we need to be careful, as Example 3a, below, shows. Example 3a ‘The resistances of 100 resistors are given in Table 3a, . Resistance R (KO) Frequency 4 ° & 20 = 8 12 16=R<18 18 18=R<20 2 205R<23 a Draw a histogram for this data. 16» Probability ang Statistics 811 Example 3a continued Solution Remember that the frequency is proportional to the area. We need to choose a standard class width. By looking down the left-hand column of the table, we find the class widths are of sizes 0.1, 0.2, 0.1, 0.1, 0.2, 0.2 and 0.3. Which class width would you choose to be standard? It really doesn’t make much difference, but the most suitable seems to be 0.1 because there are three intervals with this width and it keeps the arithmetic easy, that is 0.1 xX 2=0.2,01 x3 = 03. If we choose our standard width = 0.1 then the second interval is twice the standard width and so we halve the frequency height. Similarly for the last interval, wwe take 1/3 of the frequency height (Table 3b). This new figure is known as the standard frequency. va) Resistance R(kO) Cass wideh Frequency Standard 8 (Standard width, SW) frequency Zo tsr Probability and Statistics A4 Frequency polygons Another graphical representation of a frequency distribution is a frequency polygon. There are two ways of constructing a frequency polygon: 1 Draw a histogram and join the midpoints of the tops of the rectangles. 2 Plot the standard frequency on the vertical axis against the midpoint of the interval. Example 3b Plot the frequency poly Solution Which method should we use? Method 1, because wwe have already ‘Standart drawn a histogram 12 Miequeoey a for the data Frequency Y polygon Figure 2b shows LLP “ the frequency * DE polygon of resistance UY, values y L Resistance tri2 141s 16 18 20 23 ae) AS Bar charts Another graphical way to represent data is to plot a bar chart. A bar chart consists of bars which can be drawn vertically or horizontally, and the height or length of these bars gives the frequency. We will confine ourselves to vertical bars. You will find it easier to plot a bar chart using appropriate software. Example 4 Table 4 shows the number of new registrations with the Engineering Council at the end of each year. Draw a vertical bar chart to represent fa the number of CEng registrations against the year of entry b the number of CEng, [Eng and EngTech registrations against the year of entry. 16» Probability ang Statistics. 813 Example 4 continued < Number of new registrations with the z Engineering Council Year Cling Engng Tech 2000 $096 «1708 683 2001 49321362, 392 2002 $180 789 S74 2003 ©4504 599 466 2004 4518 484 758 2005 $906 532 1880 2006 $563 498 944 20073489 586 839 2008 3439 498, 1343 2009 3750 S47 1314 Solution a The number of CEng registrations is given in the second column, We plot a series of vertical bars of the same width with the year plotted horizontally and the number of CEng registrations (numbers in the second column of Table 4) vertically. ‘This is illustrated in Fig. 3a: ‘The numberof new registrations for CEag Numbersegisered 500 Fig. 8a S000 4500 4000 3500 2000 2001 2002 2003 2004 2005 2005 2007 2008 2009 Yar ‘Note that the vertical axis starts at just above 3000 because all the entries in the CEng column of Table 4 are above 3000, We could start at zero, but it would be more difficult to visualize the difference between the various years. 814 16 > Probability and Statistics Example 4 continued b We can also plot three bars for each year showing each of the categories CEng, IEng and EngTech as illustrated in Fig. 3b: ‘The number of new engineers registered Number registred so00 4900 Fig.do 3000 2000 2000 2001 2002 2003 2008 2005 2005. 2007 200% 2009 est E fog hag Ba EngTe Note that this time the vertical axis starts at zero to enable the smaller quantities of Eng and EngTech registrations to be shown. SUMMARY Discrete data can only take certain values while continuous data can take any value between the two end points. A frequency table is one way of representing the distribution of data, A histogram is a graphical representation of a frequency distribution. The frequency is, proportional to the area, ‘A frequency polygon is another graphical representation of a frequency distribution. Another graphical representation of data is a bar chart, Exercise 16(a) Seomnpagaeconengrecngsg ne 4 statewhether the following fe dcrete _@ The numberof motors in Date or continuous dat a Th eines of ght bulbs 2 The weights of people o The resistance value of vane b Marks in an examination resistor, Exercise 16{a) continued 2 The temperatures, to the nearest degree Celsius, for the last 30 days are as follows: 22 23 214 23s 16 18 23 Construct a frequency distribution. 3 The heights, in m, of 40 students are 19 23 18 26 7 20 shown below: 1.68 1.67 1.81 1.85 1.82.1.76 1.95 1.87 1.86 1.88 Construct a frequency distribution with an equal class width of 0.1 16 7 au 22. 22 1.53 1.70 1.69 1.76 1.66 1.91 1.84 1.55 1.61 197 561.99 1,93 1.64 1.89 17 20 19 21 20 20 2423 21.20 1711.71 1.80 1.95 1.87 1.80 1.85 1.93 1.88 1.83 1.74 1.83 1,90 1.72 1.95 4 Draw a histogram for question 3 with the same class width. 5 Draw a frequency polygon for question 3. 6 The table below shows the time taken, in is, for 105 op-amps to become fully operational: Time taken t (ans) lost mst 30st 40st 455 sost ssst yo=t <20 30 <40 <45 50 Probability and Statistics Exercise 16(a) continued Eee a 9 The table to the right shows the number of new registrations with the Engineering Council at the end of each Number of new registrations with the Engineering Council year. Draw a vertical bar chart to Year CEng Hing_-EngTech represent the number of registrations 1984 391123911002 of CEng, IEng and EngTech against, 1985 500227741337 each year on the same graph. 1986 596026821039 c ‘hand d 1987 6022-3066 = 1130 fomment on your graph and data, toss | seve | asi | 1189 1989 474623351210 1990 920725591315 1991 S413, 26341185 1992 5588 = 21281184 1993 6189 2050-1190 1994 S721. 15561237 1995 537614331146 1996 S485 15791082 1997 S641. 1595 903 199847921484 789 1999 S187 1562916 SECTION B Data summaries By the end of this section you will be able to: > ovaluate the moan > understand what standard deviation means > evaluate the standard deviation > derive propertias of mean and standard deviation > evaluate the mean and standard deviation of data in a frequeney distribution Bt Averages ‘The sample mean, or average, of m numbers, XX, Ay. denoted by Fi given by Ay + Xp + Xy + + Xy al mom x= THY 7 ( sum of observations ) ‘umber of observations The notation Sx; means sum x, from 1 to n, that is aj + X2 4&5 +--+ Xe

You might also like