Professional Documents
Culture Documents
6 PDF
6 PDF
6 PDF
BASICS OF STATISTICS - I
◻ Tabular presentation
◻ Construction of graphs
Source: Hafalla, V. Jr. & Calub, E. (2011) Modeling the performance of electronics
and communications engineering students in the licensure examination. UB Research
Journal, 35(1)
Tabular Presentation
Table 8
boxhead
Results of the F-Test on the BOD Levels of the Different
Points Along the Magsaysay Creek
Type III Mean Noncent. Observed
Source Sum of df Square F p-value Parameter Power*
Squares
BOD Sphericity 17947.6 6 2991.267 5.699 0.000 34.194 0.994
Assumed 17947.6 3.108 57737.924 5.699 0.003 17.715 0.920
Greenhouse- 17947.6 4.937 36355.966 5.699 0.000 28.134 0.985
Geisser 17947.6 1.000 179473.60 5.699 0.041 5.699 0.567
Huynh-Feldt
Lower-bound
◻ Line Graph
◻ Time Series graph
◻ Bar Graph
◻ Pie Chart
◻ Scatterplot
◻ Stem-and-Leaf displays
◻ Boxpot
Line Graph
8.00
19-Feb-03
7.80
21-Feb-03
7.60 27-Feb-03
28-Feb-03
pH
7.40
7-Mar-03
7.20 10-Mar-03
13-Mar-03
7.00
14-Mar-03
6.80 17-Mar-03
1 2 3 4 5 6 7 8 9 10 11 20-Mar-03
Sampling Points
Remark:
◻ use horizontal arrangement of the individual bars
when comparison of categories is being made
◻ use vertical arrangement of the individual bars when
chronological comparison are being made
Example:
1. Simple Vertical Bar Graph
Student Population of College of Engineering by Course
SY 2009-2010
Examples
2. Simple Horizontal Bar Graph
Ozone Concentration in Selected
Regions in Antartica
4. Grouped Bar/Column Chart
Defects in 3 Factories
5. Subdivided Bar/Column Chart
i. Create a scale and locate the lowest and highest value of the
data in this scale.
ii. Construct a rectangle on this scale with one end depicting
the first quartile (Q1) and the other end the third quartile
(Q3).
iii. Put a vertical line across the interior of the rectangle at the
median.
iv. Mark the lowest and largest value in the data set by vertical
lines. Identify the outliers and mark them ‘O’ in the scale.
An outlier in the boxplot is an observation falling below
Q1-1.5IQR (lower fence) and above Q3+1.5IQR (upper
fence). Note that the IQR = Q3-Q1
Example:
From the previous data set, construct a boxplot. Determine if
there are outliers in the data set.
◻ Measures of Location
(eg. quartiles, percentiles, deciles)
Sample Mean
◻ is the mean of the sample observations
Example 1:
◻ The achievement test scores in General Science of all 50
freshmen students from a certain college are as follows (max
score-100):
43 51 53 55 57 58 58 59 61 61
61 62 63 64 65 65 66 66 67 68
68 69 69 69 69 70 70 70 71 71
72 73 73 74 74 75 76 76 77 78
79 79 81 82 82 85 87 89 91 96
Frequency Counts
Provisions Agree Agree but with Disagree
reservations
1. compulsory vacation 44 32 24
2. unpaid unexcused 18 52 30
leaves
3. shifting schedules 23 22 55
First assign weights to the opinions: (agree-3, agree but with
reservations-2, disagree-1)
Calculate the weighted opinion for each provision, for example:
The median is
Characteristics of the Median
◻ It is a positional measure.
◻ It is not influenced by extreme values. Hence it is
favorable to the mean when extreme values are
present in the data set or when the distribution is
skewed.
◻ It can be applied to data that are measured in at least
ordinal level.
Mode
◻ the value in the data set that occurs with the greatest
frequency
◻ usually denoted by Mo
Example
◻ A psychologist has developed a new technique intended to improve
rote memory. To test the method, 30 high school students representing
three sections are selected at random, and each is taught the new
technique. The students are then asked to memorize a list of 100 word
phrases using the technique. The following are the number of word
phrases memorized correctly by the students selected from each
section.
A: 83 64 98 66 83 87 83 93 86 80 93 83 75
B: 87 76 96 77 94 92 88 85 89
C: 68 84 79 79 84 75 80
The mode for each section is: MoA=83, MoB=does not exist,
MoC=84 and 79
Characteristics of the Mode
◻ It is the easiest to interpret among the measures of
central tendency.
◻ It is not affected by extreme values.
◻ the mode is not unique, that is, it may be possible that
a given data set may contain more than one mode or
none at all. If a data set has two modes, we call it
bimodal, if there are three modes, we call it trimodal
and so on.
◻ one advantage of the mode is that it can be applied to
observations that are measured in the nominal level.
Measures of Location
The range is
R = 77.1 - 52.0 = 25.1
Characteristics of the Range
58 72 77 89 63 85 51
Concentration Class
of Ozone in the Number of Mark Xi-mean
Atmosphere Areas (fi) (Xi) fiXi (Xi-49.70) (Xi-49.70)2 fi(Xi-49.70)2
(ppm)
10 – 19 6 14.5 87 -35.2 1239.04 7434.24
20 – 29 5 24.5 122.5 -25.2 635.04 3175.20
30 – 39 14 34.5 483 15.2 231.04 3234.56
40 – 49 17 44.5 489.5 -5.2 27.04 297.44
50 – 59 11 54.5 926.5 4.8 23.04 391.68
60 – 69 15 64.5 967.5 14.8 219.04 3285.60
70 – 79 4 74.5 298 24.8 615.04 2460.16
80 – 89 2 84.5 169 34.8 1211.04 2422.08
90 - 99 3 94.5 283.5 44.8 2007.04 6021.12 77
3826.5 28722.08
The value 49.70 is the mean and is computed as
It is computed as follows:
Example:
◻ In an experiment, two groups of cattle were fed differently
with one using the usual hay mix and the other group the
improved high protein mix. After a year, the following
statistics were collected on the two groups of cattle: