Module 1.1 Basic Statistical Concepts

Definition of Statistical Concepts

and Principles

• To understand the basic concepts

used in statistics and to differentiate
inferential and statistical statistics.

Basic Concepts of Probability and Statistics

Probability is a branch of mathematics concerned with

theories of uncertainty, ways of measuring uncertainty
and the application of techniques involving uncertainty.

Statistics is a branch of mathematics that examine and

investigate ways to process and analyzed the data

Basic Concepts of Probability and Statistics

1.Descriptive Statistics - includes those methods

concerned with collecting, organizing, summarizing and
presenting data without drawing inference about a large
2.Inferential Statistics - refers to those methods
concerned with the analysis of a subset of data leading to
predictions and inferences about the entire set of data.
- also called Inductive Statistics or Statistical Inference
Basic Concepts of Probability and Statistics
Population - consist of the totality of the observations with
which we are concerned
Sample - collection of some of the elements obtained from the
Parameter - any numerical value describing a characteristics
of a population
Statistic - any numerical value describing a characteristic of a
Constant - characteristics or properties whereby the members
of the population are the same
Variable - is a characteristics that changes or varies over time
for different individuals or objects under consideration
Basic Concepts of Probability and Statistics
1.Qualitative Variables - measure a quality or characteristic on each
experimental unit
Examples: eye color, gender
2.Quantitative Variables - measure a numerical quantity or amount on
each experimental unit
Examples: number of accidents, volume in a glass, weight of
a. Discrete variable - countable number of values
Example: number of family members
b. Continuous variable - uncountable number of
Examples: time, distance, volume, height
Basic Concepts of Probability and Statistics

1. Nominal - values represent categories with no inherent order

Examples: Gender, Civil Status
2. Ordinal - values represent categories with inherent order (ranking)
Examples: Educational background, Quality of Service, Grades
3.Interval - values represent ordered categories with equal intervals
between them
Example: temperature
4. Ratio - comparing the z variables
Example: employment size
Basic Concepts of Probability and Statistics



Basic Concepts of Probability and Statistics
Sampling Method
1. Simple Random Sampling
2. Stratified Random Sampling
3. Systematic Random Sampling
4. Cluster Sampling
5. Stage Sampling
6. Slovin’s Formula

Basic Concepts of Probability and Statistics

How to Present Data?

1. Graphs
2. Table charts

Basic Concepts of Probability and Statistics
Graphs for Qualitative Data
- what values of the variable have been measured
- how often each values has occurred
Three Measures available for this purpose
1.Frequency - the number of times a score or group of score (class)
occurs in a population or sample
2.Relative Frequency - the frequency of one score or group of scores
divided by the total frequency of all the observations
Relative Frequency = frequency/n
where n is the sum of frequencies
3.The percentage of measurement in each category
Basic Concepts of Probability and Statistics
Graphs for Qualitative Data
To display distribution of data:
Pie Chart - circular graph that shows how the measurements are
distributed among the categories
[one sector of a circle is assigned to each category; the angle of
each sector should be proportional to the proportion of
measurements (or relative frequency) in that category Angle =
relative frequency x 360°]
Bar Chart - the height of the bar measures how often a particular
category was observed
Basic Concepts of Probability and Statistics
Sample Problem
In a summary concerning public
education, 400 school US Education rating by 4000 educators
administrators were asked to Rating Frequency
rate the quality of education in A 35
the United States. Their B 260
responses are summarized as
follows: C 93

D 12
Construct both a Pie chart and
Total 400
Bar chart to describe the data.
Basic Concepts of Probability and Statistics

Rating Frequency Percent Angle

A 35 0.09 9% 32.4°

B 260 0.65 65% 234°

C 93 0.23 23% 82.8°

D 12 0.03 3% 10.8°

Total 400 1.0 100% 360°

Basic Concepts of Probability and Statistics

Pie chart Bar graph

Basic Concepts of Probability and Statistics
Graphs for Qualitative Data
*Describing data by the amount measured in each category:
1.Pie Chart - displays how the total quantity is distributed among
2.Bar Chart - uses the height of the bar to display the amount in a
particular category
*Describing data by time series:
1. Line Chart - when a quantitative variable is recorded overtime at
equally spaced intervals (such as daily, weekly, monthly, quarterly), the
data set forms a time series. Time series data are most effectively
presented on a line chart with time as the horizontal axis.
Basic Concepts of Probability and Statistics

Graphs for Qualitative Data

*Describing data by frequency of occurrence:

1. Relative Frequency Histogram - For a quantitative data set is a bar

graph in which the height of the bar represents the proportion or relative
frequency of occurrence for a particular class or sub-interval being
measured. The classes or sub-intervals are plotted along the horizontal
Basic Concepts of Probability and Statistics
Graphs for Qualitative Data
*Describing data by frequency of occurrence:
2. Frequency Distribution
a.For ungrouped data - It is a tabulation of data showing the frequency
of occurrence of the different values of the variable.
b.For grouped data - It is a tabulation of data showing the number of
observations that fall in each of the classes.
c. Class/Class Interval - a symbol defining the arbitrary groupings.
Example: 9-11
d. Class Limits - the end numbers of the class or class interval.
Example: 9 and 11 where 9 is the lower class limit and 11 is the
19 upper class limit
Basic Concepts of Probability and Statistics
Graphs for Qualitative Data
*Describing data by frequency of occurrence:
2. Frequency Distribution
e. Class Interval Size - difference between two successive lower class
limits or two successive upper class limits.
f. Class Boundary - halfway between the lower limit of one class and
the upper limit of the preceding. It is the exact limit.
Example: In the interval 9-11, 8.5 is the lower class boundary and
11.5 is the upper class boundary
g. Class Mark - the midpoint between the upper and lower class
boundaries or class limits of a class interval
h. Class Width - the difference between upper and lower class
boundaries of a class interval
Basic Concepts of Probability and Statistics
Graphs for Qualitative Data
*Describing data by frequency of occurrence:
2. Frequency Distribution
i.Class Frequency (f) - the number of observations falling in a particular
j.Relative Frequency - the frequency of one observation or group of
observations divided by the total frequency of all observations.
k.Cumulative Frequency - the frequency of any class plus the frequencies of
all preceeding class in a distribution.
l.Histogram - a vertical bar graph that shows the frequencies of scores or
classes of scores by the height of the bar.
m.Frequency Polygon - a graph on which the frequencies of classes are
plotted at the class mark and the class marks are connected by straight lines.
Basic Concepts of Probability and Statistics
Sample Problem:

The following scores represent the final examination grade for an elementary
statistic course.

23 60 79 32 57 74 57 70 82 36 80 77 81 95 41 65 92 85 55 76 52 10 64 75 78
25 80 98 81 67 41 71 83 54 64 72 88 62 74 43 60 78 89 76 84 48 84 90 15 79
34 67 17 82 69 74 63 80 85 61

Using 10 intervals with the lowest starting at 10 set up a frequency

distribution or cumulative frequency distribution.

Basic Concepts of Probability and Statistics
Frequency Cumulative Frequency Class Class
(f) Cf< Boundary Mark
90-99 4 60 89.5-99.5 94.5
80-89 14 56 79.5-89.5 84.5
*70-79 14 42 69.5-79.5 74.5
60-69 11 28 59.5-69.5 64.5
50-59 5 17 49.5-59.5 54.5
40-49 4 12 39.5-49.5 44.5
30-39 3 8 29.5-39.5 34.5
20-29 2 5 19.5-29.5 24.5
10-19 3 3 9.5-19.5 14.5
n = 60
Basic Concepts of Probability and Statistics
The Choice of a Graph
1.Histogram - is to be preferred when only one distribution is to be
2.Frequency Polygon - more useful and better in comparing two or
more distribution graphically on the same axes.
3.The ogive - is useful in making estimates of quantities, medians and
other similar points of relative positions.
4.The Pie diagram - or circle graph is useful when one wishes to
picture proportions in a striking way.

Note: The choice of the graph or graphs to use depends on the judgment
of the user based on the purposes or intentions that he/she has in using
these graphical representation of data.
Basic Concepts of Probability and Statistics
Properties of Frequency Distribution
Frequency distributions differ from each other in terms of their four
important properties: central location, variation, skewness and
1.Central location - refers to the value near the center of frequency
2.Variation -refers to the extent of spreading out of individual
measures from the measure of central tendency.
3.Kurtosis refers to the flatness or peakedness of one distribution is
relation another.
4. Skewness refers to the symmetry or asymmetry of a frequency
Measures of Central Tendency
Three Measures of Central Tendency
Any measure indicating the center of a set of data arranged
in an increasing or decreasing order of magnitude.
1. Mean - the arithmetic average of all the scores or group of
scores in a distribution. It is denoted by the symbol (μ) for
population mean and X-bar for sample mean.
For ungrouped data:
X  X
where X = score or measures in the series
n = number of measures in the series
Measures of Central Tendency
Mean for Ungrouped data:

Example: Given the scores of 7 students on a quiz [ 65 54 83

89 70 75 84], Find the mean?

X  X
65548389 70 7584
n 7

X  74.29

Measures of Central Tendency
For Grouped data:

X  fM i
n f = frequency
M = midpoint or class mark


X A (
fd )i
f = frequency
n M = class mark
A = assumed mean
n = number of observation in a sample
Measures of Central Tendency
Class CF Class mark f(M) d fd
90-99 4 60 94.5 378 2 8
80-89 14 56 84.5 1183 1 14
*70-79 14 42 74.5 1043 0 0
60-69 11 28 64.5 709.5 -1 -11
50-59 5 17 54.5 272.5 -2 -10
40-49 4 12 44.5 178 -3 -12
30-39 3 8 34.5 103.5 -4 -12
20-29 2 5 24.5 49 -5 -10
10-19 3 3 14.5 43.5 -6 -18
∑f = n = 60 ∑fM = 3960 ∑fd = -51
Measures of Central Tendency
For Grouped data:

X  fM i
 3 9 6 0  66
n 60


X A (
 fd
)i  74.5  ( -51 )* 10  66
n 60

Measures of Central Tendency
Three Measures of Central Tendency
2. Median - point on the scale of measurement that divides a series
of ranked observations into halves such that half of the
observations fall above it and the other half fall below it.
For ungrouped data:
The median is the middle value if there is an odd number of
Median = th largest observation

Example: For the observations 7 9 2 4 3

“4” is the median
Measures of Central Tendency
The median is the average or mean of the two middle scores if there
is an even number of observation median = midpoint between the
n n
( ) observation and the [ +1]th observation
2 2

Example: For the observations:

2 3 4 7 9 11
2 th observation = 3rd observation which is 4

[ n +1]th largest observation = 4th observation which is 7

Median = 5.5
Measures of Central Tendency
For Grouped data:
0.5n - fc
Median = L  ( )i
L = exact lower limit of the class containing the median
fc = sum of all frequencies below L
fm = frequency of the class interval containing the median
n = number of cases or observations
i = class interval
Measures of Central Tendency
Class Class mark f(M) d fd
90-99 4 94.5 378 2 8
80-89 14 84.5 1183 1 14
*70-79 14 74.5 1043 0 0
60-69 11 64.5 709.5 -1 -11
50-59 5 54.5 272.5 -2 -10
40-49 4 44.5 178 -3 -12
30-39 3 34.5 103.5 -4 -12
20-29 2 24.5 49 -5 -10
10-19 3 14.5 43.5 -6 -18
∑f = n = 60

Measures of Central Tendency
For Grouped data:
0.5n - fc
Median = L  ( )i
0.5(60) – 28
Median = 69.5  ( )*10 = 70.93

Measures of Central Tendency
Three Measures of Central Tendency
3. Mode -point on the measurement scale with the maximum
frequency in the given distribution
For ungrouped data:
Mode is the measurement which occurs most frequently
Example: Find the mode of the following data
4 7 7 7 9 10 12 11
13 13 13 13 13 13 15 16

Mode is 13
Measures of Central Tendency

Three Measures of Central Tendency

For grouped data:

Mode = L  ( )*i
∆1 + ∆2

Measures of Central Tendency
Class Class mark f(M) d fd
90-99 4 94.5 378 2 8
80-89 14 84.5 1183 1 14
*70-79 14 74.5 1043 0 0
60-69 11 64.5 709.5 -1 -11
50-59 5 54.5 272.5 -2 -10
40-49 4 44.5 178 -3 -12
30-39 3 34.5 103.5 -4 -12
20-29 2 24.5 49 -5 -10
10-19 3 14.5 43.5 -6 -18
∑f = n = 60

Measures of Central Tendency

Three Measures of Central Tendency

For grouped data:
Mode = 69.5  ( )*10 = 79.5

Assessment Task 1: Measures of Central Tendency
Annual Salary Number of Faculty

P49,000-58,999 10
39,000-48,999 35
29,000-38,999 20
19,000-28,999 8
Determine the following:
a. Mean
b. Median
c. Mode
