Professional Documents
Culture Documents
G.E. 4 Pre - Final Handoout
G.E. 4 Pre - Final Handoout
G.E. 4 Pre - Final Handoout
Statistics is the study of the collection, analysis, interpretation, presentation, and organization of data. In other words, it is a
mathematical discipline to collect, summarize data.
According to Merriam-Webster dictionary, statistics is defined as “classified facts representing the conditions of a people in a
state – especially the facts that can be stated in numbers or any other tabular or classified arrangement”.
According to statistician Sir Arthur Lyon Bowley, statistics is defined as “Numerical statements of facts in any department of
inquiry placed in relation to each other”.
Mathematical Statistics
Mathematical statistics is the application of Mathematics to Statistics, which was initially conceived as the science of the state
— the collection and analysis of facts about a country: its economy, and, military, population, and so forth.
Mathematical techniques used for different analytics include mathematical analysis, linear algebra, stochastic analysis,
differential equation and measure-theoretic probability theory.
Scope
Statistics is used in many sectors such as psychology, geology, sociology, weather forecasting, probability and much more. The
goal of statistics is to gain understanding from the data, it focuses on applications, and hence, it is distinctively considered as a
mathematical science.
Population
Population consists of all subjects (people, objects, events) that are being studied.
Sample
Sample refers to a group of subjects selected from a population of interest. The size of the sample is always less than the total
size of the population.
Two Broad Areas of Statistics
Descriptive Statistics – refers to the processes that are used in presenting and describing data.
Inferential Statistics – refers to the processes of making inferences about the population based on observations of a smaller
group.
Data
Data is a collection of facts, such as numbers, words, measurements, observations etc.
Types of Data-
Variables
Variables are those characteristics which can take on more than one value and which shows variation from person to person
or from case to case or from event to event. If for instance, a researcher observes a group of individuals and measures their
age, height, weight, intelligence, achievement and attitudes, such characteristics are called variables because one would
expect to find variations from person to person of these attributes.
The types of variables are: Quantitative, Qualitative, Independent, Dependent, Discrete, Continuous and Suppressor Variables.
Quantitative Variables
These are variables which vary in quantity and are therefore recorded in numerical form, examples are age, test scores, time
required to solve problems, height, weight etc.
Qualitative Variables
These are variables which vary in quality and may be recorded by means of a verbal label or through the use of code numbers.
Examples are sex, colour of hair or eyes, handedness, shape of face etc.
Independent Variables
These are the variables which are manipulated in an experimental study by the experimenter in order to see what effect
changes in those variables have on the other variable which is hypothesized to be dependent upon it. Variations in an
independent variable are presumed to result in variations in the dependent variable.
Dependent Variables
These are variables so called because their values are thought to depend on, or vary with the values of the independent
variables. Note that in nonexperimental studies the independent variable is not manipulated by the experimenter, but is a
pre-existing variable which is hypothesized to influence a dependent variable.
Note also that those characteristics which serve as variables in one study may be kept constant in another study by selecting
members of a sample on the basis of similarity in these characteristics. The term constant can be used in reference to
characteristics which do not vary for the members of a particular group.
Discrete Variables
These are variables which can take on only a specific set of values. These variables can only yield whole number values, and
no fractions. Take for instance, the number of students in a class. This can be 35, 40, 42, 45 etc, but cannot be 35½ or 40¼
etc. Family size is another discrete variables. Take your own family for example, your family may be composed of 4, 5, 7 or
more people, but values between the numbers or fractions would not be possible, as you cannot have 4½ or 5½ etc. people.
Discrete variables may be either qualitative – sex, marital status, handedness, state of origin, nationality, or quantitative –
number of books in the library, number of goats in the farm, number of graduate teachers in the school, number of boys
doing chemistry in a class etc.
Continuous Variables
These are variables which can assume any values, including fractional values, within a range of values. In other words,
continuous variables are measured in both whole and fractional units. Age, height, weight, intelligent test, achievement test
scores, daily caloric intake, time, length etc. are all examples of continuous variables. An individual can be described as 12½
years old, 1.7m tall, and weight of 70.3kg etc. These variables are always quantitative in nature and exist along a continuum
from the smallest amount of the variable at one extreme to the largest amount possible at the other end. For instance, to go
from 25 to 26, one must pass through a large number of fractional parts such as 25.0, 25.1, 25.2 etc. Measurement here is
always an approximation of the true value. This is because no matter how accurately you try to measure, it is not possible to
measure and record all the possible values of a continuous variable. These are measured therefore, to the nearest convenient
unit. Take 25.0 to 25.1 as an example, you will note that from 25.0 you have 25.001, 25.002, 25.003 etc. 25.01, 25.02, 25.03
etc. It can be endless to count from 25.0 to 26.
• To find the mean, add up all the numbers and divide by the number of numbers.
• To find the median, place all the numbers in order and select the middle number.
• The mode is the number which appears most often.
• The range gives an idea of how the data are spread out and is the difference between the smallest and largest
values.
Worked Example 1
Find
(a)the mean (b) the median (c) the mode (d) the range of this set of data.
5, 6, 2, 4, 7, 8, 3, 5, 6, 6
Solution
(a) The mean is
=
= 5.2
(b) To find the median, place all the numbers in order.
2, 3, 4, 5, 5, 6, 6, 6, 7, 8
median = = = 5.5
(c) From the list above it is easy to see that 6 appears more than any other number, so
mode = 6
(d) The range is the difference between the smallest and largest numbers, in this case 2 and 8. So the range is 8
− 2 = 6.
Worked Example 2
Five people play golf and at one hole their scores are
3, 4, 4, 5, 7
For these scores, find (a)the mean (b)the median (c)the mode (d)the range .
Solution
(a) The mean is
=
= 4.6
(b) The numbers are already in order and the middle number is 4. So
median = 4
(c) The score 4 occurs most often, so,
mode = 4
(d) The range is the difference between the smallest and largest numbers, in this case 3 and 7, so
range = 7 − 3 = 4
Worked Example 1
A football team keep records of the number of goals it scores per match during a season. The list is shown opposite.
Find the mean number of goals per match. No. of Goals Frequency
0 8
1 10
2 12
3 3
4 5
5 2
Solution
The previous table can be used, with a third column added.
The mean can now be calculated. No. of Goals Frequency No. of Goals ⋅ Frequency
0 8 0⋅8 = 0
Mean =
1 10 1 10⋅ = 10
= 1.825
2 12 2 12⋅ = 24
3 3 3⋅3 = 9
4 5 4 ⋅ 5 = 20
5 2 5 ⋅ 2 = 10
TOTALS 40 73
Worked Example 2
A police station kept records of the number of road traffic accidents in their area each day for 100 days. The figures below give
the number of accidents per day.
1 4 3 5 5 2 5 4 3 2 0 3 1 2 2 3 0 5 2 1
3 3 2 6 2 1 6 1 2 2 3 2 2 2 2 5 4 4 2 3
3 1 4 1 7 3 3 0 2 5 4 3 3 4 3 4 5 3 5 2
4 4 6 5 2 4 5 5 3 2 0 3 3 4 5 2 3 3 4 4
1 3 5 1 1 2 2 5 6 6 4 6 5 8 2 5 3 3 5 4
Find the mean number of accidents per day.
Solution
The first step is to draw out and complete a tally chart. The final column shown below can then be added and completed.
0 IIII 4 0 ⋅ 4 = 0
1 IIII IIII 10 1 10
⋅ 10 =
2 IIII IIII IIII IIII II 22 2 22
⋅ 44 =
3 IIII IIII IIII IIII III 23 3 23
⋅ 69 =
4 IIII IIII IIII I 16 4 16
⋅ = 64
5 IIII IIII IIII II 17 5 ⋅ 17 = 85
6 IIII I 6 6 ⋅ 6 = 36
7 I 1 7 ⋅ 1 = 7
8 I 1 8 ⋅ 1 = 8
TOTALS 100 3 23
Worked Example 3
3 4 5 6 5
5 1 2 3 3
4 7 5 1 5
2 5 6 5 4
6 4 5 4 3
(a) Copy and complete the frequency table below to present the information given above.
Marks Frequency
1 2
2 2
3 4
4 -
5 -
6 3
7 1
(b) Using the frequency distribution, state
Solution
(a) Marks Frequency (Check: total frequency = 2+2+4+5+8+3+1=25)
(b) (i) Modal mark = 5 (with frequency 8)
1 2
2 2 (ii) Median mark = 4 (as we need the 13th
number, when in order)
3 4
4 5 (iii) Range = 7 − 1 = 6
5 8
6 3
7 1
EXERCISES
Directions: Read the following carefully. Do as indicated. Write your answer in separate sheet/s.
A. Identify each of the following situations makes use of descriptive statistics or inferential statistics.
1. Mr. Cruz checked the classroom attendance and recorded the names of the absentees.
2. In a pen factory, a production manager is asked to check 20 pens from each box of 100 pens to determine the
quality of the product.
3. A certain supermarket, offers free taste of a brand of fruit juice. Each costumer is offered 10 ml of the juice In a
cup.
B. Identify the population and sample in each of the following situations.
1. A certain supermarket, offers free taste of a brand of fruit juice. Each costumer is offered 10 ml of the juice In a
cup.
2. In a pen factory, a production manager is asked to check 20 pens from each box of 100 pens to determine the
quality of the product.
3. Mr. Uy wants to know the percentage of defective drums they produce in a week by examining 10 drums each day
produced at various times during a day.
4. Mang Al bought a cavan of rice. He first examined only a handful of rice from the cavan to determine its quality.
5. In a blood test, the medical technician or nurse takes only a few cubic centimeters of blood from the patient.
C. Write D if the item is discrete variable or C if it is continuous variable.
1. Number of classrooms in CMC.
2. Travel time from Cataingan to Masbate City.
3. Distance from CMC to your house.
4. Weight of a book.
5. Number of chairs in a classroom.
6. Height of the tallest man among your classmates.
7. Number of freshmen students in CMC.
D. Write N if the item is Nominal, O if ordinal, I if interval and R if ratio level.
1. ID numbers
2. Citizenships
3. Blood Types
4. Grades in Math
5. Movie Ratings
6. Mobile Numbers
7. Gender
8. House Number
E. Find the mean, median, mode and range of each set of numbers below.
1. 3, 4, 7, 3, 5, 2, 6, 10
2. 8, 10, 12, 14, 7, 16, 5, 7, 9, 11
3. 17, 18, 16, 17, 17, 14, 22, 15, 16, 17, 14, 12
LESSON 3 – FREQUENCY DISTRIBUTION TABLE
Array – this is the arrangement of the data from the highest to lowest or vice versa.
Range – this is the difference between the highest number and the lowest number.
Class frequency – this refers to the number of observations belonging to class interval for the number of items within a
category.
Class Interval – this is the grouping or category defined by a lower limit and an upper limit.
Class Boundaries – they are more precise expressions of the class limits by at least 0.5 their value. It is situated between the
upper limit of our interval and the lower limit of the next interval.
Class Mark – this is computed by adding the lower and upper limit and the sum divided by 2.
Class Size – this is the width of class interval.
Frequency distribution table is usually used to interpret data. Some companies use this to analyze their sales and they
can easily present their reports easily.
Example:
Given below are the scores of 25 students in their G.E. 4 Module 2, make a frequency distribution table and a
graph.
30 50 27 28 32
41 62 51 50 35
37 38 43 57 61
28 36 36 30 24
36 50 37 55 54
CUMULATIVE RELATIVE
CLASS TALLY CLASS CLASS CLASS BOUNDARY
FREQUENCY FREQUENCY
INTERVAL MARKS FREQUENCY MARK
LOWER UPPER <CF >CF %
24 – 31 IIII I 6 27.5 23.5 31.5 6 25 24%
32 – 39 IIII III 8 35.5 31.5 39.5 14 19 32%
40 – 47 II 2 43.5 39.5 47.5 16 11 8%
48 – 55 IIII I 6 51.5 47.5 55.5 22 9 24%
56 – 63 III 3 59.5 55.5 63.5 25 3 12%
n = 25 Total = 100%
6 6 6
5
4
3 3
2 2
1
0
24 – 31 32 – 39 40 – 47 48 – 55 56 – 63
CLASS INTERVAL
LINE GRAPH
6
5
4
3
2
1
0
24 – 31 32 – 39 40 – 47 48 – 55 56 – 63
CLASS INTERVAL
BAR GRAPH
G.E. 4 MODULE 2 SCORES
12% 24 – 31
24%
32 – 39
24% 40 – 47
48 – 55
8% 32% 56 – 63
CLASS INTERVAL
PIE GRAPH
Exercise:
Given the following scores of 60 students in Mathematics prelim examination.
17 31 36 26 34 32
44 33 37 39 45 21
24 38 40 42 39 32
43 18 24 32 49 33
33 33 40 24 46 22
29 33 37 30 43 43
26 39 57 30 40 33
25 33 48 39 34 29
29 37 39 35 41 29
23 32 48 28 45 19
Do the following:
1. Mean
2. Median
3. Mode
4. Range
5. Class size
6. Frequency Distribution Table
7. Line graph
8. Bar graph
9. Pie graph