G.E. 4 Pre - Final Handoout

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

LESSON 1 - BASIC CONCEPTS IN STATISTICS

Statistics is the study of the collection, analysis, interpretation, presentation, and organization of data. In other words, it is a
mathematical discipline to collect, summarize data.
According to Merriam-Webster dictionary, statistics is defined as “classified facts representing the conditions of a people in a
state – especially the facts that can be stated in numbers or any other tabular or classified arrangement”.
According to statistician Sir Arthur Lyon Bowley, statistics is defined as “Numerical statements of facts in any department of
inquiry placed in relation to each other”.

Mathematical Statistics
Mathematical statistics is the application of Mathematics to Statistics, which was initially conceived as the science of the state
— the collection and analysis of facts about a country: its economy, and, military, population, and so forth.
Mathematical techniques used for different analytics include mathematical analysis, linear algebra, stochastic analysis,
differential equation and measure-theoretic probability theory.
Scope
Statistics is used in many sectors such as psychology, geology, sociology, weather forecasting, probability and much more. The
goal of statistics is to gain understanding from the data, it focuses on applications, and hence, it is distinctively considered as a
mathematical science.
Population
Population consists of all subjects (people, objects, events) that are being studied.
Sample
Sample refers to a group of subjects selected from a population of interest. The size of the sample is always less than the total
size of the population.
Two Broad Areas of Statistics
Descriptive Statistics – refers to the processes that are used in presenting and describing data.
Inferential Statistics – refers to the processes of making inferences about the population based on observations of a smaller
group.
Data
Data is a collection of facts, such as numbers, words, measurements, observations etc.
Types of Data-

1. Qualitative data- it is descriptive data.


o Example- She can run fast, He is thin.
2. Quantitative data- it is numerical information.
o Example- An Octopus is an Eight-legged creature.

Variables
Variables are those characteristics which can take on more than one value and which shows variation from person to person
or from case to case or from event to event. If for instance, a researcher observes a group of individuals and measures their
age, height, weight, intelligence, achievement and attitudes, such characteristics are called variables because one would
expect to find variations from person to person of these attributes.

The types of variables are: Quantitative, Qualitative, Independent, Dependent, Discrete, Continuous and Suppressor Variables.

Quantitative Variables
These are variables which vary in quantity and are therefore recorded in numerical form, examples are age, test scores, time
required to solve problems, height, weight etc.

Qualitative Variables
These are variables which vary in quality and may be recorded by means of a verbal label or through the use of code numbers.
Examples are sex, colour of hair or eyes, handedness, shape of face etc.

Independent Variables
These are the variables which are manipulated in an experimental study by the experimenter in order to see what effect
changes in those variables have on the other variable which is hypothesized to be dependent upon it. Variations in an
independent variable are presumed to result in variations in the dependent variable.

Dependent Variables
These are variables so called because their values are thought to depend on, or vary with the values of the independent
variables. Note that in nonexperimental studies the independent variable is not manipulated by the experimenter, but is a
pre-existing variable which is hypothesized to influence a dependent variable.

Note also that those characteristics which serve as variables in one study may be kept constant in another study by selecting
members of a sample on the basis of similarity in these characteristics. The term constant can be used in reference to
characteristics which do not vary for the members of a particular group.

Discrete Variables
These are variables which can take on only a specific set of values. These variables can only yield whole number values, and
no fractions. Take for instance, the number of students in a class. This can be 35, 40, 42, 45 etc, but cannot be 35½ or 40¼
etc. Family size is another discrete variables. Take your own family for example, your family may be composed of 4, 5, 7 or
more people, but values between the numbers or fractions would not be possible, as you cannot have 4½ or 5½ etc. people.
Discrete variables may be either qualitative – sex, marital status, handedness, state of origin, nationality, or quantitative –
number of books in the library, number of goats in the farm, number of graduate teachers in the school, number of boys
doing chemistry in a class etc.

Continuous Variables
These are variables which can assume any values, including fractional values, within a range of values. In other words,
continuous variables are measured in both whole and fractional units. Age, height, weight, intelligent test, achievement test
scores, daily caloric intake, time, length etc. are all examples of continuous variables. An individual can be described as 12½
years old, 1.7m tall, and weight of 70.3kg etc. These variables are always quantitative in nature and exist along a continuum
from the smallest amount of the variable at one extreme to the largest amount possible at the other end. For instance, to go
from 25 to 26, one must pass through a large number of fractional parts such as 25.0, 25.1, 25.2 etc. Measurement here is
always an approximation of the true value. This is because no matter how accurately you try to measure, it is not possible to
measure and record all the possible values of a continuous variable. These are measured therefore, to the nearest convenient
unit. Take 25.0 to 25.1 as an example, you will note that from 25.0 you have 25.001, 25.002, 25.003 etc. 25.01, 25.02, 25.03
etc. It can be endless to count from 25.0 to 26.

Types of variables with regard to measurement (Level of measurement)


The first level of measurement is nominal level of measurement. In this level of measurement, the numbers in the variable are
used only to classify the data. In this level of measurement, words, letters, and alpha-numeric symbols can be used. Suppose
there are data about people belonging to three different gender categories. In this case, the person belonging to the female
gender could be classified as F, the person belonging to the male gender could be classified as M, and transgendered classified
as T. This type of assigning classification is nominal level of measurement.
The second level of measurement is the ordinal level of measurement. This level of measurement depicts some ordered
relationship among the variable’s observations. Suppose a student scores the highest grade of 100 in the class. In this case, he
would be assigned the first rank. Then, another classmate scores the second highest grade of an 92; she would be assigned the
second rank. A third student scores a 81 and he would be assigned the third rank, and so on. The ordinal level of
measurement indicates an ordering of the measurements.
The third level of measurement is the interval level of measurement. The interval level of measurement not only classifies and
orders the measurements, but it also specifies that the distances between each interval on the scale are equivalent
along the scale from low interval to high interval. For example, an interval level of measurement could be the
measurement of anxiety in a student between the score of 10 and 11, this interval is the same as that of a student who
scores between 40 and 41. A popular example of this level of measurement is temperature in centigrade, where, for
example, the distance between 940C and 960C is the same as the distance between 1000C and 1020C.
The fourth level of measurement is the ratio level of measurement. In this level of measurement, the observations, in addition
to having equal intervals, can have a value of zero as well. The zero in the scale makes this type of measurement unlike the
other types of measurement, although the properties are similar to that of the interval level of measurement. In the ratio
level of measurement, the divisions between the points on the scale have an equivalent distance between them.
The researcher should note that among these levels of measurement, the nominal level is simply used to classify data,
whereas the levels of measurement described by the interval level and the ratio level are much more exact.

LESSON 2 – MEASURES OF CENTRAL TENDENCY


Mean, Median, Mode and Range
The mean, median and mode are three different ways of describing the average.

• To find the mean, add up all the numbers and divide by the number of numbers.
• To find the median, place all the numbers in order and select the middle number.
• The mode is the number which appears most often.
• The range gives an idea of how the data are spread out and is the difference between the smallest and largest
values.

Worked Example 1
Find
(a)the mean (b) the median (c) the mode (d) the range of this set of data.

5, 6, 2, 4, 7, 8, 3, 5, 6, 6
Solution
(a) The mean is

=
= 5.2
(b) To find the median, place all the numbers in order.
2, 3, 4, 5, 5, 6, 6, 6, 7, 8

As there are two middle numbers in this example, 5 and 6,

median = = = 5.5
(c) From the list above it is easy to see that 6 appears more than any other number, so

mode = 6
(d) The range is the difference between the smallest and largest numbers, in this case 2 and 8. So the range is 8
− 2 = 6.
Worked Example 2
Five people play golf and at one hole their scores are
3, 4, 4, 5, 7
For these scores, find (a)the mean (b)the median (c)the mode (d)the range .

Solution
(a) The mean is

=
= 4.6

(b) The numbers are already in order and the middle number is 4. So

median = 4
(c) The score 4 occurs most often, so,

mode = 4
(d) The range is the difference between the smallest and largest numbers, in this case 3 and 7, so
range = 7 − 3 = 4

Finding the Mean from Tables and Tally Charts


Often data are collected into tables or tally charts. This section considers how to find the mean in such cases.

Worked Example 1
A football team keep records of the number of goals it scores per match during a season. The list is shown opposite.
Find the mean number of goals per match. No. of Goals Frequency
0 8
1 10
2 12
3 3
4 5
5 2

Solution
The previous table can be used, with a third column added.

The mean can now be calculated. No. of Goals Frequency No. of Goals ⋅ Frequency

0 8 0⋅8 = 0
Mean =
1 10 1 10⋅ = 10
= 1.825
2 12 2 12⋅ = 24
3 3 3⋅3 = 9
4 5 4 ⋅ 5 = 20
5 2 5 ⋅ 2 = 10
TOTALS 40 73
Worked Example 2
A police station kept records of the number of road traffic accidents in their area each day for 100 days. The figures below give
the number of accidents per day.

1 4 3 5 5 2 5 4 3 2 0 3 1 2 2 3 0 5 2 1

3 3 2 6 2 1 6 1 2 2 3 2 2 2 2 5 4 4 2 3
3 1 4 1 7 3 3 0 2 5 4 3 3 4 3 4 5 3 5 2
4 4 6 5 2 4 5 5 3 2 0 3 3 4 5 2 3 3 4 4
1 3 5 1 1 2 2 5 6 6 4 6 5 8 2 5 3 3 5 4
Find the mean number of accidents per day.

Solution

The first step is to draw out and complete a tally chart. The final column shown below can then be added and completed.

Number of Accidents Tally Frequency No. of Accidents ⋅ Frequency

0 IIII 4 0 ⋅ 4 = 0
1 IIII IIII 10 1 10
⋅ 10 =
2 IIII IIII IIII IIII II 22 2 22
⋅ 44 =
3 IIII IIII IIII IIII III 23 3 23
⋅ 69 =
4 IIII IIII IIII I 16 4 16
⋅ = 64
5 IIII IIII IIII II 17 5 ⋅ 17 = 85
6 IIII I 6 6 ⋅ 6 = 36
7 I 1 7 ⋅ 1 = 7
8 I 1 8 ⋅ 1 = 8

TOTALS 100 3 23

Mean number of accidents per day = = 3.23

Worked Example 3

The marks obtained by 25 pupils on a test are shown below.

3 4 5 6 5

5 1 2 3 3

4 7 5 1 5

2 5 6 5 4

6 4 5 4 3
(a) Copy and complete the frequency table below to present the information given above.
Marks Frequency
1 2
2 2
3 4
4 -
5 -
6 3
7 1
(b) Using the frequency distribution, state

(i) the modal mark

(ii) the median mark

(iii) the range.

Solution
(a) Marks Frequency (Check: total frequency = 2+2+4+5+8+3+1=25)
(b) (i) Modal mark = 5 (with frequency 8)
1 2
2 2 (ii) Median mark = 4 (as we need the 13th
number, when in order)
3 4
4 5 (iii) Range = 7 − 1 = 6

5 8

6 3
7 1
EXERCISES
Directions: Read the following carefully. Do as indicated. Write your answer in separate sheet/s.
A. Identify each of the following situations makes use of descriptive statistics or inferential statistics.
1. Mr. Cruz checked the classroom attendance and recorded the names of the absentees.
2. In a pen factory, a production manager is asked to check 20 pens from each box of 100 pens to determine the
quality of the product.
3. A certain supermarket, offers free taste of a brand of fruit juice. Each costumer is offered 10 ml of the juice In a
cup.
B. Identify the population and sample in each of the following situations.
1. A certain supermarket, offers free taste of a brand of fruit juice. Each costumer is offered 10 ml of the juice In a
cup.
2. In a pen factory, a production manager is asked to check 20 pens from each box of 100 pens to determine the
quality of the product.
3. Mr. Uy wants to know the percentage of defective drums they produce in a week by examining 10 drums each day
produced at various times during a day.
4. Mang Al bought a cavan of rice. He first examined only a handful of rice from the cavan to determine its quality.
5. In a blood test, the medical technician or nurse takes only a few cubic centimeters of blood from the patient.
C. Write D if the item is discrete variable or C if it is continuous variable.
1. Number of classrooms in CMC.
2. Travel time from Cataingan to Masbate City.
3. Distance from CMC to your house.
4. Weight of a book.
5. Number of chairs in a classroom.
6. Height of the tallest man among your classmates.
7. Number of freshmen students in CMC.
D. Write N if the item is Nominal, O if ordinal, I if interval and R if ratio level.
1. ID numbers
2. Citizenships
3. Blood Types
4. Grades in Math
5. Movie Ratings
6. Mobile Numbers
7. Gender
8. House Number

E. Find the mean, median, mode and range of each set of numbers below.
1. 3, 4, 7, 3, 5, 2, 6, 10
2. 8, 10, 12, 14, 7, 16, 5, 7, 9, 11
3. 17, 18, 16, 17, 17, 14, 22, 15, 16, 17, 14, 12
LESSON 3 – FREQUENCY DISTRIBUTION TABLE
Array – this is the arrangement of the data from the highest to lowest or vice versa.
Range – this is the difference between the highest number and the lowest number.
Class frequency – this refers to the number of observations belonging to class interval for the number of items within a
category.
Class Interval – this is the grouping or category defined by a lower limit and an upper limit.
Class Boundaries – they are more precise expressions of the class limits by at least 0.5 their value. It is situated between the
upper limit of our interval and the lower limit of the next interval.
Class Mark – this is computed by adding the lower and upper limit and the sum divided by 2.
Class Size – this is the width of class interval.

Frequency distribution table is usually used to interpret data. Some companies use this to analyze their sales and they
can easily present their reports easily.

Example:
Given below are the scores of 25 students in their G.E. 4 Module 2, make a frequency distribution table and a
graph.
30 50 27 28 32
41 62 51 50 35
37 38 43 57 61
28 36 36 30 24
36 50 37 55 54

Solution: Get the Class Size


First arrange the given data in array. r
cs=
24 27 28 28 30 √n
30 32 35 36 36 r = range
36 37 37 38 41 Then find the range:
n= no. of data
43 50 50 50 51 Range = Hv – Lv
r = 38
54 55 57 61 62 Hv = higher value
n = 25
Lv = lower value
38 38
Range = 62 – 24 = 38 cs= = =7.6∨8
√ 25 5
Note: always change the class size into
whole number.

CUMULATIVE RELATIVE
CLASS TALLY CLASS CLASS CLASS BOUNDARY
FREQUENCY FREQUENCY
INTERVAL MARKS FREQUENCY MARK
LOWER UPPER <CF >CF %
24 – 31 IIII I 6 27.5 23.5 31.5 6 25 24%
32 – 39 IIII III 8 35.5 31.5 39.5 14 19 32%
40 – 47 II 2 43.5 39.5 47.5 16 11 8%
48 – 55 IIII I 6 51.5 47.5 55.5 22 9 24%
56 – 63 III 3 59.5 55.5 63.5 25 3 12%
n = 25 Total = 100%

Now let’s make a graph, we are going to make only 3graphs.


1. Line graph
2. Bar graph
3. Pie graph
G.E. 4 MODULE 2 SCORES
9
8 8
7
CLASS FREQUENCY

6 6 6
5
4
3 3
2 2
1
0
24 – 31 32 – 39 40 – 47 48 – 55 56 – 63
CLASS INTERVAL

LINE GRAPH

G.E. 4 MODULE 2 SCORES


9
8
7
CLASS FREQUENCY

6
5
4
3
2
1
0
24 – 31 32 – 39 40 – 47 48 – 55 56 – 63
CLASS INTERVAL

BAR GRAPH
G.E. 4 MODULE 2 SCORES

12% 24 – 31
24%
32 – 39
24% 40 – 47
48 – 55
8% 32% 56 – 63

CLASS INTERVAL

PIE GRAPH
Exercise:
Given the following scores of 60 students in Mathematics prelim examination.

17 31 36 26 34 32
44 33 37 39 45 21
24 38 40 42 39 32
43 18 24 32 49 33
33 33 40 24 46 22
29 33 37 30 43 43
26 39 57 30 40 33
25 33 48 39 34 29
29 37 39 35 41 29
23 32 48 28 45 19

Do the following:
1. Mean
2. Median
3. Mode
4. Range
5. Class size
6. Frequency Distribution Table
7. Line graph
8. Bar graph
9. Pie graph

You might also like