General Objectives:
At the end of this section, the students are expected to:
● organize and present data in forms that are both meaningful and useful for decision
● use a variety of statistical tools to process and manage numerical data;
● use the methods of linear regression and correlations to predict the value of a variable
under certain conditions; and
● advocate the use of statistical data in making important decisions.

Introduction to Statistics
Statistics has a great influence in almost all fields of human endeavor. It may have different
meanings, but what matters is how we understand statistics so we can make proper judgments when
a person or company presents us with an argument supported by data. Thus, there is a need of
statistical data in every walk of life.
Whenever we watch television, listen to a radio, or read newspapers, magazines or books, we
encounter statistics. We can find statistics in articles on business, politics, science and technology,
education, sports, and many other subjects. In order to comprehend all the information presented,
we must possess a considerable level of understanding about statistics.
Statistics plays a vital role in all intricacies of life. In fact, it aids in making inferences and
decisions, helps in summarizing or describing data, and assists in forecasting or predicting future
outcomes and even in comparing or establishing certain relationships. In education, statistics gives
information about a school’s population change. In business and economics including government,
statistics helps in the control and maintenance of quality products and assists a financial analyst in
making investment decisions (human resource allocation).
With regard to its etymology, the word “statistics” was derived from the Latin word “status”
or from the Italian word “statista”, which means “political state” or “government”. It is actually
used to describe collection, reliability, organization, representation, analysis, and interpretation of
data and not just a collection of numerical results. Simply put, Statistics is a branch of mathematics
that deals with collection, tabulation or representation, analysis, and interpretation of numerical or
quantitative data, and drawing of conclusions about a population from knowledge of the properties
of a sample.
Division of Statistics
Descriptive Statistics is a statistical procedure concerned with the description of the
characteristics and properties of a group of persons, places or things which are based on easily
verifiable facts. It organizes the presentation, description, and interpretation of the gathered data. It
also includes the study of relationships between or among variables.
Inferential Statistics, on the other hand, is a statistical procedure that draws inferences from the
population on the basis of information obtained from the sample using the various techniques of
descriptive statistics.

Table 2.1 Descriptive vs. Inferential Statistics

Descriptive Statistics Inferential Statistics

Describes what is or what the data shows Draws conclusions that extend beyond the
immediate data alone
Provides summaries about the sample and the Infers from the sample data what the
measures population might think
Describes the data in hand Infers the nature of a larger (typically infinite)
set of data
Simply describes what’s going on with the data Makes inferences from the data to more
general conditions
Uses sampling techniques Uses sampling distributions and hypothesis
Presents the summary measures of data Uses simple time series analysis, correlation,
and regression
Works in a normal distribution Works in a test on proportion and chi-square
Example: Example:
Descriptive statistics answers questions like: Inferential statistics answers questions like:
“How many students are interested to take “Is there a significant difference in the
Statistics online?” academic performance of the male and female
A basketball player wants to find his average students in Statistics?”
shots for the past 10 games. A politician wants to estimate his chance of
winning in the upcoming senatorial election.

Read the research article, then classify whether each statement is descriptive or inferential. Write
DS on the space provided for each number if it is a descriptive statistics and IS if it is an inferential
statistics. The number refers to the previous statement.

Philippine Economic Update: Investing in the Future

In 2017, the Philippines was among the top three growth performers in Region III. (1) _____
The Philippine economy grew from 6.9 percent year-to-year in 2016 to 6.7 percent year-to-year
in 2017. (2) _____
A sustained economic growth is likely to continue to contribute in poverty reduction. (3)
_____ As a matter of fact, the responsiveness of the poverty rate to economic growth was then
projected to decline from 27.0 percent in 2015 to 22.9 percent and 21.7 percent in 2018 and 2019,
respectively. (4) _____ These projections would imply a continuing trend of one million Filipinos
being lifted out of poverty each year. (5) _____
In 2020, growth is expected to level up at 6.6 percent. (6) _____ actually, the economy is
currently growing at its potential, making productive investment in physical and human capital
essential so that it can continue to grow along its current growth trajectory. (7) _____
In the recent years, the Philippine economy has made great strides in delivering inclusive
growth, as evidenced by the declining poverty rates and a falling Gini coefficient. (8) _____
Underemployment has reached historic low rates, but underemployment remains high, near its
18-20 percent decade-long average. (9) _____ Nonetheless, the employment rate increased
between 2006 and 2015, whilst mean wages remained stagnant with only a four percent increase
in real terms over the same period. (10) _____


Lesson Objectives:
At the end of the lesson, the students are expected to:
● define what data is and identify its types;
● classify information according to their level of measurement;
● identify the most commonly used methods of data collection;
● determine the different ways of deriving a sample; and
● identify the different methods of presenting a data.

For a statistician to gain information, he collects data for certain variables which are used to
describe an event. Data is a set of observations, values, elements or objects under consideration.
Types of Data:
1. Raw Data – It pertains to the data collected from the original information.
2. Grouped Data – This type of data is placed in a tabular form and characterized by class intervals
with a corresponding frequency.
3. Primary Data – This data type is measured and gathered by the researcher who published it.
4. Secondary Data – This type of data is republished by another researcher or agency.


When we collect data, we usually classify the information obtained according to one of the four
levels of measurement:

1. The lowest level of measurement is the nominal level. The data at this level of measurement
consist of names only, or qualities with no implied criteria by which the data can be identified
as greater than or less than other data items.
Sex (male or female)
Soft drinks brands (Pepsi, Fruit Soda, Coke, Sparkle)
Religious affiliation

2. The next level of measurement is the ordinal level. The data at the ordinal level may be
arranged in some order, but the actual differences between the data values are neither
determined nor meaningless. In other words, the ordinal level of measurement permits the rank
ordering of the members of a group but exact differences are not computed.
Responses of students on the Faculty Evaluation Sheet:
4 – Excellent
3 – Very Satisfactory
2 – Satisfactory
1 – Poor
3. The interval level of measurement is like the ordinal level, but it has the additional property.
Therefore, the meaningful differences between the data values can be computed. However,

interval level data may not have a starting point or “zero” point. Consequently, differences are
meaningful, but ratios of data values are not.
Mental ability as defined by IQ
4. The ratio level of measurement is the highest level. This ratio level is similar to the interval
level, but it includes an inherent zero as a starting point for all measurements. Consequently,
at this level, both differences and ratios are meaningful.
Number of objects


Categorize these measurements associated with student life according to level: nominal,
ordinal, interval, or ratio.
1. Class category: freshman, sophomore, junior, senior
2. Subject evaluation scale: poor, acceptable, good
3. Time of first class
4. Score on last exam (based on 100 possible points)
5. Age of a student
6. Weekly allowance of a student
7. Length of time to complete an exam
8. Grade of a student in Math
9. Civil status
10. Class standing in a particular section


Actually, there are five (5) most commonly used methods of data collection in educational and
psychological researches which are as follows: 1) interview method, 2) questionnaire method, 3)
observation method, 4) registration method, and 5) experiment method.
Table 2.2 Methods of Data Collection
Methods Characteristics Advantages Disadvantages
1. Interview or It is a person-to-person It provides consistent and It is time-
direct method exchange between the more precise information consuming,
interviewer and the since clarification may expensive, and has
interviewee. be given by the limited field
interviewee. The coverage.
questions may be
repeated or modified to
suit each interviewee’s
level of understanding.
2. Questionnaire The responses are written It saves time and money. There is a high level
or indirect and the research Also, a large number of of probability of
method participants are given samples can be reached having no response,
more time to answer the in a shorter span of time. especially if the

prepared questions. A Additionally, the questionnaires are
questionnaire is a list of informers may feel a mailed. Likewise,
questions which are greater sense of freedom the questions which
intended to elicit answers to express their views are not easily
to the problems of a and opinions because understood will
study. It may be mailed their anonymity is probably not be
or personally delivered. maintained. answered.
3. Observation The investigator The data can be easily The information
method observes the behavior of gathered during the may be subjected to
persons or organizations available time of the subjective
and their outcomes. It is researcher since it can be judgments.
usually used when the done anytime.
subjects cannot talk or
4. Registration Gathering information The most reliable The data are limited
method from the respondents is information is kept to what is listed in
enforced by certain laws, systematized and made the documents.
policies, rules, available to all because
regulations, decrees, or of the requirement of the
standard practices. law.
Examples are the
registration of births,
deaths, motor vehicles,
marriages and licenses.

5. Experiment It is used when the It can go beyond plain There are lots of
method objective is to determine description. threats to internal
the cause-and-effect and external
relationship of certain validity.
phenomena under
controlled conditions.

Identify the best method of collecting data applicable in each objective. Write your answer on
the second column aligned to the item number.
1. To differentiate the actuations and actions of elementary pupils and
high school students.
2. To identify the effects of trainings and physical workshops in the Body
Mass Index (BMI) of the dancers.
3. To determine the proportion of dismissed students from the total
number of enrolled students.
4. To identify the students’ preferred type of examination.
5. To know the teachers’ opinion on the K-to-12 program of the Basic
Education Curriculum.


It is not necessary for the researcher to examine every member of the population to get the data
or information about the population. The cost and time constraints will prohibit one from
undertaking a study of the entire population. At any rate, all that he needs is to draw sample units
systematically or at random. This process is called sampling.
The term sampling refers to the process which involves selecting a part of the population,
making observations on this representative group, and then generalizing the findings to the bigger
population. Also, it refers to the strategies which enable you to pick a subgroup from a larger group
and then use this subgroup as a basis in making judgments about the larger group.
Sampling techniques or strategies refer to the different ways of deriving a sample. There are
two kinds of sampling techniques:
1. Probability Sampling. Probability sampling is a technique where all elements in the population
frame have an equal chance of being selected. The representative samples of the population are
selected using this technique. The findings of researches using a probability sampling can be
used to infer the characteristics of the population. Actually, the findings are more valid when
probability sampling is used.
The different sampling strategies under probability sampling are the following:
a. Random Sampling
This is done by using a lottery sampling or table of random numbers. To illustrate this,
number each subject in the population. Afterwards, place each number in a bowl, and select as
many card numbers as needed. Then, the subjects whose numbers are selected will constitute
the sample.
b. Systematic Sampling
This is done by numbering each subject of the population and then selecting every kth
number. For example, there are 5000 families in a city, so only 50 families are needed as sample
for an experiment. Since 5000 ÷ 50 = 100, then k = 100. This means that every 100th subject
will be selected. However, the first subject will be selected at random from subjects 1 to 100.
Suppose the subject 88 is selected, then the sample will consist of subjects whose numbers are
88, 188, 288, and so on until 50 families will be obtained.
c. Stratified Sampling
Stratified sampling is a sampling strategy in such a way that specific subgroups or strata
will have a sufficient number of representatives within the sample to provide ample numbers
for sub-analysis of the members of these subgroups. Strata are designed so that members in
each stratum are more homogenous, that is, more similar to each other. The results are then
grouped together to form the sample. This technique is particularly useful in populations that
can be stratified into groups by gender, race, or geography.
d. Cluster Sampling
Cluster sampling occurs when you select the members of your sample in clusters rather
than use separate individuals. It is a sampling strategy in which groups, not individuals, are
randomly selected. Thus, any intact group of similar characteristics is a cluster. Additionally,

this is sometimes referred to as area sampling because it is frequently applied on a geographical
e. Multi-Stage Sampling
This technique uses several stages or phases in getting the sample from the general
population. However, the selection of the sample is still done at random. Moreover, multi-stage
sampling is useful in conducting nationwide surveys or any survey involving a large universe.
2. Non-probability Sampling. Non-probability sampling is a strategy where not all elements in the
population frame have an equal chance of being selected. Certain parts in the overall group are
deliberately not included in the selection of the representative subgroup. This strategy is also
called non-random or judgment sampling because it makes use of judgment in the selection of
items to put into the subgroup.
Under non-probability sampling, the following strategies are considered:
a. Purposive or Deliberate Sampling
This type of sampling strategy is based on certain criteria laid down by the researcher.
Thus, the people who satisfy the criteria are interviewed. For instance, a researcher might want
to find out the reactions of the banking community to a particular Central Bank Circular.
Instead of interviewing the executives of all banks, he can purposely choose to interview the
key executives of the five (5) biggest banks in the country only if he believes that it is the
reaction of these big ones that counts anyway. Of course, the answers obtained through this
procedure are not representative of the entire banking system.
b. Quota Sampling
In quota sampling, you identify a set of important characteristics of a population and then
select your desired samples in a non-random way. It is assumed that the samples will match the
population with regard to the chosen set of characteristics.
For instance, if you are required in a research class to determine the most favored soft
drinks from a population of televiewers, you should interview televiewers who drink soft
drinks. You continue this process until you arrive at your quota.
c. Convenience or Accidental Sampling
This sampling strategy is based on the convenience of the researcher. For instance, if you
want to know the opinions of Filipinos about national reconciliation in the Philippines through
telephone interviews, you will have the chance to interview only those who have telephones,
which somehow manifests bias against those who have no telephones.


Identify what sampling technique is exemplified in each statement. Write your answer on the
second column aligned to the item number.
1. Every 12th customer entering a shopping mall is asked to select his or
her favorite store.
2. In a university, all teachers from three buildings are interviewed to
determine whether they think students have higher grades now than in
previous years.
3. Supervisors are selected using random numbers in order to determine
their annual salaries.

4. A teacher writes the name of each student in a card, shuffles the cards,
and then draws five names.
5. A head nurse selects 10 patients from each floor of a hospital.


The collected data must be organized in order to show significant characteristics. They can be
presented in three (3) forms:
Textual, where the data are presented in a paragraph form.
Tabular, where the data are presented in rows and columns.
Graphical, where the data are presented in a visual form.
1. Textual Form. This is the simplest method of presenting data particularly when there are only
a few numbers to be presented. In this form, results are explained in a paragraph form. This
includes enumerating the important characteristics, emphasizing the most significant features,
and highlighting the most striking attributes of the set of data.
In the College of Education, the data show that out of 186 freshmen, 89 or 47.85% are male
while 97 or 52.15% are female.
2. Tabular Form. The data are presented in a systematic and orderly manner to catch one’s
attention as it may facilitate the comprehension and analysis of the data presented. The
frequency distribution table (FDT) is a statistical table showing the frequency or number of
observations contained in each of the defined classes or categories. Each category in the table
is placed in a row or column and the data are assigned in suitable cells.
Parts of a Statistical Table
1. Table Heading includes the table number and title of the table.
2. Body refers to the main part of the table that contains the information of figures.
3. Stubs or classes refer to the classifications or categories describing the data and are usually
found at the leftmost side of the table.
4. Box head is located at the top of the body.

3. Graphical Form. In this form, the data are presented in a visual form. The numerical data
provided in a frequency distribution or contingency table can be made more interesting and
easier to understand when depicted in a graphical form. A graph is a pictorial presentation of a
given set of data. It should have good appearance, and should be accurate, clear and simple.

Types of Graph:
1. Scatter Graph – It is a graph used to present values or measurements that are thought to be
related. This graph is used when the data are interval or ratio.

2. Line Graph – It is a graphical presentation of data especially useful in showing trends over
a period of time. This graph is used when the data are interval or ratio.

3. Bar or Column Graph – It is like a circle graph and only applicable to grouped data. This
consists of bars or heavy lines of equal widths, either vertical or horizontal. This graph is
used when the data are considered nominal and ordinal.

4. Circle Graph – It is also known as a circular or pie chart. This is used to represent the parts
that make up a whole. Pie charts are most effective when illustrating budget allocations of
a family or an agency, or in dealing with qualitative variables involving popularity. This
graph is used when the data is nominal, ordinal, interval, or ratio. However, it is not
practical to use a pie chart when there are more than five or six possible values for a


Determine whether the statement is true or false. Write your answer on the second column
aligned to the item number.
1. The data collected over a period of time can be graphed using a pie graph.
2. In a tabular form, data are presented in a systematic and orderly manner.

3. Bar graphs can be drawn using vertical or horizontal lines.

4. Textual form can be used if there are several numbers to be presented.

5. The data collected must be organized to show insignificant characteristics.

