Statistics and Probability Handouts - Basic Terms in Statistics

CHAPTER 1: EXPLORING DATA
Introducing Statistics
Statistics as a Tool in Decision-Making
Statistics is defined as a science that studies data to be able to make a decision.
Hence, it is a tool in decision-making process. Mention that Statistics as a science involves the
methods of collecting, processing, summarizing and analyzing data in order to provide answers
or solutions to an inquiry. One also needs to interpret and communicate the results of the
methods identified above to support a decision that one makes when faced with a problem or an
inquiry.
Trivia: The word “statistics” actually comes from the word “state”—because
governments have been involved in the statistical activities, especially the conduct of censuses
either for military or taxation purposes. The need for and conduct of censuses are recorded in
the pages of holy texts. In the Christian Bible, particularly the Book of Numbers, God is reported
to have instructed Moses to carry out a census. Another census mentioned in the Bible is the
census ordered by Caesar Augustus throughout the entire Roman Empire before the birth of
Christ.
Make known to students that Statistics enable us to

• characterize persons, objects, situations, and phenomena;
• explain relationships among variables;
• formulate objective assessments and comparisons; and, more importantly
• make evidence-based decisions and predictions.
To summarize, a statistical process in making a decision or providing solutions to

a problem includes the following:
• Planning or designing the collection of data to answer statistical questions in a way
that maximizes information content and minimizes bias;
• Collecting the data as required in the plan;
• Verifying the quality of the data after they were collected;
• Summarizing the information extracted from the data; and
• Examining the summary statistics so that insight and meaningful information can be
produced to support decision-making or solutions to the question or problem at
hand.
KEY POINTS
 Difference between questions that could be and those that could not answered using
Statistics.
 Statistics is a science that studies data.
 There are many uses of Statistics but its main use is in decision-making.
 Logical decisions or solutions to a problem could be attained through a statistical process.
Basic Terms in Statistics
Definition of Basic Terms
The collection of respondents from whom one obtains the data is called the universe of
the study. Universe is not necessarily composed of people. Since there are studies where the
observations were taken from plants or animals or even from non-living things like buildings,
vehicles, farms, etc. So formally, we define universe as the collection or set of units or entities
from whom we got the data.
A variable is a characteristic that is observable or measurable in every unit of the
universe.
The set of all possible values of a variable is referred to as a population. Thus for each
variable we observed, we have a population of values. The number of population in a study will
be equal to the number of variables observed.
A subgroup of a universe or of a population is a sample. There are several ways to take
a sample from a universe or a population and the way we draw the sample dictates the kind of
analysis we do with our data.
Broad Classification of Variables

Variables can be broadly classified as either quantitative or qualitative, with the latter
further classified into discrete and continuous types.
i. Qualitative variables express a categorical attribute, such as sex (male or female),

religion, marital status, region of residence, highest educational attainment. Qualitative
variables do not strictly take on numeric values (although we can have numeric codes
for them, e.g., for sex variable, 1 and 2 may refer to male, and female, respectively).
Qualitative data answer questions “what kind.” Sometimes, there is a sense of ordering
in qualitative data, e.g., income data grouped into high, middle and low-income status.
Data on sex or religion do not have the sense of ordering, as there is no such thing as a
weaker or stronger sex, and a better or worse religion. Qualitative variables are
sometimes referred to as categorical variables.
ii. Quantitative (otherwise called numerical) data, whose sizes are meaningful, answer
questions such as “how much” or “how many”. Quantitative variables have actual units of
measure. Examples of quantitative variables include the height, weight, number of
registered cars, household size, and total household expenditures/income of survey
respondents. Quantitative data may be further classified into:
a. Discrete data are those data that can be counted, e.g., the number of days for
cellphones to fail, the ages of survey respondents measured to the nearest year,
and the number of patients in a hospital. These data assume only (a finite or
infinitely) countable number of values.
b. Continuous data are those that can be measured, e.g. the exact height of a survey
respondent and the exact volume of some liquid substance. The possible values are
uncountably infinite.
KEY POINTS
 A universe is a collection of units from which the data were gathered.

 A variable is a characteristic we observed or measured from every element of the
universe.
 A population is a set of all possible values of a variable.
 A sample is a subgroup of a universe or a population.
 In a study there is only one universe but could have several populations.
 Variables could be classified as qualitative or quantitative, and the latter could be further
classified as discrete or continuous.
Levels of Measurement
Levels of Measurement
a. Nominal level of measurement arises when we have variables that are categorical and
non-numeric or where the numbers have no sense of ordering. Examples of the
variables measured at the nominal level include sex, marital status, religious affiliation.
The numbers used in nominal level are simply for numerical codes, and cannot be used
for ordering and any mathematical computation.
b. Ordinal level also deals with categorical variables like the nominal level, but in this level
ordering is important, that is the values of the variable could be ranked. Using the codes,
the responses could be ranked. Examples of the ordinal scale include socio economic
status (A to E, where A is wealthy, E is poor), difficulty of questions in an exam (easy,
medium difficult), rank in a contest (first place, second place, etc.), and perceptions in
Likert scales.
c. Interval level tells us that one unit differs by a certain amount of degree from another
unit. Knowing how much one unit differs from another is an additional property of the
interval level on top of having the properties possess by the ordinal level. Celsius scale
is in interval level. Other example of a variable measure at the interval is the Intelligence
Quotient (IQ) of a person.
d. Ratio level also tells us that one unit has so many times as much of the property as
does another unit. The ratio level possesses a meaningful (unique and non-arbitrary)
absolute, fixed zero point and allows all arithmetic operations. The existence of the zero
point is the only difference between ratio and interval level of measurement. Examples of
the ratio scale include mass, heights, weights, energy and electric charge.
In summary, we have the following levels of measurement:
Level Property Basic Empirical Operation
Nominal No order, distance, or origin Determination of equivalence
Has order but no distance or

Ordinal Determination of greater or lesser values
unique origin
Both with order and distance Determination of equality of intervals or
Interval
but no unique origin difference
Determination of equality of ratios or
Ratio Has order, distance and unique origin
means
Methods of Data Collection
Variables were observed or measured using any of the three methods of data collection,
namely: objective, subjective and use of existing records. The objective and subjective
methods obtained the data directly from the source. The former uses any or combination of the
five senses (sense of sight, touch, hearing, taste and smell) to measure the variable while the
latter obtains data by getting responses through a questionnaire. The resulting data from these
two methods of data collection is referred to as primary data.
On the other hand, secondary data are obtained through the use of existing records or
data collected by other entities for certain purposes. For example, when we use data gathered
by the Philippine Statistics Authority, we are using secondary data and the method we employ
to get the data is the use of existing records. Other data sources include administrative records,
news articles, internet, and the like. However, we must emphasize to the students that when we
use existing data we must be confident of the quality of the data we are using by knowing how
the data were gathered. Also, we must remember to request permission and acknowledge the
source of the data when using data gathered by other agency or people.
KEY POINTS
 Four levels of measurement: Nominal, Ordinal, Interval and Ratio.

 Knowing what level the variable was measured or observed will guide us to know the
type of analysis to apply.
 Three methods of data collection include objective, subjective and use of existing
records.
 Using the data collection method as basis, data can be classified as either primary or
secondary data.

Statistics and Probability Handouts - Basic Terms in Statistics

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics and Probability Handouts - Basic Terms in Statistics

Uploaded by

Copyright:

Available Formats

CHAPTER 1: EXPLORING DATA

Make known to students that Statistics enable us to

To summarize, a statistical process in making a decision or providing solutions to

Broad Classification of Variables

i. Qualitative variables express a categorical attribute, such as sex (male or female),

 A universe is a collection of units from which the data were gathered.

In summary, we have the following levels of measurement:

Level Property Basic Empirical Operation

Nominal No order, distance, or origin Determination of equivalence

Has order but no distance or

 Four levels of measurement: Nominal, Ordinal, Interval and Ratio.

You might also like