Stat Compile

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 115

The Nature of

Probability
and Statistics

Harold B.
Badilla

Preliminaries

Descriptive
and Inferential
Statistics
The Nature of Probability and Statistics
Variables and
Types of Data STAT-CBEA: Statistical Analysis with Software Application
Levels of
Measurement

Data
Collection and
Sampling Harold B. Badilla
Techniques
harold.badilla@email.lcup.edu.ph
Uses and
Misuses of
Statistics
Mathematics Faculty Member
College of Arts, Sciences, and Education
La Consolacion University Philippines
Introduction
(Bluman, 2012 & PSA, 2020)

The Nature of
Probability
and Statistics

Harold B. You may be familiar with probability and statistics through radio,
Badilla
television, newspapers, and magazines. For example, you may
Preliminaries
have read statements like the following found in newspapers.
Descriptive
and Inferential The College Stress and Mental Illness Poll reported that
Statistics
85% of college and university students reported feeling stress
Variables and
Types of Data daily; 77% reported stress from school work, and 74% ex-
Levels of
Measurement
perienced stress from grades.
Data Teenage pregnancy in the Philippines is on the rise. In fact,
Collection and
Sampling the data show that about one in every ten women of child-
Techniques
bearing age was a teenager, and there were 24 babies born
Uses and
Misuses of every hour by teenage mothers.
Statistics
Introduction
(Bluman, 2012 & PSA, 2020)

The Nature of
Probability
and Statistics

Harold B.
Badilla
In the Philippines, from 2014 to 2016, the median age at
Preliminaries marriage for males is 28, while for females is 26.
Descriptive
and Inferential Valentine’s Day had the most number of marriages for the
Statistics
past ten years. Cavite, Pangasinan and Negros Occidental
Variables and
Types of Data had the most number of marriages recorded on February 14.
Levels of
Measurement
In 1974 and 1994 when the Miss Universe pageants were
Data held in the Philippines, and in 1969, 1973 and 2015 when
Collection and
Sampling the Philippines won the Miss Universe crown, babies were
Techniques
named after the beauty queens.
Uses and
Misuses of
Statistics
Definition of Terms

The Nature of
Probability
and Statistics

Harold B.
Badilla

Preliminaries

Descriptive
and Inferential
Definition (Statistics)
Statistics
Statistics is the science of conducting studies to collect, or-
Variables and
Types of Data ganize, summarize, analyze, and draw conclusions from data
Levels of
Measurement
(Bluman, 2012).
Data
Collection and
Sampling
Techniques

Uses and
Misuses of
Statistics
Why a student study Statistics

The Nature of
Probability
and Statistics

Harold B. Like professional people, you must be able to read and un-
Badilla
derstand the various statistical studies performed in your
Preliminaries fields. To have this understanding, you must be knowledge-
Descriptive
and Inferential
able about the vocabulary, symbols, concepts, and statisti-
Statistics cal procedures used in these studies.
Variables and
Types of Data You may be called on to conduct research in your field, since
Levels of statistical procedures are basic to research. To accomplish
Measurement
this, you must be able to design experiments; collect, or-
Data
Collection and ganize, analyze, and summarize data; and possibly make
Sampling
Techniques reliable predictions or forecasts for future use. You must
Uses and also be able to communicate the results of the study in
Misuses of
Statistics your own words.
Why a student study Statistics

The Nature of
Probability
and Statistics

Harold B.
Badilla

Preliminaries You can also use the knowledge gained from studying statis-
Descriptive
and Inferential
tics to become better consumers and citizens. For exam-
Statistics
ple, you can make intelligent decisions about what products
Variables and
Types of Data to purchase based on consumer studies, about government
Levels of spending based on utilization studies, and so on.
Measurement

Data
These reasons can be considered some of the goals for studying
Collection and
Sampling
statistics.
Techniques

Uses and
Misuses of
Statistics
Variable

The Nature of
Probability
and Statistics

Harold B. To gain knowledge about seemingly haphazard situations, statis-


Badilla
ticians collect information for variables, which describe the situ-
Preliminaries ation.
Descriptive
and Inferential Definition (Variable)
Statistics

Variables and
A variable is a characteristic or condition that can change or
Types of Data
take on different values.
Levels of
Measurement

Data Definition (Data)


Collection and
Sampling Data are the values (measurements or observations) that the
Techniques

Uses and
variables can assume. Variables whose values are determined by
Misuses of
Statistics
chance are called random variables.
Variable

The Nature of
Probability
and Statistics

Harold B.
Badilla

Preliminaries

Descriptive
and Inferential
Statistics
Definition (Data Set)
Variables and A collection of data values forms a data set. Each value in the
Types of Data

Levels of
data set is called a data value or a datum.
Measurement

Data
Collection and
Sampling
Techniques

Uses and
Misuses of
Statistics
Sample vs. Population

The Nature of
Probability
and Statistics
In statistics it is important to distinguish between a sample and
Harold B.
a population.
Badilla
Definition (Population)
Preliminaries

Descriptive A population consists of all subjects (human or otherwise)


and Inferential
Statistics that are being studied. When data are collected from every
Variables and subject in the population, it is called a census.
Types of Data

Levels of
Measurement Definition (Parameter)
Data
Collection and A parameter is a numerical value summarizing all the data of
Sampling
Techniques an entire population.
Uses and
Misuses of Most of the time, due to the expense, time, size of population,
Statistics
medical concerns, etc., it is not possible to use the entire popu-
lation for a statistical study; therefore, researchers use samples.
Sample vs. Population

The Nature of
Probability
and Statistics

Harold B.
Badilla Definition (Sample)
Preliminaries A sample is a group of subjects selected from a population.
Descriptive When data are collected from every subject in the sample, it is
and Inferential
Statistics called a survey.
Variables and
Types of Data

Levels of
Definition (Statistic)
Measurement
A statistic is a numerical value summarizing the sample data.
Data
Collection and
Sampling
Techniques
If the subjects of a sample are properly selected, most of the
Uses and
time they should possess the same or similar characteristics as
Misuses of
Statistics
the subjects in the population.
Important Note!

The Nature of
Probability
and Statistics

Harold B.
Badilla

The information obtained from a statistical sample is


Preliminaries

Descriptive
said to be ”biased” if the results from the sample of
and Inferential
Statistics
a population are radically different from the results of
Variables and
a census of the population. Also, a sample is said
Types of Data to be biased if it does not represent the population
Levels of
Measurement
from which it has been selected. The techniques used
Data to properly select a sample are explained in the next
Collection and
Sampling section.
Techniques

Uses and
Misuses of
Statistics
Two Branches of Statistics

The Nature of
Probability
and Statistics

Harold B.
Badilla

Preliminaries

Descriptive
The body of knowledge called statistics is sometimes divided into
and Inferential
Statistics
two main areas, depending on how data are used. The two areas
Variables and are
Types of Data
1 Descriptive Statistics
Levels of
Measurement 2 Inferential Statistics
Data
Collection and
Sampling
Techniques

Uses and
Misuses of
Statistics
Descriptive Statistics

The Nature of
Probability
and Statistics Definition (Descriptive Statistics)
Harold B.
Badilla
Descriptive statistics consists of the collection, organization,
summarization, and presentation of data.
Preliminaries

Descriptive In descriptive statistics, the statistician tries to describe


and Inferential
Statistics a situation. Consider the national census conducted by
Variables and the Philippine government every 10 years. Results of
Types of Data

Levels of
this census give you the average age, income, and other
Measurement characteristics of the Philippine population. To obtain
Data
Collection and
this information, the Philippine Statistical Authority
Sampling
Techniques
(PSA) must have some means to collect relevant data.
Uses and
Once data are collected, the PSA must organize and
Misuses of
Statistics
summarize them. Finally, the PSA needs a means of
presenting the data in some meaningful form, such as
charts, graphs, or tables.
Inferential Statistics

The Nature of
Probability
and Statistics

Harold B.
Badilla Definition (Inferential Statistics)
Preliminaries Inferential statistics consists of generalizing from samples to
Descriptive populations, performing estimations and hypothesis tests, deter-
and Inferential
Statistics mining relationships among variables, and making predictions.
Variables and
Types of Data Here, the statistician tries to make inferences from
Levels of
Measurement
samples to populations. Inferential statistics uses hy-
Data
pothesis testing, a decision-making process for evalu-
Collection and
Sampling
ating claims about a population, based on information
Techniques obtained from samples. Statisticians also use statistics
Uses and
Misuses of
to determine relationships among variables.
Statistics
Example

The Nature of
Probability
and Statistics Determine whether descriptive or inferential statistics were used.
Harold B.
Badilla
a. The average jackpot for the top five lottery winners was
$367.6 million.
Preliminaries

Descriptive
and Inferential Descriptive statistics were used because this is an average,
Statistics
and it is based on data obtained from the top five lottery
Variables and
Types of Data winners at this time.
Levels of
Measurement
b. A study done by the American Academy of Neurology sug-
Data
Collection and gests that older people who had a high caloric diet more
Sampling
Techniques than doubled their risk of memory loss.
Uses and
Misuses of
Statistics Inferential statistics were used since this is a generalization
made from a sample to a population.
Example

The Nature of
Probability
and Statistics Determine whether descriptive or inferential statistics were used.
Harold B.
Badilla
c. Based on a survey of 9317 consumers done by the National
Retail Federation, the average amount that consumers spent
on Valentine’s Day in 2011 was $116.
Preliminaries

Descriptive
and Inferential
Statistics
Descriptive statistics were used since this is an average
Variables and
Types of Data based on a sample of 9317 respondents.
Levels of
Measurement
d. Scientists at the University of Oxford in England found that
Data
Collection and a good laugh significantly raises a person’s pain level toler-
Sampling
Techniques ance.
Uses and
Misuses of
Statistics Inferential statistics were used since this is a generalization
made from a sample to a population.
Applying the Concepts

The Nature of
Probability
and Statistics Example
Harold B. A study conducted at Manatee Community College revealed that
Badilla
students who attended class 95 to 100% of the time usually
Preliminaries
received an A in the class. Students who attended class 80 to
Descriptive
and Inferential 90% of the time usually received a B or C in the class. Students
Statistics
who attended class less than 80% of the time usually received
Variables and
Types of Data a D or an F or eventually withdrew from the class. Based on
Levels of
Measurement this information, attendance and grades are related. The more
Data you attend class, the more likely it is you will receive a higher
Collection and
Sampling grade. If you improve your attendance, your grades will probably
Techniques
improve. Many factors affect your grade in a course. One factor
Uses and
Misuses of that you have considerable control over is attendance. You can
Statistics
increase your opportunities for learning by attending class more
often.
Applying the Concepts

The Nature of
Probability
and Statistics

Harold B.
Badilla
Answer the following questions:
Preliminaries
1 What are the variables under study?
Descriptive
and Inferential
Statistics
2 What are the data in the study?
Variables and 3 Are descriptive, inferential, or both types of statistics used?
Types of Data

Levels of
4 What is the population under study?
Measurement
5 Was a sample collected? If so, from where?
Data
Collection and
Sampling
6 From the information given, comment on the relationship
Techniques between the variables.
Uses and
Misuses of
Statistics
Classification of Variables

The Nature of
Probability
and Statistics

Harold B.
Badilla

Preliminaries Definition (Qualitative Variable)


Descriptive
and Inferential Qualitative variables are variables that have distinct
Statistics
categories according to some characteristic or attribute.
Variables and
Types of Data

Levels of
For example, if subjects are classified according to sex (male or
Measurement female), then the variable sex is qualitative. Other examples
Data
Collection and
of qualitative variables are religious preference and geographic
Sampling
Techniques
locations.
Uses and
Misuses of
Statistics
Classification of Variables

The Nature of
Probability
and Statistics

Harold B.
Badilla

Preliminaries Definition (Quantitative Variable)


Descriptive
and Inferential Quantitative variables are variables that can be counted or
Statistics
measured.
Variables and
Types of Data
For example, the variable age is numerical, and people can be
Levels of
Measurement ranked in order according to the value of their ages. Other
Data examples of quantitative variables are heights, weights, and body
Collection and
Sampling temperatures.
Techniques

Uses and
Misuses of
Statistics
Classification of Quantitative Variables

The Nature of
Probability
and Statistics

Harold B.
Badilla Quantitative variables can be further classified into two groups:
Preliminaries discrete and continuous.
Descriptive
and Inferential Definition (Discrete Variables)
Statistics
Discrete variables assume values that can be counted.
Variables and
Types of Data

Levels of
Discrete variables can be assigned values such as 0, 1, 2, 3 and
Measurement are said to be countable. Examples of discrete variables are the
Data
Collection and
number of children in a family, the number of students in a
Sampling
Techniques
classroom, and the number of calls received by a switchboard
Uses and
operator each day for a month.
Misuses of
Statistics
Classification of Quantitative Variables

The Nature of
Probability
and Statistics

Harold B.
Badilla
Definition (Continuous Variables)
Preliminaries
Continuous variables can assume an infinite number of values
Descriptive
and Inferential between any two specific values. They are obtained by
Statistics
measuring. They often include fractions and decimals.
Variables and
Types of Data

Levels of Continuous variables, by comparison, can assume an infinite


Measurement
number of values in an interval between any two specific val-
Data
Collection and ues. Temperature, for example, is a continuous variable, since
Sampling
Techniques the variable can assume an infinite number of values between
Uses and any two given temperatures.
Misuses of
Statistics
Example

The Nature of
Probability
and Statistics

Harold B.
Badilla

Preliminaries
Classify each variable as a discrete variable or a continuous vari-
Descriptive
able.
and Inferential
Statistics The highest wind speed of a hurricane
Variables and
Types of Data
The weight of baggage on an airplane
Levels of The number of pages in a statistics book
Measurement

Data
The amount of money a person spends per year for online
Collection and
Sampling
purchases
Techniques

Uses and
Misuses of
Statistics
Sidenote

The Nature of
Probability
and Statistics
In addition to being classified as qualitative or quantita-
Harold B.
Badilla tive, variables can be classified by how they are categorized,
counted, or measured.
Preliminaries

Descriptive For example, can the data be organized into specific cate-
and Inferential
Statistics gories, such as area of residence (rural, suburban, or urban)?
Variables and Can the data values be ranked, such as first place, second
Types of Data

Levels of
place, etc.?
Measurement
Or are the values obtained from measurement, such as
Data
Collection and heights, IQs, or temperature?
Sampling
Techniques This type of classification—i.e., how variables are categorized,
Uses and
Misuses of
counted, or measured—uses measurement scales, and four
Statistics common types of scales are used: nominal, ordinal, interval,
and ratio.
Levels of Measurement

The Nature of
Probability
and Statistics

Harold B.
Badilla Definition (Nominal Level of Measurement)
Preliminaries The nominal level of measurement classifies data into mutu-
Descriptive ally exclusive (non-overlapping) categories in which no order or
and Inferential
Statistics ranking can be imposed on the data.
Variables and
Types of Data

Levels of
Example
Measurement
Religion
Data
Collection and Hometown
Sampling
Techniques Civil Status
Uses and Zip Code
Misuses of
Statistics
The Nature of
Probability
and Statistics

Harold B.
Badilla Definition (Ordinal Level of Measurement)
Preliminaries The ordinal level of measurement classifies data into cate-
Descriptive gories that can be ranked; however, precise differences between
and Inferential
Statistics the ranks do not exist.
Variables and
Types of Data
Example:
Levels of
Measurement First Place, Second Place, Third Place
Data
Collection and
Shirt Size (Small, Medium, Large)
Sampling
Techniques
Job Position (Clerk, Supervisor, Director, Chief Executive)
Uses and
Letter Grades (A, B+, B, C+, C, D, F)
Misuses of
Statistics
The Nature of
Probability
and Statistics

Harold B.
Badilla
Definition (Interval Level of Measurement)
Preliminaries

Descriptive
The interval level of measurement ranks data, and precise
and Inferential
Statistics
differences between units of measure do exist; however, there is
Variables and
no meaningful zero.
Types of Data

Levels of
Measurement Example:
Data
Collection and
IQ
Sampling Temperature
Techniques

Uses and
Misuses of
Statistics
The Nature of
Probability
and Statistics

Harold B. Definition (Ratio Level of Measurement)


Badilla
The ratio level of measurement possesses all the characteris-
Preliminaries
tics of interval measurement, and there is a true zero. In ad-
Descriptive
and Inferential dition, true ratios exist when the same variable is measured on
Statistics

Variables and
two different members of the population.
Types of Data

Levels of
Measurement
Example:
Data Height
Collection and
Sampling Weight
Techniques
Daily Allowance
Uses and
Misuses of Test Score
Statistics
The Nature of
Probability
and Statistics
Scale Description Examples
Harold B. Nominal Labels, names, or cate- Gender, Civil Status,
Badilla
gories only. Data can- Blood Type, Telephone
Preliminaries not be arranged in an Numbers
Descriptive ordering scheme.
and Inferential
Statistics Ordinal Categories are ordered, Job Position, Military
Variables and
Types of Data
but differences cannot Ranks, Letter Grades,
Levels of
be determined or they Size of Shirts
Measurement are meaningless.
Data
Collection and
Interval Quantitative but has IQ, Temperature
Sampling
Techniques
NO true zero point (or
Uses and natural starting point).
Misuses of
Statistics Ratio Quantitative and has Number of Children,
true zero point. Math Test Scores
Identifying Measurement Scales

The Nature of
Probability
and Statistics

Harold B.
Identify the measurement scales of the items below:
Badilla
1 Ice cream flavor preferred by customer (nominal)
Preliminaries 2 Highest level of cooling effect in Fahrenheit of the
Descriptive
and Inferential air-conditioning unit produced (interval)
Statistics

Variables and
3 Production cost incurred each month of manufacturing
Types of Data firms (ratio)
Levels of
Measurement
4 Satisfaction level described in a scale of 1-5 where 1 is the
Data lowest (ordinal)
Collection and
Sampling
Techniques
5 Educational attainment of the household member (ordinal)
Uses and 6 Average satisfaction scores of customers on different
Misuses of
Statistics features of the produce (ratio)
Data Collection

The Nature of
Probability
and Statistics

Harold B.
Badilla
In research, statisticians use data in many different ways. As
Preliminaries
such, the importance of data collection cannot be undermined.
Descriptive
and Inferential Some of the most common methods of data collection are
Statistics
1 Telephone surveys
Variables and
Types of Data 2 Mailed questionnaire surveys
Levels of
Measurement 3 Personal interview surveys
Data
Collection and
4 Surveying records
Sampling
Techniques 5 Direct observation of situations
Uses and
Misuses of
Statistics
Notes on Data Collection

The Nature of
Probability
and Statistics

Harold B.
Badilla

Preliminaries Researchers use samples to collect data and informa-


Descriptive tion about a particular variable from a large population.
and Inferential
Statistics Using samples saves time and money and in some cases
Variables and enables the researcher to get more detailed information
Types of Data
about a particular subject. Remember, samples cannot
Levels of
Measurement be selected in haphazard ways because the information
Data
Collection and
obtained might be biased.
Sampling
Techniques

Uses and
Misuses of
Statistics
Four Basic Sampling Techniques

The Nature of
Probability
and Statistics

Harold B.
Badilla

Preliminaries
To obtain samples that are unbiased—i.e., that give each sub-
Descriptive ject in the population an equally likely chance of being selected,
and Inferential
Statistics
statisticians use four basic methods of sampling:
Variables and 1 random
Types of Data

Levels of
2 systematic
Measurement

Data
3 stratified
Collection and
Sampling
4 cluster
Techniques

Uses and
Misuses of
Statistics
Random Sampling

The Nature of
Probability
and Statistics
Definition
Harold B.
Badilla A random sample is a sample in which all members of the
Preliminaries population have an equal chance of being selected.
Descriptive
and Inferential
Statistics

Variables and
Types of Data

Levels of
Measurement

Data
Collection and
Sampling
Techniques

Uses and
Misuses of
Statistics
Systematic Sampling

The Nature of
Probability
and Statistics
Definition
Harold B.
Badilla A systematic sample is a sample obtained by selecting every
Preliminaries k th member of the population where k is a counting number.
Descriptive
and Inferential
Statistics

Variables and
Types of Data

Levels of
Measurement

Data
Collection and
Sampling
Techniques

Uses and
Misuses of
Statistics
Stratified Sampling

The Nature of
Probability
and Statistics Definition
Harold B.
Badilla
A stratified sample is a sample obtained by dividing the
population into subgroups or strata according to some
Preliminaries
characteristic relevant to the study. (There can be several
Descriptive
and Inferential subgroups.) Then subjects are selected from each subgroup.
Statistics

Variables and
Types of Data

Levels of
Measurement

Data
Collection and
Sampling
Techniques

Uses and
Misuses of
Statistics
Cluster Sampling

The Nature of
Probability
and Statistics Definition
Harold B.
Badilla
A cluster sample is obtained by dividing the population into
sections or clusters and then selecting one or more clusters and
Preliminaries
using all members in the cluster(s) as the members of the
Descriptive
and Inferential sample.
Statistics

Variables and
Types of Data

Levels of
Measurement

Data
Collection and
Sampling
Techniques

Uses and
Misuses of
Statistics
Summary of the Sampling Techniques

The Nature of
Probability
and Statistics

Harold B.
Badilla
Random Subjects are selected by random numbers.
Preliminaries Systematic Subjects are selected by using every k th number
Descriptive
and Inferential
after the first subject is randomly selected from
Statistics 1 through k.
Variables and
Types of Data
Stratified Subjects are selected by dividing up the popu-
Levels of lation into subgroups (strata), and subjects are
Measurement
randomly selected within subgroups.
Data
Collection and Cluster Subjects are selected by using an intact sub-
Sampling
Techniques group that is representative of the population.
Uses and
Misuses of
Statistics
Example

The Nature of
Probability
State which sampling method was used.
and Statistics
a. Out of 10 hospitals in a municipality, a researcher selects
Harold B.
Badilla one and collects records for a 24-hour period on the types
Preliminaries
of emergencies that were treated there. (Cluster)
Descriptive b. A researcher divides a group of students according to gen-
and Inferential
Statistics der, major field, and low, average, and high grade point
Variables and average. Then she randomly selects six students from each
Types of Data
group to answer questions in a survey. (Stratified)
Levels of
Measurement
c. The subscribers to a magazine are numbered. Then a sam-
Data
Collection and ple of these people is selected using random numbers. (Ran-
Sampling
Techniques
dom)
Uses and d. Every 10th bottle of Super-Duper Cola is selected, and the
Misuses of
Statistics amount of liquid in the bottle is measured. The purpose
is to see if the machines that fill the bottles are working
properly. (Systematic)
How Statistics can be misused

The Nature of
Probability
and Statistics

Harold B.
Badilla
Statistical techniques can be used to describe data,
Preliminaries compare two or more data sets, determine if a rela-
Descriptive
and Inferential
tionship exists between variables, test hypotheses, and
Statistics make estimates about population characteristics. How-
Variables and
Types of Data
ever, there is another aspect of statistics, and that is
Levels of the misuse of statistical techniques to sell products that
Measurement
don’t work properly, to attempt to prove something
Data
Collection and true that is really not true, or to get our attention by
Sampling
Techniques using statistics to evoke fear, shock, and outrage
Uses and
Misuses of
Statistics
Suspect Samples

The Nature of
Probability
and Statistics

Harold B.
Badilla
The first thing to consider is the sample that was used in
Preliminaries the research study. Sometimes researchers use very small
Descriptive samples to obtain information.
and Inferential
Statistics “Three out of four doctors surveyed recommend brand
Variables and such and such.”
Types of Data
Not only is it important to have a sample size that is large
Levels of
Measurement enough, but also it is necessary to see how the subjects in
Data the sample were selected.
Collection and
Sampling Convenience Sample
Techniques
Volunteer Sample
Uses and
Misuses of
Statistics
How survey results can be problematic
http://www.publicusasia.com/sara-bongbong-isko-and-leni-tied-in-top-spot/

The Nature of
Probability
Take a look at the survey from Publicus Asia (2020).
and Statistics

Harold B.
Badilla

Preliminaries

Descriptive
and Inferential
Statistics

Variables and
Types of Data

Levels of
Measurement

Data
Collection and
Sampling
Techniques

Uses and
Misuses of
Statistics
”It is a nationwide purposive sampling survey comprised of
1500 respondents drawn from a research panel of approximately
100K Filipinos maintained by a Singapore-based firm.”
What is Purposive Sampling?

The Nature of
Probability
and Statistics

Harold B.
Badilla

Preliminaries

Descriptive Purposive sampling, also known as judgmental, selec-


and Inferential
Statistics
tive, or subjective sampling, is a form of non-probability
Variables and sampling in which researchers rely on their own judg-
Types of Data
ment when choosing members of the population to
Levels of
Measurement participate in their study (Alchemer, 2018).
Data
Collection and
Sampling
Techniques

Uses and
Misuses of
Statistics
Ambiguous Averages

The Nature of
Probability
and Statistics

Harold B.
Badilla

Preliminaries

Descriptive
You will learn that there are four commonly used measures that
and Inferential
Statistics
are loosely called averages. They are the mean, median, mode,
Variables and
and midrange. For the same data set, these averages can differ
Types of Data markedly. People who know this can, without lying, select
Levels of
Measurement
the one measure of average that lends the most evidence
Data to support their position.
Collection and
Sampling
Techniques

Uses and
Misuses of
Statistics
Changing the Subject

The Nature of
Probability
and Statistics

Harold B.
Another type of statistical distortion can occur when differ-
Badilla ent values are used to represent the same data.
Preliminaries For example, one political candidate who is running for re-
Descriptive
and Inferential
election might say, “During my administration, expenditures
Statistics increased a mere 3%.”
Variables and
Types of Data His opponent, who is trying to unseat him, might say, “Dur-
Levels of ing my opponent’s administration, expenditures have in-
Measurement
creased a whopping $6,000,000.”
Data
Collection and
Sampling
Here both figures are correct; however, expressing a 3%
Techniques
increase as $6,000,000 makes it sound like a very large in-
Uses and
Misuses of
crease. Here again, ask yourself, Which measure better
Statistics
represents the data?
Detached Statistics

The Nature of
Probability
and Statistics

Harold B.
Badilla

Preliminaries
A claim that uses a detached statistic is one in which no
Descriptive
comparison is made.
and Inferential
Statistics For example, you may hear a claim such as “Our brand of
Variables and crackers has one-third fewer calories.” Here, no comparison
Types of Data
is made.
Levels of
Measurement
Another example is a claim that uses a detached statistic
Data
Collection and such as “Brand A aspirin works four times faster.” Four
Sampling
Techniques
times faster than what?
Uses and
Misuses of
Statistics
Implied Connections

The Nature of
Probability
and Statistics

Harold B.
Badilla Many claims attempt to imply connections between vari-
Preliminaries
ables that may not actually exist.
Descriptive For example, consider the following statement: “Eating fish
and Inferential
Statistics may help to reduce your cholesterol.”
Variables and
Types of Data
“Studies suggest that using our exercise machine will re-
Levels of
duce your weight.”
Measurement
Another claim might say, “Taking calcium will lower blood
Data
Collection and pressure in some people.”
Sampling
Techniques Be careful when you draw conclusions from claims that use
Uses and
Misuses of
words such as may, in some people, and might help.
Statistics
Misleading Graphs

The Nature of
Probability
and Statistics

Harold B.
Badilla

Preliminaries
Statistical graphs give a visual representation of data that
Descriptive
and Inferential enables viewers to analyze and interpret data more easily
Statistics

Variables and
than by simply looking at numbers.
Types of Data
However, if graphs are drawn inappropriately, they can mis-
Levels of
Measurement represent the data and lead the reader to draw false con-
Data clusions.
Collection and
Sampling
Techniques

Uses and
Misuses of
Statistics
Example of Misleading Graphs

The Nature of
Probability
and Statistics

Harold B.
Badilla

Preliminaries

Descriptive
and Inferential
Statistics

Variables and
Types of Data

Levels of
Measurement

Data
Collection and
Sampling
Techniques

Uses and
Misuses of
Statistics
Faulty Survey Questions

The Nature of
Probability
and Statistics

Harold B. When analyzing the results of a survey using questionnaires,


Badilla
you should be sure that the questions are properly written
Preliminaries since the way questions are phrased can often influence the
Descriptive way people answer them.
and Inferential
Statistics
For example, the responses to a question such as “Do you
Variables and
Types of Data feel that the North Huntingdon School District should build
Levels of a new football stadium?” might be answered differently than
Measurement
a question such as “Do you favor increasing school taxes so
Data
Collection and that the North Huntingdon School District can build a new
Sampling
Techniques football stadium?”
Uses and
Misuses of
Each question asks something a little different, and the re-
Statistics sponses could be radically different.
Frequency
Distribution
and Graphs

Harold B.
Badilla

Introduction

Frequency
Distribution

Graphical Frequency Distribution and Graphs


Representation
of Frequency
Distributions
STAT-CBEA: Statistical Analysis with Software Application
Distribution
Shapes

Harold B. Badilla
harold.badilla@email.lcup.edu.ph

Mathematics Faculty Member


College of Arts, Sciences and Education
La Consolacion University Philippines
Introduction
(Bluman, 2014)

Frequency
Distribution
and Graphs

Harold B.
Badilla When conducting a statistical study, the researcher must
Introduction
gather data for the particular variable under study.
Frequency To describe situations, draw conclusions, or make inferences
Distribution
about events, the researcher must organize the data in some
Graphical
Representation meaningful way. The most convenient method of organizing
of Frequency
Distributions data is to construct a frequency distribution.
Distribution
Shapes
After organizing the data, the researcher must present them
so they can be understood by those who will benefit from
reading the study. The most useful method of presenting
the data is by constructing statistical charts and graphs.
Organizing Data
(Bluman, 2014)

Frequency
Distribution
and Graphs Suppose a researcher wished to do a study on the ages of the 50
Harold B. wealthiest people in the world.
Badilla
The following ages are data gathered from Forbes Magazine:
Introduction

Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes

This is an example of raw data.


Frequency Distribution
(Bluman, 2014)

Frequency
Distribution
and Graphs

Harold B.
Badilla Frequency Distribution
Introduction A frequency distribution is the organization of raw data in
Frequency table form, using classes and frequencies.
Distribution

Graphical
Representation Each raw data value is placed into a quantitative or quali-
of Frequency
Distributions tative category called a class.
Distribution
Shapes
Frequency
The frequency of a class is the number of data values
contained in a specific class.
Example of Frequency Distribution

Frequency
Distribution
and Graphs

Harold B.
Badilla

Introduction

Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes

The classes in this distribution are 27–35, 36–44, etc. These


values are called class limits. The data values 27, 28, 29, 30,
31, 32, 33, 34, 35 can be tallied in the first class; 36, 37, 38,
39, 40, 41, 42, 43, 44 in the second class; and so on.
Frequently Used Types of Frequency Distribution

Frequency
Distribution
and Graphs

Harold B.
Badilla

Introduction

Frequency
Distribution

Graphical
1 Categorical Frequency Distribution
Representation
of Frequency
2 Grouped Frequency Distribution
Distributions

Distribution
Shapes
Categorical Frequency Distribution (CFD)

Frequency
Distribution
and Graphs

Harold B.
Badilla Categorical Frequency Distribution
Introduction The categorical frequency distribution is used for data that
Frequency
Distribution
can be placed in specific categories, such as nominal- or
Graphical
ordinal-level data.
Representation
of Frequency
Distributions Examples of data for CFD:
Distribution
Shapes political affiliation
religious affiliation
major field of study
Example of Categorical Frequency Distribution

Frequency
Distribution
and Graphs

Harold B. Example
Badilla
Twenty-five army inductees were given a blood test to
Introduction
determine their blood type. The data set is
Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes

Construct a frequency distribution table for the data.


Solution

Frequency
Distribution
and Graphs

Harold B.
Badilla

Introduction

Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes

Conclusion: More people have type O blood than any other type.
Grouped Frequency Distribution

Frequency
Distribution
When the range of the data is large, the data must be
and Graphs grouped into classes that are more than one unit in width,
Harold B.
Badilla
in what is called a grouped frequency distribution.
For example, a distribution of the blood glucose levels in
Introduction
milligrams per deciliter (mg/dL) for 50 randomly selected
Frequency
Distribution college students is shown.
Graphical
Representation
of Frequency
Distributions

Distribution
Shapes
Grouped Frequency Distribution

Frequency
Distribution In this distribution, the values 58 and 64 of the first class
and Graphs
are called class limits.
Harold B.
Badilla The lower class limit is 58; it represents the smallest data
Introduction value that can be included in the class. The upper class
Frequency limit is 64; it represents the largest data value that can be
Distribution
included in the class.
Graphical
Representation
of Frequency
The numbers in the second column are called class bound-
Distributions aries. These numbers are used to separate the classes so
Distribution
Shapes
that there are no gaps in the frequency distribution.
Finally, the class width for a class in a frequency distribu-
tion is found by subtracting the lower (or upper) class limit
of one class from the lower (or upper) class limit of the
next class. For example, the class width in the preceding
distribution on the distribution of blood glucose levels is 7,
found from 65 − 58 = 7.
Record High Temperature

Frequency
Distribution
and Graphs

Harold B. These data represent the record high temperatures in degrees


Badilla
Fahrenheit (◦ F ) for each of the 50 states. Construct a grouped
Introduction frequency distribution for the data, using 7 classes.
Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes
Step 1
Determine the classes

Frequency
Distribution
and Graphs
Highest and Lowest Values
Harold B.
Badilla
Highest Value (H) = 134, Lowest Value (L) = 100
Introduction

Frequency
Distribution
Range (R)
Graphical
Representation
R =H −L
of Frequency
Distributions = 134 − 100
Distribution
Shapes
R = 34

Number of Classes (k)


Select the number of classes desired. In this case, 7 is
arbitrarily chosen.
Step 1
Determine the classes (cont.)

Frequency
Distribution
and Graphs
Class Width
Harold B.
Badilla
lR m l 34 m
Width = =
Introduction k 7
Frequency Width = d4.9e = 5
Distribution

Graphical
Representation
of Frequency
True Class Boundaries (TCB)
Distributions
This removes discontinuity between classes and considers the
Distribution
Shapes true range of values.

How to obtain column entries


(Lower TCB) LTCB = LL − 0.5 (unit)
(Upper TCB) UTCB = UL + 0.5 (unit)
Steps 2 and 3

Frequency
Distribution
and Graphs

Harold B.
Badilla

Introduction

Frequency
Step 2
Distribution
Tally the data.
Graphical
Representation
of Frequency
Distributions Step 3
Distribution
Shapes
Find the numerical frequencies from the tallies.
Complete Grouped Frequency Distribution

Frequency
Distribution
The completed frequency distribution is shown below.
and Graphs

Harold B.
Badilla

Introduction

Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes

The frequency distribution shows that the class 109.5–114.5 con-


tains the largest number of temperatures (18) followed by the
class 114.5–119.5 with 13 temperatures. Hence, most of the
temperatures (31) fall between 110 and 119◦ F .
Cumulative Frequency Distribution

Frequency
Distribution
and Graphs
Definition (Cumulative Frequency Distribution)
Harold B. A cumulative frequency distribution is a distribution that
Badilla
shows the number of data values less than or equal to a spe-
Introduction cific value (usually an upper boundary).
Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes
Histograms, Frequency Polygons, and Ogives

Frequency
Distribution
and Graphs

Harold B.
Badilla
After you have organized the data into a frequency distribu-
tion, you can present them in graphical form. The purpose
Introduction
of graphs in statistics is to convey the data to the viewers
Frequency
Distribution in pictorial form.
Graphical
Representation Statistical graphs can be used to describe the data set or
of Frequency
Distributions
to analyze it.
Distribution The three most commonly used graphs in research are
Shapes
1 the histogram
2 the frequency polygon
3 the cumulative frequency graph, or ogive (pronounced
o-jive).
Histogram

Frequency
Distribution
and Graphs
Histogram
Harold B. The histogram is a graph that displays the data by using
Badilla
contiguous vertical bars (unless the frequency of a class is 0) of
Introduction various heights to represent the frequencies of the classes.
Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes
Frequency Polygon

Frequency
Distribution
and Graphs

Harold B.
Frequency Polygon
Badilla
The frequency polygon is a graph that displays the data by
Introduction using lines that connect points plotted for the frequencies at
Frequency
Distribution
the midpoints of the classes. The frequencies are represented
Graphical
by the heights of the points.
Representation
of Frequency
Distributions

Distribution
Shapes
Example of Frequency Polygon

Frequency
Distribution
and Graphs

Harold B.
Badilla

Introduction

Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes
Cumulative Frequency

Frequency
Distribution
and Graphs

Harold B.
Badilla Cumulative Frequency
Introduction
The cumulative frequency is the sum of the frequencies accu-
Frequency
mulated up to the upper boundary of a class in the distribution.
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes
Cumulative Frequency Graph or Ogive

Frequency
Distribution Cumulative Frequency Graph or Ogive
and Graphs

Harold B.
The ogive is a graph that represents the cumulative frequencies
Badilla for the classes in a frequency distribution.
Introduction

Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes
Distribution Shapes

Frequency
Distribution When one is describing data, it is important to be able to
and Graphs
recognize the shapes of the distribution values.
Harold B.
Badilla A distribution can have many shapes, and one method of
analyzing a distribution is to draw a histogram or frequency
Introduction

Frequency
polygon for the distribution.
Distribution Several of the most common shapes are
Graphical
Representation
of Frequency
the bell-shaped or the positively or
Distributions mound-shaped right-skewed shape
Distribution
the uniform-shaped the negatively or left-skewed
Shapes
shape
the J-shaped
the bimodal-shaped, and
the reverse J-shaped the U-shaped

Distributions are most often not perfectly shaped, so it is


not necessary to have an exact shape but rather to identify
an overall pattern.
Bell-shaped Distribution

Frequency
Distribution Characteristics
and Graphs

Harold B.
A bell-shaped distribution has a single peak and tapers off at
Badilla either end. It is approximately symmetric; i.e., it is roughly the
Introduction same on both sides of a line running through the center.
Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes
Uniform Distribution

Frequency
Distribution
and Graphs
Characteristic
Harold B.
Badilla A uniform distribution is basically flat or rectangular.
Introduction

Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes
J-shaped Distribution

Frequency
Distribution
and Graphs Characteristic
Harold B.
Badilla A J-shaped distribution has a few data values on the left side
and increases as one moves to the right.
Introduction

Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes
Reverse J-shaped Distribution

Frequency
Distribution
and Graphs
Characteristic
Harold B. A reverse J-shaped distribution is the opposite of the J-shaped
Badilla
distribution, i.e. it has a few data values on the right side and
Introduction increases as one moves to the left side.
Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes
Positively-Skewed Distribution

Frequency
Distribution
and Graphs
Characteristic
Harold B. When the peak of a distribution is to the left and the data values
Badilla
taper off to the right, a distribution is said to be positively or
Introduction right-skewed.
Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes
Negatively-Skewed Distribution

Frequency
Distribution
and Graphs Characteristic
Harold B.
Badilla
When the data values are clustered to the right and taper off to
the left, a distribution is said to be negatively or left-skewed.
Introduction

Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes
Unimodal Distribution

Frequency
Distribution
and Graphs

Harold B.
Badilla

Introduction Characteristics
Frequency
Distribution
Distributions with one peak, such as those shown in bell-shaped
Graphical
distribution, right-skewed and left-skewed distribution, are said
Representation
of Frequency
to be unimodal. (The highest peak of a distribution indicates
Distributions where the mode of the data values is. The mode is the data
Distribution
Shapes
value that occurs more often than any other data value. Modes
are explained in Chapter 3.)
Bimodal Distribution

Frequency
Distribution
and Graphs Characteristic
Harold B.
Badilla
When a distribution has two peaks of the same height, it is said
to be bimodal.
Introduction

Frequency
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes
U-shaped Distribution
(Glen, 2020)

Frequency
Distribution
and Graphs
Characteristic
Harold B. A U-Shaped distribution is a bimodal distribution with frequen-
Badilla
cies that steadily fall and then steadily rise. There is a higher
Introduction chance of a measurement being found at the extremes than in
Frequency the center of the distribution.
Distribution

Graphical
Representation
of Frequency
Distributions

Distribution
Shapes
Notes

Frequency
Distribution
and Graphs When you are analyzing histograms and frequency polygons,
Harold B. look at the shape of the curve.
Badilla
For example, does it have one peak or two peaks? Is it
Introduction
relatively flat, or is it U-shaped?
Frequency
Distribution Are the data values spread out on the graph, or are they
Graphical
Representation
clustered around the center?
of Frequency
Distributions Are there data values in the extreme ends? These may be
Distribution outliers.
Shapes
Are there any gaps in the histogram, or does the frequency
polygon touch the x−axis somewhere other than at the
ends?
Finally, are the data clustered at one end or the other, in-
dicating a skewed distribution?
Analyzing Shapes of Histogram

Frequency
Distribution Example
and Graphs

Harold B.
The histogram for the record high temperatures shows a single
Badilla peaked distribution, with the class 109.5–114.5 containing the
Introduction largest number of temperatures. The distribution has no gaps,
Frequency and there are fewer temperatures in the highest class than in the
Distribution
lowest class.
Graphical
Representation
of Frequency
Distributions

Distribution
Shapes
Measures of
Central
Tendency

Harold B.
Badilla

Introduction

Measures of
Central
Tendency for
Ungrouped Measures of Central Tendency
Data

Measures of
STAT-CBEA: Statistical Analysis with Software Application
Central
Tendency for
Grouped Data

Mean, Median
and Mode in
Harold B. Badilla
Distribution harold.badilla@email.lcup.edu.ph
Shapes

Mathematics Faculty Member


College of Arts, Sciences, and Education
La Consolacion University Philippines
Introduction
(Bluman, 2014)

Measures of
Central
Tendency

Harold B.
Badilla
In the book American Averages by Mike Feinsilber and William
Introduction

Measures of
B. Meed, the authors state:
Central
Tendency for
Ungrouped
Data

Measures of
”Average” when you stop to think of it is a funny con-
Central
Tendency for
cept. Although it describes all of us it describes none of
Grouped Data us....While none of us wants to be the average Ameri-
Mean, Median
and Mode in
can, we all want to know about him to her.
Distribution
Shapes
Introduction
(Bluman, 2014)

Measures of
Central
Tendency

Harold B.
Badilla
The authors go on to give examples of averages:

Introduction
The average American man is five feet, nine inches tall;
Measures of
the average woman is five feet, 3.6 inches.
Central
Tendency for The average American is sick in bed seven days a year
Ungrouped
Data missing five days of work.
Measures of
Central
On the average day, 24 million people receive animal bites.
Tendency for
Grouped Data By his or her 70th birthday, the average American will have
Mean, Median eaten 14 steers, 1050 chickens, 3.5 lambs, and 25.2 hogs.
and Mode in
Distribution In these examples, the word average is ambiguous, since several
Shapes
different methods can be used to obtain an average.
Measures of Central Tendency
(Bluman, 2014)

Measures of
Central
Tendency

Harold B.
Badilla
Measures of average are also called measures of central tendency
Introduction describing the center of the data. It is a single value about which
Measures of
Central
the observations tend to cluster. It includes the following:
Tendency for
Ungrouped mean
Data

Measures of
median
Central
Tendency for mode
Grouped Data
midrange
Mean, Median
and Mode in
Distribution
weighted mean
Shapes
Measures of Central Tendency
Definition of Terms

Measures of
Central
Tendency

Harold B.
Badilla Mean (µ or x)
Introduction
The sum of the observations divided by the number of totaled
Measures of observations
Central
Tendency for
Ungrouped
Data
Median (Mdn or mdn)
Measures of The middle value of an arranged array (either in ascending or
Central
Tendency for descending order)
Grouped Data

Mean, Median
and Mode in
Distribution
Mode (Mo or mo)
Shapes
Observations which occur most frequently in the data set
Measures of Central Tendency
The Mean (µ or x)

Measures of
Central
Tendency

Harold B.
Characteristics
Badilla
Interval statistic
Introduction
Calculated average
Measures of
Central Value is determined by every case in the distribution
Tendency for
Ungrouped
Data
Affected by extreme values
Measures of
Central
Tendency for When to use
Grouped Data
Variables are in at least interval scale
Mean, Median
and Mode in
Distribution
Value of each score is desired
Shapes
Values are considerably concentrated or closed to each
other
Measures of Central Tendency
The Median (Mdn or mdn)

Measures of
Central
Tendency

Harold B.
Badilla Characteristics
Introduction Ordinal Statistic
Measures of
Central
Rank or position average
Tendency for
Ungrouped NOT affected by extreme values
Data

Measures of
Central When to Use
Tendency for
Grouped Data Ordinal interpretation is needed
Mean, Median
and Mode in Middle score is desired
Distribution
Shapes We want to avoid influence of extreme values
Measures of Central Tendency
The Mode (Mo or mo)

Measures of
Central
Tendency
Characteristics
Harold B.
Badilla Nominal statistic
Introduction Inspection average
Measures of
Central NOT unique
Tendency for
Ungrouped Most popular score
Data

Measures of NOT affected by extreme values


Central
Tendency for Represents the majority
Grouped Data

Mean, Median
and Mode in
Distribution
When to Use
Shapes
Nominal interpretation is needed
Quick approximation of central tendency is required
Formulas for the Sample Mean, Median, and Mode
of Ungrouped Data
Measures of
Central
Tendency
Given a data set {x1 , x2 , x3 , x4 , . . . , xn } . . . ,
Harold B.
Badilla Sample Mean x
Introduction
. . . the sample mean x is
Measures of Pn
Central x1 + x2 + x3 + x4 + · · · + xn i=1 xi
Tendency for x= =
Ungrouped n n
Data

Measures of
Central Sample Median mdn
Tendency for
n+1
Grouped Data

. . . the sample median mdn is the 2 th term.
Mean, Median
and Mode in
Distribution
Shapes Sample Mode mo
. . . the data point with the most number of frequency is the
mode.
Drills

Measures of
Central
Tendency

Harold B.
Badilla
Example 1
Introduction
Find the mean, median, and mode of the following data set.
Measures of
Central
Tendency for
Ungrouped
{8, 8, 9, 10, 11, 13, 17, 20, 21}
Data

Measures of
Central Example 2
Tendency for
Grouped Data Find the mean, median, and mode of the following data set.
Mean, Median
and Mode in
Distribution {8, 8, 9, 10, 11, 13, 17, 20, 21, 75}
Shapes
Try this!
(Koh, et al., 2020)

Measures of
Central
Tendency
The data below shows the number of confirmed COVID-19 Cases
Harold B.
in ASEAN countries as of March 5, 2020.
Badilla
Country No. of Confirmed Cases
Introduction

Measures of
Singapore 117
Central
Tendency for
Malaysia 55
Ungrouped
Data
Thailand 47
Measures of
Vietnam 16
Central
Tendency for
Philippines 3
Grouped Data
Indonesia 2
Mean, Median
and Mode in Cambodia 1
Distribution
Shapes Myanmar 0
Laos 0
Brunei 0
Measures of
Central
For the mean
Tendency P10
i=1 xi
Harold B. µ= = 24.1
Badilla 10
Introduction

Measures of
For the median
Central
Tendency for {0, 0, 0, 1, 2, 3, 16, 47, 55, 117}
Ungrouped
Data
Since the sample size is even, we get the (n/2)th and
Measures of
Central ((n + 1)/2)th term, that is, the 5th and 6th term and get its
Tendency for
Grouped Data midpoint.
Mean, Median
and Mode in
Distribution
Mdn= (2 + 3)/2 = 2.5
Shapes

For the mode


The number 0 occurs thrice, more than any other numbers in
the data set. Hence, the mode is 0 .
Interpretations

Measures of
Central
Tendency
Mean
Harold B.
Badilla µ = 24.1 means if the number of COVID-19 cases per ASEAN
Introduction
country would be equal, the COVID-19 cases per country is
Measures of
24.1.
Central
Tendency for
Ungrouped Median
Data

Measures of Mdn = 2.5 means half of the ASEAN countries have COVID-19
Central
Tendency for
cases below or equal to 2.5 and the other half obtained
Grouped Data
COVI9-19 cases higher than or equal to 2.5.
Mean, Median
and Mode in
Distribution
Shapes Mode
Mo= 0 means that most ASEAN countries as of March 5, 2020
have no COVID-19 cases.
Midrange

Measures of
Central
Tendency
The midrange (MR) is the sum of the lowest and highest values
Harold B.
Badilla
in the data set divided by 2.
It is a rough estimate of the middle and can be affected by one
Introduction
extremely high or low value.
Measures of
Central
Tendency for Formula for the Midrange
Ungrouped
Data
LV + HV
Measures of MR =
Central 2
Tendency for
Grouped Data

Mean, Median Example


and Mode in
Distribution Find the midrange of the data set below.
Shapes

{8, 8, 9, 10, 11, 13, 17, 20, 21, 22}


Weighted Mean

Measures of
Central
Tendency

Harold B.
The weighted mean of a variable X is found by multiplying
Badilla each value by its corresponding weight and dividing the sum of
Introduction all the products by the sum of the weights.
Measures of It is used when the values are not all equally represented.
Central
Tendency for
Ungrouped Formula for the Weighted Mean
Data

Measures of Pn
Central wi Xi w1 X1 + w2 X2 + w3 X3 + . . . + wn Xn
Tendency for X = Pi=1
n =
Grouped Data w
i=1 i w1 + w2 + w3 + . . . + wn
Mean, Median
and Mode in
Distribution Where w1 , w2 , w3 , . . . , wn are the weights and
Shapes
X1 , X2 , X3 , . . . , Xn are the values.
Example

Measures of
Central
Tendency
The La Consolacion University Philippines uses this College Grad-
Harold B.
ing System for its students.
Badilla
% Equivalent Numerical Equivalent Description
Introduction

Measures of
98 to 100 1.00 Excellent
Central
Tendency for
95 to 97 1.25 Superior
Ungrouped
Data
92 to 94 1.50 Very Good
Measures of 89 to 91 1.75 Good
Central
Tendency for 86 to 88 2.00 Very Satisfactory
Grouped Data
83 to 85 2.25 Satisfactory
Mean, Median
and Mode in 80 to 82 2.50 Fairly Satisfactory
Distribution
Shapes 77 to 79 2.75 Fair
75 to 76 3.00 Passed
74 below 5.00 Failed
Example

Measures of
Central The General Weighted Average (GWA) is used to measure a student’s aca-
Tendency
demic performance. To determine the GWA, we use the formula for the
Harold B. weighted mean. The weights w are the credit units of the subject and the
Badilla
values X are the numerical equivalent of each grade.
Introduction The table below shows the grades of Student A for the First Semester of SY
Measures of 2021-2022. Compute for the GWA of Student A.
Central
Tendency for
Ungrouped
SUBJCODE Units N.G. % Equivalent
Data (Weight) (Value)
Measures of ACIS 3 98 1.00
Central
Tendency for STRABA 3 95 1.25
Grouped Data
COST 6 88 2.00
Mean, Median TH5 3 94 1.50
and Mode in
Distribution AUDPROB 6 96 1.25
Shapes
INTACC3 3 86 2.00
STAT-CBEA 3 93 1.50
ABUSCOM 3 85 2.25
Measures of Central Tendency for Grouped Data

Measures of
Central
Tendency
The measures of central tendency discussed earlier deals
Harold B.
Badilla with ungrouped data.
Introduction However, what if the given data is grouped, i.e., something
Measures of presented as grouped frequency distribution?
Central
Tendency for
Ungrouped
Data

Measures of
Central
Tendency for
Grouped Data

Mean, Median
and Mode in
Distribution
Shapes
Mean of Grouped Frequency Distribution

Measures of
Central
Tendency

Harold B.
Badilla
The procedure for finding the mean for grouped data as-
sumes that the mean of all the raw data values in each
Introduction
class is equal to the midpoint of the class.
Measures of
Central
Tendency for
In reality, this is not true, since the average of the raw data
Ungrouped values in each class usually will not be exactly equal to the
Data

Measures of
midpoint.
Central
Tendency for However, using this procedure will give an acceptable ap-
Grouped Data
proximation of the mean, since some values fall above the
Mean, Median
and Mode in midpoint and other values fall below the midpoint for each
Distribution
Shapes class, and the midpoint represents an estimate of all values
in the class.
Estimating the Mean of Grouped Data

Measures of
Central Procedure Table
Tendency
1. Make a table as shown.
Harold B.
Badilla

Introduction

Measures of
Central
2. Find the midpoints of each class and place them in column
Tendency for
Ungrouped
C.
Data
3. Multiply the frequency by the midpoint for each class, and
Measures of
Central place the product in column D.
Tendency for
Grouped Data 4. Find the sum of column D.
Mean, Median
and Mode in 5. Divide the sum obtained in column D by the sum of the
Distribution
Shapes frequencies obtained in column B.
The formula for the mean is
P
f · Xm
X̄ = .
n
Calculating Grouped Mean

Measures of
Central
Tendency
Example: Miles Run per Week
Harold B.
Badilla
Using the following frequency distribution, find the mean. The
Introduction data represent the number of miles run during one week for a
Measures of sample of 20 runners.
Central
Tendency for
Ungrouped
Data

Measures of
Central
Tendency for
Grouped Data

Mean, Median
and Mode in
Distribution
Shapes
Estimating Median of Grouped Data

Measures of
Central Formula
Tendency

Harold B.
The estimated median of a grouped data can be found using
Badilla the formula:
Introduction n
2 −B
Measures of Estimated Median = L + ·w
Central f
Tendency for
Ungrouped
Data where
Measures of
Central
L is the lower class boundary of the group containing the
Tendency for
Grouped Data
median
Mean, Median n is the total number of values
and Mode in
Distribution
Shapes
B is the cumulative frequency of the groups before the
median group
f is the frequency of the median group
w is the class width
Calculating Grouped Median

Measures of
Central
Tendency
Example: Miles Run per Week
Harold B.
Badilla
Using the following frequency distribution, find the median.
Introduction The data represent the number of miles run during one week
Measures of for a sample of 20 runners.
Central
Tendency for
Ungrouped
Data

Measures of
Central
Tendency for
Grouped Data

Mean, Median
and Mode in
Distribution
Shapes
Estimating Mode of Grouped Data

Measures of
Central The mode for grouped data is the modal class. The
Tendency
modal class is the class with the largest frequency.
Harold B.
Badilla
Example: Miles Run per Week
Introduction
Using the following frequency distribution, find the mode. The
Measures of
Central data represent the number of miles run during one week for a
Tendency for
Ungrouped sample of 20 runners.
Data

Measures of
Central
Tendency for
Grouped Data

Mean, Median
and Mode in
Distribution
Shapes
Distribution Shapes

Measures of
Central
Tendency Frequency distributions can assume many shapes. The three
Harold B. most important shapes are positively skewed, symmetric,
Badilla
and negatively skewed.
Introduction
In a positively skewed or right-skewed distribution, the
Measures of
Central
majority of the data values fall to the left of the mean and
Tendency for
Ungrouped
cluster at the lower end of the distribution; the “tail” is to
Data
the right.
Measures of
Central Also, the mean is to the right of the median, and the mode
Tendency for
Grouped Data is to the left of the median.
Mean, Median
and Mode in
An example of a positively skewed distribution is the in-
Distribution
Shapes
comes of the population of the Philippines. Most of the
incomes cluster about the low end of the distribution; those
with high incomes are in the minority and are in the tail at
the right of the distribution.
Figure of Positively-Skewed Distribution

Measures of
Central
Tendency

Harold B.
Badilla

Introduction

Measures of
Central
Tendency for
Ungrouped
Data

Measures of
Central
Tendency for
Grouped Data

Mean, Median
and Mode in
Distribution
Shapes
Symmetric Distribution

Measures of
Central
Tendency

Harold B.
Badilla

Introduction In a symmetric distribution, the data values are evenly


Measures of distributed on both sides of the mean.
Central
Tendency for
Ungrouped
In addition, when the distribution is unimodal, the mean,
Data median, and mode are the same and are at the center of
Measures of
Central
the distribution.
Tendency for
Grouped Data Examples of symmetric distributions are IQ scores and
Mean, Median heights of adult males.
and Mode in
Distribution
Shapes
Figure of Symmetric Distribution

Measures of
Central
Tendency

Harold B.
Badilla

Introduction

Measures of
Central
Tendency for
Ungrouped
Data

Measures of
Central
Tendency for
Grouped Data

Mean, Median
and Mode in
Distribution
Shapes
Negatively-Skewed Distribution

Measures of
Central
Tendency

Harold B.
Badilla When the majority of the data values fall to the right of
Introduction
the mean and cluster at the upper end of the distribution,
Measures of with the tail to the left, the distribution is said to be
Central
Tendency for negatively skewed or left-skewed.
Ungrouped
Data Also, the mean is to the left of the median, and the mode
Measures of is to the right of the median.
Central
Tendency for
Grouped Data
As an example, a negatively skewed distribution results if
Mean, Median the majority of students score very high on an instructor’s
and Mode in
Distribution
examination. These scores will tend to cluster to the right
Shapes
of the distribution.
Figure of Negatively-Skewed Distribution

Measures of
Central
Tendency

Harold B.
Badilla

Introduction

Measures of
Central
Tendency for
Ungrouped
Data

Measures of
Central
Tendency for
Grouped Data

Mean, Median
and Mode in
Distribution
Shapes

You might also like