Professional Documents
Culture Documents
Introduction To Statistics
Introduction To Statistics
Statistics
LOGO
NATURE OF PROBABILITY
AND STATISTICS
cdcjaurigue
LOGO
Contents
1 Introduction
2 Descriptive and
Inferential Statistics
4 Data Collection
5 Sampling Techniques
cdcjaurigue
LOGO
Objectives
cdcjaurigue
LOGO
Why study Statistics?
cdcjaurigue
LOGO
Why study Statistics?
Data are everywhere
▪ GDP gained momentum growing by 2.5 percent
in the first quarter of 2012 while GNI grew by a
slower pace of 1.3 percent.
▪ The Agriculture, Hunting, Forestry and Fishery
sector posted a turnaround growth of 2.5
percent in the first quarter from two consecutive
quarters of decline.
▪ Industry slowed down to 2.2 percent growth
from 3.3 percent in the previous quarter.
cdcjaurigue
LOGO
Why study Statistics?
Data are everywhere
▪ The Services sector accelerated to 2.6 percent from 1.3
percent in the previous quarter as all its subsectors
recorded positive growth.
▪ With projected population reaching 95.2 million, per
capita
- GDP grew by 4.6 percent
- GNI grew by 4.0 percent
- HFCE grew by 4.9 percent
• HFCE (Household Final Consumption Expenditure) has been
growing robustly since the fourth quarter of 2010
Source: http://www.nscb.gov.ph
cdcjaurigue
LOGO
Why study Statistics?
✓ Statistical techniques are used to make
many decisions that affect our lives
▪ Insurance companies use statistical analysis to set
rates for home, automobile, life, and health insurance.
▪ Laguna Lake Development Authority is monitoring the
water quality of Laguna Lake. They periodically take
water samples to establish the level of contamination
and maintain the level of quality.
▪ Medical researchers study the cure rates for diseases
using different drugs and different forms of treatment
cdcjaurigue
LOGO
Why study Statistics?
cdcjaurigue
LOGO
Why study Statistics?
cdcjaurigue
LOGO
What is Statistics?
cdcjaurigue
LOGO
Statistics Defined
cdcjaurigue
LOGO
Statistics
Types
Descriptive Inferential
statistics utilizes statistics utilizes
numerical and sample data to
graphical methods make estimates,
to look for patterns decisions,
in a data set, to predictions, or
summarize the other
information generalizations
revealed in a data about a larger set
set, and to present of data or
the information in a population.
convenient form.
cdcjaurigue
LOGO
Descriptive Statistics
❖Collect data
▪ survey
❖Present data
▪ tables and graphs
❖Characterize data
▪ sample mean = X i
n
LOGO
Inferential Statistics
❖ Estimation
▪ Estimate the population mean
weight using the sample mean
weight
❖ Hypothesis testing
▪ Test the claim that the
population mean weight is 120
pounds
population
sample
the complete
collection of the portion of the
individuals, items, or population
data under selected for
analysis
consideration in a
statistical study
cdcjaurigue
LOGO
Population vs. Sample
a b cd
b c
ef gh i jk l m n
gi n
o p q rs t u v w
o r u
x y z
y
1-18 cdcjaurigue
LOGO
Population vs. Sample
cdcjaurigue
LOGO
Why Sample?
Example.
• 100 owners of a certain car reported 85 problems
in the first 90 days of ownership.
• The statistic “85” describes the number of
problems per 100 cars during the first 90 days of
ownership.
• It suggests that the entire population of owners of
these cars experience an average of 0.85
problems per car.
cdcjaurigue
LOGO
Parameter
Variable Observation
2 Data
3 set
❖ a characteristic of
interest concerning ❖the value of a ❖consists of the
the individual variable for one observations of a
elements of a particular element variable for the
population or a from the sample or elements of a sample
sample population
❖often represented
by a letter such as x,
y, or z
cdcjaurigue
LOGO
Example
All engineering majors taking IE101 are polled and each one is asked if
they approve or disapprove of the student council’s policies.
Variable Observation
2 Data
3 set
❖ the opinion of the
engineering student ❖“approve” or ❖consists of the
taking IE101 of the “disapprove observation of all the
council’s policies. engineering students
taking IE101
Let
x = 0, if disapprove
0 or 1 1,0,0,1,0,...,1,1
x = 1, if approve
cdcjaurigue
LOGO
Data Types
Examples:
◼ Marital Status
◼ Political Party
◼ Eye Color
(Defined categories)
Examples: Examples:
◼ Number of Children ◼ Weight
◼ Defects per hour ◼ Voltage
(Counted items) (Measured
characteristics)
cdcjaurigue
LOGO
Classification of Data
cdcjaurigue
LOGO
Qualitative data
Examples
Satisfaction ratings (on a scale from “not satisfied”
to “very satisfied”) by users of a website
Party affiliation (Liberal, Nacionalista, Pwersa ng
Masang Pilipino, Lakas Kampi, Bangon Pilipinas,
etc.) of voters
Eye colors (blue, brown, or so on) of babies
Names (first and last) of a group of students who
took an exam
cdcjaurigue
LOGO
Qualitative data
Examples
Student numbers of a group of engineering
students
Foremost colors (red, yellow, orange, or so on) of
flowers in a garden
Sex (male or female) of users of a website
cdcjaurigue
LOGO
Quantitative data
Examples
Daytime temperature readings (in degrees
Fahrenheit) in a 30-day period
Heights (in centimeters) of plants in a plot of land
Number (0, 1, 2, or so on) of people attending a
conference
Distances (in miles) traveled by students
commuting to school
commuting to school
cdcjaurigue
LOGO
Quantitative data
Examples
Heights (in inches) of girls in a classroom
Number (0, 1, 2, or so on) of students in a
classroom
Number (0, 1, 2, or so on) of teachers in favor of
school uniforms
Ages (in months) of children in a preschool
cdcjaurigue
LOGO
Quantitative Data
Discrete Data
quantitative data that are countable using a
finite count, such as 0, 1, 2, and so on
integer-valued
Continuous Data
quantitative data that can take on any value
within a range of values on a numerical scale in
such a way that there are no gaps, jumps, or
other interruptions
real-valued
cdcjaurigue
LOGO
Discrete or Continuous?
Examples
Daytime temperature readings (in degrees
Fahrenheit) in a 30-day period continuous
Heights (in centimeters) of plants in a plot of
land continuous
Number (0, 1, 2, or so on) of people attending
a conference discrete
Distances (in miles) traveled by students
commuting to schoolcommuting
continuous to school
cdcjaurigue
LOGO
Discrete or Continuous?
Examples
Heights (in inches) of girls in a classroom
continuous
cdcjaurigue
LOGO
Levels of Measurement
1
Ratio
2
Interval
3
Ordinal
4
Nominal
LOGO
Nominal Scale
the lowest level of data
applied to data that are used for category
identification
characterized by data that consist of names,
labels, or categories only
data cannot be arranged in an ordering
scheme
arithmetic operations are not performed for
nominal data
cdcjaurigue
LOGO
Nominal Scale
Qualitative Possible nominal level data
variable values
cdcjaurigue
LOGO
Interval Scale
applied to data that can be arranged in some order
and for which differences in data values are
meaningful
results from counting or measuring
data can be arranged in an ordering scheme and
differences can be calculated and interpreted
the value zero is arbitrarily chosen for interval data
and does not imply an absence of the characteristic
being measured
ratios are not meaningful for interval data
Examples: temperature, IQ scores
cdcjaurigue
LOGO
Ratio Scale
cdcjaurigue
LOGO
Ratio Scale
Sampling Techniques
Simple
Convenience Systematic
Random
Judgment
Cluster
Stratified
1-44
LOGO
Nonstatistical Sampling
❖Convenience
▪ Collected in the most convenient
manner for the researcher
❖Judgment
▪ Based on judgments about who in the
population would be most likely to
provide the needed information
LOGO
Statistical Sampling
Statistical Sampling
(Probability Sampling)
Population
Divided
into 4
strata
Sample
LOGO
Systematic Random Sampling
❖ Decide on sample size: n
❖ Divide ordered (e.g., alphabetical) frame of
N individuals into groups of k individuals:
k = N/n
❖ Randomly select one individual from the 1st
group
❖ Select every kth individual thereafter
N = 64
n=8
First Group
k=8
LOGO
Cluster Sampling
❖Divide population into several “clusters,”
each representative of the population
(e.g., regions)
❖Select a simple random sample of
clusters
▪ All items in the selected clusters can be used, or
items can be chosen from a cluster using another
probability sampling technique
Population
divided into 16
clusters. Randomly selected
clusters for sample
LOGO
Computers and Calculators
Advantages Disadvantages
❖ Efficient way of ❖ They depend on the
collecting information subjects’ motivation,
from a large number of honesty, memory and
people. ability to respond.
❖ Relatively easy to ❖ Answers could lead to
administer. vague data.
❖ Wide variety of
information can be
collected.
❖ They can be focused.
LOGO
Designing a Survey
❖ Surveys can take different forms. They can be
used to ask only one question or they can ask a
series of questions. We can use surveys to test
out people’s opinions or to test a hypothesis.
❖ When designing a survey, the following steps
are useful:
1. Determine the goal of the survey.
2. Identify the sample population.
3. Choose an interviewing method.
4. Decide what questions you will ask, in what
order, and how to phrase them.
5. Conduct the interview and collect the info.
6. Analyze the results
LOGO
Design of Experiment
❖ Is defined as a branch of applied statistics that
deals with planning, conducting, analyzing, and
interpreting controlled tests to evaluate the
factors that control the value of parameters or
group of parameters.