Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Chapter 3: Descriptive Measures

Dr. Alan Polansky


Division of Statistics
Northern Illinois University
Chapter Concepts
! Characteristics of data: location and variation
! Measuring location: mean, median, mode
! Measuring variation: range, standard deviation
! Population and sample characteristics

Copyright 2001 Alan M. Polansky 2


Characteristics of Data

! Location: The center or middle of the data


set
! Variation: How far data is likely to be from
the location

Copyright 2001 Alan M. Polansky 3


The Nature of Location
! Consider the two sets of squares.
! Difference: The red squares tend to be
larger than the blue squares.
! This is a difference in location.

Copyright 2001 Alan M. Polansky 4


The Mean

! The mean is a measure of location.


! The sum of the observations divided by the
number of observations.
! When computed on a sample, the mean is
called the sample mean and is denoted x
! When computed on a population, the mean is
called the population mean and is denoted µ.

Copyright 2001 Alan M. Polansky 5


The Σ Notation

! The Σ notation means to add the sequence


that is specified afterwards.
! Hence, Σx means to add together all of the x
values.
! If x1,…,xn represents the observations in a
data set, then the sample mean can be
written as
x= 1
n ∑x
Copyright 2001 Alan M. Polansky 6
The Median

! The median is a measure of location.


! To compute the median first sort the
observations from smallest to largest.
! If there are an odd number of observations, then
the median is the middle sorted observation.
! If there are an even number of observations,
then the median is the average of the middle
two sorted observations.

Copyright 2001 Alan M. Polansky 7


The Mode

! The mode is the observation in the data set


that occurs the most.
! The mode may not be unique, or may not
exist at all.
! The mode is rarely used except in basic
statistics classes.

Copyright 2001 Alan M. Polansky 8


Example

! Data: 24, 29, 26, 18


! Mean = 24.25
! Median = 25.00
! Mode does not exist.
! Data: 18, 28, 17, 15, 28
! Mean = 21.20
! Median = 18
! Mode = 28

Copyright 2001 Alan M. Polansky 9


The Nature of Variation

! Both sets of squares have an average area of 10.


! The difference: The blue squares have more
variation than the red ones.

Copyright 2001 Alan M. Polansky 10


Measuring Variation: The Range

! The range is the difference between the


largest observation and the smallest
observation:
Range = Largest Obs. – Smallest Obs.
! The range is not used much in practice
because it ignores most of the observations.

Copyright 2001 Alan M. Polansky 11


More on Variation
! Variation reflects how far away observations typically
are from the the location.
! The difference between each observation and the mean
is called a deviation:
deviation of xi = xi − x

! Taking the mean of these deviations, after squaring to


get rid of the negatives, results in a measure of
variation.

Copyright 2001 Alan M. Polansky 12


Measuring Variation:
The Sample Standard Deviation
! The standard deviation of a set of n
observations is
1

2
Sx = (x − x )
n −1
! To compute the standard deviation:
1. Compute the square deviations of the observations.
2. Add up the square deviations and divide by n – 1.
3. Take the square root of the result.
Copyright 2001 Alan M. Polansky 13
Example

! Data: 24, 29, 26, 18


! Range = 11
! Sx = 4.64
! Data: 18, 28, 17, 15, 28
! Range = 13
! Sx = 6.30

Copyright 2001 Alan M. Polansky 14


The Empirical Rule

! For any data set with a bell shaped


histogram:
! Roughly 68% of the data will be within one
standard deviation of the mean.
! Roughly 95% of the data will be within two
standard deviations of the mean.
! Roughly 99% of the data will be within three
standard deviations of the mean.

Copyright 2001 Alan M. Polansky 15


The Empirical Rule

! An observation is within k standard


deviations of the mean if it is between
x − kS x and x + kS x

Copyright 2001 Alan M. Polansky 16


Example

! For the gasoline octane data we have:


! Mean = 90.67
! Standard Deviation = 2.83
! According to the empirical rule:
! Roughly 68% of the observations are between 87.84 and 93.50
! Roughly 95% of the observations are between 85.01 and 96.33
! Roughly 99% of the observations are between 85.18 and 99.16

Copyright 2001 Alan M. Polansky 17


The Population
Standard Deviation
! When computing the standard deviation of a
population, use the formula:
1
σ=
N
∑ ( x − µ ) 2

where N is the number of items in the


population and µ is the population mean
defined earlier.

Copyright 2001 Alan M. Polansky 18


Descriptive Statistics and
Statistical Inference
! Suppose we are interested in the mean (µ) and
the standard deviation (σ) of a population.
! We can take a sample and use the sample
mean ( x ) and the sample standard deviation
(Sx) to tell us about µ and σ.
! The reason we divide by n – 1 in the sample
standard deviation is to remove some bias
caused by the sampling.

Copyright 2001 Alan M. Polansky 19


Homework

! Section 3.3
! Pages 154 – 159
! Practice: 3.55 (a,b), 3.65
! Turn In: 3.54 (a,b), 3.64

Copyright 2001 Alan M. Polansky 20

You might also like