Professional Documents
Culture Documents
Statistics For Business and Economics: Describing Data: Graphical
Statistics For Business and Economics: Describing Data: Graphical
6th Edition
Chapter 2
Chap 2-1
Chapter Goals
After completing this chapter, you should be able to:
Identify types of data and levels of measurement Create and interpret graphs to describe categorical variables: frequency distribution, bar chart, pie chart, Pareto diagram Create a line chart to describe time-series data Create and interpret graphs to describe numerical variables: frequency distribution, histogram, ogive, stem-and-leaf display Construct and interpret graphs to describe relationships between variables: Scatter plot, cross table Describe appropriate and inappropriate ways to display data graphically
Chap 2-2
Types of Data
Data
Categorical
Examples:
Numerical
Marital Status Are you registered to vote? Eye Color (Defined categories or groups)
Discrete
Examples:
Continuous
Examples:
Chap 2-3
Measurement Levels
Differences between measurements, true zero exists Differences between measurements but no true zero
Ratio Data
Quantitative Data
Interval Data
Ordinal Data
Qualitative Data
Nominal Data
Chap 2-4
Data in raw form are usually not easy to use for decision making
Some type of organization is needed Table Graph The type of graph to use depends on the variable being summarized
Chap 2-5
Chap 2-6
Graphing Data
Bar Chart
Pie Chart
Pareto Diagram
Chap 2-7
Bar charts and Pie charts are often used for qualitative (category) data
Height of bar or size of pie slice shows the frequency or percentage for each category
Chap 2-9
Emergency 25%
Chap 2-11
Pareto Diagram
Used to portray categorical data A bar chart, where categories are shown in descending order of frequency A cumulative polygon is often shown in the same graph
Chap 2-12
60%
100%
90%
50% 80%
30%
50%
0%
0%
Poor Alignment
Paint Flaw
Bad Weld
Missing Part
Cracked case
Electrical Short
Chap 2-15
A line chart (time-series plot) is used to show the values of a variable over time
Chap 2-16
Thousands of subscribers
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
Chap 2-17
Stem-and-Leaf Display
Histogram
Ogive
Chap 2-18
Frequency Distributions
What is a Frequency Distribution?
A frequency distribution is a list or a table containing class groupings (categories or ranges within which the data fall) ... and the corresponding frequencies with which data fall within each class or category
Chap 2-19
The distribution condenses the raw data into a more useful form...
and allows for a quick visual interpretation of the data
Chap 2-20
Each class grouping has the same width Determine the width of each interval by
largest number smallest number w interval width number of desired intervals
Use at least 5 but no more than 15-20 intervals Intervals never overlap Round up the interval width to get desirable interval endpoints
Chap 2-21
Chap 2-22
Find range: 58 - 12 = 46
Determine interval boundaries: 10 but less than 20, 20 but Count observations & assign to classes
Chap 2-23
Interval
Frequency
Relative Frequency
Percentage
10 but less than 20 20 but less than 30 30 but less than 40 40 but less than 50 50 but less than 60 Total
3 6 5 4 2 20
15 30 25 20 10 100
Chap 2-24
Histogram
Bars of the appropriate heights are used to represent the number of observations within each class
Chap 2-25
Histogram Example
Interval 10 but less than 20 20 but less than 30 30 but less than 40 40 but less than 50 50 but less than 60 Frequency
7 6 5 4 3 2 1 0 3
6 5 4 2 0
Frequency
0
0 0 10 10 2020 30 30 40 40 50 50 60 60 70 Temperature in Degrees
Chap 2-26
Histograms in Excel
Chap 2-27
Histograms in Excel
(continued)
2
Choose Histogram
(
Input data range and bin range (bin range is a cell 3
range containing the upper interval endpoints for each class grouping)
Often answered by trial and error, subject to user judgment The goal is to create a distribution that is neither too "jagged" nor too "blocky Goal is to appropriately show the pattern of variation in the data
Chap 2-29
3.5 3
Frequency
may yield a very jagged distribution with gaps from empty classes Can give a poor indication of how frequency varies across classes
Temperature
12 10 8 6 4 2 0 0 30 60 More Temperature
may compress variation too much and yield a blocky distribution can obscure important patterns of variation.
Chap 2-30
Class 10 but less than 20 20 but less than 30 30 but less than 40
Frequency Percentage 3 6 5 15 30 25
4
2 20
20
10 100
18
20
90
100
Chap 2-31
The Ogive
Graphing Cumulative Frequencies
Interval Less than 10 10 but less than 20 20 but less than 30 30 but less than 40 40 but less than 50 50 but less than 60 Upper interval Cumulative endpoint Percentage 10 20 30 40 50 60 0 15 45 70 90 100
100 80 60 40 20 0 10 20 30 40 50 60
Interval endpoints
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 2-32
Distribution Shape
The shape of the distribution is said to be symmetric if the observations are balanced, or evenly distributed, about the center.
Symmetric Distribution
10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9
Frequency
Chap 2-33
Distribution Shape
(continued)
The shape of the distribution is said to be skewed if the observations are not symmetrically distributed around the center.
Positively Skewed Distribution
12 10
A positively skewed distribution (skewed to the right) has a tail that extends to the right in the direction of positive values.
Frequency
8 6 4 2 0 1 2 3 4 5 6 7 8 9
A negatively skewed distribution (skewed to the left) has a tail that extends to the left in the direction of negative values.
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.
8 6 4 2 0 1 2 3 4 5 6 7 8 9
Chap 2-34
Stem-and-Leaf Diagram
A simple way to see distribution details in a data set METHOD: Separate the sorted data series into leading digits (the stem) and the trailing digits (the leaves)
Chap 2-35
Example
Data in ordered array:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41
21 is shown as 38 is shown as
2 3
1 8
Chap 2-36
Example
(continued)
2 3 4
1 4 4 6 7 7 0 2 8 1
Chap 2-37
Leaf
6
7 12
1
8 2
Chap 2-38
9
10 11 12
13368
356 47 2
Chap 2-39
Graphs illustrated so far have involved only a single variable When two variables exist other techniques are used:
Categorical (Qualitative) Variables Cross tables Numerical (Quantitative) Variables Scatter plots
Chap 2-40
Scatter Diagrams
Scatter Diagrams are used for paired observations taken from two numerical variables
The Scatter Diagram: one variable is measured on the vertical axis and the other variable is measured on the horizontal axis
Chap 2-41
23
26 29 33 38 42 50
125
140 146 160 167 170 188
55
60
195
200
Chap 2-42
2
Select XY(Scatter) option, then click Next
3
When prompted, enter the data range, desired legend, and desired destination to complete the scatter diagram
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 2-43
Cross Tables
Cross Tables (or contingency tables) list the number of observations for every combination of values for two categorical or ordinal variables
If there are r categories for the first variable (rows) and c categories for the second variable (columns), the table is called an r x c cross table
Chap 2-44
Investment Category
Stocks
Bonds CD Savings Total
46.5
32.0 15.5 16.0 110.0
55
44 20 28 147
27.5
19.0 13.5 7.0 67.0
129
95 49 51 324
Chap 2-45
Chap 2-46
Present data to display essential information Communicate complex ideas clearly and accurately
message
Chap 2-48
Unequal histogram interval widths Compressing or distorting the vertical axis Providing no zero point on the vertical axis Failing to provide a relative basis in comparing data between groups
Chap 2-49
Chapter Summary
Reviewed types of data and measurement levels Data in raw form are usually not easy to use for decision making -- Some type of organization is needed:
Table Graph
Line chart Frequency distribution Histogram and ogive Stem-and-leaf display Scatter plot Cross tables and side-by-side bar charts
Chap 2-50