Screenshot 2024-02-03 at 1.09.10 AM

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 75

Chapter 2

Organizing and
Visualizing Variables

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 1


Objectives
In this chapter you learn:
■ The DCOVA framework guides your application of
statistics.
■ To understand the different measurement scales.
■ How to organize and visualize categorical variables.
■ How to organize and visualize numerical variables.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 2


To Properly Apply Statistics You Should Follow A
Framework To Minimize Possible Errors (DCOVA)

❑ Define the data you want to study to solve a problem or meet an objective.
❑ Collect the data from appropriate sources. (Chapter 1)
❑ Organize the data collected by developing tables. (Chapter 2)
❑ Visualize the data collected by developing charts. (Chapter 2)
❑ Analyze the data collected to reach conclusions and present those results.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 3


Definitions

VARIABLE
A characteristic or property of an item or individual.

DATA
The set of values associated with one or more
variables.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 4


ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd.
5
Classifying Variables By Type
DCOVA
▪ Categorical (qualitative) variables take categories as
their values such as “yes”, “no”, or “blue”, “brown”,
“green”.

▪ Numerical (quantitative) variables have values that


represent a counted or measured quantity.
▪ Discrete variables arise from a counting process.
▪ Continuous variables arise from a measuring process.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 6


Two Types of Numerical Variables…
1) Discrete Random Variable
– one that takes on a countable number of values.

Examples:
X = Sum of values on the roll of two dices:
X has to be either 2, 3, 4, …, or 12.
Y = number of accidents in Doha during a
week: Y has to be 0, 1, 2, 3, 4, 5, 6, 7, 8, ……………real
big number.
Z = number of children in a family: Z has to
be 0, 1, 2, 3, …
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 7
Two Types of Numerical Variables…
2) Continuous Random Variable
– one whose values are not discrete, not
countable.
Example 1:
-Let X= time to write a statistics exam in a university

where the time limit is 3 hours and students cannot


leave before 30 minutes.
➢ The smallest value of X is 30 minutes.

➢ The next value : it is 30.1? 30.01? 30.001?

30.0001?

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 8


Examples of Types of Variables
DCOVA

Question Responses Variable Type

Do you have a Facebook


profile? Yes or No Categorical

How many text messages Numerical


have you sent in the past --------------- (discrete)
three days?
How long did the mobile Numerical
app update take to --------------- (continuous)
download?

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 9


Types of Variables
DCOVA
Variables

Categorical Numerical

Nominal Ordinal Discrete Continuous


Examples: Examples: Ratings Examples: Examples:
■ Marital Status ■ Good, Better, ■ Number of Children ■ Weight
■ Political Party Best ■ Defects per hour ■ Voltage
■ Eye Color ■ Low, Med, High (Counted items) (Measured
(Defined Categories) (Ordered Categories) characteristics)

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 10


Measurement Scales
DCOVA
A nominal scale classifies data into distinct
categories in which no ranking is implied.

Categorical Variables Categories

Do you have a
Facebook profile? Yes, No

Type of investment Growth, Value, Other

Cellular Provider AT&T, Sprint, Verizon,


Other, None

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 11


Measurement Scales (con’t.)
DCOVA
An ordinal scale classifies data into distinct
categories in which ranking is implied.
Categorical Variable Ordered Categories

Student class designation Freshman, Sophomore, Junior,


Senior
Product satisfaction Very unsatisfied, Fairly unsatisfied,
Neutral, Fairly satisfied, Very
satisfied
Faculty rank Professor, Associate Professor,
Assistant Professor, Instructor
Standard & Poor’s bond ratings AAA, AA, A, BBB, BB, B, CCC, CC,
C, DDD, DD, D
Student Grades A, B, C, D, F

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 12


ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd.
13
Measurement Scales (con’t.)
DCOVA
▪ An interval scale is an ordered scale in which the difference
between measurements is a meaningful quantity.
▪ Interval Data are real numbers, such as heights, weights,
prices, distance, etc.
▪ Arithmetic operations can be performed on Interval Data
➢ Also referred to as quantitative or numerical.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 14


Exercise
■ Identify each of the following as examples of
nominal, ordinal or numerical data.
1. The temperature in Doha on any given day

2. The make (Toyota, Honda, etc.) of automobile

3. Whether or not a student is married

4. The weight of a pencil

5. The length of time billed for a long distance telephone call

6. The brand quality of cereal

7. The type of book (fiction, drama, etc.)


ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 15
Organizing Data Creates Both Tabular
And Visual Summaries
DCOVA
■ Summaries both guide further exploration and
sometimes facilitate decision making.

■ Visual summaries enable rapid review of larger


amounts of data & show possible significant
patterns.

■ Often, the Organize and Visualize step in


DCOVA occur concurrently.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 16


Categorical Data Are Organized By Utilizing
Tables
DCOVA
Categorical Data

Tallying Data

One Two
Categorical Categorical
Variable Variables

Summary Contingency
Table Table

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 17


ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd.
18
Organizing Categorical Data: Summary
Table DCOVA
▪ A summary table tallies the frequencies or percentages of items in a set of
categories so that you can see differences between categories.

Devices Millennials Use to Watch Movies or Television Shows

Devices Used To Watch Movies or TV Shows Percent


Television Set 49%
Tablet 9%
Smartphone 10%
Laptop / Desktop 32%

Source: Data extracted and adapted from A. Sharma, “Big Media Needs to Embrace
Digital Shift Not Fight It,” Wall Street Journal, June 22, 2016, p. 1-2.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 19


ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd.
20
A Contingency Table Helps Organize
Two or More Categorical Variables
DCOVA
■ Used to study patterns that may exist between the
responses of two or more categorical variables.

■ Cross tabulates or tallies jointly the responses of


the categorical variables.

■ For two variables the tallies for one variable are


located in the rows and the tallies for the second
variable are located in the columns.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 21


ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd.
22
Contingency Table - Example
DCOVA
■ A random sample of 400
Contingency Table Showing
invoices is drawn. Frequency of Invoices Categorized
■ Each invoice is categorized By Size and The Presence Of Errors
as a small, medium, or large No
amount. Errors Errors Total

■ Each invoice is also examined Small 170 20 190


Amount
to identify if there are any
Medium 100 40 140
errors. Amount
■ This data are then organized Large 65 5 70
in the contingency table to the Amount
right. Total 335 65 400

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 23


Contingency Table Based On Percentage Of
Overall Total DCOVA
No
Errors Errors Total 42.50% = 170 / 400
Small 170 20 190 25.00% = 100 / 400
Amount 16.25% = 65 / 400
Medium 100 40 140
Amount No
Large 65 5 70 Errors Errors Total
Amount Small 42.50% 5.00% 47.50%
Total 335 65 400 Amount
Medium 25.00% 10.00% 35.00%
Amount
83.75% of sampled invoices
Large 16.25% 1.25% 17.50%
have no errors and 47.50% Amount
of sampled invoices are for Total 83.75% 16.25% 100.0%
small amounts.
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 24
Contingency Table Based On Percentage of
Row Totals DCOVA
No
Errors Errors Total 89.47% = 170 / 190
Small 170 20 190 71.43% = 100 / 140
Amount 92.86% = 65 / 70
Medium 100 40 140
Amount No
Large 65 5 70 Errors Errors Total
Amount Small 89.47% 10.53% 100.0%
Total 335 65 400 Amount
Medium 71.43% 28.57% 100.0%
Amount
Medium invoices have a larger
Large 92.86% 7.14% 100.0%
chance (28.57%) of having Amount
errors than small (10.53%) or Total 83.75% 16.25% 100.0%
large (7.14%) invoices.
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 25
Contingency Table Based On Percentage Of
Column Totals DCOVA
No
Errors Errors Total 50.75% = 170 / 335
Small 170 20 190 30.77% = 20 / 65
Amount
Medium 100 40 140
Amount No
Large 65 5 70 Errors Errors Total
Amount Small 50.75% 30.77% 47.50%
Total 335 65 400 Amount
Medium 29.85% 61.54% 35.00%
Amount
There is a 61.54% chance
Large 19.40% 7.69% 17.50%
that invoices with errors are Amount
of medium size. Total 100.0% 100.0% 100.0%

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 26


Tables Used For Organizing
Numerical Data
DCOVA
Numerical Data

Frequency Cumulative
Ordered Array
Distributions Distributions

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 27


Organizing Numerical Data:
Ordered Array
DCOVA
▪ An ordered array is a sequence of data, in rank order, from the smallest
value to the largest value.
▪ Shows range (minimum value to maximum value).
▪ May help identify outliers (unusual observations).

Age of Day Students


Surveyed 16 17 17 18 18 18
College
Students 19 19 20 20 21 22

22 25 27 32 38 42

Night Students
18 18 19 19 20 21

23 28 32 33 41 45

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 28


Organizing Numerical Data:
Frequency Distribution
DCOVA
▪ The frequency distribution is a summary table in which the data are arrang
numerically ordered classes.

▪ You must give attention to selecting the appropriate number of class groupi
the table, determining a suitable width of a class grouping, and establish
boundaries of each class grouping to avoid overlapping.

▪ The number of classes depends on the number of values in the data. With
number of values, typically there are more classes. In general, a fre
distribution should have at least 5 but no more than 15 classes.

▪ To determine the width of a class interval, you divide the range (Highest
Lowest value) of the data by the number of class groupings desired.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 29


Organizing Numerical Data:
Frequency Distribution Example
DCOVA

Example 1:A manufacturer of insulation randomly


selects 20 winter days and records the daily high
temperature.

24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 30


Organizing Numerical Data:
Frequency Distribution Example
DCOVA
1) Sort raw data in ascending order:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58.
2) Find range: Range = Largest Observation – Smallest Observation =58 - 12 =
46.
3) Select number of classes: Number of classes = 5
Method 1
Method 2

Alternative, we could use Sturges’ formula:


Number of class intervals = 1 + 3.3 log (n)

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 31


Organizing Numerical Data:
Frequency Distribution Example
DCOVA
4) Compute class interval (width): 10 (46/5 then round up).
Range ÷ (# classes) = 46 ÷ 5=9.2 ≈ 10 (or 9)
5) Determine class boundaries (limits):
▪ Class 1: 10 but less than 20.
▪ Class 2: 20 but less than 30.
▪ Class 3: 30 but less than 40.
▪ Class 4: 40 but less than 50.
▪ Class 5: 50 but less than 60.

6) Compute class midpoints: 15, 25, 35, 45, 55.

7) Count observations & assign to classes.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 32


Organizing Numerical Data: Frequency
Distribution Example
DCOVA
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Class Midpoints Frequency

10 but less than 20 15 3


20 but less than 30 25 6
30 but less than 40 35 5
40 but less than 50 45 4
50 but less than 60 55 2
Total 20

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 33


Organizing Numerical Data: Relative &
Percent Frequency Distribution Example
DCOVA
Relative
Class Frequency Percentage
Frequency
10 but less than 20 3 .15 15%
20 but less than 30 6 .30 30%
30 but less than 40 5 .25 25%
40 but less than 50 4 .20 20%
50 but less than 60 2 .10 10%
Total 20 1.00 100%

Relative Frequency = Frequency / Total, e.g. 0.10 = 2 / 20

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 34


Organizing Numerical Data: Cumulative
Frequency Distribution Example
DCOVA
Cumulative Cumulative
Class Frequency Percentage
Frequency Percentage

10 but less than 20 3 15% 3 15%


20 but less than 30 6 30% 9 45%
30 but less than 40 5 25% 14 70%
40 but less than 50 4 20% 18 90%
50 but less than 60 2 10% 20 100%

Total 20 100% 20 100%

Cumulative Percentage = Cumulative Frequency / Total * 100 e.g. 45% = 100*9/20

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 35


Why Use a Frequency Distribution?
DCOVA
■ It condenses the raw data into a more
useful form.
■ It allows for a quick visual interpretation of
the data.
■ It enables the determination of the major
characteristics of the data set including
where the data are concentrated /
clustered.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 36


Frequency Distributions:
Some Tips
DCOVA
■ Different class boundaries may provide different pictures for
the same data (especially for smaller data sets).

■ Shifts in data concentration may show up when different class


boundaries are chosen.

■ As the size of the data set increases, the impact of alterations


in the selection of class boundaries is greatly reduced.

■ When comparing two or more groups with different sample


sizes, you must use either a relative frequency or a
percentage distribution.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 37


Visualizing Categorical Data Through
Graphical Displays
DCOVA
Categorical Data

Visualizing Data

Summary Contingency
Table For One Table For Two
Variable Variables

Bar Pie or
Chart Doughnut Side By Side
Chart Bar Chart

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 38


Visualizing Categorical Data:
The Bar Chart
DCOVA
▪ The bar chart visualizes a categorical variable as a series of bars. The
length of each bar represents either the frequency or percentage of values
for each category. Each bar is separated by a space called a gap.

Devices Percent
Used to
Watch
Television Set 49%
Tablet 9%
Smartphone 10%
Laptop / 32%
Desktop

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 39


Visualizing Categorical Data:
The Pie Chart
DCOVA
▪ The pie chart is a circle broken up into slices that represent categories.
The size of each slice of the pie varies according to the percentage in
each category.

Devices Percent
Used to
Watch
Television Set 49%
Tablet 9%
Smartphone 10%
Laptop / 32%
Desktop

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 40


Visualizing Categorical Data:
The Doughnut Chart DCOVA
▪ The doughnut chart is the outer part of a circle broken up into pieces
that represent categories. The size of each piece of the doughnut varies
according to the percentage in each category.

Devices Percent
Used to
Watch
Television Set 49%
Tablet 9%
Smartphone 10%
Laptop / 32%
Desktop

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 41


Visualizing Categorical Data:
Side By Side Bar Charts DCOVA
▪ The side by side bar chart represents the data from a contingency
table.
No
Errors Errors Total
Small 50.75% 30.77% 47.50%
Amount
Medium 29.85% 61.54% 35.00%
Amount
Large 19.40% 7.69% 17.50%
Amount
Total 100.0% 100.0% 100.0%

Invoices with errors are much more likely to be of


medium size (61.5% vs 30.8% & 7.7%).
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 42
Visualizing Categorical Data:
Doughnut Charts DCOVA
▪ A Doughnut Chart can be used to represent the data from a contingency table.

No
Errors Errors Total
Small 50.75% 30.77% 47.50%
Amount
Medium 29.85% 61.54% 35.00%
Amount
Large 19.40% 7.69% 17.50%
Amount
Total 100.0% 100.0% 100.0%

Invoices with errors are much more likely to be of


medium size (61.5% vs 30.8% & 7.7%).
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 43
Visualizing Categorical Data

Example 2: Newspaper Readership Survey

▪ In a major North American city, there are four competing


newspapers: Globe and Mail, Post, Star and Sun.

➢ To help design advertising campaigns, the advertising


managers of the newspapers need to know which
segments of the newspaper market are reading their
papers.

▪ A survey was conducted to analyze the relationship


between newspapers read and occupation.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 44


Visualizing Categorical Data

Example 2: Newspaper Readership Survey

■A sample of newspaper readers (354


participants) was asked to report which
newspaper they read: Globe and Mail (1), Post (2),
Star (3), Sun (4), and to indicate whether they
were blue-collar worker (1), white-collar worker
(2), or professional (3).

■Some of the data are listed here.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 45


Visualizing Categorical Data
Example 2:
Reader Occupation Newspaper
1 2 2
2 1 4
3 2 1
. . . .
. . . .
352 3 2
353 1 3
354 2 3
➢ Determine whether the two nominal variables are related.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 46


Visualizing Categorical Data

➢ By counting the number of times each of the 12 combinations occurs, we


produced the following table.

Newspaper
Occupation G&M Post Star Sun Total
Blue collar 27 18 38 37 120
White collar 29 43 21 15 108
Professional 33 51 22 20 126
Total 89 112 81 72 354

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 47


Visualizing Categorical Data
➢ By dividing each frequency by its row total, we produced the following table.

Newspaper
Occupation G&M Post Star Sun Total
Blue collar 27 /120 18/120 38/120 37/120 120/120
White collar 29/108 43/108 21/108 15/108 108/108
Professional 33/126 51/126 22/126 20/126 126/126
Total 89/354 112/354 81/354 72/354 354/354

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 48


Visualizing Categorical Data
▪ If occupation and newspaper are related, then there will be differences in the
newspapers read among the occupations.

Newspaper
Occupation G&M Post Star Sun Total
Blue collar .23 .15 .32 .31 1.00
White collar .27 .40 .19 .14 1.00
Professional .26 .40 .17 .16 1.00
Total .25 .32 .23 .20 1.00

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 49


Visualizing Categorical Data
❑ Use the data from the cross-classification table to create
bar charts… Newspaper
Occupation G&M Post Star Sun Total
Blue collar 27 18 38 37
120
White collar 29 43 21 15
108
Professional 33 51 22 20
126
120
Total 89 112 81 72 Professionals
354
tend to read the
Frequency

90
Post more than
60 twice as often as
the Star or
30
Sun…
0
Blue collar White collar Professional Grand Total
Occupa&on

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 50


Visualizing Numerical Data
By Using Graphical Displays
DCOVA
Numerical Data

Frequency Distributions
and
Cumulative Distributions

Histogram Polygon Ogive

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 51


Visualizing Numerical Data:
The Histogram
DCOVA
▪ A vertical bar chart of the data in a frequency distribution is called
a histogram.

▪ In a histogram there are no gaps between adjacent bars.

▪ The class boundaries (or class midpoints) are shown on the


horizontal axis.

▪ The vertical axis is either frequency, relative frequency, or


percentage.

▪ The height of the bars represent the frequency, relative frequency,


or percentage.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 52


Visualizing Numerical Data:
The Histogram
DCOVA
Relative
Class Frequency Percentage
Frequency
10 but less than 20 3 .15 15 Histogram: Age
20 but less than 30 6 .30 30 Of Students
30 but less than 40 5 .25 25
40 but less than 50 4 .20 20
6

Frequenc
50 but less than 60 2 .10 10
Total 20 1.00 100 5 Histogram: Temperature
3

y
2
0
(In a percentage
histogram the vertical 5 25 45 More
axis would be defined to
show the percentage of
observations per class).

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 53


Shapes of Histograms…
1) Symmetry
A histogram is said to be symmetric if, when we
draw a vertical line down the center of the
histogram, the two sides are identical in shape and
size:
Frequency

Frequency

Frequency
Variable Variable Variable

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 54


Shapes of Histograms…
2) Skewness
A skewed histogram is one with a long tail
extending to either the right or the left:
Frequency

Frequency
Variable Variable

Positively Skewed Negatively Skewed

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 55


Shapes of Histograms…
3) Modality
A unimodal histogram is one with a single peak, while
a bimodal histogram is one with two peaks:

Bimodal
Unimodal
Frequency

Frequency
Variable Variable

A modal class is the class with


the largest number of observations

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 56


Shapes of Histograms…

4) Bell Shape
A special type of symmetric unimodal histogram
is one that is bell shaped:

- Many statistical techniques


require that the population be Frequency

bell shaped.

- Drawing the histogram helps


Variable
verify the shape of the
population in question. Bell Shaped

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 57


Visualizing Numerical Data:
The Frequency Polygon DCOVA
Useful When Comparing Two or More Groups
Example 3:
Frequency

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 58


Visualizing Numerical Data:
The Percentage Polygon DCOVA

▪ A percentage polygon is formed by having the midpoint


of each class represent the data in that class and then
connecting the sequence of midpoints at their respective
class percentages.

▪ Useful when there are two or more groups to compare.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 59


Visualizing Numerical Data: DCOVA
The Percentage Polygon
Useful When Comparing Two or More Groups
Example 4: Frequency Distributions for Cost of a Meal at 50 Center City
Restaurants and 50 Metro Area Restaurants

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 60


Visualizing Numerical Data: DCOVA
The Percentage Polygon
Useful When Comparing Two or More Groups
➢Example 4: Relative Frequency = # of observations in a class/
Total # of observations

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 61


Visualizing Numerical Data: DCOVA
The Percentage Polygon
Useful When Comparing Two or More Groups
Example 4: Frequency Distributions for Cost of a Meal at 50 Center City
Restaurants and 50 Metro Area Restaurants

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 62


Visualizing Numerical Data:
The Cumulative Percentage Polygon
DCOVA

▪ The cumulative percentage polygon, or ogive,


displays the variable of interest along the X axis, and
the cumulative percentages along the Y axis.

▪ Useful when there are two or more groups to compare.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 63


Visualizing Numerical Data:
The Cumulative Percentage Polygon (Ogive)
Useful When Comparing Two or More Groups DCOVA
Example 4:

Center City Metro Area


Meal Cost ($) Percentage Cumulative Percentage Percentage Cumulative
(%) (%) (%) Percentage (%)
20 but less than 30 8 8 8 8

30 but less than 40 6 8+6=14 28 36

40 but less than 50 24 8+6+24=38 32 68

50 but less than 60 28 8+6+24+28=66 24 92

60 but less than 70 14 8+6+24+28+14=80 4 96

70 but less than 80 8 8+6+24+28+14+8=88 2 98

80 but less than 90 10 8+6+24+28+14+8+10=98 2 100

90 but less than 100 2 8+6+24+28+14+8+10+2=100 0 100

Total

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 64


Visualizing Numerical Data:
The Cumulative Percentage Polygon (Ogive)
DCOVA
Useful When Comparing Two or More Groups

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 65


Visualizing Two Numerical Variables
By Using Graphical Displays
DCOVA

Two Numerical
Variables

Scatter Time-
Plot Series
Plot

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 66


Visualizing Two Numerical Variables:
The Scatter Plot
DCOVA
▪Scatter plots are used for numerical data consisting of paired
observations taken from two numerical variables.

▪One variable’s values are displayed on the horizontal or X axis and


the other variable’s values are displayed on the vertical or Y axis.

▪Scatter plots are used to examine possible relationships between


two numerical variables.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 67


Visualizing Two Numerical Variables: The
Scatter Plot

Example 5:
❑ A real estate agent wanted to know to what extent
the selling price of a home is related to its size.
➢ To acquire this information he took a sample of 12
homes that had recently sold, recording the price in
thousands of dollars and the size in hundreds of
square feet.
▪ These data are listed in the accompanying table.
Use a graphical technique to describe the
relationship between size and price.
Size 2354 1807 2637 2024 2241 1489 3377 2825 2302 2068 2715 1833
Price 314 229 355 261 234 216 308 306 289 204 265 195

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 68


Visualizing Two Numerical Variables: The
Scatter Plot
Example 5: DCOVA
Size 2354 1807 2637 2024 2241 1489 3377 2825 2302 2068 2715 1833
Price 314 229 355 261 234 216 308 306 289 204 265 195

The Scatter Plot:

➢ It appears that in fact there is a relationship, that is, the greater the house size
the greater the selling price…

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 69


Patterns of Scatter Plots…
❑ Linearity and Direction are two concepts we are interested in.
Y Y

X X
Positive Linear Relationship Negative Linear Relationship
Y

X
Weak or Non-Linear Relationship
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 70
Visualizing Two Numerical
Variables: The Time Series Plot
DCOVA
■ A Time-Series Plot is used to study
patterns in the values of a numeric
variable over time.

■ The Time-Series Plot:


■ Numeric variable’s values are on the vertical
axis and the time period is on the horizontal
axis.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 71


Time Series Plot Example DCOVA
Example 6: Movie Revenues (in $ billions) from 1995 to 2016

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 72


Statistics in Excel
■ Part 1:
An increasing number of statistics courses use a computer and software
rather than manual calculations. A survey of statistics instructors asked
them to report the software their courses use. The responses are:
1. Excel 2. Minitab 3. SAS 4. SPSS 5. Other
The data can be found in Sheet1.
a) Produce a frequency and a relative frequency distribution using Excel.

b) Graphically summarize the frequency data on the Excel file using the
bar chart and pie chart. If you are selling software to statistics
instructors, which software will sell most?

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 73


Statistics in Excel
Part 2:

2) The Excel file (sheet 2) lists the traffic volume (number of


cars observed in a day) and the carbon monoxide level.

■ Draw a scatter diagram on the Excel file. Explain what it tells


us about the relationship between the traffic volume and the
carbon monoxide level.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 74


Chapter Summary

In this chapter we covered:


■ Organizing and visualizing categorical variables.
■ Organizing and visualizing numerical variables.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 75

You might also like