Professional Documents
Culture Documents
Screenshot 2024-02-03 at 1.09.10 AM
Screenshot 2024-02-03 at 1.09.10 AM
Screenshot 2024-02-03 at 1.09.10 AM
Organizing and
Visualizing Variables
❑ Define the data you want to study to solve a problem or meet an objective.
❑ Collect the data from appropriate sources. (Chapter 1)
❑ Organize the data collected by developing tables. (Chapter 2)
❑ Visualize the data collected by developing charts. (Chapter 2)
❑ Analyze the data collected to reach conclusions and present those results.
VARIABLE
A characteristic or property of an item or individual.
DATA
The set of values associated with one or more
variables.
Examples:
X = Sum of values on the roll of two dices:
X has to be either 2, 3, 4, …, or 12.
Y = number of accidents in Doha during a
week: Y has to be 0, 1, 2, 3, 4, 5, 6, 7, 8, ……………real
big number.
Z = number of children in a family: Z has to
be 0, 1, 2, 3, …
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 7
Two Types of Numerical Variables…
2) Continuous Random Variable
– one whose values are not discrete, not
countable.
Example 1:
-Let X= time to write a statistics exam in a university
30.0001?
Categorical Numerical
Do you have a
Facebook profile? Yes, No
Tallying Data
One Two
Categorical Categorical
Variable Variables
Summary Contingency
Table Table
Source: Data extracted and adapted from A. Sharma, “Big Media Needs to Embrace
Digital Shift Not Fight It,” Wall Street Journal, June 22, 2016, p. 1-2.
Frequency Cumulative
Ordered Array
Distributions Distributions
22 25 27 32 38 42
Night Students
18 18 19 19 20 21
23 28 32 33 41 45
▪ You must give attention to selecting the appropriate number of class groupi
the table, determining a suitable width of a class grouping, and establish
boundaries of each class grouping to avoid overlapping.
▪ The number of classes depends on the number of values in the data. With
number of values, typically there are more classes. In general, a fre
distribution should have at least 5 but no more than 15 classes.
▪ To determine the width of a class interval, you divide the range (Highest
Lowest value) of the data by the number of class groupings desired.
24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27
Visualizing Data
Summary Contingency
Table For One Table For Two
Variable Variables
Bar Pie or
Chart Doughnut Side By Side
Chart Bar Chart
Devices Percent
Used to
Watch
Television Set 49%
Tablet 9%
Smartphone 10%
Laptop / 32%
Desktop
Devices Percent
Used to
Watch
Television Set 49%
Tablet 9%
Smartphone 10%
Laptop / 32%
Desktop
Devices Percent
Used to
Watch
Television Set 49%
Tablet 9%
Smartphone 10%
Laptop / 32%
Desktop
No
Errors Errors Total
Small 50.75% 30.77% 47.50%
Amount
Medium 29.85% 61.54% 35.00%
Amount
Large 19.40% 7.69% 17.50%
Amount
Total 100.0% 100.0% 100.0%
Newspaper
Occupation G&M Post Star Sun Total
Blue collar 27 18 38 37 120
White collar 29 43 21 15 108
Professional 33 51 22 20 126
Total 89 112 81 72 354
Newspaper
Occupation G&M Post Star Sun Total
Blue collar 27 /120 18/120 38/120 37/120 120/120
White collar 29/108 43/108 21/108 15/108 108/108
Professional 33/126 51/126 22/126 20/126 126/126
Total 89/354 112/354 81/354 72/354 354/354
Newspaper
Occupation G&M Post Star Sun Total
Blue collar .23 .15 .32 .31 1.00
White collar .27 .40 .19 .14 1.00
Professional .26 .40 .17 .16 1.00
Total .25 .32 .23 .20 1.00
90
Post more than
60 twice as often as
the Star or
30
Sun…
0
Blue collar White collar Professional Grand Total
Occupa&on
Frequency Distributions
and
Cumulative Distributions
Frequenc
50 but less than 60 2 .10 10
Total 20 1.00 100 5 Histogram: Temperature
3
y
2
0
(In a percentage
histogram the vertical 5 25 45 More
axis would be defined to
show the percentage of
observations per class).
Frequency
Frequency
Variable Variable Variable
Frequency
Variable Variable
Bimodal
Unimodal
Frequency
Frequency
Variable Variable
4) Bell Shape
A special type of symmetric unimodal histogram
is one that is bell shaped:
bell shaped.
Total
Two Numerical
Variables
Scatter Time-
Plot Series
Plot
Example 5:
❑ A real estate agent wanted to know to what extent
the selling price of a home is related to its size.
➢ To acquire this information he took a sample of 12
homes that had recently sold, recording the price in
thousands of dollars and the size in hundreds of
square feet.
▪ These data are listed in the accompanying table.
Use a graphical technique to describe the
relationship between size and price.
Size 2354 1807 2637 2024 2241 1489 3377 2825 2302 2068 2715 1833
Price 314 229 355 261 234 216 308 306 289 204 265 195
➢ It appears that in fact there is a relationship, that is, the greater the house size
the greater the selling price…
X X
Positive Linear Relationship Negative Linear Relationship
Y
X
Weak or Non-Linear Relationship
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. 70
Visualizing Two Numerical
Variables: The Time Series Plot
DCOVA
■ A Time-Series Plot is used to study
patterns in the values of a numeric
variable over time.
b) Graphically summarize the frequency data on the Excel file using the
bar chart and pie chart. If you are selling software to statistics
instructors, which software will sell most?