02.2 Graphical Summary Techniques PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Graphical Summary Techniques

Glyzel Grace M. Francisco


2nd Semester, 2022-2023

CENTRAL LUZON STATE UNIVERSITY


DEPARTMENT of
STATISTICS Graphical Summary Techniques

1. Stem and-Leaf Display

2. Boxplots

3. Histogram

GGMFRANCISCO Graphical Summary Techniques | 2


DEPARTMENT of
STATISTICS Stem and Leaf Plot

- A special table where each data value is split value into a


“stem” (the first digit/s) and a “leaf” (usually the last digit).
- A way to organize data
- It sorts data quickly
- Gives an overview of the spread of the data
- Able to tell minimum value, maximum value and range in a
glance.
GGMFRANCISCO Graphical Summary Techniques | 3
DEPARTMENT of
STATISTICS Stem and Leaf Plot

Example 1:

The test scores of 20 students are as follows:

83, 72, 73, 65, 65, 95, 70, 50, 100, 88,
87, 13, 92, 35, 56, 8, 40, 23, 39, 45

The leaf will be the rightmost digit of each score;


the stem will be the other digits listed in ascending order.

GGMFRANCISCO Graphical Summary Techniques | 4


DEPARTMENT of
STATISTICS Stem and Leaf Plot
Stem Leaf
Example 1: 0 8
1 3
83 72 73 65 65 2 3

95 70 50 100 88 3 59
4 05
87 13 92 35 56 5 06
8 40 23 39 45 6 55
7 230
8 387
9 52
10 0
Key: “1 | 3” means “13”
GGMFRANCISCO Graphical Summary Techniques | 5
DEPARTMENT of
STATISTICS Stem and Leaf Plot
Stem Leaf Data set
0 8 08
1 3 13
2 3 23
3 59 35, 39
4 05 40, 45
5 06 50, 56
6 55 65, 65
7 230 72, 73, 70
8 387 83, 88, 87
9 52 95, 92
10 0 100
GGMFRANCISCO Graphical Summary Techniques | 6
DEPARTMENT of
STATISTICS Stem and Leaf Plot

Example 2:

The long jump distances (measured in meters) in a P.E. class of 25


students are as follows:
2.3, 2.5, 3.5, 6.5, 2.0, 2.3, 3.7, 4.5, 3.8, 4.0,
3.5, 6.1, 4.3, 3.8, 5.0, 3.5, 4.5, 4.7, 4.5, 2.8,
6.3, 2.2, 3.2, 3.4, 4.5,
The leaf will be the tenth digit of each reading;
the stem will be the ones digits listed in ascending order.

GGMFRANCISCO Graphical Summary Techniques | 7


DEPARTMENT of
STATISTICS Stem and Leaf Plot
Example 2:
Stem Leaf
2.3 2.5 3.5 6.5 2.0 2 350382
2.3 3.7 4.5 3.8 4.0 3 57858524
3.5 6.1 4.3 3.8 5.0 4 503755
3.5 4.5 4.7 4.5 2.8 5 0
6.3 2.2 3.2 3.4 4.5 6 513
Key: “2 | 3” means “2.3”

GGMFRANCISCO Graphical Summary Techniques | 8


DEPARTMENT of
STATISTICS Stem and Leaf Plot

Stem Leaf Data Set


2 350382 2.3, 2.5, 2.0, 2.3, 2.8, 2.2
3 57858524 3.5, 3.7, 3.8, 3.5, 3.8, 3.5, 3.2, 3.4
4 503755 4.5, 4.0, 4.3, 4.7,5.4, 4.5
5 0 5.0
6 513 6.5, 6.1, 6.3

Key: “2 | 3” means “2.3”

GGMFRANCISCO Graphical Summary Techniques | 9


DEPARTMENT of
STATISTICS Stem and Leaf Plot

- can also be used to make a comparison between two classes


of scores.

- We can draw them back to back

GGMFRANCISCO Graphical Summary Techniques | 10


DEPARTMENT of
STATISTICS Stem and Leaf Plot

Example 3:

Female scores: 76, 45, 58, 79, 82, 63, 76, 72, 58, 13, 45, 90, 72, 65, 70

Male scores: 56, 34, 72, 89, 15, 97, 45, 34, 72, 65, 98, 12, 26, 64, 54

The leaf will be the ones digit of each score;


the stem will be the tens digits listed in ascending order.

GGMFRANCISCO Graphical Summary Techniques | 11


DEPARTMENT of
STATISTICS Stem and Leaf Plot
Example 3: Leaf (F) Stem Leaf (M)
3 1 52
76 45 58 79 82 2 6
Female
63 76 72 58 13 3 44
scores 55 4 5
45 90 72 65 70
88 5 64
56 34 72 89 15 35 6 54
Male
97 45 34 72 65 696220 7 22
scores
98 12 26 64 54 2 8 9
0 9 78
Key: “3| 1| 5 2” means Female: “13”
and Male: “15” & “12”
GGMFRANCISCO Graphical Summary Techniques | 12
DEPARTMENT of
STATISTICS Stem and Leaf Plot
Female Leaf (F) Stem Leaf (M) Male
13 3 1 52 15, 12
2 6 26
3 44 34, 34
45, 45 55 4 5 45
58, 58 88 5 64 56, 54
63, 65 35 6 54 65, 64
76, 79, 76, 72, 72, 70 696220 7 22 72, 72
82 2 8 9 89
90 0 9 78 97, 98
GGMFRANCISCO Graphical Summary Techniques | 13
DEPARTMENT of
STATISTICS Boxplot
• also called as the box-and-whiskers plot
• graph that is very useful for displaying the following features
of data:
▪ Location
▪ Spread
▪ Symmetry
▪ extremes/outliers

• useful for identifying outliers


• use in comparing distributions

GGMFRANCISCO Graphical Summary Techniques | 14


DEPARTMENT of
STATISTICS Boxplot

Interquartile Range
Outlier (IQR) Outliers

whisker whisker
Minimum/ Maximum/
Lower Fence Median Upper Fence
𝑸𝟏 𝑸𝟑
(𝑸𝟏 − 1.5 ∗ IQR) (25th Percentile) (75th Percentile) (𝑸𝟑 + 1.5 ∗ IQR)

GGMFRANCISCO Graphical Summary Techniques | 15


DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot
1. Compute for the following values:
1. 1st quartile
𝑄1 = 𝑃25
25 25 12th
𝑸𝟏 =17
𝐿= 𝑥𝑁 = 𝑥46 = 11.5 ≈ 12 data
100 100
1 5 5 6 8 10 12 13 14 15
12th
15 17 data
18 18 18 18 18 20 24 25
26 26 26 26 27 28 31 31 34 37
37 37 39 40 40 41 41 42 43 44
44 46 46 47 49 50
GGMFRANCISCO Graphical Summary Techniques | 16
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot

2. 2nd quartile or the median 𝟐𝟔+𝟐𝟔


𝑄2 = 𝑃50 𝑸𝟐 = =26
𝟐

50 50 𝟐𝟑𝒓𝒅 + 𝟐𝟒𝒕𝒉
𝐿= 𝑥𝑁 = 𝑥46 = 23 𝟐
100 100

1 5 5 6 8 10 12 13 14 15
15 17 18 18 18 18 18 20 24 25
23rd 24th
26 26 data 26 26 data
27 28 31 31 34 37
37 37 39 40 40 41 41 42 43 44
44 46 46 47 49 50
GGMFRANCISCO Graphical Summary Techniques | 17
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot

3. 3rd quartile
𝑄3 = 𝑃75

75 75 35th
𝐿= 𝑥𝑁 = 𝑥46 = 34.5 ≈ 35 data
𝑸𝟑 =40
100 100

1 5 5 6 8 10 12 13 14 15
15 17 18 18 18 18 18 20 24 25
26 26 26 26 27 28 31 31 34 37
35th
37 37 39 40 40 data
41 41 42 43 44
44 46 46 47 49 50
GGMFRANCISCO Graphical Summary Techniques | 18
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot

4. Interquartile range
IQR = 𝑄3 − 𝑄1
IQR = 40 − 17
IQR = 23

5. Lower and Upper Fence

𝐹𝐿 = 𝑄1 − 1.5𝑥𝐼𝑄𝑅 𝐹𝑈 = 𝑄3 + 1.5𝑥𝐼𝑄𝑅
𝐹𝐿 = 17 − 1.5𝑥23 𝐹𝐿 = 40 + 1.5𝑥23
𝐹𝐿 = −17.5 𝐹𝐿 = 74.5
GGMFRANCISCO Graphical Summary Techniques | 19
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot
1 5 5 6 8 10 12 13 14 15 𝑄1 = 17
15 17 18 18 18 18 18 20 24 25 𝑄2 = 26
26 26 26 26 27 28 31 31 34 37 𝑄3 = 40
37 37 39 40 40 41 41 42 43 44 𝐹𝐿 = −17.5
44 46 46 47 49 50 𝐹𝑈 = 74.5
2. Construct a rectangle with one end at the first quartile and the
other end at the third quartile.

-20 -10 0 10 20 30 40 50 60 70 80 90 100 110


𝟏𝟕 𝟒𝟎
GGMFRANCISCO Graphical Summary Techniques | 20
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot
1 5 5 6 8 10 12 13 14 15 𝑄1 = 17
15 17 18 18 18 18 18 20 24 25 𝑄2 = 26
26 26 26 26 27 28 31 31 34 37 𝑄3 = 40
37 37 39 40 40 41 41 42 43 44 𝐹𝐿 = −17.5
44 46 46 47 49 50 𝐹𝑈 = 74.5
3. Put a line across the interior of the rectangle at the median.

-20 -10 0 10 20 30 40 50 60 70 80 90 100 110


𝟏𝟕 𝟐𝟔 𝟒𝟎
GGMFRANCISCO Graphical Summary Techniques | 21
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot
1 5 5 6 8 10 12 13 14 15 𝑄1 = 17
15 17 18 18 18 18 18 20 24 25 𝑄2 = 26
26 26 26 26 27 28 31 31 34 37 𝑄3 = 40
37 37 39 40 40 41 41 42 43 44 𝐹𝐿 = −17.5
44 46 46 47 49 50 𝐹𝑈 = 74.5
4. Locate the smallest value/observation in the interval [𝐹1 , 𝑄1 ].
Draw a line from this value to 𝑄1 .

-20 -10 0 10 20 30 40 50 60 70 80 90 100 110


−𝟏𝟕. 𝟓 𝟏𝟕 𝟐𝟔 𝟒𝟎
GGMFRANCISCO Graphical Summary Techniques | 22
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot
1 5 5 6 8 10 12 13 14 15 𝑄1 = 17
15 17 18 18 18 18 18 20 24 25 𝑄2 = 26
26 26 26 26 27 28 31 31 34 37 𝑄3 = 40
37 37 39 40 40 41 41 42 43 44 𝐹𝐿 = −17.5
44 46 46 47 49 50 𝐹𝑈 = 74.5
5. Locate the largest value/observation in the interval [𝑄3 , 𝐹𝑢 ]. Draw
a line from this value to 𝑄3 .

-20 -10 0 10 20 30 40 50 60 70 80 90 100 110


−𝟏𝟕. 𝟓 𝟏𝟕 𝟐𝟔 𝟒𝟎 74.𝟓
GGMFRANCISCO Graphical Summary Techniques | 23
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot
1 5 5 6 8 10 12 13 14 15 𝑄1 = 17
15 17 18 18 18 18 18 20 24 25 𝑄2 = 26
26 26 26 26 27 28 31 31 34 37 𝑄3 = 40
37 37 39 40 40 41 41 42 43 44 𝐹𝐿 = −17.5
44 46 46 47 49 50 𝐹𝑈 = 74.5
6. Values falling outside the fences are considered outliers and are
usually denoted by “x”.

-20 -10 0 10 20 30 40 50 60 70 80 90 100 110


−𝟏𝟕. 𝟓 𝟏𝟕 𝟐𝟔 𝟒𝟎 74.𝟓
GGMFRANCISCO Graphical Summary Techniques | 24
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot

-20 -10 0 10 20 30 40 50 60 70 80 90 100 110

−𝟏𝟕. 𝟓 𝟏𝟕 𝟐𝟔 𝟒𝟎 74.𝟓

Observations:
• Median is not exactly at the middle of the rectangle

• no possible outliers

GGMFRANCISCO Graphical Summary Techniques | 25


DEPARTMENT of
STATISTICS Histogram

• The word histogram comes from the Greek histos,


meaning pole or mast, and gram, which means chart or
graph.

• The direct definition of “histogram” is “pole chart.”

GGMFRANCISCO Graphical Summary Techniques | 26


DEPARTMENT of
STATISTICS Histogram
• A histogram is a graphical display of data using bars of
different heights.

• It is used to display the distribution of data values along the


real number line

• It is created by dividing up the range of the data into a small


number of intervals or bins.

• The number of observations falling in each interval is counted.


This gives a frequency distribution.
GGMFRANCISCO Graphical Summary Techniques | 27
DEPARTMENT of
STATISTICS Histogram
-is a graph of the frequency distribution in which the vertical axis
represents the count (frequency) and the horizontal axis
represents the possible range of the data values.

➢ A Histogram visually represent


the distribution of a continuous
variable.

GGMFRANCISCO Graphical Summary Techniques | 28


DEPARTMENT of
STATISTICS Histogram
Steps in Constructing Quantitative FDT
1. Put the data in array because it is easier to detect the smallest and largest
value and it is easier to find the frequency and measure of position.
2. Determine the range (R):
𝐑 = 𝐡𝐢𝐠𝐡𝐞𝐬𝐭 𝐯𝐚𝐥𝐮𝐞 − 𝐥𝐨𝐰𝐞𝐬𝐭 𝐯𝐚𝐥𝐮𝐞
3. Solve for the number of classes or class intervals (k):
𝐤= 𝒏
where n is the number of observations.
4. Determine the class size (c).
Note: Round off c where c has the same number of decimal places in the raw data.
𝑹
𝐜=
𝒌
GGMFRANCISCO Graphical Summary Techniques | 29
DEPARTMENT of
STATISTICS Histogram
Steps in Constructing Quantitative FDT
5. Determine and enumerate the classes. Each class is an interval of values
defined by its lower and upper class limits. As a rule, the lowest value in
the data becomes the lower class limit (LL) of the first class interval. Adding
c to the lower class limit of the preceding class interval obtains succeeding
lower limits. Upper class limits (UL) are obtained using the following
formula:
UL = LL + c – 1 unit of measure
Data One unit of measure
0.32, 24.56, 119.02, 3.67 0.01
1.5, 5.5, 123.8, 2.7, 12.3 0.1
2, 17, 29, 6, 176 1
GGMFRANCISCO Graphical Summary Techniques | 30
DEPARTMENT of
STATISTICS Histogram
Example
Jeff is the branch manager at a clinic. Recently, Jeff’s been receiving customer
feedback saying that the wait times for a client to be served by a customer
service representative are too long. Jeff decides to observe and write down the
time spent by each customer on waiting. Here are his findings from observing
and writing down the wait times spent by 20 customers:
Customer Wait Time in minutes (n=20)
13.1 42.2 43.5 45.2
25.6 15.5 40.3 54.1
37.6 30.3 10.2 45.6
36.5 21.4 37.3 36.5
45.3 35.6 31.2 43.1
GGMFRANCISCO Graphical Summary Techniques | 31
DEPARTMENT of
STATISTICS Histogram
Steps in Constructing Quantitative FDT
1. Array
10.2 13.1 15.5 21.4 25.6 30.3 31.2 35.6 36.5 36.5
37.3 37.6 40.3 42.2 43.1 43.5 45.2 45.3 45.6 54.1
2. Determine the range (R):
𝐑 = 𝐡𝐢𝐠𝐡𝐞𝐬𝐭 𝐯𝐚𝐥𝐮𝐞 − 𝐥𝐨𝐰𝐞𝐬𝐭 𝐯𝐚𝐥𝐮𝐞 = 𝟓𝟒. 𝟏 − 𝟏𝟎. 𝟐 = 𝟒𝟑. 𝟗
3. Solve for the number of classes or class intervals (k):
𝐤 = 𝒏 = 𝟐𝟎 = 𝟒. 𝟓 ≈ 𝟓
where n is the number of observations.
4. Determine the class size (c).
Note: Round off c where c has the same number of decimal places in the raw data.
𝑹 𝟒𝟑. 𝟗
𝐜= = = 𝟗. 𝟕𝟓 ≈ 𝟗. 𝟖
𝒌 𝟒. 𝟓
GGMFRANCISCO Graphical Summary Techniques | 32
DEPARTMENT of
STATISTICS Histogram
Example
10.2 13.1 15.5 21.4 25.6 30.3 31.2 35.6 36.5 36.5
37.3 37.6 40.3 42.2 43.1 43.5 45.2 45.3 45.6 54.1
Lower Limit Upper Limit Frequency
UL = LL + c – 1 unit of measure
10.2 10.2 + 9.8 - 0.1 = 19.9 3
20.0 20.0 + 9.8 - 0.1 = 29.7 2
29.8 29.8 + 9.8 - 0.1 = 39.5 7
39.6 39.6 + 9.8 - 0.1 = 49.3 7
49.4 49. 4+ 9.8 - 0.1 = 59.1 1

GGMFRANCISCO Graphical Summary Techniques | 33


DEPARTMENT of
STATISTICS Histogram
Example Histogram of Customer wait time
Customer Wait Time in minutes
8
(n=20)
7
13.1 42.2 43.5 45.2
6

25.6 15.5 40.3 54.1

Frequency
5

37.6 30.3 10.2 45.6 4

36.5 21.4 37.3 36.5 3

45.3 35.6 31.2 43.1 2

10.2 - 19.9 20 - 29.7 29.8 - 39.5 39.6 - 49.3 49.4 - 59.1


Customer wait time (minutes)

GGMFRANCISCO Graphical Summary Techniques | 34

You might also like