1 CH 1-2

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 80

Instructor: Dr.

Aabed Mohammed
E-mail: aabedukm@yahool.com

Course name : statistics and Probability


Textbook :Elementary Statistics a Step by Step Approach,
8th Edition by Allan Bluman, McGraw/Hill.
Syllabus
1- Introduction:
Definition of statistics, variables and types of variables,
sample, population, types of statistics
2- Frequency distribution and graph:
❖ categorical frequency distribution
❖ grouped frequency distribution
❖ un grouped frequency distribution
❖ histogram, frequency polygon, ogive, stem and leaf
plots

2
Syllabus
3- Data description
➢ measures of central tendency
➢ measures of variation
➢ measures of position
4- Probability
➢ Basic concept: probability experiment- outcome-
sample space- event- Tree diagram
➢ Probability of an event, complement of an event,
mutually exclusive events
➢ addition rule
3
Syllabus
➢ The Multiplication Rules
➢ Conditional Probability
5- Discrete probability distributions
➢ Probability Distributions
➢ Mean, variance, standard deviation, and expectation
➢ The binomial distribution
6- The Normal Distribution:
➢ Properties of normal distribution
➢ The Standard normal distribution
➢ Application of the normal distribution 4
Chapter (1)

Introduction and Basic concept

5
Introduction
Statistics: is the science of conducting studies to collect
,organize,summarize,analyze and drawing conclusions
from data.
A population: consists of all subjects (human or
otherwise) that are being studied.
A sample : is a group of subjects selected from a
population
Types of statistics

Descriptive Statistic Inferential statistics

consists of generalizing from


consists of the collection , samples to populations,
organization , summarization performing estimations and
and presentation of data. hypothesis testing, determining
relationships among variables,
and making predictions.

EX:”the average age of the


student is 14 years” EX: the relationship between
smoking and lung cancer”

Inferential statistics uses probability i.e., the chance of an event occurring


The Variable and its classifications
 A Variable: is characteristic or attribute that can
assume different values.
 Data: are the values that the variables can assume.
 data set : Collection of data values . Each value in the
data set is called a data value or a datum.
Variables and Types of Data
variables

Qualitative Quantitative

Discrete Continuous
 Qualitative Variables: are variables that can be placed
into distinct categories ,according to some characteristic or
attribute.For example: Gender ,Major ,Color……etc
• Quantitative variables : are numerical and can be
ordered or ranked. For example: Age ,Height , Weight
,temperature …..etc
 Discrete Variables: assume values that can be counted
For example: number of children in a family ,
number of student in classroom……etc
• Continuous Variables: assume an infinite number of values
between any two specific values.
• For example: Temperature , Time …..etc
The boundaries of a continuous variable
The boundaries of a continuous variable are given in one
additional decimal place and always end with the digit 5.
Example:

Exercise: Give the boundaries of each value.


a. 36 inches. b. 105.4 miles. c. 72.6 tons.
d. 5.27 centimeters. e. 5 ounces.
11
H. W
 Review Exercises. Page: 26, Num. 6, 8, 9, 10.

12
Frequency Distributions
and Graphs

13
Chapter 2 Overview
Introduction
 2-1 Organizing Data
 2-2 Histograms, Frequency
Polygons, and Ogives
 2-3 Other Types of Graphs

Bluman, Chapter 2 14
Chapter 2 Objectives
1. Organize data using frequency distributions.
2. Represent data in frequency distributions
graphically using histograms, frequency
polygons, and ogives.
3. Draw and interpret a stem and leaf plot.

Bluman, Chapter 2 15
 Data collected in original form is called raw data.
 A frequency distribution is the organization of raw
data in table form, using classes and frequencies.
Categorical frequency distributions.

Example 2-1. Page #38


Twenty-five army inductees were given a blood test to
determine their blood type. The data set is

Construct a frequency distribution for the data.

Bluman, Chapter 2 16
Relative
frequency
IIII 5/25=0.2 20
7/25=0.28 28
9/25=0.36 36
4/25=0.16 16

Bluman, Chapter 2 17
Grouped Frequency Distribution
 Grouped frequency distributions are used when
the range of the data is large.

Constructing a Grouped Frequency Distribution

Section 2-1
Example 2-2 Page #41

Bluman, Chapter 2 18
The following data represent the record high temperatures for
each of the 50 states. Construct a grouped frequency
distribution for the data using 7 classes.

Bluman, Chapter 2 19
Solution
Determine the classes.
Determine the lowest value (L), L=100,
highest value (H), H=134.
Find the range (R). Range= highest value – smallest value
R=H-L=134-100=34.
Find the class width.
Class width = Range/number of classes
=34/7 = 5
Rounding Rule: Always round up if a remainder

Bluman, Chapter 2 20
Constructing a Grouped Frequency Distribution
 For convenience sake, we will choose the lowest data
value, 100, for the first lower class limit.
 The subsequent lower class limits are found by
adding the width to the previous lower class limits.
Class Limits
◼The first upper class limit is one
100 - 104
105 - 109 less than the next lower class limit.
110 - 114
◼The subsequent upper class limits
115 - 119
120 - 124 are found by adding the width to the
125 - 129 previous upper class limits.
130 - 134
Bluman, Chapter 2 21
Constructing a Grouped Frequency Distribution

Bluman, Chapter 2 22
The class width from the frequency distribution table
class width = Lower (or upper)class limit of one class - Lower(or upper)class limit of
preceding class
Or
Class width = (upper class limit – lower class limit of the same class)+1
Or
Class width = upper class boundary – lower class boundary of the same
class

The class midpoint Xm


Lower limit +upper limit Lower boundary +upper boundary
X = =
m 2 2

Xm of any class = Xm of preceding class +the class width

Bluman, Chapter 2 23
Exercise
 Find the class boundaries, midpoints, and widths for
each class.
a. 32–38 b. 86–104 c. 895–905 d. 12.3–13.5
e. 3.18–4.96

Bluman, Chapter 2 24
Rules for Classes in Grouped Frequency
Distributions
1. There should be 5-20 classes.
2. The class width should be an odd number.
3. The classes must be mutually exclusive.
4. The classes must be continuous.
5. The classes must be exhaustive.
6. The classes must be equal in width (except in open-
ended distributions).

Bluman, Chapter 2 25
Cumulative Frequency
A cumulative frequency distribution is a distribution that
shows the number of data values less than or equal to a specific
value (usually an upper boundary).

0
2
10
28
41
48
49
50
Bluman, Chapter 2 26
Un Grouped Frequency Distribution
When the range of the data values is relatively small, a frequency distribution
can be constructed using single data values for each class. This type of
distribution is called an ungrouped frequency distribution

Example 2-2 Page #41


The data shown here represent the number of miles per gallon (mpg) that 30
selected four-wheel-drive sports utility vehicles obtained in city driving.
Construct a frequency distribution.

Bluman, Chapter 2 27
Solution
STEP 1 Determine the classes.
Determine the lowest value (L), L=12, highest value (H), H=19.
Find the range (R), R=H-L=19-12=7.

Bluman, Chapter 2 28
Cumulative Frequency

Bluman, Chapter 2 29
H.W
 Applying the concepts 2-1 page 45
 H.W page 46 No. 3, 7, 8, 15, 17, 18

Bluman, Chapter 2 30
2-2 Histograms, Frequency Polygons, and
Ogives
3 Most Common Graphs in Research
1. Histogram
2. Frequency Polygon
3. Cumulative Frequency Polygon (Ogive)

Bluman, Chapter 2 31
1- Histograms
The histogram is a graph that displays the data by using
contiguous (unless the frequency of a class is 0) vertical bars of
various heights to represent the frequencies of the classes.

Steps
1: Draw and label the x and y axes. The x axis is
always the horizontal axis, and the y axis is always
the vertical axis.
2: Represent the class boundaries on the x axis.
and the frequency on the y axis.
3: Using the frequencies as the heights, draw vertical
bars for each class.Bluman, Chapter 2 32
Example 2-4
Construct a histogram to represent the data for
the record high temperatures for each of the 50
states (see Example 2–2 for the data).
Class
Frequency
Limits
100 - 104 2
105 - 109 8
110 - 114 18
115 - 119 13
120 - 124 7
125 - 129 1
130 - 134 1
Bluman, Chapter 2 33
Histograms
Histograms use class boundaries and
frequencies of the classes.
Class Class
Frequency
Limits Boundaries
100 - 104 99.5 - 104.5 2
105 - 109 104.5 - 109.5 8
110 - 114 109.5 - 114.5 18
115 - 119 114.5 - 119.5 13
120 - 124 119.5 - 124.5 7
125 - 129 124.5 - 129.5 1
130 - 134 129.5 - 134.5 1

Bluman, Chapter 2 34
Histograms
Histograms use class boundaries and
frequencies of the classes.

Bluman, Chapter 2 35
Frequency Polygon
The frequency polygon is a graph that displays the data by using
lines that connect points plotted for the frequencies at the class
midpoints. The frequencies are represented by the heights of the
points.
Steps
1: Draw and label the x and y axes.
2: Represent the midpoint, on the x axis.
3: Choose a suitable scale for the frequencies, and label it on the y
axis.
4: Connect adjacent points with line segments. Draw a line back to
the x axis at the beginning and end of the graph, at the same
distance that the previous and next midpoints would be located.
Bluman, Chapter 2 36
Example 2-5
Construct a frequency polygon to represent the
data for the record high temperatures for each of
the 50 states (see Example 2–2 for the data).
Class
Frequency
Limits
100 - 104 2
105 - 109 8
110 - 114 18
115 - 119 13
120 - 124 7
125 - 129 1
130 - 134 1
Bluman, Chapter 2 37
Frequency Polygons
Frequency polygons use class midpoints
and frequencies of the classes.
Class Class
Frequency
Limits Midpoints
100 - 104 102 2
105 - 109 107 8
110 - 114 112 18
115 - 119 117 13
120 - 124 122 7
125 - 129 127 1
130 - 134 132 1

Bluman, Chapter 2 38
Frequency Polygons
Frequency polygons use class midpoints
and frequencies of the classes.

Bluman, Chapter 2 39
An Ogive (Cumulative Frequency Polygon
The ogive is a graph that represents the cumulative
frequencies for the classes in a frequency distribution.
steps
1: Draw and label the x and y axes.
2: Represent the class boundaries on the x axis
3: Choose a suitable scale cumulative frequencies, and
label it on the y axis.
4: Plot the points and then draw the bars or lines.

Bluman, Chapter 2 40
Example 2-6
Construct an ogive to represent the data for the
record high temperatures for each of the 50
states (see Example 2–2 for the data).
Class
Frequency
Limits
100 - 104 2
105 - 109 8
110 - 114 18
115 - 119 13
120 - 124 7
125 - 129 1
130 - 134 1
Bluman, Chapter 2 41
Solution
Ogives use upper class boundaries and
cumulative frequencies of the classes.
Class Class Cumulative
Frequency
Limits Boundaries Frequency
100 - 104 99.5 - 104.5 2 2
105 - 109 104.5 - 109.5 8 10
110 - 114 109.5 - 114.5 18 28
115 - 119 114.5 - 119.5 13 41
120 - 124 119.5 - 124.5 7 48
125 - 129 124.5 - 129.5 1 49
130 - 134 129.5 - 134.5 1 50

Bluman, Chapter 2 42
Ogives
Ogives use upper class boundaries and
cumulative frequencies of the classes.
Cumulative
Class Boundaries
Frequency
Less than 99.5 0
Less than 104.5 2
Less than 109.5 10
Less than 114.5 28
Less than 119.5 41
Less than 124.5 48
Less than 129.5 49
Less than 134.5 50
Bluman, Chapter 2 43
An ogive (Cumulative Frequency Polygon)

Bluman, Chapter 2 44
Ogives
Ogives use upper class boundaries and
cumulative frequencies of the classes.

Bluman, Chapter 2 45
2.2 Relative Frequency Graphs
If proportions are used instead of frequencies, the
graphs are called relative frequency graphs.

Relative frequency graphs are used when the


proportion of data values that fall into a given class
is more important than the actual number of data
values that fall into that class.

Bluman, Chapter 2 46
Example 2-7 Page #57
Construct a histogram, frequency polygon, and ogive
using relative frequencies for the distribution (shown
here) of the miles that 20 randomly selected runners
ran during a given week. Class
Frequency
Boundaries
5.5 - 10.5 1
10.5 - 15.5 2
15.5 - 20.5 3
20.5 - 25.5 5
25.5 - 30.5 4
30.5 - 35.5 3
35.5 - 40.5 2

Bluman, Chapter 2 47
Histograms
The following is a frequency distribution of
miles run per week by 20 selected runners.
Divide each
Class Relative
Frequency frequency by
Boundaries Frequency the total
5.5 - 10.5 1 frequency to
1/20 = 0.05
10.5 - 15.5 2 get the
2/20 = 0.10
15.5 - 20.5 3 relative
3/20 = 0.15
20.5 - 25.5 5 frequency.
5/20 = 0.25
25.5 - 30.5 4 4/20 = 0.20
30.5 - 35.5 3 3/20 = 0.15
35.5 - 40.5 2 2/20 = 0.10

f = 20 rf = 1.00

Bluman, Chapter 2 48
Histograms
Use the class boundaries and the
relative frequencies of the classes.

Bluman, Chapter 2 49
Frequency Polygons
The following is a frequency distribution of
miles run per week by 20 selected runners.
Class Class Relative
Boundaries Midpoints Frequency
5.5 - 10.5 8 0.05
10.5 - 15.5 13 0.10
15.5 - 20.5 18 0.15
20.5 - 25.5 23 0.25
25.5 - 30.5 28 0.20
30.5 - 35.5 33 0.15
35.5 - 40.5 38 0.10

Bluman, Chapter 2 50
Frequency Polygons
Use the class midpoints and the
relative frequencies of the classes.

Bluman, Chapter 2 51
Ogives
The following is a frequency distribution of
miles run per week by 20 selected runners.
Class Cumulative Cum. Rel.
Frequency
Boundaries Frequency Frequency
5.5 - 10.5 1 1 1/20 = 0.05
10.5 - 15.5 2 3 3/20 = 0.15
15.5 - 20.5 3 6 6/20 = 0.30
20.5 - 25.5 5 11 11/20 = 0.55
25.5 - 30.5 4 15 15/20 = 0.75
30.5 - 35.5 3 18 18/20 = 0.90
35.5 - 40.5 2 20 20/20 = 1.00
f = 20
Bluman, Chapter 2 52
Ogives
Ogives use upper class boundaries and
cumulative frequencies of the classes.
Cum. Rel.
Class Boundaries
Frequency
Less than 5.5 0
Less than 10.5 0.05
Less than 15.5 0.15
Less than 20.5 0.30
Less than 25.5 0.55
Less than 30.5 0.75
Less than 35.5 0.90
Less than 40.5 1.00
Bluman, Chapter 2 53
Ogives
Use the upper class boundaries and the
cumulative relative frequencies.

Bluman, Chapter 2 54
H.w
 Exercises 2-3
 Page 61 No. 1, 3, 4
 Extending the concept page 63 No. 19, 20

Bluman, Chapter 2 55
Shapes of Distributions

Bluman, Chapter 2 56
Shapes of Distributions

Bluman, Chapter 2 57
Other Types of Graphs
Stem and Leaf Plots
A stem and leaf plots is a data plot that uses part of a data
value as the stem and part of the data value as the leaf to
form groups or classes.

It has the advantage over grouped frequency distribution


of retaining the actual data while showing them in graphic
form.

Bluman, Chapter 2 58
Section 2-3
Example 2-13
Page #80

Bluman, Chapter 2 59
At an outpatient testing center, the number of
cardiograms performed each day for 20 days is shown.
Construct a stem and leaf plot for the data.

25 31 20 32 13
14 43 2 57 23
36 32 33 32 44
32 52 44 51 45

Solution
Step 1 Arrange the data in order:
02, 13, 14, 20, 23, 25, 31, 32, 32, 32,
32, 33, 36, 43, 44, 44, 45, 51, 52, 57

Bluman, Chapter 2 60
Step 2 Separate the data according to the first digit, as shown.
02 13, 14 20, 23, 25 31, 32, 32, 32, 32, 33, 36
43, 44, 44, 45 51, 52, 57

Stem and Leaf Plot

Stem Leaf
0 2
1 3 4
2 0 3 5
3 1 2 2 2 2 3 6
4 3 4 4 5
5 1 2 7

Bluman, Chapter 2 61
Section 2-3
Example 2-14
Page #82

Bluman, Chapter 2 62
An insurance company researcher conducted a survey on the number of car
thefts in a large city for a period of 30 days last summer. The raw data are
shown. Construct a stem and leaf plot by using classes 50–54, 55–59, 60–64,
65–69,70–74, and 75–79.
52 62 51 50 69
58 77 66 53 57
75 56 55 67 73
79 59 68 65 72
57 51 63 69 75
65 53 78 66 55

Solution
Step 1 Arrange the data in order:

63
Step 2 Separate the data according to the classes.

Stem and Leaf Plot

Stem Leaf

64
Other Types of Graphs
❑ The Pie Graph:
Pie graphs are used extensively in statistics.
The purpose of the pie graph is to show the
relationship of the parts to the whole
The pie graph is used to represent the nominal or
categorical variable
A pie graph is a circle that is divided into
sections according to the percentage of
frequencies in each category of the
distribution.
Example: Construct a pie graph showing the blood
types of the army inductees described in Example 2–1.
The frequency distribution is repeated here.
Step 3 Using a protractor, graph each section and write its name and
corresponding percentage, as shown in following figure .
Example
The average amounts spent by college freshmen for
school items are shown. Construct a pie graph.
Electronics/computers $728
Dorm items $344
Clothing $ 141
Shoes $ 72
Solution:

Convert the frequency to degrees, also the frequency to


percent
f f
Degree =  360 Per cent =  100
n n
728 728
El ect r onics  360 = 204 Electronics  100 = 56%
1285 1285
344 344
D o rm item s  360 = 96 Do rm item s  100 = 27%
1285 1285
141 141
Clothing  360 = 40 Clothing  100 = 11%
1285 1285
72 72
Shoes  360 = 20 Shoes  100 = 6%
1285 1285
Step 3 Using a protractor, graph each section and write its name and
corresponding percentage, as shown in following figure .
H. w. page 85 Num. 10, 12.
page 87 Num. 23.
Bar Graphs
When the data are qualitative or categorical, bar graphs can
be used to represent the data. A bar graph can be drawn
using either horizontal or vertical bars.
A bar graph represents the data by using vertical or
horizontal bars whose heights or lengths represent the
frequencies of the data.
Example:

Bluman, Chapter 2 71
Bluman, Chapter 2 72
Bluman, Chapter 2 73
Pareto Charts
A Pareto chart is used to represent a frequency
distribution for a categorical variable, and the
frequencies are displayed by the heights of vertical bars,
which are arranged in order from highest to lowest.
Example:

Bluman, Chapter 2 74
Solution
Step 1 Arrange the data from the largest to smallest
according to frequency.

Step 2 Draw and label the x and y axes.


Step 3 Draw the bars corresponding to the
frequencies.

Bluman, Chapter 2 75
Pareto Charts

The graph shows that the number of homeless people


is about the same for Atlanta and Chicago and a lot
less for Baltimore and St. Louis.
Bluman, Chapter 2 76
The Time Series Graph
When data are collected over a period of time, they can
be represented by a time series graph.
Example
The number of homicides that occurred in the
workplace for the years 2003 to 2008 is shown. Draw a
time series graph for the data.

Bluman, Chapter 2 77
Solution

Step 1 Draw and label the x and y axes.


Step 2 Label the x axis for years and the y axis for the
number.
Step 3 Plot each point according to the table.
Step 4 Draw line segments connecting adjacent point.

Bluman, Chapter 2 78
There was a slight decrease in the years ’04, ’05, and
’06, compared to ’03, and again an increase in ’07. The
largest decrease occurred in ’08.

Bluman, Chapter 2 79
H.W
 Page: 85 No. 10, 11, 12, 14
 Review Exercises Page 95 No. 1, 5, 6, 7, 8, 9, 10, 11, 12,
13, 20, 21, 22.
 Chapter Quiz page 98.

80

You might also like