Professional Documents
Culture Documents
22 Chapter 4 Data Management
22 Chapter 4 Data Management
22 Chapter 4 Data Management
D A TA M A N A G E M E N T
4.1 • Descriptive Statistics
Discrete
number of students present
number of red marbles in a jar
number of heads when flipping three coins
students’ grade level
Continuous
height of students in class
weight of students in class
time it takes to get to school
distance traveled between classes
Types of Statistical Data
1. Numerical data
These data have meaning as a measurement such as a
person’s height, weight, IQ, or blood pressure or shares
of stocks a person owns.
2. Categorical data
Categorical data represent characteristics such as a
person’s gender, marital status, hometown, or the
types of movies they like. Categorical data can take on
numerical values (such as “1” indicating male and “2”
indicating female) but those numbers don’t have
mathematical meaning.
Four Levels of Measurement
1) In-Person Interviews
Pros: In-depth and a high degree of confidence on the data
Cons : Time consuming, expensive and can be dismissed as anecdotal
2) Mail Surveys
Pros : Can reach anyone and everyone – no barrier
Cons : Expensive, data collection errors, lag time
3) Phone Surveys
Pros : High degree of confidence on the data collected, reach almost anyone
Cons : Expensive, cannot self-administer, need to hire an agency
4) Web/Online Surveys
Pros : Cheap, can self-administer, very low probability of data errors
Cons : Not all your customers might have an email address/be on the internet,
customers may be wary of divulging information online
Three Ways of Presenting Data
Definition: Frequency
C lass Frequency, f
1–4 4
Lower Class
Limits 5–8 5
Upper Class 9 – 12 3 Frequencies
Limits 13 – 16 4
17 – 20 2
Class Frequency, f
1–4 4
5–1=4 5–8 5
9–5=4 9 – 12 3
13 – 9 = 4 13 – 16 4
17 – 13 = 4 17 – 20 2
The class width is 4.
Guidelines
1. Decide on the number of classes to include. The number of
classes should be between 5 and 20; otherwise, it may be
difficult to detect any patterns. (or number of classes= 𝑁 )
2. Find the class width as follows. Determine the range of the
data, divide the range by the number of classes, and round up
to the next convenient number.
3. Find the class limits. You can use the minimum entry as the
lower limit of the first class. To find the remaining lower limits,
add the class width to the lower limit of the preceding class.
Then find the upper class limits.
4. Make a tally mark for each data entry in the row of the
appropriate class.
5. Count the tally marks to find the total frequency f for each
class.
Example:
The following data represents the ages of 30 students in a
statistics class. Construct a frequency distribution that
has five classes.
Ages of Students
18 20 21 27 29 20
19 30 32 19 34 19
24 29 18 37 38 22
30 39 32 44 33 46
54 49 18 51 21 21
Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 21
Constructing a Frequency Distribution Table
Example continued:
Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 22
Constructing a Frequency Distribution Table
Example continued:
3. The minimum data entry of 18 may be used for the
lower limit of the first class. To find the lower class
limits of the remaining classes, add the width (8) to each
lower limit.
The lower class limits are 18, 26, 34, 42, and 50.
The upper class limits are 25, 33, 41, 49, and 57.
4. Make a tally mark for each data entry in the
appropriate class.
Example continued:
Number of
Ages students
Ages of Students
C lass Tally Frequency, f
18 – 25 13
26 – 33 8
34 – 41 4
42 – 49 3
Check that the
50 – 57 2 sum equals
the number in
f 30
the sample.
Relative
Class Frequency, f
Frequency
1–4 4 0.222
f 18
Relative frequency f 4 0.222
n 18
Larson & Farber, Elementary Statistics: Picturing the World, 3e 28
Relative Frequency
Example:
Find the relative frequencies for the “Ages of Students”
frequency distribution.
Relative Portion of
C lass Frequency, f Frequency students
18 – 25 13 0.433 f 13
26 – 33 8 0.267 n 30
34 – 41 4 0.133 0.433
42 – 49 3 0.1
50 – 57 2 0.067
f 30 f 1
n
Larson & Farber, Elementary Statistics: Picturing the World, 3e 29
Cumulative Frequency
The cumulative frequency of a class is the sum of the
frequency for that class and all the previous classes.
Ages of Students
Cumulative
Class Frequency, f Frequency
18 – 25 13 13
26 – 33 +8 21
34 – 41 +4 25
42 – 49 +3 28
Total number
50 – 57 +2 30 of students
f 30
14 13 Ages of Students
12
10
8
8
f 6
4
4 3
2 2
0
17.5 25.5 33.5 41.5 49.5 57.5
Broken axis
Age (in years)
Larson & Farber, Elementary Statistics: Picturing the World, 3e 34
Frequency Polygon
A frequency polygon is a line graph that emphasizes the
continuous change in frequencies.
14
Ages of Students
12
10
8 Line is extended
to the x -axis.
f 6
4
2
0
13.5 21.5 29.5 37.5 45.5 53.5 61.5
Broken axis
Age (in years) C lassmarks
30 Ages of Students
Cumulative frequency
(portion of students)
24
18
The graph ends
at the upper
12 boundary of the
last class.
6
0
17.5 25.5 33.5 41.5 49.5 57.5
Age (in years)
Larson & Farber, Elementary Statistics: Picturing the World, 3e 36
Pie Chart
A pie chart is a circle that is divided into sectors that
represent categories. The area of each sector is proportional
to the frequency of each category.
Accidental Deaths in the U S A in 2002
Type Frequency
Motor Vehicle 43,500
Falls 12,200
Poison 6,400
Drowning 4,600
Fire 4,200
Ingestion of Food/Object 2,900
(Source: U S Dept.
of Transportation) Firearms 1,400 Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 37
Pie Chart
To create a pie chart for the data, find the relative frequency
(percent) of each category.
Relative
Type Frequency
Frequency
Motor Vehicle 43,500 0.578
Falls 12,200 0.162
Poison 6,400 0.085
Drowning 4,600 0.061
Fire 4,200 0.056
Ingestion of Food/Object 2,900 0.039
Firearms 1,400 0.019
n = 75,200
Continued .
Larson & Farber, Elementary Statistics: Picturing the World, 3e 38
Pie Chart
Next, find the central angle. To find the central angle,
multiply the relative frequency by 360°.
Relative
Type Frequency Angle
Frequency
Motor Vehicle 43,500 0.578 208.2°
Falls 12,200 0.162 58.4°
Poison 6,400 0.085 30.6°
Drowning 4,600 0.061 22.0°
Fire 4,200 0.056 20.1°
Ingestion of Food/Object 2,900 0.039 13.9°
Firearms 1,400 0.019 6.7°
Continued .
Larson & Farber, Elementary Statistics: Picturing the World, 3e 39
Pie Chart
Fire Ingestion F irearms
6% 4% 2%
Drowning
6%
Poison
8%
Motor
Vehicle
Fa lls 58%
16%
Continued .
Larson & Farber, Elementary Statistics: Picturing the World, 3e 40
Scatter Plot
When each entry in one data set corresponds to an entry in
another data set, the sets are called paired data sets.
Continued .
Larson & Farber, Elementary Statistics: Picturing the World, 3e 41
Scatter Plot
Absences Grade
Final 100 x y
grade 90 8 78
2 92
(y) 80
5 90
70 12 58
60 15 43
50 9 74
6 81
40
0 2 4 6 8 10 12 14 16
Absences (x)
From the scatter plot, you can see that as the number of
absences increases, the final grade tends to decrease.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 42
Times Series Chart
A data set that is composed of quantitative data entries
taken at regular intervals over a period of time is a time
series. A time series chart is used to graph a time series.
Example:
The following table lists Month Minutes
the number of minutes January 236
Robert used on his cell
February 242
phone for the last six
months. March 188
April 175
Construct a time series May 199
chart for the number of
June 135
minutes used.
Continued .
Larson & Farber, Elementary Statistics: Picturing the World, 3e 43
Times Series Chart
200
Minutes
150
100
50
0
Jan Feb Mar Apr M ay J une
Month
Population mean: μ x
N
Solution:
∑𝑥 53 + 32 + 61 + 57 + 39 + 44 + 57
𝜇= = = 𝟒𝟗
𝑁 7
Example 1:
Calculate the median age of the seven employees.
53 32 61 57 39 44 57
Example 1:
Find the mode of the ages of the seven employees.
53 32 61 57 39 44 57
Example:
Grades in a statistics class are weighted as follows:
Tests are worth 50% of the grade, homework is worth 30% of the
grade and the final is worth 20% of the grade. A student receives a
total of 80 points on tests, 100 points on homework, and 85 points
on his final. What is his current grade?
Continued .
Larson & Farber, Elementary Statistics: Picturing the World, 3e 52
Weighted Mean
x (x w ) 87 0.87
w 100
Example:
The following data are the closing prices for a certain stock
on ten successive Fridays. Find the range.
Stock 56 56 57 58 61 63 63 67 67 67
n 1
Guidelines
In Words In Symbols
1. Find the mean of the population μ x
data set. N
Guidelines
In Words In Symbols
1. Find the mean of the sample data x x
set. n
Example 1:
The following data are the closing prices for a certain stock on five
successive Fridays. The population mean is 61. Find the population
standard deviation.
Always positive!
mathsisfun.com
Solution:
A) Mean:
60 + 47 + 17 + 43 + 30 197
𝑥= = = 39.4
5 5
𝟑𝟗. 𝟒
mathsisfun.com
S olution:
B) Sample variance:
= 271.3
mathsisfun.com
So, using the standard deviation we have a "standard"
way of knowing what is normal, and what is extra large
or extra small.
16.47
16.47
mathsisfun.com
4.6 Testing a Statistical Hypothesis
Level of Significance
• It is denoted by alpha or α refers to the degree of
significance in which we accept or reject the null
hypothesis.
• 100% accuracy is not possible in accepting or rejecting a
hypothesis.
• The significance level is also the probability of making the
wrong decision when the null is true.
Types of Statistical Hypothesis
Types of Statistical Tests
Examples: H 1 : 𝑥 ≠ 8
Possibilities in the Decision Procedure
1) Type I error
The null hypothesis is rejected when in fact it is
true.
2) Type II error
The null hypothesis is accepted when in fact it
is false .
Constructing the Null and Alternative Hypothesis