CHAPTER 6 - Basic Statistic Concepts

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

BTV3413 –

INDUSTRIAL QUALITY CONTROL

BASIC STATISTIC CONCEPT

SLIDE | 1
LEARNING OUTCOMES

• To review basic statistic concepts


• To understand how to graphically and analytically study a process by using statistics
• To know how to create and intercepts a frequency diagram and a histogram
• To know how to calculate the mean, median, mode, range and standard deviation for
a given set of numbers
• To understand the importance of the normal curve and the central limit theorem in
quality assurance.
• To know how to find the area under a curve using the standard normal probability
distribution (tables)
• To understand how to interpret the information analyzed

SLIDE | 2
Introduction

• Statistics is the collection, tabulation, analysis, interpretation and presentation


of analytical data which provide a viable method of supporting or clarifying a
topic under discussion.

• In industry the statistical representation of data is the foundation for quality


assurance and quality improvement processes if correctly used. It can be used
as a decision support to change processes or pursuing a particular course of
action.

• Assumption based on incomplete information can lead to incorrect decisions,


unwise investment and uncomfortable environment.

SLIDE | 3
Population Vs Samples

• The entire group of individuals is called the population

• For example, a researcher may be interested in the relation between class


size (variable 1) and academic performance (variable 2) for the population of
third-grade children

• Usually, populations are so large that a researcher cannot examine the entire
group.

• Therefore, a sample is selected to represent the population in a research


study. The goal is to use the results obtained from the sample to help answer
questions about the population.

SLIDE | 4
Variables

• In quality control, two types of numerical data can be collected: Variable data
and attribute data.

• A variable is a characteristic or condition that can change or take on different


values.

• Attribute data are those quality characteristics that are observed to be either
present or absent, conforming or nonconforming.

• Although both variables and attribute data can be described by numbers,


attribute data are countable, not measurable.

• Attribute data will always be a whole number because it counts the presence or
absence of a chosen characteristic.

SLIDE | 5
SLIDE | 6
Statistic

• A sample is a subset of elements or measurements taken from a population.

• Descriptive or deductive statistics describe a population or complete group of


data. When describing a population using deductive statistics, the investigator
must study each entity within the population. This provides a great deal of
information about the population, product, or process, but gathering the
information is time-consuming.

• Inductive statistics deal with a limited amount of data or a representative


sample of the population

• Measurement error is the difference between a value measured and the true
value. The error that occurs is one either of accuracy or of precision.

SLIDE | 7
Statistic

• Accuracy refers to how far from the actual or real value the measurements is

• Precision is the ability to repeat a series of measurements and get the same
value each time (sometimes referred to as repeatability)

SLIDE | 8
SLIDE | 9
Statistic : Frequency diagram
• A frequency diagram shows the number of times each of the measured values
occurred when the data were collected. This diagram can be created either
from measurements taken from a process or from data taken from the
occurrences of events.

SLIDE | 10
Statistic : Frequency diagram
To create :
1. Collect the data. Record the measurements or counts of the characteristics of
interest.
2. Count the number of times each measurement or count occurs.
3. Construct the diagram by placing the counts or measured values on the x axis and
the frequency or number of occurrences on the y axis. The x axis must contain each
possible measurement value from the lowest to the highest, even if a particular
value does not have any corresponding measurements. A bar is drawn on the
diagram to depict each of the values and the number of times the value occurred in
the data collected.
4. Interpret the frequency diagram. Study the diagrams you create and think about the
diagram’s shape, size, and location in terms of the desired target specification.

SLIDE | 11
Example : clutch plate grouped data for thickness

Clutch plate thickness was measured to


respond to customer issues, the engineers
involved in the clutch plate problem are
studying the thickness of the part. To gain a
clearer understanding of incoming materials
thickness, they plan to create a frequency
diagram for the grouped data as shown in
table. The first step is to perform by the
operator, who randomly selects five part
each hour, measures the thickness of each
part and record the values

SLIDE | 12
Data analysis : graphical

Clutch plate thickness frequency


Clutch plate thickness Tally sheet distribution ( Coded 0.06)

SLIDE | 13
Statistic : Histogram
• Similar to frequency diagrams.
❑ The most notable difference between the two is that on a histogram the data are
grouped into cells. Each cell contains a range of values.
Step 1: Collect the data and construct a tally sheet
Step 2: Calculate the range
Step 3: Create the cells by determining the cell intervals, midpoints, and
boundaries
Step 4: Label the axes
Step 5: Post the values
Step 6: Interpret the histogram

SLIDE | 14
Statistic : Histogram

SLIDE | 15
Statistic : Histogram
• Analyze histogram by studying :

❑ Shape

❑ Location

❑ Spread

SLIDE | 16
Statistic : Histogram
• Shape : the form that the values of the measurable characteristics take on
when graphed. Shape is based on the distribution’s symmetry, skewness, and
kurtosis

• Location : Where is the distribution in relation to the target?

• Spread : the distance between the highest and lowest values.

SLIDE | 17
Statistic : Histogram
• Analytical methods of describing histograms exist.

• Though shape was easily seen from a picture, the location and spread can be
more clearly identified mathematically

• Location is described by measures of central tendency: the mean, mode, and


median. Spread is defined by measures of dispersion: the range and standard
deviation.

SLIDE | 18
Statistic : Histogram
• Mathematical description of histogram : measures of central tendency:

• Mean –is determined by adding the values together and then dividing this sum
by the total number of values

• Median –is the value that divided an ordered series of numbers so that there is
an equal number of values on either side of the center, or median value

• Mode – is the most frequently occurring number is a group of values.

SLIDE | 19
Statistic : Histogram
• Mathematical description of histogram : measures of dispersion:

• Range –is the difference between the highest value in a series of values or
sample and the lowest value in the same series

• Standard deviation – shows the dispersion of the data within the distribution

SLIDE | 20
Mean = average

SLIDE | 21
Median = middle
• Put numbers in order from lowest to highest and find the number that is exactly
in the middle

20, 15, 10, 10, 10, 1

• Since there is an even number of values the median is 10 years (average of


the 2 middle values)

• For odd number, the values of median is the number in the middle.

SLIDE | 22
Mode = frequently occurring number
• Number in data set that occurs most often : 20, 15, 10, 10, 10, 1

• Sometimes there will not be a mode : 20, 17, 15, 8, 3

• Record answer as “none” or “no mode” – NOT “0”

• Sometimes there will be more than one mode


• 20, 15, 15, 10, 10, 10, 1

SLIDE | 23
Range = difference between the lowest and highest
number

20, 15, 10, 10, 1

20-1 = 19 years

• The range tells you how spread out the data points are.

SLIDE | 24
Example

The mean of four numbers is 50.5

101 99 1 1

• What is the median?

• What is the mode?

SLIDE | 25
Measured Values

• When making a set of repetitive measurements, the standard deviation


(S.D.) can be determined to

❑ indicate how much the samples differ from the mean

❑ Indicates also how spread out the values of the samples are

Variance vs
standard
deviation?

SLIDE | 26
Standard Deviation

• The smaller the standard deviation, the higher the quality of the measuring
instrument and your technique

• Also indicates that the data points are also fairly close together with a small
value for the range.

• Indicates that you did a good job of precision your measurements.

SLIDE | 27
Standard Deviation

A high or large standard deviation

• Indicates that the values or measurements are not similar

• There is a high value for the range

• Indicates a low level of precision (you didn’t make measurements that were
close to the same)

• The standard deviation will be “0” if all the values or measurements are the
same.

SLIDE | 28
Standard Deviation

A high or large standard deviation

• Indicates that the values or measurements are not similar

• There is a high value for the range

• Indicates a low level of precision (you didn’t make measurements that were
close to the same)

• The standard deviation will be “0” if all the values or measurements are the
same.

SLIDE | 29
Example

You and your friends have just measured the heights of your dogs (in
millimeters):

The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm and
300mm.

Find out the Mean, the Variance, and the Standard Deviation.

SLIDE | 30
• Our example was for a Population (the 5 dogs were the only dogs we were
interested in).

• But if the data is a Sample (a selection taken from a bigger Population), then
the calculation changes!

• When you have "N" data values that are:

• The Population: divide by N when calculating Variance (like we did)

• A Sample: divide by N-1 when calculating Variance

The "Population Standard Deviation":

The "Sample Standard Deviation":

SLIDE | 31
Example :

The frequency table of the monthly salaries of 20 people is shown below

Salary ($) frequency


3500 5
4000 8
4200 5
4300 2

a) Calculate the mean of the salaries of the 20 people

b) Calculate the standard deviation of the salaries of the 20 people

SLIDE | 32
a) Mean = $3955

b) variance = [(3500 - 3955)^2 * 5 + (4000 - 3955)^2 * 8 + (4200 - 3955)^2 * 5 +


(4300 - 3955)^2 * 2] / 20 = [(-455)^2 * 5 + (45)^2 * 8 + (245)^2 * 5 + (345)^2 *
2] / 20 = (103225 * 5 + 2025 * 8 + 60025 * 5 + 119025 * 2) / 20 = 53975

c) SD = Standard Déviation = √(Variance) = √(53975) ≈ $232.50

SLIDE | 33
• Mean = (600 + 470 + 170 + 430 + 300)/5 = 1970/5 = 394

• Now we calculate each dog's difference from the Mean

• To calculate the Variance, take each difference, square it, and then average
the result is 21,704

• The standard deviation = sqrt (21,704) = 147

• Now, we can show which heights are within one S.D (147) of the mean.

• Using the S.D, we have a standard way of knowing what is normal, and what is
extra large or extra small

SLIDE | 34
The graphs show three normal distributions with the same
mean, but the taller graph is less “spread out.”
Therefore, the data represented by the taller graph has a
smaller standard deviation

SLIDE | 35
Statistic : Central Limit Theorem
• The central limit theorem states that a group of sample averages tends to be
normally distributed; as the sample size n increases, this tendency toward
normality improves.

• This means that the population from which the samples are taken does not
need to be normally distributed for the sample averages to tend to be normally
distributed.

• In the field of quality, the central limit theorem supports the use of sampling to
analyze the population. The mean of the sample averages will approximate
the mean of the population.

SLIDE | 36
Statistic : Normal Frequency Distribution
• The normal frequency distribution, the familiar bell-shaped curve , is
commonly called a normal curve. A normal frequency distribution is
described by the normal density function:

SLIDE | 37
Statistic : Normal Frequency Distribution
• Normal Frequency Distribution (the Normal Curve)
❑ A normal curve is symmetrical about µ
❑ The mean, mode, and median are equal
❑ The curve is unimodal and bell-shaped
❑ Data values concentrate around the mean and decrease in number further
away
❑ The area under the normal curve equals 1
❑ The distribution can be described in terms of the mean and standard
deviation

SLIDE | 38
Percentage of Measurements Falling Within
Each Standard Deviation

SLIDE | 39
Standard Normal Probability Distribution : Z tables

SLIDE | 40
Statistic : Normal Frequency Distribution
• To find Area under Normal Curve:

SLIDE | 41
Statistic : Normal Frequency Distribution

SLIDE | 42
Example

The engineers working with the clutch plate thickness data have determined that their
data approximates a normal curve. They would like to determine what percentage of
parts from the samples taken is below 0.0624 inch and above 0.0629 inch.

They calculated an average of 0.0627 and a standard deviation of 0.00023. They used
the Z tables to determine the percentage of parts under 0.0624 inch thick.

SLIDE | 43
• Area = 0.0968 or 9.68 percent of the parts are thinner than 0.0624 inch

SLIDE | 44
Exercise

The Rockwell hardness of specimens of an alloy shipped by your supplier


varies according to a normal distribution with mean 70 and standard deviation 3.
Specimens are acceptable for machining only if their hardness is greater than
65. What percentage of specimens will be acceptable? Draw the normal curve
diagram associated with this problem.

SLIDE | 45
SLIDE | 46

You might also like