Professional Documents
Culture Documents
Applied Statistics For QA/QC, MFG, and R+D Advanced Applications
Applied Statistics For QA/QC, MFG, and R+D Advanced Applications
Workshop in
Applied Statistics for
QA/QC, Mfg, and R+D
Part 3 of 3:
Advanced Applications
Instructor : John Zorich
www.JOHNZORICH.COM JOHNZORICH@YAHOO.COM
Methods of extrapolating
Cumulative %
-- unfinished
experiments
-- data that can't
be normalized
Not -- data from 2
possible different
with populations
K-tables -- data with
many
duplicates
Transformed DATA as we shall see
next...
Actual data from presenter's client...
continued from previous slide...
In reliability statistics
textbooks, a plotdflike this, or one
that is not even as straight as
this, is sometimes shown as an
example of a Normal"
distribution; but...
even tho this data does
pass the best tests for
Normality (Anderson-Darling
A2*, Cramer-von Mises W2*,
and Shapiro-Francia W' ), with
test p-values all > 0.425, ...
and even tho the
correlation coefficient is very
high...
this plot is slightly curved; and
This is the Excel equivalent therefore this data is not truly
of a Normal Probability Plot normal (it is almost Normal).
(data is Normal if it shows as a
straight line on this plot). Is "almost" good enough
continued from previous slide...
Using Reliability
Plotting.xls
Because this data using
does NOT form a
straight line on NPP Z(F)
paper, it's not valid to vs.
use K-factor tables. X(untransformed)
(this is equal to
Normal
Probability
Plotting paper)
Burst strength
Also, notice that the data includes
many replicate values...
Burst strength
...and notice that on a basic cumulative plot
( = F(untransformed) vs. X(untransformed) )
the data seem to include 2 different populations:
s i n g le
A t io n
u l a
pop ok like
ul d lo
w o
o t h " S"
a smo This one
curve. break or
has a er in it,
corn g a dual To use Reliability
n d i ca t in n.
i l a t io Plotting, must "censor"
popu these data, (they appear
as shoulder on a line
chart -- see next slide)
(continued from previous slide)
Frequency
Distribution
Reproducibility (99%) =
5.15 x StdDev( 4.63, 4.42, 4.65 )
2 2
Gage R&R (99%) = sqrt ( Reproducibility + Repeatability )
Gage Correlation
A "Gage Correlation" study typically is used to compare
measurements of identical parts by 2 different companies ---
for example, by the Supplier of the part, and their Customer.
One practical use is to validate that the Supplier gets the "same"
answer as the Customer, and thus the Customer justifies using
Supplier-provided QC data rather than the Customer having itself
to perform QC (the part could then justifiably go "dock to stock").
In a Gage Correlation study, we identify the...
Linear Regression relationship between the measurements
by the 2 companies
Correlation Coefficient for that linear regression
Offset Values that could be used to "correct" any identified
differences in measurement between the 2 companies.
Such a study could also evaluate R&D vs. Pilot Production, or
Pilot Production vs. Manufacturing, in the same company.
Gage Correlation
This is an example of a data
input table for a simple
Gage Correlation study
(a complicated one would
involve 3 or more gages).
In this case, each "Set #" is
a unique part to be
measured by "Gage # 1"
at one company
(or department), and
"Gage # 2" at the other
company (or department).
Altho not identical, the
measurements from the 2
companies look to be
linearly related and to be
highly correlated.
For Gage#1 to read like Gage #2,
multiply each Gage# 1 result by
0.998 and then subtract 4.7732
from that result...
4.77
0.000
Gage Bias
A "Gage Bias" study is, in effect, a one-point Gage Linearity
Study, in which is used either a gold standard ("Reference")
calibrator or a gold standard ("Reference") gage. The difference
between the on-test gage and either the gold-standard gage or
the gold-standard calibrator is considered the "bias".
QC Sampling Plans
(are they worth the effort?)
What people say about why they use
traditional AQL sampling plans is...
"FDA / ISO auditors won't ask any challenging questions."
That is true, for field auditors (= untrained in statistics).
PMA / CE auditors and their staff statisticians are much
more statistically savvy, and have been known to ask you
to justify your sampling plans, based on risk analysis, for
critical parts (e.g., implant components).
The manner in which lot quality and lot size affect the
Pass Rate is described by 2 types of...
Operating Characteristic curves = OC curves
In this presentation, those 2 types of curves are called...
% Defective OC Curves
and
Lot Size OC Curves
examples of each are shown on upcoming slides...
Predicting Pass Rates
This is the typical
OC CURVE found in
text books, i.e., Lot %
Defective vs. Pass Rate
N = 1000
n = 15
c=0
How can such variations in "4% AQL" pass rates ensure that
"Suppliers provide consistently high quality product "?
Lot Quality is
Do AQL plans control consumer risk
consistently? Conclusion:
Z1.4 and C=0 AQL sampling plans do
not control consumer risk consistently,
unless Lot Size is controlled.
Lot Size OC Curves
for v4 ASQC-C=0,
Single Sample, 4% AQL
Lot Quality is
Important lesson from 2 previous slides:
IN CONTROL Lower
Spec Target
OUT OF --
CONTROL
Lower
Spec --
Upper
Spec
Out of control
e b ee n m uch
w ould ha v
Th e G HTF d d ed the
t h e y a
ful had
more help j us t b efore the
ple
word sam e o r r ange
er ag
words av
d
Control Chart per GHTF ( FDA approved !!)
(Process Validation Guidance, GHTF/SG3/N99-10:2004 (Edition 2)
You should choose the lots that are "relevant" to the current
process. That is, lots (i.e., data) that represent the current
production process.
For example....
The lots marked with squares were
used to calculate the control limits.
121
2
11.5
1
1
11.0
.
In effect, the
5
10.5
control limits
1
10.0
1 are set at
+/ 3 std errors
.
9.50
9.0
from the
Control limits on the "Averages" midline
chart are equivalent to (i.e., the
+/ 3 std errors of the mean. distance
between the
control limits is
6 standard
errors wide),
calculated
Control limits on the "Ranges" indirectly,
chart are equivalent to using tables.
+/ 3 std errors of the range.
XbarR Chart, (n = 2 or more)
For sample averages:
UCL = AvgAvg + ( A2 x AvgRange )
AvgAvg = Average of all chosen measurements
LCL = AvgAvg ( A2 x AvgRange )
e
# of data pts per Sample Avg = 9
bl
ta
# of Sample Avgs = 7
m
fro
Average of all 7 Avgs = 100
s
or
ct
Average of all 7 Ranges = 10
Fa
Answer: UCLavg = 100 + (10 x 0.337) =
103.37
LCLavg = 100 (10 x 0.337) = 96.63
UCLrange = 10 x 1.816 = 18.16
LCLrange = 10 x 0.184 = 1.84
Thecircled
Lots lots circled in are
in blue blueshown
were used
in thetohistogram
calculate
and
thewere
control
used limits.
to calculate the control limits.
LCL(avg) UCL(avg)
= 6.5 = 13.5
Thecircled
Lots lots circled in are
in blue blueshown
were used
in thetohistogram
calculate
and
thewere
control
used limits.
to calculate the control limits.
LCL LCL
(avg) (avg)
= 7.5 = 12.5
Thecircled
Lots lots circled in are
in blue blueshown
were used
in thetohistogram
calculate
and
thewere
control
used limits.
to calculate the control limits.
LCL(avg) = 9.5
UCL(avg) = 10.5
Because the variation in both Averages and Ranges has been
greatly reduced, the process on the far right is making
higher quality product than the process on the far left,
regardless of what % of product is made "in spec".
The purpose of SPC is to help processes move from left to right!
The goal of an SPC program is to make product that is more "on target"
with "minimum variation".
How is "Out of Control" detected?
OUT OF CONTROL = any data point or set of points (on
the control chart ) that would have little likelihood of occurring
by chance alone, assuming the data is normally distributed
( "out of control" = "special cause" is present).
YOU decide what " little likelihood " means. Over 80 years
ago, that meant a probability of 1 in 20 , whereas current
preference is 1 in 370.
t rol
n
Out of control point f co "
u t o end
O
" tr
tr ol
n
f co "
u t o ries
O e
"s
"Rules" for detecting "out of control"
(all taken from SPC textbooks)
Probability of occurring by chance
(assuming no "special cause" is acting)
1 point outside either control limit 1 in 370
9 in sequence on one side of midline 1 in 256
9 in an ascending or descending trend 1 in 256 (on avg)
8 in sequence on one side of midline 1 in 128
10 of 11 on same side of midline 1 in 102
12 of 14 on same side of midline 1 in 105
14 of 17 on same side of midline 1 in 117
16 of 20 on same side of midline 1 in 135
many, many others !! 1 in 100 to 400 !!!
Reasons not to have too many rules...
USING THE ROLL OF 4 DICE AS AN EXAMPLE
The chance of not getting a 6 on any die is 5/6 x 5/6 x 5/6 x 5/6 =
approximately 50 % ; therefore, about 50% of the time, a toss of 4 dice
will have a 6 showing on at least 1 of the dice.
USING ALL THE RULES ON THE PREVIOUS SLIDE
369/370 x 255/256 x 255/256 x 127/128 x 101/102 x 104/105 x 116/117
x 134/135 = 95 % ; therefore, about 5 % of the time ( = 1 out of every 20
times), a point will be called "out of control" even tho it is the result of
random variation ( = "common cause"; that is, NOT "special cause").
That may be too frequent for the boss's taste !!!
USING ONLY (the first) 3 RULES
369/370 x 255/256 x 255/256 = 99 %, which means only 1% or about 1
in a hundred times will a "false alarm" be triggered by chance (is that
more acceptable to your boss ??). That % is recommended in the
Handbook of Statistical Methods in Manufacturing (R. B. Clements, 1991).
Random vs. Representative Sampling
The data on your SPC chart should faithfully represent the
process you're trying to improve.
Wait till the end of the production run, and then randomly
choose a sample from output of the entire run, OR...
Take one item per arbitrary time period (e.g., one part
every hour, or one part after each 100 parts made, or ??)
and combine all the collected parts as a "representative"
sample.
If your boss makes you take only the first few parts from a
day's run, don't argue -- it's better than having no SPC program
!!
Rational Sub-grouping of Samples
Basic rule for good SPC charts:
For example:
Manufacturing occurs on day, swing, and graveyard
shifts. The average of a sample from each shift's
production is plotted sequentially on the same SPC chart.
Evening, Night
This
is GOO
Rational Sub-grouping D !!
Rational Sub-grouping
Basic rule for good SPC charts: Plotted points must
have NO KNOWN SYSTEMATIC SOURCE OF
VARIATION between them, other than sequential
production over time.
If you ignore that rule, you may miss chances for valuable
investigations of "special cause" incidents, or you may
waste time investigating "common cause" effects.
Cp =
ratio of the width of the specification limits to the SPC-
estimated width of the range that the encompasses
99.7% of the product population.
= ( USL LSL ) / (6Sigma [indirect])
The larger Cp is, the better, because large numbers
indicate that a large % of the product might lie within
the Upper and Lower spec limits
( = USL & LSL), that is, a large % might pass QC.
NOTE: If use "6Sigma [direct]", instead of "6Sigma
[indirect]" you're actually calculating Pp, not Cp.
Capability Indices
Cp
This is useful only if the average data value is
currently near the specification target. If the average
data value is not near the target spec, then this
gives a false indication of % in-spec.
Cpk =
2 times the distance (as a positive number) that the
AVG (data) VALUE is from the nearest spec limit,
divided by "6 x SigmaX [ indirect ]":
= 2 x |( NSL AvgValue )| / (6Sigma
[indirect])
The larger Cpk is, the better, because large numbers indicate
that a large % of the product does pass QC.
NOTE: If use " 6Sigma [direct] ", instead of "6Sigma [indirect]"
you're actually calculating Ppk, not Cpk.
Capability Indices
Cpk is useful no matter whether the avg data value is
currently near the specification target or not; i.e., even if
the average data value is not near the target spec, Cpk
still gives good indications of % in-spec.
That % in-spec would be higher IF the average data value
were nearer the spec target
(Cp gives an indication of that higher %).
Express Cpk as a negative number,
only if the "Avg Value" is outside the spec limits.
Most companies claim to be calculating Cp & Cpk, but
an examination of their formulas reveals that they are
really calculating Pp & Ppk !!!
Classroom exercise
Calculate the Cp, based upon this data, using the
equations given on the previous slides...
9 = n = Sample Size
130 = USL = Upper (QC) Specification Limit
70 = LSL = Lower (QC) Specification Limit
105 = UCL = Upper Control (chart) Limit of Avgs
95 = LCL = Lower Control (chart) Limit of Avgs
Answer:
Cp = (USL LSL) / (6Sigma[indirect])
(USL LSL) = 130 70 = 60
"6Sigma..." = ( 105 95 ) x sqrt( 9 ) = 10 x 3 = 30
Cp = 60 / 30 = 2.00
What % is "in spec"
For a given set of data, Cpk and Ppk values are always
smaller than or equal to Cp & Pp, respectively (they can
never be larger).
Cp = Cpk
Cp
Cpk
Cp = Cpk
Cp = Cpk
histogram of
raw data
histogram of histogram of
raw data BEFORE raw data AFTER
process improvements process improvements
SPC helps to
get you from
here to here
3 4 5 6 7 8 9 3 4 5 6 7 8 9
Raw data
Spec Cpk = 0.97
limits =
0.07 to 3 parts per 2000
0.15 are predicted to be
out-of-spec
Non-normal Data, TRANSFORMED
Raw data
transformed
(1/X)
Cpk = 1.09
Transformed 1 part in
Spec limits = 2,000 is
6.7 to 14.3 actually
out-of-spec