Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 35

Basic Statistics for Process Improvement

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02

The Breakthrough Strategy

Define Measure Characterize Analyze
BB Works with Management
1 2 3 4 5 6 7 Select Output Characteristic and identify key process input and output variables Define Performance Standards Validate Measurement System Establish Product Capability Define Performance Objectives Identify Variation Sources Screen Potential Causes Discover Variable Relationships Establish Operating Tolerances

Improve Optimize Control

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02

8 9

10 Validate Measurement System 11 Determine Process Capability 12 Implement Process Controls


Measurement Phase
Project Definition: Problem Description Project Metrics Process Exploration: Process Flow Diagram C&E Matrix, PFMEA, Fishbones Data collection system Measurement System(s) Analysis (MSA): Attribute / Variable Gage Studies Capability Assessment (on each Y) Capability (Cpk, Ppk, s Level, DPU, RTY) Graphical & Statistical Tools Project Summary Conclusion(s) Issues and barriers Next steps Completed Local Project Review
Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02

Basic Statistics Fundamentals of Improvement

Variability Is the process on target with minimum variability? We use the mean to determine if process is on target. We use the Standard Deviation (s) to determine spread Stability How does the process perform over time? Stability is represented by a constant mean and predictable variability over time.
X-Bar Chart for Process A
UCL=77.20 75

X-Bar Chart for Process B

80 UCL=77.27

Sam ple M ean

Sam ple M ean


X =70.98

X =70.91 70

LCL=64.70 60


LCL=64.62 50 0 5 10 15 20 25 0 5 10 15 20 25

Sample Number

Sample Number

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02

Warm-Up Exercise

Assume machines A, B, and C make identical products (w/range charts in control) Assume that the target value for each product output variable is 100 mm Answer the following questions: Which machines exhibit(s) variation? Where is each machine centered? Which machines are predictable over time? Which machines have special cause variation? Which machine would you want making your product? Which machine would probably be easiest to fix?
X-bar Chart for Machine B
110 Sample Mean

X-bar Chart for Machine A

145 135 125 115 105 95 85 75 65 55 0 10 Sample Number
Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02

X-bar Chart for Machine C

Sample Mean 108.5 X=101.0 93.42 120 119.7


Sample M ean






62.93 20
0 10

110 20 0 10 Sample Number 20


Sample Number

Can We Tolerate Variability?

There will always be variability present in any process We can tolerate variability if: the process is on target the total variability is relatively small compared to the process specifications the process is stable over time



Traditional View





Taguchi Loss Function

(New View)

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02

Data Analysis Tasks for Improvement

Determine if process is stable If process is not stable, identify and remove causes (Xs) of instability (obvious non-random variation) Estimate the magnitude of the total variability. Is it acceptable with respect to the customer requirements (spec limits)? If not, identify the sources of the variability and eliminate or reduce their influence on the process Determine the location of the process mean. Is it on target? If not, identify the variables (Xs) which affect the mean and determine optimal settings to achieve target value We will now review statistics that help this process
Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02

Types of Outputs (Data)

Attribute Data (Qualitative)
Categories Yes, No Go, No go Machine 1, Machine 2, Machine 3 Pass/Fail

Variable Data (Quantitative)

Discrete (Count) Data
Maintenance equipment failures, fiber breakouts, number of clogs

Continuous Data
Decimal subdivisions are meaningful Dimension, chemical yield, cycle time
Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02

Selecting Statistical Techniques

There are statistical techniques available to analyze all combinations of input / output data.

Discrete (Attribute)
Discrete (Attribute) Continuous (Variable) Chi-square

Continuous (Variable)
Analysis of Variance Correlation


Discriminate Analysis Logistic regression

Multiple Regression

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02

Statistical Distributions
We can describe the behavior of any process or system by plotting multiple data points for the same variable
over time across products on different machines, etc.

The accumulation of this data can be viewed as a distribution of values Represented by:
dot plots histograms normal curve or other smoothed distribution
Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


Dot plot distribution

Imagine a metering pump, geared to pump material at 50 gallons/minute The actual pump rate is measured at 100 separate instances in time. Each dot is plotted and represents one event of output at a given value (pump speed). As the dots accumulate, the nature of the pumps actual performance can be seen as a : values. distribution of pump speed
: . . . : . .

:: : :::.::

:: . ::
: .

. : .. .:.:.:::::::::::::::.::.::::..:

-------+---------+---------+---------+---------+-------GPM 49.00 49.50 50.00 50.50 51.00

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02

Histogram Distribution
Now imagine the same data, grouped into intervals with the number of times that a pump speed data point falls within a given interval determining the height of the interval bar.
40 30




0 48.8 49.3 49.8 50.3 50.8 51.3

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


Smoothed (Normal) distribution

Finally, we can view the data as a smoothed distribution (red line). In this example using the normal distribution assumption (well discuss this later) provides an approximation of how the data might look if we were to collect an infinite number of data points.

4 8 .0

4 8 .5

4 9 .0

4 9 .5

5 0 .0

5 0 .5

5 1 .0

5 1 .5

5 2 .0

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02

Population Parameters Vs Sample Statistics

Population: an entire group of objects that have been made or will be made containing a characteristic of interest is it likely we can ever know the true population parameters Sample: the group of objects actually measured in a statistical study a sample is usually a subset of the population of interest Population Parameters m = Population mean s = Population standard deviation

Sample Statistics
X = Sample mean

s = Sample standard deviatio

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


Computational Equations
Population Mean Sample Mean



x =


Population Standard Deviation

Sample Standard Deviation

s =

i =1

s s

i 1


n 1

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02

Measures of Central Tendency

n Mean: Arithmetic average of a set of values n Reflects the influence of all values n 1 Strongly Influenced by extreme values n Median: Reflects the 50%rank - the center number after a set of numbers has been sorted Does not necessarily include all values in calculation Is robust to extreme scores Mode: Most frequently occurring value in a data set Why would we mainly use the mean, instead of the median, in process improvement efforts?

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


Measures of Variability:
Numerical distance between the highest and the lowest values in a data set.
Range max min





The average squared deviation s 2 of each individual data point from the mean.

2 (X X ) i i 1

n 1
2 (X X ) i i 1 n

Standard Deviation (s ; s):

The square root of the variance.
most commonly used measurement to quantify variability
Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02

n 1

The Quadratic Deviation

Squaring the deviation weights extreme deviations from the natural mean very heavily

(x - x )



0 0 5 10


Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


Principle of Six Sigma

Variances add, standard deviations do not Variances of the inputs add to calculate the total variance in the output


2 total

variance of the process output;

s s

2 X1 2 X2

variance due to Input Variable X1 ; variance due to Input Variable X 2 ; then, s

2 Total

2 X1

2 X2

2 2 So, s Total s X s X2 1
Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


The Normal Distribution

The Normal Distribution is a distribution of data which has certain consistent properties These properties are very useful in our understanding of the characteristics of the underlying process from which the data were obtained Most natural phenomena and man-made processes are distributed normally, or can be represented as normally distributed

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


The Normal Distribution

Property 1: A normal distribution can be described completely by knowing only the:
mean, and standard deviation
Distribution One

Distribution Two Distribution Three

What is the difference among these three normal distributions?

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


The Normal Curve and Its Probabilities

Property 2: The area under sections of the curve can be used to estimate the cumulative probability of a certain event occurring
Probability of sample value
40% 30% 20% 10% 0% -4 -3 -2 -1 0 1 2 3 4 99.73% 68%
Cumulative probability of obtaining a value between two values


Number of standard deviations from the mean

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


Empirical Rules for the Standard Deviation

The previous rules of cumulative probability closely apply even when a set of data is not perfectly normally distributed. Lets compare the values for a theoretical (perfect) normal distributions to empirical (real-world) distributions.

N um ber of Standard D eviations +/- 1 s +/- 2 s +/- 3 s

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02

Theoretical N orm al 68% 95% 99.7%

Em pirical N orm al 60-75% 90-98% 99-100%


Normal Probability Plots

We can test whether a given data set can be described as normal with a test called a Normal Probability Plot If a distribution is close to normal, the normal probability plot will be a straight line. Minitab makes the normal probability plot easy. Open Distskew.Mtw Choose: Stat > Basic Stats > Normality Test > Produce a normal plot of each of the first 3 columns. Which appear to be normal? Now, graph a histogram of each. What does this reveal?
Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


Normal Probability Plots

Positive Skewed Distribution

Normal Probability Plots


Normal Probability Plots

.999 .99


P robability




.95 .80 .50 .20



.05 .01 .001

0 20 30 40 50 60 70 80 90 100 110

0 10 20 30 40 50 60 70 80
Average: 70 Std Dev: 10 N of data: 500








Pos Skew
Anderson-Darling Normality T est A-Squared: 46.447 p-value: 0.000



Normal Distribution

Negative Skewed Distribution

Normal Probability Plots


.999 .99 .95

.999 .99 .95 .80

P robability

.80 .50 .20 .05 .01 .001

P robability
26 36 46 56 66 76 86 96 106



.50 .20 .05 .01 .001

















Average: 70 Std Dev: 10 N of data: 500

Anderson-Darling Normality T est A-Squared: 0.418 p-value: 0.328 Average: 70 Std Dev: 10 N of data: 500

Neg Skew
Anderson-Darling Normality T est A-Squared: 43.953 p-value: 0.000


Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


Mystery Distribution
Generate a Normal Probability Plot for the Mystery variable in C5. What is your conclusion? Is this a normal distribution?
Mystery Distribution
.999 .99 .95

P robability

.80 .50 .20 .05 .01 .001




Average: 100 Std Dev: 32.3849 N of data: 500 Anderson-Darling Normality T est A-Squared: 27.108 p-value: 0.000

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


Open file DISTSKEW.MTW Stat > Basic Statistics > Display Descriptive Statistics
Variable Normal Pos Skew Neg Skew Mystery N 500 500 500 500 Mean 70.000 70.000 70.000 100.00 Median 69.977 65.695 73.783 104.20 Tr Mean 70.014 68.554 71.368 99.94 StDev 10.000 10.000 10.000 32.38 SE Mean 0.447 0.447 0.447 1.45

Variable Normal Pos Skew Neg Skew Mystery

Min 29.824 62.921 1.866 41.77

Max 103.301 130.366 77.106 162.82

Q1 63.412 63.647 67.891 68.69

Q3 76.653 72.821 76.290 130.81


Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02

Graphical Summary
Descriptive Statistics
Variable: Mystery
Anderson-D arling N orm ality Test A-Squared: p-value: Mean Std D ev Variance Skewness Kurtosis n of data Minim um 1st Quartile Median 3rd Quartile Maxim um 27.11 0.00 100.00 32.38 1048.78 0.01 -1.63 500.00 41.77 68.69 104.20 130.81 162.82



13 0

18 0

95% C onfidence Interval for Mu

95% C onfidence Interval for Mu 97.15

80 90 10 0 110 12 0


95% C onfidence Interval for Sigm a 30.49 34.53

95% C onfidence Interval for Median

95% C onfidence Interval for Median 82.78 117.66

Stat > Basic Statistics > Display Descriptive Statistics > Graphs > Graphical Summary
Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


Exercise in Data Mining

Remember the basic premise of Six Sigma, that sources of variation can be:
Identified Quantified Eliminated or Controlled

The following example investigates potential sources of variation in breaking strength in a spin draw process.
Output: Breaking Strength Inputs Tracked: Day, Doff, Spinneret and Draw ratio

Objective: Which Xs affects variation in Y Filename: Bhhmult.mtw

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


Data Set
Column C1 C2 C3 C4 Count 36 36 36 36 Missing 0 0 0 0 Name Day Doff Spinnert DrwRatio




The Info window of Minitab shows that the data set contains information about Day, Doff, Spinneret, Draw Ratio and Breaking Strength. There are 36 observations. The challenge is to determine what inputs are causing variation in the output.
Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


Total Variation of Breaking Strength

Using the Graph > Histogram function we see the distribution of Breaking Strength. Values range from about 15 to about 30.


0 15 17 19 21 23 25 27 29


Variable BrkStren Variable BrkStren

N 36 Min 15.330

Mean 21.865 Max 29.720

Median 22.380 Q1 19.242

Tr Mean 21.819 Q3 24.138

StDev 3.428

SE Mean 0.571

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


Mining the Data

Lets look at Draw Ratio and its effects on the variability of Breaking Strength. We can go to Stat > Basic Stats > Display Descriptive Statistics. Use the By statement.

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


Breakdown by Draw Ratio

Variable BrkStren DrwRatio 1 5 10 N 12 12 12 Mean 18.774 22.282 24.538 Median 18.990 22.815 24.565 Tr Mean 18.625 22.377 24.621 StDev 2.560 1.821 3.017 SE Mean 0.739 0.526 0.871

Variable BrkStren

DrwRatio 1 5 10

Min 15.330 18.960 18.530

Max 23.710 24.650 29.720

Q1 16.373 20.888 22.715

Q3 20.317 23.220 26.898

These results show that, as Draw Ratio varies from 1% to 10%, the average Breaking Strength varies from 18.8 to 24.5. If we could center Draw Ratio on 5%, the sigma for Breaking Strength would be reduced from 3.0 to about 1.8.
Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


Data Mining Graphically

Go to Graph > Character Graph > Dotplot and display Break Strength BY Draw Ratio.

Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


DrwRatio 1 . ... . .... . . . ---+---------+---------+---------+---------+---------+---BrkStren DrwRatio 5 .. . : :: ..

---+---------+---------+---------+---------+---------+---BrkStren DrwRatio 10 . . . . .. :. . . .

---+---------+---------+---------+---------+---------+---BrkStren 15.0 18.0 21.0 24.0 27.0 30.0

Exercise: Investigate Day, Doff and Spinneret in the same way and be ready to report conclusions. Which is the strongest input in explaining variation in Breaking Strength.
Visteon Corporation BB Mod #5 Basic Stats Rev 1.0 3/02


You might also like