Download as pdf or txt
Download as pdf or txt
You are on page 1of 82

Quality Management in the Bosch Group | Technical Statistics

1. Basic Concepts of Technical Statistics


Variable Characteristics
Edition 10.1996

1996 Robert Bosch GmbH


Table of Contents

1. Introduction ................................................................................................................... 5

2. Terms............................................................................................................................. 7
2.1 Characteristic ......................................................................................................... 7
2.2 Population .............................................................................................................. 7
2.3 Sample ................................................................................................................... 8
2.4 Random Variable ................................................................................................... 8
2.5 Probability ........................................................................................................... 11

3. Statistical Characteristics ............................................................................................. 13


3.1 Median ................................................................................................................. 13
3.2 Arithmetic Mean .................................................................................................. 15
3.3 Moving Mean ....................................................................................................... 16
3.4 Geometric Mean................................................................................................... 17
3.5 Harmonic Mean.................................................................................................... 19
3.6 Standard Deviation............................................................................................... 20
3.7 Coefficient of Variation ....................................................................................... 22
3.8 Range ................................................................................................................... 23
3.9 Range Method for Determining the Standard Deviation ....................................... 23

4. Statistical Calculations with Pocket Calculator ............................................................ 25

5. Graphic Representation of Data ................................................................................... 28


5.1 Original Value Diagram ....................................................................................... 28
5.2 Tally Chart, Dot Diagram..................................................................................... 29
5.3 Grouping, Histogram............................................................................................ 30
5.4 Cumulative Frequency Curve ............................................................................... 35

6. Statistical Distributions................................................................................................ 36
6.1 Gaussian Normal Distribution .............................................................................. 36
6.1.1 Properties and Characteristics of the Normal Distribution ................................... 37
6.1.2 Distribution Function ........................................................................................... 39
6.1.3 Standard Normal Distribution .............................................................................. 41
6.2 Normal Probability Paper ..................................................................................... 44
6.3 Lognormal Distribution ........................................................................................ 51
6.3.1 Lognormal Probability Paper................................................................................ 53
6.3.2 Relationship between Normal Distribution and Lognormal Distribution ............. 53
6.4 Mixture of Distributions....................................................................................... 57

7. Quality Control Charts ................................................................................................. 58


7.1 Location Control Charts ....................................................................................... 58
7.1.1 Average Chart ...................................................................................................... 59
7.1.2 Original Value Chart ............................................................................................ 61
7.2 Variation Control Charts ...................................................................................... 63
7.2.1 s-Chart ................................................................................................................. 63
7.2.2 R-Chart ................................................................................................................ 65

-3-
8. Assessing Frequency Distributions in Connection with a Tolerance ............................ 66

9. Accuracy of Estimating Mean and Standard Deviation................................................. 69

10. Table of the Standard Normal Distribution................................................................. 73

11. Bibliography .............................................................................................................. 77

12. Symbols and Terms .................................................................................................... 78

Index................................................................................................................................ 80

-4-
1. Introduction
Mathematical statistics - in short statistics - originally developed as a requirement for
counting populations which served in determining the status (lat. status) of a nation and
facilitated in describing the latter with respect to public economics. According to the
dictionary, it is the science of numerical recording, investigating and evaluating mass
occurrences.

This definition comprises two essential aspects of statistics: the recording, arranging and
representing statistical data as the targets of the descriptive (describing) statistics, whereas
evaluating (analysis, interpretation) data is the task of inductive (conclusive) statistics. In
the media, we often encounter examples of corresponding applications from either aspect.

Examples of the descriptive (describing) statistics are:

exchange rate development over time of foreign currencies or stock indices (original
value charts),
distribution of seats in state parliaments (pie charts),
market shares of different automobile types as part of the total number of newly
registered cars in Germany in one year (histograms),

or the information on per capita consumption of milk products in the EC-countries in one
year (mean values).

The following examples of the inductive statistics procedure still have an unmistakable
relationship with the national census:

forecast of election results based on representative interviews on the election Sunday,


projections of television broadcasting viewers based on visual participation of selected
test viewers,
estimates of visitors at public events,
estimation of the population of a particular type of animal in an area of specific size,
analysing the effect of an advertisement campaign upon the consumer behaviour of a
market based on the behaviour of selected test customers.

In all the last mentioned cases, a statement concerning a larger collective (population) is
derived from the basic knowledge about a limited subset of individuals (sample).
Thereby, the fact is exploited that in case of mass phenomena, though the result of a single
observation is random (yearly number of lightning strikes per square kilometre in an area
xy) and therefore cannot be predicted with certainty, conformity to a mathematically
describable law nevertheless exists.

Frequently, inductive statistics is also aimed at making conclusions about a future


behaviour by using the momentary state (trend), i.e. somehow to predict the future.

Statistics works with mathematical models for this purpose (distribution functions), which
describe the characteristics of so-called random variables.

-5-
Misconceptions in the application of statistical methods nearly always result from
neglecting model-association and the prerequisites mentioned in connection with the
methods.

To the freshman in the material, understanding of statistical statements and methods is


made difficult by several problematic points of view listed on the following page.

1. Terminological Difficulties
In the colloquial language, the term probable is often replaced by other terms, e.g.
impossible, maybe, presumably, with reasonable certainty, or dead certain,
which, according to our experience, should always represent a measure of the confidence
in the truthfulness of a statement. Depending upon the person using such a term, his state
of mind (euphoria, depression) and the respective situation, each of them can have an
entirely different meaning.
In contrast, statistics uses a mathematically defined probability, a number between zero
(impossible event) and one (certain event) as a measure for the estimated occurrence or
non-occurrence of an event. The difficulties of explaining the term probability with
simple words when separated from the statistical scope are obvious.

2. Logical Difficulties
A risk exists that the user of the inductive statistics gets the impression of an objective
certainty, which actually does not exist. This misconception is reflected in the sequence
of terms: unknown, random (random variable), probable, (probability) which in
the common language peaks in the term certain (certainty).
It should be clear that there is no possibility of creating a bridge in reality between the
unknown and certain states.

3. Transferability
The examples considered in statistics textbooks show that one cant avoid applying natural
phenomena and respectively measured quantities or examples from the theory of games
with known constraints for illustrating a random (chaotic?) behaviour:

number of lightning strikes per square kilometre on the earths surface and year,
annual rainfall quantity per square metre,
movement of gas molecules (Brownian molecular movement),
radioactive decay,
winning chances at games of chance (throwing the dice, roulette, lotto).

In comparison with such examples, the phenomena investigated in the industrial practice
seldom appear to be compatible with the terms random or even chaotic.

Regardless of these basic problems, statistical methods have won a fixed place in the
industrial practice.

This booklet represents the introduction into the series Quality Assurance in the Bosch-
Group, Technical Statistics, which covers a variety of special themes.

-6-
2. Terms

2.1 Characteristic

Objects of statistical conceptions and calculations in the industrial practice are in general
continuously variable and measurable characteristics and discretely countable
characteristics of the units considered. In correspondence with these subgroups, the first
two booklets of the Bosch script series carry the subtitle Variable Characteristics
(Booklet No. 1) and Attributive Characteristics (Booklet No. 2).

The variable characteristics dealt with in this Booklet 1 are measurable or observable
properties (length, weight, temperature) of objects or events (lifetime, bursting pressure).

Although a physically measured value is always given in the form of a value (a number,
e.g. 48) and a unit of measurement (e.g. mm), the latter are of subordinate importance with
respect to statistical considerations. We can therefore mostly concentrate on pure statistics
in the examples.

Statistical analyses of the properties of variable characteristics find application in many


areas of the industrial practice, e.g. within the scope of

capability studies of measuring instruments and machines,


the assessment of manufacturing processes,
statistical process control (SPC), and
the evaluation of experimental data.

The procedures of inductive statistics are especially of interest in relationship with risk
analyses, and where one must work with relatively small sample sizes due to economic
grounds, e.g. in the area of (expensive) quality controls (destructive inspection, lifetime
investigations).

2.2 Population

Under the term population we understand a limited or unlimited quantity of considered


units which, with regard to a statistical question on hand, are considered equivalent. Such
considered units can, for example, also be observations or results of experiments
conducted under the same conditions.

Examples of limited populations are the set of

pupils in a school,
persons entitled to voting in a federal state,
television viewers, who watched the final match of the last football world
championship,
the parts of a goods delivery,
goods manufactured within one shift in the XY-factory.

-7-
Examples of (theoretical) infinite populations are sets of

observed scores when repeatedly throwing a dice,


determined results when repeatedly measuring a standard of length,
parts which a machine will manufacture under the assumption that it will retain its
momentary state forever.

The last examples show especially that a population must not always be real, but rather
can also be ficticious. Furthermore, it is evident that a statistical question can aim now and
then at a prognosis (forecast) of future events.

2.3 Sample

A sample is, in contrast, always a real and thus a finite set of things or events. Examples
of this are the set of:

the vehicles which passed under the Engelberg tunnel (near Leonberg) on May 1, 1996,
observed scores of 10 throws of a dice,
results when measuring a standard of length 25 times,
50 parts manufactured in the scope of a machine capability study.

The term sample in German (Stichprobe) comes from tapping as previously used for
the purpose of quality control in connection with cereal sacks and cotton wool balls. A
sample consists of a unit or several units which are taken from a real or a ficticious
population according to the random principle. The number of these elements is called
sample size. The properties of a sample should be representative for the population. A
random drawing imposes the prerequisite that each element of the population has the same
chance (same probability) to get into the sample. In general, it is only possible in a few
cases, to realize the random principle in a nearly ideal manner (coin throwing, roulette,
drawing of lottery numbers). The imagination fails mostly especially in connection with
ficticious populations; drawing here is only understood in the figurative sense.

2.4 Random Variable

In statistics, one circumvents these difficulties by introducing the terms random


experiment and random variable. With random experiment one terms a process that
may in principle be repeated arbitrarily often and whose (individual) result is not
predictable (e.g. a throw with a dice). The random variable represents the possible
results of a random experiment (e.g. the numbers 1, 2, ..., 6). Mathematically seen, it is a
function to which a real number becomes assigned through a random experiment (e.g.
score from a throw). The units or elements, which are observed as results of a
random experiment (taken from the population as a sample), are the so-called
realizations of this random variable.

If these definitions are compared with the explanations from 2.2 and 2.3, then one
recognizes that the terms population and sample, corresponding to the colloquial and
illustrative language, have been replaced by the mathematical quantities random variable
and realization of the random quantity, with the limitation that both of the latter
mentioned always involve real numbers.

-8-
For example, the results x 1 , x 2 , . . ., x 10 of a series of 10 repetitive measurements on a
standard of length are realizations of a random quantity X , which represents the entirety
of all possible (infinitely many) measurement results on this standard.

In the common language, even the characteristic values measured on a sample of real parts
(set of measurement results) are often termed as sample.

Example: The Figures 2.4.1 and 2.4.2 respectively show a population of 4,900 balls. The
population according to Figure 2.4.1 consists of 4,420 white and 480 black balls. We want
to consider the latter as representative for defective (nonconforming) parts. The proportion
480
p of the black balls is therefore p = 0.098 = 9.8 % .
4,900
For the estimation of the proportion nonconforming (it is in this case the interesting
random variable), a sample of 490 balls is drawn, here illustrated by the area confined in a
rectangle ( 35 14 balls). The sample contains 44 black balls, whose proportion in the
44
sample is thus p = 0.09 = 9.0 % .
490
Therefore, in this case, the sample delivers a relatively good estimate of the
nonconforming proportion of the population.

Fig. 2.4.1: Uniformly mixed population with some 10% proportion nonconforming.
The proportion nonconforming is really well estimated by means of the
sample (rectangle).

-9-
The example according to Fig. 2.4.2 shows that such an estimation can also lead to wrong
conclusions. Here the population is not homogeneously mixed; the proportion
nonconforming decreases from the bottom upwards. This situation can, for instance, occur
when the proportion nonconforming of a manufacturing line decreases within a limited
period and the produced parts of the production sequence are accordingly put into a
container. In the example at hand, 469 balls of the population are black and 4,431 white.

469
A defective proportion p = 0.096 = 9.6 % lies at hand, which scarcely distin-
4,900
guishes itself from that in Fig. 2.4.1. It is, however, obvious that the proportion
4
nonconforming in the sample p = 0.008 = 0.8 % leads to a wrong conclusion.
490
The difference in the results of both, sample results and the estimations of the proportions
resulting there from, it is blamed on the violation of the random principle (in both of these
cases). Not every part of the population had the same chance of being drawn into the
sample.

In reality, we are naturally in the situation that we dont know the nonconforming
proportion in the population and must exclusively rely on the information from the sample.

Fig. 2.4.2: Inhomogeneous population with some 10% nonconforming proportion.


The nonconforming proportion is wrongly estimated through the sample
(rectangle).

- 10 -
2.5 Probability

Mathematical probability is a number which is closely connected with the result of a


random experiment.

Its classical definition is derived from the theory of games, from which the probability
theory and statistics originally developed.

A random experiment that is often considered in statistics textbooks is the throwing of a


coin. It is generally accepted that the result of a coin-throw cannot be predicted, and owing
to the (sufficient) symmetry of the coin, the outcome heads and tails are equally
probable.

According to the classical definition, the mathematical probability P ( A) of an event A


during a random experiment is given by

g
P ( A) = .
m

Therein

g designates the number of (favourable) cases, in which A occurs

and

m the number of all possible events

in the experiment in question. Referred to the throw of a coin, this means:


1
the probability for the event heads is P (" Heads ") = = 0.5 = 50 % .
2

The number g of the cases, in which the tails occurs (favourable result for the player,
who predicted the tails) is equal to 1, the number of all possible outcomes of the (one-
time) coin-throw is equal to two (heads and tails). Apparently, from the symmetry of
the coin follows the symmetry of the probabilities. Both results are equally probable:
1
P (" Heads ") = P (" Tails ") = .
2

Let us consider a random experiment with a finite number of possible events, the
occurrence probabilities of which cannot be deduced directly from considerations of
symmetry. Then, (at least theoretically) there exists the possibility to repeat the
experiment very often and to determine the relative frequencies (see Section 5.3) with
which every event occurs. One can then define the probability for a particular event as a
limit, which the relative frequency (a number between zero and one) of the considered
event approaches, in case of a large number (approaching infinity) of repetitions of the
random experiment.

When repeatedly throwing a coin, e.g. one will observe that the relative frequencies (e.g.
number of throws with the result heads divided by the total number of throws), with
which both results occur continually approach the value 0.5 with an increasing number of
throws (Fig. 2.5). Generally, this phenomenon is described by the law of large numbers.

- 11 -
Fig. 2.5: Illustration of the Law of Large Numbers. The relative frequency of the result
tails achieved when repeatedly tossing a coin approaches the theoretical value
0.5 after sufficiently many repetitions.

From the following table, it is clear how the relative frequencies represented in Fig. 2.5
have been calculated.

Toss Rel. Frequency of the Toss Rel. Frequency of the


Result Result
No. outcome tails No. outcome tails
1 H 0/1 = 0.00 991 T 0.504
2 T 1/2 = 0.50 992 T 0.504
3 H 1/3 = 0.33 993 T 0.505
4 H 1/4 = 0.25 994 T 0.505
5 H 1/5 = 0.20 995 T 0.506
6 T 2/6 = 0.33 996 K 0.505
7 H 2/7 = 0.29 997 K 0.505
8 T 3/8 = 0.38 998 K 0.504
9 T 4/9 = 0.44 999 T 0.505
10 H 4/10 = 0.40 1,000 T 0.505

Table 2.5

- 12 -
3. Statistical Characteristics
Essential properties of a data set are the mean position of individual values, as well as
their dispersion on the numbers line, which reaches from minus infinity up to plus
infinity. In this section characteristic quantities which are suitable for describing these
properties are explained.

3.1 Median

During the collection of individual values it must be considered already that the sequence
generally contains essential information, namely the chronological sequence of their
origination (e.g. temperature curve, experimental sequence). The values become written in
the sequence in which they evolve. The evolved list is called the original list.

If several parts are drawn consecutively from a series production, whose characteristic
values should finally be measured and analysed, then it is appropriate to number the parts
in accordance with their sequence of production.
This is then important especially when somehow the measuring process takes place at a
different location than the place of production and there is imminent danger that the
sequence would otherwise get lost. Obviously, the characteristic values (measured values)
deduced from the parts are written in correspondence with the numbering.

Example 3.1:

The following nine measured values were taken: 5, 6, 6, 3, 5, 8, 6, 7, 4.

One can assume that they are deviations from a certain nominal value (midpoint of the
tolerance zone) e.g. in 1/100mm or mV.

In general, one designates a characteristic value with the letter x and the number of the
values with n . Here, a running index i is added to the symbol x :

xi ; i = 1, 2 , 3, . . ., n .

That is, the values are designated with

x 1 , x 2 , x 3 , . . ., x n .

If the values become arranged according to their size and written beginning with the
smallest value, then one speaks of an ordered list. According to Example 3.1:

3, 4, 5, 5, 6, 6, 6, 7, 8

Generally formulated:

x(1) x( 2 ) . . . x( n )

To distinguish from the original list, the indices are placed in parentheses.

- 13 -
In an ordered list, the first quantity corresponds to the minimum characteristic value, the
last to the maximum characteristic value:

x (1) = x min x ( n ) = x max .

If the characteristic values are applied to a characteristic axis, one obtains the frequency
distribution of the characteristic values (Fig. 3.1).

Fig 3.1: Frequency diagram (dot diagram)

A characteristic value that is simple to determine is the median. It subdivides the sample in
two halves of an equal number. The median, designated with ~ x (spoken: x wave), is
determined by counting the values of an ordered list:

~
x = x n + 1 if n is odd

2

x n + x n
+ 1
~ 2 2
x= if n is even.
2

The median only appears in the set of values, thus, in the case of odd numbers of measured
values; for an even number it is defined as the mean value of the neighbouring values
x n and x n .
+ 1
2 2

The ordered list already given above:

x (1) = 3 , x ( 2 ) = 4 , x ( 3) = 5 , x ( 4 ) = 5 , x ( 5) = 6 , x ( 6 ) = 6 , x ( 7 ) = 6 , x ( 8 ) = 7 , x ( 9 ) = 8

has therefore the median: ~


x = x ( 5) = 6 .

The essential advantage of the median is its independence from the extreme values of the
data set.

- 14 -
3.2 Arithmetic Mean

The arithmetic mean is defined as the sum of all individual values divided by the number
of the individual values:

The sum of all individual values


x= ( x : say x bar)
The number of all individual values

or, mathematically formulated:

1 n
x=
n i =1
xi (arithmetic mean).

The sum of all individual values is simply represented by a sum sign (upper-case Greek
letter sigma). It means that all n characteristic values x , beginning with the first
measured value x 1 (corresponding to i = 1 ) and continuing to the last measured value x n
(corresponding to i = n ), should be added up.

As a formula, it appears as follows:

x i = x1 + x2 + .. . + x n .
i =1

According to Example 3.1, the nine measurement values taken ( n = 9 ) are produced as the
sum

x i = 5 + 6 + 6 + 3 + 5 + 8 + 6 + 7 + 4 = 50
i =1

and thereby as the arithmetic mean

50
x= 5.6 .
9

If x is not indexed, it always stands for the arithmetic mean in this booklet.

As the following example shows, the mean value solely depicts an orientation value for the
mean position of the values on the number line. Without additional information it can be
almost worthless:

20 Pupils in a class have an average body height of 1.70 m.


Can a conclusion be drawn about the actual distribution of body sizes?

- 15 -
It is possible that the body heights 1.50m, 1.60m, 1.80m, and 1.90m are respectively
somehow equally often represented, and the remaining pupils are 1.70m tall.

It is also possible, though, that one half of the pupils is about 1.60 m tall and the other half
is about 1.80 m tall.

It can also not be ruled out that 19 pupils are about 1.66 m tall, and one is extremely tall at
2.46 m.

The example clarifies that the meaningfulness of the arithmetic mean always stands in
connection with a related distribution model (single-peaked, multiple-peaked, symmetrical,
skewed distribution). It is especially clear that extreme values can substantially influence
the arithmetic mean.

3.3 Moving Mean

A moving mean is formed from a sequence of characteristic values, in that, n respective


values of this sequence become formally united to make one group and then the mean
value of these n values is calculated.
For each new characteristic value, which becomes added to the sequence, one discards the
first value of the last group so that a new group of size n results, from which the new
moving mean will be calculated etc.

Example for n = 5 :

3 7 4 9 1 x 1 = 4.8

3 7 4 9 1 8 x 2 = 5.8

3 7 4 9 1 8 5 x 3 = 5.4

3 7 4 9 1 8 5 2 x 4 = 5.0

Obviously, the so determined moving means are no longer independent of one another.
This characteristic value therefore only responds to spontaneous changes with delay,
which is in fact also desired.

For example, a long-term trend represented as the monthly number of new registrations of
passenger cars and station wagons within a time interval may be easily recognized if a
moving mean is derived from the numbers of the last 6 or 12 months (see Fig. 3.3). Short-
term fluctuations have hardly any effect on the moving mean.

Within the scope of Statistical Process Control, the possibility of using a quality control
chart with a moving mean exists. In this case, the delayed response of the moving mean
on spontaneous and undesired process states can be disadvantageous.

- 16 -
Fig. 3.3: Moving 12-months mean value (thick line)

3.4 Geometric Mean

The geometric mean is produced by the n -th root of the product of all n values of a series
of numbers:

x g = n Product of all n individual values

or, mathematically formulated:

n
xg = n x
i =1
i (geometric mean).

The product of all individual values is represented simply by the Greek letter pi.
This writing style means that all n characteristic values x , beginning with the first value
x 1 (corresponding to i = 1 ) up to the last value x n (corresponding to i = n ), become
multiplied by each other:

x
i =1
i = x1 x 2 . . . x n .

- 17 -
As an example, the nine measurement values taken from Example 3.1 will be used again.
The product from these is:

x
i =1
i = 5 6 6 3 5 8 6 7 4 = 3,628,800 ,

and thus the result as geometric mean is:

x g = 9 3,628,800 5.4 .

The geometric mean finds application in connection with growth processes.

Example 3.4:

Let us assume that the population of a town grows exponentially. We want to rule out
spontaneous (discontinuous) changes through mass influxes or catastrophic events.

Time in years Year Population


t1 = 0 1970 N 1 = 100,000
t 2 = 10 1980 N 2 = 141,000
t 3 = 20 1990 N 3 = 200,000

We assume that the populations for the years 1970 and 1990 are known, and the
population in the year 1980 should be deduced from these two pieces of information.

The arithmetic mean would deliver 150,000 as the result for the year 1980. Calculating in
this way, one would discard, however, the exponential growth and mistakenly calculate the
value that would result from a linear growth.

The geometrical mean delivers the correct estimation in this example:

x g = 2 100,000 200,000 141,000 .

The reason for this relationship will be clear when one considers what an exponential
growth means. The population increases in correspondence with the function

N = N0 eat

within the time t . With the help of this law of growth and the information for the years
1970 and 1990, the growth parameter a may be calculated:

- 18 -
1 N3 1 200,000
a= ln =
ln = 0.03466 .
t N 1 20 100,000

The population in the year 1980 may be determined, in that, one substitutes the mean time
t1 + t 3
t t2 = :
2

t1 + t3
a
N2 = N1 e 2 = 100,000 e 0.03466 10 141,000 .

This corresponds to the result which one receives via the geometric mean.

3.5 Harmonic Mean

As far as the measured values x i concern ratios (or inverse values), the calculation of the
arithmetic mean will lead to a false result.

Example 3.5:
A motorist covers a distance of 200km on the motorway. During the first half of the
distance s 1 = 100 km , he drives at v 1 = 80 km h , during the second half s 2 = 100 km at
v 2 = 160 km h . What is the average speed?
80 + 160
The obvious answer v = km h = 120 km h is false!
2

The correct result is found when one divides the entire distance by the total time taken:
s1 + s 2 s1 + s 2
v= = .
t1 + t 2 s1 s 2
+
v1 v 2
Since both partial distances are equally long ( s 1 = s 2 ), the result is:

2
v= 107 km h
1 1
+
80 km h 160 km h

The value to be considered in the general case

n
xH =
1 1 1
+ + ...+
x1 x 2 xn

is called harmonic mean value (harmonic mean).

- 19 -
3.6 Standard Deviation

The data sets represented in Fig. 3.6, consisting of 7 measured values each are all
characterized by the same arithmetic mean x = 5 .

a) b)

c) d)

Fig. 3.6: Dot frequency diagram of data sets with the same mean value ( x = 5 )

Although the mean value is the same in all cases, individual values are obviously dispersed
in different concentrations around the mean value.
That is, a more or less large deviation of individual values from the mean is present; in Fig
3.6c, this deviation is lowest; in 3.6b greatest.
Therefore, it appears useful to calculate an average deviation of the mean in a manner that
n
one divides the sum of all individual deviations (x i x ) by the number of individual
i =1

1 n
values n : Average deviation =
n i =1
(x i x ) .

Here, however, the difficulty arises that the sum of all individual deviations becomes zero:

(x i x ) = (x 1 x ) + (x 2 x ) + (x 3 x ) + . . . + (x n x )
i =1

= (x 1 + x 2 + x 3 + . . . + x n ) n x

n
= x i n x .
i =1

- 20 -
n
Due to the relationship x i =nx (definition of the mean value), it follows
i =1

conclusively

(x i x) = 0.
i =1

Obviously, the sum of the deviations of individual values from the mean value is not a
useful measure for variation.

An alternative consists of forming the sum of absolute values of individual deviations from
the mean value and dividing it by the sample size. The so-defined measure of variation is
called the mean linear deviation:

1 n
D=
n i =1
xi x .

This statistic is, however, not common.

A very important and frequently used statistic for variation is found, in the contrary, when
not the absolute values, but rather the individual squared deviations are added together and
used as a basis for the statistic:

(x i x)2 .
i =1

By squaring, on the one hand, individual contributions to the total deviation become
positive and on the other hand, the individual values lying further away from the mean
value become considered more strongly. A suitable measure of variation is finally found,
in that, the sum of the available squared deviations still become divided by the sample size
reduced by one.


1
s2 = (x i x ) 2
n 1 i =1

This characteristic value s 2 of the deviation is designated as variance.

It is apparent that the division is not performed with the number of individual values n ,
but rather divided by that very number reduced by one ( n 1 ). The reason for this is that
the so-defined variance of the sample is a good (mathematically expressed: unbiased)
estimation of the unknown variance of the investigated population.

The value that evolves through taking the square root of the variance s 2 , is called
(empirical) standard deviation s :


1
s= (x i x ) 2 .
n 1 i =1

- 21 -
Because it is calculated from a sum of squared expressions, it is always a positive number
(larger or equal to zero).

In connection with repetitive measurements for the purpose of estimating the measurement
uncertainty, once in a while, one finds information of measurements in the form 23 0.2
mm.
Thereby, the number 23 corresponds to the mean value x and the number 0.2 to the
standard deviation s of these individual values, calculated from individual measured
values.
The information x s also contains a piece of information about the spread of the
measured values, besides the measured mean value.

Since each measured value (e.g. 23mm) consists of a measure (the number 23) and a unit
of measurement (mm), it is apparent that the variance for determining the measurement
uncertainty is unsuitable (an information of the form 23mm 0.04mm2 would be
senseless).

The manual calculation of the standard deviation s should be illustrated using the values
from Example 3.1. For this, the following table is sensible:

Individual Square of
Running index values Deviation deviations
i xi xi x ( xi x ) 2
1 5 -0.55 0.303
2 6 0.45 0.203
3 6 0.45 0.203
4 3 -2.55 6.503
5 5 -0.55 0.303
6 8 2.45 6.003
7 6 0.45 0.203
8 7 1.45 2.103
9 4 -1.55 2.403
Sum 50 18.227 Table 3.6

50 1
x= = 5.5 s= 18.227 = 151
.
9 9 1

3.7 Coefficient of Variation

A characteristic value that is not of less importance for assessing populations is the
coefficient of variation v . In this case the extent of the variation of the individual values
is related to the size of the arithmetic mean:

s
v= 100% .
x

The application of this statistic is advantageous when the mean values of two data sets
which are subject to the same type of distribution and should be compared are strongly
different from one another.

- 22 -
3.8 Range

A further common measure of variation is the range R . The range is the difference
between the last and the first value of an ordered series of values:

R = x (n ) x (1)

or when applied to an arbitrary group of unordered values, the difference between the
greatest and the smallest value:

R = x max x min .

The range is always a positive number (greater or equal to zero).

Example 3.8:

For the set of values (2, 3, 7, 5, 3, 2, -2, 0, 4, 3) the following results:

x max = 7 x min = 2 R = 7 ( 2) = 9 .

3.9 Range Method for Determining the Standard Deviation

The range method is a simplified calculation procedure for a quick determination of a


standard deviation s R . This measure of variation s R corresponds in good approximation to
s and is sufficiently exact for many cases occurring in practice. The prerequisite for the
application of this simple procedure is that the data set is based on a normal distribution
and especially does not contain an outlier.

The values of the measurement series become subdivided in m groups of n individual


values each. The data set consists of m n individual values in total. This procedure finds
application, in general, then, when the measured values appear in groups anyway, e.g. in
the case of the median-R chart within the scope of SPC in the form of 5-piece samples.

The respective group mean value x j is then

1 n
xj =
n i =1
x i, j

with i : running index within a group,


j : running index for the groups ( j = 1, 2 , . . ., m ).

The range of each group is R j = R j , max R j , min .

- 23 -
The mean R of the ranges of all groups is

R
1
R= j .
m j =1

The standard deviation s R is finally calculated from R and a tabled auxiliary quantity
dn:

R
sR = .
dn

d n is dependent upon the number n of the individual values per group.

n 2 3 4 5 6 7 8 9 10
dn 1.13 1.70 2.06 2.33 2.50 2.70 2.85 2.97 3.08

Example:

Group No.
i 1 2 3 4 5 6
1 70 71 68 72 72 72
2 68 67 72 76 66 69
3 69 66 69 67 63 63
4 69 64 67 68 73 68
5 75 72 69 69 72 68
~
xj ~
x 1 = 69 ~
x 2 = 67 ~
x 3 = 69 ~
x 4 = 69 ~
x 5 = 72 ~
x 6 = 68
Rj R1 = 7 R2 = 8 R3 = 5 R4 = 9 R 5 = 10 R6 = 9

6
7 + 8 + 5 + 9 + 10 + 9
R
1
R= j = =8
6 j=1 6

R 8
sR = = 3.4
d n 2.33

- 24 -
4. Statistical Calculations with Pocket Calculator

Because many inexpensive pocket calculators are now equipped with statistical functions,
fortunately, it is no longer necessary to calculate mean values and standard deviations in
the form explained above.
Although the handling, the assignment of keys and the labelling are different for different
calculators, the necessary inputs and the keys to be pressed are mostly according to the
following scheme:

1. Switching over the calculator to the statistics functions.

2. Typing in the 1st value and pressing the return key (data-key).
In the display, the numeral 1 appears to show that the 1st value was registered.

Typing in the 2nd value and pressing the return key (data-key).
In the display, the numeral 2 appears to show that the 2nd value was registered.
Input of all the rest of the values in the same manner.

3. By pressing the key x , the mean value is displayed.

4. By pressing the key s , the standard deviation is displayed.

The key for displaying the standard deviation can, for instance, be marked with the
symbols s , s x or even n 1 . In this case, it is assumed that the entered values were
sample results, with the help of which the standard deviation of an essential population
should be estimated. The calculation of s takes place in correspondence with the formula
given above:


1
s= (x i x ) 2 .
n 1 i =1

In this case, the sum of the squared deviations (sum of squares) is divided by the sample
size reduced by one.

If additionally keys with symbols such as s n , or n exist, then when activated, the
calculation takes place according to the formula:

1 n
sn =
n i =1
(x i x ) 2 .

In this case, it is assumed that the entered values already correspond to the population (e.g.
body heights of pupils in a class), and the sum of the squared deviations is divided by the
size of the population (number of the input values). When dealing with larger data sets,
1 1
starting from n = 50 , the difference between the factors and becomes meaning-
n1 n
less (the relative error in the case of n = 50 is approximately 2%).

- 25 -
Caution with Characteristic Values!

The following example should show that statistical characteristics alone still do not allow
a definite conclusion with respect to a nonconforming fraction based on a tolerated
limiting value (upper specification limit USL).

Measurement series 1, measured values x i


22 25 25 26 26
26 22 22 23 25
24 23 24 23 23
23 24 22 26 25
26 25 24 22 24

Mean: x = 24.0 Standard deviation: s x = 1.4434

Measurement series 2, measured values y i


24 23 22 25 23
23 25 21 24 24
24 23 22 26 24
23 26 24 24 25
24 25 28 24 24

Mean: y = 24.0 Standard deviation: s y = 1.4434

Fig. 4: Dot diagrams of measurement series 1 (above) and measurement series 2 (below)

- 26 -
Although mean values and standard deviations of both measurement series are respectively
equal, in the first case no single value lies above the upper limiting value USL (upper
specification limit), while in the second case a measured value exceeds USL .

The example shows that when evaluating measurement series it is indispensable to adopt a
possibly extensive and integral point of view and to draw conclusions not based on only a
few individual informations. Apparently, in the example at hand, the distribution of the
measured values must be considered (see Section 6).

Hint:

Frequently it is assumed that the quantities mean value and standard deviation are always
based upon the normal distribution (see Section 6). This assumption is not right.
A pocket calculator obviously always calculates these statistical characteristics indepen-
dent of a theoretical distribution according to the same calculation algorithm.

- 27 -
5. Graphic Representation of Data
A graphic representation of data enables the viewer to quickly perceive the essential
properties of a data set. In this way it supports and simplifies the evaluation of
measurement series.

It is, for example, quite easy to recognize special characteristics such as starting dot, end
dot, trend, periodicity, accumulations of dots or single dots, which clearly lie far away
from the majority of the remaining dots, the so-called outliers, using an original value
diagram.

Furthermore, the majority of graphical representations serve for the evaluation of data
with respect to their statistical properties and characteristic quantities even without
computer aid.

5.1 Original Value Diagram

The original value diagram (original data chart) is a plot of measured values in the
sequence of their emergence, the abscissa (x-axis) frequently corresponds to the time. In
the scope of process investigations, for instance, data become recorded at intervals of
minutes, hours, shifts or days. In these cases the time can therefore be given as inputs of
date and time. If an investigation is being conducted on parts which have been drawn from
a production lot as a sample, the time of drawing the parts and the time of measuring the
parts characteristics can differ greatly. Under certain circumstances this plays a role when
dealing with products which can change with time (e.g. plastic parts, adhesive
connections).

In the case of experimental investigations, however, a process can be of interest whose


beginning is defined as zero-point of the time scale (e.g. transient response of a control
device). The following example shows the temperature response of a drying oven with a
simple on-off control system.

Fig. 5.1: Example of an original value diagram

- 28 -
5.2 Tally Chart; Dot Diagram

Examples of dot diagrams are found in the preceding Sections 3.1, 3.6 and 4. and will be
used to explain grouping in Section 5.3.

A dot diagram originates from a tally chart when each group of five (obviously other
groupings are also possible) is represented by a point according to the following example.
One should compare Fig. 5.2 with Fig. 4. (below). In both cases, so to say, a grouping
naturally exists (s. Section 5.3) since characteristic values only appear as integers. The
height of a column is an absolute frequency measure of the corresponding characteristic
value.

Fig. 5.2: Example of a tally chart

- 29 -
5.3 Grouping; Histogram

If the number of measured values based on a sample is greater than about 25, a grouping is
undertaken for convenience.

The procedure of grouping should be explained via an example with the following list of
measured values. The measured values, given in millimetres (mm), are assumed; they
could, however, originate from a production process such as sawing rod material (Example
5.3).

The following list is given with 50 values:

8.0 7.0 7.4 8.0 7.0


7.4 7.8 7.5 7.7 6.9
6.5 7.5 7.6 7.3 8.0
7.0 7.5 7.1 7.4 8.6
6.0 8.0 7.0 8.0 6.9

7.5 8.4 6.8 8.3 8.0


8.3 7.3 7.0 7.5 7.9
8.0 7.5 7.0 6.5 7.8
5.8 7.8 6.3 7.5 7.9
9.0 8.0 7.1 7.0 7.4

If one enters the measured values in a frequency diagram by considering a grouping with 7
classes, the following representation is produced:

Fig. 5.3.1: Frequency diagram (dot diagram) for example 5.3; k = 7 , w = 0.5

- 30 -
A different grouping with 22 classes, however, produces the following frequency diagram:

Fig. 5.3.2: Frequency diagram (dot diagram) for example 5.3; k = 22 , w = 0.15

The selected class number and class width apparently have a great influence on the
appearance of the frequency.

A good base value for an appropriately selected number of classes k is supplied through
the formula

k n for 25 n 100 .

The limitation with reference to the sample size shows that this formula is only valid for
measurement series of up to 100 values. In the case of less than 25 values, it is mostly not
necessary to create a frequency chart.

On the contrary, if more than 100 measured values lie on hand, then it is recommendable
to select the class number in accordance with the formula:

k 5 log ( n ) for n > 100 .

By considering these rules the following class numbers will result:

Number of measured
values Number of classes
n k
from 25 5
30 6
45 7 Table 5.3.1:
60 8 Number of the classes k
75 9 depending upon the number
100 10 of measured values n
200 12

- 31 -
As one can deduce from the frequency diagrams, each class includes a certain interval of
values. The limits of each interval are called lower and upper class limit, the length of
such an interval is called class width w .

The recommendation found in the literature now and then, of calculating the class width
x ma x x mi n
w according to w = from the range, has the class limit mostly with several
k 1
decimal points as a result that is inappropriate for the manual creation of such a chart and
can above all lead to empty classes.

The frequency diagram according to Fig. 5.3.1 was created using the n = 50 values from
example 5.3. The following values resulted in correspondence with the above rules for the
selection of the quantities k and w :

x ma x x mi n 9.0 5.8
k 50 7 and w= = 0.5 .
k 1 7 1

In the Figures 5.3.1 and 5.3.2, the class limits were designated through the corresponding
choice of the second decimal point so that every characteristic value can clearly be
assigned to a class. Another possibility of clear assignment respectively consists of
including the right class limit in the interval or values which are identical with a class
limit to be halved respectively between both neighbouring classes.

Hint:
Situations in which it can be advantageous to choose the class width differently are
conceivable. For instance, the above rule for calculating the class width fails when the
data set contains an outlier.
Strict compliance with this rule could then lead to the result that only the two outer classes
become occupied (One of these classes contains only the outlier) and all the rest remain
empty. One can avoid this, for instance, by leaving individual extreme values unconsidered
when grouping and then respectively assigning them to the corresponding outer class (first
or last class) after determining a reasonable grouping for the situation. This means that the
outer right-hand-side class is open upwards and the outer left-hand-side class is open
downwards.

It cannot be ruled out that even statistics programs which (for instance for creating a
histogram) perform groupings in compliance with a few simple rules, depending upon how
exotic the data set is, deliver useless representations because of the mentioned reasons.
Therefore, they mostly offer the user the possibility of correcting the grouping in
accordance with their own discretion

Before going into the calculation of the mean x and the standard deviation s from a
grouping, several significant terms should be explained first.

- 32 -
Class limits:
Each class of a grouping is limited by a lower class limit x j 1 and an upper class limit
x j .

Class midpoint x j :
The class midpoint corresponds to the arithmetic mean of lower and upper class limit:
x j 1 + x j
xj = .
2

Class width w j :
The class width corresponds to the distance between the lower and the upper class limit:
w j = x j x j 1 .
In general, all classes have the same class width, i.e. w j = w for all classes.

Absolute frequency n j :
Number of values which are allotted to the corresponding ( j -th) class (one also speaks of
absolute class frequency).

Relative frequency h j :
nj
Absolute frequency divided by the total number n of values in the data set: h j =
n
k
with n= n j = n1 + n 2 + n 3 + . . . + n k .
j =1

Absolute cumulative frequency G j :


Sum of absolute frequencies n j from the first to the j -th class (inclusive).

j
Gj = n
i=1
i = n1 + n 2 + n 3 + . . . + n j

Cumulative relative frequency H j :


Relative proportion of all values below the upper class limit of the j -th class:

j
Hj = h
i=1
i = h1 + h 2 + h 3 + . . . + h j

Gj
or, a simpler form in case of a manual calculation: H j = .
n

A frequency chart represents the distribution of measured values; that is the relationship
between a variable x and the frequency of its occurrence. If the absolute frequencies are
plotted above the characteristics axis, then one attains a dot diagram (see Fig 5.3.1 and
5.3.2).

- 33 -
If one, however, applies the relative frequencies, one obtains the so-called histogram (also
called the bar chart). The following bar chart is produced from the values of example 5.3:

Fig. 5.3.3: Histogram of the values of Example 5.3

Rectangles here are drawn above the characteristic classes which correspond in height to
the frequencies h j (with constant class width).
The following table contains all the essential characteristics:

Lower Upper Absolute Relative Cumulative Auxiliary Auxiliary


Class class class frequency frequency relative value value
no. limit limit frequency (s. text) (s. text)
x j 1 x j . nj: hj Hj nj xj n j x 2j
1 5.75 6.25 2 4% 4% 12.0 72.0
2 6.25 6.75 3 6% 10% 19.5 126.8
3 6.75 7.25 12 24% 34% 84.0 588.0
4 7.25 7.75 15 30% 64% 112.5 843.8
5 7.75 8.25 13 26% 90% 104.0 832.0
6 8.25 8.75 4 8% 98% 34.0 289.0
7 8.75 9.25 1 2% 100% 9.0 81.0
n = 50 100% 375.0 2,832.6

Table 5.3.2

- 34 -
5.4 Cumulative Frequency Curve

If the cumulative relative frequencies are plotted over the upper class limits, one obtains
an s-shaped curve, the so-called cumulative frequency curve (cumul. frequency polygon).

Fig. 5.4: Cumulative frequency curve of the values of Example 5.3

The advantage of the cumulative frequency curve as opposed to the frequency diagram is
easily recognizable. The percentage of measured values lying in each limit area (e.g. for
the estimation of proportions nonconforming) can be read without great effort.
As far as the original values of the data set are not known, the following formulas can
become helpful, which enable one to calculate the mean and the standard deviation using
the information which is contained in the histogram (for memory: x j designates the class
midpoint here).

k n 1 x 1 + n 2 x 2 + n 3 x 3 + . . . + n k x k1
(n
1
Mean: x= j xj)=
n j =1 n
k
1 k

1
Variance: s2 = n j (x j x) 2 = ( n j x 2j ) n x 2
n 1 j =1 n 1 j =1

Standard deviation: s = s2

The absolute frequencies n j can be calculated using the relative frequencies h j :


n j = n h j . In the above example (see Table 5.3.2) one finds:

375.0 1
x= = 7.5 , s2 = ( 2 ,832.6 50 7.5 2 ) = 0.41 and finally s = 0.64 .
50 50 1

When calculating these statistics, using the original values, one obtains: x = 7.454 and
s = 0.6399 .

- 35 -
6. Statistical Distributions
6.1 Gaussian Normal Distribution

If one enlarges (under constant measuring conditions) the size n of a measurement series
consecutively (i.e. the number of measured values becomes theoretically infinitely large),
and concurrently reduces the class width (towards zero), then the sum curve (see 5.4)
approaches a limit curve, which corresponds to the distribution (distribution function) of
the (infinite) population. Analogously, the stepped line which represents the upper
boundary of the histogram approaches a limit curve, which is the graphical representation
of a probability density function (see Fig. 6.1, n , class width w 0 ).

Fig. 6.1: Scheme for illustrating the transition from the histogram to the density function
by example of a normal distribution.

Every point on the characteristic axis corresponds to a number with theoretically infinitely
many decimal points, thus, e.g. x = 73.26452718 ... . The probability that a characteristic
(with known distribution) exactly assumes this value, is zero. On the contrary, the
probability that a characteristic value lies within the interval between 73.2 and 73.4 is a
finite number greater than zero. One obtains such a probability, by multiplying a value of
the probability density function with the interval width. The probability density function is
thus the generalization of the relative frequency when the class width is shrinking towards
zero.
The term density embraces on an analogy between the probability calculation and the
mechanics of solid bodies (see e.g. [3]).

The surface area which is confined by the density function curve and a particular interval
of the characteristic axis corresponds to the probability with which characteristic values of
the population are allotted to this interval. This area is therefore a graphic analogue to the
probability. The total surface area which is confined by an arbitrary probability density
function and the characteristic axis (between minus infinity and plus infinity), always
corresponds to the value 1 (= 100%).

In the past, it has been evident that when doing experimental investigations and statistical
observations one frequently finds characteristic distributions which result in histograms
with similar appearance. The mathematician C. F. Gauss investigated this phenomenon on
land surveying data. This type of distribution has the name normal distribution and
serves frequently as a distribution model for technical-statistic phenomena. Due to its
characteristic form, the representation of the density function of this distribution is also
called Gaussian bell-shaped curve.

- 36 -
6.1.1 Properties and Characteristics of the Normal Distribution

The bell-shaped curve is the graphical representation of the density function of the normal
distribution, which is described by the mathematical relationship
2
1 x
1
2
f ( x) = e .
2

The function (and thus the curve) is determinate through the parameters and .
Thereby, represents the mean value of the distribution and its standard deviation.

By means of the functional equation or its graphical representation, several peculiarities


can be recognized:

The curve is symmetrical about the mean value .

The curve exhibits an inflection point at the points and + respectively. This
means that, e.g. at the point the curvature which is orientated away from the x -
axis transforms into a curvature that is orientated towards the x -axis.

The curve runs from x = to x = + . This is, however, only interesting for a
theoretical consideration. The curve is mostly only of practical importance at a distance
of three to four standard deviations from the mean value towards the left and the
right in Fig. 6.1.1. There, the curve already approaches the x -axis.

As already mentioned the surface area under the Gaussian curve corresponds to an
infinitely large number of measured values from a normally distributed population. If this
area is set equal to 1 (corresponding to 100%), then a portion lying between two points (in
%) can be ascertained.

If on the characteristic axis corresponding distances are marked in multiples of the


standard deviation, beginning from the mean value towards the left and right, then
depending upon the proportion of the distribution can be given. In Fig. 6.1.1, these
proportions are distinguished by cross-hatching for the intervals 1 , 2 and
3 .

Accordingly, the following is obtained for the interval

1 a proportion of 68.3 %,

2 a proportion of 95.4 %,

3 a proportion of 99.7 %.

One can see that outside 3 only an infinitesimally small proportion of the
distribution is found, namely only 0.3 % (= 100 % - 99.7 %) (see Fig. 6.1.1).

- 37 -
Fig. 6.1.1: Area proportions under the bell-shaped curve

- 38 -
6.1.2 Distribution Function

In Section 5.4, it is described, how the cumulative frequency curve of a frequency


distribution is determined. By addition of individual relative class frequencies the
cumulative relative frequencies are determined, which are finally plotted versus the upper
class limits. The corresponding points are connected piece-wise by straight lines. At the
last upper class limit all the values are then accounted for (= 100% of the distribution).

The cumulative frequency curve of the Gaussian distribution is in principle also ascer-
tained likewise. Thereby, the cumulative frequency formed in dependence upon the
characteristic value x , which in fact corresponds to a certain area proportion under the
Gaussian curve must be calculated by means of a special mathematical procedure - the
integration.

Fig. 6.1.2.1: Illustration of integration

The function which describes the cumulative frequency curve of a probability distribution
is called a distribution function F ( x ) . It gives the probability for every x that a randomly
measured value is smaller or equal to x .

Mathematically formulated, the probability up to the point x is given by the distribution


function:
2
x 1 v
1
e

F (x) = 2
dv .
2

F ( x ) correspond to the area below the Gaussian bell-shaped curve up to the value x .
Fig. 6.1.2.1 illustrates the meaning of integration. The area below the curve up to point x
is approximately calculated, in that, one respectively determines and sums up the areas of
narrow rectangles (width x ). The result will be the accurate the narrower the rectangles
are made (passage to the limit: x 0 ).

In Fig. 6.1.2.2 (below) the distribution function is represented as a curve.

- 39 -
Fig. 6.1.2.2: Comparison of the probability density function (top) and the distribution
function (below) of the normal distribution

- 40 -
From this representation, it is evident that at the position of the mean the cumulative
frequency is 50%. The 0%-line, as well as the 100%-line will be touched by the curve
theoretically only at infinity. By 3 , or + 3 , however, the corresponding lines
are nearly reached. By 3 the sum probability is 0.135%, by + 3 it has the
value 99.865%.

It is easily recognizable that the proportion of the distribution between 3 and


+ 3 is approximately equal to 99.73%, namely 99.863% - 0.135%.

6.1.3 Standard Normal Distribution

The Gaussian curves attain their great practical importance first through a standardization
process. This will be understandable when one considers that for each arbitrary normal
distribution the corresponding Gaussian curve can be drawn.

Standardization has the effect that all Gaussian curves can be transformed into a standard
curve with the mean value = 0 and standard deviation = 1 . This will be attained
through the following transformation:

x
u= .

Through subtraction x the mean value is shifted towards the zero point. Division by
the standard deviation corresponds to a compression (or stretching) of the abscissa, so that
the standard deviation 1 results. Fig. 6.1.3.2 illustrates this transformation of scale.

Fig. 6.1.3.1: Bell-shaped curve of the standard normal distribution

- 41 -
To draw the bell-shaped curve, one can use the following approximation values, whereby
the vertex height S (in mm) can be chosen arbitrarily (cp. Fig. 6.1.3.1):

Abscissa 0.5 1 15
. 2 3
Ordinate 7 5 2.5 1 01
.
S S S S S
8 8 8 8 8
0.88 S 0.63 S 0.33 S 0.13 S 0.01 S

Table 6.1.3.1: Approximation values of graphical representation of the Gaussian curve.

The advantage of standardization lies by the fact that for Gaussian distributions with
arbitrary and the probability density and thus the sum function as well (cumulative
frequency) only depend upon the value of the variable u .

Fig. 6.1.3.2: Representation of fractions nonconforming

Two new terms emerge in connection with this representation:

The quantity P , which gives the probability that a randomly measured value lies
between x and x 1 , and

the nonconforming fraction . This quantity corresponds to the probability, with which
the measured value is smaller than x . Due to the symmetry, is also equal to the
probability that the measured value is greater than x 1 .

- 42 -
The area below the Gaussian curve corresponds to P + 2 = 1 = 100 % .

For the interval 3 the probability P = 99.73 % results and therefore for the one-
sided nonconforming fraction the value = 0.135 % (due to 2 = 0.27 % ).

So-called limits of variation (positions on the characteristic axis) are assigned to the
nonconforming fraction , a lower limit, which is designated with x , and an upper limit,
which is designated with x 1 .

By rearranging the standardizing equation

x
u=

these limits of variation may be easily calculated. The following applies:

x1 = + u1 .

Because the normal distribution is symmetrical, the following applies:

x = + u with u = u 1 x = u1 .

Now, the practical usefulness of the standardization is understandable: For each Gaussian
distribution with arbitrary and the portion P of the distribution lies between the
variation limits x , and x 1 .

This generally valid connection between u and P is apparent through the following
Table:

u 1.0 1.28 1.64 1.96 2.0 2.33 2.58 3.0 3.2


P ( u ) % 68.3 80.0 90.0 95.0 95.4 98.0 99.0 99.7 99.9
% 15.9 10.0 5.0 2.5 2.28 1.0 0.5 0.135 0.05

Table 6.1.3.2: Relationship between u and P

In Section 10. a detailed table of the standard normal distribution is depicted. However,
somewhat other designations are used there. D ( u ) corresponds to the quantity which is
designated with P here and ( u ) is identical to .

- 43 -
6.2 Normal Probability Paper

The probability paper of the normal distribution is an aid for graphic determination of
statistics of a data set. Furthermore, it affords one the possibility to test whether the values
of the data set can originate from a normal distribution or not.

The ordinate division of the probability paper is thus designed that data sets originating
from normal distributions deliver dot sequences which lie on a straight line. Pictorially
speaking, the sum curve from Fig. 6.1.2.2 is straightened by correspondingly distorting
the ordinate axis. This is attained, in that, the cumulative frequencies belonging to the
integral multiples of the (dimension-less) quantity u receive equal distances to each other
on the ordinate. The abscissa, thus the characteristic axis (here represented for the u ; see
Section 6.1.3), has a linear division.

Fig. 6.2: Construction of the normal probability paper

Because normally the values of the cumulative relative frequency determined from a
sample only approximately follow the theoretical normal distribution, the corresponding
points in the probability plot are to be approximated by a best-fit line.

The application and the practical benefit of the probability paper are explained in the
following section.

- 44 -
Frequency Diagram with Probability Paper

The form sheet BVE 21723 Frequency Diagram with Probability Paper (only available
in German) supports the creation of a frequency diagram as well as the graphical
representation on normal probability paper (see Examples 6.2.1 - 6.2.4).

To the cumulative relative


frequency corresponds

50% the mean value x

99.85% the value x + 3 s

0.15% the value x 3 s

These relationships justify the handling of the probability paper.

In the case of measurement series, which consist of more than 25 measured values and
which become evaluated with the help of a grouping, one plots the values of the
cumulative relative frequency versus the upper class limits on the probability paper.

When dealing with smaller measurement series, one first creates an ordered list (measured
values are arranged according to size)

x 1 , x 2 , x 3 , . . ., x n

and then assigns cumulative relative frequencies H i (n ) , of these values which can be
i 0.3
calculated with the help of the approximation formula H i (n) = ( i = 1, 2, . . ., n) .
n + 0.4

x( 1 ) , x ( 2 ) , ... , x ( n )
H 1 (n ) , H 2 (n ) , ... , H n (n ) .

Finally, the points ( x ( i ) , H i (n ) ) are plotted on the probability paper.

Through the resulting dots sequence, a best-fit line is drawn in such a manner that the
displacements of the points from the lines are as small as possible and nearly equally many
dots lie above and below the line. The better the dots sequence is approximated by the line
the more suitable is the model of the normal distribution for describing the evaluated data
set.

The graphic determination of statistical characteristic quantities are explained using the
example 6.2.1.

- 45 -
Graphic determination of the mean value

The mean value x is found, by searching for the intersection point of the horizontal line at
50% cumulative frequency with the best-fit line, and reading the corresponding value on
the x -axis.

One finds x = 346 .

Graphic determination of the standard deviation

The standard deviation s can be determined, by reading the x -values belonging to the
cumulative frequencies 99.85% (corresponds to x + 3 s ) and 0,15% (corresponds to
x 3 s ), in the example at hand, these values are x = 366 and x = 324 , and dividing their
difference by 6 (see corresponding evaluation aid on the right-side boundary of the form
sheet, Example 6.2.1).

366 324 42
One therefore calculates: s = = = 7.
6 6

Graphic determination of a fraction nonconforming

In the case at hand the upper limiting value (upper specification limit) is USL = 360
(upper tolerance limit). One finds the theoretical portion of the population, which
exceeds USL , in that, one draws a vertical line at this point and determines its intersection
point with the best-fit line. The cumulative frequency belonging to this intersection point
corresponds to the portion of the population, which lies below USL , thus about 97.5%.

The nonconforming fraction being searched (one-sided right) is then the difference to
100%, that is 2.5%.

The fraction nonconforming with respect to a specific lower limiting value (lower
specification limit) LSL (lower tolerance limit), results analoguously. One must
therefore draw a perpendicular line at the position of LSL and then read out the
cumulative frequency that belongs to the intersection point with the best-fit line. This
value directly corresponds to the searched non-conforming fraction (one-sided left).

The so-called two-sided non-conforming fraction is the sum of both one-sided non-
conforming fractions (upper and lower). It gives the proportion of the units of the
population, whose characteristic values lie outside the tolerance interval.

- 46 -
Example 6.2.1: Dynamic pressure values of conical seats

The frequency distribution of a sample of 50 measured values is depicted here. Each


measured value is marked with an X in the corresponding class.
The cumulative relative frequency H does not need to be calculated here, since the
vertical positions of the points on the G j -scale (for n = 50 ) can be read on the right
boundary of the probability paper.

- 47 -
Example 6.2.2: Guiding diameter of nozzle needles

This sample contains n = 240 measured values. In order to be able to represent all
values in the frequency diagram, every measured valued is represented by a stroke, and
five values respectively are summed up to form one group (see Section 5.2, tally).

- 48 -
Example 6.2.3: Dynamic pressure values of conical seats

This example shows the evaluation of a small measurement series ( n = 15 measured


values) on probability paper. In the following table, the measured values x i (10 N/mm2)
and the corresponding cumulative relative frequencies H i (n ) are given for plotting the
points ( x ( i ) , H i (n ) ) on probability paper (in an ascending sequence).
i 0.3
The last were calculated with the help of the approximation formula H i (n) = for
n + 0.4
2 0.3
i = 1, 2, . . . , 15 . Example: H 2 (15) = 0.11 = 11% .
15 + 0.4

No. 1 2 3 4 5 6 7 8
xi 311 319 321 321 321 321 325 327
H i (n ) 4.5 11.0 17.5 24.0 30.5 37.0 43.5 50.0

No. 9 10 11 12 13 14 15
xi 329 329 329 331 333 333 335
H i (n ) 56.5 63.0 69.5 76.0 82.5 89.0 95.5

The scale on the characteristics axis must be chosen with some thought. On the one hand,
it should still be possible to draw the best-fit line completely, so that one can possibly read
the intersection points on the upper and lower boundary of the paper. On the other hand,
the line should not run too steep for the sake of reading accuracy.

- 49 -
Example 6.2.4: Investigating the grinding bushes process

In this case the measured values were recorded in the upper part of the form sheet similar
to an original data chart in a time sequence. From this representation the frequency
diagram is created in a simple manner.

- 50 -
6.3 Lognormal Distribution

When due to physical reasons a characteristic value cannot exceed or fall below a certain
limit, a so-called skewed (asymmetric) distribution of characteristic values often occurs.
This is, e.g. the case with the characteristics of deflection and concentricity, where the
lower limit is zero or with the Rockwell-Hardness test of steel, where a certain minimum
hardness is not achieved. The evaluation of the frequency chart of such a skewed
distribution on normal probability paper produces a curved line (Fig. 6.3.0.1).

Fig. 6.3.0.1: Evaluating a skewed distribution on normal probability paper

- 51 -
Fig. 6.3.0.2: Equal data set plotted on lognormal probability paper

- 52 -
If, on the contrary, one represents the same data set on lognormal probability paper, an
approximate straight-lined chain of dots will result (see Fig. 6.3.0.2).

6.3.1 Lognormal Probability Paper

The ordinate scale of lognormal probability paper is identical to the scale of normal
probability paper. The two papers only distinguish themselves in the scaling of abscissa
( x -axis). In case of the lognormal probability paper, this scaling is logarithmic. Owing to
the relationships given in the following table (the first two columns contain different
writing styles for the same number x ) the points corresponding to the values 10, 100,
1,000 have the same displacements on a logarithmically divided axis.

x x log ( x )
1
= 0.01 10 2 -2
100
1
= 0.1 10 1 -1
10
1 10 0 0
10 10 1 1
100 10 2 2
1,000 10 3 3

It is thus possible by simple means to represent data sets which extend over two or more
orders of magnitude. In case of the example according to Fig. 6.3.1, the values lie in the
interval between 0.15 and 0.95. It is therefore reasonable, to assign the value 0.1 to the
first point where 10 stands, and to the middle 10 the value 1. It should be observed that on
a logarithmic scale the value zero can never appear, since the position where the zero must
stand, lies at minus infinity.

The plotting of the points takes place just like on normal probability paper (see Section
6.2).

Characteristic quantities of the lognormal distribution are the geometric mean x g and the
shape parameter (epsilon). The mean value x g is not identical to the most frequent
value (mode) due to the asymmetry of the distribution. x g is the median, i.e 50% of the
values of the distribution lie on the left and right of this number. It can therefore be easily
ascertained graphically, in that, the intersection point of the best-fit line with the
horizontal line at 50% is searched and then the corresponding x -value is read out.

To determine the shape parameter , one first determines in the probability plot the
intersection point of the best-fit line with the horizontal line at x g (corresponding to a
cumulative frequency of 84.13%), reads the corresponding value on the x -axis and finally
divides this number by x g .

- 53 -
Similarly, with the normal distribution, area fractions can be determined with the help of
x g and , which are limited within certain intervals by the curve of the density function
and the x -axis.

xg
68.3% of all values of a lognormal distribution lie between and x g .

xg
95.4% of all values of a lognormal distribution lie between and x g 2 .
2

xg
99.7% of all values of a lognormal distribution lie between and x g 3 .
3

In general, in connection with zero-limited characteristics, only the upper specification


limit USL is of interest. In this case, in the logarithmic probability plot, the corresponding
nonconforming fraction can be determined. Thereby, one draws at the point USL (on the
x -axis) a perpendicular line and determines the cumulative frequency value H (USL) in
percentage, belonging to the point of intersection with the best-fit line. The value
100 H (USL) corresponds to the theoretical portion of the distribution (in percent),
which exceeds the limiting value.

Fig. 6.3.1 shows an example of such an evaluation. One should observe that the frequency
diagram (above) and the probability paper (below) have different abscissa scales (linear
and logarithmic division) and the points on the probability paper therefore no longer come
to lie perpendicular under the positions of the upper class limits of the frequency
diagrams.

If one goes along the horizontal line at a cumulative frequency of 50% towards the right
up to the best-fit line and then at this intersection point moves perpendicularly downwards,
one finds the geometric mean x g = 0.31 . Correspondingly, one finds at the intersection
point between the 84.13%-line with the balancing line the value x g = 0.51 . Division of
0.51
these two numbers delivers the shape parameter = = 165
. . If one follows the
0.31
perpendicular line on the upper limit value USL = 1.0 , one finds the intersection point with
the best-fit line at a cumulative frequency of 99%. The nonconforming fraction with
reference to this limit is then 1%.

- 54 -
Fig. 6.3.1: Graphic evaluation on lognormal probability paper

- 55 -
6.3.2 Relationship between the Normal and the Lognormal Distribution

By taking the logarithm, a lognormally distributed characteristic x is transformed in a


normally distributed characteristic z . By taking the logarithm, the upper part of a
lognormal distribution is strongly compressed and the interval between the geometric
mean x g and the zero point is strongly stretched. This corresponds to a reflection with
reference to the curve of the function z = ln ( x ) . Fig. 6.3.2 illustrates that x g is
transformed on z and the point x g corresponds to the value z + s z . Thus:

z = ln ( x g ) xg = ez s z = ln ( ) =e
sz
and .

xg xg

Fig. 6.3.2: Representation of the relationship between the normal


and the lognormal distribution.

Hint: It may appear somewhat confusing that the preceding explanations are based on the
natural logarithm ln ( x ) (base e logarithm), the probability paper, however, exhibits a
division corresponding to the common logarithm log ( x ) (base 10 logarithm). This fact is,
however, meaningless owing to the comparability with numerical calculations, because
when calculating x g and a reverse transformation into the original coordinate system
always takes place, and calculations of nonconforming fractions or limiting values are only
performed with the help of these two quantities. By application of log ( x ) as well as its
inverse function 10 x , one finds the expressions x g = 10 z and = 10 z .
s

- 56 -
6.4 Mixture of Distributions
It can happen that the measures of parts which were manufactured on two different
machines or production lines exhibit different distributions. In general one observes that
average and/or variation deviate from each other wheras the distributions are the same. If
the parts are not separated, then this is expressed statistically by originating a mixed
distribution. Mixture distributions can occur, when in a running production substantial
influence quantities practically change erratically (tool exchange, change of material
batch).

The histogram of a mixture distribution shows in general two or more maxima (peaks).
One also speaks of a bi-modal or a multi-modal distribution.
As far as mixed distributions originate through superposition of two normal distributions,
whose mean values greatly differ, a plot (of cumulative frequencies or individual values)
on probability paper shows that the created dots sequences may be approximated piece-
wise by two different straight lines.

Fig. 6.4: Representation of a mixture distribution originating from two collectives.

- 57 -
7. Quality Control Charts
Hint: In this chapter only statistical fundamentals of control charts technique will be
illustrated. Special procedures and current control systems with regard to their practical
application are described in Booklet 7 Statistical Process Control (SPC) of the Series
Quality Assurance in the Bosch-Group, Technical Statistics.

The Statistical Process Control is a proven methodology for controlling a manufacturing


process on the basis of statistical methods.
Thereby, samples of parts become drawn from the process in accordance with process-
specific drawing rules, their characteristic values become measured and entered in a form-
sheet, the so-called quality control chart. The statistical characteristic quantities derived
from the characteristic values are then used for assessing the current process condition. If
necessary, the process condition is corrected through suitable measures.

The control chart technique was developed by Walter Andrew Shewhart (1891-1967) in
the nineteen-twenties and in 1931, described in details in his work Economic Control of
Quality of Manufactured Product.

SPC is an application from the field of inductive (conclusive) statistics. Not all measured
values are made available, as would be the case when inspecting 100%. From a small data
set, the sample values, conclusion is made about the population.
The mathematical model for variable quantities is based on the idea that many influence
quantities have an effect on a process. The 5 Ms Man, Machine, Material, Milieu,
Method form the main elements of the influence quantities.
Each M may be subdivided further, e.g. Milieu (= Environment) in temperature,
humidity, vibration, contamination, lighting, ....

The uncontrollable, random effect of many influence quantities lead to deviations of actual
characteristic values from target values (in general, midpoint of the tolerance zone),
despite thorough approach.
From the random interaction of many influence quantities, a Gaussian normal distribution
results for the considered characteristic of the product. This fact is described through
the central limit theorem of statistics. The normal distribution is therefore of
fundamental importance to SPC.

7.1 Location Control Charts

A first hint on the process average is found already, in that, one takes a produced part as a
random sample, measures the interesting characteristic value and then compares the found
characteristic value with the estimation of the mean $ . If the single value lies within an
interval of, for instance, 3 , this result will not surprise further because under the
mentioned prerequisites in case of a purely random process behaviour, 99.73% of all
characteristic values should lie in this interval. One could now draw a conclusion from a
found result that the current process average corresponds to the pre-production run
(estimation $ ), or that there is no hint on a change of the process average.

- 58 -
The confidence in such a statement would, however, be substantially greater if it were not
only based on an individual sample value, but for instance on n = 5 values. The preceding
thoughts create the basics of the function of a quality control chart. A conclusion about a
momentary process level is drawn from the result of the current 5-piece sample. What is
still undecided is, however, only the question about which characteristic quantities should
be considered as carrier of information about the process average.

Besides the possibility of considering all the five individual values separately, it is
recommendable to undertake an assessment based on the location of the mean value x or
the median ~x of the five values. These three possibilities correspond to the three different
control charts of the location of the original data chart, the average chart and the median
chart. We want to limit ourselves in this documentation to a representation of the average
chart and the original data chart.

7.1.1 Average Chart

The average chart is the most important and in practice the most used quality control chart.
Its functions shall be explained in the following description.
At constant time intervals, samples of size n = 5 are drawn from a production process, the
characteristic to be monitored is measured and the five individual measured results as well
as their standard deviation s and mean x become entered in a suitable form sheet, the
quality control chart.
We now consider only the sample means x i found in time sequence. The standard
deviations s i are handled in connection with the s -chart (Section 7.2.1).
The calculated mean values are entered in a diagram graphically and then interconnected
through a chain of lines.

Fig. 7.1.1.1: Schematic representation of the function of an average chart. To exemplify


the relationships in addition to the mean values (dots) also the individual
values (crosses) of each sample are depicted.
Horizontal scale: No. of the sample.

The mean values like the individual values show a variation about the process average ,
1
however, their variation is by factor smaller than that of the individual values. This
5
relationship is exemplified by the following figure.

- 59 -
Fig. 7.1.1.2: Relationship between the variation of individual values
(original values) and the variation of mean values.

In general this applies: If the individual values of a process characteristic are dispersed
with the standard deviation about the mean , then the standard deviation of the means

x of n values is equal to .
n
Here, we switch over to the consideration of the general case of a sample of size n .

The random variation range of the mean values is now easily found, when one exploits the
x
fact that the transformation u = transforms a normally distributed variable x into a

standard normally distributed variable u (see Section 6.1.3) and instead of x with the

standard deviation the mean values x with the standard deviation are substituted:
n
x
u= .

n

In the table of the standard normal distribution, one finds the limits u lower = 2.58 and
u upper = + 2.58 for the (two-sided) 99% random variation range of quantity u . With a
probability of 99%, u therefore lies within the limits -2.58 and +2.58:

2.58 u + 2.58

By substituting u , the 99% random variation range of the mean x of a sample of size n is
attained:

- 60 -
x
2.58 + 2.58 2.58 x + 2.58
n n
n

If one substitutes the still unknown quantity with the nominal value C and the similarly
unknown of the population by the corresponding estimate $ (it is determined within the
scope of a process capability analysis using at least 20 samples of 5 parts), then one gets:

$ $
C 2.58 x C + 2.58 ,
n n

a central relationship in connection with all location control charts.

The quantities
$ $
UCL = C + 2.58 and LCL = C 2.58
n n

are called upper and lower control limit for the mean value x . Since they are only
dependent upon the process variation and independent upon the tolerance of the
characteristic, one speaks of natural or process related control limits.
They limit the range, in which 99% of all mean values of respectively n individual values
are lying. These control limits are drawn as horizontal continuous lines in the control chart
(see Fig. 7.1.1.1 and 7.1.1.2). All considerations to date have the prerequisite that the
process average is stable. If an observed mean value x i exceeds the upper or lower control
point, then it is concluded that the prerequisite of a stable process midpoint is no longer
existing, thus the process has significantly shifted and a corrective action by the machine
operator must take place.

7.1.2 Original Data Chart (x Chart)

As practical experience shows, it is in some cases desirable to assess the process location
directly using the samples individual values. This is shown in the original data chart.

The derivation of their natural control limits takes place by considering the probability
that all individual values of a sample of size n lie within these limits.

In the section about the average chart it was shown that under certain circumstances
(normally distributed characteristic, stable process) an individual characteristic value with
a probability of 99% lies in the interval between

C 2.58 and C + 2.58 .

Because, for example, 5 sample values independent of one another are random results, the
probability, that all the 5 values lie in this interval, is equal to the product of the individual
probabilities, thus

- 61 -
Ptotal = 0.99 0.99 0.99 0.99 0.99

Ptotal = 0.99 5 0.95 .

To calculate the natural control limits for the original chart with n = 5 one must only
transform the total probability Ptotal = 0.99 and then transform the equation

Ptotal = ( Pindividual ) 5 = 0.99 into

Pindividual = 5 0.99 = 0.998 .

They are therefore derived from the table of the standard normal distribution as (two-
sided) 99.8 % random variation limits. For the example n = 5 is this the value u = 3.09 ,
and the control limits are as such

UCL = C + 3.09 $
( n = 5)
LCL = C 3.09 $ .

If generalized to samples of size n , the calculation of the natural control limits for the
original data chart takes place according to

UCL = C + u n 0 .99
$

LCL = C u n 0 .99
$ .

Fig. 7.1.2: Schematic representation of the function of an original data chart. For
illustration purposes, the respectively greatest and smallest individual value of
a sample are connected by a line (same data as in Fig. 7.1.1).
Horizontal scale: No. of the sample.

- 62 -
7.2 Variation Control Charts

As well as the process average, the process variation that is generally identified with the
standard deviation of a characteristic is a central characteristic quantity for assessing the
production quality. Thereby, the recognition of variation increases is equally interesting,
just as variation reductions are. The last case should be seen as an opportunity for finding
out the cause for the short-term improvement and permanently retaining the favorable
process state.

Variation control charts are suitable aids for determining such changes. For this purpose,
the considered samples of size n will be used in connection with the average chart.
Besides the information about the current process location, each group of samples
individual values also contains a piece of information about the momentary process
variation.

As a characteristic quantity of this information, one can consider the standard deviation s
or the range R of each sample.

7.2.1 s Chart

As already mentioned, it is convenient not to regard the s chart as a separate chart, but
rather to represent it as a second diagram on the form sheet of the average chart. One calls
this combination x - s chart.

In parallel to representing the mean of each i -th sample of size 5 (or generally of a sample
of size n ) here, its standard deviation s i is recorded.

The magnitude of the current value of this standard deviation s i naturally depends on the
mean process variation estimated by $ and can by chance in a special case be somewhat
greater or smaller than a long-term mean value s . In order to be able to decide from when
such fluctuations of s are no longer to be interpreted as a random result but rather as a
hint about an actually prevailing change of process variation, one needs limiting values for
s which for instance incorporate the 99% random variation range of s . If the current
value of s is greater than the upper or smaller than the lower limit value, then this will be
interpreted as a hint about a significant change of process variation. Such limit values may
s2
be determined when one considers the fact that the quantity f is subject to a 2 -
2

distribution (spoken: chi-squared).

s2
From f 2 = 2 with

2f , 1 / 2 2f , /2
s ob = and s un =
f f

one attains the control limits for s being asked for.

- 63 -
The 2 -distribution depending on the number f of degrees of freedom ( f corresponds to
the sample size reduced by 1, thus n 1 ) is tabled like the standard normal distribution.
However, the 2 -distribution is not symmetrical.

Therefore, different factors result for calculating the upper and lower control limits. The
square-root terms in the above expressions are designated as a matter of simplicity with
B E ob and B E un , and directly given with the corresponding sample size n in the table
below. The equations for the control limits of the s chart are thus:

UCL = B E ob $

LCL = B E un $ .

n 2 3 4 5 6
B E ob 2.807 2.302 2.069 1.927 1.830
B E un 0.006 0.071 0.155 0.227 0.287

Table 7.2.1: Factors for calculating the control limits of the s chart

Fig. 7.2.1: Schematic representation of an s chart (same data as in the Fig. 7.1.1 and
7.1.2). The standard deviations obey a skewed distribution. Since the control
limits for s are calculated via those for s 2 , the partial areas above UCL or
below LCL in the representation are depicted too large.
Horizontal scale: No. of the sample.

- 64 -
7.2.2 R Chart

In the case of the R chart one uses the range R , thus the difference between the largest
and the smallest individual value of a 5-group (or a sample of size n ) as a measure for the
momentary process variation.

The basis for calculating the variation limits of R is formed by the distribution of the
relative range w n .
This distribution may for instance be simulated with the help of a computer, in that,
repeatedly, a random sample of size n is respectively drawn from a population of
standard normally distributed values and its range R is determined. The 99% random
variation range of this quantity R is then given by the upper limit w n; 0.995 and the lower
limit w n; 0.005 .
Because, when applying the range chart, the process standard deviation is estimated via
the mean range R (see Section 3.9) as a matter of convenience

R
$ = ,
dn

1
it is advantageous to join the factors w n; 0.005 or w n; 0.995 and and denote the resulting
dn
factors with D E un or D E ob . These quantities are tabled and enable calculation of the
control limits for the range chart according to

UCL = D E ob R

LCL = D E un R .

n 2 3 4 5 6
D E ob 3.518 2.614 2.280 2.100 1.986
D E un 0.008 0.080 0.166 0.239 0.296

Table 7.2.2: Factors for calculating the control limits of the R chart

- 65 -
8. Assessing Frequency Distributions
in Connection with a Tolerance
In the following diagrams several examples of possible frequency distributions which can
occur in an investigation of a production process become schematically illustrated and
explained in a shorthand fashion.

A
Tolerance
Variation range considerably smaller than
tolerance. Mean agrees well with midpoint of
the tolerance zone.

B
Tolerance
Variation range considerably smaller than
tolerance. Mean lies, however, outside
tolerance center. The development of scrap is
to be reckoned with.
Center the process!

Tolerance C
Variation range considerably smaller than
tolerance. Mean of the distribution, however,
unfavourably far away from tolerance center.
Center the process!

- 66 -
Tolerance D
Spread is about equal to the tolerance.
Mean of the distribution agrees well with
the tolerance center.
A systematic change of the average can
result in scrap.
Reduce the variability!

Tolerance E
Spread is about equal to the tolerance.
However, the mean of the distribution
does not coincide with the tolerance
center. Scrap at the upper specification
limit.
Center the process, reduce the variability!

Tolerance F
Mean value of the distribution corres-
ponds well with the tolerance mid-point.
Spread is too large. Nonconformities at
both tolerance limits.
Reduce the variability!

Tolerance G
Superposition of two distributions.
Possibly caused by a systematic process
change (e.g. tool, material).
After eliminating the cause the tolerance
can be easily maintained because the
spread of both distributions is com-
parably small.

- 67 -
Tolerance H
Similar to Fig. G, however the means of
both distributions lie so far away from
one another that scrap develops at both
specification limits.

Tolerance I
The mean of the distribution deviates
from the tolerance center towards the
lower specification limit. The lot was
obviously 100% sorted. If the process is
modified so that mean and tolerance
center coincide, the 100% sorting can
become extraneous.

Tolerance J
The main distribution has a small spread.
The average corresponds well with the
tolerance midpoint. A small part lies
beyond the upper tolerance limit. This
could be scrap, which came about during
machine set-up and was not sorted out.

- 68 -
9. Accuracy of Estimating Mean and Standard Deviation
Statements derived from samples about a population are always connected with a
statistical uncertainty. In general, this uncertainty increases, the smaller the data basis is,
on which the sample is based, i.e. the size n of the sample.

We assume in the following that the population involves a normal distribution, of which
the mean value and the standard deviation are unknown.

The calculated (empirical) quantities x and s from a sample are estimates of the unknown
quantities or . This is expressed in the writing style

$ = x ( x is an estimate for )

$ = s ( s is an estimate for ).

One can give an interval about x or about s , in which the unknown quantities and
respectively lie with a great probability. The width of this so-called confidence interval is
on the one hand dependent upon the sample size n , and on the other hand dependent upon
a prescribed confidence level P A . The quantity 1 P A is the corresponding error
probability.

In the Figures 9.1 and 9.2 the confidence intervals of and are represented depending
upon the sample size. The curves are valid for a confidence level of 95%. That means that
in the 95 of 100 cases, in which is estimated through x or through s , or lie
within the confidence limits derived from the curves.

a) Confidence interval for :

sR sR

D ob D un

s R is the standard deviation of the sample determined with the Range-Method (see Section
1 1
3.9). Factors and are found such that one looks for the sample size n on the
D un D ob
abscissa (this is divided logarithmically), goes vertically upwards up to both the curves
and from here goes horizontally left-ward on to the ordinate. There the corresponding
1 1
value for or is read (Fig. 9.1).
D un D ob

- 69 -
b) Confidence interval for :

t t
x sR x + sR
n n

t
One finds the factor analoguously to the procedure described in a) (Fig. 9.2). Since
n
t
the confidence interval for is symmetrical, it is sufficient to read , on the upper
n
curve.

1 1 t
The greater the sample is, the smaller the factors , or , and thus the
D un D ob n
confidence interval.

The curves shapes allow the deduction that with sample sizes above about n = 50 (despite
increasing experimental scope) the confidence interval only becomes negligibly smaller.
On the contrary, statements concerning the mean value and standard deviation in the case
of sample sizes n < 25 are to be interpreted with caution, because the corresponding
confidence intervals become very large.

- 70 -
Fig. 9.1: Diagram for determining the confidence interval for

- 71 -
Fig. 9.2: Diagram for determining the confidence interval for

- 72 -
10. Standard Normal Distribution ( u ) = 1 ( u ) D( u ) = (u ) ( u )
u (-u) (u) D(u) u (-u) (u) D(u)
0.01 0.496011 0.503989 0.007979 0.51 0.305026 0.694974 0.389949
0.02 0.492022 0.507978 0.015957 0.52 0.301532 0.698468 0.396936
0.03 0.488033 0.511967 0.023933 0.53 0.298056 0.701944 0.403888
0.04 0.484047 0.515953 0.031907 0.54 0.294598 0.705402 0.410803
0.05 0.480061 0.519939 0.039878 0.55 0.291160 0.708840 0.417681
0.06 0.476078 0.523922 0.047845 0.56 0.287740 0.712260 0.424521
0.07 0.472097 0.527903 0.055806 0.57 0.284339 0.715661 0.431322
0.08 0.468119 0.531881 0.063763 0.58 0.280957 0.719043 0.438085
0.09 0.464144 0.535856 0.071713 0.59 0.277595 0.722405 0.444809
0.10 0.460172 0.539828 0.079656 0.60 0.274253 0.725747 0.451494
0.11 0.456205 0.543795 0.087591 0.61 0.270931 0.729069 0.458138
0.12 0.452242 0.547758 0.095517 0.62 0.267629 0.732371 0.464742
0.13 0.448283 0.551717 0.103434 0.63 0.264347 0.735653 0.471306
0.14 0.444330 0.555670 0.111340 0.64 0.261086 0.738914 0.477828
0.15 0.440382 0.559618 0.119235 0.65 0.257846 0.742154 0.484308
0.16 0.436441 0.563559 0.127119 0.66 0.254627 0.745373 0.490746
0.17 0.432505 0.567495 0.134990 0.67 0.251429 0.748571 0.497142
0.18 0.428576 0.571424 0.142847 0.68 0.248252 0.751748 0.503496
0.19 0.424655 0.575345 0.150691 0.69 0.245097 0.754903 0.509806
0.20 0.420740 0.579260 0.158519 0.70 0.241964 0.758036 0.516073
0.21 0.416834 0.583166 0.166332 0.71 0.238852 0.761148 0.522296
0.22 0.412936 0.587064 0.174129 0.72 0.235762 0.764238 0.528475
0.23 0.409046 0.590954 0.181908 0.73 0.232695 0.767305 0.534610
0.24 0.405165 0.594835 0.189670 0.74 0.229650 0.770350 0.540700
0.25 0.401294 0.598706 0.197413 0.75 0.226627 0.773373 0.546745
0.26 0.397432 0.602568 0.205136 0.76 0.223627 0.776373 0.552746
0.27 0.393580 0.606420 0.212840 0.77 0.220650 0.779350 0.558700
0.28 0.389739 0.610261 0.220522 0.78 0.217695 0.782305 0.564609
0.29 0.385908 0.614092 0.228184 0.79 0.214764 0.785236 0.570472
0.30 0.382089 0.617911 0.235823 0.80 0.211855 0.788145 0.576289
0.31 0.378281 0.621719 0.243439 0.81 0.208970 0.791030 0.582060
0.32 0.374484 0.625516 0.251032 0.82 0.206108 0.793892 0.587784
0.33 0.370700 0.629300 0.258600 0.83 0.203269 0.796731 0.593461
0.34 0.366928 0.633072 0.266143 0.84 0.200454 0.799546 0.599092
0.35 0.363169 0.636831 0.273661 0.85 0.197662 0.802338 0.604675
0.36 0.359424 0.640576 0.281153 0.86 0.194894 0.805106 0.610211
0.37 0.355691 0.644309 0.288617 0.87 0.192150 0.807850 0.615700
0.38 0.351973 0.648027 0.296054 0.88 0.189430 0.810570 0.621141
0.39 0.348268 0.651732 0.303463 0.89 0.186733 0.813267 0.626534
0.40 0.344578 0.655422 0.310843 0.90 0.184060 0.815940 0.631880
0.41 0.340903 0.659097 0.318194 0.91 0.181411 0.818589 0.637178
0.42 0.337243 0.662757 0.325514 0.92 0.178786 0.821214 0.642427
0.43 0.333598 0.666402 0.332804 0.93 0.176186 0.823814 0.647629
0.44 0.329969 0.670031 0.340063 0.94 0.173609 0.826391 0.652782
0.45 0.326355 0.673645 0.347290 0.95 0.171056 0.828944 0.657888
0.46 0.322758 0.677242 0.354484 0.96 0.168528 0.831472 0.662945
0.47 0.319178 0.680822 0.361645 0.97 0.166023 0.833977 0.667954
0.48 0.315614 0.684386 0.368773 0.98 0.163543 0.836457 0.672914
0.49 0.312067 0.687933 0.375866 0.99 0.161087 0.838913 0.677826
0.50 0.308538 0.691462 0.382925 1.00 0.158655 0.841345 0.682689

- 73 -
u (-u) (u) D(u) u (-u) (u) D(u)
1.01 0.156248 0.843752 0.687505 1.51 0.065522 0.934478 0.868957
1.02 0.153864 0.846136 0.692272 1.52 0.064256 0.935744 0.871489
1.03 0.151505 0.848495 0.696990 1.53 0.063008 0.936992 0.873983
1.04 0.149170 0.850830 0.701660 1.54 0.061780 0.938220 0.876440
1.05 0.146859 0.853141 0.706282 1.55 0.060571 0.939429 0.878858
1.06 0.144572 0.855428 0.710855 1.56 0.059380 0.940620 0.881240
1.07 0.142310 0.857690 0.715381 1.57 0.058208 0.941792 0.883585
1.08 0.140071 0.859929 0.719858 1.58 0.057053 0.942947 0.885893
1.09 0.137857 0.862143 0.724287 1.59 0.055917 0.944083 0.888165
1.10 0.135666 0.864334 0.728668 1.60 0.054799 0.945201 0.890401
1.11 0.133500 0.866500 0.733001 1.61 0.053699 0.946301 0.892602
1.12 0.131357 0.868643 0.737286 1.62 0.052616 0.947384 0.894768
1.13 0.129238 0.870762 0.741524 1.63 0.051551 0.948449 0.896899
1.14 0.127143 0.872857 0.745714 1.64 0.050503 0.949497 0.898995
1.15 0.125072 0.874928 0.749856 1.65 0.049471 0.950529 0.901057
1.16 0.123024 0.876976 0.753951 1.66 0.048457 0.951543 0.903086
1.17 0.121001 0.878999 0.757999 1.67 0.047460 0.952540 0.905081
1.18 0.119000 0.881000 0.762000 1.68 0.046479 0.953521 0.907043
1.19 0.117023 0.882977 0.765953 1.69 0.045514 0.954486 0.908972
1.20 0.115070 0.884930 0.769861 1.70 0.044565 0.955435 0.910869
1.21 0.113140 0.886860 0.773721 1.71 0.043633 0.956367 0.912734
1.22 0.111233 0.888767 0.777535 1.72 0.042716 0.957284 0.914568
1.23 0.109349 0.890651 0.781303 1.73 0.041815 0.958185 0.916370
1.24 0.107488 0.892512 0.785024 1.74 0.040929 0.959071 0.918141
1.25 0.105650 0.894350 0.788700 1.75 0.040059 0.959941 0.919882
1.26 0.103835 0.896165 0.792331 1.76 0.039204 0.960796 0.921592
1.27 0.102042 0.897958 0.795915 1.77 0.038364 0.961636 0.923273
1.28 0.100273 0.899727 0.799455 1.78 0.037538 0.962462 0.924924
1.29 0.098525 0.901475 0.802949 1.79 0.036727 0.963273 0.926546
1.30 0.096801 0.903199 0.806399 1.80 0.035930 0.964070 0.928139
1.31 0.095098 0.904902 0.809804 1.81 0.035148 0.964852 0.929704
1.32 0.093418 0.906582 0.813165 1.82 0.034379 0.965621 0.931241
1.33 0.091759 0.908241 0.816482 1.83 0.033625 0.966375 0.932750
1.34 0.090123 0.909877 0.819755 1.84 0.032884 0.967116 0.934232
1.35 0.088508 0.911492 0.822984 1.85 0.032157 0.967843 0.935687
1.36 0.086915 0.913085 0.826170 1.86 0.031443 0.968557 0.937115
1.37 0.085344 0.914656 0.829313 1.87 0.030742 0.969258 0.938516
1.38 0.083793 0.916207 0.832413 1.88 0.030054 0.969946 0.939892
1.39 0.082264 0.917736 0.835471 1.89 0.029379 0.970621 0.941242
1.40 0.080757 0.919243 0.838487 1.90 0.028716 0.971284 0.942567
1.41 0.079270 0.920730 0.841460 1.91 0.028067 0.971933 0.943867
1.42 0.077804 0.922196 0.844392 1.92 0.027429 0.972571 0.945142
1.43 0.076359 0.923641 0.847283 1.93 0.026803 0.973197 0.946393
1.44 0.074934 0.925066 0.850133 1.94 0.026190 0.973810 0.947620
1.45 0.073529 0.926471 0.852941 1.95 0.025588 0.974412 0.948824
1.46 0.072145 0.927855 0.855710 1.96 0.024998 0.975002 0.950004
1.47 0.070781 0.929219 0.858438 1.97 0.024419 0.975581 0.951162
1.48 0.069437 0.930563 0.861127 1.98 0.023852 0.976148 0.952297
1.49 0.068112 0.931888 0.863776 1.99 0.023295 0.976705 0.953409
1.50 0.066807 0.933193 0.866386 2.00 0.022750 0.977250 0.954500

- 74 -
u (-u) (u) D(u) u (-u) (u) D(u)
2.01 0.022216 0.977784 0.955569 2.51 0.006037 0.993963 0.987927
2.02 0.021692 0.978308 0.956617 2.52 0.005868 0.994132 0.988264
2.03 0.021178 0.978822 0.957644 2.53 0.005703 0.994297 0.988594
2.04 0.020675 0.979325 0.958650 2.54 0.005543 0.994457 0.988915
2.05 0.020182 0.979818 0.959636 2.55 0.005386 0.994614 0.989228
2.06 0.019699 0.980301 0.960602 2.56 0.005234 0.994766 0.989533
2.07 0.019226 0.980774 0.961548 2.57 0.005085 0.994915 0.989830
2.08 0.018763 0.981237 0.962475 2.58 0.004940 0.995060 0.990120
2.09 0.018309 0.981691 0.963382 2.59 0.004799 0.995201 0.990402
2.10 0.017864 0.982136 0.964271 2.60 0.004661 0.995339 0.990678
2.11 0.017429 0.982571 0.965142 2.61 0.004527 0.995473 0.990946
2.12 0.017003 0.982997 0.965994 2.62 0.004397 0.995603 0.991207
2.13 0.016586 0.983414 0.966829 2.63 0.004269 0.995731 0.991461
2.14 0.016177 0.983823 0.967645 2.64 0.004145 0.995855 0.991709
2.15 0.015778 0.984222 0.968445 2.65 0.004025 0.995975 0.991951
2.16 0.015386 0.984614 0.969227 2.66 0.003907 0.996093 0.992186
2.17 0.015003 0.984997 0.969993 2.67 0.003793 0.996207 0.992415
2.18 0.014629 0.985371 0.970743 2.68 0.003681 0.996319 0.992638
2.19 0.014262 0.985738 0.971476 2.69 0.003573 0.996427 0.992855
2.20 0.013903 0.986097 0.972193 2.70 0.003467 0.996533 0.993066
2.21 0.013553 0.986447 0.972895 2.71 0.003364 0.996636 0.993272
2.22 0.013209 0.986791 0.973581 2.72 0.003264 0.996736 0.993472
2.23 0.012874 0.987126 0.974253 2.73 0.003167 0.996833 0.993666
2.24 0.012545 0.987455 0.974909 2.74 0.003072 0.996928 0.993856
2.25 0.012224 0.987776 0.975551 2.75 0.002980 0.997020 0.994040
2.26 0.011911 0.988089 0.976179 2.76 0.002890 0.997110 0.994220
2.27 0.011604 0.988396 0.976792 2.77 0.002803 0.997197 0.994394
2.28 0.011304 0.988696 0.977392 2.78 0.002718 0.997282 0.994564
2.29 0.011011 0.988989 0.977979 2.79 0.002635 0.997365 0.994729
2.30 0.010724 0.989276 0.978552 2.80 0.002555 0.997445 0.994890
2.31 0.010444 0.989556 0.979112 2.81 0.002477 0.997523 0.995046
2.32 0.010170 0.989830 0.979659 2.82 0.002401 0.997599 0.995198
2.33 0.009903 0.990097 0.980194 2.83 0.002327 0.997673 0.995345
2.34 0.009642 0.990358 0.980716 2.84 0.002256 0.997744 0.995489
2.35 0.009387 0.990613 0.981227 2.85 0.002186 0.997814 0.995628
2.36 0.009137 0.990863 0.981725 2.86 0.002118 0.997882 0.995763
2.37 0.008894 0.991106 0.982212 2.87 0.002052 0.997948 0.995895
2.38 0.008656 0.991344 0.982687 2.88 0.001988 0.998012 0.996023
2.39 0.008424 0.991576 0.983152 2.89 0.001926 0.998074 0.996147
2.40 0.008198 0.991802 0.983605 2.90 0.001866 0.998134 0.996268
2.41 0.007976 0.992024 0.984047 2.91 0.001807 0.998193 0.996386
2.42 0.007760 0.992240 0.984479 2.92 0.001750 0.998250 0.996500
2.43 0.007549 0.992451 0.984901 2.93 0.001695 0.998305 0.996610
2.44 0.007344 0.992656 0.985313 2.94 0.001641 0.998359 0.996718
2.45 0.007143 0.992857 0.985714 2.95 0.001589 0.998411 0.996822
2.46 0.006947 0.993053 0.986106 2.96 0.001538 0.998462 0.996923
2.47 0.006756 0.993244 0.986489 2.97 0.001489 0.998511 0.997022
2.48 0.006569 0.993431 0.986862 2.98 0.001441 0.998559 0.997117
2.49 0.006387 0.993613 0.987226 2.99 0.001395 0.998605 0.997210
2.50 0.006210 0.993790 0.987581 3.00 0.001350 0.998650 0.997300

- 75 -
u (-u) (u) D(u) u (-u) (u) D(u)
3.01 0.001306 0.998694 0.997387 3.51 0.000224 0.999776 0.999552
3.02 0.001264 0.998736 0.997472 3.52 0.000216 0.999784 0.999568
3.03 0.001223 0.998777 0.997554 3.53 0.000208 0.999792 0.999584
3.04 0.001183 0.998817 0.997634 3.54 0.000200 0.999800 0.999600
3.05 0.001144 0.998856 0.997711 3.55 0.000193 0.999807 0.999615
3.06 0.001107 0.998893 0.997786 3.56 0.000185 0.999815 0.999629
3.07 0.001070 0.998930 0.997859 3.57 0.000179 0.999821 0.999643
3.08 0.001035 0.998965 0.997930 3.58 0.000172 0.999828 0.999656
3.09 0.001001 0.998999 0.997998 3.59 0.000165 0.999835 0.999669
3.10 0.000968 0.999032 0.998065 3.60 0.000159 0.999841 0.999682
3.11 0.000936 0.999064 0.998129 3.61 0.000153 0.999847 0.999694
3.12 0.000904 0.999096 0.998191 3.62 0.000147 0.999853 0.999705
3.13 0.000874 0.999126 0.998252 3.63 0.000142 0.999858 0.999717
3.14 0.000845 0.999155 0.998310 3.64 0.000136 0.999864 0.999727
3.15 0.000816 0.999184 0.998367 3.65 0.000131 0.999869 0.999738
3.16 0.000789 0.999211 0.998422 3.66 0.000126 0.999874 0.999748
3.17 0.000762 0.999238 0.998475 3.67 0.000121 0.999879 0.999757
3.18 0.000736 0.999264 0.998527 3.68 0.000117 0.999883 0.999767
3.19 0.000711 0.999289 0.998577 3.69 0.000112 0.999888 0.999776
3.20 0.000687 0.999313 0.998626 3.70 0.000108 0.999892 0.999784
3.21 0.000664 0.999336 0.998673 3.71 0.000104 0.999896 0.999793
3.22 0.000641 0.999359 0.998718 3.72 0.000100 0.999900 0.999801
3.23 0.000619 0.999381 0.998762 3.73 0.000096 0.999904 0.999808
3.24 0.000598 0.999402 0.998805 3.74 0.000092 0.999908 0.999816
3.25 0.000577 0.999423 0.998846 3.75 0.000088 0.999912 0.999823
3.26 0.000557 0.999443 0.998886 3.76 0.000085 0.999915 0.999830
3.27 0.000538 0.999462 0.998924 3.77 0.000082 0.999918 0.999837
3.28 0.000519 0.999481 0.998962 3.78 0.000078 0.999922 0.999843
3.29 0.000501 0.999499 0.998998 3.79 0.000075 0.999925 0.999849
3.30 0.000483 0.999517 0.999033 3.80 0.000072 0.999928 0.999855
3.31 0.000467 0.999533 0.999067 3.81 0.000070 0.999930 0.999861
3.32 0.000450 0.999550 0.999100 3.82 0.000067 0.999933 0.999867
3.33 0.000434 0.999566 0.999131 3.83 0.000064 0.999936 0.999872
3.34 0.000419 0.999581 0.999162 3.84 0.000062 0.999938 0.999877
3.35 0.000404 0.999596 0.999192 3.85 0.000059 0.999941 0.999882
3.36 0.000390 0.999610 0.999220 3.86 0.000057 0.999943 0.999887
3.37 0.000376 0.999624 0.999248 3.87 0.000054 0.999946 0.999891
3.38 0.000362 0.999638 0.999275 3.88 0.000052 0.999948 0.999896
3.39 0.000350 0.999650 0.999301 3.89 0.000050 0.999950 0.999900
3.40 0.000337 0.999663 0.999326 3.90 0.000048 0.999952 0.999904
3.41 0.000325 0.999675 0.999350 3.91 0.000046 0.999954 0.999908
3.42 0.000313 0.999687 0.999374 3.92 0.000044 0.999956 0.999911
3.43 0.000302 0.999698 0.999396 3.93 0.000042 0.999958 0.999915
3.44 0.000291 0.999709 0.999418 3.94 0.000041 0.999959 0.999918
3.45 0.000280 0.999720 0.999439 3.95 0.000039 0.999961 0.999922
3.46 0.000270 0.999730 0.999460 3.96 0.000037 0.999963 0.999925
3.47 0.000260 0.999740 0.999479 3.97 0.000036 0.999964 0.999928
3.48 0.000251 0.999749 0.999498 3.98 0.000034 0.999966 0.999931
3.49 0.000242 0.999758 0.999517 3.99 0.000033 0.999967 0.999934
3.50 0.000233 0.999767 0.999535 4.00 0.000032 0.999968 0.999937

- 76 -
11. Bibliography

[1] Broschure: Elementary Quality Assurance Tools (ZQF)

[2] M. Sadowy: Industrielle Statistik, Vogel-Verlag, Wrzburg, 1970

[3] Erwin Kreyszig: Statistische Methoden und ihre Anwendungen,


Vandenhoeck u. Ruprecht, Gttingen, 1988

[4] H. Weber: Einfhrung in die Wahrscheinlichkeitsrechnung und Statistik fr


Ingenieure, Teubner, Stuttgart, 1988

[5] Karl Bosch: Elementare Einfhrung in die Wahrscheinlichkeitsrechnung,


Vieweg, Braunschweig, 1989

[6] Karl Bosch: Elementare Einfhrung in die angewandte Statistik,


Vieweg, Braunschweig, 1989

[7] F. Barth, H. Berghold, R. Haller: Stochastik (Grundkurs, Leistungskurs),


Ehrenwirth, Mnchen, 1992

[8] G. Wagner und R. Lang: Statistische Auswertung von Me- und Prfergebnissen,
Hrsg.: Deutsche Gesellschaft fr Qualitt e. V., DGQ 14, 1976

[9] Graf, Henning, Stange, Wilrich: Formeln und Tabellen der angewandten
mathematischen Statistik, Springer-Verlag, Berlin, 1987

[10] Lothar Sachs: Angewandte Statistik,


Springer-Verlag, Berlin, 1992

[11] Hartung: Statistik,


Oldenbourg, Mnchen, 1989

- 77 -
12. Symbols and Terms


+
= Integral from minus infinite to plus infinite

= Radical sign; root sign

= Sum symbol; summation sign

= Product symbol

= Smaller than or equal to

= Greater than or equal to

= Not equal

x = Absolute value of x (positive value of x )

B lower , B upper = Factors for calculating random variation limits of s

C = Midpoint of the tolerance zone or target value

D un , D ob = Factors for the calculation of the confidence interval for

e = Base of the natural logarithm ( 2.71828)

f = Degrees of freedom

f ( x) = Probability density function

Gj = Cumulative frequency

hj = Relative frequency

Hj = Cumulative relative frequency

i , j = Count indices

k = Number of classes

ln = Natural (base e) logarithm

m = Number of groups

n = Sample size

nj = Absolute frequency in the j-th class

P = Probability

PA = Probability (confidence level)

R = Range

- 78 -
s = Standard deviation of a sample

sR = Standard deviation determined through the Range-Method

t = Factor for calculating the confidence interval for when the


standard deviation of the population is unknown

u = Standardised parameter of the normal distribution N ( = 0; 2 = 1)

v = Coefficient of variation

w = Class width

x = Continuous characteristic values

xi , yi = Values of a measurement series

x (1) , . . . , x ( n ) = Values of a measurement series arranged in order of magnitude

xg = Geometric mean of a sample

x ma x = Largest value of a sample

x mi n = Smallest value of a sample

x% = Median of a sample

x = Arithmetic mean of a sample

xD = Average of the difference distribution from samples

= Probability for type I error

= Probability for type II error

= Skewness of a population

= Difference in the determination of the measurement accuracy

= Shape parameter of the lognormal distribution

e = Excess of a population

= Average of a population

D = Average of the difference distribution

g = Geometric mean of a population

= Standard deviation of a population

2 = Variance of a population

= Number Pi (3.1416)

- 79 -
Index

Average chart 59 moving 16


Average deviation 20 Median 13; 14; 53
Mixture of distributions 57
Bar chart 34
Bell-shaped curve 37; 38; 39 Normal distribution 36; 37; 58

Characteristic 7 Ordered list 14


Class Original data chart (x chart) 61
limit 32; 33; 34 Original value diagram 28
midpoint 33
width 33 Pocket Calculator 25
Coefficient of Variation 22 Population 7
Confidence interval Probability 11; 36
for the mean 69 Probability paper
for the standard deviation 69 lognormal 52; 53
Confidence level 69 normal 44; 45
Cumulative frequency
absolute 33 Quality control chart 58
curve 35; 39
polygon 35 R Chart 65
relative 33; 34 Random
experiment 8; 11
Density function 36; 40 variable 8
Distribution function 36; 39; 40 Random variation range
Dot diagram 29 of the mean 60
Range 23; 65
Fraction nonconforming 42; 46 Range Method 23
Frequency Realisation of a random variable 8
absolute 33; 34
relative 33; 34 s Chart 63
Frequency diagram 30; 45 Sample 8
Shape parameter 53
Gau 36 Shewhart 58
Grouping 30 Skewed distribution 51
Standard deviation 20
Histogram 30 graphic determination 46
of the normal distribution 37
Influence quantities ("5M") 58 Standard normal distribution 41
Integration 39 Statistics 5
descriptive 5
Law of large numbers 12 inductive 5
Location control charts 58
Logarithm 53; 56 Tally chart 29
Lognormal distribution 51; 56
Variance 21
Mean 37 Variation
arithmetic 15 of the individual values 60
geometric 17; 53; 56 of the means 60
graphic determination 46 Variation control charts 63
harmonic 19

- 80 -
Robert Bosch GmbH
C/QMM
Postfach 30 02 20
D-70442 Stuttgart
Germany
Phone +49 711 811-4 47 88
Fax +49 711 811-2 31 26
www.bosch.com

You might also like