Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

Sources of Error in Brake Wheel Sizes

What is the cause of this size discrepancy?


Project by Rebecca Winzer
University of Idaho
STAT 301
5 December 2012






Winzer 1
The purpose of this study is to determine the discrepancies of size in brake wheels. These
discrepancies can be contributed to three different operators as well as three different machines,
so this study will be multi-variable and rather complex. I expect to find the source of the error
contributed to at least one of the operators or one of the machines, but the errors could
accumulate from more than one source. This particular data set interested me because of the
complexity and the wide range of possibilities for the source of the greatest error; I thought it
would be a great study to apply a lot of the test-type statistics learned over the semester.
For machine one, two, and three, 15 samples are collected for each, but 5 of each of the
samples are paired with operators one, two, and three. Therefore machines and operators will be
dependent on each other throughout the study. For these particular brake wheels, the population
is determined by the total amount of brake wheels made from those specific machines and
operators. The sample is random and draws from the pool of the total brake wheels made (the
population) to reflect the total population. There are concerns for statistical bias throughout the
study. Representativeness is a potential issue when doing this study. If a sample does not
represent the true population, results will be skewed and biased. Fortunately the subject that the
study concerns is not complex. Brake wheels bring no potential bias threat to the study because
obviously they are inanimate objects with no ability to impact the data collected. The factor that
we have to take into account is selection bias, in other words equal representation for all of the
different combinations. So really we just have to represent an equal number of samples for each
Machine/Operator combination, which the data accumulated does have. There are other potential
bias threats. The potential omission of information has to be understood. Was it just the
machines and operators that played a role in the size discrepancy of the brake wheels, or have
Winzer 2
other steps to the process been omitted because they were deemed negligible? We put some trust
in the data collectors that they appropriately analyzed all of the factors that went into making
these brake wheels and that they have given us an accurate reflection of these discrepancies.
Detection bias might play a role. Once again, the data collectors might have been biased towards
choosing brake wheels with discrepancies because that is what they were looking for when
collecting the data. We have to consider how the data was taken. We trust that the data collectors
did as random of a collection of data as they possibly could for each machine and operator.
Reporting bias might be another contributing factor. We ensure that the data collected was
accumulated for only 5 samples of each combination randomly and that samples were not chosen
out of the entire data collection. I believe the sample to be representative of the true population
because we have an equal number of samples for each machine/operator combination and great
guarantee that are data collectors are competent.
I plotted a boxplot graph choosing the graph variable as the difference and the categorical
variables as the three different machines and operators. The boxplot showed that the overall error
between the machines was pretty equal. What appears to be causing the most error falls on the
behalf of the operators. When looking at the boxplot, one sees that operators one, two, and three
are very different from one another. Operator two appears to contribute to the greatest amount of
error and operator three contributes to the least amount of error. But the difference between
Operator one and three and operator two and three is quite drastic. It is true that operator two
contributes the greatest amount of error, but the drastic difference between each operator output
appears to contribute to the greatest discrepancy. When looking at the collected descriptive
statistics one sees for the results of machine one, two, and three the great variance in mean
Winzer 3
values for the operators. For the operator results, the machines have mean values that are very
close to one another. They also have maximum and minimum values that are very close as well.
For the machine results, there is a great amount of variance between the operators.
Are the mean values for machine one, two, and three essentially equal? We perform a
two-way ANOVA test of the difference versus the machine and operator. We choose our null
hypothesis to be that the mean values of machine one, two, and three are equal. Our alternative
hypothesis is that any of the two mean values or all three are different. I performed the test and
found that the p-value is 0.391 (which is much greater than 0.05). We cannot reject the null
hypothesis therefore it seems that the machine means are all essentially equal. Are the mean
values for operators one, two, and three essentially equal? Again we perform the two-way
ANOVA test. We choose our null hypothesis to be that the mean values of operator one, two,
and three are equal. Our alternative hypothesis is that any of the two mean values or all three are
different. I performed the test and I found a p-value of 0.000 for operators. Since 0.000 is less
than the p-value of 0.05 we must reject the null hypothesis. Therefore we accept the alternative
hypothesis that at least two of the operator mean values are different. Are the mean values for the
combination of the machines and operators essentially equal? We perform a two-way ANOVA
test. We choose our null hypothesis to be that the mean values of the combination of machines
and operators are equal. Our alternative hypothesis is that at least two of the mean value
combinations are different. We perform the test and get a p-value for the interaction of 0.896
(which is much greater than p-value 0.05). We cannot reject the null hypothesis and therefore it
seems that the mean values for the machine and operator combinations are all equal. The
statistical inference data shows that the operators are contributing to the discrepancy. To further
Winzer 4
emphasize this conclusion, I performed a one-way ANOVA test of both machines versus
difference and operators versus difference. I used Tukey method for both. For the operators:
operator one and two were in the same grouping of A, but operator three was in a grouping of B.
The method shows the large difference that operator three has from the other two operators.
Again we get a p-value of 0.000 which shows the difference in the mean values of the operators.
For the machines: machine one, two, and three are all in the same grouping of A. The p-value is
0.499 (which is much greater than 0.05) so it shows again that the mean values for the machines
are equal. The method was appropriate because we want to find what is causing the largest
discrepancy in brake wheel size. The important information is the difference measured from the
operator and machine combination.I collected a group of four different graphs for residual plots
versus difference. For the normal probability plot, the residuals follow the line pattern overall.
The plot shows that the residuals are normally distributed and therefore follow a linear fit. This
plot shows that the data collectors did a good job in reflecting all of the variables that contribute
to the difference in brake wheel size.For the versus fits graph, there is a little change in
variability across the plot, but not too much. It shows again that we have normally distributed
residuals and that the data fits a linear model. The histogram graph does not quite show a bell
curve, but we also have to take in account the very small sample that we are working from. The
larger the sample the more the histogram will reflect a normal bell curve.For the most part,
though, the histogram reflects a bell curve shape and therefore the residuals are shown to be
normally distributed. The residual versus observation order graph is very unpredictable reflecting
no pattern, which is good. This graph shows that the residuals are independent of one another, so

Winzer 5
the independence condition is met. These graphs show that there are no serious violations of the
linear regression assumption.
From our studies we can conclude that the operators are the causation for the greatest
source of error when making the brake wheels. Our descriptive statistics data show that the
greatest discrepancy occurs in the operators. They have the greatest difference among their mean
values. The boxplot confirms the data. It shows that the machines overall have around the same
level of error, but the error contributed by the operators is very widespread. For our statistical
inference data, all but the operators had equal mean values of error. For both machines and
machine/operator combinations the error was essentially equal between them. For the three
operators, two differed greatly from the third. Operator one and two were drastically different
than operator three (operator one and two were in grouping A while operator three was in
grouping B). One question that naturally arises: will changing the operators also change the
machines output? We can round off this argument. Our residual versus observation order graph
shows that there is independence among the residuals. Therefore we can conclude that operators
and machines are (for the most part) independent of one another and that changing one will not
change the other.
We found our large source of error: fluctuation among the operators. To further our study
of this problem, I suggest that we use a much large sample size. Instead of 5 samples for each
machine/operator combination, I advise that we instead take 50 samples. We must use all of the
methods to obtain a completely random sample, but taking a larger sample would contribute to
more accurate results. From our studies, I would also propose that we take a much more selective
look at the operators. They appear to be the lead cause in brake wheel size discrepancy, and
Winzer 6
fixing the operators would help diminish the variance. For someone conducting a similar study, I
would first like to ask how the operators specifically function in making the brake wheels. Is
there a distinct function the operators perform that creates the variation? Are there multiple
factors? There must be many elements that contribute to how an operator functions, so I suggest
that an individual conducting a corresponding study take into account these agents. The factors
that contribute to machine output and operator output were not really considered in this study.
Since there are multiple contributing factors to operator output they must also be considered in
forthcoming studies to improve efficiency for the future.












Winzer 7

Descriptive Statistics: Difference

Results for Machine = 1

Variable Operator Mean SE Mean StDev Variance Minimum Q1 Median
Difference 1 2.800 0.255 0.570 0.325 2.000 2.250 3.000
2 3.400 0.292 0.652 0.425 3.000 3.000 3.000
3 2.300 0.255 0.570 0.325 1.500 1.750 2.500

Variable Operator Q3 Maximum Range IQR
Difference 1 3.250 3.500 1.500 1.000
2 4.000 4.500 1.500 1.000
3 2.750 3.000 1.500 1.000


Results for Machine = 2

Variable Operator Mean SE Mean StDev Variance Minimum Q1 Median
Difference 1 3.300 0.200 0.447 0.200 3.000 3.000 3.000
2 3.400 0.485 1.084 1.175 2.500 2.500 3.000
3 2.500 0.224 0.500 0.250 2.000 2.000 2.500

Variable Operator Q3 Maximum Range IQR
Difference 1 3.750 4.000 1.000 0.750
2 4.500 5.000 2.500 2.000
3 3.000 3.000 1.000 1.000


Results for Machine = 3

Variable Operator Mean SE Mean StDev Variance Minimum Q1 Median
Difference 1 3.200 0.255 0.570 0.325 2.500 2.750 3.000
2 3.800 0.406 0.908 0.825 2.500 3.000 4.000
3 2.500 0.224 0.500 0.250 2.000 2.000 2.500

Variable Operator Q3 Maximum Range IQR
Difference 1 3.750 4.000 1.500 1.000
2 4.500 5.000 2.500 1.500
3 3.000 3.000 1.000 1.000

Results for Operator = 1

Variable Machine Mean SE Mean StDev Variance Minimum Q1 Median
Difference 1 2.800 0.255 0.570 0.325 2.000 2.250 3.000
2 3.300 0.200 0.447 0.200 3.000 3.000 3.000
3 3.200 0.255 0.570 0.325 2.500 2.750 3.000

Variable Machine Q3 Maximum Range IQR
Difference 1 3.250 3.500 1.500 1.000
2 3.750 4.000 1.000 0.750
3 3.750 4.000 1.500 1.000


Results for Operator = 2

Variable Machine Mean SE Mean StDev Variance Minimum Q1 Median
Difference 1 3.400 0.292 0.652 0.425 3.000 3.000 3.000
2 3.400 0.485 1.084 1.175 2.500 2.500 3.000
Winzer 8

3 3.800 0.406 0.908 0.825 2.500 3.000 4.000

Variable Machine Q3 Maximum Range IQR
Difference 1 4.000 4.500 1.500 1.000
2 4.500 5.000 2.500 2.000
3 4.500 5.000 2.500 1.500


Results for Operator = 3

Variable Machine Mean SE Mean StDev Variance Minimum Q1 Median
Difference 1 2.300 0.255 0.570 0.325 1.500 1.750 2.500
2 2.500 0.224 0.500 0.250 2.000 2.000 2.500
3 2.500 0.224 0.500 0.250 2.000 2.000 2.500

Variable Machine Q3 Maximum Range IQR
Difference 1 2.750 3.000 1.500 1.000
2 3.000 3.000 1.000 1.000
3 3.000 3.000 1.000 1.000













Winzer 9
Statistical Inference Data for Machines and Operators

Two-way ANOVA: Difference versus Machine, Operator

Source DF SS MS F P
Machine 2 0.8778 0.43889 0.96 0.391
Operator 2 9.2111 4.60556 10.11 0.000
Interaction 4 0.4889 0.12222 0.27 0.896
Error 36 16.4000 0.45556
Total 44 26.9778

S = 0.6749 R-Sq = 39.21% R-Sq(adj) = 25.70%


Individual 95% CIs For Mean Based on
Pooled StDev
Machine Mean -------+---------+---------+---------+--
1 2.83333 (----------*-----------)
2 3.06667 (-----------*-----------)
3 3.16667 (-----------*----------)
-------+---------+---------+---------+--
2.70 3.00 3.30 3.60


Individual 95% CIs For Mean Based on
Pooled StDev
Operator Mean --------+---------+---------+---------+-
1 3.10000 (------*------)
2 3.53333 (------*------)
3 2.43333 (------*------)
--------+---------+---------+---------+-
2.50 3.00 3.50 4.00


One-way ANOVA: Difference versus Operator

Source DF SS MS F P
Operator 2 9.211 4.606 10.89 0.000
Error 42 17.767 0.423
Total 44 26.978

S = 0.6504 R-Sq = 34.14% R-Sq(adj) = 31.01%


Individual 95% CIs For Mean Based on
Pooled StDev
Level N Mean StDev --------+---------+---------+---------+-
1 15 3.1000 0.5412 (------*------)
2 15 3.5333 0.8550 (------*-----)
3 15 2.4333 0.4952 (------*-----)
--------+---------+---------+---------+-
2.50 3.00 3.50 4.00

Pooled StDev = 0.6504


Grouping Information Using Tukey Method

Operator N Mean Grouping
2 15 3.5333 A
Winzer 10
1 15 3.1000 A
3 15 2.4333 B

Means that do not share a letter are significantly different.


Tukey 95% Simultaneous Confidence Intervals
All Pairwise Comparisons among Levels of Operator

Individual confidence level = 98.07%


Operator = 1 subtracted from:

Operator Lower Center Upper -------+---------+---------+---------+--
2 -0.1444 0.4333 1.0110 (----*-----)
3 -1.2444 -0.6667 -0.0890 (----*-----)
-------+---------+---------+---------+--
-1.0 0.0 1.0 2.0


Operator = 2 subtracted from:

Operator Lower Center Upper -------+---------+---------+---------+--
3 -1.6777 -1.1000 -0.5223 (-----*-----)
-------+---------+---------+---------+--
-1.0 0.0 1.0 2.0


One-way ANOVA: Difference versus Machine

Source DF SS MS F P
Machine 2 0.878 0.439 0.71 0.499
Error 42 26.100 0.621
Total 44 26.978

S = 0.7883 R-Sq = 3.25% R-Sq(adj) = 0.00%


Individual 95% CIs For Mean Based on
Pooled StDev
Level N Mean StDev ---------+---------+---------+---------+
1 15 2.8333 0.7237 (------------*-------------)
2 15 3.0667 0.7988 (------------*-------------)
3 15 3.1667 0.8381 (-------------*------------)
---------+---------+---------+---------+
2.70 3.00 3.30 3.60

Pooled StDev = 0.7883


Grouping Information Using Tukey Method

Machine N Mean Grouping
3 15 3.1667 A
2 15 3.0667 A
1 15 2.8333 A

Means that do not share a letter are significantly different.


Winzer 11
Tukey 95% Simultaneous Confidence Intervals
All Pairwise Comparisons among Levels of Machine

Individual confidence level = 98.07%


Machine = 1 subtracted from:

Machine Lower Center Upper ------+---------+---------+---------+---
2 -0.4668 0.2333 0.9335 (-------------*-------------)
3 -0.3668 0.3333 1.0335 (-------------*-------------)
------+---------+---------+---------+---
-0.50 0.00 0.50 1.00


Machine = 2 subtracted from:

Machine Lower Center Upper ------+---------+---------+---------+---
3 -0.6002 0.1000 0.8002 (-------------*-------------)
------+---------+---------+---------+---
-0.50 0.00 0.50 1.00

You might also like