Professional Documents
Culture Documents
Math 1040 Skittles Term Project
Math 1040 Skittles Term Project
Math 1040 Skittles Term Project
Number of
orange
candies
224
Number of
yellow
candies
205
Number of
green candies
209
Number of
purple
candies
235
Introduction
The purpose of this assignment is to see if the variance in colors between packages is, in
fact random, or if the skittles company really does produce varying amounts of each
color. For our experiment, each student in the class has recorded the amount of every
color obtained in their skittles package. We will then compare results into one large
sample and continue from there. Following the large sample size, we will then compare
the mean number of skittles per bag. I will then construct various confidence intervals
for different confidence levels, after which, I will test various hypothesis regarding the
statistics of the skittles.
Jamison Pittl
19.24
19.33 20.72
18.96
Red
Orange
Yellow
Green
Purple
Orange
Yellow
Green
Purple
100
235
90
230
80
225
70
220
60
215
50
210
40
205
30
200
20
195
10
190
Purple
Orange
Green
Column1
Name
Green
Orang
Purple
Red
Red
Yellow
Frequency
Yellow
Sum
Jamison Pittl
Alison
Digravio
Anton
Ashley
Austin
Brandon
Bryce
Cesar
Chet
Clarissa
Danielle
Derek
Jamison
Katherine
Kiran
Matani
Remington
Whitney
William
e
17
15
16
7
9
12
14
9
11
19
13
8
18
14
12
7
11
8
6
14
14
18
12
7
15
12
12
12
10
10
14
11
13
12
15
16
10
13
14
9
17
13
12
11
10
15
13
9
16
16
12
19
9
9
13
10
15
9
11
13
10
13
11
6
17
16
15
11
11
14
9
11
9
9
15
16
13
11
12
14
13
11
7
8
11
6
60
58
58
60
57
62
64
61
63
60
58
60
65
62
59
57
59
sum
average
209
11.61
111
224
12.44
444
235
13.05
556
208
11.55
556
205
11.38
889
1081
60.055
56
10
16
58
As shown from the compiled data and graphs, some colors of skittles were
significantly more present than were others. Especially the color purple,
which for example had a total of 235 while yellow only had 205, a difference
in 30 skittles. However, the frequencies of each color relatively surprised me.
They all seemed to be around the frequency of 20, give or take a percentage
here and there. Ive included a table of all statistics and highlighted mine
specifically in yellow. In regards to my statistics specifically, I was very
surprised to see that yellow, one of my highest in numbers was one of the
lowest overall for the class. However, my results for the purple skittle did
coincide with the final results of purple being the highest.
Jamison Pittl
120.00%
100.00%
80.00%
60.00%
40.00%
20.00%
57
59
61
Frequency
63
More
0.00%
Cumulative %
Mean: 60.1
Standard Deviation: 2.36
Min: 57
Q1: 58
Med: 60
Q3: 62
Max: 65
N: 18
This Data Is relatively normal but can also be viewed as skewed toward the
right. The average amount of skittles per bag was about 60.1 with a high of
65 and a low of 57. As is shown in the histogram, most students got about
59-60 skittles. This is pretty much what I expected to see. However, I got
below average on the amount of skittles I received.
Jamison Pittl
Reflection
The difference between categorical and quantitative data is that quantitative
data is measurable or quantifiable such as length, weight, and age.
Categorical data are things that can be grouped together according to some
common properties or characteristics such as color, sex, and type of vehicle.
The best kind of graph that would make sense for categorical data would be
a bar graph, line graphs, pie graph or a pictograph because categorical
values can be assigned a bar, wedge, or picture and no mathematical values
are needed. Histograms and scatterplots dont work well with categorical
data as they usually depend upon numerical data and categorical data has
no numerical values. The best kind of graphs to use when dealing with
quantitative data would be a histogram, boxplot, or scatterplot. While a bar
graph can be used, its usually only used for categorical type data because it
doesnt accurately display enumerative data unless you turn it into a
histogram. From quantitative type graphs one could obtain a max, min and
median. You can definitely obtain more information from quantitative graphs
but categorical graphs do a good job at organizing information and turning it
into a visual.
Jamison Pittl
We are 99% sure that the proportion of yellow candies lies between the two
points of .15893 and .22035.
Construct a 95% confidence interval estimate for the true mean
number of candies per bag.
(58.995, 61.117) also written as, 58.995<u<61.117 (found using a
calculator)
We are 95% confident that the true mean of number of skittles per bag lies
between the two points of 58.995 and 61.117.
5.697
We are 98% sure that the populations standard deviation of candies per bag
lies between the two points of 3.432 and 16.619.
Reflect
From the first confidence interval, we can see that the proportion of yellow
skittles lies between 15.9% and 22% with 99% confidence. From the second
confidence interval we can conclude that the mean number of candies per
bag lies between the numbers of 58.9 and 61.1 with 95% confidence. From
the final confidence interval we can conclude that the population standard
deviation lies somewhere between the numbers of 3.43w and 16.619 with
98% confidence.
Jamison Pittl
Hypothesis Tests
A hypothesis test is a statistical test that is used to determine whether there
is enough evidence in a sample of data to infer that a certain condition is
true for the entire population. A hypothesis test examines two opposing
hypotheses about a population: the null hypothesis and the alternative
hypothesis. The null hypothesis is the statement being tested. Usually the
null hypothesis is a statement that declares values are of equal value. The
alternative hypothesis is the statement you want to be able to conclude is
true and is usually stated as being larger than or less than, never equal to.
Use a 0.05 significance level to test the claim that 20% of all Skittles
candies are red.
H0(null): p=.20
calculator)
(found on
H1(alternative): p=.20
Z= -.624
p= .533
a= .01
p= .533>.05 =a
Because the value of p is greater than our significance level, we fail to reject
the null hypothesis. There is significant evidence to support the claim that
20% of skittles are red.
Use a 0.01 significance level to test the claim that the mean number
of candies in a bag of Skittles is 55.
H0(null): u=55
calculator)
H1(alternative): u=55
a: .01
Mean: 60.06
SD: 2.36
O(population SD): 2.29
p= 9.8450 E-21
n=18
(found on
Jamison Pittl
p= 9.8450 E-21<.01 =a
Because the value of p is less than the significance level, we reject the null
hypothesis. There is not sufficient enough evidence to support the claim that
the mean number of skittles per bag, is 55.
Reflect
Before an interval estimate can be made, you must first make sure a few
conditions are met. First, you need to make sure the sample used is a simple
random sample. Second, the sample needs to be sufficiently large. A sample
can usually be concluded as sufficiently large if it included at least 10
success and 10 failures. As with interval estimates, hypothesis tests also
have conditions that need to be met. However they vary depending on the
type of test that you are doing. For a proportion test, the sample used must
be a simple random sample and have a population greater than 10n. For a
sample T-test, the sample needs to be greater than 30, the population must
be greater than 10n, and the sample must be a simple random one. As far as
I can tell, my tests did meet the conditions in which they had to be run. To
improve a sampling method, the larger the sample is, the more accurate it is
going to be. Additionally, the more random the sample it is, the more likely
you are to get diverse answers and be more accurate in relation to the
population. From the statistical research we have done, we can conclude that
the mean number of skittles per bag is more than 55. We can also conclude
that 20% of skittles are red.
Final Reflection
What have you learned as a result of this project?
As a result of this project I have become much more proficient with not
only Microsoft word, but Microsoft excel as well. Previously I hadnt had much
experience with data sheets and making graphs. But throughout the course
of this project, I was really pushed to my limits in trying to create graphs that
accurately displayed the skittles data. Ive become a lot more confident
Jamison Pittl
Jamison Pittl