Math 1040 Skittles Term Project

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

Jamison Pittl

Math 1040 Skittles Term Project


Number of
red candies
208

Number of
orange
candies
224

Number of
yellow
candies
205

Number of
green candies
209

Number of
purple
candies
235

Introduction
The purpose of this assignment is to see if the variance in colors between packages is, in
fact random, or if the skittles company really does produce varying amounts of each
color. For our experiment, each student in the class has recorded the amount of every
color obtained in their skittles package. We will then compare results into one large
sample and continue from there. Following the large sample size, we will then compare
the mean number of skittles per bag. I will then construct various confidence intervals
for different confidence levels, after which, I will test various hypothesis regarding the
statistics of the skittles.

Jamison Pittl

Skittles Color Frequency


21.74

19.24

19.33 20.72
18.96
Red

Orange

Yellow

Green

Purple

Number of Skittles by Color


235 208
224
209
205
Red

Orange

Yellow

Green

Purple

Skittles Count and Relative Frequency


240

100

235

90

230

80

225

70

220

60

215

50

210

40

205

30

200

20

195

10

190

Purple

Orange

Green
Column1

Name

Green

Orang

Purple

Red

Red

Yellow

Frequency

Yellow

Sum

Jamison Pittl

Alison
Digravio
Anton
Ashley
Austin
Brandon
Bryce
Cesar
Chet
Clarissa
Danielle
Derek
Jamison
Katherine
Kiran
Matani
Remington
Whitney
William

e
17

15
16
7
9
12
14
9
11
19
13
8
18
14
12
7
11
8

6
14
14
18
12
7
15
12
12
12
10
10
14
11
13
12
15

16
10
13
14
9
17
13
12
11
10
15
13
9
16
16
12
19

9
9
13
10
15
9
11
13
10
13
11
6
17
16
15
11
11

14
9
11
9
9
15
16
13
11
12
14
13
11
7
8
11
6

60
58
58
60
57
62
64
61
63
60
58
60
65
62
59
57
59

sum
average

209
11.61
111

224
12.44
444

235
13.05
556

208
11.55
556

205
11.38
889

1081
60.055
56

10

16

58

As shown from the compiled data and graphs, some colors of skittles were
significantly more present than were others. Especially the color purple,
which for example had a total of 235 while yellow only had 205, a difference
in 30 skittles. However, the frequencies of each color relatively surprised me.
They all seemed to be around the frequency of 20, give or take a percentage
here and there. Ive included a table of all statistics and highlighted mine
specifically in yellow. In regards to my statistics specifically, I was very
surprised to see that yellow, one of my highest in numbers was one of the
lowest overall for the class. However, my results for the purple skittle did
coincide with the final results of purple being the highest.

Jamison Pittl

Skittles Per Package


7
6
5
4
3
2
1
0

120.00%
100.00%
80.00%
60.00%
40.00%
20.00%
57

59

61

Frequency

63

More

0.00%

Cumulative %

Mean: 60.1
Standard Deviation: 2.36
Min: 57
Q1: 58
Med: 60
Q3: 62
Max: 65
N: 18

This Data Is relatively normal but can also be viewed as skewed toward the
right. The average amount of skittles per bag was about 60.1 with a high of
65 and a low of 57. As is shown in the histogram, most students got about
59-60 skittles. This is pretty much what I expected to see. However, I got
below average on the amount of skittles I received.

Jamison Pittl

Reflection
The difference between categorical and quantitative data is that quantitative
data is measurable or quantifiable such as length, weight, and age.
Categorical data are things that can be grouped together according to some
common properties or characteristics such as color, sex, and type of vehicle.
The best kind of graph that would make sense for categorical data would be
a bar graph, line graphs, pie graph or a pictograph because categorical
values can be assigned a bar, wedge, or picture and no mathematical values
are needed. Histograms and scatterplots dont work well with categorical
data as they usually depend upon numerical data and categorical data has
no numerical values. The best kind of graphs to use when dealing with
quantitative data would be a histogram, boxplot, or scatterplot. While a bar
graph can be used, its usually only used for categorical type data because it
doesnt accurately display enumerative data unless you turn it into a
histogram. From quantitative type graphs one could obtain a max, min and
median. You can definitely obtain more information from quantitative graphs
but categorical graphs do a good job at organizing information and turning it
into a visual.

Confidence Interval Estimates


The purpose of confidence intervals is to give us a range of values for our
estimated population parameter rather than a single value or a point
estimate. The estimated confidence interval gives us a range of values within
which we believe, with varying degrees of confidence that the true
population value falls. After calculating a sample statistic, it would be useful
to use that value to estimate what the true value of the population
parameter is. A confidence interval estimate is a method to do just that and
after using the appropriate process the result will be an interval of numbers
that will include the population parameter. The process varies slightly
depending on what level of confidence you desire, with 95% being the most
common choice. For example, you gather data from several samples of the
whole population and calculate the mean of that data.
Construct a 99% confidence interval estimate for the true proportion
of yellow candies.
(.15893, .22035) also written as, .15893<p<.22035 (found using a
calculator)

Jamison Pittl

We are 99% sure that the proportion of yellow candies lies between the two
points of .15893 and .22035.
Construct a 95% confidence interval estimate for the true mean
number of candies per bag.
(58.995, 61.117) also written as, 58.995<u<61.117 (found using a
calculator)
We are 95% confident that the true mean of number of skittles per bag lies
between the two points of 58.995 and 61.117.

Construct a 98% confidence interval estimate for the standard


deviation of the number of candies per bag.
(3.432, 16.619) also written as, 3.432<o<16.619 (found using the SD
formula)
17(2.36)^2 17(2.36)^2
27.587

5.697

We are 98% sure that the populations standard deviation of candies per bag
lies between the two points of 3.432 and 16.619.
Reflect
From the first confidence interval, we can see that the proportion of yellow
skittles lies between 15.9% and 22% with 99% confidence. From the second
confidence interval we can conclude that the mean number of candies per
bag lies between the numbers of 58.9 and 61.1 with 95% confidence. From
the final confidence interval we can conclude that the population standard
deviation lies somewhere between the numbers of 3.43w and 16.619 with
98% confidence.

Jamison Pittl

Hypothesis Tests
A hypothesis test is a statistical test that is used to determine whether there
is enough evidence in a sample of data to infer that a certain condition is
true for the entire population. A hypothesis test examines two opposing
hypotheses about a population: the null hypothesis and the alternative
hypothesis. The null hypothesis is the statement being tested. Usually the
null hypothesis is a statement that declares values are of equal value. The
alternative hypothesis is the statement you want to be able to conclude is
true and is usually stated as being larger than or less than, never equal to.
Use a 0.05 significance level to test the claim that 20% of all Skittles
candies are red.
H0(null): p=.20
calculator)

(found on

H1(alternative): p=.20
Z= -.624
p= .533
a= .01
p= .533>.05 =a
Because the value of p is greater than our significance level, we fail to reject
the null hypothesis. There is significant evidence to support the claim that
20% of skittles are red.
Use a 0.01 significance level to test the claim that the mean number
of candies in a bag of Skittles is 55.
H0(null): u=55
calculator)
H1(alternative): u=55
a: .01
Mean: 60.06
SD: 2.36
O(population SD): 2.29
p= 9.8450 E-21
n=18

(found on

Jamison Pittl

p= 9.8450 E-21<.01 =a
Because the value of p is less than the significance level, we reject the null
hypothesis. There is not sufficient enough evidence to support the claim that
the mean number of skittles per bag, is 55.

Reflect
Before an interval estimate can be made, you must first make sure a few
conditions are met. First, you need to make sure the sample used is a simple
random sample. Second, the sample needs to be sufficiently large. A sample
can usually be concluded as sufficiently large if it included at least 10
success and 10 failures. As with interval estimates, hypothesis tests also
have conditions that need to be met. However they vary depending on the
type of test that you are doing. For a proportion test, the sample used must
be a simple random sample and have a population greater than 10n. For a
sample T-test, the sample needs to be greater than 30, the population must
be greater than 10n, and the sample must be a simple random one. As far as
I can tell, my tests did meet the conditions in which they had to be run. To
improve a sampling method, the larger the sample is, the more accurate it is
going to be. Additionally, the more random the sample it is, the more likely
you are to get diverse answers and be more accurate in relation to the
population. From the statistical research we have done, we can conclude that
the mean number of skittles per bag is more than 55. We can also conclude
that 20% of skittles are red.

Final Reflection
What have you learned as a result of this project?
As a result of this project I have become much more proficient with not
only Microsoft word, but Microsoft excel as well. Previously I hadnt had much
experience with data sheets and making graphs. But throughout the course
of this project, I was really pushed to my limits in trying to create graphs that
accurately displayed the skittles data. Ive become a lot more confident

Jamison Pittl

when hypothesis testing. Conducting hypothesis tests use to be quite


challenging for me especially in extinguishing the null from the alternative,
but through this project Ive learned a lot more about the conditions in which
hypotheses tests can actually be done and it has made conducting the tests
much easier.
Discuss how the math skills that you applied in this project will
impact other classes you will take in your school career.
Ive learned that my previous thought that statistics arent useful is
quite wrong. Statistics can be applied in countless ways in countless areas. I
think that the skills Ive learned through doing this project can help me
understand the process by which statistics are obtained in textbooks in other
classes. Additionally, due to trying to obtain a bachelor of science in political
science, I will need to take a lot of statistic based classes, I think that
applying everting I learned will definitely help me get ahead in those classes.
Discuss how the project helped to develop your problem solving
skills.
This project may have been very easy for some, but for me is was
anything but. Ive struggled quite a bit in statistics and so a project as
extensive as this really pushed me to my limits. There was a lot I didnt know
how to do and a lot I needed to figure out, but throughout the course of
doing this project, I believe I gained some very valuable problem solving
skills. For example, making graphs proved to be extremely challenging for
me. They would always come out differently than I had hoped for or with the
wrong information, but as soon as I started thinking about the problem and

Jamison Pittl

how to solve it from multiple different perspectives, things began to get


easier. When trying to make confidence intervals, I struggled quite a bit with
making a standard deviation interval, however when I put my new found
problem solving skills into play, I realized I had multiple different resources at
my fingertips such as YouTube, the text book, and various websites that
could help me figure out how to construct the interval.

You might also like