Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

Karla Moreno-Rodriguez

Math 1040
Skittles Project – Final

Part One:
Throughout the first part of the skittles project, we were organizing our data through tables and
graphs. We were also calculating and interpreting the proportion of red skittles by finding p-
values and t-values.
After purchasing a 2.17-ounce bag of Original Skittles, the count of amount of Skittles of each
color in the bag were the following:
Red: _9_
Orange: _18_
Yellow: _9_
Green: _14_
Purple: _10_
Total: _60_
Bar Graph - representing the colors of skittles in the sample bag.
The most common color was orange with the two least common colors being red and yellow.
The relative frequency of red skittles is 0.15. With the known relative frequency, if given a bag
of 5000 skittles we would predict 750 of them to be red.
The skittles used are from a random sample. It would be classified as a multistage sample as a
store was picked but the chosen skittle bag was chosen in random.

The p-value calculated is 0.0894. Since the p-value is greater than 0.05, we fail to reject the null
hypothesis. There is not sufficient evidence to conclude that the proportion of red skittles is
different than 0.20.

Part 2
Throughout this portion of the project, we started looking at the population as well as samples
outside of our own. We interpreted the results through confidence intervals and hypothesis
testing.
Compiled data from the entire class.

Color Frequency
Red 513
Orange 501
Yellow 519
Green 481
Purple 485
Total 2499

The conditions for computing a 95% confidence interval for the class data is met. The data
comes from a random sample as the machines randomly place the skittles into the bag. The
observations are independent as a skittles bag from one individual is not going to affect the
observations of another. The sample is large enough since it is larger than 30.

The p-hat value obtained from part 1 was 0.15. Based on the interval 0.0742 and 0.0962, the p-
hat was not in the interval it was not a likely value.
To test for a hypothesis test, we need to verify that it is random, independent, and the sample size
is large enough. Looking at the class sample, it is a random sample. The observation is
independent since the sample is less than 5% of total skittles. (n<0.05N) Lastly the sample size is
large enough. (513 > 10, 1,980.20 > 10)

Part 3:
In part three, we started looking more at the outliers. We compared our results with a histogram
and box plot to compare how outliers affect our numbers. We used our results to test and
estimate the means.

Box Plot

Histograms

Looking at the Histogram, it shows that it is slightly right skewed. The medium lies between 58-
62. With outliers being shown for 91, 104, and 106.
The box plot shows a normal distribution shape with the median 68. It excludes outliers making
the medium more accurate with sample data. The outliers shown are 42, 91, 104, and 106. This
can be determined with IQR (60-57=3). The lower fence is 52.5 (57-1.5*3) and the upper fence
is 64.3 (60+1.5*3).
To test for a hypothesis test, we need to verify that it is random, independent, and the sample size
is large enough. Looking at the class sample, it is a random sample. The observation is
independent since the sample is less than 5% of total skittles. (n<0.05N) Lastly the sample size is
large enough. (513 > 10, 1,980.20 > 10)
Mean/ 59.8947
S/ 9.8771

With the calculations above, we are 95% confident that the intervals 59.5073 and 60.2821
contain the true mean of total number of skittles per bag.
We fail to reject the null hypothesis. There is not sufficient evidence to conclude that the total
number of skittles per bag is different than 60.

The p-value is 0 which is less than 0.05, rejecting the null. According to the p-value, there is
enough evidence to conclude that proportion is not equal to 0.2.
This result is different than the one in Part 1, as in part one we did not reject to null hypothesis.
In this case the testing in Part 1 is more accurate since the data could have been skewed due to
the large number of skittles for 2 of the samples.

Part 4:
Lastly, in part 4 we started to look at probability. We looked at the probability of picking certain
skittles compared to another. In this portion, we also compared the difference of population
proportions.

Red Orange Yellow Green Purple Total


Me 9 18 9 14 10 60
Student A 13 12 7 15 11 58
Student B 9 16 14 6 15 60
Total 31 46 30 89 36 178

The probability that a randomly selected skittle is red or yellow – 61/178 = 0.3427
The probability that a randomly selected skittle is red or from your bag – 82/178 = 0.46067
The probability that a randomly selected skittle is red and from your bag – 9/178 = 0.0506
The probability that a randomly selected skittle is from your bag, given it is red – 9/31 = 0.2903
The probability that three randomly selected skittles (without replacement) are all red –
(31/178)*(30/177)*(29/176) = 0.0049

We are 95% confident that the interval calculated, -0.2144 and 0.0662, contains the difference of
population proportions of total skittles in a 2.16 oz bag. The confidence level does not provide
significant evidence for a difference in of total skittles in a 2.16 oz bag since zero is within our
confidence level.
Sample
n1p - 9
n1(1-p) - 51
n2p - 13
n2(1-p) – 45
Looking at the calculation for sample sizes, the sample size for n1 is not sufficient since 9 <10
suggesting that the results will not be valid.

Test statistic - -1.0338


Since p-value > α , we fail to reject the null hypothesis and have insufficient evidence that there
is a difference in the proportion of red skittles between the place where I purchased the bag and
where student A purchased the bag.

You might also like