Eportfolio Stats

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

Courtney Tiatia

Statistics 1040
This semester we applied statistics to the amount of skittles in a bag and how many skittles
there were of each color.

Count Red Count Orange Count Yellow Count Green Count Purple Total
My Bag 14 16 16 10 14 58
Class Counts 1340 1356 1410 1245 1329 6680

I expected to see that Red was the lowest out of all the colors because Red is a favorite
amongst a lot of people and the company knows it but green was the lowest.

There were six observations that seemed to be outliers and they can mess with our graphs by
making the graph have a big gap to the outlier and can skew our graph and summery.

The distribution of colors in the total class data matches my data for the most part and that
actually didn’t surprise me. I would hope that each skittles bag would be around the same
amounts of colors, so customers can rely on the colors they love being in the bag.

After we completed individual work we moved on to working as a group to get the following
results on the next two pages.
Group 3 Part 2

1.
Red count: 1340, proportion: .201
Orange count: 1356 proportion: .203
Yellow count: 1410 proportion: .211
Green count: 1245 proportion: .186
Purple count: 1329 proportion: .199
What did we expect the proportions to be and why?
Everyone thought the proportions were going to be different based on their own
bag of skittles (i.e. one thought orange would be the highest count because they had
plenty more orange in their bag than any other color). Since everyone had different
guesses we expected them to be pretty close to each other which was right. It is hard to
guess any proportions when everyone’s bags of skittles are so different. We did guess
that they would be close to .2 for about all of them because with as many people in the
class we figured there would be many people with different amounts of the different
colors. I am sure skittles does not pick one color over the other when making them. I am
sure they make the same amount of every color, but then it randomly divides them into
the bags. Thisis probably why we had so many different numbers and variations.
2.
My findings about the variable “Total candies in each bag”

I found that the shape of the distribution is slightly skewed to the right but if we
discount the outliers it is roughly bell shaped. The graphs show basically what I expected from
the data and it is a lot easier to understand the data when you get a visual of all the numbers in
a graph. The mean we found for the whole class is roughly 60 candies in each bag and my own
bag was 58 total candies in the bag. So, my data is slightly lower than the mean of the class.
There are a couple types of data that we use and the first is categorical data which I
know better as qualitative data. Qualitative data is data such as the breeds of dogs (German
Shepard, Golden Retriever, Australian Shepard, etc.) because the data is describing
characteristics and not actual numbers it is considered Qualitative (Categorical) data.
Quantitative data is numerical data. Numerical data is data that can be written as numbers such
as shoe size, height, weight, etc. Since one type of data is describing data (qualitative) and one
type of data is numerical data (quantitative) there are certain graphs and calculations that
make sense for each type of data. For qualitative (categorical) data the best graphs are pie
charts (my personal favorite) or bar graphs because since qualitative data is describing the data
these graphs are easier for the eye to understand the data quickly. Qualitative data is simpler
data and therefore and not numerical, so it wouldn’t work with boxplots and histograms as
Quantitative data would. The best way to calculate qualitative data is to be grouped into
categories or counted for frequency. On the other hand, we have quantitative data and the
best graphs for quantitative data are boxplots, histograms, stem and leaf plots, etc. We have a
wider range of graphs and calculations for quantitative data because this type of data is more
complex and numerical. It wouldn’t make sense for us to use a pie chart for quantitative data
because it doesn’t describe our data like qualitative data does. We need both types of data for
less/more complex data and as long as you know the difference you can come out with a better
understanding of the data you’ve collected.

As a group we applied statistics to our findings as follows:

1. Mean, Standard Deviation, 5-Number Summary for “Total Candies in Each Bag”
a. Mean = 60.2
b. Standard Deviation = 7.0
c. (Min) 35, (Q1) 58, (Median) 59, (Q3) 61, (Max) 97
2. Frequency Histogram for the variable “Total Candies in Each Bag”:
3. Box Plot for the variable “Total Candies in Each Bag”:

What is a confidence interval? Confidence intervals are a way to make an inference from a
sample to a population. In other words, a confidence interval draws a conclusion about a
population from sample data. So, if we conducted a survey and we had a 99% confidence
interval we could say that if we used the same sampling method to select different samples and
got an interval estimate for each sample, we would expect the true population parameter to
fall within the interval estimates 95% of the time. Confidence intervals are a great choice for
statisticians because they give an accurate guesstimate and the uncertainty of a guestimate.

Now as a group we will use confidence intervals with our findings from our class skittle
statistics.

1. P for yellow candies= .211


Confidence level is 99%
N= 6680
Np(1-p) is greater than or equal to 10
(6680)(.211)(1-.211)
1409.48(.789)
1112.08 is greater than or equal to 10 so the condition is met
N is less than or equal to .05N
Condition is met
Stat Crunch
1. Stat
2. Proportion stats
3. One sample
4. With summary
5. # of successes 1410
6. # of observations 6680
7. Confidence level .99
8. Compute

2. Enter data in stat crunch of overall skittles in each bag


Check that it satisfies both conditions
Stat Crunch
a. Graph: QQ plot and boxplot
b. Select number of skittles (data entered in stat crunch)
c. Add correlation statistic
d. Compute
e. If we disregard the outliers that we were supposed to in the data then our data should be
satisfied by the condition because our population size is so large. The data is normally
distributed. Both are satisfied.
a. Stat
b. T stats
c. One sample
d. With Data
e. Select data (# of skittles)
f. Confidence variable of mean which is .95
g. Compute

3. In conclusion, (1) We are 99% confident that the proportion of skittles in a 2.17
oz bag that are yellow is between .198 and .224.
And (2) We are 95% confident that the mean number of skittles per 2.17 oz bag
is between 58.863 and 61.497

The mathematics and statistics skills that I applied in this project will impact me for future
classes in chemistry, a variety of nursing courses, and even Nurse Practitioner courses. I will
need the mathematics skills for a good number of future courses and in my career. The
statistics skills that were learned from this course will also be very beneficial for nursing and NP
school because statistics are used continually for healthcare degrees.

The group part of the project was a great way to keep on social group skills and coming up
with solutions to the questions together and making sure everyone in the group completed
their part on time. This is helpful for all my future classes as I will need to find classmates for
group study sessions and continue to work with different personalities and schedules. This
then will go forward to my career in healthcare working with many different patient
personalities and their schedules and motivations.

This project didn’t change my view thinking about real-world statistics but instead
supported my current thinking. Statistics control how so much of our world works from
Presidential politics to drive-thru restaurants, all companies use statistics and all countries use
statistics. In my current job in Labor and Delivery we use statistics to determine a guess to how
many babies will be born in a certain month but there is a standard deviation of 1-2 weeks and
then there are always those preemie babies who are the statistical outliers. It helps us to know
how busy out staff is going to be and how to plan on ordering supplies. Statistics are
everywhere in the real-world and this project helps confirm that.

You might also like